Compare commits

...

9276 Commits

Author SHA1 Message Date
Juan A. Suarez Romero
b43b55d461 nir/spirv: return after emitting a branch in block
When emitting a branch in a block, it does not make sense to continue
processing further instructions, as they will not be reachable.

This fixes a nasty case with a loop with a branch that both then-part
and else-part exits the loop:

%1 = OpLabel
     OpLoopMerge %2 %3 None
     OpBranchConditional %false %2 %2
%3 = OpLabel
     OpBranch %1
%2 = OpLabel
    [...]

We know that block %1 will branch always to block %2, which is the merge
block for the loop. And thus a break is emitted. If we keep continuing
processing further instructions, we will be processing the branch
conditional and thus emitting the proper NIR conditional, which leads to
instructions after the break.

This fixes dEQP-VK.graphicsfuzz.continue-and-merge.

CC: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-28 09:47:06 +01:00
Eric Engestrom
0c3287e94d egl/android: replace magic 0=CbCr,1=CrCb with simple enum
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-28 07:44:46 +00:00
Caio Marcelo de Oliveira Filho
6a553bedcc st/nir: count num_uniforms for FS bultin shader
Usually the uniforms will be assigned locations and have their slots
counted automatically, but for builtin shaders the location assignment
is manual.  So count them too otherwise we get num_uniforms == 0.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-27 22:18:24 -08:00
Ray Zhang
b344e32cdf glx: fix shared memory leak in X11
call XShmDetach to allow X server to free shared memory

Fixes: bcd80be49a "drisw/glx: use XShm if possible"
Signed-off-by: Ray Zhang <zhanglei002@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-02-28 14:23:02 +10:00
Timothy Arceri
e907337fad radeonsi/nir: move si_lower_nir() call into compiler thread
This helps improve compile times. For example the shader-db dolphin
shader shaders/dolphin/ubershaders/120.shader_test goes from
~1.69 -> ~1.57 seconds on my machine with this change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-28 11:54:06 +11:00
Timothy Arceri
7536af670b glsl: fix shader cache for packed param list
Some types of params such as some builtins are always padded. We
need to keep track of this so we can restore the list correctly.

Here we also remove a couple of cache entries that are not actually
required as they get rebuilt by the _mesa_add_parameter() calls.

This patch fixes a bunch of arb_texture_multisample and
arb_sample_shading piglit tests for the radeonsi NIR backend.

Fixes: edded12376 ("mesa: rework ParameterList to allow packing")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-28 11:47:37 +11:00
Yevhenii Kolesnikov
07f4b4e403 i965: Fix allow_higher_compat_version workaround limited by OpenGL 3.0
Added check for higher compat profile being allowed
before assigning certain extensions.

Fixes: 272fe94942 (mesa: enable ARB_texture_buffer_* extensions in the Compatibility profile)

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107052
2019-02-28 10:25:16 +11:00
Lionel Landwerlin
6e184147dd intel/compiler: use correct swizzle for replacement
The optimization in 4cd1a0be76 introduced a replacement of :

cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D
...
cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D

By :

cmp(8).z.f0.0 vgrf15.x:D, vgrf10.xxxx:D, vgrf2.yyyy:D
...
mov(8) vgrf11.y:D, vgrf15.yyyy:D

The first cmp instruction is storing in x while the second mov is
sourcing from y. We need to take into account where the replacement on
the scan_inst destination is going to store thing so that the
replacement mov can source things from the correct location.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4cd1a0be76 ("i965/vec4: Propagate conditional modifiers from more compares to other compares")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109759
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-27 20:06:42 +00:00
Jonathan Marek
61e3188633 freedreno: catch failing fd_blit and fallback to software blit
Fixes cases where the fd_blit fails and never happens (ex: blit to etc1)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-02-27 18:46:28 +00:00
Jonathan Marek
e3591b0339 freedreno: use renderonly path for buffers allocated with modifiers
Now that freedreno has create_with_modifiers(), this "hack" is needed to
make some cases work. Copied from vc4.

Fixes: 41ddf1d1

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-02-27 18:46:28 +00:00
Jonathan Marek
6c0fefb448 freedreno: a2xx: fix mipmapping for NPOT textures
Fixes: 3a273a4a

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-02-27 18:46:28 +00:00
Jonathan Marek
4f23767590 freedreno: a2xx: fix fast clear for some gmem configurations
In freedreno_gmem.c, gmem_align of 0x8000 is used. Alignment used here
should be the same.

Fixes: 912a9c8d

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-02-27 18:46:28 +00:00
Jonathan Marek
8eca6df5ed freedreno: a2xx: add use_hw_binning function
Fixes: cb2322c7

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-02-27 18:46:28 +00:00
Jonathan Marek
357313ab0f freedreno: a2xx: don't write 4th vertex in mem2gmem
There is only room for 3 vertices now (RECT has 3 vertices).

Fixes: 6ef7700a

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-02-27 18:46:28 +00:00
Erik Faye-Lund
71a76a47cc swr/codegen: fix autotools build
When the output directory was changed, the BUILT_SOURCES and build-rule
target-path was no longer correct, leading to races to generate the
sources and compiling them.

Fix this by updating both sets of paths, so automake see what's going on
here.

Fixes: 773b3ceaca ("swr/rast: Fix autotools and scons codegen")
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-02-27 17:59:06 +00:00
Timo Aaltonen
738626daca util/os_misc: Add check for PIPE_OS_HURD
Fix build on Hurd.

Signed-off-by: Timo Aaltonen <tjaalton@debian.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-27 14:56:48 +00:00
Lionel Landwerlin
2fff5966d6 vulkan/overlay: install layer binary in libdir
This will allow multilib.

v2: Drop path from json file, dlopen should be able to locate the lib in libdir

v3: Switch from configure_file to install_data (Dylan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109788
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-27 11:45:42 +00:00
Eric Engestrom
7763e664ce meson/swr: replace hard-coded path with current_build_dir()
Fixes: 93cd9905c8 "swr/rast: Cleanup and generalize gen_archrast"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Alok Hota <alok.hota@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-02-27 11:13:05 +00:00
Gert Wollny
b7201a468d nir: Add posibility to not lower to source mod 'abs' for ops with three sources
This is useful for r600 since there the abs source modifier is not supported
for ops with three sources

v2: Use correct logic to enable lowering to abs source mod (Eric Anhold)

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-27 11:04:06 +00:00
Gurchetan Singh
ce112fcc87 virgl/vtest: deprecate protocol version 1
This is a partial revert of 9d81cd ("virgl: Pass resource size and
transfer offsets").

The adjustments made in the client code means there's various
mismatches when transfering data.

Let's fallback to protocol version 0 and deprecate protocol
version 1.  We can still use the protocol version 1 slots for
a shared memory transfer mechanism later.

Fixes:
  dEQP-GLES31.functional.copy_image.mixed.viewclass_128_bits_mixed.*_renderbuffer

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2019-02-27 11:02:29 +00:00
Tapani Pälli
b9acfef337 util: fix a warning when building against clang7 headers
Header xmmintrin.h conditionally includes emmintrin.h that defines
_MM_DENORMALS_ZERO_MASK, add ifndef to fix this warning.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-27 08:57:41 +02:00
Tapani Pälli
d1af8115f8 iris: add libmesa_iris_gen8 library to the build
Patch fixes iris build on Android.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-27 08:57:41 +02:00
Tapani Pälli
5e52184f72 android: make libbacktrace optional on USE_LIBBACKTRACE
Otherwise with VNDK enabled we fail linking:
   src/gallium/targets/dri/Android.mk: error: gallium_dri (native:vendor)
   should not link to libbacktrace.vendor (native:vndk_private)

Option makes it possible to use libbacktrace only when VNDK is not
enabled.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-27 08:56:46 +02:00
Tapani Pälli
a3c366c4b2 android: add liblog to libmesa_intel_common build
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-27 08:53:09 +02:00
Alyssa Rosenzweig
b7a5b81d14 panfrost/midgard: Allow flt to run on most units
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-27 03:56:56 +00:00
Alyssa Rosenzweig
4c82abb9b6 panfrost: Expose perf counters in environment
Previously, we were guarded by an #ifdef, which is generally a bad form.
This patch instead guards them behind an environmental variable.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-27 03:56:38 +00:00
Alyssa Rosenzweig
60270c83b5 panfrost: Identify 4-bit channel texture formats
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-27 03:56:17 +00:00
Alyssa Rosenzweig
90fd82c540 panfrost: Add RGB565, RGB5A1 texture formats
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-27 03:55:19 +00:00
Jose Maria Casanova Crespo
4122665dd9 iris: Enable ARB_shader_draw_parameters support
Additional VERTEX_ELEMENT_STATE are used to store basevertex and
baseinstance and drawid updating the DWordLength of the
3DSTATE_VERTEX_ELEMENTS command.

This passes all piglit tests for spec.*draw_parameters.* tests
and VK-GL-CTS KHR-GL45.shader_draw_parameters_tests.* tests.

Now we only mark a dirty_update when parameters are changed or
when we have an indirect draw.

We enable PIPE_CAP_DRAW_PARAMETERS on Iris.

There is no edge flag support in the Vertex Elements setup.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-26 13:28:38 -08:00
Pierre Moreau
1c9fdcefd4 clover: Fix indentation issues
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
5285fff5f9 clover: Only use devices supporting IR_NATIVE
Currently clover will advertise any device that advertises
PIPE_CAP_COMPUTE, even if they do not support PIPE_SHADER_IR_NATIVE,
which is the IR used internally by clover.
This avoids clover advertising devices as available even though they
actually are not supported.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
8f9b4a2be6 clover: Move platform extensions definitions to clover/platform.cpp
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
2019-02-26 21:02:07 +01:00
Pierre Moreau
b033620abf clover: Move device extensions definitions to core/device.cpp
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
2019-02-26 21:02:07 +01:00
Pierre Moreau
d42f5896c5 clover: Validate program and library linking options
Program linking options are only valid if the library was created with
the `-enable-link-options` option, which itself is only valid when
creating a library, and only when creating an executable.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
fccc6ecb52 clover: Disallow creating libraries from other libraries
If creating a library, do not allow non-compiled object in it, as
executables are not allowed, and libraries would make it really hard to
enforce the "-enable-link-options" flag.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
2019-02-26 21:02:07 +01:00
Pierre Moreau
bad161c894 clover/api: Fail if trying to build a non-executable binary
From the OpenCL 1.2 Specification, Section 5.6.2 (about clBuildProgram):

> If program is created with clCreateProgramWithBinary, then the
> program binary must be an executable binary (not a compiled binary or
> library).

Reviewed-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
25d4e65eb7 clover/api: Rework the validation of devices for building
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
505ec3a530 clover: Add an helper for checking if an IR is supported
Reviewed-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
67769c913f clover: Remove the TGSI backend as unused
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
669d00ba4c clover: Avoid warnings from new OpenCL headers
* Avoid warnings from references to deprecated CL 1.0, 1.2, 2.0 and 2.1 APIs.
* Avoid warnings from not defining CL_TARGET_OPENCL_VERSION.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Karol Herbst
ba8d21a8d3 clover: update ICD table to support everything up to 2.2
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
2019-02-26 21:02:07 +01:00
Pierre Moreau
dddc5649bf include/CL: Update to the latest OpenCL 2.2 headers
Acked-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-02-26 21:02:07 +01:00
Marek Olšák
2ae07830e7 gallium/u_tests: use a compute-only context to test GCN compute ring 2019-02-26 14:58:55 -05:00
Marek Olšák
a1378639ab radeonsi: always use compute rings for clover on CI and newer (v2)
initialize all non-compute context functions to NULL.

v2: fix SI
2019-02-26 14:58:55 -05:00
Bas Nieuwenhuizen
c0110477b5 radv: Interpolate less aggressively.
Seems like dxvk used integer builtins without setting the flat
interpolation decoration.

I believe in the current spec the app is required to set these,
but in the meantime to avoid breaking things in stable releases
(and so close to release for 19.0), only expand the interpolation
to float16 and struct (which cannot be builtins as our spirv parser
lowers the builtin block).

Fixes: f324784104 "radv: Allow interpolation on non-float types."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-26 18:51:35 +00:00
Drew Davenport
1fd79b4b6d util: Don't block SIGSYS for new threads
SIGSYS is needed for programs using seccomp for sandboxing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-26 19:39:14 +01:00
Rob Clark
64206102fc freedreno/ir3: gsampler2DMSArray fixes
Array index should come before sample-id.  And exclude all isam variants
(which take integer texel coords) from adding of offset.

Fixes dEQP-GLES31.functional.texture.multisample.samples_1.use_texture_*_2d_array

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
a06bb486b0 freedreno/ir3/a6xx: fix atomic shader outputs
We also need to put in the output mov.  Possibly we could just fixup the
output register to read it directly from the dummy, but that is more
work and I guess dEQP is probably the only time you encounter this.

Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.atomic_counter.const_literal_fragment

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
db1fa21374 freedreno/a6xx: vertex_id is not _zero_based
Fixes dEQP-GLES31.functional.draw_base_vertex.draw_elements_base_vertex.builtin_variable.vertex_id

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
79180a0566 freedreno/a6xx: fix DRAW_IDX_INDIRECT max_indicies
The indirect offset does not effect the index buffer size.  Fixes all of
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_100x100_drawcount_*
with drawcount > 1.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
cabe55a2e7 freedreno/ir3/a6xx: fix non-ssa atomic dst
We weren't propagating the array info for cases where result of atomic
is array/reg.  This can happen, for example, if result is part of a phi
web lowered to regs.

Fixes dEQP-GLES31.functional.ssbo.atomic.compswap.*

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
edd5b3126d freedreno/a6xx: fix ssbo alignment
Fixes a bunch of deqp ssbo tests that use multiple ssbo blocks packed
into a single buffer.

Note the a5xx value seems suspicious, but this is what blob seems to
advertise.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
cb884d8ab2 freedreno/ir3: use nopN encoding when possible
Use the (nopN) encoding for slightly denser shaders.. this lets us fold
nop instructions into the previous alu instruction in certain cases.

Shouldn't change the # of cycles a shader takes to execute, but reduces
the size.  (ex: glmark2 refract goes from 168 to 116 instructions)

Currently only enabled for a6xx, but I think we could enable this for
a5xx and possibly a4xx.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
04c2520d91 freedreno/a6xx: fix hangs with large shaders
We were overflowing instrlen (which is # of groups of 16 instructions)
in a couple dEQP tests, causing gpu hangs:

dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Brian Paul
6dabcb5bcf mesa: fix display list corner case assertion
This fixes a failed assertion in glDeleteLists() for the following
case:

list = glGenLists(1);
glDeleteLists(list, 1);

when those are the first display list commands issued by the
application.

When we generate display lists, we plug in empty lists created with
the make_list() helper.  This function uses the OPCODE_END_OF_LIST
opcode but does not call dlist_alloc() which would set the
InstSize[OPCODE_END_OF_LIST] element to non-zero.

When the empty list was deleted, we failed the InstSize[opcode] > 0
assertion.

Typically, display lists are created with glNewList/glEndList so we
set InstSize[OPCODE_END_OF_LIST] = 1 in dlist_alloc().  That's why
this bug wasn't found before.

To fix this failure, simply initialize the InstSize[OPCODE_END_OF_LIST]
element in make_list().

The game oolite was hitting this.

Fixes: https://github.com/OoliteProject/oolite/issues/325
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-26 09:56:45 -07:00
Brian Paul
cb52d4482d svga: fix dma.pending > 0 test
The dma.pending field is boolean, so testing for > 0 isn't right.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2019-02-26 09:56:45 -07:00
Brian Paul
96ea977c79 svga: assorted whitespace and formatting fixes
Remove trailing whitespace, etc.

Trivial.
2019-02-26 09:56:45 -07:00
Brian Paul
a81eebf9bc st/mesa: whitespace/formatting fixes in st_cb_texture.c
Remove trailing whitespace, replace tabs w/ spaces, etc.

Trivial.
2019-02-26 09:56:45 -07:00
Eleni Maria Stea
fd37a19ac4 i965: fixed clamping in set_scissor_bits when the y is flipped
Calculating the scissor rectangle fields with the y flipped (0 on top)
can generate negative values that will cause assertion failure later on
as the scissor fields are all unsigned. We must clamp the bbox values
again to make sure they don't exceed the fb_height. Also fixed a
calculation error.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999
          https://bugs.freedesktop.org/show_bug.cgi?id=109594

v2:
   - I initially clamped the values inside the if (Y is flipped) case
   and I made a mistake in the calculation: the clamp of the bbox[2] should
   be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
   shouldn't have changed the ScissorRectangleYMax calculation. As the
   fixed code is equivalent with using CLAMP instead of MAX2 at the top of
   the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
   clear, I replaced it. (Nanley Chery)

v3:
   - Reversed the CLAMP change in bbox[3] as the API guarantees that the
   viewport height is positive. (Nanley Chery)

v4:
  - Added nomination for the mesa-stable branch and the link to the second
  bugzilla bug (Nanley Chery)

CC: <mesa-stable@lists.freedesktop.org>
Tested-by: Paul Chelombitko <qamonstergl@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-02-26 08:23:26 -08:00
Eduardo Lima Mitev
0bf667984b freedreno/a6xx: Silence compiler warnings
util_format_compose_swizzles() expects 'const unsigned char' and we
are feeding it 'char'.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-02-26 14:15:33 +01:00
Kasireddy, Vivek
7cab8d3661 i965: Add support for sampling from XYUV images
Add support to the i965 DRI driver to sample from XYUV8888 buffers.

Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 13:08:52 +00:00
Kasireddy, Vivek
65600d0946 dri: Add XYUV8888 format
In addition to adding this format to the dri_interface header,
add an entry in the android and wayland backends as well.

Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 13:08:52 +00:00
Vivek Kasireddy
ff14d06be5 drm-uapi: Update headers from drm-next
Pull new updates from drm-next as of the following commit:

commit a5f2fafece141ef3509e686cea576366d55cabb6
Merge: 71f4e45a4ed3 860433ed2a55
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Feb 20 12:16:30 2019 +1000

    Merge https://gitlab.freedesktop.org/drm/msm into drm-next

Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 13:08:51 +00:00
Kasireddy, Vivek
78fb3fd17e nir/lower_tex: Add support for XYUV lowering
The memory layout associated with this format would be:
Byte:      0 1 2 3
Component: V U Y X

Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 13:08:51 +00:00
Lionel Landwerlin
913d711e0f imgui: update memory editor
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-26 12:49:07 +00:00
Lionel Landwerlin
ab9ae080ec imgui: update commit
In commit 3950e7c11e ("imgui: bump copy") I forgot to update the
README about what copy of imgui we carry.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-26 12:49:04 +00:00
Eric Engestrom
a213b927f2 driinfo: add DTD to allow the xml to be validated
This DTD can be used to validate the output and make sure any parsers
out there can handle it:
$ xmllint --noout --valid driinfo.xml

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-26 12:48:28 +00:00
Lionel Landwerlin
9646750822 vulkan/overlay: fix includes
The Loader/Validation-Layers repository allow the user to choose where
header files are installed. On my system I choose /usr/include
thinking it was the obvious "base" location, but it turns out the
headers end up being installed right there rather in a vulkan
subdirectory. On Debian/Ubuntu the selected installation path is
/usr/include/vulkan, so just go with that.

Hopefully other distro don't choose another path.

Note that the validation layer doesn't provide a .pc file so we have
no way of querying where the headers are installed.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109739
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 12:29:54 +00:00
Lionel Landwerlin
47ef52d333 vulkan/overlay: fix missing installation of layer
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109739
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 12:29:46 +00:00
Eric Engestrom
318e550549 dri_interface: add missing #include
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-26 12:03:20 +00:00
Eric Engestrom
7f5d9c2757 gitlab-ci: always run the containers build
If the first time a fork was created, the job creating the containers was
manually cancelled, this would have left the fork unable to use the CI
(until the next automatic regeneration of the container).

Avoid this by always running the container-generation job, even though
99% of the time it will spin up, see that the container exists and shut
down.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2019-02-26 12:02:14 +00:00
Emil Velikov
40a82e6463 docs: mention "Allow commits from members who can merge..."
Mention the tick-box otherwise only the MR author can rebase the series.

Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reivewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-26 11:27:10 +00:00
Emil Velikov
d9d1cb43d7 egl/android: bump the number of drmDevices to 64
It's the current maximum supported by the kernel. Stay consistent with
the rest of Mesa and use the same number.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-26 11:07:23 +00:00
Emil Velikov
02344fe80b loader: use loader_open_device() to handle O_CLOEXEC
Some platforms lack O_CLOEXEC. The loader_open_device() handles those
appropriately, so use the helper.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 11:07:23 +00:00
Emil Velikov
f0a7b463b5 meson: egl: correctly manage loader/xmlconfig
Earlier commit introduced support for haiku yet did not properly
annotate the loader/xmlconfig dependencies.

Thus we ended up adding inc_loader for each !haiku platform - see
659910eda0 9a96bf0ecd c731508b98 ec6cb01e21.

One piece remained though - the wayland platform. Hence the following
would fail:

 meson -Dgallium-drivers=etnaviv -Ddri-drivers=''\
       -Dtools=etnaviv -Dplatforms=wayland -Dglx=disabled \
       build/

Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Reported-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 834d221512 ("meson: Add Haiku platform support v4")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-02-26 11:07:23 +00:00
Emil Velikov
9d84a922b8 egl/dri: de-duplicate dri2_load_driver*
The difference between the three functions is the list of mandatory
driver extensions. Pass that as an argument to the common helper.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 11:07:23 +00:00
Samuel Pitoiset
4924dfc851 radv: don't copy buffer descriptors list for samplers
Sampler descriptors don't have a buffer list.

This fixes some crashes with new CTS
dEQP-VK.binding_model.descriptor_copy.*.sampler_*.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-26 11:22:28 +01:00
Samuel Pitoiset
9256e0a09d radv: fix out-of-bounds access when copying descriptors BO list
We shouldn't increment the buffer list pointers twice.

This fixes some crashes with new CTS
dEQP-VK.binding_model.descriptor_copy.*.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-26 11:22:22 +01:00
Tapani Pälli
1d5e5ec30a nir: use nir_variable_create instead of open-coding the logic
Fixes: 3d7611e9 "st/nir: use NIR for asm programs"
Reported-by: Matthias Lorenz <oschowa@web.de>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-26 09:00:36 +02:00
Tapani Pälli
22267feff1 nir: initialize value in copy_prop_vars_block
Fixes following valgrind warning:

   ==27561== Conditional jump or move depends on uninitialised value(s)
   ==27561==    at 0x667856B: value_set_ssa_components (nir_opt_copy_prop_vars.c:78)
   ==27561==    by 0x667A1C4: copy_prop_vars_block (nir_opt_copy_prop_vars.c:797)

Fixes: 62332d139c "nir: Add a local variable-based copy propagation pass"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-26 08:56:25 +02:00
Eric Anholt
97566efe5c v3d: Rematerialize MOVs of uniforms instead of spilling them.
If we have a MOV of a uniform value available to spill, that's one of our
best choices.  We can just not spill the value, and emit a new load of the
uniform as the fill.  This saves bothering the TMU and the thrsw, and is
the same cost in uniforms (since the spill offset is a uniform anyway).

This doesn't have a huge impact on shader-db, since there aren't a whole
lot of spills and we usually copy-prop the uniforms at the VIR level such
that the only uniform MOVs are from vir_lower_uniforms:

total instructions in shared programs: 6430292 -> 6430279 (<.01%)
total uniforms in shared programs: 2386023 -> 2385787 (<.01%)
total spills in shared programs: 4961 -> 4960 (-0.02%)
total fills in shared programs: 6352 -> 6350 (-0.03%)

However, I'm interested in dropping the uniforms copy-prop in the backend,
since it would be cheaper to not load repeated uniforms if we have the
registers to spare.  This also saves many spills on
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20, which is what
motivated a bunch of my recent backend work in the first place:

before: 46 spills, 106 fills, 3062 instructions
after: 0 spills, 0 fills, 2611 instructions
2019-02-25 21:33:47 -08:00
Eric Anholt
e0fada983d v3d: Dump the VIR after register spilling if we were forced to.
Spilling is unusual, but one often has to debug it when it happens, so
dump it.
2019-02-25 21:26:24 -08:00
Eric Anholt
2786d2161a v3d: Fix vir_is_raw_mov() for input unpacks.
There are no users at the moment, but I wanted to start using this in
register spilling.
2019-02-25 21:26:24 -08:00
Mathias Fröhlich
1ab2159249 st/mesa: Reduce array updates due to current changes.
Since using bitmasks we can easily check if we have any
current value that is potentially uploaded on array setup.
So check for any potential vertex program input that is not
already a vao enabled array. Only flag array update if there is
a potential overlap.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2019-02-26 05:42:04 +01:00
Dylan Baker
6f42303646 meson/iris: Use current coding style
Just a few minor style things.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-25 23:37:27 +00:00
Timothy Arceri
603206d0a6 radeonsi: fix query buffer allocation
Fix the logic for buffer full check on alloc.

This patch just takes the fix Nicolai attached to the bug report
and updates it to work on master.

Fixes: e0f0d3675d ("radeonsi: factor si_query_buffer logic out of si_query_hw")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109561
2019-02-26 09:55:41 +11:00
Eric Anholt
7c1bf075f3 nir: Just return when asked to rewrite uses of an SSA def to itself.
The nir_builder swizzling improvement to not emit extra MOVs resulted in
nir_lower_tex() trying to rewrite an SSA def to itself, triggering the
assert on all texturing in v3d.  There's no work to be done in this case,
so just stop asserting.

Fixes: 743700be1f ("nir/builder: Don't emit no-op swizzles")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-25 21:25:24 +00:00
Samuel Pitoiset
5671f38085 radv: fix clearing attachments in secondary command buffers
If no framebuffer is bound, get the number of samples and the
image format from the render pass.

This fixes new CTS dEQP-VK.geometry.layered.*.secondary_cmd_buffer.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-25 21:42:50 +01:00
Alok Hota
773b3ceaca swr/rast: Fix autotools and scons codegen
Use new input flags for gen_archrast.py

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-25 13:05:39 -06:00
Alok Hota
16e10b8c30 swr/rast: Add general SWTag statistics
Update Archrast parser to use stats, used with an internal tool

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-25 13:05:36 -06:00
Alok Hota
b45a15a39f swr/rast: Add string handling to AR event framework
For use by an internal tool

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-25 13:05:31 -06:00
Alok Hota
8608a747aa swr/rast: Add initial SWTag proto definitions
Update gen_archrast.py to properly generate event IDs

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-25 13:05:17 -06:00
Alok Hota
93cd9905c8 swr/rast: Cleanup and generalize gen_archrast
Update meson.build to accomodate

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-25 13:05:07 -06:00
Daniel Schürmann
0bd45f96b9 nir: Use SM5 properties to optimize shift(a@32, iand(31, b))
This is a common pattern from HLSL->SPIRV translation
and supported in HW by all current NIR backends.

vkpipeline-db results anv (SKL):

    total instructions in shared programs: 6403130 -> 6402380 (-0.01%)
    instructions in affected programs: 204084 -> 203334 (-0.37%)
    helped: 208
    HURT: 0

    total cycles in shared programs: 1915629582 -> 1918198408 (0.13%)
    cycles in affected programs: 1158892682 -> 1161461508 (0.22%)
    helped: 107
    HURT: 86

shader-db results on i965 (KBL):

    total instructions in shared programs: 15284592 -> 15284568 (<.01%)
    instructions in affected programs: 81683 -> 81659 (-0.03%)
    helped: 24
    HURT: 0

    total cycles in shared programs: 375013622 -> 375013932 (<.01%)
    cycles in affected programs: 40169618 -> 40169928 (<.01%)
    helped: 13
    HURT: 9

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-25 12:59:44 -06:00
Daniel Schürmann
0525bdc225 nir: Define shifts according to SM5 specification.
SPIR-V shifts are undefined for values >= bitsize, but SM5 shifts
are defined to only use the least significant bits.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-25 12:59:43 -06:00
Jason Ekstrand
c4fb6b0c81 intel/eu: Add an EOT parameter to send_indirect_[split]_message
For split indirect sends we have to put the EOT parameter in the
extended descriptor as well as the instruction itself so just calling
brw_inst_set_eot is insufficient.  Moving the EOT handling handling into
the send_indirect_[split]_message helper lets us handle it properly.
2019-02-25 11:35:12 -06:00
Sergii Romantsov
dcc4866419 d3d: meson: do not prefix user provided d3d-drivers-path
The user can select the location where there d3d drivers
are installed by the d3d-drivers-path meson option.

By default path will be $prefix/$libdir/d3d.

Currently we add $prefix to the user provided path.
Resulting in an incorrect or even missing path.

Based on logic of
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109698
CC: Kenneth Graunke <kenneth@whitecape.org>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-25 16:07:02 +00:00
Sergii Romantsov
f6556ec7d1 dri: meson: do not prefix user provided dri-drivers-path
The user can select the location where there dri drivers
are installed by the dri-drivers-path meson option.

By default path will be $prefix/$libdir/dri.

Currently we add $prefix to the user provided path.
Resulting in an incorrect or even missing path.

v2: fixed dri_search_path by default, rebased to master

v3: new commit-message (Emil Velikov), cc mesa-stable

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109698
CC: Rafael Antognolli <rafael.antognolli@intel.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Fixes: 306914db92 (meson: Add dridriverdir variable to dri.pc.)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-25 16:07:02 +00:00
Lionel Landwerlin
30828f4646 intel/aub_viewer: silence more compiler warnings
format not a string literal and no format arguments.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-25 13:11:16 +00:00
Lionel Landwerlin
91df8b1780 intel/aub_viewer: silence compiler warning
buffer_addr may be used uninitialized.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-25 13:11:13 +00:00
Lionel Landwerlin
f1da10e0c5 intel/aub_viewer: printout 48bits addresses
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-25 13:11:05 +00:00
Gert Wollny
875942c059 mesa/core: Enable EXT_depth_clamp for GLES >= 2.0
The extension NV_depth_clamp is written against OpenGL 1.2.1, and
since GLES 2.0 is based on GL 2.0 there is no reason not to enable
this extension also for GLES >= 2.0.

v2: Use EXT_depth_clamp that has been proposed to Khronos

v3: - Fix check for extension availability (Erik Faya-Lund)
    - Also fix the test in is_enabled
v4: - Test both, ARB and EXT extension (Erik)
v5: - Fix white space errors (Erik)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-25 09:44:27 +00:00
Kenneth Graunke
b45186a6cd iris: Properly allow rendering to RGBX formats.
I was converting them at pipe_surface creation time, but not when
answering queries about whether formats support rendering.  This caused
a lot of FBO incomplete errors for formats that ought to be supported.

Fixes "Child of Light", which uses PIPE_FORMAT_R8G8B8X8_UNORM_SRGB.

Also fixes Witcher 1 using wined3d (GL) according to Timur Kristóf.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109738
2019-02-25 01:11:27 -08:00
Kenneth Graunke
fce089c8a2 iris: Drop RGBX -> RGBA for storage image usages
GLSL doesn't expose RGB/RGBX image formats, so this isn't needed.
2019-02-25 00:57:50 -08:00
Kenneth Graunke
6921588d54 mesa: Fix RGBBuffers for renderbuffers with sized internal formats
For texture attachments, 'f' is texImg->_BaseFormat, but for
renderbuffer attachments, 'f' is att->Renderbuffer->InternalFormat.

InternalFormat may be something like GL_RGB8, which causes our
(f == GL_RGB) check to fail.  Switch to using a proper _BaseFormat,
which drops the size.

Fixes dEQP-GLES31.functional.draw_buffers_indexed.random.
max_required_draw_buffers.15 on iris when combined with a driver fix.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
2019-02-25 00:57:42 -08:00
Oscar Blumberg
da9c030763 glsl: Fix function return typechecking
apply_implicit_conversion only converts and check base types but we
need actual type equality for function returns, otherwise you can
return a vec2 from a function declared as returning a float.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-25 08:49:06 +02:00
Jordan Justen
bd0ad651e0 iris: Always use in-tree i915_drm.h
Ref: f1374805a8 "drm-uapi: use local files, not system libdrm"
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-24 21:06:40 -08:00
Alyssa Rosenzweig
f943047e48 panfrost: Decode render target swizzle/channels
On MRT-capable systems, the framebuffer format is encoded as a 64-bit
word in the render target descriptor. Previously, the two 32-bit
words were exposed as opaque hex values. This commit identifies a 12-bit
Mali swizzle and a 2-bit channel counter, removing some of the magic. It
also adds decoding support for the AFBC and MSAA enable bits, which were
already known but otherwise ignored in pandecode.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 04:49:50 +00:00
Alyssa Rosenzweig
c6be9969d2 panfrost/midgard: Add fround(_even), ftrunc, ffma
These ops were discovered by invoking the correspondingly names GLSL
functions. The rounding ops here behave exact as expected and are mapped
to their corresponding NIR ops where applicable. The ffma behaves as a
LUT instruction and requires some special argument packing (since
Midgard normally only allows for 2 arguments); this quirk will be
addressed in the future, but for now FMA is still lowered.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 02:36:26 +00:00
Alyssa Rosenzweig
4a4726af3c panfrost/nondrm: Split out dump_counters
Previously, this function was implied a part of the job submit.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 02:34:16 +00:00
Alyssa Rosenzweig
cdca103d43 panfrost/nondrm: Make COHERENT_LOCAL explicit
This flag corresponds to what was MEM_COHERENT_LOCAL in the vendor
driver, which seems to influence the cache policy, necessary for the
varying temporary storage but nothing else.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 02:32:45 +00:00
Alyssa Rosenzweig
f44d4653a9 panfrost/nondrm: Flag CPU-invisible regions
Potentially, the kernel could optimize these allocations, or perhaps we
can save on mapping costs.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 02:31:09 +00:00
Alyssa Rosenzweig
10cc251842 panfrost/meson: Remove subdir for nondrm
This change fixes cross builds with the (temporary) non-DRM overlay.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 02:27:26 +00:00
Alyssa Rosenzweig
77fea552f6 panfrost: Use tiler fast path (performance boost)
For reasons that are still unclear (speculation included in the comment
added in this patch), the tiler? metadata has a fast path that we were
not enabling; there looks to be a possible time/memory tradeoff, but the
details remain unclear.

Regardless, this patch improves performance dramatically. Particular
wins are for geometry-heavy scenes. For instance, glmark2-es2's
Phong-shaded bunny, rendering at fullscreen (2400x1600) via GBM, jumped
from ~20fps to hitting vsync cap at 60fps. Gains are even more obvious
when vsync is disabled, as in glmark2-es2-wayland.

With this patch, on GLES 2.0 samples not involving FBOs, it appears
performance is converging with (and sometimes surpassing) the blob.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 02:25:50 +00:00
Jason Ekstrand
743700be1f nir/builder: Don't emit no-op swizzles
The nir_swizzle helper is used some on it's own but it's also called by
nir_channel and nir_channels which are used everywhere.  It's pretty
quick to check while we're walking the swizzle anyway whether or not
it's an identity swizzle.  If it is, we now don't bother emitting the
instruction.  Sure, copy-prop will clean it up for us but there's no
sense making more work for the optimizer than we have to.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-24 20:01:27 -06:00
Jason Ekstrand
724371c6b9 nir/split_vars: Don't compact vectors unnecessarily
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-24 20:01:18 -06:00
Erik Faye-Lund
7a6a5d4bfa st/mesa: remove unused header-file
This header has been unused since f8f2520e88 ("st/mesa: Remove
unnecessary headers"). And in the more than 8 years since, this
hasn't been useful. So let's just get rid of it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-24 20:53:37 +01:00
Maya Rashish
021c496135 configure: fix test portability
From the bash manual:

string1 == string2
string1 = string2
       True if the strings are equal.  = should be used with the test
       command for POSIX conformance.
2019-02-24 19:26:15 +00:00
David Shao
6fa923a65d meson: ensure that xmlpool_options.h is generated for gallium targets that need it
Fixes: 68076b8747 "meson: build gallium vdpau state tracker"
Fixes: 22a817af8a "meson: build gallium xvmc state tracker"
Fixes: 5a785d51a6 "meson: build gallium va state tracker"
Fixes: 0ba909f0f1 "meson: build gallium xa state tracker"
Fixes: 1d36dc674d "meson: build gallium omx state tracker"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-24 09:00:39 +00:00
Matthias Lorenz
f91654120b vulkan/overlay: Add fps counter
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109747
2019-02-24 01:07:26 +00:00
Lionel Landwerlin
239b0d8570 Revert "anv: add support for INTEL_DEBUG=bat"
This reverts commit e4d88396d2.

Apologies, I pushed the wrong commit.
2019-02-24 01:06:39 +00:00
Lionel Landwerlin
e4d88396d2 anv: add support for INTEL_DEBUG=bat
As requested by Ken ;)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-23 23:29:04 +00:00
Christian Gmeiner
c56e734496 etnaviv: blt: mark used src resource as read from
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-02-23 16:00:50 +01:00
Christian Gmeiner
7244e76804 etnaviv: rs: mark used src resource as read from
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-02-23 16:00:25 +01:00
Vinson Lee
2bd08b8b9d gallium/auxiliary/vl: Fix duplicate symbol build errors.
CXXLD    gallium_dri.la
duplicate symbol _compute_shader_video_buffer in:
    ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o)
    ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o)
duplicate symbol _compute_shader_weave in:
    ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o)
    ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o)
duplicate symbol _compute_shader_rgba in:
    ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o)
    ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o)

Fixes: 9364d66cb7 ("gallium/auxiliary/vl: Add video compositor compute shader render")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
2019-02-22 23:07:26 -08:00
Caio Marcelo de Oliveira Filho
4c160b6bd8 nir: fix MSVC build
Zero initialize struct with {0} instead of {}.
2019-02-22 22:38:05 -08:00
Caio Marcelo de Oliveira Filho
eb13211997 nir/copy_prop_vars: add tests for load/store elements of vectors
Test using array deref on vectors in loads and stores.  These are
marked DISABLED_ as this optimization is currently not done.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho
4f3809d389 nir: nir_build_deref_follower accept array derefs of vectors
Code itself already supports it, just make sure we can use it for
those cases.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho
c4beadd28e nir/copy_prop_vars: change test helper to get intrinsics
Replace find_next_intrinsic(intrinsic, after) with
get_intrinsic(intrinsic, index).  This makes slightly more convenient
to check the resulting loads/stores/copies, since in most tests we
know which one we care about.  The cost is to perform more traversals,
but for such tests this is not a problem.

Added the ASSERT_EQ() on count to some tests missing it, so the
indices queried are always expected to find something.

Also, drop two nir_print_shader leftover calls in a test.

v2: Remove redundant assertions.  nir_src_comp_as_uint already
    assert what we need.  (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho
fdcb9779d9 nir/copy_prop_vars: keep track of components in copy_entry
When a copy_entry is SSA, store not only the nir_ssa_def* for each
component, but also the source component they come from.  At the
moment this is always a match (i.e. 'component[i] == i'), because all
the operations for a copy_entry happen using definitions with the same
size.  This prepares the code for array_derefs of vectors, in which
'component[i] != i'.

Also, extract setting all SSA components into a function of its own.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho
6624decbb5 nir/copy_prop_vars: add debug helpers
Disabled by default, to be used during development.  Adding those
so I don't rewrite some ad-hoc version of them everytime I'm working
with this pass.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho
60d9bb9ff5 nir/copy_prop_vars: don't get confused by array_deref of vectors
For now these derefs are not handled, so don't let these get into the
copies list -- which would cause wrong propagations.  For load_derefs,
do nothing.  For store_derefs, invalidate whatever the store is
writing to.  For copy_derefs, invalidate whatever the copy is writing
to.

These cases will happen once derefs to SSBOs/UBOs are kept around long
enough to get optimized by copy_prop_vars.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 21:00:50 -08:00
Timothy Arceri
f48527e51a nir: allow nir_lower_phis_to_scalar() on more src types
Rather than only lowering if all srcs are scalarizable we instead
check that at least one src is scalarizable.

We change undef type to return false otherwise it will cause
regressions when it is the only scalarizable src.

total instructions in shared programs: 13219105 -> 13024547 (-1.47%)
instructions in affected programs: 1153797 -> 959239 (-16.86%)
helped: 581
HURT: 74

total cycles in shared programs: 333968972 -> 324807922 (-2.74%)
cycles in affected programs: 129809402 -> 120648352 (-7.06%)
helped: 571
HURT: 131

total spills in shared programs: 57947 -> 29130 (-49.73%)
spills in affected programs: 53364 -> 24547 (-54.00%)
helped: 351
HURT: 0

total fills in shared programs: 51310 -> 25468 (-50.36%)
fills in affected programs: 44882 -> 19040 (-57.58%)
helped: 351
HURT: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-23 11:11:51 +11:00
Alok Hota
6053499f2e swr/rast: bypass size limit for non-sampled textures
This fixes a bug where SWR will fail to render in cases with large
buffer allocations, e.g. very large meshes whose vertex buffers exceed
2GB

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-22 23:35:11 +00:00
Marek Olšák
b326a15eda tgsi: don't set tgsi_info::uses_bindless_images for constbufs and hw atomics
This might have decreased performance for radeonsi/tgsi, because most
most shaders claimed they used bindless.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-02-22 18:00:54 -05:00
Jordan Justen
cf652205cf iris: Add gitlab-ci build testing
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-22 14:08:21 -08:00
Rob Clark
fd360c82f0 freedreno/a6xx: cube image fix
Note that emit_intrinsic_load_image() already swaps a .3d flag with an
.a flag.  I tried doing things the other way around (going back to .3d)
but that didn't work.  And treating cube images as 2d array is also what
blob does, so let's just go with that.

Fixes dEQP-GLES31.functional.image_load_store.cube.load_store.*

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-22 14:05:32 -05:00
Rob Clark
f90c3b4485 freedreno/a6xx: fix border-color offset
Fixes nearly all of dEQP-GLES31.functional.texture.border_clamp.* when
run after a test that binds textures used in vertex shader.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-22 14:05:32 -05:00
Rob Clark
bdedb8277a freedreno/ir3: don't hardcode wrmask
Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler.const_literal.vertex.samplercubeshadow
and few other similar tests that do multiple texture fetches into
individual components of a packet output.  Mostly works around the
issue mentioned in ra_block_find_definers().

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-22 14:05:32 -05:00
Rob Clark
5d4fa194b8 freedreno: fix race condition
rsc->write_batch can be cleared behind our back, so we need to acquire
the lock *before* deref'ing.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-22 14:05:32 -05:00
Kenneth Graunke
3090c6b9e9 vulkan: Fix 32-bit build for the new overlay layer
vulkan_core.h defines non-dispatchable handles as (struct object *)
on 64-bit systems, but uint64_t on 32-bit systems.  The former can be
implicitly cast to void *, but the latter requires an explicit cast.

While here, %lu is the wrong format specifier for uint64_t on 32-bit
systems, so use PRIu64, fixing a warning.

Reported-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-22 08:56:54 -08:00
Juan A. Suarez Romero
4f917e6a61 anv: advertise 8 subpixel precision bits
On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is
used to select between 8 bit subpixel precision (value 0) or 4 bit
subpixel precision (value 1). As this value is not set, means it is
taking the value 0, so 8 bit are used.

On the other side, in the Vulkan CTS tests, if the reference rasterizer,
which uses 8 bit precision, as it is used to check what should be the
expected value for the tests, is changed to use 4 bit as ANV was
advertising so far, some of the tests will fail.

So it seems ANV is actually using 8 bits.

v2: explicitly set 3DSTATE_SF::VertexSubPixelPrecisionSelect (Jason)

v3: use _8Bit definition as value (Jason)

v4: (by Jason)
anv: Explicitly set 3DSTATE_CLIP::VertexSubPixelPrecisionSelect

This field was added on gen8 even though there's an identically defined
one in 3DSTATE_SF.

CC: Jason Ekstrand <jason@jlekstrand.net>
CC: Kenneth Graunke <kenneth@whitecape.org>
CC: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 17:53:55 +01:00
Juan A. Suarez Romero
3b423eeb2d genxml: add missing field values for 3DSTATE_SF
Fill out "Vertex Sub Pixel Precision Select" possible values.

CC: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 17:53:45 +01:00
Bas Nieuwenhuizen
f324784104 radv: Allow interpolation on non-float types.
In particular structs containing floats and 16-bit floating point
types.

Fixes: 62024fa775 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features"
Fixes: da29594636 "spirv: Only split blocks"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109735
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-22 17:06:55 +01:00
Bas Nieuwenhuizen
a1fdd4a4a7 radv: Fix float16 interpolation set up.
float16 types can have non-flat interpolation so set up the HW
correctly for that.

Fixes: 62024fa775 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-22 17:06:55 +01:00
Ilia Mirkin
ae2cb72804 nv50: disable compute
It causes more trouble than it's worth. Now vl tries to create compute
shaders without all the proper checking. Since there's really no
(current) way to use compute on nv50, just mark it disabled.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109742
Fixes: f6ac0b5d71 ("gallium/auxiliary/vl: Add compute shader to support video compositor render")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-02-22 09:42:41 -05:00
Lionel Landwerlin
1d626fc028 intel: fix urb size for CFL GT1
Same 192Kb amount as SKL/KBL GT1 applies.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Fixes: de7ed0ba55 ("i965/CFL: Add PCI Ids for Coffee Lake.")
2019-02-22 11:53:49 +00:00
Samuel Iglesias Gonsálvez
bd2c5a8203 isl: the display engine requires 64B alignment for linear surfaces
v2: Add PRM quote (Lionel)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-22 11:45:45 +00:00
Gert Wollny
2ee197d6e8 virgl: Enable mixed color FBO attachemnets only when the host supports
it

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2019-02-22 10:44:08 +01:00
Mauro Rossi
338dacc341 android: intel/isl: remove redundant building rules
Fixes the following building error:

including ./external/mesa/Android.mk ...
build/core/base_rules.mk:183: *** external/mesa/src/intel:
MODULE.TARGET.STATIC_LIBRARIES.libmesa_isl_tiled_memcpy already defined by external/mesa/src/intel.
make: *** [build/core/ninja.mk:164: out/build-android_x86_64.ninja] Error 1

ISL_TILED_MEMCPY_FILES is isl/isl_tiled_memcpy_normal.c
and that source file includes isl_tiled_memcpy.c source

Fixes: 96bb328 ("iris: add Android build")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-22 07:56:11 +02:00
Kenneth Graunke
b21de090d6 Revert "iris: Enable auxiliary buffer support"
This reverts commit cd0ced49e7.

It breaks glxgears rendering.
2019-02-21 15:50:46 -08:00
Kenneth Graunke
e2cb0c5e0e iris: Enable -msse2 and -mstackrealign
This is needed for gen_clflush.h intrinsics to work on 32-bit builds.
i965 and anv both set these, and iris needs to as well.

Tested-by: Mark Janes <mark.a.janes@intel.com>
2019-02-21 14:51:15 -08:00
Francisco Jerez
7272fe9c08 intel/fs: Rely on undocumented unrestricted regioning for 32x16-bit integer multiply.
Even though the hardware spec claims that any "integer DWord multiply"
operation is affected by the regioning restrictions of CHV/BXT/GLK,
this is inconsistent with the behavior of the simulator and with
empirical evidence -- Return false from has_dst_aligned_region_restriction()
for such instructions as a micro-optimization.

Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 14:07:25 -08:00
Francisco Jerez
e03be78252 intel/fs: Implement extended strides greater than 4 for IR source regions.
Strides up to 32B can be implemented for the source regions of most
instructions by leveraging either the vertical or the horizontal
stride of the hardware Align1 region.  The main motivation for this is
that currently the lower_integer_multiplication() pass will happily
double the stride of one of the 32-bit sources, which can blow up if
the stride of the original source was already the maximum value
allowed by the hardware.

An alternative would be to use the regioning legalization pass in
order to lower such strides into the composition of multiple legal
strides, but that would be somewhat less efficient.

This showed up as a regression from my commit cbea91eb57
in Vulkan 1.1 CTS tests on CHV/BXT platforms, however it was really a
pre-existing problem that had affected conformance on other platforms
without native support for integer multiplication.  CHV/BXT were
getting around it because the code I removed in that commit had the
"fortunate" side effect of emitting narrower regions that didn't hit
the hardware stride limit after lowering.  Beyond fixing the
regression this fixes ~90 additional Vulkan 1.1 subgroup CTS tests on
ICL (that's why this patch is marked for inclusion in mesa-stable even
though the original regressing patch was not).

According to Jason, a nearly equivalent change had been committed
previously as e8c9e65185 and then (mistakenly?) reverted as
a31d038208.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328
Reported-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 14:07:25 -08:00
Francisco Jerez
7f9f6263c1 intel/fs: Cap dst-aligned region stride to maximum representable hstride value.
This is required in combination with the following commit, because
otherwise if a source region with an extended 8+ stride is present in
the instruction (which we're about to declare legal) we'll end up
emitting code that attempts to write to such a region, even though
strides greater than four are still illegal for the destination.

Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 14:07:25 -08:00
Francisco Jerez
e2f475ddff intel/fs: Lower integer multiply correctly when destination stride equals 4.
Because the "low" temporary needs to be accessed with word type and
twice the original stride, attempting to preserve the alignment of the
original destination can potentially lead to instructions with illegal
destination stride greater than four.  Because the CHV/BXT alignment
restrictions are now being enforced by the regioning lowering pass run
after lower_integer_multiplication(), there is no real need to
preserve the original strides anymore.

Note that this bug can be reproduced on stable branches, but
back-porting would be non-trivial, because the fix relies on the
regioning lowering pass recently introduced.

Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 14:07:25 -08:00
Francisco Jerez
c3c27762f7 intel/fs: Exclude control sources from execution type and region alignment calculations.
Currently the execution type calculation will return a bogus value in
cases like:

  mov_indirect(8) vgrf0:w, vgrf1:w, vgrf2:ud, 32u

Which will be considered to have a 32-bit integer execution type even
though the actual indirect move operation will be carried out with
16-bit precision.

Similarly there's no need to apply the CHV/BXT double-precision region
alignment restrictions to such control sources, since they aren't
directly involved in the double-precision arithmetic operations
emitted by these virtual instructions.  Applying the CHV/BXT
restrictions to control sources was expected to be harmless if mildly
inefficient, but unfortunately it exposed problems at codegen level
for virtual instructions (namely the SHUFFLE instruction used for the
Vulkan 1.1 subgroup feature) that weren't prepared to accept control
sources with an arbitrary strided region.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328
Reported-by: Mark Janes <mark.a.janes@intel.com>
Fixes: efa4e4bc5f "intel/fs: Introduce regioning lowering pass."
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 14:07:25 -08:00
Timothy Arceri
d9e08e753b nir: clone instruction set rather than removing individual entries
This reduces the time spent in nir_opt_cse() by almost a half.

The massif tool from callgrind reported no change in peak
memory use with the large doliphin uber shaders I used for
testing.

Reviewed-by: Thomas Helland<thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 08:36:36 +11:00
Jordan Justen
cd0ac3a6af genxml: Remove extra space in gen4/45/5 field name
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 13:17:10 -08:00
Jordan Justen
a9b0b72a78 genxml/gen_bits_header.py: Use regex to strip no alphanum chars
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 13:15:59 -08:00
Kenneth Graunke
cd0ced49e7 iris: Enable auxiliary buffer support
This currently regresses KHR-GL4x.compute_shader.resource-texture,
but that's a pre-existing bug (https://bugs.freedesktop.org/109113)
which should be fixed up once we have fast clear support.
2019-02-21 10:26:12 -08:00
Rafael Antognolli
db81445837 iris: Flag ALL_DIRTY_BINDINGS on aux state change.
If we change the aux state for a given resource, we need to re-emit the
binding table pointers for any stage that has such resource bound. Since
we don't track that, flag IRIS_ALL_DIRTY_BINDINGS and emit all of them.
2019-02-21 10:26:12 -08:00
Rafael Antognolli
95589652a1 iris: Skip resolve if there's no context.
If iris_resource_get_handle() gets called without a context, we can't
resolve the resource. Hopefully it shouldn't be compressed anyway, so
let's just add an assert to ensure it's correct.
2019-02-21 10:26:12 -08:00
Rafael Antognolli
36138bb7fc iris/clear: Pass on render_condition_enabled. 2019-02-21 10:26:12 -08:00
Rafael Antognolli
8190165d13 iris: Avoid leaking if we fail to allocate the aux buffer.
Otherwise we could leak the aux state map or the aux BO.
2019-02-21 10:26:12 -08:00
Kenneth Graunke
7da53d7188 iris: Only resolve compute resources for compute shaders 2019-02-21 10:26:12 -08:00
Kenneth Graunke
95a36bd55c iris: Fix aux usage in render resolve code 2019-02-21 10:26:12 -08:00
Rafael Antognolli
4f191feb0c iris: Pin HiZ buffers when rendering. 2019-02-21 10:26:12 -08:00
Rafael Antognolli
dfd54f9954 iris: Flush before hiz_exec. 2019-02-21 10:26:12 -08:00
Kenneth Graunke
f3f7d45a63 iris: Allow disabling aux via INTEL_DEBUG options 2019-02-21 10:26:12 -08:00
Kenneth Graunke
4634b754f4 iris: do flush for buffers still 2019-02-21 10:26:12 -08:00
Kenneth Graunke
15822f33ad iris: make surface states for CCS_D too
CCS_E can fall back to CCS_D with incompatible format views

CCS_D is pretty useless without fast clears and we may as well use NONE,
but we're surely going to hook those up at some point, so may as well
just go ahead and do it now...
2019-02-21 10:26:12 -08:00
Rafael Antognolli
689b590069 iris: Skip msaa16 on gen < 9.
Also needed to add gen information to KEY_INIT.
2019-02-21 10:26:12 -08:00
Kenneth Graunke
fd2038b22a iris: Set program key fields for MCS 2019-02-21 10:26:12 -08:00
Kenneth Graunke
92c310fd3f iris: don't use hiz for MSAA buffers 2019-02-21 10:26:12 -08:00
Kenneth Graunke
2cddc953cd iris: some initial HiZ bits 2019-02-21 10:26:12 -08:00
Kenneth Graunke
9b1126c990 iris: disable aux for external things 2019-02-21 10:26:12 -08:00
Kenneth Graunke
45f4dab62b iris: Resolves for compute 2019-02-21 10:26:12 -08:00
Kenneth Graunke
ecc897b8ad iris: consider framebuffer parameter for aux usages 2019-02-21 10:26:12 -08:00
Kenneth Graunke
b77d2dc71b iris: Make blit code use actual aux usages 2019-02-21 10:26:12 -08:00
Kenneth Graunke
bfc76d3525 iris: store modifier info in res 2019-02-21 10:26:12 -08:00
Kenneth Graunke
56f1fe3eac iris: pin the buffers 2019-02-21 10:26:12 -08:00
Kenneth Graunke
f8aa9aa353 iris: resolve before transfer maps 2019-02-21 10:26:12 -08:00
Kenneth Graunke
c53a67d469 iris: be sure to skip buffers in resolve code
Buffers don't have ISL surfaces, and this can get us into trouble.
2019-02-21 10:26:12 -08:00
Kenneth Graunke
5eb75345b8 iris: try to fix copyimage vs copybuffers 2019-02-21 10:26:12 -08:00
Kenneth Graunke
d8f3bc1c4c iris: actually use the multiple surf states for aux modes 2019-02-21 10:26:12 -08:00
Kenneth Graunke
3c979b0e6d iris: add some draw resolve hooks 2019-02-21 10:26:12 -08:00
Kenneth Graunke
53c484ba8a iris: blorp using resolve hooks 2019-02-21 10:26:12 -08:00
Kenneth Graunke
77a1070d36 iris: Initial import of resolve code 2019-02-21 10:26:12 -08:00
Kenneth Graunke
f879349398 iris: create aux surface if needed 2019-02-21 10:26:12 -08:00
Kenneth Graunke
3efd5299af iris: Fill out SURFACE_STATE entries for each possible aux usage 2019-02-21 10:26:12 -08:00
Kenneth Graunke
3cfc6a207b iris: Fill out res->aux.possible_usages 2019-02-21 10:26:12 -08:00
Kenneth Graunke
a7bc4d6074 iris: Add iris_resource fields for aux surfaces
But without fast clears or HiZ per-level tracking just yet.
2019-02-21 10:26:12 -08:00
Jordan Justen
d0996d5fab iris: Emit default L3 config for the render pipeline
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:12 -08:00
Kenneth Graunke
51ddc40084 iris: Always emit at least one BLEND_STATE 2019-02-21 10:26:12 -08:00
Kenneth Graunke
d6dd57d43c iris: Add missing depth cache flushes 2019-02-21 10:26:12 -08:00
Kenneth Graunke
1b5c342f33 iris: Simplify iris_get_depth_stencil_resources
We can safely assume that the given resource is depth, depth/stencil,
or stencil already.  The stencil-only case is easily detectable with
a single format check, and all other cases are handled identically.

This saves some CPU overhead.
2019-02-21 10:26:12 -08:00
Kenneth Graunke
07ec1f0b25 iris: Make an IRIS_MAX_MIPLEVELS define 2019-02-21 10:26:12 -08:00
Rafael Antognolli
455c959689 iris: Store internal_format when getting resource from handle. 2019-02-21 10:26:12 -08:00
Kenneth Graunke
973f01d55a iris: Move create and bind driver hooks to the end of iris_program.c
This just moves the code for dealing with pipe_shader_state /
pipe_compute_state / iris_uncompiled_shader to the end of the file.
Now that those do precompiles, they want to call the actual compile
functions.  Putting them at the end eliminates the need for a bunch
of prototypes.
2019-02-21 10:26:12 -08:00
Timur Kristóf
cacf84ed5f iris: implement clearing render target and depth stencil
v2 (Kenneth Graunke): split color/depthstencil cases, fix iris_clear
2019-02-21 10:26:12 -08:00
Kenneth Graunke
8ab82bd1fd iris: Drop XXX about checking for swizzling
Caio noted that this is not necessary on Gen8+:

   "Before Gen8, there was a historical configuration control field to
    swizzle address bit[6] for in X/Y tiling modes.  This was set in
    three different places: TILECTL[1:0], ARB_MODE[5:4], and
    DISP_ARB_CTL[14:13].  For Gen8 and subsequent generations, the
    swizzle fields are all reserved, and the CPU's memory controller
    performs all address swizzling modifications."

Since we don't support earlier hardware, we can skip it entirely.
2019-02-21 10:26:12 -08:00
Kenneth Graunke
bf23e79629 iris: Set HasWriteableRT correctly
A bit of irritating state cross dependency here, but nothing too hard
2019-02-21 10:26:12 -08:00
Kenneth Graunke
d612cd1bf8 iris: Set 3DSTATE_WM::ForceThreadDispatchEnable
The Vulkan driver only sets this if color writes are disabled, which
is more conservative - but would require us to inspect blend state.

(If color writes are enabled, we don't need to force anything, because
the internal signal is already correct.  But it shouldn't hurt to do so.)
2019-02-21 10:26:12 -08:00
Kenneth Graunke
27d751cdd8 iris: Drop XXX about alpha testing
I was misreading i965 - the 3DSTATE_WM::PixelShaderKillsPixel bit from
Gen < 8 needed all of this, but the 3DSTATE_PS_EXTRA bit only needs
prog_data->uses_kill.
2019-02-21 10:26:12 -08:00
Andre Heider
bffb65d28e iris: improve PIPE_CAP_VIDEO_MEMORY bogus value
-1 is a little too bogus for most games ;)

Signed-off-by: Andre Heider <a.heider@gmail.com>
2019-02-21 10:26:12 -08:00
Andre Heider
f89a578818 iris: fix build with gallium nine
Signed-off-by: Andre Heider <a.heider@gmail.com>
2019-02-21 10:26:12 -08:00
Kenneth Graunke
be49fb051d iris: Stop chopping off the first nine characters of the renderer string 2019-02-21 10:26:12 -08:00
Kenneth Graunke
15341778ba iris: rework num textures to util_lastbit 2019-02-21 10:26:12 -08:00
Kenneth Graunke
974229df46 iris: Add PIPE_CAP_MAX_VARYINGS 2019-02-21 10:26:11 -08:00
Kenneth Graunke
1cd001aa63 iris: Make a iris_batch_reference_signal_syncpt helper function.
Suggested by Chris Wilson.  More obvious what's going on.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
9376799bd6 iris: Use READ_ONCE and WRITE_ONCE for snapshots_landed
Suggested by Chris Wilson, if only to make it obvious to the human
readers that these are volatile reads.  It may also be necessary for
the compiler in a few cases.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
18e31a9b31 iris: Fix accidental busy-looping in query waits
When switching from bo_wait to sync-points, I missed that we turned an
if (not landed) bo_wait into a while (not landed) check_syncpt(), which
has a timeout of 0.  This meant, rather than sleeping until the batch
is complete, we'd busy-loop, continually asking the kernel "is the batch
done yet???".  This is not what we want at all - if we wanted a busy
loop, we'd just loop on !snapshots_landed.  We want to sleep.

Add an effectively infinite timeout so that we sleep.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
3b1ac8244e iris: Add a timeout_nsec parameter, rename check_syncpt to wait_syncpt
I want to be able to wait with a non-zero timeout from elsewhere.
2019-02-21 10:26:11 -08:00
Sagar Ghuge
c24a574e6c iris: Don't allocate a BO per query object
Instead of allocating 4K BO per query object, we can create a large blob
of memory and split it into pieces as required.

Having one BO for multiple query objects, we don't want to wait on all
of them, instead when we write last snapshot, we create a sync point, and
check syncpoints while waiting on particular object.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-02-21 10:26:11 -08:00
Kenneth Graunke
a1ebac3750 iris: Implement ALT mode for ARB_{vertex,fragment}_shader
Fixes gl-1.0-spot-light
2019-02-21 10:26:11 -08:00
Kenneth Graunke
732c3a90a4 iris: Fix bug in bound vertex buffer tracking
res might be NULL, at which point this is an unbind.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
4bfd12bbf7 iris: minor tidying 2019-02-21 10:26:11 -08:00
Kenneth Graunke
b1bacbf038 iris: Unreference some more things on state module teardown 2019-02-21 10:26:11 -08:00
Kenneth Graunke
e092ed9213 iris: Drop dead state_size hash table
I inherited this from i965.  It would be nice to track the state size
so INTEL_DEBUG=color,bat decoding can print the right number of e.g.
binding table entries or blend states, but...without a single point
of entry for state, it's a little tricky to get right.  Punt for now,
and drop the dead code in the meantime.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
6e41f1b459 iris: Drop comment about ISP_DIS
i965 re-emits 3DSTATE_CONSTANT_* on every batch, so there's no point in
restoring the constants from the context.  Iris actually re-pins the
constant buffers properly across the batch, and avoids re-emitting the
constant packets unless it's necessary.  So, we don't want ISP_DIS.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
edd3ce5a63 iris: Enable PIPE_CAP_COMPACT_ARRAYS 2019-02-21 10:26:11 -08:00
Kenneth Graunke
1db394f46b iris: Remap stream output indexes back to VARYING_SLOT_*.
Previously I had a hack in st/mesa to make it stop remapping
VARYING_SLOT_* into the naively compacted slots, which aren't
what we want.  But that wasn't very feasible, as we'd have to
update all drivers, or add capability bits, and it gets messy fast.

It turns out that I can map back to VARYING_SLOT_* in about 5 LOC,
so let's just do that.  It removes the need for hacks, and is easy.

This also fixes KHR-GL46.enhanced_layouts.xfb_capture_struct, which
apparently with my hack was still getting the wrong slot info.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
5d3d757178 iris: Zero the compute predicate when changing the render condition
1. Set a render condition.  We emit it immediately on the render
   engine, and stash q->bo as ice->state.compute_predicate in case
   the compute engine needs it.

2. Clear the render condition.  We were incorrectly leaving a stale
   compute_predicate kicking around...

3. Dispatch compute.  We would then read the stale compute predicate,
   and try to load it into MI_PREDICATE_DATA.  But q->bo may have been
   freed altogether, causing us to try and use garbage memory as a BO,
   adding it to the validation list, failing asserts, and tripping
   EINVALs in execbuf.

Huge thanks to Mark Janes for narrowing this sporadic GL CTS failure
down to a list of 48 tests I could easily run to reproduce it.  Huge
thanks to the Valgrind authors for the memcheck tool that immediately
pinpointed the problem.
2019-02-21 10:26:11 -08:00
Caio Marcelo de Oliveira Filho
4fd1f70e62 iris: always include an extra constbuf0 if using UBOs
In st_nir_lower_uniforms_to_ubo() all UBO access in the shader have
its index incremented to open room for uniforms in constbuf0.  So if
we use UBOs, we always need to include the extra binding entry in the
table.

To avoid doing this checks both when compiling the shader and when
assigning binding tables, store the num_cbufs in iris_compiled_shader.

Fixes a bunch of tests from Piglit and CTS that use UBOs but don't use
uniforms or system values.  Note that some tests fitting this criteria
were passing because the UBOs were moved to be push
constants (avoiding the problem).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-21 10:26:11 -08:00
Kenneth Graunke
4801af2f26 iris: Do binder address allocations per-context, not globally.
iris_bufmgr allocates addresses across the entire screen, since buffers
may be shared between multiple contexts.  There used to be a single
special address, IRIS_BINDER_ADDRESS, that was per-context - and all
contexts used the same address.  When I moved to the multi-binder
system, I made a separate memory zone for them.  I wanted there to be
2-3 binders per context, so we could cycle them to avoid the stalls
inherent in pinning two buffers to the same address in back-to-back
batches.  But I figured I'd allow 100 binders just to be wildly
excessive/cautious.

What I didn't realize was that we need 2-3 binders per *context*,
and what I did was allocate 100 binders per *screen*.  Web browsers,
for example, might have 1-2 contexts per tab, leading to hundreds of
contexts, and thus binders.

To fix this, we stop allocating VMA for binders in bufmgr, and let
the binder handle it itself.  Binders are per-context, and they can
assign context-local addresses for the buffers by simply doing a
ringbuffer style approach.  We only hold on to one binder BO at a
time, so we won't ever have a conflicting address.

This fixes dEQP-EGL.functional.multicontext.non_shared_clear.

Huge thanks to Tapani Pälli for debugging this whole mess and
figuring out what was going wrong.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-21 10:26:11 -08:00
Kenneth Graunke
0f33204f05 iris: Fix memzone_for_address for the surface and binder zones
We use > for IRIS_MEMZONE_DYNAMIC because IRIS_BORDER_COLOR_POOL_ADDRESS
lives at the very start of that zone.  However, IRIS_MEMZONE_SURFACE and
IRIS_MEMZONE_BINDER are normal zones.  They used to be a single zone
(surface) with a single binder BO at the beginning, similar to the
border color pool.  But when I moved us to multiple binders, I made them
have a real zone (if a small one).  So both zones should use >=.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-21 10:26:11 -08:00
Kenneth Graunke
3bcb1a7fcd iris: Don't whack SO dirty bits when finishing a BLORP op
Re-emitting 3DSTATE_SO_BUFFERS can be hazardous, as it could zero
offsets.  Plus, it's just not necessary - BLORP doesn't change these.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
b9697dd820 iris: Fix SO issue with INTEL_DEBUG=reemit, set fewer bits
INTEL_DEBUG=reemit was breaking streamout tests, by re-emitting
3DSTATE_SO_BUFFER commands that tell the HW to zero the SO write
offsets.  We would need to alter them to use 0xFFFFFFFF for the offset.

Also, have each upload function only flag bits relevant to its own
pipeline.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
61798e3c88 iris: CS stall on VF cache invalidate workarounds
See commit 31e4c9ce40 in i965.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
c81941f1e7 iris: Pay attention to blit masks
For combined depth/stencil formats, we may want to only blit one half.
If PIPE_BLIT_Z is set, blit depth; if PIPE_BLIT_S is set, blit stencil.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
7837fec740 iris: Assert about blits with color masking
st/mesa never asks for this today, but in theory someone might, and we
don't support it.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
0f677b0d87 iris: Don't enable smooth points when point sprites are enabled
dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_*.primitives.points
2019-02-21 10:26:11 -08:00
Kenneth Graunke
3b336a1513 iris: Allow sample mask of 0
I think this was an attempt to work around various sample mask bugs I
had early on.  It's not correct.  A sample mask of 0 is legal and means
to disable all samples.

Fixes dEQP-GLES31.functional.texture.multisample.*.*sample_mask*
2019-02-21 10:26:11 -08:00
Kenneth Graunke
e17333ea1e iris: fail to create screen for older unsupported HW
loader shouldn't try, but let's be paranoid
2019-02-21 10:26:11 -08:00
Kenneth Graunke
1f91f688e8 iris: Switch to the new PIPELINE_STATISTICS_QUERY_SINGLE capability
I had a hack in place earlier to pass the query type as q->index
for the regular statistics query, but we ended up adjusting the
interface and adding a new query type.  Use that instead, fixing
pipeline statistics queries since the rebase.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
a23c06cabc iris: Use new PIPE_STAT_QUERY enums rather than hardcoded numbers. 2019-02-21 10:26:11 -08:00
Kenneth Graunke
5aef30b886 iris: Fix Broadwell WaDividePSInvocationCountBy4
We were dividing by 4 in calculate_result_on_gpu(), and also in
iris_get_query_result().  We should stop doing the latter, and instead
divide by 4 in calculate_result_on_cpu() as well.

Otherwise, if snapshots were available, and you hit the
calculate_result_on_cpu() path, but requested it be written to a QBO,
you'd fail to get a divide.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
7f318bf2ac iris: Delete genx->bound_vertex_buffers
This is actually stored in ice->state, as it isn't gen-specific
2019-02-21 10:26:11 -08:00
Kenneth Graunke
02991e2878 iris: Drop a dead comment 2019-02-21 10:26:11 -08:00
Kenneth Graunke
572fad1e84 iris: Don't check other batches for our batch BO
This is an awkward corner case.  We create batches in order, each of
which creates and pins a BO.  The other batches may not be set up yet,
so it may not be safe to ask whether they reference a BO.

Just avoid this for now.  We could avoid it for other context-local BOs
too, but we currently don't have a flag for that (and I'm not certain
whether it's worth it).
2019-02-21 10:26:11 -08:00
Kenneth Graunke
8eda6f2288 iris: Handle PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE somewhat
Various places in the transfer code need to know whether they must
read the existing resource's values.  Rather than checking both flags
everywhere, just make PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE also flag
PIPE_TRANSFER_DISCARD_RANGE - if we can discard everything, we can
discard a subrange, too.

Obviously, we can do better for PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE,
but eventually u_threaded_context should handle swapping out buffers
for new idle buffers, anyway.  In the meantime, this is at least better.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
bacc722d13 iris: Flush the render cache in flush_and_dirty_for_history
BLORP uses the render engine to write to buffers, and we need to flush
that data out to the actual surface (finishing the write).  Then, the
rest of this function invalidates any caches that might have stale data
which needs to be refetched.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
7a9e87c224 iris: Implement multi-slice copy_region
I don't know if this is required - surprisingly, I haven't seen it
matter - but I'd like to use it for multi-slice transfer maps.  We may
as well do the right thing.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
307f3f9924 iris: Leave a comment about why Broadwell images are broken
There are a variety of ways to fix this, many of which are simple, but
I could use some advice on which ones other people prefer, and so we'll
punt until after the holidays.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
7ed1383c0a iris: Fix surface states for Gen8 lowered-to-untype images
We have to use SURFTYPE_BUFFER and ISL_FORMAT_RAW for these.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
477e7d575b iris: Fill out brw_image_params for storage images on Broadwell 2019-02-21 10:26:11 -08:00
Kenneth Graunke
7e35333c73 iris: Don't make duplicate system values
We were relying on CSE/GVN/etc to coalesce all intrinsics that load the
same value, but that's a bad idea.  We might have a couple intrinsics
that reload the same value.  If so, we only want to set up the uniform
on the first one we see.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
bc3bb28645 iris: Don't enable push constants just because there are system values
System values are built-in uniforms.  We set them up as UBO values, and
might pull or push them.  UBO push analysis will take care of that.  We
only want to enable push constants if there's an actual range being
pushed.  Otherwise, we might get into a scenario where 3DSTATE_PS
enables push constants but 3DSTATE_CONSTANT_PS isn't pushing anything.

This fixes GPU hangs in Broadwell image load store tests which have
unused image param system values but no other uniforms.  (We shouldn't
be making those anyway, but that's a separate fix...)
2019-02-21 10:26:11 -08:00
Kenneth Graunke
2ca0d913ea iris: Fix framebuffer layer count
cso_fb->layers is only valid for no-attachment framebuffers.  Use the
helper function to get the real value, then stash it so we don't have
to call the helper function on the old value for comparison, or at draw
time for Force Zero RTA Index setting.

This fixes Force Zero RTA Index being set even when attempting layered
rendering.
2019-02-21 10:26:11 -08:00
Dave Airlie
df60241ff7 iris: handle qbo fragment shader invocation workaround 2019-02-21 10:26:11 -08:00
Dave Airlie
5ae2e5aa94 iris: add fs invocations query workaround for broadwell 2019-02-21 10:26:11 -08:00
Dave Airlie
8806b29e16 iris: setup gen8 caps 2019-02-21 10:26:11 -08:00
Dave Airlie
1bbf095473 iris: limit gen8 to 8 samples 2019-02-21 10:26:11 -08:00
Dave Airlie
823609b1a3 iris/WIP: add broadwell support
This adds all the state changes, MOCS changes,
2019-02-21 10:26:11 -08:00
Kenneth Graunke
5be72d9a20 iris: Delete bogus comment about cube array counting.
Both 'z' and 'depth' are counted in slices, according to the Gallium
docs (context.rst).  In our temporary memory, we allocate `box.depth`
slices, so we need to rebase the starting slice (box.z) down to 0,
and back again when writing on unmap.

There's nothing strange about cubes here.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
73709be0c3 iris: Fix compute scratch pinning
Thanks to Eero Tamminen for helping catch this.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
3ab3aa23c2 iris: Add a more long term TODO about timebase scaling 2019-02-21 10:26:11 -08:00
Kenneth Graunke
7ddc1f8ded iris: Only resolve inputs for actual shader stages
We don't need to consider compute at render time, and don't need to
consider disabled stages.  4% on drawoverhead.
2019-02-21 10:26:11 -08:00
Rhys Kidd
6c17e7d95f iris: Fix assertion in iris_resource_from_handle() tiling usage
Assertion error:

  iris_resource_from_handle: Assertion `res->bo->tiling_mode ==
      isl_tiling_to_i915_tiling(res->surf.tiling)' failed.

This patch fixes 16 piglit tests on KBL:
glx/glx-multithread-texture
glx/glx-query-drawable-glx_fbconfig_id-glxpbuffer
glx/glx-query-drawable-glx_fbconfig_id-glxpixmap
glx/glx-query-drawable-glx_preserved_contents
glx/glx-query-drawable-glxpbuffer-glx_height
glx/glx-query-drawable-glxpbuffer-glx_width
glx/glx-query-drawable-glxpixmap-glx_height
glx/glx-query-drawable-glxpixmap-glx_width
glx/glx-swap-pixmap
glx/glx-swap-pixmap-bad
glx/glx-tfp
glx/glx-visuals-depth -pixmap
glx/glx-visuals-stencil -pixmap
spec/egl 1.4/eglcreatepbuffersurface and then glclear
spec/egl 1.4/largest possible eglcreatepbuffersurface and then glclear
spec/egl_nok_texture_from_pixmap/basic

Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
2019-02-21 10:26:11 -08:00
Kenneth Graunke
73d525f188 iris: Fix scratch space allocation on Icelake.
Gen9-10 have fewer than 4 subslices per slice, so they need this to be
rounded up.  Gen11 isn't documented as needing this hack, and it can
also have more than 4 subslices, so the hack actually can break things.

Fixes tests/spec/arb_enhanced_layouts/execution/component-layout/
sso-vs-gs-fs-array-interleave
2019-02-21 10:26:11 -08:00
Kenneth Graunke
154e3e45bb iris: better MOCS 2019-02-21 10:26:11 -08:00
Dave Airlie
aaaf611130 iris: fix gpu calcs for timestamp queries 2019-02-21 10:26:11 -08:00
Kenneth Graunke
3c45d03049 iris: only mark depth/stencil as writable if writes are actually enabled 2019-02-21 10:26:11 -08:00
Kenneth Graunke
3a938a4b23 iris: more dead comments 2019-02-21 10:26:11 -08:00
Kenneth Graunke
e169cb09c3 iris: pin and re-pin the scratch BO 2019-02-21 10:26:11 -08:00
Kenneth Graunke
dd0d47a5d2 iris: delete finished comments 2019-02-21 10:26:11 -08:00
Kenneth Graunke
32ee2e4c27 iris: always pin the binder...in the compute context, too.
not sure why this hasn't tripped things up
2019-02-21 10:26:11 -08:00
Kenneth Graunke
fbfe07c4f3 iris: Track blend enables, save outbound for resolve code 2019-02-21 10:26:11 -08:00
Kenneth Graunke
5481887ca8 iris: whitespace fixes 2019-02-21 10:26:11 -08:00
Kenneth Graunke
b2fa90706e iris: Make a alloc_surface_state helper
This does the gtt_offset addition for us
2019-02-21 10:26:11 -08:00
Kenneth Graunke
b358c4b92b iris: Use a surface state fill helper
This will check aux_usage eventually
2019-02-21 10:26:11 -08:00
Kenneth Graunke
b92ca4d0f6 iris: don't print the pointer in INTEL_DEBUG=submit
lots of noise in diff, hope was it would be useful for gdb, but the
the GEM handle is good enough
2019-02-21 10:26:11 -08:00
Kenneth Graunke
ad969a00c0 iris: Fix the prototype for iris_bo_alloc_tiled
This now matches the actual function in iris_bufmgr.c, as well as the
equivalent brw_bufmgr.c function...
2019-02-21 10:26:11 -08:00
Kenneth Graunke
598a78849e iris: Fix for PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET
This fixes ext_transform_feedback-builtin-varyings gl_Position after the
combination of my transform feedback reworks and my vertex buffer
reworks (?)
2019-02-21 10:26:11 -08:00
Kenneth Graunke
392fba5f31 iris: drop unnecessary genx->streamout field 2019-02-21 10:26:11 -08:00
Kenneth Graunke
5307ff6a5f iris: Implement DrawTransformFeedback()
We get the count by dividing the offset by the stride.
2019-02-21 10:26:11 -08:00
Jason Ekstrand
2e103fff63 iris: Copy anv's MI_MATH helpers for multiplication and division
(import done by Ken but with author set to Jason because it's his
code that's being imported, so he deserves the credit)
2019-02-21 10:26:11 -08:00
Kenneth Graunke
52baba80f3 iris: only get space for one offset in stream output targets
Target corresponds to a buffer, buffer only records one offset, not
multiple.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
31357bae4b iris: Move iris_stream_output_target def to iris_context.h
now that it doesn't have genxml
2019-02-21 10:26:10 -08:00
Kenneth Graunke
cf4931e586 iris: Don't bother packing 3DSTATE_SO_BUFFER at create time
We have to do half the packet late anyway, we may as well just do it
all at set time.  This also lets us move the struct def out of genxml
2019-02-21 10:26:10 -08:00
Kenneth Graunke
754d678b0a iris: Add _MI_ALU helpers that don't paste
This lets you pass arguments as function parameters
2019-02-21 10:26:10 -08:00
Kenneth Graunke
5094062bbe iris: Reorder LRR parameters to have dst first.
LRI and LRM both put dst first, be consistent.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
2f5d85661f iris: rewrite set_vertex_buffer and VB handling
I was using the Gallium API wrong.  set_* functions with start_slot
and count parameters are supposed to update a subrange of the items.
I had been trashing all bound vertex buffers and starting over.

This should hopefully also make it easier to slot in additional
VERTEX_BUFFER_STATEs at draw time, say, for shader draw parameters.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
286b8b8f99 iris: handle PatchVerticesIn as a system value. 2019-02-21 10:26:10 -08:00
Tapani Pälli
96bb328e9b iris: add Android build
Note that at least following additional libs/components require changes
since they refer to BOARD_GPU_DRIVERS variable which is used to select
the driver:

  - mixins
  - minigbm
  - libdrm
  - drm_gralloc

v2: (feedback by Gustaw Smolarczyk) Fix trailing \ in a few cases

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-21 10:26:10 -08:00
Kenneth Graunke
97e82e80f9 iris: override alpha to one src1 blend factors
No idea why this used to pass and doesn't after updating...seems like
we should have been handling it all along...
2019-02-21 10:26:10 -08:00
Kenneth Graunke
90b2745148 iris: Always do rasterizer discard in clipper
but continue doing it in SOL if possible because it's faster

Fixes ./bin/ext_transform_feedback-discard-drawarrays - simpler too
2019-02-21 10:26:10 -08:00
Kenneth Graunke
5f511798d0 iris: Fix primitive generated query active flag 2019-02-21 10:26:10 -08:00
Kenneth Graunke
99cab4d381 iris: Enable guardband clipping 2019-02-21 10:26:10 -08:00
Kenneth Graunke
f062dcdfbb iris: Clamp viewport extents to the framebuffer dimensions
Fixes arb_framebuffer_no_attachments-query's resize subtest.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
fb2df1b5d5 iris: Fix clear dimensions
Fixes depthstencil-render-miplevels 1024 s=z24_s8
2019-02-21 10:26:10 -08:00
Kenneth Graunke
2e79e46d23 iris: Drop continues in resolve
Now that we u_bit_scan we know it exists
2019-02-21 10:26:10 -08:00
Kenneth Graunke
5fde1fa988 iris: Replace num_textures etc with a bitmask we can scan
More accurate bounds, plus can skip dead ones
2019-02-21 10:26:10 -08:00
Kenneth Graunke
7ad7d0beea iris: Fix set_sampler_views with start > 0 2019-02-21 10:26:10 -08:00
Kenneth Graunke
1c6fea8e7b iris: fix set_sampler_views to not unbind, be better about bounds 2019-02-21 10:26:10 -08:00
Kenneth Graunke
598ce8e88e iris: fix overhead regression from flushing for storage images
st calls us with count = 32 but a NULL pointer...we only really care
about the highest non-NULL image...
2019-02-21 10:26:10 -08:00
Kenneth Graunke
4749f6cc4f iris: Fix NOS mechanism
Set bits, not values
2019-02-21 10:26:10 -08:00
Kenneth Graunke
a24734a2d7 iris: re-pin inherited streamout buffers 2019-02-21 10:26:10 -08:00
Kenneth Graunke
19803d0aa7 iris: reemit SBE when sprite coord origin changes
fixes arb_point_sprite-checkerboard
2019-02-21 10:26:10 -08:00
Kenneth Graunke
480c62bc7e iris: omask can kill 2019-02-21 10:26:10 -08:00
Kenneth Graunke
bd031eb2e8 iris: reject all clipping when we can't use streamout render disabled 2019-02-21 10:26:10 -08:00
Kenneth Graunke
72cf2185c8 iris: make clipper statistics dynamic 2019-02-21 10:26:10 -08:00
Kenneth Graunke
1114f0c1ce iris: CS stall for stream out -> VB
i965 doesn't do this, but I suspect it just stalls a lot and doesn't hit
this.  Fixes ext_transform_feedback-position render among others.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
c03fbb41aa iris: fix dma buf import strides 2019-02-21 10:26:10 -08:00
Kenneth Graunke
90274bd48f iris: fix alpha channel for RGB BC1 formats 2019-02-21 10:26:10 -08:00
Jason Ekstrand
47d4ea1a16 iris: Allocate buffer resources separately
(cleaned up by Ken - make sure a bunch of things were more obviously
not using res->surf, do allow checking res->surf.tiling == LINEAR,
drop format cpp checks that aren't needed, drop memzone handling for
images, assume buffers / non-buffers in a few places...)
2019-02-21 10:26:10 -08:00
Kenneth Graunke
585c95f8cc iris: Don't bother considering if the underlying surface is a cube
Dave fixed it to consider whether the sampler view is a cube.
With that, there's no point (possibly harm) in looking if the original
resource was a cube...if it's an array view, we don't want to treat it
as a cube anymore...
2019-02-21 10:26:10 -08:00
Kenneth Graunke
773adeb9e9 iris: move some non-buffer case code in a bit 2019-02-21 10:26:10 -08:00
Kenneth Graunke
2c0f001295 iris: Stop leaking iris_uncompiled_shaders like mad
Now shader-db actually executes.  We still need a plan for culling
dead iris_compiled_shaders...
2019-02-21 10:26:10 -08:00
Kenneth Graunke
68d531d7d7 iris: Destroy the bufmgr
Plugs a 12360 byte leak
2019-02-21 10:26:10 -08:00
Kenneth Graunke
7c29c3d01e iris: Fix IRIS_MEMZONE_COUNT to exclude the border color pool
This is supposed to exclude single address zones.  We were getting
too many VMA allocators but failing to set them up, which worked out
because we also forgot to destroy them...
2019-02-21 10:26:10 -08:00
Kenneth Graunke
6cb211121b iris: Unref unbound_tex resource
Plugs a 12536 byte leak
2019-02-21 10:26:10 -08:00
Kenneth Graunke
f73fdb4001 iris: Destroy the border color pool
This plugs a 12224 byte leak
2019-02-21 10:26:10 -08:00
Kenneth Graunke
3d55e9a2aa iris: Destroy transfer helper on screen teardown
Plugs a 16 byte leak
2019-02-21 10:26:10 -08:00
Kenneth Graunke
bdc1269eb2 iris: Fix failed to compile TCS message 2019-02-21 10:26:10 -08:00
Kenneth Graunke
fbf3124771 iris: Rework tiling/modifiers handling
We were being very picky about things being Y tiled.  But, not
everything can be - for example, > 16382 surfaces on SKL GT1-3
have to fall back to linear.

Instead, give ISL options and let it pick.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
761a5fb36a iris: fix conditional compute, don't stomp predicate for pipelined queries 2019-02-21 10:26:10 -08:00
Kenneth Graunke
40b12c103c iris: check query first
this lets us avoid the predicate bit in more cases, which is nice
2019-02-21 10:26:10 -08:00
Kenneth Graunke
0c3ea03e4b iris: for BLORP, only use the predicate enable bit when USE_BIT 2019-02-21 10:26:10 -08:00
Dave Airlie
7bbf3ff4a9 iris: add conditional render support 2019-02-21 10:26:10 -08:00
Kenneth Graunke
dbe198d6ba iris: drop key_size_for_cache
dead since my program cache API rework.  we could still use it for one
function, but it's so trivial to pass the size, that it's probably not
worth the extra code
2019-02-21 10:26:10 -08:00
Dave Airlie
e4115eaca0 iris: iris add load register reg32/64
These will be needed for broadwell and conditional render
2019-02-21 10:26:10 -08:00
Dave Airlie
311a1b3198 iris: execute compute related query on compute batch.
This only happens for the compute invocations query.
2019-02-21 10:26:10 -08:00
Dave Airlie
00645ea01c iris: fix cube texture view 2019-02-21 10:26:10 -08:00
Kenneth Graunke
39d1056d10 iris: fix some SO overflow query bugs and tidy the code a bit 2019-02-21 10:26:10 -08:00
Dave Airlie
527e5bcdc7 iris: add initial transform feedback overflow query paths (V3)
v2: fix cpu overflow calc
v3: use a struct
2019-02-21 10:26:10 -08:00
Kenneth Graunke
0ded23a552 iris: actually flush for storage images 2019-02-21 10:26:10 -08:00
Kenneth Graunke
69e97670bc iris: add an extra BT assert from Chris Wilson 2019-02-21 10:26:10 -08:00
Kenneth Graunke
4312784674 iris: add assertions about binding table starts 2019-02-21 10:26:10 -08:00
Kenneth Graunke
240615695d iris: drop pull constant binding table entry
nothing uses this
2019-02-21 10:26:10 -08:00
Kenneth Graunke
10d04cdaa4 iris: Use program's num textures not the state tracker's bound
the state tracker might bind more textures than the program is using.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
855ff47d36 iris: Enable precompiles 2019-02-21 10:26:10 -08:00
Kenneth Graunke
ed4ffb9715 iris: rework program cache interface
This exposes iris_upload_shader() without having to bind it, which will
be useful for precompiles.  It also lets us examine the old programs and
flag dirty bits at a higher level, rather than cramming all that
knowledge into the cache layer.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
701a6b6006 iris: Use wrappers for create_xs_state rather than a switch statement 2019-02-21 10:26:10 -08:00
Kenneth Graunke
e628095b9a iris: fix comment location 2019-02-21 10:26:10 -08:00
Kenneth Graunke
e5df8913e1 iris: export iris_upload_shader 2019-02-21 10:26:10 -08:00
Kenneth Graunke
d525b3dfad iris: fix prototype warning 2019-02-21 10:26:10 -08:00
Kenneth Graunke
84a8c63527 iris: Re-pin even if nothing is dirty 2019-02-21 10:26:10 -08:00
Kenneth Graunke
415ede346d iris: Flush for history at various moments
When we blit, transfer, or copy_resource to a buffer, we need to flush
to ensure any stale data for that buffer is invalidated in the caches.

bind_history will inform us which caches need to be flushed.

Also, for any push constant buffers, we need to flag those dirty so
that we re-emit 3DSTATE_CONSTANT_*, causing the data to be re-pushed.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
c8579e708e iris: add iris_flush_and_dirty_for_history 2019-02-21 10:26:10 -08:00
Kenneth Graunke
d169747a3e iris: Track a binding history for buffer resources
This will let us know what caches to flush / state to dirty when
altering the contents of a buffer.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
f49f506b13 iris: drop long dead XXX comment 2019-02-21 10:26:10 -08:00
Kenneth Graunke
5dbd6df9f7 iris: Do the 48-bit vertex buffer address invalidation workaround 2019-02-21 10:26:10 -08:00
Kenneth Graunke
1b1ea23766 iris: Fix VIEWPORT/LAYER in stream output info
Fixes glsl-1.50-transform-feedback-builtins and
ext_transform_feedback-builtin-varyings gl_PointSize
2019-02-21 10:26:10 -08:00
Kenneth Graunke
c5b22441f1 iris: Fix buffer -> buffer copy_region
Size can be too large for a surf, blorp_buffer_copy chops things up
into segments we can actually handle

Fixes map_buffer_range_test and copy_buffer_coherency
2019-02-21 10:26:10 -08:00
Kenneth Graunke
beb2d5e065 iris: Lie about indirects
fixes interpolateAt tests
2019-02-21 10:26:10 -08:00
Kenneth Graunke
b9ccb00e2c iris: Enable ctx->Const.UseSTD430AsDefaultPacking
hooray for obscurely named pipe caps with bizarre descriptions!
2019-02-21 10:26:10 -08:00
Kenneth Graunke
39cb10613c iris: update comment 2019-02-21 10:26:10 -08:00
Kenneth Graunke
f9612e7682 iris: RT flush for memorybarrier with texture bit
PIXEL_BUFFER_BARRIER_BIT turns into PIPE_BARRIER_TEXTURE and it ought
to trigger an RT flush, according to brw_memory_barrier
2019-02-21 10:26:10 -08:00
Kenneth Graunke
2c23721397 iris: PIPE_CONTROL workarounds for GPGPU mode 2019-02-21 10:26:10 -08:00
Kenneth Graunke
f1a7392be1 iris: Put batches in an array
We keep re-making this array all over the place
2019-02-21 10:26:10 -08:00
Kenneth Graunke
c2a77efa71 iris: put render batch first in fence code
this shouldn't matter, but it will make the next refactor easier
2019-02-21 10:26:10 -08:00
Kenneth Graunke
d918c09975 iris: flush the compute batch too if border pool is redone 2019-02-21 10:26:10 -08:00
Kenneth Graunke
017b556609 iris: leave a TODO 2019-02-21 10:26:10 -08:00
Chris Wilson
f459c56be6 iris: Add fence support using drm_syncobj 2019-02-21 10:26:10 -08:00
Kenneth Graunke
db199d9d07 iris: Add wait fences to properly sync between render/compute
When flushing a batch due to a data dependency, we need to not only
kick off the other batch's work, but stall our execution until it
completes.  Just wait on last_syncpt after flushing it.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
d69bc4ac12 iris: Hang on to the last batch's sync-point, so we can wait on it 2019-02-21 10:26:10 -08:00
Chris Wilson
fae74234d9 iris: Tag each submitted batch with a syncobj
(adjusted by Ken to make the signalling sync object immediately on
batch reset, rather than batch finish time.  this will work better
with deferred flushes...)
2019-02-21 10:26:10 -08:00
Kenneth Graunke
3e332af611 iris: Drop vestiges of throttling code 2019-02-21 10:26:10 -08:00
Chris Wilson
54347c078e iris: Merge two walks of the exec_bos list 2019-02-21 10:26:10 -08:00
Kenneth Graunke
3455f57575 iris: replace vestiges of fence fds with newer exec_fence API
patch by me and Chris Wilson
2019-02-21 10:26:10 -08:00
Kenneth Graunke
11da219be9 iris: Avoid synchronizing due to the workaround BO 2019-02-21 10:26:10 -08:00
Kenneth Graunke
30d7bebc8a iris: Avoid cross-batch synchronization on read/reads
This avoids flushing batches just because e.g. both are reading the same
dynamic state streaming buffer, or shader assembly buffer.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
b21e916a62 iris: Combine iris_use_pinned_bo and add_exec_bo 2019-02-21 10:26:10 -08:00
Kenneth Graunke
fb4c898842 iris: Use iris_use_pinned_bo rather than add_exec_bo directly
less special this way
2019-02-21 10:26:10 -08:00
Chris Wilson
e5528151a7 iris: Fix assigning the output handle for exporting for KMS
Fixes gbm_bo_get_handle() used for KMS in glamor.
2019-02-21 10:26:10 -08:00
Chris Wilson
01e729f883 iris: Tidy exporting the flink handle 2019-02-21 10:26:10 -08:00
Kenneth Graunke
1b69b14c2a iris: Fix SLM
Now that Jason has set up the L3 we can do this.  Also, my assert was
useless because we hadn't set up the field in the first place.  Oops.
2019-02-21 10:26:10 -08:00
Jason Ekstrand
f9c5e277ac iris: Don't set constant read lengths at upload time
They're set in derived_data as part of store_cs_state
2019-02-21 10:26:10 -08:00
Jason Ekstrand
a90a0e22cb iris: Configure the L3$ on the compute context 2019-02-21 10:26:10 -08:00
Kenneth Graunke
25a41b1aef iris: properly pin stencil buffers 2019-02-21 10:26:10 -08:00
Kenneth Graunke
8545e39808 iris: Fix TCS/TES slot unification
TCS outputs, TES inputs...not TCS inputs

Fixes some barrier tests
2019-02-21 10:26:10 -08:00
Kenneth Graunke
da5590496e iris: more todo notes 2019-02-21 10:26:10 -08:00
Kenneth Graunke
9878ea842f iris: scissored and mirrored blits 2019-02-21 10:26:10 -08:00
Kenneth Graunke
25f194d5ac iris: more TODO 2019-02-21 10:26:10 -08:00
Kenneth Graunke
5207a5f5d5 iris: Fix independent alpha blending.
independent_blend_enable means per-RT blending, not RGB != A
2019-02-21 10:26:10 -08:00
Kenneth Graunke
c06f6d12a5 iris: "Fix" transfer maps of buffers
x should be in bytes, not cpp units

This generally worked out because PIPE_BUFFER is supposedly required
to be R8_UINT or R8_UNORM.  I hear some state trackers pass
PIPE_FORMAT_NONE instead, however, which would make this break.

Just do the right thing directly, to be defensive and clear.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
b2c04aa3a0 iris: Fix SourceAlphaBlendFactor 2019-02-21 10:26:10 -08:00
Kenneth Graunke
89833eddab iris: leave another TODO 2019-02-21 10:26:10 -08:00
Kenneth Graunke
983e2ae7d2 iris: only clip lower if there's something to clip against 2019-02-21 10:26:10 -08:00
Kenneth Graunke
e11c497fc6 iris: fix sysval only binding tables 2019-02-21 10:26:10 -08:00
Kenneth Graunke
2ddbc1025e iris: don't forget to upload CS consts 2019-02-21 10:26:10 -08:00
Kenneth Graunke
f1f84a1ae7 iris: drop param stuffs 2019-02-21 10:26:10 -08:00
Kenneth Graunke
1b5d35319e iris: don't trip on param asserts
I'd rather not rewrite i965's compute system value handling right now :(
2019-02-21 10:26:10 -08:00
Kenneth Graunke
f4829a2fe1 iris: don't support pull constants.
I don't think it matters, we won't have any params anyway, but let's
be sure it doesn't try
2019-02-21 10:26:10 -08:00
Kenneth Graunke
911f9e8f3f iris: regather info so we get CLIP_DIST slots, not CLIP_VERTEX 2019-02-21 10:26:09 -08:00
Kenneth Graunke
6d19fe376d iris: enable push constants if we have sysvals but no uniforms 2019-02-21 10:26:09 -08:00
Kenneth Graunke
1ef68d77c0 iris: drop iris_setup_push_uniform_range
it doesn't do anything, we have no params.  I guess I thought there
would be some, but they all get dead code eliminated even if we try
to make them exist in the first place.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
7eeb124c02 iris: fix more uniform setup 2019-02-21 10:26:09 -08:00
Kenneth Graunke
50743eb748 iris: fix num clip plane consts 2019-02-21 10:26:09 -08:00
Kenneth Graunke
a98634a28f iris: actually upload clip planes. 2019-02-21 10:26:09 -08:00
Kenneth Graunke
c60ce3f4fd iris: bypass params and do it ourselves
the backend keeps dead code eliminating them all, so we can't do that,
plus we don't want to because params[] is lame
2019-02-21 10:26:09 -08:00
Kenneth Graunke
78fc760bab iris: dodge backend UCP lowering 2019-02-21 10:26:09 -08:00
Kenneth Graunke
deb6d588a6 iris: fix system value remapping 2019-02-21 10:26:09 -08:00
Kenneth Graunke
2b0a2915dc iris: hook up key stuff for clip plane lowering 2019-02-21 10:26:09 -08:00
Kenneth Graunke
2876dd1a37 iris: lower user clip planes 2019-02-21 10:26:09 -08:00
Kenneth Graunke
80c856cbee iris: only bother with params if there are any... 2019-02-21 10:26:09 -08:00
Kenneth Graunke
2186d83185 iris: fill out params array with built-ins, like clip planes 2019-02-21 10:26:09 -08:00
Kenneth Graunke
d3e8ff143d iris: add param domain defines 2019-02-21 10:26:09 -08:00
Kenneth Graunke
ecb28b2802 iris: drop unnecessary param[] setup from iris_setup_uniforms
the backend just considers these dead anyway
2019-02-21 10:26:09 -08:00
Kenneth Graunke
ed08f022f0 iris: Defer cbuf0 upload to draw time 2019-02-21 10:26:09 -08:00
Kenneth Graunke
e98cf9c24b iris: Clone the NIR
The backend compiler used to do this for us, but after a rebase, it's
now the driver's responsibility.  This lets us alter it for say, clip
vertex lowering, at the global level rather than the per-variant level.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
587e438128 iris: Print the batch name when decoding 2019-02-21 10:26:09 -08:00
Kenneth Graunke
2727a942a4 iris: partial set_query_active_state
used to avoid OQ during clears for example

fixes occlusion_query_meta_no_fragments
2019-02-21 10:26:09 -08:00
Kenneth Graunke
64af1d9248 iris: Fix multiple RTs with non-independent blending
rt[i] isn't filled out in this case, so we have to use rt[0]
2019-02-21 10:26:09 -08:00
Kenneth Graunke
58507c02ce iris: Fix TextureBarrier
I don't know how I came up with the old one, this is now what i965 does
Also we now do compute batches too
2019-02-21 10:26:09 -08:00
Kenneth Graunke
e5d84bbd36 iris: Fix MSAA smooth points
Fixes bin/ext_framebuffer_multisample-point-smooth 2 -auto -fbo
2019-02-21 10:26:09 -08:00
Kenneth Graunke
4d219b0eb3 iris: implement scratch space!
we borrow the approach from anv rather than i965, as it works better
with pre-baked state that needs to contain scratch BO addresses

fixes a bunch of varying packing tests
2019-02-21 10:26:09 -08:00
Kenneth Graunke
9511b89ef9 iris: tidy more warnings 2019-02-21 10:26:09 -08:00
Kenneth Graunke
846316b258 iris: Enable msaa_map transfer helpers
This does the downsampling for us.  It'll use BLORP anyway because
it uses blit(), and that uses BLORP.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
9ec927497e iris: Actually create/destroy HW contexts
The intention is that render and compute use their own contexts,
and each is PIPELINE_SELECT'd to the right pipeline.  But we hadn't
actually made them, so we got the fd-default context.

Thanks to Chris Wilson for catching this!
2019-02-21 10:26:09 -08:00
Kenneth Graunke
cb5f47f585 iris: Don't leak the compute batch 2019-02-21 10:26:09 -08:00
Kenneth Graunke
fbe5d75f11 iris: cross batch flushing 2019-02-21 10:26:09 -08:00
Kenneth Graunke
c3cc525c7a iris: Cross-link iris_batches so they can potentially flush each other
This makes e.g. the render batch aware of the compute batch, so it can
ask questions like "is this BO referenced by some other batch?" and do
something about that.
2019-02-21 10:26:09 -08:00
Dave Airlie
ed016b2a0b iris: fix crash in sparse vertex array
this fixes crash in array-stride piglit.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
bcac11c8f1 iris: Use at least 1x1 size for null FB surface state.
Otherwise we get 0 - 1 = 0xffffffff and fail to pack SURFACE_STATE.

Fixes some object namespace pollution gltexsubimage2d tests
2019-02-21 10:26:09 -08:00
Kenneth Graunke
9c8fdf8133 iris: Drop B5G5R5X1 support
This is oddly renderable but not supported for sampling, which is the
opposite of other X formats.  Just skip it and fall back to BGRA.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
4b31f506f8 iris: Enable A8/A16_UNORM in an inefficient manner
These are currently just use the 'A' hardware formats, rather than the
faster 'R' formats.  glBitmap handling needs these, it seems. :(
2019-02-21 10:26:09 -08:00
Kenneth Graunke
80497af192 iris: Enable ARB_shader_stencil_export 2019-02-21 10:26:09 -08:00
Kenneth Graunke
3e6aaa1ba5 iris: Disable a PIPE_CONTROL workaround on Icelake 2019-02-21 10:26:09 -08:00
Kenneth Graunke
84a419432d iris: Flag constants dirty on program changes
3DSTATE_CONSTANT_* looks at prog_data->ubo_ranges.  We were getting
saved by iris_set_constant_buffers() usually happening when changing
programs (as they usually change uniforms too), but with the clear
shader that doesn't use uniforms, we weren't getting one and were
leaving push constants enabled, screwing things up.

Also clean up a bit of a mess left by the hacks - we were missing
bindings in the VS/FS/CS case, among other issues...
2019-02-21 10:26:09 -08:00
Kenneth Graunke
317ba8796f iris: allow binding a null vertex buffer
PBO upload apparently does this...
2019-02-21 10:26:09 -08:00
Kenneth Graunke
aef1ba5ce4 iris: fix overhead regression from "don't stomp each other's dirty bits"
The change from dirty = 0ull to dirty &= ~NOT_MY_BITS broke the "nothing
to do?  skip it!" optimization.  thanks to Chris for noticing this!
2019-02-21 10:26:09 -08:00
Kenneth Graunke
525d89cafc iris: delete dead code 2019-02-21 10:26:09 -08:00
Kenneth Graunke
8a98e90415 iris: Fix refcounting of grid surface 2019-02-21 10:26:09 -08:00
Jason Ekstrand
8e8868d5ad iris/compute: Zero out the last grid size on indirect dispatches 2019-02-21 10:26:09 -08:00
Jason Ekstrand
c16e711ff2 iris/compute: Don't increment the grid size offset
It may be in the dynamic state buffer but the fact that we have a
resource takes care of that.  We don't need to add in the address of
the dynamic state buffer again.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
a3e813c5af iris: SO_DECL_LIST fix 2019-02-21 10:26:09 -08:00
Kenneth Graunke
927c4a21bd iris: Fall back to 1x1x1 null surface if no framebuffer supplied
If the state tracker never gave us the framebuffer dimensions via
a set_framebuffer_state() call, just fall back to the unbound texture
null surface, which is 1x1x1.  Otherwise we'd use a NULL resource
(no pun intended).
2019-02-21 10:26:09 -08:00
Kenneth Graunke
5d1a9db720 iris: Fix off by one in scissoring, empty scissors, default scissors 2019-02-21 10:26:09 -08:00
Kenneth Graunke
938d63b2e8 iris: Move snapshots_landed to the front.
Transform feedback overflow queries need to write additional data,
and it would be nice to have this field remain at a consistent offset.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
ba2a4207f9 iris: Clamp UBO and SSBO access to the actual BO size, for safety 2019-02-21 10:26:09 -08:00
Kenneth Graunke
a9b32f2bbf iris: Fix texture buffer / image buffer sizes.
Also fix image buffers with offsets.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
d1f8947792 iris: fix SF_CLIP_VIEWPORT array indexing with multiple VPs
fixes bunches of viewport stuffs
2019-02-21 10:26:09 -08:00
Kenneth Graunke
5bd49a47b6 iris: flag CC_VIEWPORT when changing num viewports
this also has a loop over num_viewports
2019-02-21 10:26:09 -08:00
Kenneth Graunke
d98967d936 iris: fix UBOs with bindings that have an offset 2019-02-21 10:26:09 -08:00
Kenneth Graunke
3f70956a4e iris: try and avoid pointless compute submissions
if apps don't use compute shaders, we don't even want to kick off the
compute initialization batch
2019-02-21 10:26:09 -08:00
Kenneth Graunke
97125e9bb3 iris: fix SBA flushing by refactoring code 2019-02-21 10:26:09 -08:00
Kenneth Graunke
8fa99481e7 iris: do PIPELINE_SELECT for render engine, add flushes, GLK hacks 2019-02-21 10:26:09 -08:00
Kenneth Graunke
b2d223b6bf iris: hack to avoid memorybarriers out the wazoo
we don't want to emit piles of pipe controls to a compute batch if
it isn't necessary...

prevents double-batch-wraps in cs-op-selection-bool-bvec4-bvec4
(but it's still kinda a big ol' hack...)
2019-02-21 10:26:09 -08:00
Kenneth Graunke
b3a40c27a2 iris: don't let render/compute contexts stomp each other's dirty bits
only clear what you process
2019-02-21 10:26:09 -08:00
Kenneth Graunke
f8796079da iris: better dirty checking 2019-02-21 10:26:09 -08:00
Kenneth Graunke
06a993dac2 iris: rewrite grid surface handling
now we only upload a new grid when it's actually changed, which saves us
from having to emit a new binding table every time it changes.

this also moves a bunch of non-gen-specific stuff out of iris_state.c
2019-02-21 10:26:09 -08:00
Kenneth Graunke
155e1a63d5 iris: XXX for compute state tracking :/
Maybe we should just move dirty to batch, it would help with the
reset stuff too
2019-02-21 10:26:09 -08:00
Kenneth Graunke
643030f4fb iris: fix whitespace 2019-02-21 10:26:09 -08:00
Kenneth Graunke
b0dc11993e iris: bail if SLM is needed 2019-02-21 10:26:09 -08:00
Kenneth Graunke
973b937cac iris: leave XXX about unnecessary binding table uploads 2019-02-21 10:26:09 -08:00
Kenneth Graunke
7fb8c20d7b iris: drop unnecessary #ifdefs 2019-02-21 10:26:09 -08:00
Kenneth Graunke
549db5b90e iris: drop XXX that Jordan handled 2019-02-21 10:26:09 -08:00
Jordan Justen
942bdb2906 iris/compute: Support indirect compute dispatch
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
b35c8f2182 iris/compute: Push subgroup-id
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
229450a2a6 iris/compute: Flush compute batch on memory-barriers
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
fb4637797e iris/compute: Provide binding table entry for gl_NumWorkGroups
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
fcd0364857 iris/compute: Wait on compute batch when mapping
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
ea416d0b5d iris/program: Don't try to push ubo ranges for compute
We only can push constants for compute shaders from one range.

Gallium glsl-to-nir (src/mesa/state_tracker/st_glsl_to_nir.cpp) lowers
all uniform accesses to a ubo.

Unfortunately we also load the subgroup-id as a uniform in the
compiler. Since we use the 1 push range for this subgroup-id, we then
lose the ability to actually push the ubo with all the normal user
uniform values.

In other words, there is lots of room for performance improvement, but
at least retrieving the uniforms as pull-constants is functional for
now.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
c7cfa4000f iris/compute: Get group counts from grid->grid
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
fd9ccd8b5d iris/compute: Flush compute batches
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
9b5cda95aa iris/compute: Add MEDIA_STATE_FLUSH following WALKER
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
6ebd04ac8f iris: Add iris_restore_compute_saved_bos
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
622aaa290f iris: Add IRIS_DIRTY_CONSTANTS_CS
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
25f1625edf iris/compute: Set mask bits on PIPELINE_SELECT
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Kenneth Graunke
9fc672428d iris: little bits of compute basics 2019-02-21 10:26:09 -08:00
Kenneth Graunke
860ce6af3f iris: drop XXX's about swizzling
pretty sure this is unnecessary on modern HW
2019-02-21 10:26:09 -08:00
Kenneth Graunke
12de56f53d iris: drop dead format //'s
these just aren't supported
2019-02-21 10:26:09 -08:00
Kenneth Graunke
f6c68066a6 iris: yes 2019-02-21 10:26:09 -08:00
Kenneth Graunke
752abeb690 iris: initial compute caps
RET macro borrowed from freedreno
2019-02-21 10:26:09 -08:00
Kenneth Graunke
4da28c2c22 iris: Enable fb fetch
needed for ES 3.2
2019-02-21 10:26:09 -08:00
Kenneth Graunke
be905bd461 iris: advertise GL_ARB_shader_texture_image_samples 2019-02-21 10:26:09 -08:00
Jordan Justen
6441e906e8 iris: Set num_uniforms in bytes
Ref: brw_nir_lower_uniforms, type_size_scalar_bytes

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Kenneth Graunke
c29fd34259 iris: move images next to textures in binding table 2019-02-21 10:26:09 -08:00
Kenneth Graunke
0d9c5b4e7e iris: null for non-existent cbufs
prevents BTs from being shifted down incorrectly
2019-02-21 10:26:09 -08:00
Kenneth Graunke
98e8f80e7d iris: actually set image access 2019-02-21 10:26:09 -08:00
Jason Ekstrand
d9aee25a46 iris: Don't lower image formats for write-only images 2019-02-21 10:26:09 -08:00
Kenneth Graunke
a06f0fe517 iris: set image access correctly 2019-02-21 10:26:09 -08:00
Kenneth Graunke
5d1dadfc38 iris: bother with BTIs 2019-02-21 10:26:09 -08:00
Kenneth Graunke
f5b887da6c iris: implement set_shader_images hook 2019-02-21 10:26:09 -08:00
Kenneth Graunke
26a54ae4b2 iris: lower storage image derefs 2019-02-21 10:26:09 -08:00
Kenneth Graunke
e97a24da89 iris: set the binding table size
we weren't doing mark_surface_used on images (i965 does it while
uploading the unnecessary image uniforms), so our binding tables were
too small...
2019-02-21 10:26:09 -08:00
Kenneth Graunke
28b41992c8 iris: X32_S8X24 :/
This can happen when faking Z32_S8X24 and setting StencilSampling = true

I guess we'll just turn it into S8_UINT...

Fixes KHR-GL45.texture_swizzle.functional
2019-02-21 10:26:09 -08:00
Kenneth Graunke
6e7957a22d iris: enable I/L formats 2019-02-21 10:26:09 -08:00
Kenneth Graunke
bfbebbaa36 iris: Use R/RG instead of I/L/A when sampling 2019-02-21 10:26:09 -08:00
Kenneth Graunke
94569a6458 iris: rework format translation apis 2019-02-21 10:26:09 -08:00
Kenneth Graunke
b9eeed3e8f iris: Allow PIPE_CONTROL with Stall at Scoreboard and RT flush
It's nonsensical, but not illegal, and mandatory on Icelake
2019-02-21 10:26:09 -08:00
Kenneth Graunke
65d1cda995 iris: add gen11 to genX_call 2019-02-21 10:26:09 -08:00
Kenneth Graunke
0fdcb20803 iris: inline stage_from_pipe to avoid unused warnings 2019-02-21 10:26:09 -08:00
Kenneth Graunke
6fbb6ba290 iris: pipe to scs -> iris_pipe.h 2019-02-21 10:26:09 -08:00
Kenneth Graunke
87351b8dfe iris: force persample interp cap 2019-02-21 10:26:09 -08:00
Kenneth Graunke
90b9efc1f9 iris: stencil texturing 2019-02-21 10:26:09 -08:00
Kenneth Graunke
9b229d266d iris: fix Z32_S8 depth sampling
We were accidentally using the ISL_FORMAT_R32_FLOAT_X8X24_TYPELESS
format, which is NOT what we use.  We just store R32_FLOAT depth.

fixes Piglit's texwrap GL_ARB_depth_buffer_float
2019-02-21 10:26:09 -08:00
Kenneth Graunke
822f91508e iris: don't mark contains_draw = false when chaining batches
chaining to a new batch reuses create_batch(), but we don't need to do
the work of pinning BOs we inherit from a previous batch...when that is
actually part of the same execbuf invocation.

instead, just flag it when setting primary_batch_size = 0, in
iris_batch_reset
2019-02-21 10:26:09 -08:00
Kenneth Graunke
294ce58a30 iris: vma_free bo->size, not bo_size
this is more obviously correct.  I think the two end up being the same
in practice, since this is in the alloc_from_cache case, and presumably
bo from the bucket has bo->size == bucket->size, and bo_size also is
bucket->size...

still.  better to do the obvious thing.

brw_bufmgr already does it this way.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
2f24000662 iris: drop a bunch of pipe_sampler_state stuff we don't need 2019-02-21 10:26:09 -08:00
Kenneth Graunke
c6016d3761 iris: just mark snapshots_landed from the CPU
otherwise, get results may check q->map->snapshots_landed...before our
commands to initialize it to false have actually executed...so it'd get
some random garbage from the BO...
2019-02-21 10:26:09 -08:00
Kenneth Graunke
3c0ef22edb iris: Enable ARB_shader_vote
The easiest get out the vote campaign ever
2019-02-21 10:26:08 -08:00
Kenneth Graunke
0395eba20f iris: magic number 36 -> #define 2019-02-21 10:26:08 -08:00
Kenneth Graunke
57f8a623c5 iris: better query file comment 2019-02-21 10:26:08 -08:00
Kenneth Graunke
d3a5d87219 iris: early return properly 2019-02-21 10:26:08 -08:00
Kenneth Graunke
07ff8c752f iris: 36-bit overflow fixes 2019-02-21 10:26:08 -08:00
Kenneth Graunke
dff174c103 iris: Need to | 1 when asking for timestamps 2019-02-21 10:26:08 -08:00
Kenneth Graunke
1d91eba7dc iris: glGet timestamps, more correct timestamps 2019-02-21 10:26:08 -08:00
Kenneth Graunke
36fbcfb06c iris: ...and SO prims emitted queries
looks like we have queries

some fails still due to races between snapshots_written and start/end
not being garbage...not sure what that's about
2019-02-21 10:26:08 -08:00
Kenneth Graunke
ec82be57e8 iris: timestamps 2019-02-21 10:26:08 -08:00
Kenneth Graunke
23572cdd07 iris: drop explicit pinning
writes will already rw_bo or ro_bo that
2019-02-21 10:26:08 -08:00
Kenneth Graunke
d8875fe406 iris: primitives generated query support 2019-02-21 10:26:08 -08:00
Kenneth Graunke
ffae6e3105 iris: pipeline stats 2019-02-21 10:26:08 -08:00
Kenneth Graunke
7840d0e091 iris: play chicken with timer queries for now
they have been crashy in the past and I don't want to risk tanking my
laptop right before my XDC talk
2019-02-21 10:26:08 -08:00
Kenneth Graunke
0b095c665d iris: gpr0 to bool
I think OQ is basically working now.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
f5a8908bd1 iris: fix random failures via CS stall...but why? 2019-02-21 10:26:08 -08:00
Kenneth Graunke
ad14795805 iris: flush batch when asking for result via QBO 2019-02-21 10:26:08 -08:00
Kenneth Graunke
cf261caad9 iris: results write 2019-02-21 10:26:08 -08:00
Kenneth Graunke
d4e4517569 iris: gen10+ workarounds and break fix 2019-02-21 10:26:08 -08:00
Kenneth Graunke
dca5632de1 iris: initial query code 2019-02-21 10:26:08 -08:00
Kenneth Graunke
dd478913d5 iris: LRM/SRM/SDI hooks 2019-02-21 10:26:08 -08:00
Kenneth Graunke
af9fe0d472 iris: rw_bo for pipe controls
this is used for WRITE IMMEDIATE...
but maybe we don't want to for the workaround BO?
2019-02-21 10:26:08 -08:00
Kenneth Graunke
30c370ed4b iris: use 0 for TCS passthrough program string ID
the passthrough shader doesn't need a real program string ID - that's
basically used for ARB programs indicating total program source code
changes, or other pre-baked uniform changes, etc...none of which a
passthrough shader has...so we don't need a unique identifier to
distinguish them.  We want to use a consistent value so we find
existing passthrough shaders in the cache.
2019-02-21 10:26:08 -08:00
Caio Marcelo de Oliveira Filho
54e23442e2 iris: Add support for TCS passthrough
If no TCS is provided, create a "passthrough" TCS that will take the
default values set in the API as constants and pass to the TES, along
with any other inputs it expects.  The code to create the NIR shader
is the same as in i965.

Tested with

    ./piglit run -t 'tess' quick_shader r

and fixed a dozen crashes from that list.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
5395658c61 iris: inherit the index buffer properly 2019-02-21 10:26:08 -08:00
Kenneth Graunke
a858b69880 iris: delete bogus comment
Caio asked what was wrong.  There is nothing wrong.  :)
2019-02-21 10:26:08 -08:00
Kenneth Graunke
f2f506fa43 iris: properly re-pin stencil buffers 2019-02-21 10:26:08 -08:00
Kenneth Graunke
aaced066e8 iris: fix context restore of 3DSTATE_CONSTANT ranges
if clean we want to DO the pinning...not SKIP the pinning.

thanks to Jordan Justen for catching this!
2019-02-21 10:26:08 -08:00
Kenneth Graunke
58a6c99ebe iris: silence const warning
not sure why this is labeled const, I'm pretty sure we are taking the
reference and owning this, so there's no particular reason we can't
change it.  it certainly seems to be working for non-compute.  and,
freedreno's ir3_shader.c seems to do this as well.  still...gross :/
2019-02-21 10:26:08 -08:00
Kenneth Graunke
897f8d9232 iris: refactor program CSO stuff 2019-02-21 10:26:08 -08:00
Caio Marcelo de Oliveira Filho
fb4a3e2736 iris: Fix uses of gl_TessLevel*
The backend compiler expects the gl_TessLevel* variables to be mapped
as inputs instead of system values.  Use the new PIPE_CAP to get this
behavior from GLSL compiler.

Tested with:
tests/spec/arb_tessellation_shader/execution/vs-tcs-tes-tessinner-tessouter-inputs-quads.shader_test
2019-02-21 10:26:08 -08:00
Kenneth Graunke
2b956a093a iris: totally untested icelake support 2019-02-21 10:26:08 -08:00
Kenneth Graunke
921790b080 iris: initialize "don't suck" bits, as Ben likes to call them 2019-02-21 10:26:08 -08:00
Kenneth Graunke
73a4cef220 iris: refactor LRIs in context setup
we're going to have more of them, so reduce the boilerplate
2019-02-21 10:26:08 -08:00
Kenneth Graunke
2d1db44e8e iris: enable ARB_enhanced_layouts 2019-02-21 10:26:08 -08:00
Kenneth Graunke
c0422d623c iris: re-pin binding table contents if we didn't re-emit them
fixes glsl-vs-loop and other regressions from multibinder.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
2963276a58 iris: move binder pinning outside the dirty == 0 check
This might be a new batch with back to back non-dirty calls, if so we
need to inherit the old binder...
2019-02-21 10:26:08 -08:00
Chris Wilson
1a61a211f0 iris: fix memzone_for_address since multibinder changes 2019-02-21 10:26:08 -08:00
Kenneth Graunke
f6924e2379 iris: update comments for multibinder 2019-02-21 10:26:08 -08:00
Kenneth Graunke
5cb0527c4f iris: fix SO offset writes for multiple streams 2019-02-21 10:26:08 -08:00
Kenneth Graunke
eff081cdd9 iris: Support multiple binder BOs, update Surface State Base Address 2019-02-21 10:26:08 -08:00
Kenneth Graunke
148e315d96 iris: fix null FB and unbound tex surface state addresses 2019-02-21 10:26:08 -08:00
Kenneth Graunke
f838400a59 iris: set EXEC_OBJECT_CAPTURE on all driver internal buffers 2019-02-21 10:26:08 -08:00
Kenneth Graunke
938afd484a iris: fix constant buffer 0 to be absolute
thanks to Jason for catching this.  Fixes some va64 tests.  Surprisingly
not much else, as apparently getting to UBO range 4 is uncommon!
2019-02-21 10:26:08 -08:00
Kenneth Graunke
5a2257bb2f iris: don't unconditionally emit 3DSTATE_VF / 3DSTATE_VF_TOPOLOGY
this was just laziness on my part
2019-02-21 10:26:08 -08:00
Kenneth Graunke
4c27cb031c iris: skip over whole function if dirty == 0
kinda pointless in non-pathological cases, but does boost our score in
the drawarrays case by 50%...
2019-02-21 10:26:08 -08:00
Kenneth Graunke
888efcd192 iris: Allow inlining of require/get_command_space
eliminates so many callqs for ptr++
2019-02-21 10:26:08 -08:00
Kenneth Graunke
2ebce6f8c8 iris: use Eric's new caps helper
this does change a couple caps...PRIMITIVE_RESTART_FOR_PATCHES...
2019-02-21 10:26:08 -08:00
Kenneth Graunke
3e7a41f228 iris: new caps 2019-02-21 10:26:08 -08:00
Kenneth Graunke
52eb8d5593 iris: fix blend state memcpy
thanks to Jason for noticing grumpy valgrind
2019-02-21 10:26:08 -08:00
Kenneth Graunke
9ce92fa036 iris: Skip primitive ID overrides if the shader wrote a custom value
Fixes glsl-1.50/execution/geometry/primitive-id-out
2019-02-21 10:26:08 -08:00
Kenneth Graunke
47d3019c4a iris: fix crash when binding optional shader for the first time 2019-02-21 10:26:08 -08:00
Kenneth Graunke
6331b754df iris: handle level/layer in direct maps
needed now that we do 1D linear
2019-02-21 10:26:08 -08:00
Kenneth Graunke
9f7654139b iris: use linear for 1D textures
This gets us the gen9 compact linear storage
2019-02-21 10:26:08 -08:00
Kenneth Graunke
b2a5e1ebb3 iris: big old hack for tex-miplevel-selection
copied from ilo.  I don't understand this at all..
2019-02-21 10:26:08 -08:00
Kenneth Graunke
e4d22b16c8 iris: fix sampler state setting 2019-02-21 10:26:08 -08:00
Kenneth Graunke
b3bb33c4c1 iris: try to hack around binder issue 2019-02-21 10:26:08 -08:00
Kenneth Graunke
d2516358f9 iris: fix line-aa-width
we should probably move the roundf to st_atom_raster
2019-02-21 10:26:08 -08:00
Kenneth Graunke
701b47a197 iris: implement get_sample_position
Fixes arb_sample_shading/builtin-gl-sample-position
2019-02-21 10:26:08 -08:00
Kenneth Graunke
7ed4b80233 iris: z_res -> s_res
fixes crashes introduced a few commits ago
2019-02-21 10:26:08 -08:00
Kenneth Graunke
d1cb4b330a iris: reenable R32G32B32 texture buffers
This dropped us from GL 4.2 to GL 3.3 by mistake.  Thanks to Dave for
catching this!
2019-02-21 10:26:08 -08:00
Chris Wilson
367f6bbd01 iris: Record reusability of bo on construction
We know that if the bufmgr->reuse is set to false or if the bo is too
large for a bucket, the same will be true when we come to free the bo.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
abe7dbfa4a iris: Reduce binder alignment from 64 to 32
3DSTATE_BINDING_TABLE_POINTER_XS's alignment requirement is only 32B.

Makes us waste less precious binder space.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
04e8c5bb43 iris: precompute hashes for cache tracking
saves a touch of cpu overhead in the new resolve tracking
2019-02-21 10:26:08 -08:00
Chris Wilson
d209cc5170 iris: AMD_pinned_memory
(rebased by Ken, mainly set res->internal_format)
2019-02-21 10:26:08 -08:00
Kenneth Graunke
93c1921ce2 iris: proper cache tracking
this is copied from the i965 aux resolve stuff...minus the aux resolves
2019-02-21 10:26:08 -08:00
Kenneth Graunke
5e30b1083b iris: Move cache tracking to iris_resolve.c 2019-02-21 10:26:08 -08:00
Kenneth Graunke
42dccb1233 iris: use consistent copyright formatting
some of them had typos, didn't say 'authors or copyright holders',
or other mistakes.  This is now https://opensource.org/licenses/MIT
text, formatted consistently.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
1d33982e9b iris: track depth/stencil writes enabled 2019-02-21 10:26:08 -08:00
Kenneth Graunke
3fecb1c44d iris: Move iris_sampler_view declaration to iris_resource.h
We'll need this for resolve tracking.  There's also no genxml stuff here
2019-02-21 10:26:08 -08:00
Kenneth Graunke
b75b52530a iris: Move things to iris_shader_state
We didn't originally have this struct, so we had lots of ad-hoc arrays.
Now that we have it, it makes sense to group things there.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
410a555bfb iris: move iris_shader_state from ice->shaders.state to ice->state.shaders
it's more state related...
2019-02-21 10:26:08 -08:00
Kenneth Graunke
33701d5341 iris: Drop bogus sampler state saving
We do this in an earlier loop.  This was just reading things out of the
array, and saving them back over the same array...but in the wrong slots
2019-02-21 10:26:08 -08:00
Kenneth Graunke
aba2cee711 iris: rename pipe to base 2019-02-21 10:26:08 -08:00
Kenneth Graunke
7705f62cb6 iris: don't emit SBE all the time 2019-02-21 10:26:08 -08:00
Kenneth Graunke
630d602900 iris: port non-bucket alignment bugfix
Sergii's 24839663a4
2019-02-21 10:26:08 -08:00
Kenneth Graunke
ad6ba5a712 iris: drop pwrite
nobody uses it
2019-02-21 10:26:08 -08:00
Kenneth Graunke
aad70ad8a1 iris: drop dead assignments
Eric's commit 9a6a631762
2019-02-21 10:26:08 -08:00
Kenneth Graunke
2bd7d6fa71 iris: last VUE map NOS, handle > 16 FS inputs
not sure if the UNCOMPILED_FS flagging is still needed, should
reevaluate those hacks at some point
2019-02-21 10:26:08 -08:00
Kenneth Graunke
ee8cb7e0ee iris: implement ARB_clear_texture 2019-02-21 10:26:08 -08:00
Kenneth Graunke
84b30a2900 iris: call maybe_flush for each blorp operation
otherwise with high layer counts we may exceed two batches worth of
commands... (!)
2019-02-21 10:26:08 -08:00
Kenneth Graunke
0e059e4829 iris: assert depth is 1 in resource_copy_region
given the dstz parameter I don't think it does multiple slices..
2019-02-21 10:26:08 -08:00
Kenneth Graunke
03933a2d1b iris: blorp blit multiple slices
fixes getteximage-depth
2019-02-21 10:26:08 -08:00
Kenneth Graunke
84832ab7d4 iris: Fix tiled memcpy for cubes...and for array slices
tiled_memcpy_map was not offsetting map->ptr based on the slice,
while unmap was.  also, we were doing offsetting wrong for cubes.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
bce7398646 iris: disallow RGB32 formats too 2019-02-21 10:26:08 -08:00
Kenneth Graunke
ea19d359cc iris: Convert RGBX to RGBA for rendering.
Fixes a bunch of RGB bugs.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
906becec70 iris: we can do multisample Z resolves 2019-02-21 10:26:08 -08:00
Kenneth Graunke
1f156f004b iris: deal with Marek's new MSAA caps
storage sample count is equal to sample count for us, for now,
so 0 the pipe cap and ignore the new parameter
2019-02-21 10:26:08 -08:00
Kenneth Graunke
532cf23d25 iris: say no to more formats
copied from brw_surface_formats.c
2019-02-21 10:26:08 -08:00
Kenneth Graunke
d5146ba670 iris: actually do stencil blits 2019-02-21 10:26:08 -08:00
Kenneth Graunke
ad76389f88 iris: refcounting, who needs it?
that's right, we do!
2019-02-21 10:26:08 -08:00
Kenneth Graunke
be60e3247c iris: drop stencil handling now that u_transfer_helper does it 2019-02-21 10:26:08 -08:00
Kenneth Graunke
b932938d01 iris: use u_transfer_helper for depth stencil packing/unpacking 2019-02-21 10:26:08 -08:00
Kenneth Graunke
853230b5e6 iris: WTF transfers
stencil unfortunately is stored in the Weird Tile Format (WTF or Tile-W)
which needs special CPU detiling code.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
d93a20e258 iris: allow S8 as a stencil format 2019-02-21 10:26:08 -08:00
Kenneth Graunke
7972599eab iris: actually emit stencil packets 2019-02-21 10:26:08 -08:00
Kenneth Graunke
753646dd6b iris: clear stencil 2019-02-21 10:26:08 -08:00
Kenneth Graunke
9ec2d3640e iris: depth or stencil fixes 2019-02-21 10:26:08 -08:00
Kenneth Graunke
763f9095ea iris: fill out more caps 2019-02-21 10:26:08 -08:00
Kenneth Graunke
2d578e71d5 iris: get angry about execbuf failures
want this to be easy to detect for now
2019-02-21 10:26:08 -08:00
Kenneth Graunke
a378ee3607 iris: simplify batch len qword alignment
Split from a patch by Chris Wilson so I can test it independently
2019-02-21 10:26:08 -08:00
Kenneth Graunke
621cb43f41 iris: rename ring to engine
makes more sense these days.  split from a patch by Chris Wilson
2019-02-21 10:26:08 -08:00
Kenneth Graunke
1a9651f29a iris: remember to set bo->userptr 2019-02-21 10:26:08 -08:00
Chris Wilson
796ad6fe97 iris: Wrap userptr for creating bo 2019-02-21 10:26:08 -08:00
Kenneth Graunke
5911fb8801 iris: sync bugfixes from brw_bufmgr
I wrote softpin support here first, then debugged and landed it in brw;
some of those fixes need to get brought back.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
dfe1ee4f6f iris: comment everything
1. Write the code
2. Add comments
3. PROFIT (or just avoid cost of explaining or relearning things...)
2019-02-21 10:26:08 -08:00
Kenneth Graunke
387a414f2c iris: add minor comments 2019-02-21 10:26:08 -08:00
Dave Airlie
9d39e69219 iris: fix some hangs around null framebuffers
This fixes some cases in fbo-none* and framebuffer_no_attachments.

I'm not sure this is correct otherwise, the tests don't all pass yet

No idea if this is in any way the correct answer
2019-02-21 10:26:08 -08:00
Chris Wilson
02b82fe80a iris: Set resource modifier on handle
Required for gdm_bo_create_with_modifiers
2019-02-21 10:26:08 -08:00
Kenneth Graunke
682aeff8d0 iris: we don't support textureGatherOffsets, need it lowered 2019-02-21 10:26:08 -08:00
Kenneth Graunke
03dc99475d iris: cube arrays are cubes too 2019-02-21 10:26:08 -08:00
Kenneth Graunke
80c7096672 iris: fix sample mask
0xffffffff does not mean 1, it means enable as many as there actually
are.  we don't get set_sample_mask() calls until some masking is
actually applied...i.e. it doesn't get updated based on # of samples
in the FBO changing.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
e990558152 iris: drop pipe_shader_state
looking at the freedreno code, this is totally unnecessary!  we can just
store the NIR and be happy, and not have any vestiges of TGSI.

plus we can reuse this structure for compute shaders, without needing a
pipe_compute_state base.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
834b97c34b iris: fix GS output component limit
this is total, so should be 1024, not 128
2019-02-21 10:26:08 -08:00
Kenneth Graunke
c9f9a6f61b iris: Avoid croaking when trying to create FBO surfaces with bad formats
create_surface happens before st_validate_attachment, which actually
does the "hey, this is a render target now, is that OK?" check

Fixes asserts in ./bin/arb_texture_view-rendering-formats, allowing the
rest of the tests to run.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
8da91ebb68 iris: enable texture gather 2019-02-21 10:26:08 -08:00
Kenneth Graunke
f3dd70182d iris: BIG OL' HACK for UBO updates
We need to re-push data when UBO changes.  This will need to be replaced
with a usage history based flushing system later.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
a7311ef068 iris: update a todo comment 2019-02-21 10:26:07 -08:00
Kenneth Graunke
8e7b0deee2 iris: Don't reserve new binding table section unless things are dirty 2019-02-21 10:26:07 -08:00
Kenneth Graunke
870f2e8434 iris: implement texture/memory barriers 2019-02-21 10:26:07 -08:00
Kenneth Graunke
82ee971497 iris: drop unused bo parameter 2019-02-21 10:26:07 -08:00
Kenneth Graunke
f0159d5ca3 iris: update bindings when changing programs
the binding table layout depends on program info.

not known to fix anything yet.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
b0e9c5797b iris: fix for disabling ssbos 2019-02-21 10:26:07 -08:00
Kenneth Graunke
b7b061c4e2 iris: fix SSBO indexing
st/nir offsets SSBO indexes by MaxABOs.  This is not what we want,
as it bloats the binding tables.  We'll need to adjust it to use
info->num_abos as the offset and buffer base instead.  For now,
just use the inefficient format to get us rolling.  We can add a
PIPE_CAP later.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
376c7253f8 iris: enable SSBOs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
75709d982b iris: fix TBO alignment to match 965 2019-02-21 10:26:07 -08:00
Kenneth Graunke
77b9219818 iris: unbind compiled shaders if none are present
avoids the case where you have a stale compiled shader bound, but no
uncompiled shader bound, which is not just boats, but an entire marina
2019-02-21 10:26:07 -08:00
Kenneth Graunke
fd5ed7b46b iris: shorten loop
num_ubos doesn't include Tim's magic UBO for regular uniforms, so +1
2019-02-21 10:26:07 -08:00
Kenneth Graunke
bf795b0244 iris: emit binding table for atomic counters and SSBOs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
2d5f545464 iris: implement set_shader_buffers
for SSBOs/ABOs.  We just stream out SURFACE_STATE for now...since it's
a set_* API...and the buffer offset may change...not sure where else
we'd do it.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
541cb60e7e iris: export get_shader_info 2019-02-21 10:26:07 -08:00
Kenneth Graunke
f0558ca22c iris: fix msaa flipping filters 2019-02-21 10:26:07 -08:00
Kenneth Graunke
2c73d7e3f1 iris: expose more things that we already support 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5b8dd5f303 iris: fix blorp filters
we have to switch to blorp enums after the rebase, but also we were
probably doing it wrong for MSAA before this.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
3aa1fcc65a iris: hack around samples confusion 2019-02-21 10:26:07 -08:00
Kenneth Graunke
2c15f38a29 iris: point sprite enables 2019-02-21 10:26:07 -08:00
Kenneth Graunke
c60a4de1f5 iris: reemit blend state for alpha test function changes
fixes bin/fbo-alphatest-formats GL_EXT_texture_snorm
2019-02-21 10:26:07 -08:00
Kenneth Graunke
a4036635b1 iris: fix Z24
This was backwards.

thanks to Jason Ekstrand for realizing that I was seeing the wrong bits.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
a12a370d7b iris: fix EmitNoIndirect
we were using pipe stages, which are ordered dumbly for historical
reasons.  we want gl_shader_stage here.  this got us the wrong options
2019-02-21 10:26:07 -08:00
Kenneth Graunke
5bd861de8b iris: assert about passthrough shaders to make this easier to detect
otherwise it just silently fails and looks like some obscure problem
2019-02-21 10:26:07 -08:00
Kenneth Graunke
5e19885d5a iris: fill out MAX_PATCH_VERTICES 2019-02-21 10:26:07 -08:00
Kenneth Graunke
3e9e3121e5 iris: fix SGVS when there are no valid vertex elements
tessellation nop.shader_test now passes
2019-02-21 10:26:07 -08:00
Kenneth Graunke
5520a54bc5 iris: vertex ID, instance ID 2019-02-21 10:26:07 -08:00
Kenneth Graunke
a9083bdb71 iris: don't emit SO_BUFFERS and SO_DECL_LIST unless streamout is enabled
Otherwise on the first draw, if XFB isn't enabled, we get a pile of
MI_NOOPS where SO_BUFFERS should be
2019-02-21 10:26:07 -08:00
Kenneth Graunke
ebb960c6d3 iris: compile a TCS...don't bother with passthrough yet 2019-02-21 10:26:07 -08:00
Kenneth Graunke
9aa8be3d8e iris: TES program key inputs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
fcee21da6b iris: fix texture buffer stride 2019-02-21 10:26:07 -08:00
Kenneth Graunke
3c41d4cf3f iris: fix sampler views of TBOs
we can't read levels/layers, they're invalid for PIPE_BUFFER
2019-02-21 10:26:07 -08:00
Kenneth Graunke
6e7e49cc4f iris: fix crash 2019-02-21 10:26:07 -08:00
Kenneth Graunke
841fc3e3ca iris: record FS NOS 2019-02-21 10:26:07 -08:00
Kenneth Graunke
d223b316ad iris: NOS mechanics 2019-02-21 10:26:07 -08:00
Kenneth Graunke
a6d480f892 iris: bind state helper function 2019-02-21 10:26:07 -08:00
Kenneth Graunke
48b826cdaf iris: s/hwcso/state/g 2019-02-21 10:26:07 -08:00
Kenneth Graunke
aeb6fc8782 iris: bits of multisample program key 2019-02-21 10:26:07 -08:00
Kenneth Graunke
e6b1cc2106 iris: save query type 2019-02-21 10:26:07 -08:00
Kenneth Graunke
44ba48eba7 iris: draw indirect support? 2019-02-21 10:26:07 -08:00
Kenneth Graunke
b030671298 iris: fix CC_VIEWPORT
I was confusing depth bounds test with depth clamping
2019-02-21 10:26:07 -08:00
Kenneth Graunke
fdbc205552 iris: multislice transfer maps 2019-02-21 10:26:07 -08:00
Kenneth Graunke
44248d16d2 iris: disable 6x MSAA support 2019-02-21 10:26:07 -08:00
Kenneth Graunke
bc1b4db3b3 iris: fix sample mask for MSAA-off 2019-02-21 10:26:07 -08:00
Kenneth Graunke
7b8c0f058e iris: actually pin the buffers 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5635abadef iris: fix SO_DECL_LIST 2019-02-21 10:26:07 -08:00
Kenneth Graunke
dc3b927e97 iris: bother setting program_string_id...
not sure how useful this really is...

./bin/ext_transform_feedback-tessellation triangles flat_first
is hitting a case where we rebind the same VS program, but with
different streamout info...which isn't in the key...but is in the
cache...so we don't rebuild it...
2019-02-21 10:26:07 -08:00
Kenneth Graunke
9c1cefff52 iris: set even if no outputs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
cef0b8b13b iris: streamout 2019-02-21 10:26:07 -08:00
Kenneth Graunke
059c096eff iris: SO buffers 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5c00f5fdca iris: Implement 3DSTATE_SO_DECL_LIST 2019-02-21 10:26:07 -08:00
Kenneth Graunke
6794f1ffb9 iris: rearrange iris_resource.h 2019-02-21 10:26:07 -08:00
Kenneth Graunke
a3f77eceb4 iris: slab allocate transfers
apparently we need this for u_threaded_context
2019-02-21 10:26:07 -08:00
Kenneth Graunke
5165308169 iris: don't crash on shader perf logs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
f20fc950a7 iris: fix depth bounds clamp enables
fixes depthrange-clear among others
2019-02-21 10:26:07 -08:00
Kenneth Graunke
eb274a31bc iris: fix clip flagging on fb changes 2019-02-21 10:26:07 -08:00
Kenneth Graunke
0232fbc2c4 iris: comment out l/a/i/la
in hopes of r/rg fallbacks
2019-02-21 10:26:07 -08:00
Kenneth Graunke
cf34dd7a61 iris: actually handle array layers in blits 2019-02-21 10:26:07 -08:00
Kenneth Graunke
33a17d566f iris: keep DISCARD_RANGE
this isn't really an iris_bo_map flag, but the various resource mappers
want to check for it to avoid making temp copies.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
c0ab9c9890 iris: actually set cube bit properly 2019-02-21 10:26:07 -08:00
Kenneth Graunke
d849501f4c iris: rename map->stride 2019-02-21 10:26:07 -08:00
Kenneth Graunke
36301bbe40 iris: fix zoffset asserts with 2DArray/Cube 2019-02-21 10:26:07 -08:00
Kenneth Graunke
7f39f4843f iris: SBE change stash
not used yet, but want to flag it so I don't forget
2019-02-21 10:26:07 -08:00
Kenneth Graunke
8a080223e6 iris: just malloc one iris_genx_state instead of a bunch of oddball pieces
Things that are gen-specific can go in iris_genx_state.  Things that are
gen-agnostic can go directly in ice->state.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
a7e0edffb6 iris: dead pointer 2019-02-21 10:26:07 -08:00
Kenneth Graunke
ccec5bab5b iris: implement border color, fix other sampler nonsense 2019-02-21 10:26:07 -08:00
Kenneth Graunke
8a16249285 iris: border color memory zone :(
They took away our pointer bits, so now we need a pile of special code
to handle this instead of just using u_upload_mgr. :(
2019-02-21 10:26:07 -08:00
Kenneth Graunke
1c19e3b21f iris: don't include binder in surface VMA range 2019-02-21 10:26:07 -08:00
Kenneth Graunke
1cea195a95 iris: state ref tuple 2019-02-21 10:26:07 -08:00
Kenneth Graunke
c0e80a8d0a iris: null surface for unbound textures
avoids crashes...may not be really right
2019-02-21 10:26:07 -08:00
Kenneth Graunke
d358a4a040 iris: depth clears 2019-02-21 10:26:07 -08:00
Kenneth Graunke
470fb01a7a iris: fix GS dispatch mode 2019-02-21 10:26:07 -08:00
Kenneth Graunke
01483c7933 iris: fix 3DSTATE_VERTEX_ELEMENTS / VF_INSTANCING for 0 elements 2019-02-21 10:26:07 -08:00
Kenneth Graunke
4c9067ae1d iris: don't emit garbage 3DSTATE_VERTEX_BUFFERS when there aren't any 2019-02-21 10:26:07 -08:00
Kenneth Graunke
adf0c20461 iris: geometry shader support 2019-02-21 10:26:07 -08:00
Kenneth Graunke
de08ac9b0f iris: TES uniform fixes
not that we have a TES, but...
2019-02-21 10:26:07 -08:00
Kenneth Graunke
d207f97840 iris: larger polygon offset 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5188e54e97 iris: fix provoking vertex ordering
had this backwards
2019-02-21 10:26:07 -08:00
Kenneth Graunke
cbbd6a61c4 iris: maybe-flush before blorp operations
otherwise if we have a lot of back-to-back blorp operations we can
potentially overflow even the chained batch
2019-02-21 10:26:07 -08:00
Kenneth Graunke
e0f3971280 iris: lightmodel flat 2019-02-21 10:26:07 -08:00
Kenneth Graunke
4d04111bfb iris: implement copy image 2019-02-21 10:26:07 -08:00
Kenneth Graunke
40fd2fd603 iris: fall back to u_generate_mipmap
It just does blits between layers, which is all we'd do anyway,
and it already should use BLORP because of iris_blit().  Plus it
handles 3D, which our code in i965 doesn't.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
6cf04c6ded iris: clear fix 2019-02-21 10:26:07 -08:00
Kenneth Graunke
d416b81779 iris: shader dirty bits 2019-02-21 10:26:07 -08:00
Kenneth Graunke
b7cd3a083a iris: rework DEBUG_REEMIT
don't want to have to special case this everywhere
2019-02-21 10:26:07 -08:00
Kenneth Graunke
72416a2d0d iris: clears 2019-02-21 10:26:07 -08:00
Kenneth Graunke
eef0d33cee iris: better boxing on maps 2019-02-21 10:26:07 -08:00
Kenneth Graunke
419fac2fc6 iris: fix fragcoord ytransform
the TGSI in the name is a misnomer, it actually controls wpos_ytransform
lowering in NIR these days.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
e67951227d iris: Disable unsupported mirror clamp modes 2019-02-21 10:26:07 -08:00
Kenneth Graunke
234cf647a4 iris: tidy comments about mirroring modes 2019-02-21 10:26:07 -08:00
Kenneth Graunke
a3a998f19a iris: iris - fix QWord aligned endings after batch chaining rework
I need to save the primary batch size after expanding it to include
MI_BATCH_BUFFER_END and the QWord padding NOP
2019-02-21 10:26:07 -08:00
Kenneth Graunke
aacbcbbf47 iris: colorize batchbuffer failures to make them stand out 2019-02-21 10:26:07 -08:00
Kenneth Graunke
8e2b71b190 iris: bad inherited comments 2019-02-21 10:26:07 -08:00
Kenneth Graunke
8c54433275 iris: Handle batch submission failure "better"
We used to not reset the batch, and just keep appending to it, so you'd
get the same invalid contents over and over.

I'd also really like to know about this, so aborting seems wise for now,
if not for the long term
2019-02-21 10:26:07 -08:00
Kenneth Graunke
d0b55ca782 iris: don't always flush 2019-02-21 10:26:07 -08:00
Kenneth Graunke
9226ebfa85 iris: print second batch size separately 2019-02-21 10:26:07 -08:00
Kenneth Graunke
f12b079c0e iris: actually init num_viewports
fixes regressions
2019-02-21 10:26:07 -08:00
Kenneth Graunke
81f899c148 iris: scissor count fixes 2019-02-21 10:26:07 -08:00
Kenneth Graunke
92d6a70853 iris: fix VP iteration 2019-02-21 10:26:07 -08:00
Kenneth Graunke
4a94628513 iris: fix num viewports to be based on programs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
b17215800c iris: fix viewport counts and settings
seeing

   set_viewport_state 0 1
   set_viewport_state 1 15

which gives us a total of 16 viewports, updated incrementally
so keep old values around and update them...
2019-02-21 10:26:07 -08:00
Kenneth Graunke
636cf8971e iris: max VP index 2019-02-21 10:26:07 -08:00
Kenneth Graunke
7cdc6b1173 iris: emit 3DSTATE_SBE_SWIZ 2019-02-21 10:26:07 -08:00
Kenneth Graunke
26db2ea782 iris: avoid crashing on unbound constant resources
instead, read from the workaround BO
2019-02-21 10:26:07 -08:00
Kenneth Graunke
a7770501a7 iris: fix caps so tests run again 2019-02-21 10:26:07 -08:00
Kenneth Graunke
a6aeca9727 iris: fix major refcounting bug with resources
DONTBLOCK -> NULL was happening after taking a reference, causing those
to live forever

This resolves the OOM problems
2019-02-21 10:26:07 -08:00
Kenneth Graunke
49f9c88801 iris: support signed vertex buffer offsets 2019-02-21 10:26:07 -08:00
Kenneth Graunke
0a43c9defa iris: print refcounts in INTEL_DEBUG=submit 2019-02-21 10:26:07 -08:00
Kenneth Graunke
7d1e6f1fa1 iris: redo VB CSO a bit 2019-02-21 10:26:07 -08:00
Kenneth Graunke
432790bacd iris: print binder utilization in INTEL_DEBUG=submit 2019-02-21 10:26:07 -08:00
Kenneth Graunke
f8179dc760 iris: clean up some warnings so I can see through the noise 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5f3a7ee701 iris: use pipe resources not direct BOs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5619c15ecc iris: indentation 2019-02-21 10:26:07 -08:00
Kenneth Graunke
27d45eb2f2 iris: don't leak keyboxes when searching for an existing program 2019-02-21 10:26:07 -08:00
Kenneth Graunke
7d504f3d52 iris: don't leak sampler state table resources 2019-02-21 10:26:07 -08:00
Kenneth Graunke
8e186cef2c iris: rzalloc iris_compiled_shader so memcmp works even if padding creeps in 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5f722bf7c4 iris: remove 4 bytes of padding in iris_compiled_shader 2019-02-21 10:26:07 -08:00
Kenneth Graunke
0db86016f7 iris: pc fixes 2019-02-21 10:26:07 -08:00
Kenneth Graunke
f9f8ea7070 iris: more leak fixes 2019-02-21 10:26:07 -08:00
Kenneth Graunke
c763ecaa65 iris: plug leaks 2019-02-21 10:26:07 -08:00
Kenneth Graunke
477ea6c39a iris: clear dirty 2019-02-21 10:26:07 -08:00
Kenneth Graunke
23987df412 iris: some dirty fixes
two scissor bits, constants not being flagged, ZeroRTA, clip not being
flagged
2019-02-21 10:26:07 -08:00
Kenneth Graunke
ccf37c7da9 iris: bindings dirty tracking 2019-02-21 10:26:07 -08:00
Kenneth Graunke
bbc6d15b59 iris: flag DIRTY_WM properly 2019-02-21 10:26:06 -08:00
Kenneth Graunke
3f863cf680 iris: fix the validation list on new batches 2019-02-21 10:26:06 -08:00
Kenneth Graunke
80dee31846 iris: save pointers to streamed state resources
will be used for cross-batch validation list fixing
2019-02-21 10:26:06 -08:00
Kenneth Graunke
daceb04bc0 iris: put back the always flush - fixes some things :( 2019-02-21 10:26:06 -08:00
Kenneth Graunke
149408a360 iris: untested SAMPLER_STATE pin BO fix 2019-02-21 10:26:06 -08:00
Kenneth Graunke
de782e5b39 iris: delete some pointless STATIC_ASSERTS
these were useful when I was patching relocs
2019-02-21 10:26:06 -08:00
Kenneth Graunke
3eebea88dc iris: untested index buffer upload 2019-02-21 10:26:06 -08:00
Kenneth Graunke
9247546181 iris: state cleaning 2019-02-21 10:26:06 -08:00
Kenneth Graunke
7c40cdc12f iris: comment about reemitting and flushing 2019-02-21 10:26:06 -08:00
Kenneth Graunke
d46c5b7c6c iris: allow mapped buffers during execution (faster) 2019-02-21 10:26:06 -08:00
Kenneth Graunke
92de0f5aa6 iris: disable __gen_validate_value in release mode 2019-02-21 10:26:06 -08:00
Kenneth Graunke
08d1f13818 iris: drop assert for now 2019-02-21 10:26:06 -08:00
Kenneth Graunke
a9e357caac iris: fix release builds 2019-02-21 10:26:06 -08:00
Kenneth Graunke
73f3c2cad0 iris: better VFI 2019-02-21 10:26:06 -08:00
Chris Wilson
2cbd42cddd iris: IndexFormat = size/2
brw uses:
  IndexFormat = index_size >> 1

anv uses:
  IndexFromat = index_type[index_size]
2019-02-21 10:26:06 -08:00
Kenneth Graunke
5dcf62bb43 iris: use u_transfer helpers for now 2019-02-21 10:26:06 -08:00
Kenneth Graunke
48dc8bd4b0 iris: fix pull bufs that aren't the first user upload 2019-02-21 10:26:06 -08:00
Kenneth Graunke
eed7f7253e iris: fill out pull constant buffers 2019-02-21 10:26:06 -08:00
Kenneth Graunke
90046b43cc iris: make surface states for cbufs 2019-02-21 10:26:06 -08:00
Kenneth Graunke
4e007dbb30 iris: have more than one const_offset 2019-02-21 10:26:06 -08:00
Kenneth Graunke
9ea05ccf1f iris: completely rewrite binder
now we get a new one per batch, and flush if it fills up
2019-02-21 10:26:06 -08:00
Kenneth Graunke
26cc609927 iris: better ubo handling 2019-02-21 10:26:06 -08:00
Chris Wilson
a504b98e72 iris: fix import from dri2/3 2019-02-21 10:26:06 -08:00
Kenneth Graunke
badefe50a0 iris: fix constant packet length to match i965 2019-02-21 10:26:06 -08:00
Kenneth Graunke
201a4d923c iris: maybe slightly less boats uniforms 2019-02-21 10:26:06 -08:00
Kenneth Graunke
a6dd9caf0d iris: flush always 2019-02-21 10:26:06 -08:00
Kenneth Graunke
04d1a3a7de iris: transfers 2019-02-21 10:26:06 -08:00
Kenneth Graunke
7437c28c0d iris: util_copy_framebuffer_state (ported from Rob's v3d patches) 2019-02-21 10:26:06 -08:00
Kenneth Graunke
f6017da83f iris: fix VF INSTANCING length 2019-02-21 10:26:06 -08:00
Kenneth Graunke
7fb7704b2e iris: more depth stuffs...
still missing stencil
2019-02-21 10:26:06 -08:00
Kenneth Graunke
02890c75b5 iris: fix 3DSTATE_VERTEX_ELEMENTS length 2019-02-21 10:26:06 -08:00
Kenneth Graunke
601ee4c189 iris: fix whitespace 2019-02-21 10:26:06 -08:00
Kenneth Graunke
4d24874236 iris: Lower the max number of decoded VBO lines
saint foo, vbo lines!
2019-02-21 10:26:06 -08:00
Kenneth Graunke
48ddd7212d iris: fix decoding and undo testing code 2019-02-21 10:26:06 -08:00
Kenneth Graunke
f31eea1f00 iris: fix batch chaining...
don't chain a batch just for the end
2019-02-21 10:26:06 -08:00
Kenneth Graunke
5b914a6d58 iris: caps 2019-02-21 10:26:06 -08:00
Kenneth Graunke
604a1a1614 iris: chaining not growing 2019-02-21 10:26:06 -08:00
Kenneth Graunke
053fb51125 iris: just turn batch reset_and_clear_caches into reset 2019-02-21 10:26:06 -08:00
Kenneth Graunke
ca735c5e0c iris: delete growing code and just die for now
we need proper batch chaining.  without relocations, we can't grow,
since we've only allocated so much VMA for the batch, and the mechanism
only works if we can pin it at the old address
2019-02-21 10:26:06 -08:00
Kenneth Graunke
7167c6d508 iris: blorp bug fixes
I wrote this earlier, but it got lost somehow...
2019-02-21 10:26:06 -08:00
Kenneth Graunke
3650f8dfa1 iris: properly reject formats, fixes RGB32 rendering with texture float 2019-02-21 10:26:06 -08:00
Kenneth Graunke
4510098b9c iris: proper # of uniforms
or at least closer...we were using bytes, we want 256-bit units...
2019-02-21 10:26:06 -08:00
Kenneth Graunke
6091dc470f iris: proper length for VE packet? 2019-02-21 10:26:06 -08:00
Kenneth Graunke
64a3f7423a iris: uniforms for VS 2019-02-21 10:26:06 -08:00
Kenneth Graunke
d4a64e0a64 iris: bump GL version to 4.2 2019-02-21 10:26:06 -08:00
Kenneth Graunke
44993d451c iris: some depth stuff :( 2019-02-21 10:26:06 -08:00
Kenneth Graunke
eb12cc70f0 iris: assert surf init 2019-02-21 10:26:06 -08:00
Kenneth Graunke
a4a426008b iris: no more drawing rectangle in blorp
there's some bug here as Jason's patches for only emitting 3DS_DR once
got reverted by Mark later on, apparently they regressed MSAA tests.

need to sort that out.
2019-02-21 10:26:06 -08:00
Kenneth Graunke
0e3870b9de iris: blorp URB 2019-02-21 10:26:06 -08:00
Kenneth Graunke
01fe6df0ed iris: make blorp pin the binder 2019-02-21 10:26:06 -08:00
Kenneth Graunke
063fc7bbb0 iris: linear staging buffers - fast CPU access... 2019-02-21 10:26:06 -08:00
Kenneth Graunke
84abf77c67 iris: hacky flushing for now 2019-02-21 10:26:06 -08:00
Kenneth Graunke
75a1639262 iris: drop the 48b printout, we never use anything else 2019-02-21 10:26:06 -08:00
Kenneth Graunke
86d7fd71f4 iris: add INTEL_DEBUG=reemit 2019-02-21 10:26:06 -08:00
Kenneth Graunke
b8a11ad256 iris: fix blorp prog data crashes 2019-02-21 10:26:06 -08:00
Kenneth Graunke
e2ba98ba39 iris: more blorp 2019-02-21 10:26:06 -08:00
Kenneth Graunke
1bba60a4bf iris: fix sampler view crashes 2019-02-21 10:26:06 -08:00
Kenneth Graunke
e22da1e7b1 iris: drop bogus binder free
I was malloc'ing it but then I changed my mind and embedded it directly
2019-02-21 10:26:06 -08:00
Kenneth Graunke
698d45b725 iris: more blitting code to make readpixels work 2019-02-21 10:26:06 -08:00
Kenneth Graunke
c9d9e44720 iris: bits of blorp code 2019-02-21 10:26:06 -08:00
Kenneth Graunke
79466c1313 iris: move bo_offset_from_sba
for wider use
2019-02-21 10:26:06 -08:00
Kenneth Graunke
60d708bb80 iris: copy over i965's cache tracking
needed to split out vtbl so I can pipe control without ice
2019-02-21 10:26:06 -08:00
Kenneth Graunke
dbd4770397 iris: pull in newer comments 2019-02-21 10:26:06 -08:00
Kenneth Graunke
841b3b9003 iris: Defines for base addresses rather than numbers everywhere 2019-02-21 10:26:06 -08:00
Kenneth Graunke
c75a1254a4 iris: Move get_command_space to iris_batch.c
for reuse in blorp.  it's a better interface anyway.
2019-02-21 10:26:06 -08:00
Kenneth Graunke
39e795d473 iris: fix texturing! 2019-02-21 10:26:06 -08:00
Kenneth Graunke
4929f020c3 iris: better SBE 2019-02-21 10:26:06 -08:00
Kenneth Graunke
8bf167c9e9 iris: vma - fix assert 2019-02-21 10:26:06 -08:00
Kenneth Graunke
10e4f1e68c iris: vma fixes - don't free binder address 2019-02-21 10:26:06 -08:00
Kenneth Graunke
5a101e6434 iris: bo reuse 2019-02-21 10:26:06 -08:00
Kenneth Graunke
21acc00490 iris: crazy pipe control code
imported from ~kwg/mesa pcx-2, gen < 8 code dropped
2019-02-21 10:26:06 -08:00
Kenneth Graunke
87aa880795 iris: fixes 2019-02-21 10:26:06 -08:00
Kenneth Graunke
3fbf7294b1 iris: fixes from i965 2019-02-21 10:26:06 -08:00
Kenneth Graunke
999ed6e213 iris: port bug fix from i965 2019-02-21 10:26:05 -08:00
Kenneth Graunke
19d11a6df3 iris: fix index 2019-02-21 10:26:05 -08:00
Kenneth Graunke
010e845af7 iris: increase allocator alignment 2019-02-21 10:26:05 -08:00
Kenneth Graunke
35afa8c8f3 iris: better BT asserts
Probably nothing is working because texture upload isn't implemented
2019-02-21 10:26:05 -08:00
Kenneth Graunke
0148bd6839 iris: decoder fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
5d2673ba7e iris: set sampler views 2019-02-21 10:26:05 -08:00
Kenneth Graunke
34164ce622 iris: isv freeing fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
012154c20f iris: TES stash
TODO: key setup
2019-02-21 10:26:05 -08:00
Kenneth Graunke
d890aee15d iris: SBA once at context creation, not per batch
hooray!
2019-02-21 10:26:05 -08:00
Kenneth Graunke
e0eac28bd4 iris: fix a scissor bug 2019-02-21 10:26:05 -08:00
Kenneth Graunke
0707ff3f2f iris: assemble SAMPLER_STATE table at bind time
It's useless to allocate SAMPLER_STATEs in GPU memory on creation like
we do for SURFACE_STATES, because they need to be organized into a
contiguous block of memory.  But we can do that at bind time, rather
than draw time.
2019-02-21 10:26:05 -08:00
Kenneth Graunke
199c080926 iris: same treatment for sampler views 2019-02-21 10:26:05 -08:00
Kenneth Graunke
f51204a160 iris: allocate SURFACE_STATEs up front and stop streaming them 2019-02-21 10:26:05 -08:00
Kenneth Graunke
bf90d8a125 iris: delete more trash 2019-02-21 10:26:05 -08:00
Kenneth Graunke
1398c99aff iris: canonicalize addresses.
Back to working!  Woo!
2019-02-21 10:26:05 -08:00
Kenneth Graunke
b69a85bc4d iris: validation dumping improvements
backported from i965.  don't bother with (pinned) because everything is.
2019-02-21 10:26:05 -08:00
Kenneth Graunke
24bcf1054b iris: update vb BO handling now that we have softpin 2019-02-21 10:26:05 -08:00
Kenneth Graunke
9ac81f1890 iris: decoder fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
9955e8334b iris: binder fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
65073c2217 iris: hook up batch decoder 2019-02-21 10:26:05 -08:00
Kenneth Graunke
6cbd1d1692 iris: binders 2019-02-21 10:26:05 -08:00
Kenneth Graunke
209692c716 iris: include p_defines.h in iris_bufmgr.h
for PIPE_TRANSFER_WRITE and friends
2019-02-21 10:26:05 -08:00
Kenneth Graunke
1af84d345a iris: set EXEC_OBJECT_WRITE 2019-02-21 10:26:05 -08:00
Kenneth Graunke
651be7cf3d iris: rewrite to use memzones and not relocs 2019-02-21 10:26:05 -08:00
Kenneth Graunke
68229caa38 iris: more uploaders 2019-02-21 10:26:05 -08:00
Kenneth Graunke
3861d24e23 iris: Also set SUPPORTS_48B? Not sure if necessary. 2019-02-21 10:26:05 -08:00
Kenneth Graunke
e95ad5994a iris: dump gtt offset in dump_validation_list 2019-02-21 10:26:05 -08:00
Kenneth Graunke
d78be0188e iris: fix icache memzone 2019-02-21 10:26:05 -08:00
Kenneth Graunke
e4aa8338c3 iris: Soft-pin the universe
Breaks everything, woo!
2019-02-21 10:26:05 -08:00
Kenneth Graunke
3693307670 iris: some thinking about binding tables 2019-02-21 10:26:05 -08:00
Kenneth Graunke
f6be3d4f3a iris: bufmgr updates.
Drop BO_ALLOC_BUSY (best not to hand people a loaded gun...)
Drop vestiges of alignment
2019-02-21 10:26:05 -08:00
Kenneth Graunke
902a122404 iris: stop adding 9 to our varyings 2019-02-21 10:26:05 -08:00
Kenneth Graunke
a235da3e68 iris: set strides on transfers 2019-02-21 10:26:05 -08:00
Kenneth Graunke
6891f70d87 iris: enable a few more formats 2019-02-21 10:26:05 -08:00
Kenneth Graunke
7130c43d96 iris: decode batches if they fail to submit 2019-02-21 10:26:05 -08:00
Kenneth Graunke
23367688e9 iris: NOOP pad batches correctly 2019-02-21 10:26:05 -08:00
Kenneth Graunke
f3150e9ecd iris: warn if execbuf fails 2019-02-21 10:26:05 -08:00
Kenneth Graunke
a50a3a8edf iris: uniform bits...badly 2019-02-21 10:26:05 -08:00
Kenneth Graunke
213b70a222 iris: sample mask...not 0.
We now have a first triangle!
2019-02-21 10:26:05 -08:00
Kenneth Graunke
1a6bb266cf iris: write DISABLES are not write ENABLES...whoops 2019-02-21 10:26:05 -08:00
Kenneth Graunke
50a2596f46 iris: fix extents 2019-02-21 10:26:05 -08:00
Kenneth Graunke
ffcd84f55a iris: catastrophic state pointer mistake 2019-02-21 10:26:05 -08:00
Kenneth Graunke
1739dc0d5e iris: more SF CL VPs 2019-02-21 10:26:05 -08:00
Kenneth Graunke
ade381fb9c iris: fix dmabuf retval comparisons
0 means success
2019-02-21 10:26:05 -08:00
Kenneth Graunke
ed42ae2f9b iris: more sketchy SBE 2019-02-21 10:26:05 -08:00
Kenneth Graunke
9be4b3baaf iris: compctrl
oh, also run things
2019-02-21 10:26:05 -08:00
Kenneth Graunke
db15993cfd iris: actually pin the instruction cache buffers 2019-02-21 10:26:05 -08:00
Kenneth Graunke
bda9a77b47 iris: smaller blend state 2019-02-21 10:26:05 -08:00
Kenneth Graunke
f9d834d588 iris: don't do samplers for disabled stages 2019-02-21 10:26:05 -08:00
Kenneth Graunke
e21bddeb4f iris: render targets! 2019-02-21 10:26:05 -08:00
Kenneth Graunke
8503578e82 iris: fix silly unused batch with addr macro 2019-02-21 10:26:05 -08:00
Kenneth Graunke
352ec1f378 iris: warning fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
54ba8a60d5 iris: basic SBE code 2019-02-21 10:26:05 -08:00
Kenneth Graunke
5af16f5e20 iris: alpha testing in PSB 2019-02-21 10:26:05 -08:00
Kenneth Graunke
c96132d5fd iris: blend state 2019-02-21 10:26:05 -08:00
Kenneth Graunke
bb3c0be7a8 iris: dummy constants 2019-02-21 10:26:05 -08:00
Kenneth Graunke
538decc0de iris: URB configs. 2019-02-21 10:26:05 -08:00
Kenneth Graunke
b1115799e6 iris: actually set KSP offsets 2019-02-21 10:26:05 -08:00
Kenneth Graunke
6f1c07d7dd iris: actually softpin at an address 2019-02-21 10:26:05 -08:00
Kenneth Graunke
acdff2f9a6 iris: actually destroy the cache 2019-02-21 10:26:05 -08:00
Kenneth Graunke
9437e135ed iris: rewrite program cache to use u_upload_mgr 2019-02-21 10:26:05 -08:00
Kenneth Graunke
67ca2be992 iris: no NEW_SBA 2019-02-21 10:26:05 -08:00
Kenneth Graunke
e7a729ba34 iris: shuffle comments 2019-02-21 10:26:05 -08:00
Kenneth Graunke
6ecc93f764 iris: bits of WM key 2019-02-21 10:26:05 -08:00
Kenneth Graunke
bba13b1501 iris: move key pop to state module
shader key population needs to read state
2019-02-21 10:26:05 -08:00
Kenneth Graunke
5864c9414a iris: fix SBA 2019-02-21 10:26:05 -08:00
Kenneth Graunke
5ae278da18 iris: use vtbl to avoid multiple symbols, fix state base address 2019-02-21 10:26:05 -08:00
Kenneth Graunke
876417f9e8 iris: softpin some things 2019-02-21 10:26:05 -08:00
Kenneth Graunke
c493fee73f iris: drop const from prog data parameters
we ralloc steal things, which makes it a little bogus
2019-02-21 10:26:05 -08:00
Kenneth Graunke
cf7ba838ad iris: more comes from bits filled in
tomorrow, fix the build system to avoid symbol clashes somehow...
we're getting gen9 functions because they happen to be listed before 10
in the link list.
2019-02-21 10:26:05 -08:00
Kenneth Graunke
8dffc9b195 iris: index buffer BO 2019-02-21 10:26:05 -08:00
Kenneth Graunke
8665dfd602 iris: WM.
I could have added a dirty bit for this, but it doesn't seem worth it
2019-02-21 10:26:05 -08:00
Kenneth Graunke
bae5414594 iris: initial gpu state 2019-02-21 10:26:05 -08:00
Kenneth Graunke
0477591355 iris: reorganize commands to match brw 2019-02-21 10:26:05 -08:00
Kenneth Graunke
3e684d0eb7 iris: don't forget about TE 2019-02-21 10:26:05 -08:00
Kenneth Graunke
d71d2028ef iris: convert IRIS_DIRTY_* to #defines
enums are SIGNED.  so IRIS_DIRTY_VS << 4 gets sign extended, making it
not equal to IRIS_DIRTY_FS.  Surprising!
2019-02-21 10:26:05 -08:00
Kenneth Graunke
cfd5fcc256 iris: emit shader packets 2019-02-21 10:26:05 -08:00
Kenneth Graunke
1cf21cc813 iris: actually save derived state 2019-02-21 10:26:05 -08:00
Kenneth Graunke
57c1b71418 iris: promote iris_program_cache_item to iris_compiled_shader 2019-02-21 10:26:05 -08:00
Kenneth Graunke
581459a9fe iris: some shader bits 2019-02-21 10:26:05 -08:00
Kenneth Graunke
df401aaa11 iris: scissor slots 2019-02-21 10:26:05 -08:00
Kenneth Graunke
dc4453d886 iris: bind_state -> compute state 2019-02-21 10:26:05 -08:00
Kenneth Graunke
2f100c6e31 iris: 3DPRIMITIVE fields 2019-02-21 10:26:05 -08:00
Kenneth Graunke
b3646e2b48 iris: fix VF instancing length so we don't get garbage in batch 2019-02-21 10:26:05 -08:00
Kenneth Graunke
317263ab11 iris: vertex packet fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
129fae5a90 iris: fix VBs 2019-02-21 10:26:05 -08:00
Kenneth Graunke
fc5ddc64f9 iris: fix assert 2019-02-21 10:26:05 -08:00
Kenneth Graunke
e91289908a iris: fix indentation 2019-02-21 10:26:05 -08:00
Kenneth Graunke
41b32a4eda iris: hack to stop crashing on samplers for now 2019-02-21 10:26:05 -08:00
Kenneth Graunke
dcfb06375a iris: initialize dirty bits to ~0ull 2019-02-21 10:26:05 -08:00
Kenneth Graunke
0a513d63a1 iris: actually advance forward when emitting commands 2019-02-21 10:26:05 -08:00
Kenneth Graunke
24cc627612 iris: actually flush the commands 2019-02-21 10:26:05 -08:00
Kenneth Graunke
082911409e iris: actually APPEND commands, not stomp over the top and never incr 2019-02-21 10:26:05 -08:00
Kenneth Graunke
b332ff489c iris: VB fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
50b1e01996 iris: DEBUG=bat
Deleted in the interest of making the branch compile at each step
2019-02-21 10:26:05 -08:00
Kenneth Graunke
6e01bc0637 iris: VB addresses 2019-02-21 10:26:05 -08:00
Kenneth Graunke
b574b56325 iris: reference VB BOs 2019-02-21 10:26:05 -08:00
Kenneth Graunke
4dc683f64b iris: so, sba then. 2019-02-21 10:26:05 -08:00
Kenneth Graunke
d900a235b1 iris: try and have an iris address 2019-02-21 10:26:05 -08:00
Kenneth Graunke
f31ae76216 iris: flag SBA updates when instruction BO changes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
7d90cc8da4 iris: bit of SBA code
genxml MOCS is stupid, addresses are hard news at 11
2019-02-21 10:26:05 -08:00
Kenneth Graunke
ff5c886fb3 iris: move MAX defines to iris_batch.h
for SBA
2019-02-21 10:26:05 -08:00
Kenneth Graunke
7bfc8f7d7d iris: kill iris_new_batch
reset and new are too similar, and this had exactly one caller
2019-02-21 10:26:05 -08:00
Kenneth Graunke
b701096ab9 iris: make iris_batch target a particular ring 2019-02-21 10:26:05 -08:00
Kenneth Graunke
64f043570d iris: lower io 2019-02-21 10:26:05 -08:00
Kenneth Graunke
695bd55d1a iris: do the FS...asserts because we don't lower uniforms yet 2019-02-21 10:26:05 -08:00
Kenneth Graunke
6aa15cadf3 iris: import program cache code 2019-02-21 10:26:05 -08:00
Kenneth Graunke
4525dda75f iris: reworks, FS compile pieces 2019-02-21 10:26:05 -08:00
Kenneth Graunke
628a71c2e3 iris: parse INTEL_DEBUG 2019-02-21 10:26:05 -08:00
Kenneth Graunke
d62b0b9ee8 iris: draw->restart_index is uninitialized if PR is not enabled 2019-02-21 10:26:05 -08:00
Kenneth Graunke
5fad62cef1 iris: fix bogus index buffer reference 2019-02-21 10:26:05 -08:00
Kenneth Graunke
95fe254cf2 iris: fix prim type 2019-02-21 10:26:05 -08:00
Kenneth Graunke
793276cd8b iris: msaa sample count packing problems
0 -> ffffffffffffffffffffffffffff
2019-02-21 10:26:05 -08:00
Kenneth Graunke
0252fb36e9 iris: actually save VBs 2019-02-21 10:26:05 -08:00
Kenneth Graunke
ed6ee3e270 iris: fix/rework line stipple 2019-02-21 10:26:05 -08:00
Kenneth Graunke
231935efa2 iris: init the batch! 2019-02-21 10:26:05 -08:00
Kenneth Graunke
9ca58ca517 iris: delete iris_pipe.c, shuffle code around 2019-02-21 10:26:05 -08:00
Kenneth Graunke
455e2d6dce iris: disable execbuf for now 2019-02-21 10:26:05 -08:00
Kenneth Graunke
86e0c08b14 iris: make an ice->render_batch field
we may want a second one for transfers
2019-02-21 10:26:05 -08:00
Kenneth Graunke
ffd7f13b4d iris: drop unused field 2019-02-21 10:26:05 -08:00
Kenneth Graunke
8097dc9dd9 iris: shader debug log 2019-02-21 10:26:05 -08:00
Kenneth Graunke
6c7a276470 iris: maps 2019-02-21 10:26:05 -08:00
Kenneth Graunke
49896861ce iris: linear resources 2019-02-21 10:26:05 -08:00
Kenneth Graunke
c820f5a4bd iris: some program code 2019-02-21 10:26:04 -08:00
Kenneth Graunke
d48dc416fa iris: basic push constant alloc 2019-02-21 10:26:04 -08:00
Kenneth Graunke
21c016b496 iris: emit 3DSTATE_SAMPLER_STATE_POINTERS 2019-02-21 10:26:04 -08:00
Kenneth Graunke
7b80f4587d iris: sampler states 2019-02-21 10:26:04 -08:00
Kenneth Graunke
60208d12b4 iris: COLOR_CALC_STATE 2019-02-21 10:26:04 -08:00
Kenneth Graunke
9367c44639 iris: fix crash - CSO binding can be NULL (when destroying context) 2019-02-21 10:26:04 -08:00
Kenneth Graunke
efea4d96d9 iris: some draw info, vbs, sample mask 2019-02-21 10:26:04 -08:00
Kenneth Graunke
d6ad9f4732 iris: a bit of depth
still need to allocate separate stencil
2019-02-21 10:26:04 -08:00
Kenneth Graunke
7abe5aefd3 iris: fix SF_CL length 2019-02-21 10:26:04 -08:00
Kenneth Graunke
c1c6c3a18a iris: don't segfault on !old_cso 2019-02-21 10:26:04 -08:00
Kenneth Graunke
3eadb1b3a1 iris: framebuffers 2019-02-21 10:26:04 -08:00
Kenneth Graunke
e7c9bddda7 iris: stipples and vertex elements 2019-02-21 10:26:04 -08:00
Kenneth Graunke
d0aab78dc3 iris: sampler views 2019-02-21 10:26:04 -08:00
Kenneth Graunke
831d630b8b iris: Surfaces! 2019-02-21 10:26:04 -08:00
Kenneth Graunke
4ec5f8be3e iris: SF_CLIP_VIEWPORT 2019-02-21 10:26:04 -08:00
Kenneth Graunke
970836c34e iris: scissors 2019-02-21 10:26:04 -08:00
Kenneth Graunke
7c875deaf0 iris: RASTER + SF + some CLIP, fix DIRTY vs. NEW 2019-02-21 10:26:04 -08:00
Kenneth Graunke
02f583b0a0 iris: initial gpu state, merges 2019-02-21 10:26:04 -08:00
Kenneth Graunke
a13d417ac1 iris: merge pack
this lets us merge dynamic and pre-baked state, also like anv
2019-02-21 10:26:04 -08:00
Kenneth Graunke
aee39df710 iris: packing with valgrind.
borrowed macros from anv!
2019-02-21 10:26:04 -08:00
Kenneth Graunke
d3d6ef37f6 iris: initial render state upload 2019-02-21 10:26:04 -08:00
Kenneth Graunke
26fb5a8ae2 iris: port over batchbuffer updates 2019-02-21 10:26:04 -08:00
Kenneth Graunke
14ca30507f iris: viewport state, sort of 2019-02-21 10:26:04 -08:00
Kenneth Graunke
2dce0e94a3 iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.
This commit introduces a new Gallium driver for Intel Gen8+ GPUs,
named 'iris_dri.so' after the hardware.

Developed by:
- Kenneth Graunke (overall driver)
- Dave Airlie (shaders, conditional render, overflow query, Gen8 port)
- Chris Wilson (fencing, pinned memory, ...)
- Jordan Justen (compute shaders)
- Jason Ekstrand (image load store)
- Caio Marcelo de Oliveira Filho (tessellation control passthrough)
- Rafael Antognolli (auxiliary buffer fixes)
- The rest of the i965 contributors and the Mesa community
2019-02-21 10:26:04 -08:00
James Zhu
eac822eac1 gallium/auxiliary/vl: Fix transparent issue on compute shader with rgba
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109646
Problem 1,4: they are caused by imcomplete blend comute shader
implementation. So Reverts rgba back to frament shader.

Fixes: 9364d66cb7 (Add video compositor compute shader render)
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Tested-by: Bruno Milreu <bmilreu@gmail.com>
2019-02-21 13:11:53 -05:00
Lionel Landwerlin
20c370c6b1 vulkan: add an overlay layer
Just a starting point to display frame timings & drawcalls/submissions
per frame.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
+1-by: Mike Lothian <mike@fireburn.co.uk>
+1-by: Tapani Pälli <tapani.palli@intel.com>
+1-by: Eric Engestrom <eric.engestrom@intel.com>
+1-by: Yurii Kolesnykov <root@yurikoles.com>
+1-by: myfreeweb <greg@unrelenting.technology>
+1-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-21 18:06:05 +00:00
Lionel Landwerlin
89f03d1872 imgui: make sure our copy of imgui doesn't clash with others in the same process
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
+1-by: Mike Lothian <mike@fireburn.co.uk>
+1-by: Tapani Pälli <tapani.palli@intel.com>
+1-by: Eric Engestrom <eric.engestrom@intel.com>
+1-by: Yurii Kolesnykov <root@yurikoles.com>
+1-by: myfreeweb <greg@unrelenting.technology>
+1-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-21 18:06:05 +00:00
Lionel Landwerlin
3950e7c11e imgui: bump copy
Updated at :

commit f977871854af941289f2a9090dcc90f7aa3449a8
Author: omar <omarcornut@gmail.com>
Date:   Fri Feb 15 13:10:22 2019 +0100

    ImFont: Minor adjustment to the structure.
    Examples: Removed unused variable.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
+1-by: Mike Lothian <mike@fireburn.co.uk>
+1-by: Tapani Pälli <tapani.palli@intel.com>
+1-by: Eric Engestrom <eric.engestrom@intel.com>
+1-by: Yurii Kolesnykov <root@yurikoles.com>
+1-by: myfreeweb <greg@unrelenting.technology>
+1-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-21 18:06:05 +00:00
Lionel Landwerlin
51047cd2e8 build: move imgui out of src/intel/tools to be reused
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
+1-by: Mike Lothian <mike@fireburn.co.uk>
+1-by: Tapani Pälli <tapani.palli@intel.com>
+1-by: Eric Engestrom <eric.engestrom@intel.com>
+1-by: Yurii Kolesnykov <root@yurikoles.com>
+1-by: myfreeweb <greg@unrelenting.technology>
+1-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-21 18:06:05 +00:00
Jason Ekstrand
f98fd9d15a nir/lower_clip_cull: Fix an incorrect assert
Copy+paste error.  It was supposed to test cull and not clip.

Fixes: 4e69fba534 "nir: Rewrite lower_clip_cull_distance_arrays..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109717
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-21 12:05:12 -06:00
Jason Ekstrand
f9b2f10a41 nir: Fix a compile warning 2019-02-21 09:44:42 -06:00
Rob Clark
908d5ee9eb freedreno/a6xx: enable tiled images
Turns out we can write to tiled images as well as read.  This avoids
having to linearize or do the tiling in the shader.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-21 09:06:06 -05:00
Alejandro Piñeiro
0629b2a462 nir, glsl: move pixel_center_integer/origin_upper_left to shader_info.fs
On GLSL that info is set as a layout qualifier when redeclaring
gl_FragCoord, so somehow tied to a specific variable. But in practice,
they behave as a global of the shader. On ARB programs they are set
using a global OPTION (defined at ARB_fragment_coord_conventions), and
on SPIR-V using ExecutionModes, that are also not tied specifically to
the builtin.

This patch moves that info from nir variable and ir variable to nir
shader and gl_program shader_info respectively, so the map is more
similar to SPIR-V, and ARB programs, instead of more similar to GLSL.

FWIW, shader_info.fs already had pixel_center_integer, so this change
also removes some redundancy. Also, as struct gl_program also includes
a shader_info, we removed gl_program::OriginUpperLeft and
PixelCenterInteger, as it would be superfluous.

This change was needed because recently spirv_to_nir changed the order
in which execution modes and variables are handled, so the variables
didn't get the correct values. Now the info is set on the shader
itself, and we don't need to go back to the builtin variable to set
it.

Fixes: e68871f6a ("spirv: Handle constants and types before execution
                   modes")

v2: (Jason)
   * glsl_to_nir: get the info before glsl_to_nir, while all the rest
     of the info gathering is happening
   * prog_to_nir: gather the info on a general info-gathering pass,
     not on variable setup.

v3: (Jason)
   * Squash with the patch that removes that info from ir variable
   * anv: assert that OriginUpperLeft is true. It should be already
     set by spirv_to_nir.
   * blorp: set origin_upper_left on its core "compile fragment
     shader", not just on some specific places (for this we added an
     helper on a previous patch).
   * prog_to_nir: no need to gather specifically this fragcoord modes
     as the full gl_program shader_info is copied.
   * spirv_to_nir: assert that we are a fragment shader when handling
     this execution modes.

v4: (reported by failing gitlab pipeline #18750)
   * state_tracker: update too due changes on ir.h/gl_program

v5:
   * blorp: minor change after change on previous patch
   * radeonsi: update due this change.

v6: (Timothy Arceri)
   * prog_to_nir: remove extra whitespace
   * shader_info: don't use :1 on origin_upper_left
   * glsl: program.fs.origin_upper_left/pixel_center_integer can be
     move out of the shader list loop
2019-02-21 11:47:59 +01:00
Alejandro Piñeiro
675eabb560 blorp: introduce helper method blorp_nir_init_shader
This initializes the nir shader that will be used by blorp. Right now
it doesn't do too much beyond calling nir_builder_init_simple_shader,
and setting a name. More stuff will be added on following patches.

v2: there is a case were it is used a VERTEX_SHADER (Alejandro)
2019-02-21 11:47:51 +01:00
Alyssa Rosenzweig
705723e6be panfrost: Verify and print brx condition in disasm
The condition code in extended branches is repeated 8 times for unclear
reasons; accordingly, the code would be disassembled as "unknown5555",
"unknownAAAA", etc. This patch correctly masks off the lower two bits to
find the true code to print, verifying that the code is repeated as
believed to be necessary (providing some assurance for compiler quality
and an assert trip in case we encounter a shader in the wild that breaks
the convention).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:09:06 +00:00
Alyssa Rosenzweig
779e140b1a panfrost: Dynamically set discard branch targets
discard and discard_if are both implemented with the branching pipeline
on Midgard; essentially, we branch to the end of the fragment shader in
a special "discard" mode, setting the condition as necessary.
Previously, we hardcoded the form of this instruction, which worked for
very simple shaders but was incorrect for anything remotely interesting.
This patch instead emits logical branches in the IR, which are flattened
to real discard ops the same way other branches are, allowing targets to
be computed correctly.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:08:59 +00:00
Alyssa Rosenzweig
5abb7b559e panfrost/midgard: Emit extended branches
Previously, we only emitted compact branches; however, the offset range
of these branches is too small for many real world shaders. This patch
implements support for emitting extended branches and switches to always
using them for control flow. This incurs a code size and possibly
performance penalty, but expands the range of working shaders and
provides opportunity for further optimization.

Support for emitting compact branches is retained but this code path is
presently unused. In the future, we'll want to heuristically determine
which type of branch should be emitted for optimal codegen.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:08:47 +00:00
Alyssa Rosenzweig
813bb34fd8 panfrost: Rectify doubleplusungood extended branch
Midgard features "compact branches" and "extended branches", i.e.
corresponds to short jumps and far jumps. The form of the extended
branch was previously incorrect in the ISA headers; this patch corrects
it and updates the disassembler (simultaneous to preserve
bisectability).

Additionally, we fix some a corner case in the disassembly of extended
branches, and we now prefix extended branches with "brx", to visually
differentiate from compact branches prefixed with "br".

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:07:39 +00:00
Alyssa Rosenzweig
2c74709517 panfrost/midgard: Fix nested/chained if-else
An if-else statement is compiled to a conditional branch (from the start
to the second block) and an unconditional branch (from the end of the
first block to the end of the else). We previously incorrectly computed
the block index of the unconditional branch to be exactly one after that
of the conditional branch, valid for a single if-else statement but
nothing fancier. This patch correctly computes the unconditional branch
target, fixing more complex if-else chains.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:06:26 +00:00
Alyssa Rosenzweig
5e55c11a1b panfrost/midgard: Refactor tag lookahead code
Each Midgard instruction is scheduled to a particular instruction type
("tag"). Presumably the hardware prefetches memory based on tag, so it
is required to report out the first tag to the command stream and the
next tag of a branch target. This procedure was implemented in two
separate parts of the compiler (one time with a slight bug relating to
empty blocks); this patch refactors to unite the two routines and solve
the bug when branching to empty blocks.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:05:59 +00:00
Alyssa Rosenzweig
396eb1440a panfrost: Implement pantrace (command stream dump)
Historically, Panfrost debugging entailed the use of the LD_PRELOADable
`panwrap` tool. This setup is a tad fragile; Panfrost can be traced
directly without the intermediate layer. pantrace implements the
quivalent functionality of panwrap into Panfrost proper, allowing dumps
to work regardless of the kernel layer in use.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:03:21 +00:00
Alyssa Rosenzweig
f611782045 panfrost: Add pandecode (command stream debugger)
The `panwrap` utility can be LD_PRELOAD'd into a GLES app, intercepting
communication between the driver and the kernel. Modern panwrap versions
do no processing of their own; instead, they create a trace directory.
This directory contains the following files:

 - control.log: a line-by-line plain text file, denoting important
   syscalls (mmaps and job submits) along with their arguments

 - memory_*.bin, shader_*.bin: binary dumps of mapped memory

Together, these files contain enough information to reconstruct the
command stream and shaders of (at minimum) a single frame.

The `pandecode` utility takes this directory structure as input,
reconstructing the mapped memory and using the job submit command as an
entrypoint. It then walks the descriptors as the hardware would, parsing
and pretty-printing. Its final output is the pretty-printed command
stream interleaved with the disassembled shaders, suitable for driver
debugging. For instance, the behaviour of two driver versions (one
working, one broken) can be compared by diff'ing their decoded logs.

pandecode/decode.c was originally a part of `panwrap`; it is the oldest
living code in the project. Its history is generally not worth
preserving.

panwrap itself will continue to live downstream for the foreseeable
future, as it is specifically written for the vendor kernel. It is
possible, however, to produce equivalent traces directly from Panfrost,
bypassing the intermediate wrapping layer for well-behaved drivers.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:01:48 +00:00
Alyssa Rosenzweig
fb3bbd0c1c panfrost: Stub out separate stencil functions
This is not yet functional, but it resolves a crash in various apps and
provides a framework for further work.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 06:58:50 +00:00
Marek Olšák
edbd2c1ff5 radeonsi: use SDMA for uploading data through const_uploader
v2: use tc.stream_uploader in si buffer_transfer_map if not called from
    the driver thread

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-02-20 21:04:29 -05:00
Marek Olšák
54f7545cd7 gallium/u_upload_mgr: allow use of FLUSH_EXPLICIT with persistent mappings
for radeonsi

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-02-20 21:04:29 -05:00
Marek Olšák
dc8a2c139d gallium/u_threaded: always unmap const_uploader
radeonsi will require this. It's a no-op for drivers supporting persistent
mappings.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-02-20 21:04:29 -05:00
Marek Olšák
8ef6f68fa5 st/mesa: always unmap the uploader in st_atom_array.c
This is a no-op for drivers supporting persistent mappings.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-02-20 21:04:29 -05:00
Jason Ekstrand
1a93fc382b nir/xfb: Handle compact arrays in gather_xfb_info
This makes us properly handle gl_ClipDistance and gl_CullDistance.

Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info"
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-21 00:08:42 +00:00
Jason Ekstrand
558c314504 nir/xfb: Work in terms of components rather than slots
We needed to better handle cases where a chunk of a variable starts at
some non-zero location_frac and rolls over into the next slot but may
not be more than 4 dwords.  For example, if gl_CullDistance is an array
of 3 things and has location_frac = 2, it will span across two vec4s but
is not, itself, bigger than a vec4.  If you ignore the clip/cull special
case, it's not allowed to happen for anything else because the only
things that can span more than one slot is dvec3 and dvec4 and they're
both bigger than a vec4.  The current code uses this attrib_slot thing
where we count attribute slots and iterate over them.  However, that
doesn't work in the case above because gl_CullDistance will have an
attrib_slot count of 1 even though it does span two slots.  We could fix
this by adjusting attrib_slot but we already have comp_mask and it's
easier to just handle it that way.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-21 00:08:42 +00:00
Jason Ekstrand
4e69fba534 nir: Rewrite lower_clip_cull_distance_arrays to do a lot less lowering
Instead of going to all the work of to combine them into one array, just
make two arrays and use location_frac to colocate them within CLIP0.
Then the back-end can sort things out and stack them on top of each
other.  Thanks to ef99f4c8, we also don't need to set compact anymore.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-21 00:08:42 +00:00
Jason Ekstrand
8f0fe71cc5 nir/xfb: Properly align 64-bit values
Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info"
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-21 00:08:42 +00:00
Jason Ekstrand
30b548fc62 compiler/types: Add a contains_64bit helper
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-21 00:08:42 +00:00
Rob Clark
323958908e freedreno/a6xx: samplerBuffer fixes
Use the 'UNK31' bit (which should probably be called 'BUFFER') for
samplerBuffer case, which increases the size of supported buffer
texture beyond 2^15 elements.

Also need to fix the 2nd coord injected to handle the tex instructions
that take integer coords.

Fixes dEQP-GLES31.functional.texture.texture_buffer.render.as_fragment_texture.buffer_size_131071
and similar

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
50dd773a2d freedreno/ir3/a6xx: use ldib for ssbo reads
... instead of isam.  It seems like when using isam, plus atomics, we
can have the problem of old data being in the texture cache.  Plus this
way we don't have to load a component at a time.

Note that blob still seems to use isam in some cases.  I suppose it might
be preferable in the case of loading a single component, when atomics
are not in the picture (or that the ssbo does not need to otherwise be
coherent).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
c543a2cf6f freedreno/ir3: sync instr/disasm and add ldib encoding
Resync disasm and instr header from envytools, and add ldib encoding.
This replaces an opcode from a3xx which was never seen in practice,
since that seemed easier than dealing with the same opc # meaning a
different thing on a6xx.  (Not really sure if 'sti' was actually a
real thing, I think it was only seen in fuzzing.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
cadf6def0c freedreno/ir3/a6xx: fix load_ssbo barrier type.
Silly copy/pasta bug, since load_image is actually the same instruction
but different barrier class.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
0df0fc28a5 freedreno/ir3: rename put_dst()
This was overlooked when it moved to ir3_context.c and ceased to be
static..

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
7fe9e790e7 freedreno: fix crash w/ masked non-SSA dst
Fixes
dEQP-GLES3.functional.shaders.indexing.varying_array.vec3_dynamic_write_dynamic_loop_read
regression.

Fixes: c1a27ba9ba freedreno/ir3: HIGH reg w/a for a6xx
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
8c486083d0 freedreno/a6xx: 3d and cube image fixes
Fixes dEQP-GLES31.functional.image_load_store.{3d,cube}.store.*
and a bunch more

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
97479df8aa freedreno/ir3: fix crash in compile fail case
The variant will be NULL if RA failed.  Which isn't ideal, but at least
lets not segfault and bring down the rest of the dEQP run with us.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
f5ee8c54ed freedreno/ir3: fix legalize for vecN inputs
The wrmask is handled in regmask_get()/regmask_set(), but it wasn't
being propagated from SSA src to dst.  So for example, an SSBO read
value that is passed in as src2.y component to atomic op, wasn't
getting the (sy) flag set.  Causing lots of fail.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Bas Nieuwenhuizen
688f5e456a radv: Disable depth clamping even without EXT_depth_range_unrestricted.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-20 23:24:31 +00:00
Bas Nieuwenhuizen
9f7e0523ce radv: Implement VK_EXT_depth_clip_enable.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-20 23:24:31 +00:00
Timothy Arceri
03783253b1 nir: remove non-ssa support from nir_copy_prop()
Even in a very basic shader this reduces the time spent in
nir_copy_prop() by ~17%.

No shader-db changes for radeonsi NIR or i965.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 10:18:24 +11:00
Bas Nieuwenhuizen
1ef2855692 radv: Handle clip+cull distances more generally as compact arrays.
Needed for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/248 .

That MR keeps the clip and cull arrays split.

So we have to handle
 - compact arrays with location_frac != 0
 - VARYING_SLOT_CLIP_DIST1

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-20 22:49:52 +00:00
Eric Anholt
8cfc17bdda kmsro: Add the rest of the current set of tinydrm drivers.
While I haven't tested them all, given that they're all using the same
allocation paths and modifiers in the kernel they should be fine to use in
the same way.

v2: Rebase on other kmsro changes.
v3: Skip repeated '[with_gallium_kmsro,' in the meson build.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-20 21:49:41 +00:00
Andrii Simiklit
f4f4ec941e i965: re-emit index buffer state on a reset option change.
Seems like we forget to update the index buffer (ib) status and
IndexedDrawCutIndexEnable or CutIndexEnable flag is left unchanged it
leads to ignoring of glEnable/glDisable functions for GL_PRIMITIVE_RESTART
in some cases. The index buffer (ib) status should be re-emmited after the
reset option change to avoid some unexpected behavior.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109451
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Signed-off-by: Andrii Simiklit <asimiklit.work@gmail.com>
2019-02-20 20:27:56 +02:00
Kenneth Graunke
d6337b59f6 nir: Don't forget if-uses in new nir_opt_dead_cf liveness check
Commit 08bfd710a2. (nir/dead_cf: Stop
relying on liveness analysis) introduced a new check that iterated
through a SSA def's uses, to see if it's used.  But it only checked
normal uses, and not uses which are part of an 'if' condition.  This
led to it thinking more nodes were dead than possible.

Fixes Piglit's variable-indexing/tcs-output-array-float-index-wr test
(and related tests) with the out-of-tree Iris driver.

Fixes: 08bfd710a2 nir/dead_cf: Stop relying on liveness analysis
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 09:44:06 -08:00
Kristian H. Kristensen
b9eed05e7f freedreno/a6xx: Support MSAA resolve blits on blitter
This gets stencil and depth resolves working properly.

Fixes:

  dEQP-GLES3.functional.fbo.msaa.2_samples.depth32f_stencil8
  dEQP-GLES3.functional.fbo.msaa.2_samples.depth24_stencil8
  dEQP-GLES3.functional.fbo.msaa.4_samples.depth32f_stencil8
  dEQP-GLES3.functional.fbo.msaa.4_samples.depth24_stencil8
  dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_color
  dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_color

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-20 08:56:21 -08:00
Kristian H. Kristensen
686211f4c9 freedreno/a6xx: Copy stencil as R8_UINT
Blitter does support it after all. Previous attempt to use R8_UINT
failed because we overwrote the a6xx format in emit_blit_texture(),
but some of the later setup still looked at the gallium format.

If we overwrite it in the pipe_blit_info before we even call into
emit_blit_texture() it works properly.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-20 08:56:21 -08:00
Kristian H. Kristensen
e827ea8c83 freedreno: Update headers
Add support for multisampled sources for the blitter.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-20 08:56:21 -08:00
Eric Engestrom
a16c398668 anv: use anv_shader_bin_write_to_blob()'s return value
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 16:40:13 +00:00
Eric Engestrom
d3115f34a6 anv: drop unused imports
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 14:28:55 +00:00
Eric Engestrom
8cbfcab425 anv: make sure the extensions stay sorted
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 14:28:55 +00:00
Eric Engestrom
bc76ce1033 anv: sort vendors extensions after KHR and EXT
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 14:28:55 +00:00
Eric Engestrom
427aa9d154 anv: sort extensions alphabetically
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 14:28:55 +00:00
Tapani Pälli
886cee1f96 anv: anv: refactor error handling in anv_shader_bin_write_to_blob()
v2: blob manages error state internally, just return
    true if errors did not occur (Jason)

CID: 1442546
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 15:39:19 +02:00
Carlos Garnacho
30a01cd923 wayland/egl: Ensure EGL surface is resized on DRI update_buffers()
Fullscreening and unfullscreening a totem window while playing a video
sometimes results in the video subsurface not changing size along. This
is also reproducible with epiphany.

If a surface gets resized while we have an active back buffer for it, the
resized dimensions won't get neither immediately applied on the resize
callback, nor correctly synchronized on update_buffers(), as the
(now stale) surface size and currently attached buffer size still do match.

There's actually 2 things to synchronize here, first the surface query
size might not be updated yet to the wl_egl_window's (i.e. resize_callback
happened while there is a back buffer), and second the wayland buffers
would need dropping if new surface size differs with the currently attached
buffer. These are done in separate steps now.

https://bugzilla.redhat.com/show_bug.cgi?id=1650929
https://bugs.freedesktop.org/show_bug.cgi?id=109594

Fixes: a9fb331ea7 ("wayland/egl: update surface size on window resize")
Signed-off-by: Carlos Garnacho <carlosg@gnome.org>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Bastien Nocera <hadess@hadess.net>
Tested-by: Denys Kostin <denys.kostin@globallogic.com>
2019-02-20 12:04:33 +01:00
Lionel Landwerlin
f509213675 anv: implement VK_EXT_depth_clip_enable
A new extension allowing the user to explictly specify the clipping
behavior.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-20 09:57:58 +00:00
Lionel Landwerlin
fa4e103c32 vulkan: Update the XML and headers to 1.1.101 2019-02-20 09:57:58 +00:00
Samuel Iglesias Gonsálvez
63a919a3ce isl: remove the cache line size alignment requirement
The cacheline size was a requirement for using the BLT engine, which
we don't use anymore except for a few things on old HW, so we drop it.

Fixes CTS's CL#3500 test:

dEQP-VK.api.image_clearing.core.clear_color_image.2d.linear.single_layer.r8g8b8_unorm

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 08:28:31 +01:00
Bas Nieuwenhuizen
572854e706 radv: Clean up a bunch of compiler warnings.
Random unused vars.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-20 03:21:09 +01:00
Bas Nieuwenhuizen
7631feaa00 radv: Sync ETC2 whitelisted devices.
Fixes: 4bb6c49375 "radv: Allow ETC2 on RAVEN and VEGA10 instead of all GFX9."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-02-20 02:55:41 +01:00
Timothy Arceri
3d7611e9a6 st/nir: use NIR for asm programs
This uses prog_to_nir to translate ARB assembly programs to NIR.

Co-authored by Tim Arceri, Dave Airlie, and Ken Graunke:

 - [Tim Arceri]: original patch
 - [Dave Airlie]: fix crashes with parameter names
 - [Ken Graunke]:
   - Rebase on SCALAR_ISA cap, lower wpos_ytransform too.
   - Rebase on streamout fixes.
   - Lower system values for fragcoord support.
   - Don't try to use prog_to_nir for ATI_fragment_shader programs.
   - Create TGSI for fixed-function or ARB vertex shaders even if the
     driver prefers NIR, so we can create draw module shaders for
     feedback/select emulation, which rely on TGSI.

Tested on:
- iris (Intel Skylake/Kabylake): Piglit & GL CTS - Ken Graunke
- radeonsi (AMD Vega 64): Piglit - Ken Graunke
- vc4/v3d - Piglit - Eric Anholt
- freedreno - dEQP - Kristian Høgsberg

Fixes lit_degenerate_case on vc4 and v3d, and vp-address-01,
vp-arl-constant-array-huge-offset-neg, and vp-arl-neg-array on v3d.
No Piglit regressions on radeonsi; no dEQP regressions on freedreno.

Acked-by: Eric Anholt <eric@anholt.net>
Tested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 15:56:26 -08:00
Kenneth Graunke
3b4929ec6e st/mesa: Copy VP TGSI tokens if they exist, even for NIR shaders.
Even if the driver wants to use NIR shaders, we may need to have TGSI
tokens for creating draw module vertex shaders for the feedback/select
render modes.

So...if the st_vertex_program has any TGSI...copy it to the variant.

Acked-by: Eric Anholt <eric@anholt.net>
Tested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 15:56:19 -08:00
Kenneth Graunke
ba7519ca36 radeonsi: Go back to using llvm.pow intrinsic for nir_op_fpow
ARB_vertex_program and ARB_fragment_program define 0^0 = 1 (while GLSL
leaves it undefined).  Performing fpow lowering in NIR would break this
behavior, preventing us from using prog_to_nir.

According to llvm/lib/Target/AMDGPU/SIInstructions.td, POW_common
expands to <V_LOG_F32_e32, V_EXP_F32_e32, V_MUL_LEGACY_F32_e32>,
which presumably does a zero-wins multiply.

Lowering in NIR results in a non-legacy multiply, where:

   pow(0, 0) = 2^(log2(0) * 0)
             = 2^(-INF * 0)
             = 2^(-NaN)
             = -NaN

which isn't the desired result.

This reverts:
- commit d6b7539206
  (ac/nir: remove emission of nir_op_fpow)
- commit 22430224fe
  (radeonsi/nir: enable lowering of fpow)

and prevents a regression in gl-1.0-spot-light with AMD_DEBUG=nir
after enabling prog_to_nir in st/mesa later in this series.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 15:56:19 -08:00
Timothy Arceri
9c4d5926aa radeonsi/nir: set shader_buffers_declared properly
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-20 10:46:19 +11:00
Timothy Arceri
94a3df62d7 radeonsi/nir: set colors_read properly
shader-db results for VEGA64:

Totals from affected shaders:
SGPRS: 1976 -> 1976 (0.00 %)
VGPRS: 1240 -> 1144 (-7.74 %)
Spilled SGPRs: 145 -> 145 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 34632 -> 34604 (-0.08 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 261 -> 285 (9.20 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-20 10:46:19 +11:00
Timothy Arceri
05cc1dd764 radeonsi/nir: set input_usage_mask properly
shader-db results for VEGA64:

Totals from affected shaders:
SGPRS: 791528 -> 792616 (0.14 %)
VGPRS: 421624 -> 410784 (-2.57 %)
Spilled SGPRs: 1639 -> 1674 (2.14 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 16103516 -> 16063696 (-0.25 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 136307 -> 137830 (1.12 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-20 10:46:19 +11:00
Timur Kristóf
9429bcc4b0 radeonsi/nir: Use uniform location when calculating const_file_max.
The nine state tracker can produce NIR uniform variables
whose location is explicitly set. radeonsi did not take that
into account when calculating const_file_max, resulting in
rendering glitches. This patch fixes that.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-20 10:37:47 +11:00
Mario Kleiner
afb15d14ca drirc: Add sddm-greeter to adaptive_sync blacklist.
This is the sddm login screen.

Fixes: a9c36dbf9c ("drirc: Initial blacklist for adaptive sync")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-02-19 18:03:05 -05:00
Marek Olšák
bff8da6c59 driconf: add Civ6Sub executable for Civilization 6
I'm getting Civ6Sub instead of Civ6.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 17:59:17 -05:00
Marek Olšák
ae21bdf47c radeonsi: always enable NIR for Civilization 6 to fix corruption
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104602

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 17:59:17 -05:00
Marek Olšák
ccbfe44e5f radeonsi: add driconf option radeonsi_enable_nir
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 17:59:17 -05:00
Kenneth Graunke
f9c835eb56 mesa: Align doubles to a 64-bit starting boundary, even if packing.
In the new Intel Iris driver, I am using Tim's new packed uniform
storage system.  It works great, with one caveat: our scalar compiler
backend assumes that uniform offsets will be aligned to the underlying
data type.  For example, doubles must be 64-bit aligned, floats 32-bit,
half-floats 16-bit, and so on.  It does not need any other padding.

Currently, _mesa_add_parameter aligns everything to 32-bit offsets,
creating doubles that have an unaligned offset.  This patch alters
that code to align doubles to 64-bit offsets.

This may be slightly less optimal for drivers which can support full
packing, and allow reads from unaligned offsets at no penalty.  We could
make this extra alignment optional.  However, it only comes into play
when intermixing double and single precision uniforms.  Doubles are
already not too common, and intermixed values (floats then doubles)
is probably even less common.  At most, we burn a single 32-bit slot
to the alignment, which is not that expensive.  So, it doesn't seem
worthwhile to add the extra complexity.

Eventually, we'll likely want to update this code to allow half-float
values to be packed tighter than 32-bit offsets.  At that point, we'll
probably want to revisit what drivers ultimately want, and add options.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 13:26:58 -08:00
Kenneth Graunke
3c2c6bd1c7 compiler: Make is_64bit(GL_*) helper more broadly available
I'd like to use this in the prog_parameter.c code, so I need to move it
into C, make it non-static, and so on.  This probably isn't the ideal
place for it, but I couldn't think of a better one.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 13:26:58 -08:00
Eric Engestrom
daf8ada08d gitlab-ci: automatically run the CI on pushes to ci/* branches
Last commit limited the CI to master and MRs, but to avoid having to
manually trigger CI runs, let's add a 3rd, automatic way: by pushing to
a branch named `ci/*` (or `ci-*` or just `ci`) (which you can delete
afterwards, the pipeline results will remain).

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-19 16:57:32 +00:00
Eric Engestrom
861ade7042 gitlab-ci: limit the automatic CI to master and MRs
Runs on random other branches (stables RCs, personal forks) can still be
triggered manually via the web interface, or an app using the API.

This should massively help with the current voracious state of our CI.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-19 16:57:28 +00:00
Eric Engestrom
f84f833981 tegra/autotools: add missing libdrm cflags
Fixes: f1374805a8 "drm-uapi: use local files, not system libdrm"
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=109647
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-19 13:29:05 +00:00
Eric Engestrom
b787403a21 tegra/meson: add missing dep_libdrm
Fixes: f1374805a8 "drm-uapi: use local files, not system libdrm"
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=109645
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-19 13:29:00 +00:00
Rhys Perry
238730daef ac/nir: implement half-float nir_op_ldexp
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:46 +00:00
Rhys Perry
6971e8d342 ac/nir: implement half-float nir_op_frsq
v2: don't use ac_get_onef()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:41 +00:00
Rhys Perry
2038aec22a ac/nir: implement half-float nir_op_frcp
v2: don't use ac_get_onef()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:35 +00:00
Rhys Perry
4261edc067 ac/nir: make ac_build_fdiv support 16-bit floats
v2: don't use ac_get_onef()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:29 +00:00
Rhys Perry
6790b3a8db ac/nir: make ac_build_isign work on all bit sizes
v2: don't use ac_get_zero(), ac_get_one() and ac_int_of_size()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:20 +00:00
Rhys Perry
bbbfdef683 ac/nir: make ac_build_clamp work on all bit sizes
v2: don't use ac_get_zerof() and ac_get_onef()
v3: rename "intr" to "name"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:58 +00:00
Rhys Perry
7e5004e30a ac/nir: fix 64-bit nir_op_f2f16_rtz
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:44 +00:00
Rhys Perry
c4ea20c0a0 ac/nir: implement 8-bit nir_load_const_instr
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:33 +00:00
Rhys Perry
0ca550e01a radv: ensure export arguments are always float
So that the signature is correct and consistent, the inputs to a export
intrinsic should always be 32-bit floats.

This and the previous commit fixes a large amount crashes from
dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_int_*
tests

Fixes: b722b29f10 ('radv: add support for 16bit input/output')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:22 +00:00
Rhys Perry
64065aa504 radv: bitcast 16-bit outputs to integers
16-bit outputs are stored as 16-bit floats in the outputs array, so they
have to be bitcast.

Fixes: b722b29f10 ('radv: add support for 16bit input/output')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:18 +00:00
Eric Engestrom
23b485c920 gitlab-ci: use ccache to speed up builds
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-19 10:09:51 +00:00
Eric Anholt
dbe3af67a4 v3d: Move i2b and f2b support into emit_comparison.
This lets us save a resolve to NIR true/false for ifs and discard_if.  No
change in shader-db.
2019-02-18 18:18:37 -08:00
Eric Anholt
0bba9c8489 v3d: Emit a simpler negate for the iabs implementation.
One program affected in my shader-db.

instructions in affected programs: 110 -> 108 (-1.82%)
2019-02-18 18:13:09 -08:00
Eric Anholt
1a775d43c9 v3d: Delay emitting ldvpm on V3D 4.x until it's actually used.
For V3D 3.x, we emitted the ldvpms all at the top so that we didn't need
to do VPM setup when the load_inputs are out of order.  For V3D 4.x, we
can reduce register pressure by delaying our loads until they're actually
needed.  This also avoids a bunch of silly MOVs in the pre-opt VIR dump.

total instructions in shared programs: 6421415 -> 6419933 (-0.02%)
total uniforms in shared programs: 2393139 -> 2393140 (<.01%)
total threads in shared programs: 153864 -> 153906 (0.03%)
2019-02-18 18:09:07 -08:00
Eric Anholt
5a84d46896 v3d: Stop tracking num_inputs for VPM loads.
It's unused in the VS (since we need vattr_sizes[] anyway), so move it to
FS prog data.
2019-02-18 18:09:07 -08:00
Eric Anholt
581eba072d v3d: Add a function to describe what the c->execute.file check means.
This is what pointed out that we were misusing the check for last_thrsw in
the previous commit.
2019-02-18 18:09:07 -08:00
Eric Anholt
441294962c v3d: Fix the check for "is the last thrsw inside control flow"
The execute.file check used to be good enough, until I stopped setting up
the execute mask for uniform ifs.

No known tests fixed, noticed while doing a refactor.

Fixes: 0805060573 ("v3d: Handle dynamically uniform IF statements with uniform control flow.")
2019-02-18 18:09:07 -08:00
Eric Anholt
07d5b5a972 v3d: Fix f2b32 behavior.
Now that we don't have the vir_PF() magic, it's obvious that we were doing
the wrong thing for f2b32 by allowing -0.0 to produce true instead of
false.
2019-02-18 18:09:07 -08:00
Eric Anholt
3022b4bd82 v3d: Kill off vir_PF(), which is hard to use right.
You were allowed to pass in any old temp so that you could hopefully fold
the PF up into the def of the temp.  If we couldn't find one, it
implicitly generated a MOV(nop, reg).  However, that PF could have
different behavior depending on whether the def being folded into was a
float or int opcode, which the caller doesn't necessarily control.

Due to the fragility of the function, just switch all callers over to
vir_set_pf().  This also encourages the callers to use a _dest call for
the inst they're putting the PF on, eliminating a bunch of temps in the
pre-optimization VIR.

shader-db says the change is in the noise:

total instructions in shared programs: 6226247 -> 6227184 (0.02%)
instructions in affected programs: 851068 -> 852005 (0.11%)
2019-02-18 18:09:06 -08:00
Eric Anholt
6186a8d44e v3d: Do bool-to-cond for discard_if as well.
Turns this minimal conditional discard (glsl-fs-discard-01.shader_test):

0x3de0b086c5fe9000 fcmp.pushn  -, r1, r5; mov  r2, 0
0x3dec3086bbfc001f nop                  ; mov.ifa  r2, -1
0x3c047186bbe80000 nop                  ; mov.pushz  -, r2
0x3dea3186ba837000 setmsf.ifna  -, 0    ; nop

into:

0x3c00b186c582a000 fcmp.pushn  -, r2, r5; nop
0x3de83186ba837000 setmsf.ifa  -, 0     ; nop

total instructions in shared programs: 6229820 -> 6226247 (-0.06%)
2019-02-18 18:09:06 -08:00
Eric Anholt
718eef62cb v3d: Refactor bcsel and if condition handling.
Both were doing the same thing to try to get a condition to predicate on.
Noticed when I wanted to do this for discard_if as well.

No change in shader-db.
2019-02-18 18:09:06 -08:00
Eric Anholt
4586f9f902 v3d: Add a helper function for getting a nop register.
Just a little refactor to explain what's going on with QFILE_NULL.
2019-02-18 18:09:06 -08:00
Eric Anholt
339155122b v3d: Drop our hand-lowered nir_op_ffract.
The NIR lowering works fine, though it causes some slight noise due to
what looks like choices about propagating constants up multiply chains
changing.

total instructions in shared programs: 6229671 -> 6229820 (<.01%)
total uniforms in shared programs: 2312171 -> 2312324 (<.01%)
2019-02-18 18:09:06 -08:00
Eric Anholt
16f5085490 v3d: Drop a perf note about merging unpack_half_*, which has been implemented.
This is handled with copy-propagation now.
2019-02-18 18:09:06 -08:00
Eric Anholt
146e432b49 v3d: Fix incorrect flagging of ldtmu as writing r4 on v3d 4.x.
Fixes some stalls in 3DMMES's main vertex shader.

total instructions in shared programs: 6280751 -> 6211270 (-1.11%)
instructions in affected programs: 2935050 -> 2865569 (-2.37%)
2019-02-18 18:09:06 -08:00
Eric Anholt
cd5e0b2729 v3d: Use the early_fragment_tests flag for the shader's disable-EZ field.
Apparently we need disable-EZ flagged, not just "does Z writes".

Fixes
dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo
on 7278, even though it passed in simulation.

Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: 051a41d3d5 ("v3d: Add support for the early_fragment_tests flag.")
2019-02-18 18:09:06 -08:00
Eric Anholt
332b969c4e v3d: Sync indirect draws on the last rendering.
Fixes intermittent fails in
dEQP-GLES31.functional.draw_indirect.compute_interop.separate.drawelements_compute_cmd_and_data_and_indices
and others (particularly when run as part of a CTS run)
2019-02-18 18:09:06 -08:00
Eric Anholt
32f16b0b1e v3d: Clear the GMP on initialization of the simulator.
Otherwise, we might have pages accessible that shouldn't be and miss out
on errors.  This is unlikely for most tests since v3d_hw_get_mem() is big
enough that it'll be a freshly zeroed mmap, but if screens are destroyed
and recreated then we'd be reusing the old v3d_hw_get_mem() contents.
2019-02-18 18:09:06 -08:00
Emil Velikov
ba652394a3 docs: update calendar, add news item and link release notes for 18.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-18 18:38:14 +00:00
Emil Velikov
d7108dac73 docs: add sha256 checksums for 18.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bfb5bdaa97)
2019-02-18 18:36:23 +00:00
Emil Velikov
a1ccff4aaf docs: add release notes for 18.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b26488dead)
2019-02-18 18:36:21 +00:00
Ilia Mirkin
57441af8bf i965: always enable EXT_float_blend
From the table in isl_format.c, it appears that all generations
support blending on 32-bit float surfaces.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-18 12:13:54 -05:00
Ilia Mirkin
9fec653093 st/mesa: enable GL_EXT_float_blend when possible
If the driver supports PIPE_BIND_BLENABLE on RGBA32F, flip
EXT_float_blend on (which will affect ES3 contexts).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-18 12:13:54 -05:00
Ilia Mirkin
070a5e5d92 mesa: add explicit enable for EXT_float_blend, and error condition
If EXT_float_blend is not supported, error out on blending of FP32
attachments in an ES2 context.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-18 12:13:54 -05:00
Samuel Pitoiset
47616810ed radv: fix writing the alpha channel of MRT0 when alpha coverage is enabled
This version is better and safer.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 18:06:07 +01:00
Rob Clark
d6c43cceff freedreno/ir3: handle quirky atomic dst for a6xx
The new encoding returns a value via the 2nd src.  The legalize pass
needs to be aware of this to set the correct needs_sy flag, otherwise we
can, in cases where the atomic dst is not used, overwrite the register
that hardware will asynchronously load result into without (sy) flag, so
it gets clobbered by the atomic result.

This fixes a whole lot of rando ssbo+atomic fails, like
dEQP-GLES31.functional.ssbo.layout.single_basic_type.packed.highp_vec4.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-18 12:01:36 -05:00
Rob Clark
28fc6733cd freedreno/a6xx: fix helper_invocation (sampler mask/id)
Since gl_HelperInvocation is lowered to:

  !((1 << sample_id) & sample_mask_in))

Not setting these enable bits was causing it be broken.  (And probably a
bunch of other stuff too.)

Fixes dEQP-GLES31.functional.shaders.helper_invocation.*

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-18 10:37:54 -05:00
Samuel Pitoiset
32ab7a59bb radv: remove unused variable in gather_push_constant_info()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-18 13:30:16 +01:00
Lionel Landwerlin
8c87d029bc i965: scale factor changes should trigger recompile
Found by inspection.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3da858a6b9 ("intel/compiler: add scale_factors to sampler_prog_key_data")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-18 12:18:13 +00:00
Samuel Pitoiset
0d8f096293 radv: write the alpha channel of MRT0 when alpha coverage is enabled
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109597
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:22 +01:00
Samuel Pitoiset
2cf5433b99 ac: use new LLVM 8 intrinsic when loading 16-bit values
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:20 +01:00
Samuel Pitoiset
f0223143a8 ac: add ac_build_llvm8_tbuffer_load() helper
It uses the new LLVM intrinsics.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:17 +01:00
Tapani Pälli
9762a9f893 mesa: return NULL if we exceed MaxColorAttachments in get_fb_attachment
This fixes invalid access to Attachment array which would occur if caller
would exceed MaxColorAttachments. In practice this should not ever happen
because DiscardFramebufferEXT specifies only GL_COLOR_ATTACHMENT0 to be
valid and InvalidateFramebuffer will error out before but this should
make coverity happy.

v2: const, remove _EXT (Ian)

CID: 1442559
Fixes: 0c42b5f3cb "mesa: wire up InvalidateFramebuffer"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-18 07:51:55 +02:00
Alyssa Rosenzweig
2c6a7fbeb7 panfrost: Fix clipping region
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:13:50 +00:00
Alyssa Rosenzweig
fa1b36ddc2 panfrost: Preserve w sign in perspective division
This fixes issues where polygons that should be culled (due to negative
w, for instance) may not be.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:13:34 +00:00
Alyssa Rosenzweig
49985cebea panfrost: Cleanup mali_viewport (clipping) code
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:13:03 +00:00
Alyssa Rosenzweig
a94463732a panfrost: Swap order of tiled texture (de)alloc
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:10:33 +00:00
Alyssa Rosenzweig
4a4ed53c01 panfrost: Free imported BOs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:10:06 +00:00
Alyssa Rosenzweig
b5a01296f4 panfrost: Fix various leaks unmapping resources
v2: Don't check for NULL before free()

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:09:41 +00:00
Kenneth Graunke
535251487b nir: Don't reassociate add/mul chains containing only constants
The idea here is to reassociate a * (b * c) into (a * c) * b, when
b is a non-constant value, but a and c are constants, allowing them
to be combined.

But nothing was enforcing that 'b' must be non-constant, which meant
that running opt_algebraic in a loop would never terminate if the IR
contained non-folded constant expressions like 256 * 0.5 * 2.  Normally,
we call constant folding in such a loop too, but IMO it's better for
nir_opt_algebraic to be robust and not rely on that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109581
Fixes: 32e266a9a5 i965: Compile fp64 funcs only if we do not have 64-bit hardware support

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-16 23:36:14 -08:00
Chris Wilson
e9882b879b i965: Assert the execobject handles match for this device
Object handles are local to the device fd, so double check we are not
mixing together objects from multiple screens on execbuf submission.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-16 23:35:29 -08:00
Rob Clark
99b90ecd35 freedreno/a6xx: cache flush harder
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark
1af0c5d320 freedreno/a6xx: compute support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark
5118dcf8c3 freedreno/a6xx: image/ssbo state emit
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark
2183d9cff7 freedreno/a6xx: border-color offset helper
Soon we'll need this logic to deal w/ image/SSBO case, so split out a
helper rather than duplicate the logic.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark
c1a27ba9ba freedreno/ir3: HIGH reg w/a for a6xx
It seems like some instructions (noticed this w/ cat3), cannot read HIGH
regs.. cat1 (mov/cov) can, and possibly some/all of cat2.

The blob seems to stick w/ an extra mov into low regs.  So lets do the
same.

This fixes WGID on a6xx, which unsurprisingly is related to a lot of
deqp compute fails.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark
947848524d freedreno/ir3: add a6xx+ SSBO/image support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark
b46d5b8a84 freedreno/ir3: add a6xx instruction encoding
For the handful of instructions that use a new encoding.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark
2e0ea3f09c freedreno/ir3: add image/ssbo <-> ibo/tex mapping
Images and SSBOs don't map directly to the hw.  They end up being part
texture and part something else.  Starting with a6xx, the hack used for
a5xx to smash the image tex state into hw texture state starting from
MAX counting down won't work, because we start using tex state also for
SSBO read.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark
75f3a5245e freedreno/ir3: fix ncomp for _store_image() src
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark
feee3050d3 freedreno/ir3: split out a4xx+ instructions
Note that image/ssbo support is currently only implemented for a5xx.
But the instruction encoding is the same for a4xx.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark
42af0640f6 freedreno/ir3: split out image helpers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark
aefdb9bed2 freedreno/a6xx: clean up some open-coded bits
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark
b51de44dea freedreno/a6xx: move stream-out emit to helper
Split out of the main fd6_emit() code, since it was already getting to
be a pretty giant function.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:26:14 -05:00
Rob Clark
c0d6be11d6 freedreno/ir3: fix varying packing vs. tex sharp edge
We probably need to rethink how we detect which instruction first
defines higher register classes.  But for now, this at least fixes
the symptom.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:26:14 -05:00
Samuel Pitoiset
52bdb043af radv: fix invalid element type when filling vertex input default values
The elements added into a vector should have the same type as the
first one, otherwise this hits an assertion in LLVM.

Fixes: 4b3549c084 ("radv: reduce the number of loaded channels for vertex input fetches")
reported-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-16 15:33:18 +01:00
Eleni Maria Stea
7188e2ba15 i965: Removed the field etc_format from the struct intel_mipmap_tree
After the previous changes to emulate the ETC/EAC formats using the
secondary shadow miptree, the etc_format field of the intel_mipmap_tree
struct became redundant and the remaining check that used it has been
replaced. (Nanley Chery)

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-02-15 15:54:41 -08:00
Eleni Maria Stea
248f2e7888 i965: Enabled the OES_copy_image extension on Gen 7 GPUs
OES_copy_image extension was disabled on Gen7 due to the lack of support
for ETC2 images. Enabled it back. (Kenneth Graunke)

v2:
  - Removed the blank lines in the comments above OES_copy_image and
  OES_texture_view extensions in intel_extensions.c (Nanley Chery)

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-02-15 15:54:41 -08:00
Eleni Maria Stea
db0c379c06 i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
For CopyImageSubData to copy the data during the 1st draw call, we need
to update the shadow tree right before the rendering.

v2:
  - Added assertion that the miptree doesn't need update at the time we
  update the texture surface. (Nanley Chery)

v3:
  - As we now update the tree before the rendering we don't need to copy
  the data during the unmap anymore. Removed the unnecessary update from
  the intel_miptree_unmap in intel_mipmap_tree.c (Nanley Chery)

v4:
  - Fixed unrelated empty line removal (Nanley Chery)
  - As now the intel_upate_etc_shadow of intel_mipmap_tree.c is only
  called inside its following function, we don't need to declare it at
  the top of the file anymore. (Nanley Chery)

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-02-15 15:54:41 -08:00
Eleni Maria Stea
d8eb7287fe i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
compressed EAC/ETC2 images to non-compressed RGBA images. When
GetCompressed* functions were called, the pixels were returned in this
RGBA format and not the compressed format that was expected.

Trying to fix this problem, we use a secondary shadow miptree to store the
decompressed data for the rendering and the main miptree to store the
compressed for the Get functions to work. Each time that the main miptree
is written with compressed data, we decompress them to RGB and update the
shadow. Then we use the shadow for rendering.

v2:
   - Fixes in the commit message (Nanley Chery)
   - Reversed the changes in brw_get_texture_swizzle and swapped the b, g
   values at the time that we decompress the data in the function:
   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
   - Simplified the format checks in the miptree_create function of the
   intel_mipmap_tree.c and reserved the call of the
   intel_lower_compressed_format for the case that we are faking the ETC
   support (Nanley Chery)
   - Removed the check for the auxiliary usage for the shadow miptree at
   creation (miptree_create of intel_mipmap_tree.c) as we won't use
   auxiliary buffers with these types of trees (Nanley Chery)
   - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
   removed the unecessary checks (Nanley Chery)
   - Fixed an unrelated indentation change (Nanley Chery)
   - Modified the function intel_miptree_finish_write to set the
   mt->shadow_needs_update to true to catch all the cases when we need to
   update the miptree (Nanley Chery)
   - In order to update the shadow miptree during the unmap of the
   main and always map the main (Nanley Chery) the following change was
   necessary: Splitted the previous update function that was updating all
   the mipmap levels and use two functions instead: one that updates one
   level and one that updates all of them. Used the first during unmap
   and the second before the rendering.
   - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
   miptree should be mapped each time and reversed all the changes in the
   higher level texture functions that upload data to textures as they
   aren't needed anymore.
   - Replaced the boolean needs_fake_etc with an inline function that
   checks when we need to fake the ETC compression (Nanley Chery)
   - Removed the initialization of the strides in the update function as
   the values will be overwritten by the intel_miptree_map call (Nanley
   Chery)
   - Used minify instead of division in the new update function
   intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
   Chery)
   - Removed the depth from the calculation of the number of slices in
   the new update function (intel_miptree_update_etc_shadow_levels of
   intel_mipmap_tree.c) as we don't need to support 3D ETC images.
   (Nanley Chery)

v3:
  - Renamed the rgba_fmt in function miptree_create
  (intel_mipmap_tree.c) to decomp_format as the format is not always in
  rgba order. (Nanley Chery)
  - Documented the new usage for the shadow miptree in the comment above
  the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
  Chery)
  - Removed the redundant flags from the mapping of the miptrees in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
  - Fixed the switch from surface's logical level to physical level in
  the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
  (Nanley Chery)
  - Excluded the Baytrail GPUs from the check for the ETC emulation as
  they support the ETC formats natively. (Nanley Chery)
  - Simplified the check if the format is BGRA in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)

v4:
  - Removed the functions intel_miptree_(map|unmap)_etc and the check if
   we need to call them as with the new changes, they became unreachable.
   (Nanley Chery)
  - We'd rather calculate the level width and height using the shadow
  miptree instead of the main in intel_miptree_update_etc_shadow_levels of
  intel_mipmap_tree.c (Nanley Chery)
  - Fixed the format in the mt_surface_usage, set at the miptree creation,
   in miptree_create of intel_mipmap_tree.c (Nanley Chery)

v5:
  - Fixed the levels calculations in intel_mipmap_tree.c (Nanley Chery)
  - Update the flag shadow_needs_update outside the function
  intel_miptree_update_etc_shadow (Nanley Chery)
  - Fixed indentation error (Nanley Chery)

v6:
  - Fixed typo in commit message (Nanley Chery)
  - Simplified the assignment of the mt_fmt in the miptree_create of the
  intel_mipmap_tree.c (Nanley Chery)
  - Combined declarations and assignments where it was possible in the
  intel_miptree_update_etc_shadow and
  intel_miptree_update_etc_shadow_levels of the intel_mipmap_tree.c
  (Nanley Chery)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81843
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104272
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-02-15 15:54:41 -08:00
Nanley Chery
c6dada70f0 i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*
Use more generic field names. We'll reuse these fields for a workaround
with ASTC miptrees.

Reviewed-by: Eleni Maria Stea <estea@igalia.com>
2019-02-15 15:54:41 -08:00
Timothy Arceri
a801196ec9 nir: remove simple dead if detection from nir_opt_dead_cf()
This was probably useful when it was first written, however it
looks to be no longer necessary.

As far as I can tell these days dce is smart enough to remove useless
instructions from if branches. Once this is done
nir_opt_peephole_select() will end up removing the empty if.

Removing this support reduces the dolphin uber shader compilation
time spent in nir_opt_dead_cf() by a little over 7x.

No shader-db changes on i965 or radeonsi.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-02-16 10:45:31 +11:00
Alok Hota
f695e43354 swr/rast: Add translation support to streamout
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:54:29 -06:00
Alok Hota
a7fa0cc0a5 swr/rast: simdlib cleanup, clipper stack space fixes
Reduce stack space used by clipper, which had lead to crashes in some
versions for MSVC

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:54:23 -06:00
Alok Hota
f9c29a301a swr/rast: convert DWORD->uint32_t, QWORD->uint64_t
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:54:19 -06:00
Alok Hota
c503b58878 swr/rast: Refactor scratch space variable names
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:54:14 -06:00
Alok Hota
0b4db43705 swr/rast: FP consistency between POSH/RENDER pipes
- Ensure all threads have optimal floating-point control state
- Disable auto-generation of fused FP ops for VERTEX shader stage
- Disable "fast" FP ops for VERTEX shader stage

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:54:09 -06:00
Alok Hota
dc7b3c95a4 swr/rast: Move knob defaults to generated cpp file
Reduces amount of compile churn when testing different default values

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:54:04 -06:00
Alok Hota
05e4ff33f5 swr/rast: Flip BitScanReverse index calculation
The intrinsic returns the number of leading zeros, not the bit number of
the first nonzero, so just flip it based on the mask size

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:53:58 -06:00
Alok Hota
ae400a9b11 swr/rast: Correctly align 64-byte spills/fills
Fixes crashes on some compute shaders when running on AVX512

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:53:54 -06:00
Alok Hota
78bab66479 swr/rast: Disable use of __forceinline by default
- Was not useful to inline in release builds
- FORCEINLINE can be used if absolutely necessary

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:52:51 -06:00
Alok Hota
20d5c88760 swr/rast: Convert system memory pointers to gfxptr_t
Fulfills an unused internal interface

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:52:32 -06:00
Bas Nieuwenhuizen
4b03a19a0b radv: Use correct num formats to detect whether we should be use 1.0 or 1.
normalized and scaled formats also return floats.

Fixes: 4b3549c084 ("radv: reduce the number of loaded channels for vertex input fetches")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-15 20:24:16 +00:00
Ian Romanick
979b43b347 nir/algebraic: Simplify comparison with sequential integers starting with 0
All of the affected shaders are Unreal4 demos.

All Gen6+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 15437170 -> 15437001 (<.01%)
instructions in affected programs: 21536 -> 21367 (-0.78%)
helped: 43
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 3.93 x̃: 4
helped stats (rel) min: 0.68% max: 1.01% x̄: 0.80% x̃: 0.80%
95% mean confidence interval for instructions value: -4.07 -3.79
95% mean confidence interval for instructions %-change: -0.83% -0.77%
Instructions are helped.

total cycles in shared programs: 383007896 -> 383007378 (<.01%)
cycles in affected programs: 158640 -> 158122 (-0.33%)
helped: 38
HURT: 4
helped stats (abs) min: 1 max: 48 x̄: 13.89 x̃: 6
helped stats (rel) min: 0.03% max: 1.01% x̄: 0.33% x̃: 0.19%
HURT stats (abs)   min: 2 max: 3 x̄: 2.50 x̃: 2
HURT stats (rel)   min: 0.06% max: 0.09% x̄: 0.08% x̃: 0.08%
95% mean confidence interval for cycles value: -16.90 -7.77
95% mean confidence interval for cycles %-change: -0.39% -0.19%
Cycles are helped.

Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8213746 -> 8213745 (<.01%)
instructions in affected programs: 127 -> 126 (-0.79%)
helped: 1
HURT: 0

total cycles in shared programs: 187734146 -> 187734144 (<.01%)
cycles in affected programs: 2132 -> 2130 (-0.09%)
helped: 1
HURT: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-15 11:11:02 -08:00
Ian Romanick
ad05920258 nir/algebraic: Convert some f2u to f2i
Section 5.4.1 (Conversion and Scalar Constructors) of the GLSL 4.60 spec
says:

     It is undefined to convert a negative floating-point value to an
     uint.

Assuming that (uint)some_float behaves like (uint)(int)some_float allows
some optimizations in the i965 backend to proceed.

This basically undoes the small amount of damage done by
"intel/compiler: Avoid propagating inequality cmods if types are
different".

v2: Replicate part of the commit message as a comment in the code.
Suggested by Jason.

shader-db results compairing *before* "intel/compiler: Avoid propagating
inequality cmods if types are different" and after this commit:

Skylake
total cycles in shared programs: 383007996 -> 383007896 (<.01%)
cycles in affected programs: 85208 -> 85108 (-0.12%)
helped: 13
HURT: 8
helped stats (abs) min: 2 max: 26 x̄: 10.77 x̃: 6
helped stats (rel) min: 0.09% max: 0.65% x̄: 0.28% x̃: 0.14%
HURT stats (abs)   min: 2 max: 12 x̄: 5.00 x̃: 3
HURT stats (rel)   min: 0.04% max: 0.32% x̄: 0.12% x̃: 0.07%
95% mean confidence interval for cycles value: -9.31 -0.21
95% mean confidence interval for cycles %-change: -0.24% <.01%
Cycles are helped.

Broadwell
total cycles in shared programs: 415251194 -> 415251370 (<.01%)
cycles in affected programs: 83750 -> 83926 (0.21%)
helped: 7
HURT: 13
helped stats (abs) min: 10 max: 12 x̄: 11.43 x̃: 12
helped stats (rel) min: 0.30% max: 0.30% x̄: 0.30% x̃: 0.30%
HURT stats (abs)   min: 2 max: 36 x̄: 19.69 x̃: 22
HURT stats (rel)   min: 0.05% max: 0.89% x̄: 0.44% x̃: 0.47%
95% mean confidence interval for cycles value: 0.76 16.84
95% mean confidence interval for cycles %-change: <.01% 0.37%
Inconclusive result (%-change mean confidence interval includes 0).

Haswell
total instructions in shared programs: 13823885 -> 13823886 (<.01%)
instructions in affected programs: 2249 -> 2250 (0.04%)
helped: 0
HURT: 1

total cycles in shared programs: 390094243 -> 390094001 (<.01%)
cycles in affected programs: 85640 -> 85398 (-0.28%)
helped: 15
HURT: 6
helped stats (abs) min: 4 max: 26 x̄: 18.53 x̃: 18
helped stats (rel) min: 0.09% max: 0.66% x̄: 0.47% x̃: 0.42%
HURT stats (abs)   min: 2 max: 14 x̄: 6.00 x̃: 2
HURT stats (rel)   min: 0.04% max: 0.37% x̄: 0.15% x̃: 0.04%
95% mean confidence interval for cycles value: -17.36 -5.69
95% mean confidence interval for cycles %-change: -0.44% -0.14%
Cycles are helped.

Ivy Bridge
total cycles in shared programs: 180986448 -> 180986552 (<.01%)
cycles in affected programs: 34835 -> 34939 (0.30%)
helped: 0
HURT: 10
HURT stats (abs)   min: 2 max: 18 x̄: 10.40 x̃: 10
HURT stats (rel)   min: 0.06% max: 0.36% x̄: 0.28% x̃: 0.30%
95% mean confidence interval for cycles value: 4.67 16.13
95% mean confidence interval for cycles %-change: 0.20% 0.35%
Cycles are HURT.

Sandy Bridge
total cycles in shared programs: 154603969 -> 154603970 (<.01%)
cycles in affected programs: 171514 -> 171515 (<.01%)
helped: 25
HURT: 14
helped stats (abs) min: 1 max: 4 x̄: 1.80 x̃: 1
helped stats (rel) min: 0.02% max: 0.10% x̄: 0.04% x̃: 0.04%
HURT stats (abs)   min: 1 max: 8 x̄: 3.29 x̃: 3
HURT stats (rel)   min: 0.03% max: 0.28% x̄: 0.10% x̃: 0.11%
95% mean confidence interval for cycles value: -0.91 0.96
95% mean confidence interval for cycles %-change: -0.02% 0.04%
Inconclusive result (value mean confidence interval includes 0).

No changes on Iron Lake or GM45.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-15 11:11:02 -08:00
Matt Turner
ac21dd4aee intel/compiler/test: Add unit test for mismatched signedness comparison
v2 (idr): Move adding the test to after adding the fix.  Reordering the
two commits prevents possible headaches for git-bisect with scripts that
always do 'ninja check'.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-15 11:11:02 -08:00
Matt Turner
2dff9a66b6 intel/compiler: Avoid propagating inequality cmods if types are different
v2: Fix silly bug in logic.  s/||/&&/

All but one of the affected shaders is in an Unreal4 demo.  The other is
in Tomb Raider.  All of the cases that Ian investigated appear to be
sequences like the following

    if (int(uint(some_float)) < 0) /* other relations too */
        ...

At least in Tomb Raider, it's not obvious that this sequence came from
the original shader.

In some of the Unreal demos, the shader contains code like

    if (int(uint(textureLod(...))) > 0)
        ...

which explicitly generates the offending sequence.

All Gen6+ platforms had similar results (Skylake shown):
total instructions in shared programs: 15437170 -> 15437187 (<.01%)
instructions in affected programs: 4492 -> 4509 (0.38%)
helped: 0
HURT: 17
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.05% max: 0.73% x̄: 0.66% x̃: 0.73%
95% mean confidence interval for instructions value: 1.00 1.00
95% mean confidence interval for instructions %-change: 0.57% 0.75%
Instructions are HURT.

total cycles in shared programs: 383007996 -> 383007992 (<.01%)
cycles in affected programs: 20542 -> 20538 (-0.02%)
helped: 6
HURT: 7
helped stats (abs) min: 2 max: 6 x̄: 5.33 x̃: 6
helped stats (rel) min: 0.11% max: 0.36% x̄: 0.32% x̃: 0.36%
HURT stats (abs)   min: 4 max: 4 x̄: 4.00 x̃: 4
HURT stats (rel)   min: 0.27% max: 0.27% x̄: 0.27% x̃: 0.27%
95% mean confidence interval for cycles value: -3.30 2.69
95% mean confidence interval for cycles %-change: -0.19% 0.19%
Inconclusive result (value mean confidence interval includes 0).

No changes on Iron Lake or GM45.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: nagrigoriadis@gmail.com
Tested-by: Danylo Piliaiev <danylo.piliaiev@gmail.com>
2019-02-15 11:11:02 -08:00
Matt Turner
e50db60d16 intel/compiler/test: Set devinfo->gen = 7
We emit an FBL instruction which only exists since Gen7. This prevents
the test from segfaulting when run with TEST_DEBUG=1.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-15 11:11:02 -08:00
James Zhu
9364d66cb7 gallium/auxiliary/vl: Add video compositor compute shader render
Add compute shader initilization, assign and cleanup in vl_compositor API.
Set video compositor compute shader render as default when pipe support it.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2019-02-15 10:07:03 -05:00
James Zhu
f6ac0b5d71 gallium/auxiliary/vl: Add compute shader to support video compositor render
Add compute shader to support video compositor render.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2019-02-15 10:07:03 -05:00
James Zhu
299e2bc046 gallium/auxiliary/vl: Rename csc_matrix and increase its size.
Rename csc_matrix to shader_params, and increase shader_params size
to store more constants for compute shader,

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2019-02-15 10:07:03 -05:00
James Zhu
7b7b5f2029 gallium/auxiliary/vl: Split vl_compositor graphic shaders from vl_compositor API
Split vl_compositor graphic shaders from vl_compositor API in order to share
vl_compositor API with vl_compositor compute shader later.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2019-02-15 10:07:03 -05:00
James Zhu
b34d7c5daa gallium/auxiliary/vl: Move dirty define to header file
Move dirty define to header file to share with compute shader.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2019-02-15 10:07:03 -05:00
Juan A. Suarez Romero
1fb24080b7 nir: remove jump from two merging jump-ending blocks
In opt_peel_initial_if optimization, when moving the continue list to
end of the continue block, before the jump, could happen that the
continue list itself also ends with a jump.

This would mean that we would have two jump instructions in a row: the
first one from the continue list and the second one from the contine
block.

As inserting an instruction after a jump is not allowed (and it does not
make sense, as it will not be executed), remove the jump from the
continue block and keep the one from continue list, as it will be
executed first.

CC: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-15 15:16:24 +01:00
Juan A. Suarez Romero
69be9934a7 nir: move ALU instruction before the jump instruction
opt_split_alu_of_phi moves ALU instruction to the end of continue block.

But if the continue block ends with a jump instruction (an explicit
"continue" instruction) then the ALU must be inserted before the jump,
as it is illegal to add instructions after the jump.

CC: Ian Romanick <ian.d.romanick@intel.com>
Fixes: 0881e90c09 ("nir: Split ALU instructions in loops that read phis")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-15 15:14:36 +01:00
Andres Gomez
a43596df62 mesa: INVALID_VALUE for wrong type or format in Clear*Buffer*Data
Instead of generating a GL_INVALID_ENUM error when the type or format
is incorrect while using glClear{Named}Buffer{Sub}Data, generate
GL_INVALID_VALUE.

From page 72 (page 94 of the PDF) of the OpenGL 4.6 spec:

  " An INVALID_VALUE error is generated if type is not one of the
    types in table 8.2.

    An INVALID_VALUE error is generated if format is not one of the
    formats in table 8.3."

Fixes the following test:
KHR-GL45.direct_state_access.buffers_errors

v2: correct the doxygen documentation.

Cc: Pi Tabred <servuswiegehtz@yahoo.de>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-15 14:28:06 +02:00
Gurchetan Singh
67426ccd42 virgl: use virgl_transfer_inline_write even less
We've noticed the Team Fortress 2 engine seems to do many small
calls to glSubData(..). Let's pick our heuristic based on the
resource base width, not the size of a particular upload.
This will cause transfers to be batched together in the transfer
queue.

Revelant glbench microbenchmark --

Before: buffer_upload_dynamic_element_array_131072 = 131.17 mbytes_sec
After: buffer_upload_dynamic_element_array_131072 = 6828.24 mbytes_sec
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
f0e71b1088 virgl: use transfer queue
This improves Unigine Valley benchmark by 3 to 10 fps (depending
on the scene).

It also improves the Team Fortress 2 benchmark from 6 fps to 13
fps (host: 20 fps).

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
4a7857b377 virgl: introduce transfer queue
Transfers will be placed here at unmap time instead of incurring
a VM exit. There's an attempt to deduplicate intersecting 1D transfers,
which are surprisingly common.

This can also help with mipmapped texture upload and smaller
textures, where the majority of the time is spent in the guest
kernel / QEMU -- not virglrenderer.  This is shown by the GLbench
texture upload benchmark:

Before:
    texture_upload_rgba_teximage2d_32 = 64.23 mtexel_sec
After:
    texture_upload_rgba_teximage2d_32 = 367.44 mtexel_sec

v2: Split up list iteration functions (@gerddie)
v3: Support for optimizing glBufferSubData
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
9c4930946a virgl: add encoder functions for new protocol
Let's encode the new protocol with new helper functions.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
5510cc67e0 virgl: make winsys modifications for encoded transfers
The idea is to have two command buffers:

1) One for transfers
2) One for commands, which can include transfers

At flush time, (2) will be filled.  Otherwise, (1) will be
used to submit transfers if there are enough of them.

v2: Pass size directly to cmd_buf_create (@gerddie)
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
90e9650585 virgl: add extra checks in virgl_res_needs_flush_wait
This is motivated by the following scenario:

glSubBufferData(GL_ARRAY_BUFFER, ...)
glFlush(..)
glSubBufferData(GL_ARRAY_BUFFER, ...)
glSubBufferData(GL_ARRAY_BUFFER, ...)
glSubBufferData(GL_ARRAY_BUFFER, ...)

This increases @davidriley's Team Fortress 2 apitrace from
1 fps to 6 fps and helps with the Chromium glbench
microbenchmarks:

Before: texture_update_rgba_texsubimage2d_2048 = 554.96 mtexel_sec
   buffer_upload_dynamic_array_12 = 0.02 mbytes_sec
   buffer_upload_dynamic_array_576 = 1.07 mbytes_sec
After: texture_update_rgba_texsubimage2d_2048 = 612.29 mtexel_sec
   buffer_upload_dynamic_array_12 = 2.22 mbytes_sec
   buffer_upload_dynamic_array_576 = 164.89 mbytes_sec
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
ab6ea6e9ce virgl: pass virgl transfer to virgl_res_needs_flush_wait
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
d98fbd9c92 virgl: keep track of number of computations
It's good to keep track of these things.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
35515985a9 virgl: limit command length to 16 bits
Much of our logic is based around the idea the upper 16 bits
of a command dword can encode the length of the command.

Now that the command buffer >= 2^16 - 1, we should check for
this.

v2: alignment, and only check VIRGL_ENCODE_MAX_DWORDS
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
503ffe46bb virgl: use virgl_transfer in inline write
Let's define a helper function and use it.

This commit also allows resources to be emitted into different command
buffers.

Like the ioctls, send 0 for layer_stride and stride.  If we actually
send the real values, there are various assumptions in virglrenderer
for non-1D buffers that may need to be modified.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
0fcd48bac5 virgl: add protocol for resource transfers
Mostly similar to VIRGL_CCMD_RESOURCE_INLINE_WRITE.  However, this
uses the resource's already attached iovecs rather than the command
buffer to transfer the data.

v2: Used (1 << 16) not (1 << 15) [@gerddie]
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
168c3ffce3 virgl: when creating / freeing transfers, pass slab pool directly
This will allow us to destroy transfers w/o having a pointer
to the context.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Gurchetan Singh
d5c2dacc15 virgl: unmap uploader at flush time
This should save some memory when allocating and freeing transfers.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Gurchetan Singh
14f265b533 virgl: make alignment smaller when uploading index user buffers
Since we're just uploading to guest memory, let's just align to dword
size.

Fixes: e0f932 ("u_upload_mgr: pass alignment to u_upload_data manually")
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Gurchetan Singh
7626e6e189 virgl: track level cleanliness rather than resource cleanliness
This allows a minor optimization for texture upload.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Gurchetan Singh
c19aedcf1a virgl: don't mark unclean after a flush
The guest memory is still clean until host GL touches it,
which we should track elsewhere.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Gurchetan Singh
5b6a2ae987 virgl: use virgl_resource_dirty helper
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Gurchetan Singh
1d294ad264 virgl: add ability to do finer grain dirty tracking
There are levels to cleanliness.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Alyssa Rosenzweig
acc52fff20 panfrost: Improve logging and patch memory leaks
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-15 07:47:54 +00:00
Alyssa Rosenzweig
c70ed4ca18 panfrost: Don't align framebuffer dims
Fixes regressions with EGL clients

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-15 07:46:30 +00:00
Alyssa Rosenzweig
5155bcf099 panfrost: Implement PIPE_QUERY_OCCLUSION_COUNTER
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-15 07:46:02 +00:00
Alyssa Rosenzweig
2d22b5380c panfrost: Identify MALI_OCCLUSION_PRECISE bit
Setting this is required for desktop-style occlusion queries.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-15 07:45:56 +00:00
Tapani Pälli
595af46f0f drirc/i965: add option to disable 565 configs and visuals
We have cases where we would not like to expose these.

v2: call the option allow_rgb565_configs for consistency
    with existing allow_rgb10_configs (Eric, Jason)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-15 09:38:36 +02:00
Alyssa Rosenzweig
97aa05470a panfrost: Backport driver to Mali T600/T700
There are a few differenes between Mali T860 (Panfrost's primary
reference target) and the older Midgard generations (T600/T700):

 - Miscellaneous different magic numbers. It's not clear what these
numbers mean on either the old or new configurations yet.

 - Errata fixes. T800 is the final Midgard generation and presumably the
least buggy. Older Midgard has some extra hardware errata we have to
workaround.

- SFBD vs MFBD split. Essentially, older Midgard use a Single
FrameBuffer Descriptor (SFBD), which corresponds to single
render-target rendering. Newer Midgard (T760+) use a Multiple
FrameBuffer Descriptor (MFBD), allowing multiple RTs. On ES 2.0, these
descriptors serve the same function, but we implement both, depending on
the version of the hardware.

- CPU bitness. 32-bit systems generally use 32-bit GPU descriptors, and
vice versa for 64-bit. Our target T760 systems are 32-bit whereas our
target T860 systems are 64-bit. More work is needed in this area.

This patch fixes support in these areas for supporting older Midgard
hardware. It is tested on Mali T760 and Mali T860.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-15 07:22:42 +00:00
Alyssa Rosenzweig
f96e871c26 panfrost: Fix build; depend on libdrm
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-15 07:19:43 +00:00
Jason Ekstrand
08bfd710a2 nir/dead_cf: Stop relying on liveness analysis
The liveness analysis pass is fairly expensive because it has to build
large bit-sets and run a fix-point algorithm on them.  Instead of
requiring liveness for detecting if values escape a CF node, just take
advantage of the structured nature of NIR and use block indices instead.
This only requires the block index metadata which is the fastest we have
metadata to generate.

No shader-db changes on Kaby Lake

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-14 23:06:29 -06:00
Jason Ekstrand
b50465d197 nir/dead_cf: Inline cf_node_has_side_effects
We want to handle live SSA values differently and it's going to involve
walking the instructions.  We can make it a single instruction walk if
we combine it with cf_node_has_side_effects.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-14 23:05:28 -06:00
Jason Ekstrand
367b0ede4d intel/fs: Bail in optimize_extract_to_float if we have modifiers
This fixes a bug in runscape where we were optimizing x >> 16 to an
extract and then negating and converting to float.  The NIR to fs pass
was dropping the negate on the floor breaking a geometry shader and
causing it to render nothing.

Fixes: 1f862e923c "i965/fs: Optimize float conversions of byte/word..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109601
Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-02-14 23:02:44 -06:00
Ilia Mirkin
8c859367df swr: set PIPE_CAP_MAX_VARYINGS correctly
Unfortunately swr was missed in the original commit. The number of
varyings should generally match up to what's reported as the shader
caps for fragment inputs.

Fixes: 6010d7b8e8 (gallium: add PIPE_CAP_MAX_VARYINGS)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Alok Hota <alok.hota@intel.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-14 20:29:36 -05:00
Jason Ekstrand
5064464931 intel/fs: Silence a compiler warning
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-14 16:04:47 -06:00
Jason Ekstrand
9b202239ba anv: Silence some compiler warnings in release builds
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-14 16:04:45 -06:00
Jason Ekstrand
cd60c995a6 anv/blorp: Delete a pointless assert
Just a little higher up in the function we assert that the aspect masks
are actually equal so there's no reason for the weaker check.  Also, the
temporary variables were causing compiler warnings in release builds.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-14 16:04:42 -06:00
Jason Ekstrand
b14d7a6b60 nir: Silence a couple of warnings in release builds
[28/716] Compiling C object 'src/compiler/nir/068b2c8@@nir@sta/nir_gather_xfb_info.c.o'.
../src/compiler/nir/nir_gather_xfb_info.c: In function ‘nir_gather_xfb_info’:
../src/compiler/nir/nir_gather_xfb_info.c:171:13: warning: variable ‘max_offset’ set but not used [-Wunused-but-set-variable]
    unsigned max_offset[NIR_MAX_XFB_BUFFERS] = {0};
             ^~~~~~~~~~
[36/716] Compiling C object 'src/compiler/nir/068b2c8@@nir@sta/nir_instr_set.c.o'.
../src/compiler/nir/nir_instr_set.c:502:1: warning: ‘instr_each_src_and_dest_is_ssa’ defined but not used [-Wunused-function]
 instr_each_src_and_dest_is_ssa(nir_instr *instr)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-14 16:04:35 -06:00
Kenneth Graunke
6775665e5e spirv: Eliminate dead input/output variables after translation.
spirv_to_nir can generate input/output variables which are illegal
for the current shader stage, which would cause nir_validate_shader
to balk.  After my recent commit to start decorating arrays as compact,
dEQP-VK.spirv_assembly.instruction.graphics.module.same_module started
hitting validation errors due to outputs in a TCS (not intended for the
TCS at all) not being per-vertex arrays.

Thanks to Jason Ekstrand for suggesting this approach.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109573
Fixes: ef99f4c8d1 compiler: Mark clip/cull distance arrays as compact before lowering.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-02-14 11:03:56 -08:00
Kenneth Graunke
39aee57523 anv: Put MOCS in the correct location
My patch to switch from struct-based MOCS to numeric MOCS accidentally
divided all MOCS entries by 2 in the Vulkan driver.

MOCS on Gen9+ is just an array index into a table.  But in the hardware
packets, the index starts at bit 1.  So we need to shift it.

Fixes: 0b44644ca6 (genxml: Consistently use a numeric "MOCS" field)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-14 11:03:28 -08:00
Ian Romanick
9a918050e0 spirv: Add missing break
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: c6465fec0c ("spirv: add SpvCapabilityInt64Atomics")
CID: 1442555
2019-02-14 08:35:59 -08:00
Eric Engestrom
c2b4b46fa9 util/tests: compile to something sensible in release builds
assert()-based tests make no sense without asserts, so make sure asserts
are compiled in, even if the rest of the code has asserts turned off.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-14 12:52:34 +00:00
Eric Engestrom
f7c56475d2 anv/tests: compile to something sensible in release builds
assert()-based tests make no sense without asserts, so make sure asserts
are compiled in, even if the rest of the code has asserts turned off.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-14 12:52:34 +00:00
Eric Engestrom
4c1ca5b074 etnaviv: drop duplicate #define
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-14 11:20:00 +00:00
Eric Engestrom
7f68b38439 st/dri: drop duplicate #define
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-14 11:20:00 +00:00
Eric Engestrom
2fa165e757 gbm: drop duplicate #defines
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-14 11:20:00 +00:00
Eric Engestrom
f1374805a8 drm-uapi: use local files, not system libdrm
There was an issue recently caused by the system header being included
by mistake, so let's just get rid of this include path and always
explicitly #include "drm-uapi/FOO.h"

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-14 11:20:00 +00:00
Eric Engestrom
69e4c273c4 drm-uapi/README: remove explicit list of driver names
These headers are used by a lot more than just the intel drivers nowadays.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-14 11:20:00 +00:00
Samuel Pitoiset
227df98fa6 radv: fix radv_fixup_vertex_input_fetches()
We should check that num_channels is 4, otherwise that breaks
the world. Sorry for the short breakage.

Fixes: 4b3549c084 ("radv: reduce the number of loaded channels for vertex input fetches")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-14 09:44:35 +01:00
Samuel Pitoiset
4b3549c084 radv: reduce the number of loaded channels for vertex input fetches
It's unnecessary to load more channels than the vertex attribute
format. The remaining channels are filled with 0 for y and z,
and 1 for w.

29077 shaders in 15096 tests
Totals:
SGPRS: 1321605 -> 1318869 (-0.21 %)
VGPRS: 935236 -> 932252 (-0.32 %)
Spilled SGPRs: 24860 -> 24776 (-0.34 %)
Code Size: 49832348 -> 49819464 (-0.03 %) bytes
Max Waves: 242101 -> 242611 (0.21 %)

Totals from affected shaders:
SGPRS: 93675 -> 90939 (-2.92 %)
VGPRS: 58016 -> 55032 (-5.14 %)
Spilled SGPRs: 172 -> 88 (-48.84 %)
Code Size: 2862740 -> 2849856 (-0.45 %) bytes
Max Waves: 15474 -> 15984 (3.30 %)

This mostly helps Croteam games (Talos/Sam2017).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-14 09:10:56 +01:00
Samuel Pitoiset
210aec3612 radv: store vertex attribute formats as pipeline keys
The formats will be used for reducing the number of loaded channels.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-14 09:10:09 +01:00
Samuel Pitoiset
45382baef6 radv: use MAX_{VBS,VERTEX_ATTRIBS} when defining max vertex input limits
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-14 09:09:51 +01:00
Samuel Pitoiset
2154fac6f3 ac: make use of ac_build_expand_to_vec4() in visit_image_store()
And make ac_build_expand() a static function.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-14 09:09:48 +01:00
Eric Anholt
338d399fd0 freedreno: Use the NIR lowering for isign.
I think this will save an instruction and hopefully not increase any other
costs (possibly the immediate -1 and 1?), but I haven't actually tested.

Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-14 00:32:30 +00:00
Eric Anholt
8f3694e1ab intel: Use the NIR lowering for isign.
Drops one instruction from fs-sign-int.shader_test.  No change in
shader-db due to it having 0 instances of sign(genIType).  This may hurt
isign64 if algebraic runs before int64 lowering, but I wasn't sure how to
mark the algebraic opt as "every bit size but 64".

v2: Update commit message about shader-db.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
2019-02-14 00:32:30 +00:00
Eric Anholt
3f22b35a43 v3d: Use the NIR lowering for isign instead of rolling our own.
min/max instead of comparisons saves 2 instructions on
fs-sign-int.shader_test.
2019-02-14 00:32:30 +00:00
Eric Anholt
42d2cae907 nir: Move panfrost's isign lowering to nir_opt_algebraic.
I wanted to reuse this from v3d.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-14 00:32:30 +00:00
Timothy Arceri
68baf96824 nir: turn an ssa check in nir_search into an assert
Everything should be in ssa form when we call this. This is a
hotpath so replace the check with an assert.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-02-14 09:35:32 +11:00
Timothy Arceri
46a4d2c867 nir: turn ssa check into an assert
Everthing should be in ssa form when this is called. Checking
for it here is expensive so turn this into an assert instead.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-02-14 09:35:32 +11:00
Timothy Arceri
0a89c9779a nir: prehash instruction in nir_instr_set_add_or_rewrite()
There is no need to hash the instruction twice, especially as we
end up adding it in the majority of cases.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-14 09:35:32 +11:00
Dylan Baker
279060cd32 meson: Add dependency on genxml to anvil
Currently the Intel "anvil" driver races with the generation of genxml
files, while i965 has an explicit dependency. This patch adds the same
dependency to anvil.

Fixes: d1992255bb
       ("meson: Add build Intel "anv" vulkan driver")
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-13 22:01:00 +00:00
Samuel Pitoiset
334da034d8 radv: always export gl_SampleMask when the fragment shader uses it
For some reasons, this breaks trees rendering in Project Cars.

Fixes: 85010585cd ("radv: only enable gl_SampleMask if MSAA is enabled too")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109401
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-13 23:01:30 +01:00
Alok Hota
736241892f gallium/aux: add PIPE_CAP_MAX_VARYINGS to u_screen
Allows drivers using `u_pipe_screen_get_param_defaults` to use a
fallback value for the new pipe cap. Default value of 8 based on GL 2.1
MAX_VARYING_FLOATS

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-13 15:08:14 -06:00
Kristian H. Kristensen
e8566d7098 .mailmap: Add a few more alises for myself
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-13 12:03:41 -08:00
Samuel Pitoiset
5e18000d1b radv/winsys: fix BO list creation when RADV_DEBUG=allbos is set
Fixes: 50fd253bd6 ("radv/winsys: Add priority handling during submit.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-13 20:51:40 +01:00
Kristian H. Kristensen
0a41ddbd4e freedreno/a6xx: Fix point coord
Use ir3_next_varying() for iterating through varyings and unset the
global point coord invert bit.

Fixes:

  dEQP-GLES3.functional.shaders.builtin_variable.pointcoord

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-13 11:14:06 -08:00
Kristian H. Kristensen
2fbd2d5f58 freedreno/a6xx: Front facing needs UNK3 bit
We need to set UNK3 in GRAS_CNTL and RB_RENDER_CONTROL0 for the value
to be reliably delivered.

Fixes:

  dEQP-GLES3.functional.shaders.builtin_variable.frontfacing

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-13 11:14:06 -08:00
Kristian H. Kristensen
1831238c8e freedreno/a6xx: Update headers
This pulls in changes for compute shaders and a6xx ssbo/image support.
FACENESS bit moved from position 1 to 2 and there's a global invert
bit for point coord.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-13 11:14:06 -08:00
Kristian H. Kristensen
182e5c011f freedreno/a6xx: Clean up mixed use of swap and swizzle for texture state
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-13 11:03:29 -08:00
Rob Clark
61094629cb freedreno/a6xx: small compiler warning fix
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-13 13:54:05 -05:00
Dylan Baker
aff52dd2c6 get-pick-list: Add --pretty=medium to the arguments for Cc patches
Because none of them have been picked up for 19.0 due to this bug
being reintroduced.

v2: - Fix fixes tags

Fixes: e6b3a3b201
       ("bin/get-pick-list.sh: handle "typod" usecase.")
Fixes: fac10169bb
       ("bin/get-pick-list.sh: prefix output with "[stable] "")
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-13 08:59:30 -08:00
Eric Engestrom
68a9383c6f gitlab-ci: limit ninja to 4 threads max
I tried bumping the limit on make and scons instead, but that just
thrashed the runners, so let's not do that (sorry @daniels :]).

Instead, remove the automatic thread management from ninja and limit it
to 4 instead, in line with make and scons.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-13 16:15:43 +00:00
Konstantin Kharlamov
fccc9d3de6 mapi: work around GCC LTO dropping assembly-defined functions
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109391

Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-13 14:20:51 +00:00
Caio Marcelo de Oliveira Filho
017349997f nir: fix example in opt_peel_loop_initial_if description
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-12 20:33:20 -08:00
Karol Herbst
7e08f22a72 nir/opt_if: don't mark progress if nothing changes
if we have something like this:

loop {
   ...
   if x {
      break;
   } else {
      continue;
   }
}

opt_if_loop_last_continue returns true marking progress allthough nothing
changes.

Fixes: 5921a19d4b "nir: add if opt opt_if_loop_last_continue()"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-13 00:21:35 +01:00
Oscar Blumberg
3c540e0a74 radeonsi: Fix guardband computation for large render targets
Stop using 12.12 quantization for viewports that are not contained in
the lower 4k corner of the render target as the hardware needs to keep
both absolute and relative coordinates representable.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-12 17:21:46 -05:00
Chia-I Wu
2f8734e13b egl: fix KHR_partial_update without EXT_buffer_age
EGL_BUFFER_AGE_EXT can be queried without EXT_buffer_age.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-12 19:14:34 +00:00
Kenneth Graunke
5a006b026d mesa: Advertise EXT_float_blend in ES 3.0+ contexts.
This extension simply drops a draw time restriction:

    "Furthermore, an INVALID_OPERATION error is generated by
     DrawArrays and the other drawing commands defined in section
     2.8.3 (10.5 in ES 3.1) if blending is enabled (see below) and
     any draw buffer has 32-bit floating-point format components."

We never correctly enforced this restriction anyway, so we were
basically already implementing it.  We just need to advertise it
for our behavior to be correct.

The extension requires EXT_color_buffer_float, but we already enable
that via dummy_true.  So we can dummy_true this one as well.

Found while debugging WebGL conformance tests.  Does not fix any.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-12 10:57:25 -08:00
Alok Hota
d3dfa86a30 gallium/swr: Param defaults for unhandled PIPE_CAPs
Without using this function, we fail the -Wswitch flag when compiling
the default debugoptimized mode in Meson

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-12 18:55:14 +00:00
Juan A. Suarez Romero
1ad26f9417 anv/cmd_buffer: check for NULL framebuffer
This can happen when we record a VkCmdDraw in a secondary buffer that
was created inheriting from the primary buffer, but with the framebuffer
set to NULL in the VkCommandBufferInheritanceInfo.

Vulkan 1.1.81 spec says that "the application must ensure (using scissor
if neccesary) that all rendering is contained in the render area [...]
[which] must be contained within the framebuffer dimesions".

While this should be done by the application, commit 465e5a86 added the
clamp to the framebuffer size, in case of application does not do it.
But this requires to know the framebuffer dimensions.

If we do not have a framebuffer at that moment, the best compromise we
can do is to just apply the scissor as it is, and let the application to
ensure the rendering is contained in the render area.

v2: do not clamp to framebuffer if there isn't a framebuffer

v3 (Jason):
- clamp earlier in the conditional
- clamp to render area if command buffer is primary

v4: clamp also x and y to render area (Jason)

v5: rename used variables (Jason)

Fixes: 465e5a86 ("anv: Clamp scissors to the framebuffer boundary")
CC: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-12 19:19:13 +01:00
Marek Olšák
6c64413b6f radeonsi: use MEM instead of MEM_GRBM in COPY_DATA.DST_SEL
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-12 13:08:54 -05:00
Marek Olšák
f8e4c9df47 radeonsi: add AMD_DEBUG env var as an alternative to R600_DEBUG
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-12 13:08:54 -05:00
Samuel Pitoiset
1b8983c25b radv: fix using LOAD_CONTEXT_REG with old GFX ME firmwares on GFX8
This fixes a critical issue.

Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109575
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 17:39:30 +01:00
Samuel Pitoiset
bd1186572f radv: add support for push constants inlining when possible
This removes some scalar loads from shaders, but it increases
the number of SET_SH_REG packets. This is currently basic but
it could be improved if needed. Inlining dynamic offsets might
also help.

Original idea from Dave Airlie.

29077 shaders in 15096 tests
Totals:
SGPRS: 1321325 -> 1357101 (2.71 %)
VGPRS: 936000 -> 932576 (-0.37 %)
Spilled SGPRs: 24804 -> 24791 (-0.05 %)
Code Size: 49827960 -> 49642232 (-0.37 %) bytes
Max Waves: 242007 -> 242700 (0.29 %)

Totals from affected shaders:
SGPRS: 290989 -> 326765 (12.29 %)
VGPRS: 244680 -> 241256 (-1.40 %)
Spilled SGPRs: 1442 -> 1429 (-0.90 %)
Code Size: 8126688 -> 7940960 (-2.29 %) bytes
Max Waves: 80952 -> 81645 (0.86 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 17:25:54 +01:00
Samuel Pitoiset
8364ffe823 radv: keep track of the number of remaining user SGPRs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 17:25:52 +01:00
Samuel Pitoiset
5f9379ca35 radv: gather if shaders load dynamic offsets separately
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 17:25:49 +01:00
Samuel Pitoiset
5806d99984 radv: gather more info about push constants
This is needed in order to inline some push constants when possible.
This also adds a new helper for initializing the pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 17:25:34 +01:00
Samuel Pitoiset
129a9f4937 radv: fix compiler issues with GCC 9
"The C standard says that compound literals which occur inside of
the body of a function have automatic storage duration associated
with the enclosing block. Older GCC releases were putting such
compound literals into the scope of the whole function, so their
lifetime actually ended at the end of containing function. This
has been fixed in GCC 9. Code that relied on this extended lifetime
needs to be fixed, move the compound literals to whatever scope
they need to accessible in."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109543
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 14:48:08 +01:00
Tapani Pälli
2a2e69f975 i965: add P0x formats and propagate required scaling factors
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Lin Johnson <johnson.lin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-12 08:43:04 +02:00
Tapani Pälli
3da858a6b9 intel/compiler: add scale_factors to sampler_prog_key_data
Patch propagates given scale_factors to lowering options.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-12 08:42:25 +02:00
Tapani Pälli
722f96bfc8 dri: add P010, P012, P016 for 10bit/12bit/16bit YUV420 formats
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Lin Johnson <johnson.lin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-12 08:42:02 +02:00
Tapani Pälli
19a85a704b nir: add option to use scaling factor when sampling planes YUV lowering
Patch adds nir_lower_tex_options as parameter to sample_plane so that
we don't need to extend nir_tex_instr for this.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-12 08:41:20 +02:00
Kenneth Graunke
3eedc8f7b1 i965: Use info->textures_used instead of prog->SamplersUsed.
prog->SamplersUsed is set by the linker when validating resource limits,
while info->textures_used is gathered after NIR optimizations, which may
have eliminated some unused surfaces.

This may let us skip some work.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:50 -08:00
Kenneth Graunke
59ae985631 i965: Drop unnecessary 'and' with prog->SamplerUnits
textures_used_by_txf is a subset of textures_used which is a subset
of prog->SamplerUnits.  This should do nothing.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:48 -08:00
Kenneth Graunke
f5c7df4dc9 nir: Gather texture bitmasks in gl_nir_lower_samplers_as_deref.
Eric and I would like a bitmask of which samplers are used, similar to
prog->SamplersUsed, but available in NIR.  The linker uses SamplersUsed
for resource limit checking, but later optimizations may eliminate more
samplers.  So instead of propagating it through, we gather a new one.
While there, we also gather the existing textures_used_by_txf bitmask.

Gathering these bitfields in nir_shader_gather_info is awkward at best.
The main reason is that it introduces an ordering dependency between the
two passes.  If gathering runs before lower_samplers_as_deref, it can't
look at var->data.binding.  If the driver doesn't use the full lowering
to texture_index/texture_array_size (like radeonsi), then the gathering
can't use those fields.  Gathering might be run early /and/ late, first
to get varying info, and later to update it after variant lowering.  At
this point, should gathering work on pre-lowered or post-lowered code?
Pre-lowered is also harder due to the presence of structure types.

Just doing the gathering when we do the lowering alleviates these
ordering problems.  This fixes ordering issues in i965 and makes the
txf info gathering work for radeonsi (though they don't use it).

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:45 -08:00
Kenneth Graunke
120f9b8362 nir: Use sampler derefs in drawpixels and bitmap lowering.
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:44 -08:00
Kenneth Graunke
04bdc56872 program: Make prog_to_nir create texture/sampler derefs.
Until now, prog_to_nir has been setting texture_index and sampler_index
directly.  This is different than GLSL shaders, which create variable
dereferences and rely on lowering passes to reach this final form.

radeonsi uses variable dereferences for samplers rather than
texture_index and sampler_index, so it doesn't even make sense to set
them there.  By moving to derefs, we ensure that both GLSL and ARB
programs produce the same final form that the driver desires.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:40 -08:00
Kenneth Graunke
6a4be25a90 st/nir: Use sampler derefs in built-in shaders.
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:38 -08:00
Kenneth Graunke
ba9c1c8217 st/nir: Lower sampler derefs for builtin shaders.
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:36 -08:00
Kenneth Graunke
8d1646e0e1 st/nir: Pull sampler lowering into a helper function.
This will make it easier to reuse across GLSL / ARB / built-ins.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:35 -08:00
Kenneth Graunke
243c11dc16 i965: Call nir_lower_samplers for ARB programs.
An upcoming patch will start building derefs in prog_to_nir, at which
point we'll need to lower them to indexes.

This gets both GLSL and non-GLSL shaders using the same paths.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:30 -08:00
Kenneth Graunke
529a0711c1 glsl: Don't look at sampler uniform storage for internal vars
Passes like nir_lower_drawpixels add additional sampler variables,
and set an explicit binding which never changes.  These extra samplers
don't have proper uniform storage associated with them, and there is no
way to update bindings via the API.  So, for any 'hidden' variables,
just trust that there's an explicit binding set.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:28 -08:00
Kenneth Graunke
d34e434989 glsl: Allow gl_nir_lower_samplers*() without a gl_shader_program
I would like to be able to run gl_nir_lower_samplers() to turn texture
and sampler variable dereferences into indexes and offsets, even for
ARB programs, and built-in shaders.  This would make sampler handling
more consistent across the various types of shaders.

For GLSL programs, the gl_nir_lower_samplers_as_deref() pass looks up
the variable bindings in the shader program's uniform storage.  But
ARB programs and built-in shaders don't have a gl_shader_program, and
uniform storage doesn't exist.  In this case, we simply skip that
lookup, and trust var->data.binding to be set correctly by whoever
created the shader.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:22 -08:00
Kenneth Graunke
f45dd6d31b st/mesa: Limit GL_MAX_[NATIVE_]PROGRAM_PARAMETERS_ARB to 2048
Piglit's vp-max-array test creates a vertex program containing a uniform
array sized to the value of GL_MAX_NATIVE_PROGRAM_PARAMETERS_ARB.  Mesa
will then add additional state-var parameters for things like the MVP
matrix.

radeonsi currently exposes a value of 4096, derived from constant buffer
upload size.  This means the array will have 4096 elements, and the
extra MVP state-vars would get a prog_src_register::Index of over 4096.

Unfortunately, prog_src_register::Index is a signed 13-bit integer, so
values beyond 4096 end up turning into negative numbers.  Negative
source indexes are only valid for relative addressing, so this ends up
generating illegal IR.

In prog_to_nir, this would cause an out of bounds array access.
st_mesa_to_tgsi checks for a negative value, assumes it's bogus,
and remaps it to parameter 0 in order to get something in-range.
This isn't right - instead of reading the MVP matrix, it would read
the first element of the vertex program's large array.  But the test
only checks that the program compiles, so we never noticed that it
was broken.

This patch limits the size of the program limits, with the understanding
that we may need to generate additional state-vars internally.  i965 has
exposed 1024 for this limit for years, so I don't expect lowering it to
2048 will cause any practical problems for radeonsi or other drivers.

Fixes vp-max-array with prog_to_nir.c.

Cc: "19.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:09:51 -08:00
Francisco Jerez
374eb3cd6f intel/dump_gpu: Disambiguate between BOs from different GEM handle spaces.
This fixes a rather astonishing problem that came up while debugging
an issue in the Vulkan CTS.  Apparently the Vulkan CTS framework has
the tendency to create multiple VkDevices, each one with a separate
DRM device FD and therefore a disjoint GEM buffer object handle space.
Because the intel_dump_gpu tool wasn't making any distinction between
buffers from the different handle spaces, it was confusing the
instruction state pools from both devices, which happened to have the
exact same GEM handle and PPGTT virtual address, but completely
different shader contents.  This was causing the simulator to believe
that the vertex pipeline was executing a fragment shader, which didn't
end up well.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-11 12:27:22 -08:00
Kristian H. Kristensen
e404c6879d freedreno/a6xx: Fall back to masked RGBA blits for depth/stencil
The blitter doesn't seem to have a write mask, so for depth only and
stencil only blits to Z24S8 we cast the Z24S8 buffer to an RGBA UNORM8
buffer and fall back to pipeline blits with corresponding write mask.

Fixes

  dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_stencil_only
  dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_depth
  dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_depth
  dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_depth
  dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_depth
  dEQP-GLES3.functional.fbo.msaa.2_samples.stencil_index8
  dEQP-GLES3.functional.fbo.msaa.4_samples.stencil_index8

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
f03ba155d5 freedreno/a6xx: Add format argument to fd6_tex_swiz()
We need to allow overriding the format with that of the image or
sampler view, so we can't take it from the resource in fd6_tex_swiz().

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
bc8c813d5a freedreno/a6xx: Support y-inverted blits
The src coordinates are s24.8. For an inverted blit that ends at y=0
we need to program -1 for sy2, so we need to handle negative values
correctly.

Fixes

  dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_y
  dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_y
  dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_min_reverse_src_y
  dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_color
  dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_color

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
03a01e5d23 freedreno/a6xx: Support some depth/stencil blits on blitter
We can rewrite almost all depth stencil blits to various red-only
blits.  The exception is depth-only or stencil-only blits into z24s8
combined depth stencil buffer. We can fall back for depth-only, but
stencil-only remains broken.

Fixes

  dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_basic
  dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_scale
  dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_basic
  dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_scale
  dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_stencil_only

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
e9592da2b4 freedreno/a6xx: Move blit check so as to restore comment
The explanation for the compressed format check is broken across two
comments:

	/* We can blit if both or neither formats are compressed formats... */
	/* ... but only if they're the same compression format. */

but the ok_format() checks were inserted between, breaking up the flow
of the sentence.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
d2639f2eac freedreno: Don't tell the blitter what it can't do
Call ctx->blit() and let it reject blits it can't do instead of giving
up on stencil blits and blits u_blitter can't do.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
8cf1303698 freedreno: Consolidate u_blitter functions in freedreno_blitter.c
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
701d30dda8 freedreno/a6xx: Combine emit_blit and fd6_blit
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
6d1a7bdba3 freedreno/a6xx: Use the right resource for separate stencil stride
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
24b4172375 freedreno: Log number of draw for sysmem passes
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
a201cb157d freedreno/a6xx: Drop render condition check in blitter
We already check earlier in the call chain in fd_blit().
glBlitFramebuffer always sets render_condition_enable and thus we
would never try the blitter path for that.

Now that we get all of dEQP-GLES3.functional.fbo.blit.conversion.*
down this path, it turs out that the

  fail_if(info->mask != util_format_get_mask(info->src.format));
  fail_if(info->mask != util_format_get_mask(info->dst.format));

conditions weren't accurate.  util_format_get_mask() returns
PIPE_MASK_RGBA for any format with any color channels, while
info->mask is the exact set of channels to blit.  So we reject things
we could blit - for example, PIPE_FORMAT_R16G16_FLOAT where info->mask
is RG while util_format_get_mask() returns RGBA - and accept things we
can't.  It turns out that the blitter is happy to blit different
number of channels, but fails to blit formats with different numerical
formats and srgb formats.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
4f7a9c23ed freedreno/a6xx: regen headers
Update for a6xx.xml.h to incorporate a few new bits and changes to
blit src rect coordinate types.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-02-11 12:26:21 -08:00
Leo Liu
a0a52a0367 st/va/vp9: set max reference as default of VP9 reference number
If there is no information about number of render targets

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-11 14:44:16 -05:00
Leo Liu
21cdb828a3 st/va: fix the incorrect max profiles report
Add "PIPE_VIDEO_PROFILE_MAX" to enum, so it will make sure here will
be correct when adding more profiles in the future.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109107

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-11 14:44:16 -05:00
Guttula, Suresh
2cf2a56739 st/va:Add support for indirect manner by returning VA_STATUS_ERROR_OPERATION_FAILED
Based on VA Spec,DeriveImage() returns VA_STATUS_ERROR_OPERATION_FAILED if driver
dont have support for internal surface formats.Currently vaDeriveImage()
failed for non-contiguous planes and operation failed error string is
required to support indirect manner i.e. vaCreateImage()+vaPutImage()
incase vaDeriveImage() failed with VA_STATUS_ERROR_OPERATION_FAILED.

This patch will notify to the client as operation failed with proper
error sting,so that client will fallback to vaCreateImage()+vaPutImage().

v2: updated commit message based on VA spec.

Signed-off-by: suresh guttula <suresh.guttula@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2019-02-11 14:44:16 -05:00
Marek Olšák
114a899cc8 winsys/amdgpu: cs_check_space sets the minimum IB size for future IBs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:48 -05:00
Marek Olšák
766e920cdb winsys/amdgpu: clean up IB buffer size computation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:48 -05:00
Marek Olšák
8c1cb393fc winsys/amdgpu: remove occurence of INDIRECT_BUFFER_CONST
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:48 -05:00
Marek Olšák
881ef14b32 winsys/amdgpu: use a separate fence list for syncobjs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:48 -05:00
Marek Olšák
9f00123d51 winsys/amdgpu: unify fence list code
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:48 -05:00
Marek Olšák
ddfe209a0d winsys/amdgpu: don't drop manually added fence dependencies
wow, it's hard to believe that fence and syncobjs dependencies were ignored.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:48 -05:00
Marek Olšák
61c678d4bc radeonsi: fix EXPLICIT_FLUSH for flush offsets > 0
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:06 -05:00
Marek Olšák
4522f01d4e gallium/u_threaded: fix EXPLICIT_FLUSH for flush offsets > 0
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:04 -05:00
Jason Ekstrand
9e6a6ef0d4 nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocks
When nir_rematerialize_derefs_in_use_blocks_impl was first written, I
attempted to optimize things a bit by not bothering to re-materialize
the sources of deref instructions figuring that the final caller would
take care of that.  However, in the case of more complex deref chains
where the first link or two lives in block A and then another link and
the load/store_deref intrinsic live in block B it doesn't work.  The
code in rematerialize_deref_in_block looks at the tail of the chain,
sees that it's already in block B and skips it, not realizing that part
of the chain also lives in block A.

The easy solution here is to just rematerialize deref sources of deref
instructions as well.  This may potentially lead to a few more deref
instructions being created by the conditions required for that to
actually happen are fairly unlikely and, thanks to the caching, it's all
linear time regardless.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109603
Fixes: 7d1d1208c2 "nir: Add a small pass to rematerialize derefs per-block"
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-11 10:57:23 -06:00
Jason Ekstrand
fd77606b5b intel/fs: Use enumerated array assignments in fb read TXF setup
It's more clear and means we don't have to update the array every time
we add an optional texture instruction argument

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-11 10:57:09 -06:00
Michel Dänzer
d6c55f6c62 gitlab-ci: Re-use docker image from the main repo in forked repos
Instead of generating it from scratch in each forked repo. This should
save time, energy and storage. (The xserver & xf86-video-amdgpu CI
scripts do basically the same)

v2:
* Hardcode "mesa" instead of using $CI_PROJECT_NAME, to avoid breakage
  if the project name is changed after forking (Eric Engestrom)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-11 12:24:31 +01:00
Ilia Mirkin
cc79a1483f nvc0: we have 16k-sized framebuffers, fix default scissors
For some reason we don't use view volume clipping by default, and use
scissors instead. These scissors were set to an 8k max fb size, while
the driver advertises 16k-sized framebuffers.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
2019-02-10 23:36:23 -05:00
Alyssa Rosenzweig
85e2bb58ca panfrost: Specify supported draw modes per-context
Midgard has native support for QUADS and POLYGONS; Bifrost seemingly
does not. Thus, Midgard generally skips prim_convert whereas Bifrost
needs the pass; this patch allows the setting of allowed primitives to
occur on a per-context basis (for runtime hardware selection).

v2: Use (POLYGONS + 1) instead of LINES_ADJACENCY.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
2019-02-11 03:23:00 +00:00
Dave Airlie
90c6880df7 radv: remove alloc parameter from pipeline init
clang points out this isn't used.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-11 10:04:40 +10:00
Dave Airlie
a523ae0cac radv/llvm: initialise passes member.
Fixes coverity warning

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-11 08:59:02 +10:00
Dave Airlie
d2e82c2682 glsl: glsl to nir fix uninit class member.
The constructor should init this to NULL

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-11 08:55:07 +10:00
Alyssa Rosenzweig
2458797256 panfrost: Elucidate texture op scheduling comment
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-10 00:51:57 +00:00
Alyssa Rosenzweig
658961aec3 panfrost: Remove speculative if 0'd format bit code
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-10 00:51:51 +00:00
Alyssa Rosenzweig
b1213a3947 panfrost: Remove if 0'd dead code
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-10 00:50:35 +00:00
Alyssa Rosenzweig
e91e1786c5 panfrost: Add kernel-agnostic resource management
Various methods relating to resource management were previously marked
as kernel-specific, forcing them to stay downstream in the vendor
overlay and eventually be duplicated for DRM code. This patch adds back
this code in kernel-neutral space, allowing for code sharing and
minimising the diff to downstream.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-10 00:44:32 +00:00
Alyssa Rosenzweig
4ed23b193a panfrost: Don't hardcode number of nir_ssa_defs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-10 00:42:52 +00:00
Alyssa Rosenzweig
97dcad8d3e panfrost: Clean-up one-argument passing quirk
Most Midgard instructions take two-arguments logically; there are always
two arguments at the assembly level. For the few instructions that take
only a single argument, generally the second argument slot is unused,
with a zero inline constant occupying the space. fmov/imov are the
exception, where the first argument is filled with r24 and the logical
argument is in the second slot.

Previously, these constraints were handled by a delicate, buggy series
of hacks. This commit removes these hacks. Instead, we look at the
logical number of arguments (from NIR), switching between two argument
and one-argument-one-zero style. We then introduce a quirk for the
flipped style, which applies to fmov/imov.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-10 00:41:25 +00:00
Karol Herbst
49397a3c84 glsl_type: initialize offset and location to -1 for glsl_struct_field
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-09 13:52:15 +01:00
Kenneth Graunke
55e00a2ea8 nouveau: Silence unhandled cap warnings
Nouveau apparently uses the u_screen helper but prints a warning in the
default case, so running any GL program would start grumbling.

Fixes: 8fa54bc549 gallium: Add a PIPE_CAP_NIR_COMPACT_ARRAYS capability bit.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-02-08 16:26:00 -08:00
Caio Marcelo de Oliveira Filho
ee670d09af intel/compiler: use 0 as sampler in emit_mcs_fetch
The sampler will be ignored since the underlying 'ld_mcs' operation
won't use it, so just fill the field with 0 instead of the texture to
make it clearer that's the case.

This will also avoid is_high_sampler() to kick in unnecessarily, in
case we are using the operation for a texture with index >= 16.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 14:51:56 -08:00
Eric Engestrom
e8e544436c wsi: query the ICD's max dimensions instead of hard-coding them
anv and radv both happened to already return 2^14 for these, but
querying the ICD is safer and will help if vdreno (or whatever it's
called) doesn't have the same max.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 18:54:57 +00:00
Ian Romanick
b031c64349 nir: Convert a bcsel with only phi node sources to a phi node
v2: Remove the original ALU instruciton after all of its readers are
modified to read the new ALU instruction.

v3: Fix an issue where a bcsel that may not be executed on a loop
iteration due to a break statement is converted to a phi (and therefore
incorrectly "executed").  Noticed by Tim.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109216
Fixes: 8fb8ebfbb0 ("intel/compiler: More peephole select")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Ian Romanick
0881e90c09 nir: Split ALU instructions in loops that read phis
A single shader in Unigine Superposition is affected by this change.
A single iadd is moved to the end of a loop.  This iadd is involved in
a complex set of logic to terminate the loop, and an extra mov
instruction is inserted.  This shader really needs the optimization
suggested by bugzilla #94747, and I expect that to make this tiny
regression go away.

All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 15047543 -> 15047545 (<.01%)
instructions in affected programs: 565 -> 567 (0.35%)
helped: 0
HURT: 2

total cycles in shared programs: 369977253 -> 369978253 (<.01%)
cycles in affected programs: 127910 -> 128910 (0.78%)
helped: 0
HURT: 2

v2: Skip nir_op_vec{2,3,4} and nir_op_[fi]mov instructions to avoid
infinite optimization loops.  Remove the original ALU instruciton after
all of its readers are modified to read the new ALU instruction.

v3: Extend to the more general case.  The if the prev-block value from
the phi is not undef, this means the ALU instruction has to be
duplicated in both the prev-block and the continue-block.

Fixes: 8fb8ebfbb0 ("intel/compiler: More peephole select")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Ian Romanick
0c0c69729b nir: Select phi nodes using prev_block instead of continue_block
This simplifies some changes coming later.

Fixes: 8fb8ebfbb0 ("intel/compiler: More peephole select")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Ian Romanick
8d8f80af3a nir: Refactor code that checks phi nodes in opt_peel_loop_initial_if
This will be used in a couple more places soon.

The function name is... horribly long.  Neither Matt nor I could think
of any thing that was shorter and still more descriptive than
"is_phi_foo".  I'm willing to entertain suggestions.

Fixes: 8fb8ebfbb0 ("intel/compiler: More peephole select")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Ian Romanick
4d65d2b12e nir: Document some fields of nir_loop_terminator
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Ian Romanick
28ef5bb74c intel/compiler: Silence warning about value that may be used uninitialized
For some reason, this warning only occurs for me in release builds.

In file included from src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c:25:0:
src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c: In function ‘brw_nir_lower_mem_access_bit_sizes’:
src/compiler/nir/nir_builder.h:501:26: warning: ‘src_swiz[2]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       alu_src.swizzle[i] = swiz[i];
       ~~~~~~~~~~~~~~~~~~~^~~~~~~~~
src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c:225:16: note: ‘src_swiz[2]’ was declared here
       unsigned src_swiz[4];
                ^~~~~~~~

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Ian Romanick
78169870e4 nir: Silence zillions of unused parameter warnings in release builds
Fixes: cd56d79b59 "nir: check NIR_SKIP to skip passes by name"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Eric Engestrom
3dc5faf523 gitlab-ci: workaround docker bug for users with uppercase characters
CI_REGISTRY_IMAGE == lower($CI_REGISTRY/$CI_PROJECT_PATH)

Suggested-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-08 17:45:57 +00:00
Andrii Simiklit
2b7d5c3217 i965: consider a 'base level' when calculating width0, height0, depth0
I guess that when we calculating the width0, height0, depth0
to use for function 'intel_miptree_create' we need to consider
the 'base level' like it is done in the 'intel_miptree_create_for_teximage'
function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107987
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-07 21:40:50 -08:00
Timothy Arceri
26aa460940 nir: rewrite varying component packing
There are a number of reasons for the rewrite.

1. Adding support for packing tess patch varyings in a sane way.

2. Making use of qsort allowing the code to be much easier to
   follow.

3. Fixes a bug where different interp types caused component
   packing to be skipped for all varyings in some scenarios.

4. Allows us to add a crude live range analysis for deciding
   which components should be packed together. This support can
   optionally be added in a future patch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 02:54:56 +00:00
Timothy Arceri
2f53260417 nir: add is_packing_supported_for_type() helper
This will be used in the following patches to determine if we
support packing the components of a varying.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 02:54:56 +00:00
Timothy Arceri
e041123841 nir: add glsl_type_is_32bit() helper
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 02:54:56 +00:00
Timothy Arceri
7b01d5c354 nir: add support for marking used patches when packing varyings
This adds support needed for marking the varyings as used but we
don't actually support packing patches in this patch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 02:54:56 +00:00
Timothy Arceri
d0af13cfb4 st/glsl_to_nir: call nir_remove_dead_variables() after lowing local indirects
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 02:54:56 +00:00
Timothy Arceri
d0abbaa528 util: move BITFIELD macros to util/macros.h
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 02:54:56 +00:00
Karol Herbst
cbd1ad6165 st/mesa: require RGBA2, RGB4, and RGBA4 to be renderable
If the driver does not support rendering to these formats but does
support texturing, we can end up in incompatibilities between textures
and renderbuffers that are then copied to.

Fixes KHR-GL45.copy_image.functional on nvc0

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-07 21:51:45 -05:00
Karol Herbst
6010d7b8e8 gallium: add PIPE_CAP_MAX_VARYINGS
Some NVIDIA hardware can accept 128 fragment shader input components,
but only have up to 124 varying-interpolated input components. We add a
new cap to express this cleanly. For most drivers, this will have the
same value as PIPE_SHADER_CAP_MAX_INPUTS for the fragment shader.

Fixes KHR-GL45.limits.max_fragment_input_components

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
[imirkin: rebased, improved docs/commit message]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-07 21:51:45 -05:00
Alyssa Rosenzweig
738346fa23 kmsro: Silence warning if missing
Regardless of whether the build uses kmsro, kmsro is the default driver
descriptor when the static loader is used. Thus, in an edge case where
the static loader is used, no static targets are loaded, and kmsro is
not compiled, a spurious warning is printed. There's no harm in
executing the stub function in this case, but it's not "an error" to not
have kmsro in the build; the driver missing warning should not printed
kmsro.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-02-08 01:48:37 +00:00
Lionel Landwerlin
f1bcb9be46 radv: assert that colorAttachment is valid for CmdClearAttachment
This partially reverts a change from b7a93cbded ("radv: Handle
VK_ATTACHMENT_UNUSED in CmdClearAttachment") which fixed actual issues
but also started to accept invalid values for the colorAttachment
field.

This change asserts that the field is valid for the current pass.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b7a93cbded ("radv: Handle VK_ATTACHMENT_UNUSED in CmdClearAttachment")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-08 00:18:16 +00:00
Lionel Landwerlin
a934a3d124 anv: assert that color attachment are valid
This reverts commit d76e777988.

Let's make this obvious that there is an application issue if it tries
to access an attachment that doesn't exist in the current pass.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d76e777988 ("anv: Handle VK_ATTACHMENT_UNUSED in colorAttachment")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 00:18:16 +00:00
Dave Airlie
3c153b3982 docs: update qbo support for virgl
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-02-08 09:06:36 +10:00
Eric Engestrom
6e0effbd34 travis: fix osx make build
This variable was removed in commit 087af992a2 "travis: remove
unused linux code path" because it looked like it was only used by the
Linux build. Turns out I was wrong, so let's restore it.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-02-07 20:14:14 +00:00
Jason Ekstrand
eaf5e4a24d README: Drop the badges from the readme
They have been added as badges directly to the GitLab project.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-07 12:46:17 -06:00
Eric Engestrom
358d0cfab2 driconf: drop unused macro
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-07 13:40:26 +00:00
Eric Engestrom
00be88aab8 meson: add script to print the options before configuring a builddir
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-07 13:22:41 +00:00
Alyssa Rosenzweig
d43ec104b7 panfrost: Include glue for out-of-tree legacy code
In addition to the DRM interface in active development, for legacy
kernels Panfrost has a small, optional, out-of-tree glue repository. For
various reasons, this legacy code should not be included in Mesa proper,
but this commit allows it to coexist peacefully with upstream Panfrost.
If the nondrm repo is cloned/symlinked to the directory
`src/gallium/drivers/panfrost/nondrm`, legacy functionality will be
built. Otherwise, the driver will build normally, though a runtime error
message will be printed if a legacy kernel is detected.

This workaround is icky, but it allows a nearly-upstream Panfrost to
work on real hardware, today. Ideally, this patch will be reverted when
the Panfrost kernel module is mature and we drop legacy support.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-07 01:58:32 +00:00
Alyssa Rosenzweig
7da251fc72 panfrost: Check in sources for command stream
This patch includes the command stream portion of the driver,
complementing the earlier compiler. It provides a base for future work,
though it does not integrate with any particular winsys.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-07 01:57:50 +00:00
Alyssa Rosenzweig
8f4485ef1a panfrost: Use u_pipe_screen_get_param_defaults
Switching to the defaults function cleans up pan_screen.h markedly and
futureproofs for when new PIPE_CAPs are added.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Suggested-by: Eric Anholt <eric@anholt.net>
2019-02-07 01:57:19 +00:00
Alyssa Rosenzweig
8f9f99d84d kmsro: Move DRM entrypoints to shared block
As kmsro allows an essentially mix-and-match hodgepodge of display
drivers and renderonly GPUs, it doesn't make sense to couple the display
driver entrypoint definition with the driver. Instead, we move *all*
kmsro entrypoints to a shared kmsro block at the end (avoiding clutter
and distraction since this list may snowball in the future).

v2: Alphabetize driver list.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-07 01:50:16 +00:00
Rhys Perry
5b6f522fc2 nvc0: add compute invocation counter
The strategy is to keep a CPU-side counter of the direct invocations,
and a GPU-side counter of the indirect invocations, and then add them
together for queries.

The specific technique is a macro which multiplies a list of integers
together and accumulates the product into SCRATCH registers held inside
of the context. Another macro will read those values out and add them to
the passed-in cpu-side counter to be stored in a query buffer the same
way that all the other statistics are stored.

Original implementation by Rhys Perry, redone by Ilia Mirkin to use the
SCRATCH temporaries.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-02-06 19:35:57 -05:00
Karol Herbst
cce4955721 gm107/ir: add fp64 rsq
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Karol Herbst
815a8e59c6 gm107/ir: add fp64 rcp
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Karol Herbst
12669d2970 gk104/ir: Use the new rcp/rsq in library
[imirkin: add a few more "long" prefixes to safen things up]
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Boyan Ding
656ad06051 gk110/ir: Use the new rcp/rsq in library
v2: (Karol Herbst <kherbst@redhat.com>
 * fix Value setup for the builtins

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
[imirkin: track the fp64 flag when switching ops to calls]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Boyan Ding
7937408052 gk110/ir: Add rsq f64 implementation
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Boyan Ding
04593d9a73 gk110/ir: Add rcp f64 implementation
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Ilia Mirkin
6adb9b38bf nvc0: stick zero values for the compute invocation counts
Not quite perfect, but at least we don't end up with random values in
the query buffer.

Fixes KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Ilia Mirkin
e00799d3dc nv50,nvc0: use condition for occlusion queries when already complete
For the NO_WAIT variants, we would jump into the ALWAYS case for both
nested and inverted occlusion queries. However if the query had
previously completed, the application could reasonably expect that the
render condition would follow that result.

To resolve this, we remove the nesting distinction which unnecessarily
created an imbalance between the regular and inverted cases (since
there's no "zero" condition mode). We also use the proper comparison if
we know that the query has completed (which could happen as a result of
an earlier get_query_result call).

Fixes KHR-GL45.conditional_render_inverted.functional

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Ilia Mirkin
162352e671 nvc0: fix 3d images on kepler
Looks like SUBFM.3D and SUEAU are perfectly capable of dealing with 3d
tiling, they just need the correct inputs. Supply them.

We also have to deal with the case where a 2d "layer" of a 3d image is
bound. In this case, we supply the z coordinate separately to the
shader, which has to optionally treat every 2d case as if it could be a
slice of a 3d texture.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Ilia Mirkin
5de5beedf2 nvc0/ir: fix second tex argument after levelZero optimization
We used to pre-set a bunch of extra arguments to a texture instruction
in order to force the RA to allocate a register at the boundary of 4.
However with the levelZero optimization, which removes a LOD argument
when it's uniformly equal to zero, we undid that logic by removing an
extra argument. As a result, we could end up with insufficient alignment
on the second wide texture argument.

Instead we switch to a different method of achieving the same result.
The logic runs during the constraint analysis of the RA, and adds unset
sources as necessary right before being merged into a wide argument.

Fixes MISALIGNED_REG errors in Hitman when run with bindless textures
enabled on a GK208.

Fixes: 9145873b15 ("nvc0/ir: use levelZero flag when the lod is set to 0")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Ilia Mirkin
4443b6ddf2 nvc0/ir: always use CG mode for loads from atomic-only buffers
Atomic operations don't update the local cache, which means that we
would have to issue CCTL operations in order to get the updated values.
When we know that a buffer is primarily used for atomic operations, it's
easier to just avoid the caching at that level entirely.

The same issue persists for non-atomic buffers, which will have to be
fixed separately.

Fixes the failing dEQP-GLES31.functional.atomic_counter.* tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Ilia Mirkin
399215eb7a nvc0: add support for handling indirect draws with attrib conversion
The hardware does not natively support FIXED and DOUBLE formats. If
those are used in an indirect draw, they have to be converted. Our
conversion tries to be clever about only converting the data that's
needed. However for indirect, that won't work.

Given that DOUBLE or FIXED are highly unlikely to ever be used with
indirect draws, read the indirect buffer on the CPU and issue draws
directly.

Fixes the failing dEQP-GLES31.functional.draw_indirect.random.* tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Kristian H. Kristensen
0f7a20e91e freedreno/a6xx: Use tiling for all resources
We used to restrict this to just PIPE_BIND_SAMPLER_VIEW resources, but
most resources benefit from being tiled.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-02-06 15:28:48 -08:00
Kristian H. Kristensen
357ea7da51 freedreno/a6xx: Emit blitter dst with OUT_RELOCW
We're writing to the bo and the kernel needs to know for
fd_bo_cpu_prep() to work.

Fixes: f93e431272 ("freedreno/a6xx: Enable blitter")
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-06 15:22:25 -08:00
Bas Nieuwenhuizen
13ab63bb62 radv: Implement VK_EXT_buffer_device_address.
v2: Also update the release notes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:37:38 +01:00
Bas Nieuwenhuizen
3259e7b036 radv: Do not use the bo list for local buffers.
The kernel already does it for us.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:36:19 +01:00
Bas Nieuwenhuizen
8a15950211 amd/common: Implement global memory accesses.
Needed for VK_EXT_buffer_device_address.

The pointers are implmemented as i8*, since I could not figure
out how to emulate setting struct offsets in LLVM based on the
SPIR-V offsets (and more weird stuff like row major matrices).

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:36:11 +01:00
Bas Nieuwenhuizen
5703ecf651 amd/common: Do not use 32-bit loads for shared memory.
We use a straight glsl->llvm type conversion so types should already be right.

Also even though the writemasks were changed we we not actually doing 32-bit
things, so this fails miserably.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:36:06 +01:00
Bas Nieuwenhuizen
8d1718590b amd/common: handle nir_deref_cast for shared memory from integers.
Can happen e.g. after a phi.

Fixes: a2b5cc3c39 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:36:02 +01:00
Bas Nieuwenhuizen
830fd0efc1 amd/common: Handle nir_deref_type_ptr_as_array for shared memory.
Fixes: a2b5cc3c39 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:58 +01:00
Bas Nieuwenhuizen
dbdb44d575 amd/common: Fix stores to derefs with unknown variable.
Fixes: a2b5cc3c39 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:54 +01:00
Bas Nieuwenhuizen
3c24fc64c7 amd/common: Use correct writemask for shared memory stores.
The check was for 1 bit being set, which is clearly not what we want.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:49 +01:00
Bas Nieuwenhuizen
00253ab2c4 radv: Fix the shader info pass for not having the variable.
For example with VK_EXT_buffer_device_address or
 VK_KHR_variable_pointers.

Fixes: a2b5cc3c39 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:45 +01:00
Bas Nieuwenhuizen
58c8dadd32 amd/common: Implement ptr->int casts in ac_to_integer.
For the implicit casts inherent in nir.

This should probably have been done for shared memory for
VK_KHR_variable_pointers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:40 +01:00
Bas Nieuwenhuizen
e00d9a9a72 amd/common: Add gep helper for pointer increment.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:36 +01:00
Bas Nieuwenhuizen
39ab4e12f7 radv: Only look at pImmutableSamples if the descriptor has a sampler.
Equivalent of ANV patch c7f4a2867c

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:32 +01:00
Eric Engestrom
40b53a7203 xvmc: fix string comparison
Fixes: 6fca18696d "g3dvl: Update XvMC unit tests."
Cc: Younes Manton <younes.m@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 18:15:43 +00:00
Eric Engestrom
110a6e1839 xvmc: fix string comparison
Fixes: c7b65dcaff "xvmc: Define some Xv attribs to allow users
                             to specify color standard and procamp"
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 18:15:43 +00:00
Eric Engestrom
ba26bc4ef0 gitlab-ci: add meson glvnd build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
5459900f38 travis: remove unused scons code path
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
087af992a2 travis: remove unused linux code path
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
73275147fe gitlab-ci: add make Gallium ST Other build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
360a7bfbe9 gitlab-ci: add make Gallium ST Clover LLVM-7 build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
39315a747b gitlab-ci: add make Gallium ST Clover LLVM-6.0 build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
e80f88c48a gitlab-ci: add make Gallium ST Clover LLVM-5.0 build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
cc85f50029 gitlab-ci: add make Gallium ST Clover LLVM-4.0 build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
984e295500 gitlab-ci: add make Gallium ST Clover LLVM-3.9 build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
d0dff24cbb gitlab-ci: add make Gallium Drivers "Other" build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
055cfbc6de gitlab-ci: add make Gallium Drivers RadeonSI build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
7b26a19f31 gitlab-ci: add make Gallium Drivers SWR build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
bbdc563c11 gitlab-ci: add make loaders/classic DRI build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
f33517bda7 gitlab-ci: add meson gallium ST "Other" build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
8dab707ab8 gitlab-ci: add meson gallium ST Clover (LLVM 7.0) build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
8744ac0904 gitlab-ci: add meson gallium ST Clover (LLVM 6.0) build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
b5a70af062 gitlab-ci: add meson gallium ST Clover (LLVM 5.0) build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
d407ead204 gitlab-ci: add meson gallium "other drivers" build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
06e8f1961b gitlab-ci: add meson gallium RadeonSI build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
360c814bfe gitlab-ci: add meson gallium SWR build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
d73265e20d gitlab-ci: add meson loader/classic DRI build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
6a19ec9daa gitlab-ci: add scons SWR build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
d4c6d4d5cb gitlab-ci: add scons llvm 3.5 build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
06b245b438 gitlab-ci: add a scons no-llvm build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
89a7467899 gitlab-ci: add a make vulkan build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
46d23c0a46 gitlab-ci: add a meson vulkan build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
329f5cd780 gitlab-ci: add ubuntu container
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Marek Olšák
42a1cd034d radeonsi: use local ws variable in si_need_dma_space
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-06 11:17:21 -05:00
Marek Olšák
2c4911c652 radeonsi: don't leak an index buffer if draw_vbo fails
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-06 11:17:21 -05:00
Marek Olšák
d72c319867 radeonsi: make allocator_zeroed_memory unmappable and use bigger buffers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-06 11:17:21 -05:00
Marek Olšák
5068dec5de radeonsi: clear allocator_zeroed_memory with SDMA
so that it can be used in parallel IBs.

This also removes the SO_FILLED_SIZE hack.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-06 11:17:21 -05:00
Marek Olšák
7d4c935654 radeonsi: initialize textures using DCC to black when possible
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-06 11:17:21 -05:00
Jonathan Marek
3361305f57 freedreno: a2xx: fix fast clear
Fixes: 912a9c8d

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 14:34:57 +00:00
Eric Engestrom
54fa5eceae egl: use coherent variable names
`EGLDisplay` variables (the opaque Khronos type) have mostly been
consistently called `dpy`, as this is the name used in the Khronos
specs.

However, `_EGLDisplay` variables (our internal struct) have been
randomly called `dpy` when there was no local variable clash with
`EGLDisplay`s, and `disp` otherwise.

Let's be consistent and use `dpy` for the Khronos type, and `disp`
for our struct.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-02-06 11:53:24 +00:00
Alyssa Rosenzweig
a81d5587d6 meson: Remove panfrost from default driver list
Until the kernel side matures and the full driver is upstreamed, to
avoid end-user surprises, Panfrost should only be built for the
adventurous.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-06 02:59:00 +00:00
Eric Anholt
3c08ecf147 v3d: Whitespace consistency fix. 2019-02-05 15:46:42 -08:00
Eric Anholt
940501a446 v3d: Fix copy-propagation of input unpacks.
I had a single function for "does this do float input unpacking" with two
major flaws: It was missing the most common thing to try to copy propagate
a f32 input nunpack to (the VFPACK to an FP16 render target) along with
several other ALU ops, and also would try to propagate an f32 unpack into
a VFMUL which only does f16 unpacks.

instructions in affected programs: 659232 -> 655895 (-0.51%)
uniforms in affected programs: 132613 -> 135336 (2.05%)

and a couple of programs increase their thread counts.

The uniforms hit appears to be a pattern in generated code of doing (-a >=
a) comparisons, which when a is abs(b) can result in the abs instruction
being copy propagated once but not fully DCEed.
2019-02-05 15:46:04 -08:00
Eric Anholt
e5c6938590 v3d: Fix input packing of .l for rounding/fdx/fdy.
Avoids a regression in
dEQP-GLES3.functional.shaders.derivate.fwidth.texture.* once we start
copy-propagating more input packs.
2019-02-05 15:45:23 -08:00
Eric Anholt
1a4170952d v3d: Fix pack/unpack of VFPACK operand unpacks.
We want to be able to copy propagate our texture unpacks into the vfpack.
2019-02-05 15:45:23 -08:00
Eric Anholt
d0fdbd4211 v3d: Fix dumping of shaders with alpha test.
We were trying to print a NULL entry from the table.
2019-02-05 15:42:14 -08:00
Eric Anholt
bdef17b052 v3d: Store the actual mask of color buffers present in the key.
If you only bound rt 1+, we'd still emit a write to the rt0 that isn't
present (noticed while debugging an
ext_framebuffer_multisample-alpha-to-coverage-no-draw-buffer-zero
regression in another change).
2019-02-05 15:42:04 -08:00
Eric Anholt
17a649af05 v3d: Fix precompile of FRAG_RESULT_DATA1 and higher outputs.
I was just leaving the other MRT targets than DATA0 out, by accident.
2019-02-05 15:35:49 -08:00
Kristian H. Kristensen
ba4b22011a st/nir: Use src/ relative include path for autotools
Fixes: cdc53fa81c
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-05 14:19:51 -08:00
Kenneth Graunke
8fa54bc549 gallium: Add a PIPE_CAP_NIR_COMPACT_ARRAYS capability bit.
Iris would like to use compact arrays for tesslevels and clip/cull
distances.  radeonsi will likely want to switch to these at some point,
since it'll be necessary for GL_ARB_gl_spirv support, but it's not ready
for them just yet.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-05 13:58:46 -08:00
Kenneth Graunke
cf731564e6 st/nir: Call nir_lower_clip_cull_distance_arrays().
Today, st always sets LowerCombinedClipCullDistance, causing the GLSL IR
lowering to run, giving us vec4[2] arrays.  I would like to disable this
and instead run the NIR lowering so that we get compact float[] arrays
instead.

Calling the new pass is a noop if the GLSL IR pass has already run, so
it's safe to call the pass unconditionally.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-05 13:58:46 -08:00
Kenneth Graunke
15c6902117 nir: Avoid splitting compact arrays into per-element variables.
Compact arrays are used for special variables like clip and cull
distances, or tessellation levels.  Drivers using compact arrays
assume that these values will always be actual arrays.  We don't
want to turn a float[1] gl_CullDistance into a single float; that
would confuse drivers.

Today, i965 uses compact arrays, and Gallium drivers use
nir_lower_io_arrays_to_elements, so we haven't had any overlap
that would demonstrate the issue.  Iris will use both.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-05 13:58:46 -08:00
Kenneth Graunke
ba9dcc80fb nir: Avoid clip/cull distance lowering multiple times.
A couple places in st/nir assume that cull distances have been lowered
away, so it will need to call this lowering pass for drivers which opt
out of the GLSL IR lowering.  The Intel backend also calls this pass,
for i965 and anv.  We need to only do it once.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-05 13:58:46 -08:00
Kenneth Graunke
5730364d69 nir: Bail on clip/cull distance lowering if GLSL IR already did it.
We have a GLSL IR pass to convert clip/cull distance float[] arrays
into vec4[2] arrays.  In ff281e6204, we attempted to skip this pass
if the GLSL IR lowering had already run.  But, that code was not quite
right, as we forgot to strip away the per-vertex IO array layer for
geometry and tessellation shader varyings.

If the GLSL IR pass has run, the variables will not be marked as
"compact".  So we can simply check that and bail.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-05 13:58:46 -08:00
Kenneth Graunke
ef99f4c8d1 compiler: Mark clip/cull distance arrays as compact before lowering.
nir_lower_clip_cull_distance_arrays() marks the combined clip/cull
distance array as compact.  However, when translating in from GLSL
or SPIR-V, we were not marking the original float[] arrays as compact.

We should do so.  That way, we can detect these corner cases properly.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-05 13:58:46 -08:00
Kenneth Graunke
3327c93510 nir: Record info->fs.pixel_center_integer in lower_system_values
radeonsi uses a system value for gl_FragCoord rather than an input var.
These get translated into load_frag_coord NIR intrinsics, which lose the
pixel_center_integer and origin_upper_left decorations.  To cope with
this, Tim added a shader_info field for pixel_center_integer, and made
glsl_to_nir set it accordingly.

prog_to_nir also needs to handle these fragcoord conventions.  Instead
of duplicating the logic to set the info field, just move it to
nir_lower_system_values so it'll happen regardless of who makes the NIR.

(For what it's worth, we don't need an info flag for origin_upper_left,
because radeonsi lowers origin conventions in nir_lower_wpos_ytransform
before nir_lower_system_values destroys the variable and qualifiers.)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:51:52 -08:00
Kenneth Graunke
536abd453b program: Extend prog_to_nir handle system values.
Some drivers, such as radeonsi, use a system value for gl_FragCoord
rather than an input variable.  In this case, our Mesa IR will have
a PROGRAM_SYSTEM_VALUE register, which we need to translate.

This makes prog_to_nir work for Gallium drivers which expose the
PIPE_CAP_TGSI_FS_POSITION_IS_SYSVAL capability bit.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:51:51 -08:00
Kenneth Graunke
fa38ca25f6 program: Use u_bit_scan64 in prog_to_nir.
We can simply iterate the bits rather than using util_last_bit and
checking each one up until that point.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:51:50 -08:00
Kenneth Graunke
a01ad3110a st/mesa: Add NIR versions of the PBO upload/download shaders.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:42 -08:00
Kenneth Graunke
a02349b9e7 st/mesa: Add a NIR version of the OES_draw_texture built-in shaders.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:41 -08:00
Kenneth Graunke
be492affa8 st/mesa: Add NIR versions of the clear shaders.
We implement the basic VS and FS, as well as the VS that does layered
clears by writing gl_Layer from the vertex shader.  Drivers which need
a geometry shader for writing layer continue falling back to TGSI, as
I didn't need this and so didn't bother implementing it.  (We certainly
could, however, if people want to add it in the future.)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:39 -08:00
Kenneth Graunke
3f28b245b5 st/mesa: Add NIR versions of the drawpixels Z/stencil fragment shaders.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:37 -08:00
Kenneth Graunke
2d45f9fa25 st/mesa: Add a NIR version of the drawpixels/bitmap VS copy shader.
This provides a native NIR version of the DrawPixels/Bitmap passthrough
vertex shader.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:36 -08:00
Kenneth Graunke
cdc53fa81c st/nir: Make new helpers for constructing built-in NIR shaders.
The state tracker generates several built-in shaders in order to
perform scissored clears, upload/download PBOs, and so on.  These
are currently constructed using TGSI, using ureg and u_simple_shader.

I want to have NIR versions of these shaders, for my Gallium driver
that has a NIR backend but no TGSI support.  To that end, we'll want
a few helpers to help construct simple shaders.

This patch adds two new helpers:

- st_nir_finish_builtin_shader() takes a manually constructed NIR
  shader, applies lowering passes (like st_link_nir would do for GLSL),
  and constructs the pipe_shader_state.

- st_nir_make_passthrough_shader() makes a simple passthrough shader,
  which copies inputs to outputs.  This is similar to u_simple_shaders.

v2: Set info->fs.untyped_color_outputs for vc4/v3d (thanks Eric!).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:33 -08:00
Kenneth Graunke
4f799264d1 st/nir: Move varying setup code to a helper function.
I want to reuse this for built-in shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:02 -08:00
Jason Ekstrand
36734987a5 nir/deref: Drop zero ptr_as_array derefs
They are effectively (&x)[0] or *&x which does nothing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-05 15:17:19 -06:00
Eric Anholt
aaef12702f nir: Move V3D's "the shader was TGSI, ignore FS output types" flag to NIR.
Ken's rework of mesa/st builtins to NIR means that we'll have more NIR
shaders with color output types that are mismatched with the render target
types.  Since this is behavior that GLSL doesn't require, add it as a
shader_info option so the driver can know that it needs to ignore the FS
output's base type in favor of the actual render target's.  This prevents
needing additional variants in several mesa/st paths (clear, pbo upload,
pbo download), given that the driver already has to handle the variants
for any TGSI being passed to it (from u_blitter, for example).

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-05 12:12:33 -08:00
Emil Velikov
8943eb8f03 anv: wire up the state_pool_padding test
Cc: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 927ba12b53 ("anv/tests: Adding test for the state_pool padding.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com><Paste>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-02-05 11:39:36 -08:00
Karol Herbst
a61c388d07 nvc0/ir: replace cvt instructions with add to improve shader performance
gives me an performance boost of 0.2% in pixmark_piano on my gk106, gm204 and
gp107.

reduces the amount of generated convert instructions by roughly 30% in
shader-db.

v2: only for 32 bit operations
    move some common code out of the switch
    handle OP_SAT with modifiers
v3: only for registers and const memory
    rework if clauses
    merge isCvt into this patch
v4: merge isCvt into its use

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-02-05 20:35:38 +01:00
Bart Oldeman
a203eaa4f4 gallium-xlib: query MIT-SHM before using it.
When Mesa is compiled for gallium-xlib using e.g.
./configure --enable-glx=gallium-xlib --disable-dri --disable-gbm
-disable-egl
and is used by an X server (usually remotely via SSH X11 forwarding)
that does not support MIT-SHM such as XMing or MobaXterm, OpenGL
clients report error messages such as
Xlib:  extension "MIT-SHM" missing on display "localhost:11.0".
ad infinitum.

The reason is that the code in src/gallium/winsys/sw/xlib uses
MIT-SHM without checking for its existence, unlike the code
in src/glx/drisw_glx.c and src/mesa/drivers/x11/xm_api.c.
I copied the same check using XQueryExtension, and tested with
glxgears on MobaXterm.

This issue was reported before here:
https://lists.freedesktop.org/archives/mesa-users/2016-July/001183.html

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: <mesa-stable@lists.freedesktop.org>
2019-02-05 17:53:35 +00:00
Alok Hota
6e5eb4ead6 swr/rast: update SWR rasterizer shader stats
Primarily refactoring internal stats types

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-05 11:41:25 -06:00
Michel Dänzer
c0a540f320 loader/dri3: Use strlen instead of sizeof for creating VRR property atom
sizeof counts the terminating null character as well, so that also
contributed to the ID computed for the X11 atom. But the convention is
for only the non-null characters to contribute to the atom ID.

Fixes: 2e12fe425f "loader/dri3: Enable adaptive_sync via
                     _VARIABLE_REFRESH property"
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-05 17:18:44 +00:00
Jonathan Marek
4f0a3c9f9e nir: add missing vec opcodes in lower_bool_to_float
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-05 15:34:15 +00:00
Gert Wollny
b0b3de2be7 mesa: release references to image textures when a context is destroyed
When a texture is still bound as an image and the context it was bound in
is destroyed but not the texture, then the texture will still hold the
resource and will not be freed when it is finally destroyed. Hence, release
these references when the context is destroyed.

This leak was triggered by virglrenderer:
https://gitlab.freedesktop.org/virgl/virglrenderer/issues/86

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-05 10:53:41 +00:00
Gert Wollny
f1f3640f6f radeonsi: release tokens after creating the shader program
ureg_get_tokens clears the reference to the tokens, and create_compute_state makes
a copy, hence the tokens must be explicitely released.

Fixes: Direct leak of 256 byte(s) in 1 object(s) allocated from:
    #0 0x7ff729cf3c60 in realloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdbc60)
    #1 0x7ff721b1240c in tokens_expand ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:234
    #2 0x7ff721b1c9c0 in get_tokens ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:257
    #3 0x7ff721b1c9c0 in copy_instructions ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:2040
    #4 0x7ff721b1c9c0 in ureg_finalize ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:2090
    #5 0x7ff721b1e919 in ureg_get_tokens ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:2167
    #6 0x7ff721f8b35a in si_create_dma_compute_shader ../../samba/mesa/src/gallium/drivers/radeonsi/si_shaderlib_tgsi.c:219
    #7 0x7ff722043ed9 in si_compute_do_clear_or_copy ../../samba/mesa/src/gallium/drivers/radeonsi/si_compute_blit.c:156
    #8 0x7ff7220448d3 in si_clear_buffer ../../samba/mesa/src/gallium/drivers/radeonsi/si_compute_blit.c:247
    #9 0x7ff7220350e8 in vi_dcc_clear_level ../../samba/mesa/src/gallium/drivers/radeonsi/si_clear.c:274

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-05 11:50:54 +01:00
Caio Marcelo de Oliveira Filho
8c7c543936 isl: assert that Gen8+ don't have bit6_swizzling
v2: Rewrite the condition to more clearly match the comment. (Jordan)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-04 20:44:41 -08:00
Caio Marcelo de Oliveira Filho
5299c9cbcc anv: skip bit6 swizzle detection in Gen8+
It is always false on Gen8+.  Also, move the variable definition near
its use.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-04 20:44:41 -08:00
Caio Marcelo de Oliveira Filho
60740eade3 i965: skip bit6 swizzle detection in Gen8+
It is always false on Gen8+.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-04 20:44:41 -08:00
Caio Marcelo de Oliveira Filho
51547bbc5a nir: keep the phi order when splitting blocks
All things being equal is better to keep the original order.  Since
the new block is empty, push the phis in order to tail.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
2019-02-04 20:41:13 -08:00
Ilia Mirkin
38f542783f nv50,nvc0: add explicit settings for recent caps
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-04 23:36:46 -05:00
Alyssa Rosenzweig
e67e072637 panfrost: Implement Midgard shader toolchain
This patch implements the free Midgard shader toolchain: the assembler,
the disassembler, and the NIR-based compiler. The assembler is a
standalone inaccessible Python script for reference purposes. The
disassembler and the compiler are implemented in C, accessible via the
standalone `midgard_compiler` binary. Later patches will use these
interfaces from the driver for online compilation.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-05 01:26:28 +00:00
Alyssa Rosenzweig
61d3ae6e0b panfrost: Initial stub for Panfrost driver
This patch adds an initial stub for the Gallium driver, containing
simple screen functions and the majority of the driver headers but no
actual functionality. It further adds the winsys glue for linking in
this stub driver via kmsro on Rockchip/Amlogic boards.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-05 01:19:30 +00:00
Marek Olšák
742d6cdb42 radeonsi: fix crashing performance counters (division by zero)
Fixes: e2b9329f17 "radeonsi: move remaining perfcounter code into si_perfcounter.c"
2019-02-04 18:46:25 -05:00
Marek Olšák
a03ecbaeec radeonsi: handle render_condition_enable in si_compute_clear_render_target 2019-02-04 18:46:25 -05:00
Sonny Jiang
984fd73515 radeonsi: use compute for clear_render_target when possible
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-02-04 18:46:25 -05:00
Kenneth Graunke
dc46317d1a st/mesa: Set pipe_image_view::shader_access in PBO readpixels.
Commit 8b626a22b2 introduced a new
pipe_image_view::shader_access field, indicating the access mode
specified in the shader.  st/mesa's built-in PBO download shader
creates a write-only image buffer, so we should flag it as such.

Nobody uses this field yet (Iris will), so we don't need to backport
this fix to stable branches.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-04 11:17:56 -08:00
Rodrigo Vivi
56c3b4971d intel: Add more PCI Device IDs for Coffee Lake and Ice Lake.
Align with kernel commits:

5e0f5a58b167 ("drm/i915/cfl: Adding another PCI Device ID.")
03ca3cf8e9aa ("drm/i915/icl: Adding few more device IDs for Ice Lake")

Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-04 10:05:25 -08:00
Danylo Piliaiev
64d3b148fe anv: Fix VK_EXT_transform_feedback working with varyings packed in PSIZ
Transform feedback did not set correct SO_DECL.ComponentMask for
varyings packed in VARYING_SLOT_PSIZ:
 gl_Layer         - VARYING_SLOT_LAYER    in VARYING_SLOT_PSIZ.y
 gl_ViewportIndex - VARYING_SLOT_VIEWPORT in VARYING_SLOT_PSIZ.z
 gl_PointSize     - VARYING_SLOT_PSIZ     in VARYING_SLOT_PSIZ.w

Fixes: 36ee2fd61c "anv: Implement the basic form of VK_EXT_transform_feedback"

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-04 15:30:43 +00:00
Danylo Piliaiev
b7a93cbded radv: Handle VK_ATTACHMENT_UNUSED in CmdClearAttachment
From the Vulkan 1.0.98 spec for vkCmdClearAttachments:

"If any attachment to be cleared in the current subpass is VK_ATTACHMENT_UNUSED,
then the clear has no effect on that attachment."

"If the aspectMask member of any element of pAttachments contains
VK_IMAGE_ASPECT_COLOR_BIT, then the colorAttachment member of that
element must either refer to a color attachment which is VK_ATTACHMENT_UNUSED,
or must be a valid color attachment."

"If the aspectMask member of any element of pAttachments contains
VK_IMAGE_ASPECT_DEPTH_BIT, then the current subpass' depth/stencil attachment
must either be VK_ATTACHMENT_UNUSED, or must have a depth component"

"If the aspectMask member of any element of pAttachments contains
VK_IMAGE_ASPECT_STENCIL_BIT, then the current subpass' depth/stencil attachment
must either be VK_ATTACHMENT_UNUSED, or must have a stencil component"

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 14:50:43 +02:00
Danylo Piliaiev
d76e777988 anv: Handle VK_ATTACHMENT_UNUSED in colorAttachment
From the Vulkan 1.0.98 spec for vkCmdClearAttachments:

"If the aspectMask member of any element of pAttachments contains
VK_IMAGE_ASPECT_COLOR_BIT, then the colorAttachment member of that
element must either refer to a color attachment which is VK_ATTACHMENT_UNUSED,
or must be a valid color attachment."

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-04 14:49:50 +02:00
Samuel Pitoiset
0d0affad3c radv: don't flush src stages when dstStageMask == BOTTOM_OF_PIPE
Original patch by Fredrik Höglund.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
9efa3405a7 radv: do not set preserveAttachments for internal render passes
We don't use that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
80e809d993 radv: drop useless checks when resolving subpass color attachments
The Vulkan spec says:
   "If pResolveAttachments is not NULL, for each resolve attachment
    that does not have the value VK_ATTACHMENT_UNUSED, the
    corresponding color attachment must not have the value
    VK_ATTACHMENT_UNUSED."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
76c17cfd8d radv: execute external subpass barriers after ending subpasses
Outgoing dependencies (ie. external) should happen after the subpass.
This doesn't change anything for subpass resolves as we already
make sure that attachments are shader readable.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
b482c030f5 radv: accumulate all ingoing external dependencies to the first subpass
In case two or more subpasses declare ingoing external dependencies.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
eaab35e5e3 radv: handle subpass dependencies correctly
The different masks should be accumulated. For example if two
subpasses declare an outgoing dependency (ie. dst ==
VK_SUBPASS_EXTERNAL).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
6430616e77 radv: track if subpasses have color attachments
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
1e810f1c53 radv: add radv_render_pass_add_subpass_dep() helper
To share common code that handles subpass dependencies.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
2472907563 radv: move some render pass things to radv_render_pass_compile()
radv_render_pass_compile() is common to vkCreateRenderPass()
and vkCreateRenderPass2().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
b509013060 radv: handle final layouts at end of every subpass and render pass
That shouldn't change anything as we check if the last
subpass id is the final subpass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:18:38 +01:00
Samuel Pitoiset
5699ac0078 radv: determine the last subpass id for every attachments
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:59 +01:00
Samuel Pitoiset
e1a0a268c6 radv: use the new attachments array when starting subpasses
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:57 +01:00
Samuel Pitoiset
a20c2e38d8 radv: store the list of attachments for every subpass
This reworks how the depth stencil attachment is used for
simplicity. This also introduces radv_render_pass_compile()
helper that will be used for further optimizations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:54 +01:00
Samuel Pitoiset
a7c7d811f1 radv: move subpass image transitions to radv_cmd_buffer_begin_subpass()
Instead of doing them in radv_cmd_buffer_set_subpass().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:52 +01:00
Samuel Pitoiset
291a933786 radv: add radv_cmd_buffer_begin_subpass() helper
To unify some code in BeginRenderPass() and NextSubpass().
Based on Intel ANV driver.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:50 +01:00
Samuel Pitoiset
41199e2eeb radv: remove useless MAYBE_UNUSED in CmdBeginRenderPass()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:46 +01:00
Samuel Pitoiset
545552c9b9 radv: remove unused radv_render_pass_attachment::view_mask
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:42 +01:00
Samuel Pitoiset
0f932bbede radv: bail out when no image transitions will be performed
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:40 +01:00
Marek Olšák
1e85cfb91a meson: drop the xcb-xrandr version requirement
autotools doesn't have any requirement. This fixes meson on Ubuntu 16.04.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-03 18:39:57 -05:00
Eric Engestrom
808bf59cac wsi/display: add comment
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
2019-02-02 23:08:03 +00:00
Jason Ekstrand
0aa5a97b03 relnotes: Add VK_EXT_buffer_device_address 2019-02-02 08:42:14 -06:00
Jason Ekstrand
48ed2a7bb0 anv: Implement VK_EXT_buffer_device_address
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-01 17:09:42 -06:00
Jason Ekstrand
e644ed468f intel/fs: Implement nir_intrinsic_global_atomic_*
eviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:11:00 -06:00
Jason Ekstrand
a91f392073 intel/fs: Use SENDS for A64 writes on gen9+
eviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:11:00 -06:00
Jason Ekstrand
1c25bf4373 intel/fs: Implement load/store_global with A64 untyped messages
eviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:11:00 -06:00
Jason Ekstrand
b4f0d062cd intel/fs: Do the grf127 hack on SIMD8 instructions in SIMD16 mode
Previously, we only applied the fix to shaders with a dispatch mode of
SIMD8 but the code it relies on for SIMD16 mode only applies to SIMD16
instructions.  If you have a SIMD8 instruction in a SIMD16 shader,
neither would trigger and the restriction could still be hit.

Fixes: 232ed89802 "i965/fs: Register allocator shoudn't use grf127..."
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:11:00 -06:00
Jason Ekstrand
79724a0756 intel/fs: Properly handle 64-bit types in LOAD_PAYLOAD
By just assigning dst.type to src[i].type, we ensure that the offset at
the end of the loop actually offsets it by the right number of
registers.  Otherwise, we'll get into a case where we copy with a Q type
and then offset with a D type and things get out of sync.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:10:57 -06:00
Jason Ekstrand
f02914a991 intel/fs/cse: Split create_copy_instr into three cases
Previously, we tried to combine all cases where the instruction being
CSE'd writes to more than one MOV worth of registers into one case with
a bit of special casing for LOAD_PAYLOAD.  This commit splits things so
that LOAD_PAYLOAD is entirely it's own case.  This makes tweaking the
LOAD_PAYLOAD case simpler in the next commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:10:40 -06:00
Jason Ekstrand
f409a08e5f intel/nir: Add global support to lower_mem_access_bit_sizes
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:08:29 -06:00
Oscar Blumberg
fea5b8e5ad intel/fs: Fix memory corruption when compiling a CS
Missing check for shader stage in the fs_visitor would corrupt the
cs_prog_data.push information and trigger crashes / corruption later
when uploading the CS state.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 10:53:33 -08:00
Jason Ekstrand
ab940b0d97 spirv: Support LocalSizeId and LocalSizeHintId execution modes
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-01 17:34:02 +00:00
Jason Ekstrand
7223590c42 spirv: Handle OpExecutionModeId
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-01 17:34:02 +00:00
Jason Ekstrand
e68871f6a4 spirv: Handle constants and types before execution modes
We already defer handling the actual execution modes until after we've
created the shader.  This just moves it a tiny bit further so we
actually have constants and types and can handle OpExecutionModeId.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-01 17:34:02 +00:00
Jason Ekstrand
7d862ef530 spirv: Rework handling of spec constant workgroup size built-ins
Instead of handling it as part of the handling of constant instructions,
just stash the vtn_value when we see the decoration and handle it
explicitly later.  This will let us re-order handling of constant
instructions without breaking the Vulkan SPIR-V requirement that
decorating a specialization constant as the WorkgroupSize built-in
overrides the workgroup size set as an execution mode.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-01 17:34:02 +00:00
Jason Ekstrand
9b37e93e42 spirv: Replace vtn_constant_value with vtn_constant_uint
The uint version is less typing, supports different bit sizes, and is
probably a bit more safe because we're actually verifying that the
SPIR-V value is an integer scalar constant.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-01 17:34:02 +00:00
Samuel Pitoiset
5e7f800f32 radv: fix build
Fixes: 9b9ccee4d6 ("radv: take LDS into account for compute shader occupancy stats")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-01 15:31:55 +01:00
Timothy Arceri
9b9ccee4d6 radv: take LDS into account for compute shader occupancy stats
Ported from d205faeb6c.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-01 22:25:30 +11:00
Timothy Arceri
a53d68d318 ac/radv/radeonsi: add ac_get_num_physical_sgprs() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-01 22:25:30 +11:00
Gurchetan Singh
574186f0e8 docs: add GL_EXT_texture_compression_s3tc_srgb to release notes
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-01 10:01:59 +00:00
Gurchetan Singh
dc9a15aefb st/mesa: expose EXT_texture_compression_s3tc_srgb
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-01 10:01:59 +00:00
Gurchetan Singh
a2ab400719 i965: Set flag for EXT_texture_compression_s3tc_srgb
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-01 10:01:59 +00:00
Gurchetan Singh
db24132d80 mesa/main: Expose EXT_texture_compression_s3tc_srgb
Required for the following test:

bin/compressedteximage GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT

pass when emulating GL on GLES.

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-01 10:01:59 +00:00
Timothy Arceri
0f3a8e1b64 st/glsl_to_nir: remove dead local variables
Without this we do not end up with a deterministic NIR because
temporary register variables are added in random order. NIR must
be deterministic because we use it to produce a sha for the
radeonsi backends disk cache.

This fixes the shader cache for a bunch of shaders.

Another positive is that this results in a large reduction in the
size of the NIR that the state tracker stores to the disk cache.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 15:56:02 +11:00
Dylan Baker
4052142de7 meson: remove -std=c++11 from intel/tools
for meson all C++ code is already compiled as C++11, so it's
unnecessary. It's also the wrong way to do this, if we really needed
this the correct way is to set:

```meson
executable(
  ...
  override_options : ['cpp_std=c++11'],
)
```

Which ensures not only that the correct syntax for the current
compiler is used, but also that meson doesn't create arguments like
`-std=c++14 ... -std=c++11`

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-31 21:42:16 +00:00
Dylan Baker
8e49b32f63 meson: fix style in intel/tools
The `:` in options should always have one space before and after `foo
: bar`, and lists do not get spaces around the braces: `[foo]` not `[
foo ]`

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-31 21:42:16 +00:00
Dylan Baker
d93d53fa72 meson: remove build_by_default : true
Which is and has always been the default. This is largely an artifact
of how the building of these tools was controlled when the meson build
was originally created.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-31 21:42:16 +00:00
Emil Velikov
1240c3cb10 docs: update calendar, add news item and link release notes for 18.3.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-31 21:17:38 +00:00
Emil Velikov
83160c6c05 docs: add sha256 checksums for 18.3.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 7475d7727f)
2019-01-31 21:15:20 +00:00
Emil Velikov
4d0732dc39 docs: add release notes for 18.3.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 190a79f462)
[Emil: drop VERSION hunk]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

 Conflicts:
	VERSION
2019-01-31 21:14:56 +00:00
Neha Bhende
69d736b17a st/mesa: Fix topogun-1.06-orc-84k-resize.trace crash
We need to initialize all fields in rs->prim explicitly while
creating new rastpos stage.

Fixes: bac8534267 ("st/mesa: allow glDrawElements to work with GL_SELECT
feedback")

v2: Initializing all fields in rs->prim as per Ilia.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-31 12:21:59 -07:00
Dylan Baker
c812c740e6 android,autotools,i965: Fix location of float64_glsl.h
Android.mk and autotools disagree about where generated files should
go, which wasn't a problem until we wanted to build a dist
tarball. This corrects the problem by changing the output and include
paths to be the same on android and autotools (meson already has the
correct include path).

Fixes: 7d7b30835c
       ("automake: Fix path to generated source")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-31 19:04:30 +00:00
Marek Olšák
d49c16a597 gallium: allow more PIPE_RESOURCE_ driver flags
radeonsi has 8 and will probably have 9 soon.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-01-31 13:10:42 -05:00
Eric Anholt
ab4d5775b0 v3d: Fix image_load_store clamping of signed integer stores.
This was copy-and-paste fail, that oddly showed up in the CTS's
reinterprets of r32f, rgba8, and srgba8 to rgba8i, but not r32ui and r32i
to rgba8i or reinterprets to other signed int formats.

Fixes: 6281f26f06 ("v3d: Add support for shader_image_load_store.")
2019-01-31 08:39:40 -08:00
Eric Anholt
db2ae51121 mesa: Skip partial InvalidateFramebuffer of packed depth/stencil.
One of the CTS cases tries to invalidate just stencil of packed
depth/stencil, and we incorrectly lost the depth contents.

Fixes dEQP-GLES3.functional.fbo.invalidate.whole.unbind_read_stencil
Fixes: 0c42b5f3cb ("mesa: wire up InvalidateFramebuffer")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-31 08:37:46 -08:00
Rob Clark
39cfdf9930 freedreno: more fixing release tarball
Fixes: aa0fed10d3 freedreno: move ir3 to common location
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-31 09:59:18 -05:00
Rob Clark
e252656d14 freedreno: fix release tarball
Fixes: b4476138d5 freedreno: move drm to common location
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-31 09:59:18 -05:00
Emmanuel Gil Peyrot
0d4dd59ae5 docs: make bugs.html easier to find
Thanks to Yann Kervran for the report and suggestions.

Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-31 14:31:48 +00:00
Dave Airlie
9279a28f07 virgl: ARB_query_buffer_object support
v1.1: fix size define.

Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-01-31 11:23:38 +10:00
Dave Airlie
38658c6d4d virgl: enable elapsed time queries
GL underneath always has GL_TIME_ELAPSED so always enable these.

Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-01-31 11:23:30 +10:00
Dylan Baker
da48cba61e automake: Add --enable-autotools to distcheck flags
Fixes: e68777c87c
       ("autotools: Deprecate the use of autotools")
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-01-30 19:32:44 +00:00
Marek Olšák
ffbd37d8e9 radeonsi: fix a comment typo in si_fine_fence_set 2019-01-30 14:32:05 -05:00
Marek Olšák
f4eb746ef7 r600: add -Wstrict-overflow=0 to meson to silence the warning
same as radeonsi
2019-01-30 12:49:45 -05:00
Marek Olšák
d50bef9831 winsys/amdgpu: remove amdgpu_drm.h definitions
trivial
2019-01-30 12:38:56 -05:00
Marek Olšák
16672f16da radeonsi: unify error paths in si_texture_create_object 2019-01-30 12:35:22 -05:00
Marek Olšák
2361558eb7 radeonsi: merge & rename texture BO metadata functions 2019-01-30 12:35:22 -05:00
Marek Olšák
1c12d56e4d radeonsi: enable dithered alpha-to-coverage for better quality
same as AMDVLK.

GL_NV_alpha_to_coverage_dither_control allows controlling this behavior.
The default is implementation-dependent.
2019-01-30 12:35:22 -05:00
Dylan Baker
b4986d2e0c gallium: wrap u_screen in extern "C" for c++
Some drivers (notabily SWR) are written in C++, and as such they need
access to C headers with extern "C". So lets add that.
2019-01-30 15:12:27 +00:00
Gert Wollny
45903cddc3 mesa/core: Enable EXT_texture_sRGB_R8 also for desktop GL
As of Nov/30/2018 the extension is also valid for OpenGL >= 1.2, so
enable it accordingly and also add the required view class entry.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-30 11:32:40 +00:00
Samuel Pitoiset
9c762c01c8 radv/winsys: fix hash when adding internal buffers
This fixes serious stuttering in Shadow Of The Tomb Raider.

Fixes: 50fd253bd6 ("radv/winsys: Add priority handling during submit.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-30 12:29:10 +01:00
Erik Faye-Lund
3b6f95ad66 mesa: expose NV_conditional_render on GLES
The extension spec has been updated to include GLES 2 support, so let's
enable it there.

v2: fixup ABI-check as well

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-30 09:43:44 +01:00
Ernestas Kulik
90458bef54 v3d: Fix leak in resource setup error path
Reported by Coverity: in the case of unsupported modifier request, the
code does not jump to the “fail” label to destroy the acquired resource.

CID: 1435704
Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com>
Fixes: 45bb8f2957 ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.")
2019-01-29 16:14:13 -08:00
Ernestas Kulik
f6e49d5ad0 vc4: Fix leak in HW queries error path
Reported by Coverity: in the case where there exist hardware and
non-hardware queries, the code does not jump to err_free_query and leaks
the query.

CID: 1430194
Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com>
Fixes: 9ea90ffb98 ("broadcom/vc4: Add support for HW perfmon")
2019-01-29 16:14:13 -08:00
Eric Anholt
6053c7bb43 v3d: Fix a release build set-but-unused compiler warning. 2019-01-29 16:02:51 -08:00
Eric Anholt
0c05198d6b v3d: Always enable the NEON utile load/store code.
I can't imagine the new HW block being paired with a v6 CPU, so don't
bother with the CPU detection that vc4 had to do.

Improves 1024x1024 TexImage on my 7278 by 47.3229% +/- 0.679632%
2019-01-29 16:00:25 -08:00
Emil Velikov
385843ac3c vc4: Declare the last cpu pointer as being modified in NEON asm.
Earlier commit addressed 7 of the 8 instances available.

v2: Rebase patch back to master (by anholt)

Cc: Carsten Haitzler (Rasterman) <raster@rasterman.com>
Cc: Eric Anholt <eric@anholt.net>
Fixes: 300d3ae8b1 ("vc4: Declare the cpu pointers as being modified in NEON asm.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-29 16:00:25 -08:00
Dylan Baker
75ad254acf docs: Add relnotes stub for 19.1 2019-01-29 15:32:16 -08:00
Dylan Baker
dba0989ac1 bump version for 19.0 branch 2019-01-29 15:30:25 -08:00
Dylan Baker
90a7a9c973 automake: Add include dir for nir src directory
Fixes: 6281f26f06
       ("v3d: Add support for shader_image_load_store.")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-29 23:24:57 +00:00
Dylan Baker
82365595e9 automake: Add float64.glsl to dist tarball
Fixes: b63a1f8e40
       ("glsl: Create file to contain software fp64 functions")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-29 23:24:57 +00:00
Dylan Baker
7d7b30835c automake: Fix path to generated source
Fixes: b63a1f8e40
       ("glsl: Create file to contain software fp64 functions")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-29 23:24:57 +00:00
Matt Turner
9de90caca8 nir: Optimize double-precision lower_round_even()
Use the trick of adding and then subtracting 2**52 (52 is the number of
explicit mantissa bits a double-precision floating-point value has) to
implement round-to-even.

Cuts the number of instructions on SKL of the piglit test
fs-roundEven-double.shader_test from 109 to 21.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-01-29 15:02:23 -08:00
Marek Olšák
3e249b853e ac: use the correct LLVM processor name on Raven2
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-29 17:46:55 -05:00
Eric Anholt
f7769b5121 v3d: Fix the autotools build.
Noticed while looking at the gitlab-CI MR.
2019-01-29 14:00:27 -08:00
Jonathan Marek
31a1348a66 freedreno: fix sysmem rendering being used when clear is used
This batch->cleared value is only used to decide to use sysmem rendering
or not, so it should include any buffers that are affected by a clear.

This is required because the a2xx fast clear doesn't work with sysmem
rendering. The a22x "normal" clear path doesn't work with sysmem either.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-29 20:22:33 +00:00
Jonathan Marek
c93d77431f freedreno: fix depth usage logic
Depth can be used even when there is no restore/resolve of depth. This
happens when the depth buffer is invalidated after rendering to avoid
the resolve operation.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-29 20:22:33 +00:00
Jonathan Marek
bcefa0f1cb freedreno: fix invalidate logic
Set dirty bits on invalidate to trigger invalidate logic in fd_draw_vbo.

Also, resource_written for color needs to be after the invalidate logic.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-29 20:22:32 +00:00
Jonathan Marek
786f9639d6 mesa/st: wire up DiscardFramebuffer
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-29 20:22:32 +00:00
Rob Clark
0c42b5f3cb mesa: wire up InvalidateFramebuffer
And before someone actually starts implementing DiscardFramebuffer()
lets rework the interface to something that is actually usable.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-29 20:22:32 +00:00
Jonathan Marek
e685566612 st/dri: invalidate_resource depth/stencil before flush_resource
This allows freedreno to be aware of the depth invalidate when flushing
batches on flush_resource.

AFAIK, the only other driver which might care about this change is vc4,
where I think it should help by allowing the depth invalidate to work with
GALLIUM_HUD.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-29 20:22:32 +00:00
Mario Kleiner
820dfcea43 egl/wayland-drm: Only announce formats via wl_drm which the driver supports.
Check if a pixel format is supported by the Wayland servers gpu driver
before exposing it to the client via wl_drm, so we avoid reporting formats
to the client which the server gpu can't handle.

Restrict this reporting to the new color depth 30 formats for now, as the
ARGB/XRGB8888 and RGB565 formats are probably supported by every gpu under
the sun.

Atm. this is mostly useful to allow proper PRIME renderoffload for depth
30 formats on the typical Intel iGPU + NVidia dGPU "NVidia Optimus" laptop
combo.

Tested on Intel, AMD, NVidia with single-gpu setup and on a Intel + NVidia
Optimus setup.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-01-29 20:03:20 +00:00
Mario Kleiner
a34b0d68bb egl/wayland: Allow client->server format conversion for PRIME offload. (v2)
Support PRIME render offload between a Wayland server gpu and a Wayland
client gpu with different channel ordering for their color formats,
e.g., between Intel drivers which currently only support ARGB2101010
and XRGB2101010 import/display and nouveau which only supports ABGR2101010
rendering and display on nv-50 and later.

In the wl_visuals table, we also store for each format an alternate
sibling format which stores colors at the same precision, but with
different channel ordering, e.g., ARGB2101010 <-> ABGR2101010.

If a given client-gpu renderable format is not supported by the server
for import, but the alternate format is supported by the server, expose
the client-gpu renderable format as a valid EGLConfig to the client. At
eglSwapBuffers time, during the blitImage() detiling blit from the client
backbuffer to the linear buffer, the client format is converted to the
server supported format. As we have to do a copy for PRIME anyway,
this channel swizzling conversion comes essentially for free.

Note that even if a server gpu in principle does support sampling
from the clients native format, this conversion will be a performance
advantage if it allows to convert to the servers preferred format
for direct scanout, as the Wayland compositor may then be able to
directly page-flip a fullscreen client wl_buffer onto the primary
plane, or onto a hardware overlay plane, avoiding an extra data copy
for desktop composition.

Tested so far under Weston with: nouveau single-gpu, Intel single-gpu,
AMD single-gpu, "Optimus" Intel server iGPU for display + NVidia
client dGPU for rendering.

v2: Implement minor review comments by Eric Engestrom: Add some
    comment and assert, and some style fixes for clarity.
    No functional change.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-01-29 20:03:20 +00:00
Jason Ekstrand
a920979d4f intel/fs: Use split sends for surface writes on gen9+
Surface reads don't need them because they just have the one address
payload.  With surface writes, on the other hand, we can put the address
and the data in the different halves and avoid building the payload all
together.

The decrease in register pressure and added freedom in register
allocation resulting from this change reduces spilling enough to improve
the performance of one customer benchmark by about 2x.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
014edff0d2 intel/fs: Add interference between SENDS sources
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
eab1c55590 intel/fs: Support SENDS in SHADER_OPCODE_SEND
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
cca199fd85 intel/disasm: Properly disassemble split sends
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
8babaa84e8 intel/eu: Add support for the SENDS[C] messages
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d6a6e10390 intel/inst: Indent some code
We're about to add some more if cases so let's have the giant re-indent
in it's own patch to make review easier.

Acked-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d96969120d intel/inst: Fix the ia16_addr_imm helpers
These have clearly never seen any use.... On gen8, the bottom 4 bits are
missing so we need to shift them off before we call set_bits and shift
again when we get the bits.  Found by inspection.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
e46fb33143 intel/disasm: Rework SEND decoding to use descriptors
Instead of fetching the information out of the instruction directly,
fetch the descriptor and then pluck the information out of the
descriptor.  The current scheme works ok for SEND but with SENDS, it all
falls to pieces because the descriptor is completely shuffled around.

This commit doesn't actually convert everything.  One notable exception
is URB messages which don't even use descriptors in emit_urb_WRITE yet.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
13a6fabc62 intel/eu: Add more message descriptor helpers
We want to be able to extract data from descriptors as well as unify a
bit of the descriptor construction.

One of the unifications we do is to unify the read/write and dataport
descriptors.  On gen4-5, read/write are substantially different and the
read descriptors change between gen4 and gen4.x.  On gen6, they unified
layouts between read, write, and dataport.  Then, on gen8, they added
one bit to the message type field but left it reserved MBZ for
read/write messages.  This commit chooses to treat that as if they
expanded the field everywhere and just didn't have enough enum values
for read/write to bother with the extra bit.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
c3aa436bfe intel/eu/validate: SEND restrictions also apply to SENDC
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
fee6bd8d8e intel/eu: Use GET_BITS in brw_inst_set_send_ex_desc
It's a bit more readable

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
b284d222db intel/fs: Use SHADER_OPCODE_SEND for varying UBO pulls on gen7+
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
8514eba693 intel/fs: Use SHADER_OPCODE_SEND for texturing on gen7+
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
f547cebbe0 intel/fs: Use a logical opcode for IMAGE_SIZE
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d2d3e04501 intel/fs: Use SHADER_OPCODE_SEND for surface messages
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
7f1cf046cd intel/fs: Add a generic SEND opcode
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
ba3c5300f9 intel/eu: Rework surface descriptor helpers
This commit pulls the surface descriptor helpers out into brw_eu.h and
makes them no longer depend on the codegen infrastructure.  This should
allow us to use them directly from the IR code instead of the generator.
This change is unfortunately less mechanical than perhaps one would like
but it should be fairly straightforward.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
5b17379631 intel/eu: Add has_simd4x2 bools to surface_write functions
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
2ce93b88c0 intel/fs: Take an explicit exec size in brw_surface_payload_size()
Instead of magically falling back to SIMD8 for atomics and typed
messages on Ivy Bridge, explicitly figure out the exec size and pass
that into brw_surface_payload_size.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
cf42b0f9e2 intel/fs: Handle IMAGE_SIZE in size_read() and is_send_from_grf()
Like all the other sends, it's just mlen * REG_SIZE.

Fixes: 3cbc02e469 "intel: Use TXS for image_size when we have..."
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
009c0bd840 intel/defines: Explicitly cast to uint32_t in SET_FIELD and SET_BITS
If you pass a bool in as the value to set, the C standard says that it
gets converted to an int prior to shifting.  If you try to set a bool to
bit 31, this lands you in undefined behavior.  It's better just to add
the explicit cast and let the compiler delete it for us.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
077b9557a4 intel/fs: Get rid of fs_inst::equals
There are piles of fields that it doesn't check so using it is a lie.
The only reason why it's not causing problem is because it has exactly
one user which only uses it for MOV instructions (which aren't very
interesting) and only on Sandy Bridge and earlier hardware.  Just get
rid of it and inline it in the one place that it's actually used.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Rob Clark
446a14bc0a freedreno: minor cleanups
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-29 12:30:50 -05:00
Rob Clark
c3baa077bf freedreno: stop frob'ing pipe_resource::nr_samples
Previously we tried to normalize nr_samples to MAX2(1, nr_samples) to
avoid having to deal with 0 vs 1 everywhere.  But this causes problems
in mesa/st, for example st_finalize_texture() will think there is a
nr_samples mismatch and recreate the texture.  Somehow this manifests
as corrupt x11 font rendering on generations that do not support MSAA
(but apparently works fine on a5xx and a6xx which do support MSAA.)

Fixes: cf0c7258ee freedreno/a5xx: MSAA
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-29 12:30:50 -05:00
Rob Clark
1a6ddfe5ee freedreno/a6xx: fix blitter nr_samples check
nr_samples for non-MSAA case could be either zero or one.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-29 12:22:08 -05:00
Rob Clark
9106a0fe33 freedreno/a5xx: fix blitter nr_samples check
nr_samples for non-MSAA case could be either zero or one.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-29 12:21:19 -05:00
Bas Nieuwenhuizen
69edc972fc radv: Enable VK_EXT_memory_priority.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-29 15:56:56 +01:00
Bas Nieuwenhuizen
50fd253bd6 radv/winsys: Add priority handling during submit.
Switched to the raw bo list api to avoid having to use 2 arrays for
everything.

This was introduced in libdrm 2.4.97 which we already depend upon.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-29 15:56:52 +01:00
Bas Nieuwenhuizen
ead54d4a42 radv/winsys: Set winsys bo priority on creation.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-29 15:56:41 +01:00
Samuel Pitoiset
3a8d6c0880 radv: re-enable fast depth clears for 16-bit surfaces on VI
This has been disabled some months ago because it introduced
rendering issues with Shadow Of Warrier II (DXVK). This game is
no longer affected, I wonder if 824cfc1ee5 ("radv: rework the
TC-compat HTILE hardware bug with COND_EXEC") fixed the problem.
I checked The Forest on my Polaris, and it renders fine too.

According to Phillip, this gives +5.5% with Rise Of The Tomb
Raider and DXVK. This is because DXVK  uses 16-bit depth surfaces
while the native port from Feral uses 32-bit depth surfaces.

Unfortunately, Shadow Of The Tomb Raider isn't affected because
it clears each layer of a D16 array texture individually. So it
doesn't hit the fast clear path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-29 15:20:55 +01:00
Eric Anholt
932ed9c00b vc4: Enable NEON asm on meson cross-builds.
The core Mesa with_asm_arch and USE_ARM_ASM flags are disabled for meson
cross-builds because of the need to run host binaries on the build system.
vc4 doesn't need to do that, so skip with_asm_arch to enable NEON on my
cross-builds.

Fixes: ebcb4c2156 ("meson: Enable VC4's NEON assembly support.")
2019-01-28 16:45:48 -08:00
Carsten Haitzler (Rasterman)
300d3ae8b1 vc4: Declare the cpu pointers as being modified in NEON asm.
Otherwise, the compiler is free to reuse the register containing the input
for another call and assume that the value hasn't been modified.  Fixes
crashes on texture upload/download with current gcc.

We now have to have a temporary for the cpu2 value, since outputs must be
lvalues.

(commit message by anholt)

Fixes: 4d30024238 ("vc4: Use NEON to speed up utile loads on Pi2.")
2019-01-28 16:45:45 -08:00
Carsten Haitzler (Rasterman)
522f688471 vc4: Use named parameters for the NEON inline asm.
This makes the asm code more intelligible and clarifies the functional
change in the next commit.

(commit message and commit squashing by anholt)
2019-01-28 16:40:46 -08:00
Jonathan Marek
f6292c32cc kmsro: Add freedreno renderonly support
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-28 18:25:27 -05:00
Jonathan Marek
7d458c0c69 freedreno: a2xx: add perfcntrs
Based on a5xx perfcntrs implementation.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-28 18:21:16 -05:00
Jonathan Marek
cccec0b457 freedreno: a2xx: minor solid_vertexbuf fixups
The big thing here is the 0x60 offset for the mem2gmem copy which I missed
in my last patch.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-28 18:21:16 -05:00
Jonathan Marek
912a9c8d8c freedreno: a2xx: clear fixes and fast clear path
This fixes the depth/stencil clear on a20x, and adds a fast clear path.

The fast clear path is only used for a20x, needs performance tests on a22x.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-28 18:21:16 -05:00
Jonathan Marek
cb2322c7c0 freedreno: a2xx: a20x hw binning
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-28 18:21:16 -05:00
Jonathan Marek
501c6e70d4 freedreno: update a2xx registers
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-28 18:21:16 -05:00
Timothy Arceri
fb78a6cb72 glsl: use remap location when serialising uniform program resource data
This allows us to avoid expensive string compares since we already have
a map to the pointers.

These compares were taking ~30 seconds for a single shader compile
in Godot due to it using 64,000+ uniforms.

Fixes: c4cff5f402 ("glsl: add basic support for resource list to shader cache")

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109229
2019-01-29 09:39:54 +11:00
Vinson Lee
be5b271ea7 meson: Fix typo.
meson.build:166:21: ERROR:  Unknown method "verson_compare" for a string.

Fixes: c1efa240c9 ("meson: Add warnings and errors when using ICC")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
2019-01-28 10:47:32 -08:00
Jonathan Marek
7c930d99ad freedreno: a2xx: enable early-Z testing
Enable earlyZ when alpha test is disabled.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-28 13:04:41 -05:00
Jonathan Marek
32b1d2d716 freedreno: a2xx: ir2 cleanup
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-28 13:04:41 -05:00
Rob Herring
41a0acd6a1 Switch imx to kmsro and remove the imx winsys
The kmsro winsys is equivalent to the imx winsys, so we can switch
to it and remove the imx one.

Signed-off-by: Rob Herring <robh@kernel.org>
2019-01-28 11:50:08 -06:00
Rob Herring
827e0d6654 kmsro: Add etnaviv renderonly support
Enable using etnaviv for KMS renderonly. This still needs KMS driver
name mapping to kmsro to be used automatically.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Rob Herring <robh@kernel.org>
2019-01-28 11:45:43 -06:00
Eric Anholt
272b6cf58f kmsro: Extend to include hx8357d.
This allows vc4 to initialize on the Adafruit PiTFT 3.5" touchscreen with
the hx8357d tinydrm driver

v2: Whitespace fix noted by Eric Engestrom, update commit message for the
    driver being merged.
v3: Rebase on Rob Herring's pipe-loader changes.

Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Acked-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2019-01-28 09:35:45 -08:00
Rob Herring
511e7b6f61 pipe-loader: Fallback to kmsro driver when no matching driver name found
If we can't find a driver matching by name, then use the kmsro driver.
This removes the need for needing a driver descriptor for every possible
KMS driver.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-28 09:35:45 -08:00
Eric Anholt
ed65aeec78 pl111: Rename the pl111 driver to "kmsro".
The vc4 driver can do prime sharing to many different KMS-only devices,
such as the various tinydrm drivers for SPI-attached displays.  Rename the
driver away from "pl111" to represent what it will actually support:
various sorts of KMS displays with the renderonly layer used to attach a
GPU.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-28 09:35:45 -08:00
Samuel Pitoiset
afeef3cacf radv: set noalias/dereferenceable LLVM attributes based on param types
Instead of using this useless array_params_mask variable.
This should set these two attributes to streamout buffers too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 16:30:38 +01:00
Samuel Pitoiset
320b058d32 radv: simplify allocating user SGPRS for descriptor sets
Unnecesary to check the current stages if desc_set_used_mask
is used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 16:30:36 +01:00
Samuel Pitoiset
d1994ed229 radv: remove radv_userdata_info::indirect field
Always false.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 16:30:33 +01:00
Gert Wollny
212c0c630a mesa/main: Expose EXT_sRGB_write_control
Use EXT_framebuffer_sRGB to expose EXT_sRGB_write_control on GLES. Remove
the checks for desktion GL in the enable calls, since EXT_framebuffer_sRGB
now also indicates support for switching the linear-sRGB color
space conversion on GLES.

Thanks to Ilia Mirkin for all the helpful discussions that helped to rework
this series.

v2: Fix alphabetical listing of extensions (Tapani Pälli)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2019-01-28 12:18:40 +01:00
Gert Wollny
1013dfece1 mesa/main/version: Lower the requirements for GLES 3.0
GLES 3.0 does not actually require support for EXT_framebuffer_sRGB, it
only needs support for sRGB attachments to framebuffers and framebuffer
objects as defined in ARB_framebuffer_objects.

v2: Clarify that ARB_framebuffer_objects is needed.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-28 12:18:40 +01:00
Gert Wollny
76c3f6fb3f mesa/main: Use flag for EXT_sRGB instead of EXT_framebuffer_sRGB where possible
All drivers that support EXT_framebuffer_sRGB also support EXT_sRGB, but
in order to keep this commit minial, and not to break any drivers both
flags are checked.

v2: - Use only EXT_sRGB (Ilia Mirkin)
    - Move adding the flag EXT_sRGB to gl_extensions to a separate patch

v3: use _mesa_has_EXT_framebuffer_sRGB instead of extension flag
    The _mesa_has function also checks for the correct versions and
    should be preferred over using the flags directly (Erik)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-28 12:18:40 +01:00
Gert Wollny
8f9dfb7d88 mesa/st: rework support for sRGB framebuffer attachements
For GLES sRGB framebuffer attachemnt support is provided in two steps:
sRGB attachments like described in EXT_sRGB (and GLES 3.0) that enable
linear to sRGB color space transformation automatically, and the ability
to switch formats of the render target surface between sRGB and linear
that introduces full support for EXT_framebuffer_sRGB.
Set the according flags to reflect these two levels of sRGB support.

As a difference between desktopm GL and GLES, on desktop GL for a sRGB
framebuffer attachment the linear-sRGB conversion is turned off by default,
and for GLES it is turned on. This needs to be taken into account when
initally creating a surface, i.e. on desktop GL creation of a sRGB surface
is preferred, but on GLES sRGB surfaces are only created when explicitely
requested.

v2: - Use the new CAPS name

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-28 12:18:40 +01:00
Gert Wollny
385081cd17 i965: Set flag for EXT_sRGB
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: <Gurchetan Singh gurchetansingh@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-28 12:18:40 +01:00
Gert Wollny
7577c82fed mesa:main: Add flag for EXT_sRGB to gl_extensions
EXT_sRGB is an (incomplete) GLES extension that provides support for sRGB
framebuffer attachments, hence it can be used to check for this support
as an alternative to EXT_framebuffer_sRGB that provies the same
functionality but also sRGB write control support.

However, since EXT_sRGB  is incomplete and superseted by GLES 3.0 it will
not be exposed as an extension.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-28 12:18:40 +01:00
Gert Wollny
2845939d6a virgl: Set sRGB write control CAP based on host capabilities
v2: - Use the renamed CAPS
    - add assetions to make sure that mesa doesn't try to switch
      destination surface formats when it is not supported. (Ilia Mirkin)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-01-28 12:18:40 +01:00
Gert Wollny
8021f1875e Gallium: Add new CAPS to indicate whether a driver can switch SRGB write
Add a new cap that indicates whether the drivers supports
enabling/disabling the conversion from linear space to sRGB
for a framebuffer attachment. In Driver terms that this CAP indicates
whether the driver can switcht between a linear and and a sRGB surface
format for draw destinations witout changing the sourface itself.

v2: rename CAP to DEST_SURFACE_SRGB_CONTROL to reflect its
    purpouse better (pointed out by Ilia Mirkin)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-28 12:18:40 +01:00
Neil Roberts
75b3719c4f spirv: Don't use special semantics when counting vertex attribute size
Under Vulkan, the double vertex attributes take up the same size
regardless of whether they are vertex inputs or any other stage
interface.

Under OpenGL (ARB_gl_spirv), from GLSL 4.60 spec, section 4.3.9
Interface Blocks:

   "It is a compile-time error to have an input block in a vertex
    shader or an output block in a fragment shader. These uses are
    reserved for future use."

So we also don't need to check if it is an vertex input or not, and
use false in any case.

v2: (changes made by Alejandro Piñeiro)
    * Update required after "spirv: Handle location decorations on
      block interface members" own updates (original patch was sent
      several months ago)
    * After Neil suggesting it, confirm that this change can be also
      done for OpenGL (ARB_gl_spirv). Expand commit message.

v3: update after changing name of main method on a previous patch

Signed-off-by: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-28 11:42:46 +01:00
Neil Roberts
5c797f7354 glsl_types: Rename parameter of glsl_count_attribute_slots
glsl_count_attribute_slots takes a parameter to specify whether the
type is being used as a vertex input because on GL double attributes
only take up one slot. Vulkan doesn’t make this distinction so this
patch renames the argument to is_gl_vertex_input in order to make it
more clear that it should always be false on Vulkan.

v2: minor variable renaming (s/member/member_type) (Tapani)

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-28 11:42:46 +01:00
Neil Roberts
dfc3a7cb3c spirv/nir: handle location decorations on block interface members
Previously the code was taking any location decoration on the block
and using that to calculate the member locations for all of the
members. I think this was assuming that there would only be one
location decoration for the entire block. According to the Vulkan spec
it is possible to add location decorations to individual members:

   “If the structure type is a Block but without a Location, then each
    of its members must have a Location decoration. If it is a Block
    with a Location decoration, then its members are assigned
    consecutive locations in declaration order, starting from the
    first member which is initially the Block. Any member with its own
    Location decoration is assigned that location. Each remaining
    member is assigned the location after the immediately preceding
    member in declaration order.”

This patch makes it instead keep track of which members have been
assigned an explicit location. It also has a space to store the
location for the struct as a whole. Once all the decorations have been
processed it iterates over each member to fill in the missing
locations using the rules described above.

So, this commit is needed to get working a case like this, on both
Vulkan and OpenGL using SPIR-V (ARB_gl_spirv):

     out block {
            layout(location = 2) vec4 c;
            layout(location = 3) vec4 d;
            layout(location = 0) vec4 a;
            layout(location = 1) vec4 b;
     } name;

v2: (changes made by Alejandro Piñeiro)
   * Update after introducing struct member splitting (See commit b0c643d)
   * Update after only exposing interface_type for blocks, not to any struct
   * Update after last changes done for xfb support

v3: use "assign" instead of "add" on the new method added (Tapani)

Signed-off-by: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-28 11:42:46 +01:00
Christian Gmeiner
34458c1cf6 etnaviv: add linear sampling support
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-01-28 07:36:12 +01:00
Christian Gmeiner
42ca4dda2d etnaviv: update headers from rnndb
Update to etna_viv commit 4d2f857.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-01-28 07:36:09 +01:00
Christian Gmeiner
5b4a155d2b etnaviv: extend etna_resource with an addressing mode
Defines how sampler (and pixel pipes) needs to access the data
represented with a resource. The used default is mode is
ETNA_ADDRESSING_MODE_TILED.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-01-28 07:36:05 +01:00
Ilia Mirkin
d1d2bb8c07 nvc0: don't put text segment into bufctx
The text segment is shared among multiple contexts, while each one has
its own bufctx. So when reallocating the text segment, some contexts may
end up with stale values in their bufctx's. Instead limit the exposure
to the bufctx to within a single draw.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-27 21:47:09 -05:00
Timothy Arceri
0907ae35ad radv/ac: fix some fp16 handling
Fixes: b722b29f10 ("radv: add support for 16bit input/output")

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 10:41:48 +11:00
Eric Anholt
c496b60ed8 v3d: Create separate sampler states for the various blend formats.
The sampler border color is encoded in the TMU's blending format (half
floats, 32-bit floats, or integers) and must be clamped to the format's
range unorm/snorm/int ranges by the driver.  Additionally, the TMU doesn't
know about how we're abusing the swizzle to support BGRA, A, and LA, so we
have to pre-swizzle the border color for those.

We don't really want to spend half a kb on sampler states in most cases,
so skip generating the variants when the border color is unused or is
0,0,0,0.
2019-01-27 08:30:03 -08:00
Eric Anholt
5fe4250a2c v3d: Move the sampler state to the long-lived state uploader.
Samplers are small (8-24 bytes), so allocating 4k for them is a huge
waste.
2019-01-27 08:30:03 -08:00
Eric Anholt
09472006ff v3d: Use the symbolic names for wrap modes from the XML. 2019-01-27 08:30:03 -08:00
Eric Anholt
c51d125d18 v3d: Fix stencil sampling from a separate-stencil buffer.
When the sampler view is in sample-stencil mode, we need to return uint
stencil values.  To do that, fill in the format table to return R8I, and
have the sampler view point at the separate stencil buffer.

Fixes dEQP-GLES31.functional.stencil_texturing.format.depth32f_stencil8_2d
2019-01-27 08:30:03 -08:00
Eric Anholt
8a0b0a8f37 v3d: Fix stencil sampling from packed depth/stencil.
We need to pick the 8-bit unorm value out, not the depth component.
2019-01-27 08:30:03 -08:00
Eric Anholt
fcdbd441a2 v3d: Fix release-build warning about utile_h. 2019-01-27 08:30:03 -08:00
Eric Anholt
edb1fcd963 v3d: Flush blit jobs immediately after generating them.
Fixes OOMs in the CTS's packed_pixels.varied_rectangle.* tests -- the
series of texture uploads at the start before texturing occurred would end
up all sitting around as cached jobs for reuse.  By flushing immediately,
peak active BO usage goes from 150M to 40M.

We could maybe put some limits on how many jobs we keep around, but blits
seem particularly unlikely to get reused for other drawing.
2019-01-27 08:30:03 -08:00
Eric Anholt
ac333ffa59 v3d: Fix BO stats accounting for imported buffers. 2019-01-27 08:30:03 -08:00
Eric Anholt
060575bea8 v3d: Drop maximum number of texture units down to 16.
This is the GLES 3.2 minmax, and also what the closed source driver does.
Avoids hitting OOMs in the CTS's
dEQP-GLES3.functional.texture.units.all_units.only_cube.1.
2019-01-27 08:30:03 -08:00
Eric Anholt
3e743d8cd8 v3d: Avoid duplicating limits defines between gallium and v3d core.
We don't want to pull the compiler into every include in the gallium
driver, so just make a new little header to store the limits.
2019-01-27 08:30:03 -08:00
Eric Anholt
fe6a21c867 v3d: Fix overly-large vattr_sizes structs.
We want one vector size per vector, not per component.
2019-01-27 08:30:03 -08:00
Eric Anholt
533b3f0541 v3d: Rename gallium-local limits defines from VC5 to V3D.
The compiler has its limits under V3D_* (like most V3D stuff), so sync up
with that.
2019-01-27 08:30:03 -08:00
Bas Nieuwenhuizen
b4870a15ae radv: Remove unused variable.
Trivial.
2019-01-27 13:51:35 +01:00
Niklas Haas
804cc44d09 radv: add device->instance extension dependencies
From the vulkan spec 33.3 "Extension Dependencies":

"Any device extension that has an instance extension dependency that is
not enabled by vkCreateInstance is considered to be unsupported, hence
it must not be returned by vkEnumerateDeviceExtensionProperties for any
VkPhysicalDevice child of the instance."

Therefore we need to check whether the instance-level extensions are
actually enabled when deciding to support a device-level extension or
not.

Furthermore, we need to do this for all instance-level extensions of any
(transitive) device-level extension dependency, due to the following
paragraph:

"If an extension is supported (as queried by
vkEnumerateInstanceExtensionProperties or
vkEnumerateDeviceExtensionProperties), then required extensions of that
extension must also be supported for the same instance or physical
device."

Finally, because some of these vulkan extensions may be implicitly
promoted to future vulkan core API versions, we can also satisfy the
dependency if the vulkan API version is high enough.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-27 13:50:35 +01:00
Niklas Haas
d12dc39396 radv: correctly use vulkan 1.0 by default
From the vulkan spec 3.2 "Instances":

"Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing an
apiVersion of 0 is equivalent to providing an apiVersion of
VK_MAKE_VERSION(1,0,0)."

Fixes: ffa15861ef "radv: UseEnumerateInstanceVersion for the default version."
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-27 12:49:28 +01:00
Niklas Haas
d9bd3b1cb8 glsl: fix block member alignment validation for vec3
Section 7.6.2.2 (Standard Uniform Block Layout) of the GL spec says:

    The base offset of the first member of a structure is taken from the
    aligned offset of the structure itself. The base offset of all other
    structure members is derived by taking the offset of the last basic
    machine unit consumed by the previous member and adding one.

The current code does not reflect this last sentence - it effectively
instead aligns up the next offset up to the alignment of the previous
member. This causes an issue in exactly one case:

layout(std140) uniform block {
    layout(offset=0) vec3 var1;
    layout(offset=12) float var2;
};

As per section 7.6.2.1 (Uniform Buffer Object Storage) and elsewhere, a
vec3 consumes 3 floats, i.e. 12 basic machine units. Therefore, `var1`
in the example above consumes units 0-11, with 12 being the first
available offset afterwards. However, before this commit, mesa
incorrectly assumes `var2` must start at offset=16 when using explicit
offsets, which results in a compile-time error. Without explicit
offsets, the shaders actually work fine, indicating that mesa is already
correctly aligning these fields internally. (Just not in the code that
handles explicit buffer offset parsing)

This patch should fix piglit tests:
ssbo-explicit-offset-vec3.vert
ubo-explicit-offset-vec3.vert

Signed-off-by: Niklas Haas <git@haasn.xyz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-27 03:00:03 -05:00
Jason Ekstrand
86e5f76d3d spirv: Add support for SPV_EXT_physical_storage_buffer
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-26 13:41:50 -06:00
Jason Ekstrand
fb282a68bc spirv: Implement OpConvertPtrToU and OpConvertUToPtr
This only implements the actual opcodes and does not implement support
for using them with specialization constants.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-26 13:41:50 -06:00
Jason Ekstrand
837ed2ba51 spirv: Handle OpTypeForwardPointer
We handle forward declarations by creating the pointer type with it's
storage type based on storage class and just waiting to fill out the
actual deref type until we get the OpTypePointer.  Because any
composites using the forward declared type only care about the storage
type (i.e. uint64_t, uvec2, etc.) when creating their glsl_type, this
works fine and we can defer the actual deref_type as far as we need.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26 13:41:50 -06:00
Jason Ekstrand
4602e705e4 spirv: Drop a bogus assert
This was valid back when the only valid types of pointers were uint32
and uvec2.  Now that we're allowing more variety, it could be just about
anything so we'll just drop the assert.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26 13:41:50 -06:00
Jason Ekstrand
9e34781aef nir: Allow SSBOs and global to alias
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-26 13:41:50 -06:00
Jason Ekstrand
9839ce8bf9 nir/validate: Allow array derefs of vectors for nir_var_mem_global
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26 13:39:18 -06:00
Jason Ekstrand
5f5503d498 nir/lower_io: Add support for nir_var_mem_global
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26 13:39:18 -06:00
Jason Ekstrand
314d2c90c3 nir/lower_io: Add a 32 and 64-bit global address formats
These are simple scalar addresses.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-26 13:39:18 -06:00
Jason Ekstrand
e461926ef2 nir: Add load/store/atomic global intrinsics
These correspond roughly to reading/writing OpenCL global pointers.  The
idea is that they just take a bare address and load/store from it.  Of
course, exactly what this address means is driver-dependent.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26 13:39:18 -06:00
Axel Davy
6380fedb60 st/nine: Enable debug info if NDEBUG is not set
We want to have debug info as well if using
meson's debugoptimized when ndebug is off.

v2: use u_debug functions that do something
even if DEBUG is not set.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2019-01-26 19:53:19 +01:00
Axel Davy
d7433c22e6 st/nine: Immediately upload user provided textures
Fixes regression caused by
42d672fa6a
st/nine: Bind src not dst in nine_context_box_upload

Before that patch, for user provided textures,
when the texture was destroyed, the safety
check for pending uploads, which according to
the code "Following condition cannot happen currently",
was flushing the queue and thus triggering the upload.

After the patch, the texture destruction was delayed after
the upload. However the user frees the texture buffer,
as it thinks the texture released.

Instead of reverting the faulty patch,
this patch instead flushes the csmt queue right away
after queuing the upload for this type of textures.
This is more future-proof, as we may want to bind the
surface for other reasons in the future.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
2019-01-26 19:53:00 +01:00
Matt Turner
a7d629a590 i965: Always compile fp64 funcs when needed
Compilation of user-specified shaders with software fp64 works by
compiling on demand an "fp64-funcs" shader implementing various fp64
operations and then linking it into the "user shader".

In

   commit 64b8c86d37
   Author: Timothy Arceri <tarceri@itsqueeze.com>
   Date:   Thu Jan 17 17:16:29 2019 +1100

       glsl: be much more aggressive when skipping shader compilation

we changed the behavior of the shader cache to skip compilation earlier
when we get a cache hit.

After the aforementioned commit, compiling a user program using fp64
would store into the cache an entry for the fp64-funcs shader.
Subsequent compilations of uncached user shaders using fp64 would fail
in compile_fp64_funcs() after finding a cache entry for the fp64-funcs,
but being unprepared to read from the cache.

It's unclear to me how to retrieve the cached NIR of the fp64-funcs (if
it even is cached), so just call _mesa_glsl_compile_shader() with
force_recompile=true in order to ensure we generate the fp64-funcs
successfully.

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-26 10:33:22 -08:00
Matt Turner
18b467c066 intel/compiler: Add a file-level description of brw_eu_validate.c
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-01-26 10:33:22 -08:00
Jonathan Marek
41ddf1d150 freedreno: add renderonly scanout
This allows creating a fd_screen with a renderonly object which will be
used to allocated scanout resources.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
[slight tweak to fix uninitialized 'prsc' in debug print]
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-26 10:47:21 -05:00
Rob Clark
cd79b5e0c2 freedreno/a2xx: fix unused variable warning
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-26 10:44:31 -05:00
Timothy Arceri
8e9ad592c3 tgsi: remove culldist semantic from docs
The semantic was removed in e6d9389366.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-26 12:04:53 +11:00
Timothy Arceri
5d66f7103f ac/nir_to_llvm: fix clamp shadow reference for more hardware
Fixes the following piglit test on my VEGA and matches the behaviour in the
tgsi backend.

tests/spec/glsl-1.10/execution/samplers/glsl-fs-shadow2D-clamp-z.shader_test

Fixes: 625dcbbc45 ("amd/common: pass address components individually to ac_build_image_intrinsic")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-26 12:03:24 +11:00
Eric Anholt
08f4a904b3 gallium: Make sure we return is_unorm/is_snorm for compressed formats.
The util helpers were looking for a non-void channels in a non-mixed
format and returning its snorm/unorm state.  However, compressed formats
don't have non-void channels, so they always returned false.  V3D wants to
use util_format_is_[su]norm for its border color clamping workarounds, so
fix the functions to return the right answer for these.

This now means that we ignore .is_mixed.  I could retain the is_mixed
check, but it doesn't seem like a useful feature -- the only code I could
find that might care is freedreno's blit, which has some notes about how
things are wonky in this area anyway.

Reviewed-by: <Roland Scheidegger sroland@vmware.com>
2019-01-25 13:06:50 -08:00
Eric Anholt
104c7883e7 gallium: Fix comment about possible colorspaces.
Two typos, and missing one of the colorspaces.

Reviewed-by: <Roland Scheidegger sroland@vmware.com>
2019-01-25 13:06:47 -08:00
Eric Anholt
54abd2e084 gallium: Enable unit tests as actual meson unit tests.
These tests don't need swrast, so we can always enable them when
build_tests is set.  Most of them run to successful completion quickly
(.9s on my SKL).

Reviewed-by: <Roland Scheidegger sroland@vmware.com>
2019-01-25 13:06:45 -08:00
Emil Velikov
3b6aaab7e9 mapi: print function declarations for shared glapi
Earlier commit aimed to remove unneeded function declarations. Namely
OpenGL entrypoints which are not applicable for OpenGLES*

Although it did not consider the shared glapi which needs all,
including hidden ones. Resulting in warning/errors like the following

../build/src/mapi/shared-glapi/glapi_mapi_tmp.h:26014:15:
error: no previous prototype for ‘shared_dispatch_stub_1414’ [-Werror=missing-prototypes]

This patch addressed that.

Cc: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reported-by: Eric Anholt <eric@anholt.net>
Fixes: 6148cce388 ("mapi: drop unneeded gl_dispatch_stub declarations")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-01-25 13:04:04 -08:00
Rob Clark
4aa64940c6 freedreno: limit tiling to PIPE_BIND_SAMPLER_VIEW
1ce5d757d0 dropped this limit.. which is probably the right thing to
do.  But it results in an extra tiled->linear blit for glReadPixels()
(ie. dEQP/piglit) which is hitting some intermittent corruption (looks
like cache) on a6xx, causing a lot of spurious fails.

Since we are getting close to 19.0 branchpoint, re-instate this limit
for now, until the blitter problems are resolved.

Fixes: 1ce5d757d0 freedreno: core buffer modifier support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-25 10:20:05 -05:00
Samuel Pitoiset
378e2d2414 radv: fix computing number of user SGPRs for streamout buffers
Streamout buffers are emitted like push constants.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-25 15:36:16 +01:00
Jose Fonseca
65b8d723fd appveyor: Revert commits adding Cygwin support.
This reverts commits 00ad77b9f6 and
5334dafee2.

This avoids Appveyor build breakage due to Cygwin, but more importantly,
there are several problems with these patches, as highlighted to my
recent mesa-dev mail.  So better to revert for now, and pursue Cygwin
support after these have been address.
2019-01-25 14:13:26 +00:00
Tapani Pälli
540939ecee android: fix build issues with libmesa_anv_gen* libraries
We need this include path to find nir/nir_xfb_info.h.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-25 15:21:06 +02:00
Andrii Simiklit
4759bb2fcf intel/batch-decoder: fix a vb end address calculation
According to the loop implementation (in 'ctx_print_buffer' function),
which advances dword by dword over vertex buffer(vb),
the vb size should be aligned by 4 bytes too.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109449
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-25 15:12:30 +02:00
Andrii Simiklit
db39a44f10 intel/batch-decoder: fix vertex buffer size calculation for gen<8
It should be incremented by one according to
how it is calculated by 'emit_vertex_buffer_state':
  "\#if GEN_GEN < 8
      .BufferAccessType = step_rate ? INSTANCEDATA : VERTEXDATA,
      .InstanceDataStepRate = step_rate,
   \#if GEN_GEN >= 5
      .EndAddress = ro_bo(bo, end_offset - 1),
   \#endif
   \#endif"

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109449
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-25 15:12:07 +02:00
Eric Engestrom
69e9440367 meson/vdpau: add missing soversion
This mirrors what autotools does in src/gallium/state_trackers/vdpau/Makefile.am
and src/gallium/targets/vdpau/Makefile.am:

  VDPAU_MAJOR = 1
  VDPAU_MINOR = 0
  libvdpau_gallium_la_LDFLAGS = -version-number $(VDPAU_MAJOR):$(VDPAU_MINOR)

Reported-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
Fixes: 68076b8747 "meson: build gallium vdpau state tracker"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-25 12:10:00 +00:00
Eric Engestrom
9af77fcf98 anv: drop always-successful VkResult
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-25 09:45:27 +00:00
Rafael Antognolli
f2ece26601 anv/allocator: Avoid race condition in anv_block_pool_map.
Accessing bo->map and then pool->center_bo_offset without a lock is
racy. One way of avoiding such race condition is to store the bo->map +
center_bo_offset into pool->map at the time the block pool is growing,
which happens within a lock.

v2: Only set pool->map if not using softpin (Jason).
v3: Move things around and only update center_bo_offset if not using
softpin too (Jason).

Cc: Jason Ekstrand <jason@jlekstrand.net>
Reported-by: Ian Romanick <idr@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109442
Fixes: fc3f588320
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-24 17:39:40 -08:00
Dylan Baker
c1efa240c9 meson: Add warnings and errors when using ICC
ICC tries to be helpful by not erroring when it sees something that it
doesn't understand, which is completely the opposite of helpful. Meson
0.49.0 does much better at handling this by really trying to make ICC
error, but there are some things in mesa that still get ignored until
0.49.1

v2: - Fix id check, which is 'intel' not 'icc'

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
2019-01-24 19:14:50 +00:00
Dylan Baker
7cb7f35bc7 meson: Fix compiler checks for SWR with ICC
This is a bit fragile, as the way this "fixes" the check is to move the
one that we know is correct before the one that is incorrectly reported
as working. In meson 0.49.1 (which isn't out yet) this is fixed that the
incorrect check is reported as a failure.

Fixes: e0b037d697
       ("meson: Build SWR driver")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109129
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-24 19:14:50 +00:00
Dylan Baker
3ba7ab8d2c meson: fix swr KNL build
There's a typo in one of the #defines that breaks compilation.

Fixes: e0b037d697
       ("meson: Build SWR driver")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109023
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-24 19:14:50 +00:00
Matt Turner
70a7ece035 gallivm: Return true from arch_rounding_available() if NEON is available
LLVM uses the single instruction "FRINTI" to implement llvm.nearbyint.
Fixes the rounding tests of lp_test_arit.

Bug: https://bugs.gentoo.org/665570
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-01-24 11:07:24 -08:00
Matt Turner
385ee7c3d0 gallium: Enable ASIMD/NEON on aarch64.
NEON (now called ASIMD) is available on all aarch64 CPUs. Our code was
missing an aarch64 path, leading to util_cpu_caps.has_neon always being
false on aarch64.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-24 11:07:24 -08:00
Dave Airlie
1f6b92b476 gallium: use put image shm2 path (v2)
This fixes the drisw paths to use the new shm2 interface, so that
we don't trigger the X server overflow checks when the x offset is non-zero.

This just hides the versioning in drisw, and either passes the src_x
or adds the offset fixup for the fallback path.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-01-25 04:27:45 +10:00
Dave Airlie
00af91ca46 glx: add support for putimageshm2 path (v2)
v2: pass x,0 in as the offset coords at glx level not earlier

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-01-25 04:27:45 +10:00
Dave Airlie
db83a2b40f dri_interface: add put shm image2 (v2)
This adds a new interface to the swrast interface to fix an shm put image bug.

The current code adds the x,y src offsets into the offset parameters,
however if the x offset is > 0, and the put image copies up to the height
of the image, this can trigger an X server validation check to fail and
the renderering to get BadMatch.

This patch fixes it to pass the x offset coord in as a src x.

We cannot pass the Y coordinate due to the horrible code mangling the
image w/h vs stride in swrastXPutImage.

v2: drop srcx,y from api

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-01-25 04:27:45 +10:00
Emil Velikov
281421e1bc mapi: remove machinery handling CSV files
We haven't have one in years, so just drop the code.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
8a0012692a mapi: remove old, unused ES* generator code
As of earlier commit, everyone has switched to the new script for the ES
dispatch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
a41214ca3e mapi/es2api: remove no longer present entrypoints
With the previous scripts API from the following was incorrectly
exported. Drop them from the list, since they're no longer around.

GL_EXT_blend_func_extended
GL_EXT_texture_integer

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
05f8558b27 mapi/es*api: remove GL_EXT_multi_draw_arrays entrypoints
Now we use the upstream XML file and a cleaner generator. Thus the
symbols are no longer exported and we can drop them from this list.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
5661ce6c64 mapi/es*api: remove GL_OES_EGL_image entrypoints
As some point in the past we fixed the scripts so, these are no longer
exported. Drop them from the list.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
9f86f1da7c Revert "mapi/new: sort by slot number"
This reverts commit a1f5d9412cf7cacb3534635f6c2409fafbe6574e.

We no longer needed to sort - it was meant only to ease compare against
the old generated files.
2019-01-24 18:13:25 +00:00
Emil Velikov
3bf08292d2 scons: wire the new generator for es1 and es2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
0842bc879b meson: wire the new generator for es1 and es2
v2: use ${foo})_py naming (Dylan)
v3: use symbolic name for genCommon.py

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v2)
2019-01-24 18:13:25 +00:00
Emil Velikov
656845301d autotools: wire the new generator for es1 and es2
The output produced functionally identical, with the following changes:
 - A cosmetic: swapped ABI compatible types [ GLclampf -> GLfloat, etc ]
 - B cosmetic: renamed parameters [ zNear -> n, etc ]
 - C dropped extension entrypoints - invalid/incorrect

To make things easier to validate, normalise both old/new headers run
the sed patterns A, B and C to both sets.

A
      s/\<GLclampf\>/GLfloat/g; s/\<GLclampx\>/GLfixed/g;
      s/\<GLvoid\>/void/g;

B
      s/\ \* / */g; s/\<texture\>/target/g;
      s/\<plane\>/p/g; s/\<depth\>/d/g; s/\<modeAlpha\>/modeA/g;
      s/\<shader\>/program/g; s/\<obj\>/shaders/g; s/\<equation\>/eqn/g;
      s/\<param\>/data/g; s/\<params\>/data/g; s/\<buffers\>/buffer/g;
      s/\<src\>/mode/g; s/\<count\>/n/g; s/\<zNear\>/n/g; s/\<zFar\>/f/g;
      s/\<zfail\>/dpfail/g; s/\<zpass\>/dppass/g; s/\<buf\>/index/g;
      s/\<value\>/target/g; s/\<cap\>/target/g; s/\<maskNumber\>/index/g;
      s/\<srcRGB\>/sfactorRGB/g; s/\<dstRGB\>/dfactorRGB/g;
      s/\<srcAlpha\>/sfactorAlpha/g; s/\<dstAlpha\>/dfactorAlpha/g;
      s/\<primitiveMode\>/mode/g; s/\<primcount\>/instancecount/g;
      s/\<top\>/t/g; s/\<bottom\>/b/g; s/\<left\>/l/g; s/\<right\>/r/g;
      s/\<x\>/v0/g; s/\<y\>/v1/g; s/\<z\>/v2/g; s/\<w\>/v3/g;
      s/\<sfactor\>/mode/g; s/\<dfactor\>/dst/g; s/\<attribindex\>/bindingindex/g;
      s/\<internalFormat\>/internalformat/g; s/\<bufSize\>/bufsize/g;

C
glMultiDrawArraysEXT
glMultiDrawElementsEXT

glBindFragDataLocationEXT

glGetTexParameterIivEXT
glGetTexParameterIuivEXT
glTexParameterIivEXT
glTexParameterIuivEXT

v2:
 - gl_dispatch_stub declarations are addressed with previous patch
 - the public_entries table is no longer generated

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
389bc2bc6e mapi/new: remove duplicate GLvoid/void substitution
We already do it a few lines above - drop the duplicate.

Note that for consistency sake, we keep the substitution since the GL
API is a mixed bad - some use GLvoid while others a normal void.

We might want to merge this back in GLVND.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
5fa6c34949 mapi/new: fixup the GLDEBUGPROCKHR typedef to the non KHR one
This way we can reuse the latter, which is already present in the
headers that we use. Thus we can drop the manual typedef we generate.

We might want to merge this back in GLVND.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
babec55f7e mapi/new: don't print info we don't need for ES1/ES2
There is no need for the noop functions, the public_stubs and
public_entries table or table size defines. Remove those.

Pretty much all of this is applicable to GLVND, although it
requires preparatory work.

v2:
 - python style fixes (Dylan)
 - use "gldispatch" instead of not "glesv1" "glesv2"
 - remove the public_entries table/array (Erik)

v3:
 - use if == "gldispatch", instead of "in" (Kyle)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v2)
2019-01-24 18:13:25 +00:00
Emil Velikov
5b1bdce156 mapi/new: split out public_entries handling
The only instance that requires the public_entries table is the
dispatch library - split that into another function.

We have to be careful with when undefining the guard, so split it out.

We might want to merge this back in GLVND.
Minor GLVND cleanup will be needed first.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
313f977224 mapi/new: reinstate _NO_HIDDEN suffixes in the new generator
Strictly speaking we can rework the rest of the code so we do not need
those. That said, this will require a series on it's own so let's carry
this local quirk for now.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
451805f810 mapi/new: use the static_data offsets in the new generator
Otherwise the incorrect ones will be used, effectively breaking the ABI.

Note: some entries in static_data.py list a suffixed API, while (for ES*
at least) we expect the one w/o suffix.

v2:
 - rework path handling (Dylan)
 - use else if chain (Erik)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
bba375c016 mapi/new: sort by slot number
Makes it easier to compare the newly generated header against the old
one. Will be reverted after the transition.
2019-01-24 18:13:25 +00:00
Emil Velikov
06eb3fe371 mapi/new: import mapi scripts from glvnd
Currently we have over 20 scripts that generate the libGL* dispatch and
various other functionality. More importantly we're using local XML
files instead of the Khronos provides one(s). Resulting in an
increasing complexity of writing, maintaining and bugfixing.

One fairly annoying bug is handling of statically exported symbols.
Today, if we enable a GL extension for GLES1/2, we add a special tag to
the xml. Thus the ES dispatch gets generated, but also since we have no
separate notion of GL/ES1/ES2 static functions it also gets exported
statically.

This commit adds step one towards clearing and simplifying our setup.
It imports the mapi generator from GLVND.

  012fe39 ("Remove a couple of duplicate typedefs.")

v2: use local genCommon.py

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
cd0f11bac5 mapi: move genCommon.py to src/mapi/new
The helper will also be used by the new Khronos gl.xml aware generator.

v2: Move existing one, instead of duplicating it.
v3: Correct genCommon.py references in meson [Erik]
v4: Drop the file from the EGL EXTRA_DIST [Erik]

Suggested-by: Kyle Brenneman <kbrenneman@nvidia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
a08a793180 genCommon.py: Fix typo in _LIBRARY_FEATURE_NAMES.
Port glvnd commit 37fc6caa4b8 ("Fix typo in _LIBRARY_FEATURE_NAMES.")
from Michal Srb.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
cf317bf093 mapi: add all _glapi_table entrypoints to static_data.py
Currently various parts of mesa use the glapi_table differently.

Some use _glapi_get_proc_offset() to get the offset, while others
directly reference the specific offset via _gloffset_Function.

Add all static entries, to ensure things don't break as we flip to the
upstream XML + new mapi generator.

Note: the offsets are also used for the alias remap table, thus we need
to ensure we honour the correct offsets range or it will break.

Currently this is done via MAX_OFFSETS constant, although a better
solution is in the works.

v2: add FramebufferTexture2DMultisampleEXT
v3: add MAX_OFFSETS guard

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v1)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
fe9f5c0e21 mapi: sort static entrypoints numerically
A few of the entrypoints were incorrectly placed. Sort those to align
with the rest of the list.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
5a81e8d40e Revert "mesa/main: remove ARB suffix from glGetnTexImage"
This reverts commit f1998e15ff.

This changes the ABI, such that glGetnTexImageARB entry-point from the
GLAPI gets removed. Thus accessing many functions by offset (as we do)
will result in getting the wrong one.

Follow-up work will swap the by-offset handling, but for now revert
this patch.

Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Erik Faye-Lund
6148cce388 mapi: drop unneeded gl_dispatch_stub declarations
These declarations are not used anywhere - be that generated code or
otherwise.

[Emil: format the hunk from Erik into a patch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 18:13:24 +00:00
Emil Velikov
ca152234e1 mesa: correctly use os.path.join in our python scripts
With Windows in mind, using forward slash isn't the right thing to do.
Even if it just works, we might want to fix it.

As here, use __file__ instead of argv[0] and sys.path.insert over
sys.path.append. With the path tweak being reportedly faster.

Suggested-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:24 +00:00
Emil Velikov
9cc8e12505 freedreno: automake: ship ir3_nir_trig.py in the tarball
Fixes: aa0fed10d3 ("freedreno: move ir3 to common location")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 18:13:24 +00:00
Eric Engestrom
8ed966b506 egl/glvnd: sync egl.xml from Khronos
Fixes: 98984b7cdd "egl: add glvnd entrypoints for EGL_MESA_query_driver"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 16:55:21 +00:00
Eric Engestrom
d2ca270511 travis: bump libdrm to 2.4.97
Fixes: c02f761bdf "winsys/amdgpu: use the new BO list API"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-24 14:50:33 +00:00
Veluri Mithun
85edfc04b8 egl: Implementation of egl dri2 drivers for MESA_query_driver
Signed-off-by: Veluri Mithun <velurimithun38@gmail.com>

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 14:37:52 +00:00
Eric Engestrom
98984b7cdd egl: add glvnd entrypoints for EGL_MESA_query_driver
Fixes: fbdd7bde29863935106c "egl: Implement EGL API for MESA_query_driver"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 14:37:47 +00:00
Veluri Mithun
6afce78128 egl: Implement EGL API for MESA_query_driver
Signed-off-by: Veluri Mithun <velurimithun38@gmail.com>

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 14:37:47 +00:00
Eric Engestrom
7d9274388b egl: update headers from Khronos
Cheating a tiny bit as these headers aren't in the Khronos repo yet, but
I expect them to be within a couple days.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 14:37:44 +00:00
Eric Engestrom
381d0e753a egl: finalize EGL_MESA_query_driver
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 14:37:36 +00:00
Matt Turner
e166003cb7 intel/compiler: Reset default flag register in brw_find_live_channel()
emit_uniformize() emits SHADER_OPCODE_FIND_LIVE_CHANNEL with its
flag_subreg set, so that the IR knows which flag is accessed. However
the flag is only used on Gen7 in Align1 mode.

To avoid setting unnecessary bits in the instruction words, get the
information we need and reset the default flag register. This allows
round-tripping through the assembler/disassembler.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-01-23 22:48:29 -08:00
Kenneth Graunke
74c9c906f9 gallium: Add forgotten docs for PIPE_CAP_GLSL_TESS_LEVELS_AS_INPUTS.
Thanks to Ilia for catching this.
2019-01-23 17:16:22 -08:00
Mark Janes
022800a058 Revert "Implement EGL API for MESA_query_driver"
This reverts commit ff621a5055.

with default warnings configuration, this commit generates:

   ../src/egl/main/eglapi.c:2654:1: error: no previous prototype for
            ‘eglGetDisplayDriverConfig’ [-Werror=missing-prototypes]
2019-01-23 16:29:13 -08:00
Mark Janes
9e9fa13c81 Revert "Implementation of egl dri2 drivers for MESA_query_driver"
This reverts commit 2720f78ef2.
2019-01-23 16:28:47 -08:00
Veluri Mithun
2720f78ef2 Implementation of egl dri2 drivers for MESA_query_driver
Signed-off-by: Veluri Mithun <velurimithun38@gmail.com>

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-23 22:29:14 +00:00
Veluri Mithun
ff621a5055 Implement EGL API for MESA_query_driver
Signed-off-by: Veluri Mithun <velurimithun38@gmail.com>

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-23 22:29:14 +00:00
Veluri Mithun
499869908b Add extension doc for MESA_query_driver
Signed-off-by: Veluri Mithun <velurimithun38@gmail.com>

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-23 22:29:14 +00:00
Sergii Romantsov
cfca5cd958 nir: Length of boolean vtn_value now is 1
During conversion type-length was lost due to math.

v2 (Jason Ekstrand):
 - Use a size/offset of 4 bytes

Fixes: 44227453ec (nir: Switch to using 1-bit Booleans for almost everything)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109353
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Tested-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-23 15:43:06 -06:00
Marek Olšák
42aea4f1a7 st/mesa: fix PRIMITIVES_GENERATED query after the "pipeline stat single" changes
When this functionality was added, the PRIMITIVES_GENERATED query was
accidentally omitted. This causes issues for drivers that support
transform feedback."

Fixes: d644698b44 ("gallium: Add the ability to query a single
pipeline statistics counter")

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-23 14:32:57 -05:00
Marek Olšák
c89e8470e5 st/mesa: purge framebuffers when unbinding a context
This fixes pipe_surface "leaks".

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-01-23 14:32:57 -05:00
Erik Faye-Lund
5c17c01815 docs: add note about sending merge-requests from forks
Sending MRs from the main Mesa repository increase clutter in the
repository, and decrease visibility of project-wide branches. So it's
better if MRs are sent from forks instead.

Let's add a note about this, in case its not obvious to everyone.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-23 18:14:06 +01:00
Rob Clark
5a4af871e3 freedreno: set modifier when exporting buffer
Fixes an assert we start hitting with kms/gbm:

  #0  0x0000007fbf3d6e3c in raise () from /lib64/libc.so.6
  #1  0x0000007fbf3c4a68 in abort () from /lib64/libc.so.6
  #2  0x0000007fbf3d04e8 in __assert_fail_base () from /lib64/libc.so.6
  #3  0x0000007fbf3d0550 in __assert_fail () from /lib64/libc.so.6
  #4  0x0000007fbf5a73c4 in gbm_dri_bo_create (gbm=0x5820f0, width=2160, height=1440, format=875713112, usage=0, modifiers=0x695e00, count=1) at ../src/gbm/backends/dri/gbm_dri.c:1150
  #5  0x0000007fbf5a49c4 in gbm_bo_create_with_modifiers (gbm=0x5820f0, width=2160, height=1440, format=875713112, modifiers=0x695e00, count=1) at ../src/gbm/main/gbm.c:491
  #6  0x0000007fbbac3d64 in get_back_bo (dri2_surf=0x6f4cc0) at ../src/egl/drivers/dri2/platform_drm.c:258
  #7  0x0000007fbbac4318 in dri2_drm_image_get_buffers (driDrawable=0x704490, format=4098, stamp=0x6fc730, loaderPrivate=0x6f4cc0, buffer_mask=1, buffers=0x7fffffe210) at ../src/egl/drivers/dri2/platform_drm.c:409
  #8  0x0000007fbf5a5318 in image_get_buffers (driDrawable=0x704490, format=4098, stamp=0x6fc730, loaderPrivate=0x70e150, buffer_mask=1, buffers=0x7fffffe210) at ../src/gbm/backends/dri/gbm_dri.c:135
  #9  0x0000007fbe4308c4 in dri_image_drawable_get_buffers (drawable=0x6fc730, images=0x7fffffe210, statts=0x6f2660, statts_count=1) at ../src/gallium/state_trackers/dri/dri2.c:339
  #10 0x0000007fbe430c44 in dri2_allocate_textures (ctx=0x614b30, drawable=0x6fc730, statts=0x6f2660, statts_count=1) at ../src/gallium/state_trackers/dri/dri2.c:466
  #11 0x0000007fbe435580 in dri_st_framebuffer_validate (stctx=0x714160, stfbi=0x6fc730, statts=0x6f2660, count=1, out=0x7fffffe3b8) at ../src/gallium/state_trackers/dri/dri_drawable.c:85
  #12 0x0000007fbe7b2c84 in st_framebuffer_validate (stfb=0x6f2190, st=0x714160) at ../src/mesa/state_tracker/st_manager.c:222
  #13 0x0000007fbe7b4884 in st_api_make_current (stapi=0x7fbf0430d8 <st_gl_api>, stctxi=0x714160, stdrawi=0x6fc730, streadi=0x6fc730) at ../src/mesa/state_tracker/st_manager.c:1074
  #14 0x0000007fbe434f44 in dri_make_current (cPriv=0x703c20, driDrawPriv=0x704490, driReadPriv=0x704490) at ../src/gallium/state_trackers/dri/dri_context.c:301
  #15 0x0000007fbe42c910 in driBindContext (pcp=0x703c20, pdp=0x704490, prp=0x704490) at ../src/mesa/drivers/dri/common/dri_util.c:579
  #16 0x0000007fbbabab40 in dri2_make_current (drv=0x69d170, disp=0x69c6e0, dsurf=0x6f4cc0, rsurf=0x6f4cc0, ctx=0x70cb40) at ../src/egl/drivers/dri2/egl_dri2.c:1456
  #17 0x0000007fbbaa8ef4 in eglMakeCurrent (dpy=0x69c6e0, draw=0x6f4cc0, read=0x6f4cc0, ctx=0x70cb40) at ../src/egl/main/eglapi.c:862
  #18 0x0000007fbf5736ac in InternalMakeCurrentVendor (dpy=dpy@entry=0x614fb0, draw=draw@entry=0x6f4cc0, read=read@entry=0x6f4cc0, context=context@entry=0x70cb40, apiState=apiState@entry=0x6fc940, vendor=0x6975f0) at libegl.c:861
  #19 0x0000007fbf573764 in InternalMakeCurrentDispatch (dpy=0x614fb0, draw=0x6f4cc0, read=0x6f4cc0, context=0x70cb40, vendor=0x6975f0) at libegl.c:630
  #20 0x0000000000403640 in init_egl (egl=0x5805a8 <gl>, gbm=0x580528 <gbm>, samples=0) at ../common.c:263
  #21 0x0000000000403c1c in init_cube_smooth (gbm=0x580528 <gbm>, samples=0) at ../cube-smooth.c:225
  #22 0x0000000000408618 in main (argc=1, argv=0x7fffffe8d8) at ../kmscube.c:145

Fixes: 1ce5d757d0 freedreno: core buffer modifier support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-23 10:21:00 -05:00
Samuel Pitoiset
963c044c55 radv: always pass the GFX9 fence data to si_cs_emit_cache_flush()
Remove two useless checks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:14 +01:00
Samuel Pitoiset
5f0b17d581 radv: compute the GFX9 fence VA at allocation time
Instead of doing every time we emit cache flushes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:12 +01:00
Samuel Pitoiset
e7ac792400 radv: only allocate the GFX9 fence and EOP BOs for the gfx queue
It's invalid to emit a ZPASS_DONE event on the compute queue, and
the fence BO is unused on the compute queue (ie. we don't flush
CB or DB caches).
This saves some space in the upload BO.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:09 +01:00
Samuel Pitoiset
bd098884f1 radv: remove old_fence parameter from si_cs_emit_write_event_eop()
This parameter is actually useless as the immediate value
can always be zero.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:07 +01:00
Samuel Pitoiset
698afa177e radv: improve gathering of load_push_constants with dynamic bindings
For example, if a pipeline has two stages VS and FS. And if only
the fragment stage needs dynamic bindings, we shouldn't allocate
an extra user SGPR for the vertex stage. Of course, if the vertex
stage loads constants, it needs an user SGPR.

This should reduce the number of SET_SH_REG packets that are emitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 09:43:53 +01:00
Caio Marcelo de Oliveira Filho
e0485a1dd7 gallium: Add PIPE_CAP_GLSL_TESS_LEVELS_AS_INPUTS
In the Intel backend, it makes the most sense to treat gl_TessLevelInner
and gl_TessLevelOuter as ordinary shader inputs.  For Radeon, it makes
more sense to treat them as system values which get special handling.

We already have a compiler option for this, but the Iris driver will
need a capability bit so we can set it appropriately.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-23 00:35:56 -08:00
Ilia Mirkin
8e26d534be nv50,nvc0: mark textures dirty on fb update
We may have to flush the cache if there are any textures presently bound
that refer to the outgoing framebuffer. This is only checked at
validation time.

Fixes a number of dEQP-GLES3.functional.fbo.color.repeated_clear.sample.*
tests, which would bind a texture, then clear it while the binding was
in effect, and then render to a different texture. This seems legal
under the "no feedback loops" rule.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-22 23:16:01 -05:00
Timothy Arceri
678ef2a4a5 ac/nir_to_llvm: fix interpolateAt* for structs
This fixes the arb_gpu_shader5 interpolateAt* tests that contain
structs.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2019-01-23 10:41:37 +11:00
Timothy Arceri
559e5b0408 ac/nir_to_llvm: add bindless support for uniform handles
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-23 10:41:37 +11:00
Timothy Arceri
f0ed59076f radeonsi/nir: add missing piece for bindless image support
This fixes some piglit tests and is was TGSI does.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-23 10:41:37 +11:00
Rob Clark
1ce5d757d0 freedreno: core buffer modifier support
Split out of a patch from Fritz Koenig to decouple from a6xx UBWC
enablement, and added fd_resource_create_with_modifiers().
2019-01-22 16:33:27 -05:00
Rob Clark
c56fe4118a loader: fix the no-modifiers case
Normally modifiers take precendence over use flags, as they are more
explicit.  But if the driver supports modifiers, but the xserver does
not, then we should fallback to the old mechanism of allocating a buffer
using 'use' flags.

Fixes: 069fdd5f9f
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-01-22 16:33:27 -05:00
Fritz Koenig
7c4b9510d1 freedreno: add query for dmabuf modifiers 2019-01-22 16:33:27 -05:00
Fritz Koenig
ddbe6171e6 freedreno: drm_fourcc.h header include
Add Qualcomm modifier for UBWC
2019-01-22 16:33:27 -05:00
Brian Paul
956c219c8f svga: add new gallium formats to the format conversion table
Fixes a static assertion which broke the build.

Fixes: 3ee240890 "gallium: add SINT formats to have exact counterparts to SNORM formats"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Neha Bhende<bhenden@vmware.com>
2019-01-22 12:58:04 -07:00
Marek Olšák
d85917deaf radeonsi: rename rfence -> sfence
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 13:34:03 -05:00
Marek Olšák
260ff57647 radeonsi: rename rbo, rbuffer to buf or buffer
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 13:34:01 -05:00
Marek Olšák
63b91f25bc radeonsi: rename rsrc -> ssrc, rdst -> sdst
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 13:33:04 -05:00
Marek Olšák
4666f36c04 radeonsi: rename rquery -> squery
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 13:32:59 -05:00
Marek Olšák
501ff90a95 radeonsi: rename r600_resource -> si_resource
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 13:32:18 -05:00
Lionel Landwerlin
a75b12ce66 vulkan: make generated enum to strings helpers available from c++
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-22 18:20:53 +00:00
Marek Olšák
1cfbed7587 radeonsi: remove r600 from comments
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:26:45 -05:00
Marek Olšák
e0a6399eb4 winsys/amdgpu: rename rfence, rsrc, rdst -> afence, asrc, adst
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:26:45 -05:00
Marek Olšák
2792ec2cdd radeonsi: rename rview -> sview
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:26:45 -05:00
Marek Olšák
96610f625d radeonsi: rename rscreen -> sscreen
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:25:57 -05:00
Marek Olšák
86e25ed5a3 radeonsi: disable render cond & pipeline stats for internal compute dispatches 2019-01-22 12:24:35 -05:00
Sonny Jiang
1b25d340b7 radeonsi: use compute for resource_copy_region when possible
v2: marek: fix snorm8 blits

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-01-22 12:24:35 -05:00
Jiang, Sonny
8daf5bb209 radeonsi: add compute_last_block to configure the partial block fields 2019-01-22 12:22:46 -05:00
Marek Olšák
b443465fb9 gallium/util: add util_format_snorm8_to_sint8 (from radeonsi) 2019-01-22 12:21:43 -05:00
Marek Olšák
3ee240890c gallium: add SINT formats to have exact counterparts to SNORM formats
for radeonsi
2019-01-22 12:21:43 -05:00
Marek Olšák
4d5f8f39f3 radeonsi: move PKT3_WRITE_DATA generation into a helper function
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák
c252273f98 radeonsi: don't use WRITE_DATA.DST_SEL == MEM_GRBM on >= CIK
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák
a545415eb9 radeonsi: fix the top-of-pipe fence on SI
SI doesn't have MEM.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák
e402961e1d radeonsi: correct WRITE_DATA.DST_SEL definitions
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák
c605738113 radeonsi: compile clear and copy buffer compute shaders on demand
same as all other shaders
2019-01-22 11:59:27 -05:00
Marek Olšák
f139589069 radeonsi: remove redundant call to emit_cache_flush in compute clear/copy
launch_grid calls it.
2019-01-22 11:59:27 -05:00
Marek Olšák
e3d283eaca radeonsi: use buffer_store_format_x & xy 2019-01-22 11:59:27 -05:00
Marek Olšák
4c4c8bb1f0 radeonsi: fix rendering to tiny viewports where the viewport center is > 8K
This fixes an assertion failure with GL CTS when cts-runner is used.
(not a specific test)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108877
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
2019-01-22 11:59:27 -05:00
Marek Olšák
caa2dcd730 radeonsi: fix a u_blitter crash after a shader with FBFETCH
This fixes an assertion failure with GL CTS when cts-runner is used.
(not a specific test)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108877
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
2019-01-22 11:59:27 -05:00
Marek Olšák
c02f761bdf winsys/amdgpu: use the new BO list API 2019-01-22 11:59:27 -05:00
Jason Ekstrand
ac0f8a6ea0 anv: Implement transform feedback queries
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:57 -06:00
Jason Ekstrand
7f4d9bb7b8 genxml: Add SO_PRIM_STORAGE_NEEDED and SO_NUM_PRIMS_WRITTEN
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:57 -06:00
Jason Ekstrand
673f33c77d anv: Implement CmdBegin/EndQueryIndexed
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:57 -06:00
Jason Ekstrand
2be89cbd82 anv: Implement vkCmdDrawIndirectByteCountEXT
Annoyingly, this requires that we implement integer division on the
command streamer.  Fortunately, we're only ever dividing by constants so
we can use the mulh+add+shift trick and it's not as bad as it sounds.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
36ee2fd61c anv: Implement the basic form of VK_EXT_transform_feedback
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
39925d60ec anv: Add pipeline cache support for xfb_info
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
e3bd49eaa7 anv: Add but do not enable VK_EXT_transform_feedback
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Alejandro Piñeiro
6b50b0a4a8 nir/xfb: distinguish array of structs vs array of blocks
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
ac704e777c nir/xfb: Properly handle arrays of blocks
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22 10:42:56 -06:00
Alejandro Piñeiro
5649a0a6e8 nir/xfb: don't assert when xfb_buffer/stride is present but not xfb_offset
In order to allow nir_gather_xfb_info to be used on OpenGL,
specifically ARB_gl_spirv.

So, from OpenGL 4.6 spec, section 11.1.2.1, "Output Variables":

    "outputs specifying both an *XfbBuffer* and an *Offset* are
     captured, while outputs not specifying both of these are not
     captured. Values are captured each time the shader writes to such
     a decorated object."

This implies that are captured if both are present, and not if one of
those are lacking. Technically, it doesn't explicitly point that
having just one or the other is a mistake. In some cases, glslang is
adding some extra XfbBuffer without XfbOffset around, and mentioning
that technically that is not a bug (see issue#1526)

And for the case of Vulkan, as the same glslang issue mentions, it is
not clear if that should be a mistake or not. But even if it is a
mistake, it is not really needed to be checked on the driver, and we
can let the validation layers to check that.

v2: simplify explicit_xfb_buffer and explicit_offset checks (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
4f99ac9144 nir/xfb: Fix offset accounting for dvec3/4
Before, we were double-counting the component slots when we had a dvec3
or dvec4.  Instead, just add them in once and manually offset the
recorded output offset.

Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
96fa23bca5 nir: Preserve offsets in lower_io_to_scalar_early
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22 10:42:56 -06:00
Samuel Pitoiset
b2bbd978d0 nir: fix lowering arrays to elements for XFB outputs
If we have a transform feedback output like:

float[2] x2_out (VARYING_SLOT_VAR1.x, 0, 0)

which is lowered by nir_lower_io_arrays_to_elements to,

float x2_out (VARYING_SLOT_VAR1.x, 0, 0)
float x2_out@5 (VARYING_SLOT_VAR2.x, 0, 0)

We have to update the destination offset to avoid overwriting
the same value.

v2 (Jason Ekstrand):
 - Compute the correct offsets for arrays of vectors and/or doubles

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22 10:42:56 -06:00
Samuel Pitoiset
9f4e0aa7c1 nir: do not remove varyings used for transform feedback
When a xfb buffer is explicitely declared on a varying
variable, we shouldn't remove it at link time.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
9c14440e81 spirv: Only set interface_type on blocks
Instead of setting interface_type to whatever the per-vertex type is, we
only set it on blocks.  This allows later passes to tell the difference
between variables that are in blocks and those that aren't.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
da29594636 spirv: Only split blocks
Instead of splitting every per-vertex struct, just split the ones that
are actually blocks.  The reason for the split is so that we have
separate variables for separate locations, qualifiers, and builtin
decorations.  The vulkan spec only allows these on members of blocks.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
662cfb121b spirv: Initialize struct member offsets to -1
This is the "no offset specified" value.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
b4eae8444e anv: Always emit at least one vertex element
This seems to make the simulator happier.  The early return wasn't
really protecting anything and the code that follows will happily
initialize the dummy element to STORE_0 and emit it.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Eric Engestrom
610f956fde configure: EGL requirements only apply if EGL is built
Issue was hit with this configuration:
  --disable-{egl,gbm} --with-platform=drm

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 3208fd2e46 ("configure: move platform handling further up")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-22 16:12:40 +00:00
Jonathan Marek
fc4f6b2f12 freedreno: a2xx: add partial lower_scalar pass for ir2
Some instructions can only be scalar on a2xx, lower these only

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-22 14:45:03 +00:00
Jonathan Marek
9f614c74b7 freedreno: a2xx: add ir2 copy propagation
Two cases:
* replacing srcs which refer to MOV instructions
* replacing MOVs used to write to exports

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-22 14:45:03 +00:00
Jonathan Marek
c7dbf0b280 freedreno: a2xx: insert scalar MOV to allow 2 source scalar
If we want to use a scalar instruction with two sources, both sources have
to be in the same register. This covers a common case by inserting a scalar
MOV into a previous instruction with only a vector alu instruction.

A better method would be to have the sources end up in the same register in
the first place, but when one source is a constant this is the only way.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-22 14:45:03 +00:00
Jonathan Marek
67610a0323 freedreno: a2xx: NIR backend
This patch replaces the a2xx TGSI compiler with a NIR compiler.

It also adds several new features:
-gl_FrontFacing, gl_FragCoord, gl_PointCoord, gl_PointSize
-control flow (including loops)
-texture related features (LOD/bias, cubemaps)
-filling scalar ALU slot when possible

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-22 14:45:03 +00:00
Tapani Pälli
da3ca69afa nir: cleanup glsl_get_struct_field_offset, glsl_get_explicit_stride
Take away const qualifier from return type of these functions as
-Wignored-qualifiers points out it is ignored for these cases.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 13:09:15 +02:00
Eric Engestrom
41a0c00392 travis: fix autotools build after --enable-autotools switch addition
Fixes: e68777c87c "autotools: Deprecate the use of autotools"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 10:29:19 +00:00
Jason Ekstrand
27af1cc2a6 spirv: Update the JSON and headers from Khronos master
This corresponds to commit 79b6681aadcb53c27d1052e on GitHub.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 18:55:05 -06:00
Jason Ekstrand
ca8c6c9781 nir: Mark deref UBO and SSBO access as non-scalar
Fixes: 63b9aa2e25 "spirv: Add support for using derefs for..."
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 18:41:47 -06:00
Karol Herbst
5ee0adfb6e nir/spirv: handle ContractionOff execution mode
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 20:36:41 +01:00
Rob Clark
fa737042ad nir/vtn: add caps for some cl related capabilities
vtn supports these, so don't squalk if user is happy with enabling
these.

v2: add new members sorted

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 20:36:41 +01:00
Karol Herbst
ce08e5f39c vtn: handle SpvExecutionModelKernel
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 20:36:41 +01:00
Karol Herbst
8bb46de08b mesa: add MESA_SHADER_KERNEL
used for CL kernels

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 20:36:41 +01:00
Jason Ekstrand
2aa78e46e9 anv/pipeline: Add a pdevice helper variable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-21 11:57:00 -06:00
Jason Ekstrand
344171b9ee relnotes: Add newly added Vulkan extensions
Both the Intel and RADV people have been really bad about adding things
to the release notes.  We should start actually paying attention.

Acked-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-21 11:46:06 -06:00
Jason Ekstrand
c7f4a2867c anv: Only parse pImmutableSamplers if the descriptor has samplers
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-21 11:45:58 -06:00
Rhys Perry
f0ba826054 radv: prevent dirtying of dynamic state when it does not change
DXVK often sets dynamic state without actually changing it.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry
e4c6423c5e radv: avoid context rolls when binding graphics pipelines
It's common in some applications to bind a new graphics pipeline without
ending up changing any context registers.

This has a pipline have two command buffers: one for setting context
registers and one for everything else. The context register command buffer
is only emitted if it differs from the previous pipeline's.

v2: ensure late scissor emission is done when radv_emit_rbplus_state() is
    called
v2: make use of cmd_buffer->state.workaround_scissor_bug
v3: rename "workaround_scissor_bug" to
    "context_roll_without_scissor_emitted"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry
5564a797f2 radv: add missed situations for scissor bug workaround
v2: rename "workaround_scissor_bug" to
    "context_roll_without_scissor_emitted"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry
5d1a29071a radv: pass radv_draw_info to radv_emit_draw_registers()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Jonathan Marek
5886c5d092 freedreno: a2xx: sysmem rendering
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:34 -05:00
Jonathan Marek
bec6e4b054 freedreno: a2xx: fix non-zero texture base offsets
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:27 -05:00
Jonathan Marek
02ab85afd8 freedreno: a2xx: fix VERTEX_REUSE/DEALLOC on a20x
On a20x, set VGT_VERTEX_REUSE_BLOCK_CNTL to 2 and don't change it. Small
rearrangement on a220 to reduce the size of draw commands.

Only set DEALLOC_CNTL on a20x because the correct a220 value is not known.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:22 -05:00
Jonathan Marek
0286a11b7e freedreno: a2xx: fix gmem2mem viewport
Fixes cases where previous viewport values might case gmem2mem to fail.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:16 -05:00
Jonathan Marek
64b12520a2 freedreno: a2xx: cleanup REG_A2XX_PA_CL_VTE_CNTL
Doesn't change much, but reduces the size of fd2_emit_state

gmem2mem does not need to change the value: no Z clipping on resolve
mem2gmem now needs to restore the common value after rendering

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:10 -05:00
Jonathan Marek
6ef7700ac6 freedreno: a2xx: cleanup init_shader_const
Only 3 vertices are used so we can drop the data for vertex 4

It doesn't make sense to have 1.1 for some coordinates, use 1.0 instead

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:21:51 -05:00
Karol Herbst
0a793c78a3 nir: add bit_size parameter to system values with multiple allowed bit sizes
v2: add assert to verify we have at least one valid bit_size
v3: fix use of load_front_face in nir_lower_two_sided_color and tgsi_to_nir

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 00:17:18 +01:00
Karol Herbst
4125211e9c nir: add legal bit_sizes to intrinsics
With OpenCL some system values match the address bits, but in GLSL we also
have some system values being 64 bit like subgroup masks.

With this it is possible to adjust the builder functions so that depending
on the bit_sizes the correct bit_size is used or an additional argument is
added in case of multiple possible values.

v2: validate dest bit_size
v3: generate hex values in python code
    remove useless imports
    rename and move bit_sizes
v4: add 1 to legal bit_sizes for front_face

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 00:16:51 +01:00
Karol Herbst
27bd07e230 nir/validate: allow to check against a bitmask of bit_sizes
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 00:16:51 +01:00
Karol Herbst
b9fec2b38c nir: replace more nir_load_system_value calls with builder functions
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 00:16:51 +01:00
Karol Herbst
987744be98 glsl/lower_output_reads: set invariant and precise flags on temporaries
fixes a couple of deqp tests (on nvc0 and potential other drivers):
dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_1
dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_2
dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_3
dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_1
dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_2
dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_3
dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_1
dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_2
dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_3

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-21 00:16:50 +01:00
Rhys Kidd
8002eaab6c nv50,nvc0: add missing CAPs for unsupported features
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-20 13:51:01 -05:00
Karol Herbst
acdad24585 nir/spirv: handle SpvStorageClassCrossWorkgroup
v2: rename nir_var_global to nir_var_mem_global

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:42 +01:00
Karol Herbst
36a76b7192 nir: rename nir_var_shared to nir_var_mem_shared
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Karol Herbst
6fefd69724 nir: rename nir_var_ssbo to nir_var_mem_ssbo
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Karol Herbst
3afc1e068f nir: rename nir_var_ubo to nir_var_mem_ubo
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Karol Herbst
9b24028426 nir: rename nir_var_function to nir_var_function_temp
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Karol Herbst
e5daef9587 nir: rename nir_var_private to nir_var_shader_temp
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Lionel Landwerlin
ad99c1670a intel/genxml: add missing MI_PREDICATE compare operations
Doesn't save us a great deal of lines but at least they get decoded in
aubinators.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-01-19 15:47:36 +00:00
Lionel Landwerlin
79514cc5fb anv: document cache flushes & invalidations
A little bit of explanation regarding how vkCmdPipelineBarrier()
works.

v2: Avoid referring to data port cache when it's actually sampler
    caches (Jason)
    Complete explanation for indirect draws (Jason)

v3: s/samplers/sampler/ (Jason)
    s/UBOs/data port/
    Add documentation for VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT_EXT (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
2019-01-19 15:45:41 +00:00
Lionel Landwerlin
3c4c18341a anv: narrow flushing of the render target to buffer writes
In commit 9a7b319903 ("anv/query: flush render target before
copying results") we tracked all the render target writes to apply a
flushes in the vkCopyQueryResults(). But we can narrow this down to
only when we write a buffer (which is the only input of
vkCopyQueryResults).

v2: Drop newer render target write flags introduce by 1952fd8d2c
    ("anv: Implement VK_EXT_conditional_rendering for gen 7.5+")

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
2019-01-19 15:45:41 +00:00
Timothy Arceri
6ca652faf3 glsl: be much more aggressive when skipping shader compilation
Currently we only add a cache key for a shader once it is linked.
However games like Team Fortress 2 compile a whole bunch of shaders
which are never actually linked. These compiled shaders can take
up a bunch of memory.

This patch changes things so that we add the key for the shader to
the cache as soon as it is compiled. This means on a warm cache we
can avoid the wasted memory from these shaders. Worst case scenario
is we need to compile the shaders at link time but this can happen
anyway if the shader has been evicted from the cache.

Reduces memory use in Team Fortress 2 from 1.3GB -> 770MB on a
warm cache from start up to the game menu.

V2: only add key to cache when compilation is successful.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2019-01-19 13:12:25 +11:00
Francisco Jerez
c84ec70b3a intel/fs: Promote execution type to 32-bit when any half-float conversion is needed.
The docs are fairly incomplete and inconsistent about it, but this
seems to be the reason why half-float destinations are required to be
DWORD-aligned on BDW+ projects.  This way the regioning lowering pass
will make sure that the destination components of W to HF and HF to W
conversions are aligned like the corresponding conversion operation
with 32-bit execution data type.

Tested-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-18 16:09:39 -08:00
Timothy Arceri
9e669ed22b ac/nir_to_llvm: fix interpolateAt* for arrays
This builds on the recent interpolate fix by Rhys ee8488ea3b.

This fixes the arb_gpu_shader5 interpolateAt* tests that contain
arrays.

Fixes: ee8488ea3b ("ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics")

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 10:59:38 +11:00
Timothy Arceri
860a9e4849 Revert "glsl: be much more aggressive when skipping shader compilation"
This reverts commit 64b8c86d37.

Reverting for now as it was causing some segfaults.
2019-01-19 10:45:07 +11:00
Kristian H. Kristensen
5486c9d526 freedreno/a6xx: Turn on texture tiling by default
The color swap isn't available for tiled formats and it's not needed
either. We pick one channel order and use for all non-linear formats.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-18 14:27:15 -08:00
Kristian H. Kristensen
60c6778dda freedreno: Synchronize batch and flush for staging resource
Staging blit downloads would wait on the src resource instead of the
staging resource and didn't make sure to submit the blit batch first.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-18 14:27:12 -08:00
Timothy Arceri
64b8c86d37 glsl: be much more aggressive when skipping shader compilation
Currently we only add a cache key for a shader once it is linked.
However games like Team Fortress 2 compile a whole bunch of shaders
which are never actually linked. These compiled shaders can take
up a bunch of memory.

This patch changes things so that we add the key for the shader to
the cache as soon as it is compiled. This means on a warm cache we
can avoid the wasted memory from these shaders. Worst case scenario
is we need to compile the shaders at link time but this can happen
anyway if the shader has been evicted from the cache.

Reduces memory use in Team Fortress 2 from 1.3GB -> 770MB on a
warm cache from start up to the game menu.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2019-01-19 08:24:47 +11:00
Timothy Arceri
c9d7b0f184 glsl: don't skip GLSL IR opts on first-time compiles
This basically reverts c2bc0aa7b1.

By running the opts we reduce  memory using in Team Fortress 2
from 1.5GB -> 1.3GB from start-up to game menu.

This will likely increase Deus Ex start up times as per commit
c2bc0aa7b1. However currently 32bit games like Team Fortress 2
can run out of memory on low memory systems, so that seems more
important.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-19 08:24:43 +11:00
Caio Marcelo de Oliveira Filho
cd56d79b59 nir: check NIR_SKIP to skip passes by name
Passes' function names, separated by comma, listed in NIR_SKIP
environment variable will be skipped in debug mode.  The mechanism is
hooked into the _PASS macro, like NIR_PRINT.

The extra macro NIR_SKIP is available as a developer convenience, to
skip at pointer other than the passes entry points.

v2: Fix typo in NIR_SKIP macro. (Bas)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 12:31:49 -08:00
Danylo Piliaiev
1952fd8d2c anv: Implement VK_EXT_conditional_rendering for gen 7.5+
Conditional rendering affects next functions:
- vkCmdDraw, vkCmdDrawIndexed, vkCmdDrawIndirect, vkCmdDrawIndexedIndirect
- vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR
- vkCmdDispatch, vkCmdDispatchIndirect, vkCmdDispatchBase
- vkCmdClearAttachments

Value from conditional buffer is cached into designated register,
MI_PREDICATE is emitted every time conditional rendering is enabled
and command requires it.

v2: by Jason Ekstrand
  - Use vk_find_struct_const instead of manually looping
  - Move draw count loading to prepare function
  - Zero the top 32-bits of MI_ALU_REG15

v3: Apply pipeline flush before accessing conditional buffer
 (The issue was found by Samuel Iglesias)

v4: - Remove support of Haswell due to possible hardware bug
    - Made TMP_REG_PREDICATE and TMP_REG_DRAW_COUNT defines to
       define registers in one place.

v5: thanks to Jason Ekstrand and Lionel Landwerlin
    - Workaround the fact that MI_PREDICATE_RESULT is not
      accessible on Haswell by manually calculating
      MI_PREDICATE_RESULT and re-emitting MI_PREDICATE
      when necessary.

v6: suggested by Lionel Landwerlin
    - Instead of calculating the result of predicate once - re-emit
      MI_PREDICATE to make it easier to investigate error states.

v7: suggested by Jason
    - Make anv_pipe_invalidate_bits_for_access_flag add CS_STALL
      if VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT is set.

v8: suggested by Lionel
    - Precompute conditional predicate's result to
      support secondary command buffers.
    - Make prepare_for_draw_count_predicate more readable.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-18 18:31:44 +00:00
Danylo Piliaiev
ed6e2bf263 anv: Implement VK_KHR_draw_indirect_count for gen 7+
v2: by Jason Ekstrand
  - Move out of the draw loop population of registers
    which aren't changed in it.
  - Remove dependency on ALU registers.
  - Clarify usage of PIPE_CONTROL
  - Without usage of ALU registers patch works for gen7+

v3: set pending_pipe_bits |= ANV_PIPE_RENDER_TARGET_WRITES

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-18 18:31:44 +00:00
Dylan Baker
9e989b860a bin/meson-cmd-extract: Also handle cross and native files
Native file support in command line serialization isn't present in meson
0.49, but will be for 0.49.1 and 0.50

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-18 09:37:01 -08:00
Jason Ekstrand
b54df1b6df anv: Re-sort the extensions list
I like to keep things in good order so that you can find them.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-18 10:32:23 -06:00
Jason Ekstrand
eb32dad07c intel/fs: Don't touch accumulator destination while applying regioning alignment rule
In some shaders, you can end up with a stride in the source of a
SHADER_OPCODE_MULH.  One way this can happen is if the MULH is acting on
the top bits of a 64-bit value due to 64-bit integer lowering.  In this
case, the compiler will produce something like this:

mul(8)    acc0<1>UD   g5<8,4,2>UD   0x0004UW      { align1 1Q };
mach(8)   g6<1>UD     g5<8,4,2>UD   0x00000004UD  { align1 1Q AccWrEnable };

The new region fixup pass looks at the MUL and sees a strided source and
unstrided destination and determines that the sequence is illegal.  It
then attempts to fix the illegal stride by replacing the destination of
the MUL with a temporary and emitting a MOV into the accumulator:

mul(8)    g9<2>UD     g5<8,4,2>UD   0x0004UW      { align1 1Q };
mov(8)    acc0<1>UD   g9<8,4,2>UD                 { align1 1Q };
mach(8)   g6<1>UD     g5<8,4,2>UD   0x00000004UD  { align1 1Q AccWrEnable };

Unfortunately, this new sequence isn't correct because MOV accesses the
accumulator with a different precision to MUL and, instead of filling
the bottom 32 bits with the source and zeroing the top 32 bits, it
leaves the top 32 (or maybe 31) bits alone and full of garbage.  When
the MACH comes along and tries to complete the multiplication, the
result is correct in the bottom 32 bits (which we throw away) and
garbage in the top 32 bits which are actually returned by MACH.

This commit does two things:  First, it adds an assert to ensure that we
don't try to rewrite accumulator destinations of MUL instructions so we
can avoid this precision issue.  Second, it modifies
required_dst_byte_stride to require a tightly packed stride so that we
fix up the sources instead and the actual code which gets emitted is
this:

mov(8)    g9<1>UD     g5<8,4,2>UD                 { align1 1Q };
mul(8)    acc0<1>UD   g9<8,8,1>UD   0x0004UW      { align1 1Q };
mach(8)   g6<1>UD     g5<8,4,2>UD   0x00000004UD  { align1 1Q AccWrEnable };

Fixes: efa4e4bc5f "intel/fs: Introduce regioning lowering pass"
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-01-18 10:18:52 -06:00
Jason Ekstrand
0a7ac6d543 intel/eu: Stop overriding exec sizes in send_indirect_message
For a long time, we based exec sizes on destination register widths.
We've not been doing that since 1ca3a94427 but a few remnants
accidentally remained.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2019-01-18 10:18:52 -06:00
Samuel Pitoiset
f682ed11c3 radv: initialize the per-queue descriptor BO only once
Totally useless to write the descriptors inside the loop.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:32 +01:00
Samuel Pitoiset
72d9745a40 radv: do not write unused descriptors to the per-queue BO
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:30 +01:00
Samuel Pitoiset
8c164ea8f5 radv: reduce size of the per-queue descriptor BO
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:28 +01:00
Samuel Pitoiset
83cc87ead4 radv: drop unused code related to 16 sample locations
The driver only supports up to 8 sample locations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:24 +01:00
Karol Herbst
80dae7022e gm107/ir: disable TEXS for tex with derivAll set
fixes deqp tests:
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.samplercube_fixed_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.samplercube_float_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.isamplercube_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.usamplercube_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler3d_fixed_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler3d_float_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.isampler3d_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.usampler3d_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler2dshadow_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler3d_fixed_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler3d_float_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.isampler3d_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.usampler3d_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler2dshadow_vertex

Fixes: f821e80213
       "gm107/ir: use scalar tex instructions where possible"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-18 03:27:51 +01:00
Karol Herbst
30b5c9eda2 nv50/ir: disable tryCollapseChainedMULs in ConstantFolding for precise instructions
fixes dEQP-GLES2.functional.shaders.invariance.mediump.loop_3

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-18 02:03:30 +01:00
Bas Nieuwenhuizen
8424cd8fbd nir: Account for atomics in copy propagation.
Otherwise writes get propagated across atomics if no barrier is
used. Without barrier writes should still be visible in the same
invocation, so an atomic has to be considered a write.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: b3c6146925 "nir: Copy propagation between blocks"
Fixes: 62332d139c "nir: Add a local variable-based copy propagation pass"
2019-01-18 00:55:35 +01:00
Rafael Antognolli
927ba12b53 anv/tests: Adding test for the state_pool padding.
Add a test that checks that we can use the extra space allocated for
padding while allocating larger anv_states.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:26 -08:00
Rafael Antognolli
731c4adcf9 anv/allocator: Add support for non-userptr.
If softpin is supported, create new BOs for the required size and add the
respective BO maps. The other main change of this commit is that
anv_block_pool_map() now returns the map for the BO that the given
offset is part of. So there's no block_pool->map access anymore (when
softpin is used.

v3:
 - set fd to -1 on softpin case (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:24 -08:00
Rafael Antognolli
643248b66a anv: Remove state flush.
We have all the state buffers snooped, so we don't need to clflush
everything anymore.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:22 -08:00
Rafael Antognolli
5d61c74f3d anv/allocator: Enable snooping on block pool and anv_bo_pool BOs.
We are not going to use userptr for anv block pool BOs anymore. However,
so far we have been relying on the fact that userptr BOs are snooped on
non-llc platforms. Let's make sure that the block pool BOs are still
snooped, and we can also remove the clflush'ing that we do on all state
buffers.

And since we plan to remove the flushes, set the anv_bo_pool BOs to
cached (snooped on non-LLC platforms) too. For LLC platforms, they are
all cached by default, so this becomes a no-op.

v5:
 - Add snooping to anv_bo_pool BOs too (Jason).
 - Remove anv_gem_set_domain.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:20 -08:00
Rafael Antognolli
dfc9ab2ccd anv/allocator: Add padding information.
It's possible that we still have some space left in the block pool, but
we try to allocate a state larger than that state. This means such state
would start somewhere within the range of the old block_pool, and end
after that range, within the range of the new size.

That's fine when we use userptr, since the memory in the block pool is
CPU mapped continuously. However, by the end of this series, we will
have the block_pool split into different BOs, with different CPU
mapping ranges that are not necessarily continuous. So we must avoid
such case of a given state being part of two different BOs in the block
pool.

This commit solves the issue by detecting that we are growing the
block_pool even though we are not at the end of the range. If that
happens, we don't use the space left at the end of the old size, and
consider it as "padding" that can't be used in the allocation. We update
the size requested from the block pool to take the padding into account,
and return the offset after the padding, which happens to be at the
start of the new address range.

Additionally, we return the amount of padding we used, so the caller
knows that this happens and can return that padding back into a list of
free states, that can be reused later. This way we hopefully don't waste
any space, but also avoid having a state split between two different
BOs.

v3:
 - Calculate offset + padding at anv_block_pool_alloc_new (Jason).
v4:
 - Remove extra "leftover".

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:19 -08:00
Rafael Antognolli
7ed0898a8d anv/allocator: Rework chunk return to the state pool.
This commit tries to rework the code that split and returns chunks back
to the state pool, while still keeping the same logic.

The original code would get a chunk larger than we need and split it
into pool->block_size. Then it would return all but the first one, and
would split that first one into alloc_size chunks. Then it would keep
the first one (for the allocation), and return the others back to the
pool.

The new anv_state_pool_return_chunk() function will take a chunk (with
the alloc_size part removed), and a small_size hint. It then splits that
chunk into pool->block_size'd chunks, and if there's some space still
left, split that into small_size chunks. small_size in this case is the
same size as alloc_size.

The idea is to keep the same logic, but make it in a way we can reuse it
to return other chunks to the pool when we are growing the buffer.

v2:
 - Include Jason's suggestions to the algorithm that returns chunks.
 - Update comments.

v3:
 - Disallow returning 0 blocks (Jason).
 - fix min_size in the loop (Jason).
 - remove temporary variables (Jason)
v4:
 - return_chunk() should never return blocks larger than
 pool->block_size.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:17 -08:00
Rafael Antognolli
6a1f4c96cc anv: Remove some asserts.
They won't be true anymore once we add support for multiple BOs with
non-userptr.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:14 -08:00
Rafael Antognolli
f39dad7e4e anv: Validate the list of BOs from the block pool.
We now have multiple BOs in the block pool, but sometimes we still
reference only the first one in some instructions, and use relative
offsets in others. So we must be sure to add all the BOs from the block
pool to the validation list when submitting commands.

v2:
   - Don't add block pool BOs to the dependency list right before
   execbuf (Jason)
   - Call anv_execbuf_add_bo() to each BO in the block pools (Jason)
   - Use anv_execbuf_add_bo_set() to add surface state dependencies to
   execbuf.

v3:
   - Add comment to the non-softpin case (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:10 -08:00
Rafael Antognolli
11a5d4620b anv: Split code to add BO dependencies to execbuf.
This part of the anv_execbuf_add_bo() code is totally independent of the
BO being added. Let's split it out, so we can reuse it later.

v3: rename to anv_execbuf_add_bo_set (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:08 -08:00
Rafael Antognolli
f874604f45 anv/allocator: Add support for a list of BOs in block pool.
So far we use only one BO (the last one created) in the block pool. When
we switch to not use the userptr API, we will need multiple BOs. So add
code now to store multiple BOs in the block pool.

This has several implications, the main one being that we can't use
pool->map as before. For that reason we update the getter to find which
BO a given offset is part of, and return the respective map.

v3:
 - Simplify anv_block_pool_map (Jason).
 - Use fixed size array for anv_bo's (Jason)
v4:
 - Respect the order (item, container) in anv_block_pool_foreach_bo
 (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:04 -08:00
Rafael Antognolli
e3dc56d731 anv: Update usage of block_pool->bo.
Change block_pool->bo to be a pointer, and update its usage everywhere.
This makes it simpler to switch it later to a list of BOs.

v3:
 - Use a static "bos" field in the struct, instead of malloc'ing it.
 This will be later changed to a fixed length array of BOs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:02 -08:00
Rafael Antognolli
fc3f588320 anv/allocator: Remove pool->map.
After switching to using anv_state_table, there are very few places left
still using pool->map directly. We want to avoid that because it won't
be always the right map once we split it into multiple BOs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:00 -08:00
Rafael Antognolli
54e21e145e anv/allocator: Rename anv_free_list2 to anv_free_list.
Now that we removed the original anv_free_list, we can now use its name.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:58 -08:00
Rafael Antognolli
234c9d8a40 anv/allocator: Remove anv_free_list.
The next commit already renames anv_free_list2 -> anv_free_list since
the old one is gone.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:56 -08:00
Rafael Antognolli
e2179aceaf anv/allocator: Use anv_state_table on back_alloc too.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:52 -08:00
Rafael Antognolli
d18267fb48 anv/allocator: Use anv_state_table on anv_state_pool_alloc.
Use anv_state_pool_return_blocks() to return blocks to the pool, instead
of manually pushing them.

v3:
 - return blocks from the end of the chunk (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:50 -08:00
Rafael Antognolli
6a1dcfe73d anv/allocator: Add helper to push states back to the state table.
The use of anv_state_table_add() combined with anv_state_table_push(),
specially when adding a bunch of states to the table, is very verbose.
So we add this helper that makes things easier to digest.

We also already add the anv_state_table member in this commit, so things
can compile properly, even though it's not used.

v2: assert that the states are always aligned to their size (Jason)
v3: Add "table" member to anv_state_pool in this commit.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:47 -08:00
Rafael Antognolli
e8b6e0a5ba anv/allocator: Add getter for anv_block_pool.
We will need the anv_block_pool_map to find the map relative to some BO
that is not at the start of the block pool.

v2: just return a pointer instead of a struct (Jason)
v4: Update comment (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:43 -08:00
Rafael Antognolli
6a2d5ae305 anv/allocator: Add anv_state_table.
Add a structure to hold anv_states. This table will initially be used to
recycle anv_states, instead of relying on a linked list implemented in
GPU memory. Later it could be used so that all anv_states just point to
the content of this struct, instead of making copies of anv_states
everywhere.

One has to call anv_state_table_add(), which returns an index for the
state in the table, and then get a pointer to such index, and finally
fill in the rest of the struct.

TODO:
   1) There's a lot of common code between this table backing store
   memory and the anv_block_pool buffer, due to how we grow it. I think
   it's possible to refactory this and reuse code on both places.

   2) Add unit tests.

v3:
 - Rename state table memfd (Jason)
 - Return VK_ERROR_OUT_OF_HOST_MEMORY on more places (Jason)
 - anv_state_table_grow returns VkResult (Jason)
 - Rename variables to be more informative (Jason)
 - Return errors on state table grow.
 - Rename anv_state_table_push/pop to anv_free_list_push2/pop2
   This will be renamed again to remove the trailing "2" later.

v4:
 - Remove exit(-1) from anv_state_table (Jason).
 - Use uint32_t "next" field in anv_free_entry (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:34 -08:00
Rafael Antognolli
27478ce00e anv/tests: Fix block_pool_no_free test.
There were 2 problems with this test.

First it was comparing highest, which was -1, with an uint32_t. So the
current value would never be higher than that, and the assert would
always be false. It just never reached this point because of the next
problem.

It was always looking for the highest value of each thread and storing
it in thread_max. So a test case like this wouldn't work:

[Thread]: [Blocks]
   [0]: [0, 32, 64, 96]
   [1]: [128, 160, 192, 224]
   [2]: [256, 288, 320, 352]

Not only that would skip values and iterate only over thread number 2,
instead of walking through all of them, but thread_max was also
initialized to -1. And then compared to unsigned blocks[i][next[i].

We fix that by getting the smallest value of each thread, and checking
if it is lower than thread_min, which is initialized to INT32_MAX. And
then we end up walking through all the blocks of all threads. We also
change "blocks" to be int32_t instead of uint32_t, since in some places
(alloc_blocks) it was already referenced as int32_t, and that fixes the
comparison to -1.

v2:
 - keep highest initialized to -1, and change blocks to be int32_t.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:05:58 -08:00
Lionel Landwerlin
4149d41f2e anv: fix invalid binding table index computation
The ++ operator strikes again.

Fixes: f92c5bc8f3 ("anv/device: fix maximum number of images supported")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 11:49:10 -08:00
Eric Engestrom
c4c5c90255 docs: explain how to see what meson options exist
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-17 17:05:41 +00:00
Emil Velikov
406623f5b1 docs: update calendar, add news item and link release notes for 18.3.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-17 11:37:41 +00:00
Emil Velikov
9d58641bf2 docs: add sha256 checksums for 18.3.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 8320a07221)
2019-01-17 11:32:20 +00:00
Emil Velikov
2dad014496 docs: add release notes for 18.3.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 95a3b709c0)
2019-01-17 11:32:19 +00:00
Iago Toral Quiroga
f92c5bc8f3 anv/device: fix maximum number of images supported
We had defined MAX_IMAGES as 8, which we used to size the array for
image push constant data. The comment there stated that this was for
gen8, but anv_nir_apply_pipeline_layout runs for all gens and writes
that array, asserting that we don't exceed that number of images,
which imposes a limit of MAX_IMAGES on all gens.

Furthermore, despite this, we are exposing up to 64 images per shader
stage on all gens, gen8 included.

This patch lowers the number of images we expose in gen8 to 8 and
keeps 64 images for gen9+ while making sure that only pre-SKL gens
use push constant space to handle images.

v2:
 - <= instead of < in the assert (Eric, Lionel)
 - Change the way the assertion is written (Eric)

v3:
 - Revert the way the assertion is written to the form it had in v1,
   the version in v2 was not equivalent and was incorrect. (Lionel)

v4:
 - gen9+ doesn't need push constants for images at all (Jason)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v3)
2019-01-17 07:59:00 +01:00
Tapani Pälli
a311aa631d anv: do not advertise AHW support if extension not enabled
Fixes following failing vk-gl-cts cases on Linux desktop:

   dEQP-VK.api.external.memory.android_hardware_buffer.suballocated.buffer.info
   dEQP-VK.api.external.memory.android_hardware_buffer.suballocated.image.info
   dEQP-VK.api.external.memory.android_hardware_buffer.dedicated.image.info
   dEQP-VK.api.external.memory.android_hardware_buffer.dedicated.buffer.info

Fixes: 517103abf1 "anv/android: add ahardwarebuffer external memory properties"
Reported-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-01-17 07:22:02 +02:00
Eric Anholt
99ef66c325 vc4: Don't leak the GPU fd for renderonly usage.
Noticed while debugging V3D -- the ro->gpu_fd was freshly opened in ro
setup, and it needs to stay open until screen close (since it may be used
by renderonly) and should be the same one used by the vc4 screen.

Fixes: 7029ec05e2 ("gallium: Add renderonly-based support for pl111+vc4.")
2019-01-16 16:28:41 -08:00
Eric Anholt
0605726776 v3d: Don't leak the GPU fd for renderonly usage.
The CTS was running out of fds, because of the ro->gpu_fd never being
closed.  ro->gpu_fd should match the screen (in case the caller of
v3d_drm_screen_create_renderonly() has a scanout_for_resource() that uses
gpu_fd) and the screen is expected to close its fd at the end, fixing the
resource leak.

Fixes: e113b21cb7 ("v3d: Add renderonly support.")
2019-01-16 16:28:41 -08:00
Eric Anholt
59527a36e9 v3d: Restructure RO allocations using resource_from_handle.
I had bugs in the old path where I was laying out as tiled (so we'd render
tiled) but then only allocating space in the shared object for linear
rendering.  The resource_from_handle makes it so the same layout choices
are made in both the import and export scanout cases.  Also, fixes a leak
of the fd that was tripping up the CTS.

Now that we're checking PIPE_BIND_SHARED to choose to use RO, the
DRM_FORMAT_MOD_LINEAR check wasn't needed any more.

Fixes visual corruption and MMU faults in X in renderonly mode.

Fixes: bd09bb1629 ("v3d: SHARED but not necessarily SCANOUT buffers on RO must be linear.")
2019-01-16 16:28:41 -08:00
Eric Anholt
d70eb2302b v3d: If the modifier is not known on BO import, default to linear for RO.
Part of fixing DRI3 rendering with RO on X11.

Fixes: e113b21cb7 ("v3d: Add renderonly support.")
2019-01-16 16:28:41 -08:00
Timothy Arceri
cb527d2c4c ac/nir_to_llvm: add support for structs to get_sampler_desc()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-17 10:35:36 +11:00
Timothy Arceri
b12316cc92 ac/nir_to_llvm: fix regression in bindless support
This wasn't ported over when deref support was implemented.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-17 10:35:36 +11:00
Timothy Arceri
e106e0f2dd radeonsi/nir: get correct type for images inside structs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-17 10:35:36 +11:00
Timothy Arceri
292887ac0d ac/nir_to_llvm: fix type handling in image code
The current code only strips off arrays and cannot find the type
for images that are struct members.

Instead of trying to get the image type from the variable, we just
get it directly from the deref instruction.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-17 10:35:36 +11:00
Rhys Perry
8a52e4cc4f radv: use dithered alpha-to-coverage
This matches the behaviour of AMDVLK and hides banding.
It is also seems to be allowed by the Vulkan spec.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-16 20:49:23 +00:00
Alok Hota
187a6506a3 swr/rast: Store cached files in multiple subdirs
This improves cache filesystem performance, especially during CI tests
Also updated jitcache magic number due to codegen parameter changes
Removed 2 `if constexpr` to prevent C++17 requirement
2019-01-16 13:53:30 -06:00
Alok Hota
bb98be61f4 swr/rast: New execution engine per JIT
Fixes relocation errors with LLVM 7.0.0
2019-01-16 13:53:30 -06:00
Alok Hota
b135db5d58 swr/rast: Scope MEM_CLIENT enum for mem usages
Avoids confusion with other defaulted integer parameters

- fixed some unspecified usages
- removed unnecessary includes
- removed unecessary protected access specifier in buckets framework
2019-01-16 13:53:30 -06:00
Alok Hota
c722ad7379 swr/rast: Unaligned and translations in gathers
- added graphics address translation in odd gathers
- added support for unaligned gathers in fetch shader
- changed how 2+ GB offsets are handled to make them compatible with
unaligned offsets
2019-01-16 13:53:30 -06:00
Alok Hota
9459863dfa swr/rast: partial support for Tiled Resources
- updated sample from TRTT surfaces correctly
- implemented mapped status return for TRTT surfaces
- implemented per-sample instruction minLod clamp
- updated bilinear filter weight calculation to be closer to D3D specs
- implemented "ReducedTexcoordRange" operation from D3D specs to avoid
loss of precision on high-value normalized coordinates
2019-01-16 13:53:30 -06:00
Alok Hota
9cacf9d877 swr/rast: Add annotator to interleave isa text
To make debugging simpler
2019-01-16 13:53:30 -06:00
Alok Hota
c9fa2ee343 swr/rast: Use gfxptr_t value in JitGatherVertices
Use gfxptr_t type value for stream pointer uses in gather and similar
calls
2019-01-16 13:53:30 -06:00
Gert Wollny
e68777c87c autotools: Deprecate the use of autotools
Since Meson will eventually be the only build system deprecate autotools
now. It can still be used by invoking configure with the flag
  --enable-autotools

NAKed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2019-01-16 09:52:42 -08:00
Dylan Baker
431e9abaab meson: allow building dri driver without window system if osmesa is classic
This was already enabled for gallium based osmesa with gallium drivers
in 9d10581897, so do the same for classic
driver with classic osmesa.

Fixes: cbbd5bb889
       ("meson: build classic osmesa")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-16 17:49:51 +00:00
Bruce Cherniak
ed7673afd2 gallium/swr: Fix multi-context sync fence deadlock.
Various recreation scenarios lead to API thread getting stuck in
swr_fence_finish().  This is a multi-context issue, whereby one context
overwrites the fence read-value with a previous sync's lesser value.
The fence sync value is supposed to be always increasing.

In swr_fence_cb(), only update the "read" value if the new value is
greater.

(This may seem like we're not waiting on the other context to finish, but
had we needed for it to finish there would have been a wait prior to
submitting a new sync.)

cc: mesa-stable@lists.freedesktop.org
2019-01-16 09:26:36 -06:00
Samuel Pitoiset
d5d7b5e950 ac/nir: don't trash L1 caches for store operations with writeonly memory
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-16 13:57:22 +01:00
Kenneth Graunke
5b51d754d0 st/mesa: Optionally override RGB/RGBX dst alpha blend factors
Intel's blending hardware does not properly return 1.0 for destination
alpha for RGBX formats; it requires the factors to be overridden to
either zero or one.  Broadcom vc4 and v3d also could use this override.
While overriding these factors is safe in general, Nouveau and Radeon
would prefer not to.  Their blending hardware already returns correct
values for RGB/RGBX formats, and would like to avoid the resulting
per-buffer blending and independent blend factors (rgb != a) since it
can cause additional overhead.

I considered simply handling this in the driver, but it's not as nice.
pipe_blend_state doesn't have any format information, so we'd need the
hardware blend state to depend on both pipe_blend_state and
pipe_framebuffer_state.  Furthermore, Intel GPUs don't have a native
RGBX_SNORM format, so I avoid exposing one, which makes Gallium fall
back to RGBA_SNORM.  The pipe_surfaces we get in the driver have an RGBA
format, making it impossible to tell that there shouldn't be an alpha
channel.  One could argue that st not handling it in that case is a bug.
To work around this, we'd have to expose RGBX pipe formats, mapped to
RGBA hardware formats, and add format swizzling special cases.  All
doable, but it ends up being more code than I'd like.

st_atom_blend already has access to the right information and it's
trivial to accomplish there, so we just add a cap bit and do that.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-15 20:53:44 -08:00
Marek Olšák
11735d6c9c winsys/amdgpu: fix whitespace 2019-01-15 19:10:16 -05:00
Pierre Moreau
0b736f7fd4 meson: Fix with_gallium_icd to with_opencl_icd
`with_gallium_icd` is never used throughout the different Meson build
files, whereas `with_opencl_icd` tracks whether or not `gallium-opencl`
was set to "icd".

Fixes: 42ea0631f1
         ("meson: build clover")
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-15 13:06:50 -08:00
Kenneth Graunke
d644698b44 gallium: Add the ability to query a single pipeline statistics counter
Gallium historically has treated pipeline statistics queries as a single
query, PIPE_QUERY_PIPELINE_STATISTICS, which returns a block of 11
values.  This was originally patterned after the D3D1x API.  Much later,
Brian introduced an OpenGL extension that exposed these counters - but
it exposes 11 separate queries, each of which returns a single value.

Today, st/mesa simply queries all 11 values, and returns a single value.
While pipeline statistics counters aren't typically performance
critical, this is still not a great fit.  A D3D1x->GL translator might
request all 11 counters by creating 11 separate GL queries...which
Gallium would map to reads of all 11 values each time, resulting in a
total 121 counter reads.  That's not ideal.

This patch adds a new cap, PIPE_CAP_QUERY_PIPELINE_STATISTICS_SINGLE,
and corresponding query type PIPE_QUERY_PIPELINE_STATISTICS_SINGLE.
When calling create_query(), q->index should be set to one of the
PIPE_STAT_QUERY_* enums to select a counter.  Unlike the block query,
this returns the value in pipe_query_result::u64 (as it's a single
value) instead of the pipe_query_data_pipeline_statistics group.

We update st/mesa to expose ARB_pipeline_statistics_query if either
capability is set, preferring the new SINGLE variant when available.

Thanks to Roland, Ilia, and Marek for helping me sort this out.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-15 11:43:04 -08:00
Kenneth Graunke
f967273fb4 st/mesa: Rearrange PIPE_QUERY_PIPELINE_STATISTICS result fetching.
This just changes the order of the switch statements, so we only
look at target if the query type is PIPE_QUERY_PIPELINE_STATISTICS.

The next commit will introduce a new SINGLE query type which can be
used for the same GL query types, and it won't want this processing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-15 11:43:04 -08:00
Kenneth Graunke
e760be08b4 st/mesa: Make an enum for pipeline statistics query result indices.
Gallium handles pipeline statistics queries as a single query
(PIPE_QUERY_PIPELINE_STATISTICS) which returns a struct with 11 values.
Sometimes it's useful to refer to each of those values individually,
rather than as a group.  To avoid hardcoding numbers, we define a new
enum for each value.  Here, the name and enum value correspond to the
index in the struct pipe_query_data_pipeline_statistics result.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-15 11:43:04 -08:00
Dylan Baker
4a131a1330 meson: Add a script to extract the cmd line used for meson
Upstream I'm persuing a more comprehensive solution, but this should
prove a suitable stop-gap measure in the meantime.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109325
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-15 17:38:47 +00:00
Samuel Pitoiset
7bef192018 radv: add support for VK_EXT_memory_budget
A simple Vulkan extension that allows apps to query size and
usage of all exposed memory heaps.

The different usage values are not really accurate because
they are per drm-fd, but they should be close enough.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-15 11:18:37 +01:00
Samuel Pitoiset
9784400a6b radv: add two small helpers for getting VRAM and visible VRAM sizes
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-15 11:18:35 +01:00
Samuel Pitoiset
a6e5ce5130 radv: remove unnecessary returns in GetPhysicalDevice*Properties()
These functions return nothing.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-15 11:18:17 +01:00
Bas Nieuwenhuizen
568e7a2998 radv: Set partial_vs_wave for pipelines with just GS, not tess.
Looking at -pro we need to enable it for pipelines with just a
GS too.

This seems to reduce the hangs from
https://bugs.freedesktop.org/show_bug.cgi?id=109242 on a RX 550 to
the point where I can't reproduce, after the false start with the
wd_switch_on_eop patch due to flakiness.

(but people are reporting it does not fix the issue completely for
 them on polaris 11)

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-15 10:22:30 +01:00
Marek Olšák
5183e794af radeonsi: also apply the GS hang workaround to draws without tessellation
ported from AMDVLK.

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 18:55:58 -05:00
Eric Anholt
bd09bb1629 v3d: SHARED but not necessarily SCANOUT buffers on RO must be linear.
We don't have a way to talk to RO about modifiers it can do yet, so assume
the minimum.
2019-01-14 15:40:55 -08:00
Eric Anholt
f72820c851 v3d: Add support for CS barrier() intrinsics. 2019-01-14 15:40:55 -08:00
Eric Anholt
9b45b06d7c v3d: Add support for CS shared variable load/store/atomics.
CS shared variables are handled effectively as SSBO access to a temporary
buffer that will be allocated at CS dispatch time.
2019-01-14 15:40:55 -08:00
Eric Anholt
01d913cf90 v3d: Add support for CS workgroup/invocation id intrinsics.
We get a payload for the ivec3 workgroup and an int local invocation
index, and we use the core lowering to turn into the global invocation id
and the local invocation id ivec3s.
2019-01-14 15:40:55 -08:00
Eric Anholt
6281f26f06 v3d: Add support for shader_image_load_store.
This is only exposed on V3D 4.1+, because we didn't have the TMU write
operations for images on 3.3 (To do GLES 3.1 there, you have to lower it
to SSBO load/stores, which is a problem to solve later).
2019-01-14 15:40:55 -08:00
Eric Anholt
5932c2f0b9 v3d: Add SSBO/atomic counters support.
So far I assume that all the buffers get written.  If they weren't, you'd
probably be using UBOs instead.
2019-01-14 15:40:55 -08:00
Eric Anholt
6c8edcb89c v3d: Drop the GLSL version level.
This was an arbitrary "we support lots of stuff" value when I started the
driver.  However, at 400 we expose OES_gpu_shader5, which claims support
for dynamically indexing samplers, which the driver doesn't do yet.
2019-01-14 13:18:02 -08:00
Eric Anholt
1a63227ea0 v3d: Add support for matrix inputs to the FS.
We've been relying on linking splitting up our varying matrices into
separate vectors, but with SSO that doesn't happen.  Supporting matrix
inputs isn't too hard, though.
2019-01-14 13:18:02 -08:00
Eric Anholt
49b7e26fac v3d: Add an isr to the simulator to catch GMP violations.
Otherwise, the simulator raises the GMP interrupt and waits for it to be
handled, and v3d ends up spinning in v3d_hw_tick().  Aborting right when
violation happens gives us a chance to look at the backtrace of whatever
thread triggered the violation.
2019-01-14 13:18:02 -08:00
Eric Anholt
3790ee07e6 v3d: Fix txf_ms 2D_ARRAY array index.
We need to pass the array index through our coordinate transform
unchanged.  Fixes
dEQP-GLES31.functional.texture.multisample.samples_1.*_2d_array
2019-01-14 13:18:02 -08:00
Eric Anholt
619a28b845 v3d: Add support for GL_ARB_framebuffer_no_attachments.
Fixes
dEQP-GLES31.functional.state_query.integer.max_framebuffer_height_getboolean
when GLES3 is enabled.
2019-01-14 13:18:02 -08:00
Eric Anholt
051a41d3d5 v3d: Add support for the early_fragment_tests flag.
If this flag hasn't been set by the shader and it has some visible side
effects, then we need to disable EZ.
2019-01-14 13:18:02 -08:00
Eric Anholt
b417a9f7b2 v3d: Add support for flushing dirty TMU data at job end.
This will be needed for SSBOs and image_load_store.
2019-01-14 13:18:02 -08:00
Samuel Pitoiset
ad6ceb2872 ac: add missing 16-bit types to glsl_base_to_llvm_type()
Fix crashes with
dEQP-VK.spirv_assembly.instruction.compute.workgroup_memory.*16

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 21:18:23 +01:00
Bas Nieuwenhuizen
76b12fa564 radv: Only use 32 KiB per threadgroup on Stoney.
Causes hangs on some machines.

What works for dEQP-VK.tessellation.shader_input_output.barrier:

- running num_patches = 6 (which limits LDS to 32 KiB)
- running num_patches = 8, and artificially cutting LDS size at 32 KiB.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-14 19:58:27 +00:00
Marek Olšák
76df5e8f52 st/dri: fix dri2_format_table for argb1555 and rgb565
The bug caused that rgb565 framebuffers used argb1555.

Fixes: 433ca3127a

Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-14 14:54:19 -05:00
Jason Ekstrand
2d2737dcfe nir: Add a bool to float32 lowering pass
From @jekstrand's nir-1-bit-bool branch, with improved ior/inot lowering.

ior: fmax instead of fadd allows removing the fsat.

inot: seq(x, 0) can be better than fsub(1, x). On a2xx, it works better
with the scalar instruction set.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-01-14 19:27:06 +00:00
Caio Marcelo de Oliveira Filho
09c3ff01df src/intel: use new hash table and set creation helpers
Replace calls to create hash tables and sets that use
_mesa_hash_pointer/_mesa_key_pointer_equal with the helpers
_mesa_pointer_hash_table_create() and _mesa_pointer_set_create().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-01-14 10:49:33 -08:00
Caio Marcelo de Oliveira Filho
9fdded0cc3 src/compiler: use new hash table and set creation helpers
Replace calls to create hash tables and sets that use
_mesa_hash_pointer/_mesa_key_pointer_equal with the helpers
_mesa_pointer_hash_table_create() and _mesa_pointer_set_create().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-01-14 10:49:28 -08:00
Caio Marcelo de Oliveira Filho
ee23e8b17c util: Helper to create sets and hashes with pointer keys
These combinations are common enough and deserve a shortcut.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-01-14 10:49:21 -08:00
Samuel Pitoiset
929df7afaf ac/nir: set cache policy when loading/storing buffer images
This was missing.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 17:59:51 +01:00
Samuel Pitoiset
af2a85df74 ac/nir: add get_cache_policy() helper and use it
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 17:59:49 +01:00
Jason Ekstrand
5e4f9ea363 anv: Implement VK_KHR_depth_stencil_resolve 2019-01-14 10:16:52 -06:00
Jason Ekstrand
9f44088468 anv: Move resolve_subpass to genX_cmd_buffer.c
We may have to do transitions around certain kinds of resolves so it
helps to have it genX code.
2019-01-14 10:16:52 -06:00
Jason Ekstrand
930b17161f anv/blorp: Refactor MSAA resolves into an exportable helper function
This function is modeled after the aux_op functions except that it has a
lot more parameters because it deals with two images as well as source
and destination regions.
2019-01-14 10:16:52 -06:00
Jason Ekstrand
c92c449361 anv: Rename has_resolve to has_color_resolve 2019-01-14 10:16:52 -06:00
Jason Ekstrand
4bd976e3b8 intel/blorp: Add two more filter modes 2019-01-14 10:16:52 -06:00
Andres Gomez
3ec9ab80b8 bin/get-pick-list.sh: fix redirection in sh
"&>" is bash specific.

Fixes: e0dbfc9953 ("bin/get-pick-list.sh: warn when commit lists invalid sha")
Cc: Juan A. Suarez <jasuarez@igalia.com>
Cc: Eric Engestrom <eric.engestrom@intel.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-01-14 17:40:15 +02:00
Andres Gomez
716ed41a36 bin/get-pick-list.sh: fix the oneline printing
"--summary" will also print extended header information such as
creations, renames and mode changes.

Let's just use "--no-patch", which suppresses the diff output.

v2: Use "--no-patch" instead of the "-s" abbreviation (Eric).

Fixes: 559c32d241 ("bin/get-pick-list.sh: simplify git oneline printing")
Cc: Juan A. Suarez <jasuarez@igalia.com>
Cc: Eric Engestrom <eric.engestrom@intel.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-01-14 17:36:56 +02:00
Michel Dänzer
1a20b56798 amd/common: Restore v4i32 suffix for llvm.SI.load.const intrinsic
It was accidentally dropped in commit e4803ab7d2 "amd/common: use
llvm.amdgcn.s.buffer.load for LLVM 8.0", breaking the universe with LLVM
7.

Trivial.
2019-01-14 12:52:52 +01:00
Nicolai Hähnle
7fbd48fdc0 amd/common/vi+: enable SMEM loads with GLC=1
Only on LLVM 8.0+, which supports the new intrinsic.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 08:30:15 +01:00
Nicolai Hähnle
e4803ab7d2 amd/common: use llvm.amdgcn.s.buffer.load for LLVM 8.0
llvm.SI.load.const is deprecated.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 08:30:12 +01:00
Iago Toral Quiroga
1c1ae6376c anv/pipeline_cache: free NIR shader cache
Fixes: f6aa9f7185 'anv/pipeline_cache: Add support for caching NIR'
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-14 07:59:27 +01:00
Danylo Piliaiev
0862929bf6 glsl: Fix copying function's out to temp if dereferenced by array
Function's out variable could be an array dereferenced by an array:
 func(v[w[i]]);
or something more complicated.

Copy index in any case.

Fixes: 76c27e47b9 ("glsl: Copy function out to temp if we don't directly ref a variable")

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-14 12:04:07 +11:00
Kenneth Graunke
04c2f12ab2 i965: Drop mark_surface_used mechanism.
The original idea was that the backend compiler could eliminate
surfaces, so we would have it mark which ones are actually used,
then shrink the binding table accordingly.  Unfortunately, it's a
pretty blunt mechanism - it can only prune things from the end,
not the middle - since we decide the layout before we even start
the backend compiler, and only limit the size.  It also basically
gives up if it sees indirect array access.

Besides, we do the vast majority of our surface elimination in NIR
anyway, not the backend - and I don't see that trend changing any
time soon.  Vulkan abandoned this plan a long time ago, and I don't
use it in Iris, but it's still been kicking around in i965.

I hacked shader-db to print the binding table size in bytes, and
observed no changes with this patch.  So, this code appears to do
nothing useful.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-13 09:35:32 -08:00
Eric Engestrom
bdf6a5c1d2 egl: fix python lib deprecation warning
DeprecationWarning: the imp module is deprecated in favour of importlib

Instead of complicated logic, just import the file directly.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-13 13:59:08 +00:00
Jason Ekstrand
b938d5fbef spirv: Emit switch conditions on-the-fly
Instead of emitting all of the conditions for the cases of a switch
statement up-front, emit them on-the-fly as we emit the code for each
case.  The original justification for this was that we were going to
have to build a default case anyway which would need them all.  However,
we can just trust CSE to clean up the mess in that case.  Emitting each
condition right before the if statement that uses it reduces register
pressure and, in one customer benchmark, reduces spilling and improves
performance by about 2x.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-12 17:55:49 -06:00
Jason Ekstrand
821b6861ec nir/gcm: Support deref instructions
Even though no one's been brave enough to ever use this pass, I like to
keep it functionally working.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-12 17:55:49 -06:00
Jason Ekstrand
24c8108ea6 intel/nir: Call nir_opt_deref in brw_nir_optimize
It's an optimization so we should probably be calling it in the
optimization loop.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-12 17:55:49 -06:00
Jason Ekstrand
e57e26121a spirv: Contain the GLSLang issue #179 workaround to old GLSLang
Instead of applying the workaround universally, detect semi-old GLSLang
via the generator ID and only enable the workaround on old GLSLang.
This isn't nearly as precise as one would like it to be because the
first GLSLang generator id version bump was on October 7, 2017 which is
about 1.5 years after the bug was fixed.  However, it at least lets us
disable it for non-GLSLang and for more modern versions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-12 17:55:49 -06:00
Jason Ekstrand
b57c1ec421 spirv: Whack sampler/image pointers to uniform
A long time in a galaxy far far away, there was a GLSLang bug with how
it handled samplers passed in as function parameters.  (The bug can be
found here: https://github.com/KhronosGroup/glslang/issues/179.)
Unfortunately, that version was shipped in several apps and has been
causing heartburn for our SPIR-V parser ever since.

Recent changes to NIR uncovered a moderately old bug in how we work
around this issue.  In particular, we ended up with a deref_cast from
uniform to local which is not a no-op cast so nir_opt_deref wasn't
getting rid of the cast.  The only reason why it worked before was
because someone just happened to call nir_fixup_deref_modes which
"fixed" the cast (that shouldn't be happening) and then a later round of
copy-prop would get rid of it.  The fact that the deref_cast survived
that long without causing trouble for other parts of NIR is a bit
surprising.

Just whacking the mode of the pointer seems to fix it fairly
unobtrusively.  Currently, only apps with this bug will have a local
variable containing an image or sampler.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109304
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-12 17:55:49 -06:00
Kenneth Graunke
2b876bc922 st/nir: Lower TES gl_PatchVerticesIn to a constant if linked with a TCS.
If the TCS and TES are linked together, we can simply replace the TES's
gl_PatchVerticesIn system value with a constant, possibly allowing extra
optimization or letting the driver avoid uploading a special value.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-01-11 13:07:54 -08:00
Jonathan Marek
3d182601bb glsl/nir: keep bool types when native_integers=false
With the new handling of bool types, the conversion to float in glsl_to_nir
should not apply to bool types anymore.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-11 19:16:11 +00:00
Jonathan Marek
b27ad17115 glsl/nir: ftrunc for native_integers=false float to int cast
out_type in the default cast case is always GLSL_TYPE_FLOAT, so we get a
mov otherwise.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-11 19:16:11 +00:00
Jonathan Marek
d3b47e073e glsl/nir: int constants as float for native_integers=false
All alu instructions emitted with native_integers=false expect float
(or bool in some cases) constants, so this change is necessary.

This will cause changes with some intrinsics which had integer sources,
such as nir_intrinsic_load_uniform. Apparently it might cause issues with
some opt passes, but perhaps those don't apply in OpenGL ES 2.0 cases?

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-11 19:16:11 +00:00
Jason Ekstrand
1ede463b6e intel/peephole_ffma: Fix swizzle propagation
The num_components value passed into get_mul_for_src is used to only
compose the parts of the swizzle that we know will be used so we don't
compose invalid swizzle components.  However, we had a bug where we
passed the number of components of the add all the way through.  For the
given source, we need the number of components read from that source.
In the case where we have a narrow add, say 2 components, that is
sourced from a chain of wider instructions, we may not compose all the
swizzles.  All we really need to do is pass through the right number of
components at each level.

Fixes: 2231cf0ba3 "nir: Fix output swizzle in get_mul_for_src"
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-01-11 10:44:08 -06:00
Kenneth Graunke
ae683ed3bc nir: Allow a non-existent sampler deref in nir_lower_samplers_as_deref
GL_ARB_gl_spirv does not provide a sampler deref for e.g. texelFetch(),
so we can't assume that both are present and identical.  Simply lower
each if it is present.

Fixes regressions in GL_ARB_gl_spirv tests since I switched everyone to
using this pass.  Thanks to Alejandro Piñeiro for catching these.

Fixes: f003859f97 nir: Make gl_nir_lower_samplers use gl_nir_lower_samplers_as_deref

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-11 07:54:32 -08:00
Eric Engestrom
e12b0b5c6d travis: avoid using unset llvm-config
Fixes the following errors:
  usage: which [-as] program ...
  /Users/travis/.travis/job_stages: line 110: --version: command not found

... caused by the use of an undefined $LLVM_CONFIG

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-11 14:38:35 +00:00
Eric Engestrom
c8ae891035 egl: remove unused include
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-11 14:37:47 +00:00
Eric Engestrom
d75fbff667 egl: add missing includes
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-11 14:37:47 +00:00
Iago Toral Quiroga
4b1e436bc9 anv/pipeline_cache: fix incorrect guards for NIR cache
Fixes: f6aa9f7185 'anv/pipeline_cache: Add support for caching NIR'
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-11 12:45:18 +01:00
Kenneth Graunke
ad9832d17b blorp: Pass the batch to lookup/upload_shader instead of context
This will allow drivers to pin shader buffers if necessary.

i965 and anv do not need to do this today, but iris will.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-10 20:52:04 -08:00
Kenneth Graunke
084a1cdbb7 blorp: Add blorp_get_surface_address to the driver interface.
Currently, BLORP expects drivers to provide two functions for dealing
with buffers: blorp_emit_reloc and blorp_surface_reloc.  Both record a
relocation and combine the BO address and offset into a full 64-bit
address.  Traditionally, blorp_surface_reloc has written that combined
address to an implicitly-known buffer where surface states are stored.
(In contrast, blorp_emit_reloc returns the value.)

The upcoming Iris driver stores surface states in multiple buffers,
which makes it impossible for blorp_surface_reloc to write the combined
address - it only takes an offset, not the actual buffer to write to.

This commit adds a third function, blorp_get_surface_address, which
combines and returns an address, which is then passed to ISL's surface
state fill functions.  Softpin-only drivers can return a real address
here and skip writing it in blorp_surface_reloc.  Relocation-based
drivers are have options.  They can simply return 0 from the new
function, and continue writing the address from blorp_surface_reloc.
Or, they can return a presumed address from blorp_get_surface_address,
and have other relocation processing write the real value later.

For now, i965 and anv simply return 0.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-10 20:51:53 -08:00
Ilia Mirkin
2165636e9c docs: fix gallium screen cap docs
Make sure that the next line starts with spaces so that bullets are
maintained throughout, add `` around a few more special tokens, and fix
SAMPLE_COUNT_TEXTURE -> SAMPLE_COUNT.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-01-10 21:44:09 -05:00
Danylo Piliaiev
a2db6b4254 glsl: Make invariant outputs in ES fragment shader not to cause error
In all GLSL ES versions output variables in fragment shader are allowed
to be invariant.

 From Section 4.6.1 ("The Invariant Qualifier") GLSL ES 1.00 spec:
 "Only the following variables may be declared as invariant:
   ...
   - Built-in special variables output from the fragment shader."

 From Section 4.6.1 ("The Invariant Qualifier") GLSL ES 3.00 spec:
 "Only variables output from a shader can be candidates for invariance."

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107842
2019-01-11 13:01:11 +11:00
Jason Ekstrand
eb4b1477dc anv/pipeline: Cache the pre-lowered NIR
This adds a second level of caching for the pre-lowered NIR that's only
based off of the shader module, entrypoint and specialization constants.
This is enough for spirv_to_nir as well as our first round of lowering
and optimization.  Caching at this level should allow for faster shader
recompiles due to state changes.

The NIR caching does not get serialized to disk via either the
VkPipelineCache serialization mechanism or the transparent on-disk
cache.  We could but it's usually not that expensive to fall back to
SPIR-V for the odd cache miss especially if it only happens once for
several misses and it simplifies the cache.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-10 19:15:27 -06:00
Jason Ekstrand
f6aa9f7185 anv/pipeline_cache: Add support for caching NIR
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-10 19:15:27 -06:00
Jason Ekstrand
8dfda5ebbe anv/pipeline: Hash shader modules and spec constants separately
The stuff hashed by anv_pipeline_hash_shader is exactly the inputs to
anv_shader_compile_to_nir so it can be used for NIR caching.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-10 19:15:27 -06:00
Jason Ekstrand
b90e55a5d5 compiler/types: Serialize/deserialize subpass input types correctly
They have glsl_sampler_dim enum values of 8 and 9 which don't work when
you & them with 0x7.  Fortunately, we have plenty of bits.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-10 19:15:27 -06:00
Jason Ekstrand
73ddfbeb85 anv/pipeline: Move wpos and input attachment lowering to lower_nir
This lets us make anv_pipeline_compile_to_nir take a device instead of a
pipeline.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-10 19:15:27 -06:00
Matt Turner
32e266a9a5 i965: Compile fp64 funcs only if we do not have 64-bit hardware support
Brown bag fix...
2019-01-10 15:22:17 -08:00
Jason Ekstrand
8ea8727a87 anv/pipeline: Constant fold after apply_pipeline_layout
Thanks to the new NIR load_descriptor intrinsic added by the UBO/SSBO
lowering series, we weren't getting UBO pushing because the UBO range
detection pass couldn't see the constants it needed.  This fixes that
problem with a quick round of constant folding.  Because we're folding
we no longer need to go out of our way to generate constants when we
lower the vulkan_resource_index intrinsic and we can make it a bit
simpler.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-10 20:34:00 +00:00
Rob Clark
031e94dc72 freedreno/a6xx: fix 3d+tiled layout
The last round of fixing 3d layer+level layout skipped the tiled case,
since tiled texture support was not in place yet.  This finishes the
job.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-10 14:21:39 -05:00
Rob Clark
c92c18c70c freedreno/a6xx: move tile_mode to sampler-view CSO
This is known when the CSO is created, so no need to patch it in later.

Also, it seems like smaller textures where the first level is small
enough to be linear, it seems like we should set linear tile mode.

See: dEQP-GLES3.functional.texture.format.unsized.rgb_unsigned_byte_3d_pot

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-10 14:21:39 -05:00
Rob Clark
eb625d30b7 freedreno/a6xx: separate stencil restore/resolve fixes
Previously we'd use format/etc from the primary (z32) buffer for the
stencil (s8), due to confusion about rsc vs psurf.  Rework this to drop
extra arg and push down handling of separate stencil case (and make sure
we take the fmt from the right place).

This doesn't completely fix separate-stencil, but at least it avoids the
GPU scribbling over random other cmdstream buffers and causing a bunch
of bogus fails in dEQP.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-10 14:21:39 -05:00
Rob Clark
04aff7e42b freedreno: make cmdstream bo's read-only to GPU
If nothing else, this will make problems with cmdstream getting blit
over with pixels easier to track down (ie. faults when it first happens
rather than strange failures later from corrupted cmdstream when a
stateobj is later reused).

(NOTE this somewhat depends on the kernel supporting the flag, and the
iommu implementation.  But the worst case is just that the cmdstream
ends up writeable as before.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-10 14:21:39 -05:00
Guido Günther
286de96af8 etnaviv: fix typo in cflush_all description
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-01-10 18:46:10 +01:00
Eric Engestrom
53fbde4df3 radv: remove a few more unnecessary KHR suffixes
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)
2019-01-10 16:53:44 +00:00
Rhys Perry
0210243923 nir: fix copy-paste error in nir_lower_constant_initializers
Fixes: 393b59e077
    ('nir: Rework nir_lower_constant_initializers() to handle functions')
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-10 10:51:52 -06:00
Andres Gomez
6c3164cd08 docs: complete the calendar and release schedule documentation
As suggested by Emil Velikov.

Cc: Dylan Baker <dylan.c.baker@intel.com>
Cc: Juan A. Suarez <jasuarez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-01-10 15:53:02 +02:00
Andres Gomez
428164d87f glsl/linker: specify proper direction in location aliasing error
The check for location aliasing was always asuming output variables
but this validation is also called for input variables.

Fixes: e2abb75b0e ("glsl/linker: validate explicit locations for SSO programs")
Cc: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-10 15:51:57 +02:00
Andres Gomez
e2e03f84f9 editorconfig: Add max_line_length property
The property is supported by the most of the editors, but not all:
https://github.com/editorconfig/editorconfig/wiki/EditorConfig-Properties#max_line_length

Cc: Eric Engestrom <eric@engestrom.ch>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-10 15:50:34 +02:00
Tapani Pälli
864cc419eb intel/isl: move tiled_memcpy static libs from i965 to isl
Patch moves intel_tiled_memcpy[_sse41] libraries to isl, renames some
functions and types and makes the required build system changes for
meson, automake and Android. No functional changes are introduced.

v2: code cleanups, move isl_get_memcpy_type to i965 (Jason)
v3: move isl_mem_copy_fn to priv header, cleanups (Jason, Dylan)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-10 08:02:30 +02:00
Matt Turner
406f603b34 i965: Enable 64-bit GLSL extensions
Now that we have software implementations of ARB_gpu_shader_int64 and
ARB_gpu_shader_fp64 we can unconditionally enable these extensions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
613ac3aaa2 i965: Compile fp64 software routines and lower double-ops
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
18b4e87370 intel/compiler: Heap-allocate temporary storage
Shaders containing software implementations of double-precision
operations can be very large such that we cannot stack-allocate
an array of grf_count*16.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
622d429128 intel/compiler: Expand size of the 'nr' field
Shaders containing software implementations of double-precision
operations can be very large such that we have more the 2^16 virtual
registers during optimization.

Move the 'nr' field to the union containing the immediate storage and
expand it to 32-bits.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
7e4e9da90d intel/compiler: Prevent warnings in the following patch
The next patch replaces an unsigned bitfield with a plain unsigned,
which triggers gcc to begin warning on signed/unsigned comparisons.

Keeping this patch separate from the actual move allows bisectablity and
generates no additional warnings temporarily.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
2b801b6668 intel/compiler: Rearrange code to avoid future problems
A follow on commit will move nr to the same union as the immediate
data, so we should assert these invariants before we overwrite the nr
field.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
3b967e1724 intel/compiler: Avoid false positive assertions
A follow on patch will move the 'nr' field to the union containing the
immediate field, so prepare by checking that we're only testing these
assertions if the .file is correct.

The assertions with != ARF were kind of silly to begin with because the
<128 check is specifically only for things in the GRF.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
8534742404 intel/compiler: Split 64-bit MOV-indirects if needed
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:40 -08:00
Matt Turner
e76772af6c intel/compiler: Lower 64-bit MOV/SEL operations 2019-01-09 16:42:40 -08:00
Matt Turner
2623653126 nir: Unset metadata debug bit if no progress made
NIR metadata validation verifies that the debug bit was unset (by a call
to nir_metadata_preserve) if a NIR optimization pass made progress on
the shader. With the expectation that the NIR shader consists of only a
single main function, it has been safe to call nir_metadata_preserve()
iff progress was made.

However, most optimization passes calculate progress per-function and
then return the union of those calculations. In the case that an
optimization pass makes progress only on a subset of the functions in
the shader metadata validation will detect the debug bit is still set on
any unchanged functions resulting in a failed assertion.

This patch offers a quick solution (short of a larger scale refactoring
which I do not wish to undertake as part of this series) that simply
unsets the debug bit on unchanged functions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Matt Turner
e633fae5cb nir: Add lowering support for 64-bit operations to software
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Matt Turner
fe2cbcf3ee nir: Create nir_builder in nir_lower_doubles_impl()
We're going to use it more in a future patch, and this avoids a lot of
gross code.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Matt Turner
ecb115eb3f nir: Add and set info::uses_64bit
Will be used to communicate that a shader uses 64-bit operations to the
concerned lowering passes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Matt Turner
41f3e9e5f5 nir: Implement lowering of 64-bit shift operations
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Matt Turner
62d55f1281 nir: Wire up int64 lowering functions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Jason Ekstrand
adab27e741 nir: Add some more int64 lowering helpers
[mattst88]: Found in an old branch of Jason's.

Jason implemented: inot, iand, ior, iadd, isub, ineg, iabs, compare,
                   imin, imax, umin, umax
Matt implemented:  ixor, bcsel, b2i, i2b, i2i8, i2i16, i2i32, i2i64,
                   u2u8, u2u16, u2u32, u2u64, and fixed ilt

Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
dde73e646f nir: Tag entrypoint for easy recognition by nir_shader_get_entrypoint()
We're going to have multiple functions, so nir_shader_get_entrypoint()
needs to do something a little smarter.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Matt Turner
393b59e077 nir: Rework nir_lower_constant_initializers() to handle functions
Previously it assumed that only a single function (the entrypoint)
existed and attempted to lower constant initializers of shader outputs
for each function, for instance.
2019-01-09 16:42:40 -08:00
Sagar Ghuge
f998ce4111 glsl: Add "built-in" functions to do fp32_to_int64(fp32)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
2632c12477 glsl: Add "built-in" functions to do fp32_to_uint64(fp32)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
876a4b85fe glsl: Add "built-in" functions to do fp64_to_int64(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
21e9bb2b3f glsl: Add utility function to round and pack int64_t value
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
5a674fd789 glsl: Add "built-in" functions to do fp64_to_uint64(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
5a87441807 glsl: Add utility function to round and pack uint64_t value
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
c9d333a6b7 glsl: Add "built-in" functions to do int64_to_fp32(int64_t)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
d5cf6e92b4 glsl: Add "built-in" functions to do uint64_to_fp32(uint64_t)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
b830efb191 glsl: Add "built-in" functions to do int64_to_fp64(int64_t)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
7c5b982b89 glsl: Add "built-in" functions to do uint64_to_fp64(uint64_t)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Matt Turner
15757bc80b glsl: Add "built-in" functions to convert bool to double
And vice versa.

Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
e213f3871f glsl: Add "built-in" functions to do ffract(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
5c9a659f50 glsl: Add "built-in" function to do ffloor(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
83762afa66 glsl: Add "built-in" functions to do fmin/fmax(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
92ac2169fb glsl: Add "built-in" functions to do ffma(fp64)
Definitely not actually a fused-multiply add.

Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
3db81b5d9f glsl: Add "built-in" functions to do round(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
48891ab441 glsl: Add "built-in" functions to do trunc(fp64)
v2: use mix.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
2119094b1d glsl: Add "built-in" functions to do sqrt(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
cad58fc5e7 glsl: Add "built-in" functions to do fp32_to_fp64(fp32)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
407bd1bbf9 glsl: Add "built-in" functions to do fp64_to_fp32(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
f499942b31 glsl: Add "built-in" functions to do int_to_fp64(int)
v2: use mix
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
773190f281 glsl: Add "built-in" functions to do fp64_to_int(fp64)
v2: use mix

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
cbf090b809 glsl: Add "built-in" functions to do uint_to_fp64(uint)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
a3551ee61f glsl: Add "built-in" functions to do fp64_to_uint(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
4a93401546 glsl: Add "built-in" functions to do mul(fp64, fp64)
v2: use mix
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
f111d72596 glsl: Add "built-in" functions to do add(fp64, fp64)
v2: use mix and findMSB to optimise.
v3: [Sagar] Fix zFrac0 == 0u case in __normalizeRoundAndPackFloat64

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
c036fc97a2 glsl: Add "built-in" functions to do lt(fp64, fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
3e4d5ea7b8 glsl: Add utility function to extract 64-bit sign
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:40 -08:00
Elie Tournier
ec6e823a99 glsl: Add "built-in" functions to do eq/ne(fp64, fp64) 2019-01-09 16:42:40 -08:00
Elie Tournier
c802cdde9d glsl: Add "built-in" function to do sign(fp64)
v2: use mix.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
eac66f0248 glsl: Add "built-in" functions to do neg(fp64)
v2: use mix.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
0428951b9d glsl: Add "built-in" function to do abs(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Matt Turner
b63a1f8e40 glsl: Create file to contain software fp64 functions
The following patches will add implementations of various
double-precision operations to this file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:40 -08:00
Ian Romanick
412472da5c glsl: Add utility to convert text files to C strings
Will be used to convert the .glsl source file containing software fp64
routines to a .h file that can be included while building the compiler.

This commit contains two squashed together: the first from Ian adding
the utility (with the existing title), and the second from Dylan making
the code both python2 and python3 compatible.

This is somewhat modeled after the xxd utility that comes with Vim.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>

xxd.py: Make python2 and 3 compatible

This makes use of unicode_literals, so that undecorated strings are
considered text (python2 unicode, python3 str) and not bytes in python2
and text in python3. It makes use of io.open, which provides python2
with python3's open behavior (it's an alias in python3), in particular
support for the 't' and 'b' option. Finally, it decorates all of the
string literals with the 'b' prefix, so that python interprets them as
bytes.

I've removed the stdin and stdout options, as python2 always requires
these to be bytes, but python3 always treats them as text (there is a
way to get at the underlying bytes buffer, but that's even more
complexity), and makes the input files required arguments.

In the meson we use the '@INPUT@' shorthand instead of listing each
input, as meson will expand that to [prog_python, '@INPUT0@', @INPUT1@,
..., @OUTPUT@, ...]
2019-01-09 16:42:40 -08:00
Timothy Arceri
76c27e47b9 glsl: Copy function out to temp if we don't directly ref a variable
Otherwise we can end up with IR that looks like this:

    (
      (declare (temporary ) vec4 f@8)
      (assign  (xyzw) (var_ref f@8)  (var_ref f) )
      (call f16  ((swiz y (var_ref f@8) )))

      (assign  (xyzw) (var_ref f)  (var_ref f@8) )
    ))

When we really need:

      (declare (temporary ) float inout_tmp)
      (assign  (x) (var_ref inout_tmp)  (swiz y (var_ref f) ))
      (call f16  ((var_ref inout_tmp) ))

      (assign  (y) (var_ref f)  (swiz y (swiz xxxx (var_ref inout_tmp) )))
      (declare (temporary ) void void_var)

The GLSL IR function inlining code seemed to produce correct code
even without this but we need the correct IR for GLSL IR -> NIR to
be able to understand whats going on.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:40 -08:00
Matt Turner
63f6d7afd6 glsl: Add function support to glsl_to_nir
Based on a patch from Tim Arceri, but I had to substantially rewrite it
as a result of the NIR derefs rework.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:40 -08:00
Francisco Jerez
230a8a541d intel/fs: Remove FS_OPCODE_UNPACK_HALF_2x16_SPLIT opcodes.
These are broken on a future platform, but it turns out we don't need
to fix them, since they're just type-converting moves with strided
source.  Kill them.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
cbea91eb57 intel/fs: Remove nasty open-coded CHV/BXT 64-bit workarounds.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
2c99c7a56c intel/fs: Remove existing lower_conversions pass.
It's redundant with the functionality provided by lower_regioning now.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
efa4e4bc5f intel/fs: Introduce regioning lowering pass.
This legalization pass is meant to handle situations where the source
or destination regioning controls of an instruction are unsupported by
the hardware and need to be lowered away into separate instructions.
This should be more reliable and future-proof than the current
approach of handling CHV/BXT restrictions manually all over the
visitor.  The same mechanism is leveraged to lower unsupported type
conversions easily, which obsoletes the lower_conversions pass.

v2: Give conditional modifiers the same treatment as predicates for
    SEL instructions in lower_dst_modifiers() (Iago).  Special-case a
    couple of other instructions with inconsistent conditional mod
    semantics in lower_dst_modifiers() (Curro).

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
b94519971a intel/fs: Constify fs_inst::can_do_source_mods().
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
c301f447ea intel/fs: Respect CHV/BXT regioning restrictions in copy propagation pass.
Currently the visitor attempts to enforce the regioning restrictions
that apply to double-precision instructions on CHV/BXT at NIR-to-i965
translation time.  It is possible though for the copy propagation pass
to violate this restriction if a strided move is propagated into one
of the affected instructions.  I've only reproduced this issue on a
future platform but it could affect CHV/BXT too under the right
conditions.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Francisco Jerez
464e79144f intel/eu/gen7: Fix brw_MOV() with DF destination and strided source.
I triggered this bug while prototyping code for a future platform on
IVB.  Could be a problem today though if a strided move is
copy-propagated into a type-converting move with DF destination.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Francisco Jerez
bc781a0323 intel/fs: Fix bug in lower_simd_width while splitting an instruction which was already split.
This seems to be a problem in combination with the lower_regioning
pass introduced by a future commit, which can modify a SIMD-split
instruction causing its execution size to become illegal again.  A
subsequent call to lower_simd_width() would hit this bug on a future
platform.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Francisco Jerez
812ede088f intel/fs: Implement quad swizzles on ICL+.
Align16 is no longer a thing, so a new implementation is provided
using Align1 instead.  Not all possible swizzles can be represented as
a single Align1 region, but some fast paths are provided for
frequently used swizzles that can be represented efficiently in Align1
mode.

Fixes ~90 subgroup quad swap Vulkan CTS tests.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Francisco Jerez
c5f9c0009d intel/fs: Handle source modifiers in lower_integer_multiplication().
lower_integer_multiplication() implements 32x32-bit multiplication on
some platforms by bit-casting one of the 32-bit sources into two
16-bit unsigned integer portions.  This can give incorrect results if
the original instruction specified a source modifier.  Fix it by
emitting an additional MOV instruction implementing the source
modifiers where necessary.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Andrii Simiklit
0206ffc28d anv/pipeline: remove unnecessary null-pointer check
Looks like it is impossible that 'last' variable is a null
because at least the get_vs_prog_data shouldn't return a null pointer.
So this check is unnecessary starts from commit:
99d497c5b6 "anv/pipeline: Replace get_fs_input_map with ..."

This small issue is found by cppcheck.

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 12:29:12 -06:00
Indrajit Das
d2c170eb35 st/va: Return correct status from vlVaQuerySurfaceStatus
This ensures that during encoding, applications can get
the correct status of the surface before submitting
more operations on the same.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>
2019-01-09 11:34:22 -05:00
Roland Scheidegger
0c226d40ef Revert "llvmpipe: Always return some fence in flush (v2)"
This reverts commit f6a6da8131.

With this commit we see massive amounts of asserts triggering
in lp_fence_wait(), assert(f->issued), for instance with libgl_xlib
state tracker and piglit. Not entirely sure if the assert could
just be removed.
2019-01-09 17:28:53 +01:00
Marek Olšák
e986c1ca1d st/mesa: don't leak pipe_surface if pipe_context is not current
We have found some pipe_surface leaks internally.

This is the same code as surface_destroy in radeonsi.
Ideally, surface_destroy would be in pipe_screen.

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-01-09 11:08:44 -05:00
Marek Olšák
fd82a1d1d6 st/mesa: don't reference pipe_surface locally in PBO code
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-01-09 11:08:44 -05:00
Marek Olšák
5da442338b st/mesa: unify window-system renderbuffer initialization
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-01-09 11:08:44 -05:00
Mario Kleiner
5e30e54e05 radeonsi: Fix use of 1- or 2- component GL_DOUBLE vbo's.
With Mesa 18.1, commit be973ed21f, si_llvm_load_input_vs()
changed the number of source 32-bit wide dword components
used for fetching vertex attributes into the vertex shader
from a constant 4 to a variable num_channels number, depending
on input data format, with some special case handling for
input data formats like 64-Bit doubles.

In the case of a GL_DOUBLE input data format with one
or two components though, e.g, submitted via ...

a) glTexCoordPointer(1, GL_DOUBLE, 0, buffer);
b) glTexCoordPointer(2, GL_DOUBLE, 0, buffer);

... the input format would be SI_FIX_FETCH_RG_64_FLOAT,
but no special case handling was implemented for that
case, so in the default path the number of 32-bit
dwords would be set to the number of float input components
derived from info->input_usage_mask. This ends with corrupted
input to the vertex shader, because fetching a 64-bit double
from the vbo requires fetching two 32-bit dwords instead of 1,
and fetching a two double input requires 4 dword fetches
instead of 2, so in these cases the vertex shader receives
incomplete/truncated input data:

a) float v = gl_MultiTexCoord0.x;  -> v.x is corrupted.
b) vec2  v = gl_MultiTexCoord0.xy; -> v.x is assigned
   correctly, but v.y is corrupted.

This happens with the standard TGSI IR compiled shaders.
Under NIR with R600_DEBUG=nir, we got correct behavior
because the current radeonsi nir code always assigns
info->input_usage_mask = TGSI_WRITEMASK_XYZW, thereby
always fetches 4 dwords regardless of what the shader
actually needs.

Fix this by properly assigning 2 or 4 dword fetches for
one or two component GL_DOUBLE input.

Fixes: be973ed21f ("radeonsi: load the right number of
       components for VS inputs and TBOs")

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-01-09 11:08:44 -05:00
Rhys Perry
ee8488ea3b ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics
Fixes artifacts in World of Warcraft when Multi-sample Alpha-Test is
enabled with DXVK.
It also fixes artifacts with Fallout 4's god rays with DXVK.
Various piglit interpolateAt*() tests under NIR are also fixed.

v2: formatting fix
    update commit message to include Fallout 4 and the Fixes tag

Fixes: f4e499ec79 ('radv: add initial non-conformant radv vulkan driver')
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106595
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
2019-01-09 14:57:07 +00:00
Samuel Pitoiset
b8c4f523b4 radv: skip draws with instance_count == 0
Loosely based on RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-09 14:22:38 +01:00
Samuel Pitoiset
a2b5cc3c39 radv: enable variable pointers
The Vulkan spec 1.1.97 says:
   "variablePointers specifies whether the implementation supports
    the SPIR-V VariablePointers capability. When this feature is
    not enabled, shader modules must not declare the
    VariablePointers capability."

As the SPIR-V feature is enabled, we should turn on the
extension feature as well.

All dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.*
pass with the khronos internal repo. Note that a bunch of them
fails with the public repo, but it's expected as they violate the
specification.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-09 12:32:18 +01:00
Samuel Pitoiset
d58b11e709 radv: get rid of bunch of KHR suffixes
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-09 12:26:48 +01:00
Maya Rashish
a2ddb710fd radeon: fix printf format specifier.
From glibc printf(3):

Z      A nonstandard synonym for z that predates the appearance of z.
       Do not use in new code.

Z may not exist on non-glibc systems. Prefer the standard symbol.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-01-09 14:15:06 +11:00
Tomasz Figa
f6a6da8131 llvmpipe: Always return some fence in flush (v2)
If there is no last fence, due to no rendering happening yet, just
create a new signaled fence and return it, to match the expectations of
the EGL sync fence API.

Fixes random "Could not create sync fence 0x3003" assertion failures from
Skia on Android, coming from the following code:

https://android.googlesource.com/platform/frameworks/base/+/master/libs/hwui/pipeline/skia/SkiaOpenGLPipeline.cpp#427

Reproducible especially with thread count >= 4.

One could make the driver always keep the reference to the last fence,
but:

 - the driver seems to explicitly destroy the fence whenever a rendering
   pass completes and changing that would require a significant functional
   change to the code. (Specifically, in lp_scene_end_rasterization().)

 - it still wouldn't solve the problem of an EGL sync fence being created
   and waited on without any rendering happening at all, which is
   also likely to happen with Android code pointed to in the commit.

Therefore, the simple approach of always creating a fence is taken,
similarly to other drivers, such as radeonsi.

Tested with piglit llvmpipe suite with no regressions and following
tests fixed:

egl_khr_fence_sync
 conformance
  eglclientwaitsynckhr_flag_sync_flush
  eglclientwaitsynckhr_nonzero_timeout
  eglclientwaitsynckhr_zero_timeout
  eglcreatesynckhr_default_attributes
  eglgetsyncattribkhr_invalid_attrib
  eglgetsyncattribkhr_sync_status

v2:
 - remove the useless lp_fence_reference() dance (Nicolai),
 - explain why creating the dummy fence is the right approach.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
2019-01-09 02:06:13 +01:00
Eric Anholt
700aeaf9c8 glsl: Fix buffer overflow with an atomic buffer binding out of range.
The binding is checked against the limits later in the function, so we
need to make sure we don't overflow before the check here.

Fixes this valgrind warning (and sometimes segfault):

==1460== Invalid write of size 4
==1460==    at 0x74C98DD: ast_declarator_list::hir(exec_list*, _mesa_glsl_parse_state*) (ast_to_hir.cpp:4943)
==1460==    by 0x74C054F: _mesa_ast_to_hir(exec_list*, _mesa_glsl_parse_state*) (ast_to_hir.cpp:159)
==1460==    by 0x7435C12: _mesa_glsl_compile_shader (glsl_parser_extras.cpp:2130)

in

dEQP-GLES31.functional.debug.negative_coverage.get_error.compute.
   exceed_atomic_counters_limit

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-01-08 15:44:58 -08:00
Eric Anholt
211b826790 nir: Make nir_deref_instr_build/get_const_offset actually use size_align.
I think this was copy-and-paste mistake -- nir_opt_large_constants was
passing in glsl_get_natural_size_align_bytes() given brw_nir.c's arguments
to the opt pass.

I wanted to reuse this function for handling constant offsets of arrays of
images in V3D.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-08 15:40:53 -08:00
Danylo Piliaiev
9f29d90327 glsl/linker: Fix unmatched TCS outputs being reduced to local variable
Always match TCS outputs since they are shared by all invocations
within the patch and should not be converted to local variables.

This is one of the issues found in Downward.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104297
2019-01-09 10:31:13 +11:00
Eric Anholt
db3b6b6bca v3d: Enable GL_ARB_texture_gather on V3D 4.x.
This is part of GLES 3.1, and with the NIR lowering we're now passing the
GLES31 testcases.
2019-01-08 13:03:44 -08:00
Eric Anholt
6051c11d17 nir: Add nir_lower_tex support for Broadcom's swizzled TG4 results.
V3D returns the texels in a different order in the resulting vec4 from
what GLSL wants, so we need to put in a swizzle.  Fixes
dEQP-GLES31.functional.texture.gather.basic.2d.rgba8.base_level.level_1

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 13:03:41 -08:00
Bas Nieuwenhuizen
3fcec4a550 freedreno: Move register constant files to src/freedreno.
This way they can be shared. Build tested with meson, but not too sure
on the autotools stuff though.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Rob Clark <robdclark@gmail.com>
2019-01-08 21:46:14 +01:00
Caio Marcelo de Oliveira Filho
baabfb1959 nir: fix warning in nir_lower_io.c
Initialize the variable with NULL.  Fixes the following

    In file included from ../src/compiler/nir/nir_lower_io.c:34:
    ../src/compiler/nir/nir_lower_io.c: In function ‘nir_lower_explicit_io’:
    ../src/compiler/nir/nir.h:668:11: warning: ‘addr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
        return src;
               ^~~
    ../src/compiler/nir/nir_lower_io.c:735:17: note: ‘addr’ was declared here
        nir_ssa_def *addr;
                     ^~~~

v2: Avoid using a 'default' case so we get help from the compiler when
    new deref types are added. (Lionel)

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 12:29:56 -08:00
Chia-I Wu
3cb65cf8aa freedreno/drm: sync uapi again
"pad" was missing in Mesa's msm_drm.h.  sizeof(drm_msm_gem_info)
remains the same, but now the compiler initializes the field to
zero.

Buffer allocation results in EINVAL without this for me.

Cc: Rob Clark <robdclark@gmail.com>
Cc: Kristian Høgsberg <hoegsberg@gmail.com>
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
2019-01-08 19:55:28 +00:00
Chia-I Wu
6eeb1fe491 meson: fix EGL/X11 build without GLX
dep_xcb and others were not set under this configuration.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-08 10:58:48 -08:00
Eric Engestrom
b38a48a569 wsi: drop unneeded KHR suffix
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 18:48:03 +00:00
Eric Engestrom
4f5a526789 anv: drop unneeded KHR suffix
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 18:47:56 +00:00
Karol Herbst
d0c6ef2793 nir: rename global/local to private/function memory
the naming is a bit confusing no matter how you look at it. Within SPIR-V
"global" memory is memory accessible from all threads. glsl "global" memory
normally refers to shader thread private memory declared at global scope. As
we already use "shared" for memory shared across all thrads of a work group
the solution where everybody could be happy with is to rename "global" to
"private" and use "global" later for memory usually stored within system
accessible memory (be it VRAM or system RAM if keeping SVM in mind).
glsl "local" memory is memory only accessible within a function, while SPIR-V
"local" memory is memory accessible within the same workgroup.

v2: rename local to function as well
v3: rename vtn_variable_mode_local as well

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 18:51:46 +01:00
Dylan Baker
401dca1c73 autotools: Remove tegra vdpau driver
This has never functioned and probably wont ever function, due to the
way gallium media state trackers are architected and the tegra video
decoder is architected.

Cc: Thierry Reding <thierry.reding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Fixes: 1755f608f5
       ("tegra: Initial support")
2019-01-08 09:42:56 -08:00
Pierre Moreau
ba55cb2bcd clover/meson: Ignore 'svn' suffix when computing CLANG_RESOURCE_DIR
The version exported by LLVM in its CMake configuration files can
include the “svn” suffix when building a development version (for
example “8.0.0svn”). However the exported clang headers are still found
under “lib/clang/8.0.0/”, without the “svn” suffix.
Meson takes care of removing the “svn” suffix from the version when
using the dependency’s `version()` method.

This processing is already performed in “configure.ac” when using
autotools.

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-08 08:53:38 -08:00
Lionel Landwerlin
add5a2ec92 anv: flush fast clear colors into compressed surfaces
In the following scenario :

   1. Create image format R8G8B8A8_UNORM
   2. Create image view format R8G8B8A8_SRGB
   3. Clear the view through a sub pass to a particular color
   4. Barrier on the image to from color attachment to source transfer
   5. Copy the image into a linear buffer to check the content

The step 4 resolving the clear color is unaware of the SRGB format of
the view, because the blorp resolve operations operate on images the
color associated with the resolve will not operate on SRGB format but
UNORM. Leading to the wrong color being written into surfaces.

This change forces a clear color resolve at the end of the render pass
so following resolves won't have to deal with the clear color with a
format that doesn't match the image's format.

On gfxbench vulkan_5_normal 1280x720, this appear to cost us ~0.5fps,
from 49.316 down to 48.949.

v2: Only fast clear resolve when image & view have different formats
    (Lionel)

v3: Update warning (Jason)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2019-01-08 16:37:00 +00:00
Lionel Landwerlin
366eb656ac anv: explictly specify format for blorp ccs/mcs op
Resolve operations can happen when dealing with view (begin/end
subpasses) in which case the view's format needs to apply, not the
image's format.

v2: Relayout arguments of a ccs_op() call (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911
Cc: mesa-stable@lists.freedesktop.org
2019-01-08 16:36:56 +00:00
Tapani Pälli
c292414765 dri3: initialize adaptive_sync as false before configQueryb
Fixes following errors from valgrind output:

   ==23388== Conditional jump or move depends on uninitialised value(s)
   ==23388==    at 0x48B4924: loader_dri3_drawable_init (loader_dri3_helper.c:381)
   ==23388==    by 0x48A97D2: dri3_create_drawable (dri3_glx.c:386)
   ==23388==    by 0x489E190: driFetchDrawable (dri_common.c:369)
   ==23388==    by 0x48A9187: dri3_bind_context (dri3_glx.c:195)
   ==23388==    by 0x488B75C: MakeContextCurrent (glxcurrent.c:220)
   ==23388==    by 0x488B8DB: glXMakeCurrent (glxcurrent.c:267)
   ==23388==    by 0x10A987: ??? (in /usr/bin/glxgears)
   ==23388==    by 0x4BEB412: (below main) (in /usr/lib64/libc-2.28.so)
   ==23388==
   ==23388== Conditional jump or move depends on uninitialised value(s)
   ==23388==    at 0x48B5A40: loader_dri3_swap_buffers_msc (loader_dri3_helper.c:923)
   ==23388==    by 0x48A9B7E: dri3_swap_buffers (dri3_glx.c:587)
   ==23388==    by 0x4887A81: glXSwapBuffers (glxcmds.c:857)
   ==23388==    by 0x10ADED: ??? (in /usr/bin/glxgears)
   ==23388==    by 0x4BEB412: (below main) (in /usr/lib64/libc-2.28.so)

Fixes: 2e12fe425f "loader/dri3: Enable adaptive_sync via _VARIABLE_REFRESH property"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
2019-01-08 08:15:07 +02:00
Dave Airlie
4298a85ae8 virgl: use primconvert provoking vertex properly
This stores the raster state and calls the correct primconvert interface
using the currently bound raster state.

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-01-08 12:06:41 +10:00
Jason Ekstrand
754eff07d2 anv: Sort properties and features switch statements
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-07 18:41:15 -06:00
Jason Ekstrand
05d72d6d48 spirv: Sort supported capabilities
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-07 18:41:15 -06:00
Jason Ekstrand
34af63fa22 anv: Enable the new deref-based UBO/SSBO path
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
63b9aa2e25 spirv: Add support for using derefs for UBO/SSBO access
For now, it's hidden behind a cap.  Hopefully, we can eventually drop
that along with all the manual offset code in spirv_to_nir.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
3a7c5667c8 spirv: Make better use of vtn_pointer_uses_ssa_offset
The choice of whether or not we should use block_load/store isn't a
choice between external and not so much as a choice between deref
instructions and manually calculated offsets.  In vtn_pointer_from_ssa,
we guard the index+offset case behind vtn_pointer_uses_ssa_offset and
then branch out from there.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
adc155a815 spirv: Add explicit pointer types
Instead of baking in uvec2 for UBO and SSBO pointers and uint for push
constant and shared memory pointers, make it configurable.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
be039cb467 spirv: Choose atomic deref type with pointer_uses_ssa_offset
Previously, we hard-coded the rule about workgroup variables and the
builder lower_workgroup_access_to_offsets flag.  Instead base it on the
handy helper we have for exactly this sort of thing.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
5c3cb9c3ce spirv: Add error checking for Block and BufferBlock decorations
Variable pointers being well-defined across the block boundary requires
a couple of very specific SPIR-V validation rules.  Normally, we'd trust
the validator to catch these but since CTS tests have been found in the
wild which violate them, we'll carry our own checks.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
e90b738f20 nir/vulkan: Add a descriptor type to vulkan resource intrinsics
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
f393b10b3f nir/lower_io: Add "explicit" IO lowering
This new pass is for lowering explicitly laid out memory coming in from
SPIR-V or a similar source.  It's quite a bit more complicated than the
normal lower_io because we have to be able to handle matrices.  The
way the stride information is stored for matrices is awkward and dealing
with row-major matrices is especially painful.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
52dd43c7ef nir/validate: Allow array derefs on vectors in more modes
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
013ee5732b nir/intrinsics: Add access flags to load/store_deref
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
7755171e4c nir/intrinsics: Allow deref sources to consume anything
This commit adds a new num_components value for intrinsic sources of -1
which means that it consumes everything and the number of components
effectively isn't validated.  This is useful for deref sources which
just take the result of the deref and we leave it up to the driver to
decide what that size should be.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
d0fe52a456 nir/validate: Allow derefs in phi nodes
We added this assert when first moving derefs over to instructions to
ensure that deref chains could go all the way back to the variables.
Now that we're going to start using derefs for things that we can do
variable pointers on such as UBOs and SSBOs, we need to be able to run
derefs through phi nodes, selects, and basically anything else.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
7e85480a67 nir/remove_dead_variables: Properly handle deref casts
We already detect any incomplete deref chains (where the deref is used
for something other than another deref or a load/store) and flag the
variable as used thanks to deref_used_for_not_store.  All that's left to
do is to properly skip casts when cleaning up.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
78d80f7db2 nir/deref: Skip over casts in fixup_deref_modes
This pass is used when, for instance, we lazily change the mode of
variables rather than replacing the variable with a new one.  Since we
only do this in cases where we know we have full deref chains, it's ok
to just skip them in fixup_deref_modes.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
d8e3edb784 nir/deref: Support casts and ptr_as_array in comparisons
The code which constructs deref paths already gives you the path
starting at the nearest deref_cast or deref_var.  All we need to do for
casts is handle the case where the start of the path isn't a deref_var.
For ptr_as_array derefs, we just bail if we have any after the
divergence point between the two derefs.  We may be able to do better in
the future but this works for now.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
a1c688517d nir/opt_deref: Properly optimize ptr_as_array derefs
When handling casts, we can't blindly propagate the parent of a cast
into a ptr_as_array deref because doing so might loose the stride
information from the cast.  Instead, before we can propagate into
ptr_as_array derefs, we need to check that the cast is a cast of an
array deref and that the stride matches.  For other types of derefs, we
can continue to propagate casts as normal because they don't need the
stride.  We also add an optimization which can combine a ptr_as_array
deref with it parent if it is also an array deref of some form.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
427558a717 nir/validate: Don't allow derefs in if conditions
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
e94a027af8 nir: Add a ptr_as_array deref type
These correspond directly to SPIR-V's OpPtrAccessChain.  As such, they
treat whatever their parent gives them as if it's the first element in
some array and dereferences that array.  If the parent is, itself, an
array deref, then the two indices can just be added together to get the
final array deref.  However, it can also be used in cases where what you
have is a dereference to some random vec2 value somewhere.  In this
case, we require a cast before the ptr_as_array and use the ptr_stride
field in the cast to provide a stride for the ptr_as_array derefs.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
fc9c4f89b8 nir: Move propagation of cast derefs to a new nir_opt_deref pass
We're going to want to do more deref optimizations going forward and
this gives us a central place to do them.  Also, cast propagation will
get a bit more complicated with the addition of ptr_as_array derefs.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
bf1a1eed88 spirv: Propagate layout decorations to created glsl_types
Instead of just storing the decorations in the vtn_type, propagate them
all the way through to the glsl_type.  For array strides, this means we
need to handle them earlier so we break array stride handling into it's
own function and explicitly call it for both pointer and array types.

Due to type deduplication in the SPIR-V, we may have explicit layout
decorations on all sorts of types that don't actually want them.  In
order to prevent these leaking into unfortunate places in NIR, we
explicitly strip them off before creating NIR variables and when casting
pointers to non-external memory.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
6cebeb4f71 glsl_type: Add support for explicitly laid out matrices and arrays
SPIR-V allows for matrix and array types to be decorated with explicit
byte stride decorations and matrix types to be decorated row- or
column-major.  This commit adds support to glsl_type to encode this
information.  Because this doesn't work nicely with std430 and std140
alignments, we add asserts to ensure that we don't use any of the std430
or std140 layout functions with explicitly laid out types.

In SPIR-V, the layout information for matrices is applied to the parent
struct member instead of to the matrix type itself.  However, this is
gets rather clumsy when you're walking derefs trying to compute offsets
because, the moment you hit a matrix, you have to crawl back the deref
chain and find the struct.  Instead, we take the same path here as we've
taken in spirv_to_nir and put the decorations on the matrix type itself.

This also subtly adds support for strided vector types.  These don't
come up in SPIR-V directly but you can get one as the result of taking a
column from a row-major matrix or a row from a column-major matrix.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
7f70b3e555 glsl_type: Simplify glsl_channel_type
This is C++ so we can just poke at the fields of glsl_type if we wish
and calling get_instance is way easier and more reliable than handling
each instance separately.  While we're at it, we re-arrange the base
type labels to match the enum order and add 8-bit type support.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
d8a11bfc08 glsl_type: Add a C wrapper to get struct field offsets
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
d34f19feba glsl_type: Drop the glsl_get_array_instance C helper
It was added in bce6f99875 even though it's completely redundant with
glsl_array_type().

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
a700a82bda nir: Distinguish between normal uniforms and UBOs
Previously, NIR had a single nir_var_uniform mode used for atomic
counters, UBOs, samplers, images, and normal uniforms.  This commit
splits this into nir_var_uniform and nir_var_ubo where nir_var_uniform
is still a bit of a catch-all but the nir_var_ubo is specific to UBOs.
While we're at it, we also rename shader_storage to ssbo to follow the
convention.

We need this so that we can distinguish between normal uniforms and UBO
access at the deref level without going all the way back variable and
seeing if it has an interface type.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
c9a4135e14 nir: Allow storing to shader_storage
I have no idea how shader_storage made it into the list of banned
variable modes for stores but it clearly should be allowed.  This only
doesn't cause us a problem today because we never actually use derefs on
shader_storage variables.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
cd93b0a670 nir/validate: Require array indices to match the deref bit size
This doesn't currently change anything because array indices are
required to be 32 bits and all derefs are also 32 bits.  However, we
will one day have 64-bit derefs for OpenCL.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
abfe674c54 spirv: Handle arbitrary bit sizes for deref array indices
We already had code in link_as_ssa to handle bit sizes; we just need to
use it.  While we're at it we clean up link_as_ssa a bit and add an
explicit bit_size parameter in preparation for a day when we have derefs
that aren't 32 bit.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
bfe31c5e46 nir/builder: Add nir_i2i and nir_u2u helpers which take a bit size
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com
2019-01-08 00:38:29 +00:00
Jason Ekstrand
639c236e74 spirv: Emit NIR deref instructions on-the-fly
This simplifies our deref handling by emitting the actual NIR deref
instructions on-the-fly instead of of building up a deref chain and then
emitting them at the last moment.  In order for this to work with the
parts of the compiler that assume they can chase deref chains, we have
to run nir_rematerialize_derefs_in_use_blocks_impl to put the derefs
back in the right places.  Otherwise, in cases such as loop continues
where the SPIR-V blocks are not in the same order as the NIR blocks, we
may end up with a deref chain with a parent that does not dominate it's
child and nir_repair_ssa_impl will insert phis in the deref chain.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
c59f07684c spirv: Sign-extend array indices
The SPIR-V spec was recently updated to clarify that array indices are
treated as signed integers.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
f8992eb5ba anv/apply_pipeline_layout: Set the cursor in lower_res_reindex_intrinsic
The loop through instructions doesn't set the cursor for us so unless we
set it somewhere, we may end up emitting instructions in the wrong
place.  The only reason why we haven't been bitten by this in the past
is that it only happens in a few variable pointers cases and the CTS
tests for those don't use much control flow so things were getting
emitted in the correct order by accident.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
42b2f3e91f spirv: Handle any bit size in vector_insert/extract
This crops up both in the actual SPIR-V VectorInsert/Extract opcodes as
well as various places where we deal with vector derefs.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
a392ddb781 glsl_type: Support serializing 8 and 16-bit types
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Bas Nieuwenhuizen
70ed049cc6 spirv: Fix matrix parameters in function calls.
They can be handled exactly the same as arrays, we just need to handle
the base type correctly in the switches.

Fixes: a45b6fb452 "spirv: Pass SSA values through functions"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109204
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-08 01:30:03 +01:00
Bas Nieuwenhuizen
3cc940277a radv: Fix rasterization precision bits.
Note that these limits are exact, not a "precision is at least x",
as texel coords also get snapped to a multiple of this step size
before filtering.

This fixes CTS tests

dEQP-VK.texture.explicit_lod.2d.sizes.31x55_nearest_linear_mipmap_nearest_repeat
dEQP-VK.texture.explicit_lod.2d.sizes.57x35_nearest_linear_mipmap_nearest_repeat

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109151
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 23:27:30 +01:00
Kenneth Graunke
f003859f97 nir: Make gl_nir_lower_samplers use gl_nir_lower_samplers_as_deref
These days, we have two sampler lowering passes.  The newer one,
gl_nir_lower_samplers_as_deref, is used by radeonsi.  It rewrites
variables to drop structures out of sampler deref chains, to make
life simpler.  It then sets var->data.binding for non-bindless
sampler and image variables based on the GL uniform storage's
opaque index values.

The older one converts sampler deref chains (nir_tex_src_texture_deref)
to a numerical offset (nir_tex_src_texture_offset).  It also stores the
constant-valued portion of that number in tex->texture_index, making
life really simple for drivers that don't support indirects.  It too
pokes at GL uniform storage's opaque index values.

Logically, we can do the first pass (simplify derefs, set bindings)
then the second (turn derefs to offsets, set texture_index).  This
patch does exactly that, eliminating some redundancy (only one pass
has to poke at GL uniform storage), and gaining proper var->data.binding
values for drivers using the full lowering.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-07 14:25:04 -08:00
Kenneth Graunke
c69f9297cf nir: Fix gl_nir_lower_samplers_as_deref's structure type handling.
We recurse to remove structures, and at each step, re-modify the
resulting type for our link in the deref chain.  For arrays, the
result of recursion is the new underlying type - so we wrap it with
the array dimensionality again.  For structs, we want to simply use
the new underlying type, skipping the struct altogether.

The correct way to do this is to do nothing at all.  Previously, we
had reset type to next->type, which is the /old/ field type, not the
new field type we obtained by recursing.  This undid our recursive work.

Fixes about 338 tests with nested structs, such as:

dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.nested_structs_arrays.sampler2D_samplerCube_fragment

Note that currently only radeonsi uses this pass, and NIR support is
disabled there by default, so the breakage was likely not seen by most
people.  The next commit uses this pass for more drivers, so this fix
prevents regressions from that change.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-07 14:25:04 -08:00
Bas Nieuwenhuizen
be6cee51c0 amd/common: Add some parentheses to silence warning.
[1/59] Compiling C object 'src/amd/common/src@amd@common@@amd_common@sta/ac_nir_to_llvm.c.o'.
../mesa/src/amd/common/ac_nir_to_llvm.c: In function ‘get_inst_tessfactor_writemask’:
../mesa/src/amd/common/ac_nir_to_llvm.c:4089:32: warning: suggest parentheses around ‘+’ inside ‘<<’ [-Wparentheses]
   writemask = ((1 << num_comps + 1) - 1) << first_component;
                      ~~~~~~~~~~^~~
../mesa/src/amd/common/ac_nir_to_llvm.c:4091:33: warning: suggest parentheses around ‘+’ inside ‘<<’ [-Wparentheses]
   writemask = (((1 << num_comps + 1) - 1) << first_component) << 4;

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 23:15:37 +01:00
Bas Nieuwenhuizen
64c83efaee radv: Remove unused variable.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 23:15:33 +01:00
Bas Nieuwenhuizen
656c1c488c radv: Remove device path.
unused and gcc complains about strncpy. (from what I can see because
strncpy does not leave a 0 byte on truncate. That said we don't use
it so this does not fix a real bug).

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 23:15:14 +01:00
Marek Olšák
492ad9a402 ac: remove unused variable from ac_build_ddxy
trivial
2019-01-07 14:51:25 -05:00
Andres Gomez
0cc01f45e7 glsl: correct typo in GLSL compilation error message
v2: Add the "fix" tag (Erik).

Fixes: 037f68d81e ("glsl: apply align layout qualifier rules to block offsets")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-07 19:07:33 +02:00
Jason Ekstrand
027835b1da vulkan: Update the XML and headers to 1.1.97
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 10:00:01 -06:00
Andres Gomez
6decc6b1d9 docs: update 18.3 and add 19.x cycles for the release calendar
v2: replace incorrect "<td/>" with "<td>" (Eric).

Cc: Dylan Baker <dylan.c.baker@intel.com>
Cc: Juan A. Suarez <jasuarez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
2019-01-07 17:19:47 +02:00
Bas Nieuwenhuizen
110564fdec anv/android: Do not reject storage images.
We do the ImageFormatProperties check already, and rejecting an usage
flag when both ImageFormatProperties and the WSI (which is Android)
support it is not allowed.

Intel does support storage for some of the support WSI formats, such
as R8G8B8A8_UNORM, and looking at the ISL_SURF_USAGE_DISABLE_AUX_BIT,
the imported images do not have any form of compression that would
prevent this fix.

v2: Also consider STORAGE bit for Gralloc usage bits.
     (From Kevin Strasser <kevin.strasser@intel.com>)

Fixes: 053d4c328f "anv: Implement VK_ANDROID_native_buffer (v9)"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-07 15:20:55 +01:00
Bas Nieuwenhuizen
9a45a190ad radv: Implement buffer stores with less than 4 components.
We started using it in the btoi paths for r32g32b32, and the LLVM IR
checker will complain about it because we end up with intrinsics with
the wrong type extension in the name.

Fixes: 593996bc02 ("radv: implement buffer to image operations for R32G32B32")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 14:54:14 +01:00
Jon Turney
00ad77b9f6 appveyor: Add a Cygwin build script 2019-01-07 13:40:58 +00:00
Jon Turney
5334dafee2 appveyor: put build steps in a script, rather than inline in appveyor.yml 2019-01-07 13:40:57 +00:00
Lucas Stach
d015888efb etnaviv: annotate variables only used in debug build
Some of the status variables in the compiler are only used in asserts
and thus may be unused in release builds. Annotate them accordingly
to avoid 'unused but set' warnings from the compiler.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-01-07 11:51:02 +01:00
Lucas Stach
b56d903b5a etnaviv: enable full overwrite in a few more cases
Take into account the render target format when checking if the color
mask affects all channels of the RT. This allows to enable full
overwrite in a few cases where a non-alpha format is used.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-01-07 11:50:23 +01:00
Timothy Arceri
6dade5d534 nir: avoid uninitialized variable warning
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109231
2019-01-07 10:57:00 +11:00
Timothy Arceri
17fac39398 st/glsl: refactor st_link_nir()
The functional change here is moving the nir_lower_io_to_scalar_early()
calls inside st_nir_link_shaders() and moving the st_nir_opts() call
after the call to nir_lower_io_arrays_to_elements().

This fixes a bug with the following piglit test due to the current code
not cleaning up dead code after we lower arrays. This was causing an
assert in the new duplicate varyings link time opt introduced in
70be9afccb.

tests/spec/glsl-1.10/execution/vsfs-unused-array-member.shader_test

Moving the nir_lower_io_to_scalar_early() calls also allows us to tidy
up the code a little and merge some loops.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-07 10:54:20 +11:00
Eric Anholt
8847370424 v3d: Use the core tex lowering.
Even without any clever optimization on the unpack operations, this gives
us a useful value for the channels read field, which we can use to avoid
ldtmu instructions to the no-op register.

instructions in affected programs: 890712 -> 881974 (-0.98%)
2019-01-04 15:59:59 -08:00
Eric Anholt
f217a94542 nir: Add nir_lower_tex options to lower sampler return formats.
I've been doing this in the nir-to-vir and nir-to-qir backends of v3d and
vc4, but nir could potentially do some useful stuff for us (like avoiding
unpack/repacks) if we give it the information.

v2: Skip lowering for txs/query_levels
v3: Fix a crash on old-style shadow
v4: Rename to tex_packing, use nir_format_unpack_sint/uint helpers, pack
    the enum.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-04 15:59:57 -08:00
Eric Anholt
a74f2aeb4f nir: Allow nir_format_unpack_int/sint to unpack larger values.
For V3D, I want to unpack 4-16-bit packed integers for 8 and 16-bit
integer samplers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-04 15:59:30 -08:00
Jason Ekstrand
19c608fe43 intel/blorp: Be more conservative about copying clear colors
In 92eb5bbc68 we attempted to avoid copying clear colors whenever
we weren't doing a resolve.  However, this broke MSAA resolves because
we need the clear color in the source.  This patch makes blorp much more
conservative such that it only avoids the clear color copy if either
aux_usage == NONE or it's explicitly doing a fast-clear.

Fixes: 92eb5bbc68 "intel/blorp: Only copy clear color when doing..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107728
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-01-04 17:57:43 -06:00
Eric Anholt
81b9361b68 v3d: Stop scalarizing our uniform loads.
We can pull a whole vector in a single indirect load.  This saves a bunch
of round-trips to the TMU, instructions for setting up multiple loads,
references to the UBO base in the uniforms, and apparently manages to
reduce register pressure as well.

instructions in affected programs: 3086665 -> 2454967 (-20.47%)
uniforms in affected programs: 919581 -> 721039 (-21.59%)
threads in affected programs: 1710 -> 3420 (100.00%)
spills in affected programs: 596 -> 522 (-12.42%)
fills in affected programs: 680 -> 562 (-17.35%)

Improves 3dmmes performance by 2.29312% +/- 0.139825% (n=5)
2019-01-04 15:41:23 -08:00
Eric Anholt
f8a8de8b9a v3d: Do UBO loads a vector at a time.
In the process of adding support for SSBOs and CS shared vars, I ended up
needing a helper function for doing TMU general ops.  This helper can be
that starting point, and saves us a bunch of round-trips to the TMU by
loading a vector at a time.
2019-01-04 15:41:23 -08:00
Eric Anholt
b0e0086257 v3d: Remove dead switch cases and comments from v3d_nir_lower_io.
Moving things to NIR left this mess around.  All we lower now is uniforms.
2019-01-04 15:41:23 -08:00
Eric Anholt
f8e6b364b0 v3d: Fix up VS output setup during precompiles.
I noticed that a VS I was debugging was missing all of its output stores
-- outputs_written was for POS, VAR0, VAR3, while the shader's variables
were POS, VAR9, and VAR12.  I'm not sure what outputs_written is supposed
to be doing here, but we can just walk the declared variables and avoid
both this bug and the emission of extra stvpms for less-than-vec4
varyings.
2019-01-04 15:41:23 -08:00
Eric Anholt
e1385e879d v3d: Reinstate the new shader-db output after v3d_compile() refactor.
I misplaced it in the rebase conflicts.
2019-01-04 15:26:19 -08:00
Caio Marcelo de Oliveira Filho
bbf9ee9b18 nir: remove dead code from copy_prop_vars
When copy_prop_vars also took care of dead write handling, intrin was
used as part of store_to_entry.  Now it isn't, so this assignment
isn't used really used.  Add a comment clarifying what happens to
intrin.

Fixes: 4dfa7adc10 "nir: Remove handling of dead writes from copy_prop_vars"
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-04 15:18:41 -08:00
Lionel Landwerlin
31e4c9ce40 i965: add CS stall on VF invalidation workaround
Even with the previous commit, hangs are still happening. The problem
there is that the VF cache invalidate do happen immediately without
waiting for previous rendering to complete. What happens is that we
invalidate the cache the moment the PIPE_CONTROL is parsed but we
still have old rendering in the pipe which continues to pull data into
the cache with the old high address bits. The later rendering with the
new high address bits then doesn't have the clean cache that it
expects/needs.

v2: Update commit message/explanation with Jason's

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: a363bb2cd0 ("i965: Allocate VMA in userspace for full-PPGTT systems.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109072
2019-01-04 11:18:54 +00:00
Lionel Landwerlin
92b7407090 i965: include draw_params/derived_draw_params for VF cache workaround
These buffers are using VB slots and should be included in the
workaround decision.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: a363bb2cd0 ("i965: Allocate VMA in userspace for full-PPGTT systems.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109072
2019-01-04 11:18:54 +00:00
Lionel Landwerlin
da634a4acb intel/blorp: emit VF caching workaround before 3DSTATE_VERTEX_BUFFERS
Probably no difference but it's nice to have i965 & blorp emit things
in the same order.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-04 11:18:51 +00:00
Lionel Landwerlin
e5ed217545 i965: limit VF caching workaround to gen8/9/10
Documentation of the 3DSTATE_VERTEX_BUFFERS packet says this is only
needed before ICL.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-04 11:18:48 +00:00
Andres Gomez
f0312cfa93 glsl/linker: complete documentation for assign_attribute_or_color_locations
Commit 27f1298b9d ("glsl/linker: validate attribute aliasing before optimizations")
forgot to complete the documentation.

Cc: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-04 09:04:31 +02:00
Gurchetan Singh
6b7aea9d85 virgl: remove empty file
Fixes: 174f53 ("virgl: consolidate transfer code")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-03 20:59:29 +01:00
Gurchetan Singh
ca66457b05 virgl: don't flush an empty range
Otherwise, the gl-1.0-long-dlist Piglit test crashes.

Fixes: db7757 ("virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT")
Reported by airlied@

v2: Exit on any invalid range (Erik)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109190
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
2019-01-03 20:59:29 +01:00
Eric Engestrom
393a756e6a docs: advertise distro-provided meson cross-files
Hopefully we can kick start the revolution and other distros will start
providing them as well :)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-03 18:53:21 +00:00
Eric Engestrom
8b363bc42e docs: fix the meson aarch64 cross-file
`gcc-ar` is preferred over the generic `ar`, and the `arm` family is
for 32-bit ARM [1].

[1] https://mesonbuild.com/Reference-tables.html#cpu-families

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-03 18:53:21 +00:00
Jakob Bornecrantz
6a9be6fc0c virgl/vtest: Use default socket name from protocol header
No functional change as the socket name is the same,
just removing the double definition of the path.

Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2019-01-03 15:50:38 +00:00
Rob Clark
e869481ef3 freedreno: fix staging resource size for arrays
A 2d-array texture (for example), should get the # of array elements
from box->depth, rather than depth0 which is minified.

Fixes dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darray_bias_float_fragment
with tiled textures.

Reported-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-03 08:11:40 -05:00
Rob Clark
67a7f6f244 freedreno: remove blit_via_copy_region()
If we hit the memcpy() path for copy_region(), that will try to do a
transfer_map(), which goes badly for blits to/from staging triggered
by transfer_map() or transfer_unmap().

We could possibly add fd_blit2() which has allow_transfer_map param,
and call that for staging blits.  But I'm not really sure if trying
the blit via copy_region() is very useful.  At least for newer gens
that implement fd_context::blit(), it probably isn't.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-03 08:10:32 -05:00
Rob Clark
2fc17e16a3 freedreno/a6xx: rework blitter API
Switch over to using fd_context::blit(), in the same way that a5xx does.
The previous patch wires fd_resource_copy_region() up to the blitter so
a6xx no longer needs to bypass the core layer to accelerate this.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-03 08:10:23 -05:00
Rob Clark
53b8eb78d5 freedreno: try blitter for fd_resource_copy_region()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-03 08:10:16 -05:00
Rob Clark
228eddd7ee freedreno: rework blit API
First step to unify the way fd5 and fd6 blitter works.  Currently a6xx
bypasses the blit API in order to also accelerate resource_copy_region()

But this approach can lead to infinite recursion:

  #0  fd_alloc_staging (ctx=0x5555936480, rsc=0x7fac485f90, level=0, box=0x7fbab29220) at ../src/gallium/drivers/freedreno/freedreno_resource.c:291
  #1  0x0000007fbdebed04 in fd_resource_transfer_map (pctx=0x5555936480, prsc=0x7fac485f90, level=0, usage=258, box=0x7fbab29220, pptrans=0x7fbab29240) at ../src/gallium/drivers/freedreno/freedreno_resource.c:479
  #2  0x0000007fbe5c5068 in u_transfer_helper_transfer_map (pctx=0x5555936480, prsc=0x7fac485f90, level=0, usage=258, box=0x7fbab29220, pptrans=0x7fbab29240) at ../src/gallium/auxiliary/util/u_transfer_helper.c:243
  #3  0x0000007fbde2dcb8 in util_resource_copy_region (pipe=0x5555936480, dst=0x7fac485f90, dst_level=0, dst_x=0, dst_y=0, dst_z=0, src=0x7fac47c780, src_level=0, src_box_in=0x7fbab2945c) at ../src/gallium/auxiliary/util/u_surface.c:350
  #4  0x0000007fbdf2282c in fd_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47c780, src_level=0, src_box=0x7fbab2945c) at ../src/gallium/drivers/freedreno/freedreno_blitter.c:173
  #5  0x0000007fbdf085d4 in fd6_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47c780, src_level=0, src_box=0x7fbab2945c) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:587
  #6  0x0000007fbde2f3d0 in util_try_blit_via_copy_region (ctx=0x5555936480, blit=0x7fbab29430) at ../src/gallium/auxiliary/util/u_surface.c:864
  #7  0x0000007fbdec02c4 in fd_blit (pctx=0x5555936480, blit_info=0x7fbab29588) at ../src/gallium/drivers/freedreno/freedreno_resource.c:993
  #8  0x0000007fbdf08408 in fd6_blit (pctx=0x5555936480, info=0x7fbab29588) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:546
  #9  0x0000007fbdebdc74 in do_blit (ctx=0x5555936480, blit=0x7fbab29588, fallback=false) at ../src/gallium/drivers/freedreno/freedreno_resource.c:129
  #10 0x0000007fbdebe58c in fd_blit_from_staging (ctx=0x5555936480, trans=0x7fac47b7e8) at ../src/gallium/drivers/freedreno/freedreno_resource.c:326
  #11 0x0000007fbdebea38 in fd_resource_transfer_unmap (pctx=0x5555936480, ptrans=0x7fac47b7e8) at ../src/gallium/drivers/freedreno/freedreno_resource.c:416
  #12 0x0000007fbe5c5c68 in u_transfer_helper_transfer_unmap (pctx=0x5555936480, ptrans=0x7fac47b7e8) at ../src/gallium/auxiliary/util/u_transfer_helper.c:516
  #13 0x0000007fbde2de24 in util_resource_copy_region (pipe=0x5555936480, dst=0x7fac485f90, dst_level=0, dst_x=0, dst_y=0, dst_z=0, src=0x7fac47b8e0, src_level=0, src_box_in=0x7fbab2997c) at ../src/gallium/auxiliary/util/u_surface.c:376
  #14 0x0000007fbdf2282c in fd_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47b8e0, src_level=0, src_box=0x7fbab2997c) at ../src/gallium/drivers/freedreno/freedreno_blitter.c:173
  #15 0x0000007fbdf085d4 in fd6_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47b8e0, src_level=0, src_box=0x7fbab2997c) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:587
  ...

Instead rework the API to push the fallback back to core code, so that
we can rework resource_copy_region() to have it's own fallback path,
and then finally convert fd6 over to work in the same way.

This also makes ctx->blit() optional, and cleans up some unnecessary
callers.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-03 08:09:52 -05:00
Rob Clark
f1c88336e6 freedreno: skip depth resolve if not written
For multi-pass rendering, it is common to keep the same depth buffer
from previous pass, to discard geometry that would be hidden by later
draws.  In the later passes with depth-test enabled, but depth-write
disabled, there is no reason to do gmem2mem resolve.

TODO probably do something similar for stencil.. although stencil
buffer isn't used as commonly these days

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-03 08:09:24 -05:00
Timothy Arceri
4d3f6cb973 nir: merge some basic consecutive ifs
After trying multiple times to merge if-statements with phis
between them I've come to the conclusion that it cannot be done
without regressions. The problem is for some shaders we end up
with a whole bunch of phis for the merged ifs resulting in
increased register pressure.

So this patch just merges ifs that have no phis between them.
This seems to be consistent with what LLVM does so for radeonsi
we only see a change (although its a large change) in a single
shader.

Shader-db results i965 (SKL):

total instructions in shared programs: 13098176 -> 13098152 (<.01%)
instructions in affected programs: 1326 -> 1302 (-1.81%)
helped: 4
HURT: 0

total cycles in shared programs: 332032989 -> 332037583 (<.01%)
cycles in affected programs: 60665 -> 65259 (7.57%)
helped: 0
HURT: 4

The cycles estimates reported by shader-db for i965 seem inaccurate
as the only difference in the final code is the removal of the
redundent condition evaluations and jumps.

Also the biggest code reduction (~7%) for radeonsi was in a tomb
raider tressfx shader but for some reason this does not get merged
for i965.

Shader-db results radeonsi (VEGA):

Totals from affected shaders:
SGPRS: 232 -> 232 (0.00 %)
VGPRS: 164 -> 164 (0.00 %)
Spilled SGPRs: 59 -> 59 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 14584 -> 13520 (-7.30 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 13 -> 13 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-01-03 15:17:16 +11:00
Timothy Arceri
19cafe8084 nir: add rewrite_phi_predecessor_blocks() helper
This will also be used by the if merge pass in the following commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-01-03 15:17:16 +11:00
Timothy Arceri
5122fbc4ba nir: simplify does_varying_match()
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-03 11:47:56 +11:00
Timothy Arceri
8d05ee2005 nir: make use of does_varying_match() helper
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-03 11:47:56 +11:00
Timothy Arceri
0016166d19 nir: make nir_opt_remove_phis_impl() static
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-03 11:47:56 +11:00
Eric Anholt
d2b899c0ec v3d: Refactor compiler entrypoints.
Before, I had per-stage entryoints with some helpers shared between them.
As I extended for compute shaders and shader-db, it turned out that the
other common code in the middle wanted to be shared too.
2019-01-02 14:12:29 -08:00
Eric Anholt
0805060573 v3d: Handle dynamically uniform IF statements with uniform control flow.
Loops will be trickier, since we need some analysis to figure out if the
breaks/continues inside are uniform.  Until we get that in NIR, this gets
us some quick wins.

total instructions in shared programs: 6192844 -> 6174162 (-0.30%)
instructions in affected programs: 487781 -> 469099 (-3.83%)
2019-01-02 14:12:29 -08:00
Eric Anholt
5e9ee6e841 v3d: Fold comparisons for IF conditions into the flags for the IF.
total instructions in shared programs: 6193810 -> 6192844 (-0.02%)
instructions in affected programs: 800373 -> 799407 (-0.12%)
2019-01-02 14:12:29 -08:00
Eric Anholt
078dc176bc v3d: Don't try to fold non-SSA-src comparisons into bcsels.
There could have been a write of a src in between the comparison and the
bcsel that would invalidate the comparison.
2019-01-02 14:12:29 -08:00
Eric Anholt
2e0433b687 v3d: Move the "Find the ALU instruction generating our bool" out of bcsel.
This will be reused for if statements.
2019-01-02 14:12:29 -08:00
Eric Anholt
c3ae0aa264 v3d: Simplify the emission of comparisons for the bcsel optimization.
I wanted to reuse the comparison stuff for nir_ifs, but for that I just
want the flags and no destination value.  Splitting the conditions from
the destinations ended up cleaning the existing code up, anyway.
2019-01-02 14:12:29 -08:00
Eric Anholt
49d8e2aff1 v3d: Don't forget to include RT writes in precompiles.
Looking at some assembly dumps for an optimization, we were clearly
missing important parts of the shader!
2019-01-02 14:12:29 -08:00
Eric Anholt
3a81c753a3 v3d: Fix segfault when failing to compile a program.
We'll still fail at draw time, but this avoids a regression in shader-db
execution once I enable TLB writes in precompiles.

Fixes: b38e4d313f ("v3d: Create a state uploader for packing our shaders together.")
2019-01-02 14:12:29 -08:00
Marek Olšák
3ae57957be radeonsi: always unmap texture CPU mappings on 32-bit CPU architectures
Team Fortress 2 32-bit version runs out of the CPU address space.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:59 -05:00
Marek Olšák
edfca1f8dc radeonsi: remove unused variables in si_insert_input_ptr
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:58 -05:00
Marek Olšák
cba475b3e7 radeonsi: use u_decomposed_prims_for_vertices instead of u_prims_for_vertices
It seems to be the same, but this doesn't use integer division with
a variable divisor.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:56 -05:00
Marek Olšák
54bc87469a radeonsi: make si_cp_wait_mem more configurable
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:54 -05:00
Marek Olšák
9d2c3a1fe0 radeonsi: call si_fix_resource_usage for the GS copy shader as well
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:53 -05:00
Marek Olšák
d28e208213 radeonsi: don't emit redundant PKT3_NUM_INSTANCES packets
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:50 -05:00
Caio Marcelo de Oliveira Filho
7d6babf995 nir: add a way to print the deref chain
Makes debugging easier when we care about the deref chain and not the
deref instruction itself.  To make it take a const pointer, constify
some of the static functions in nir_print.c.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 10:09:04 -08:00
Dylan Baker
a2596450ac meson: Error out if building nouveau and using LLVM without rtti
Nouveau requires rtti. Often LLVM is configured without rtti, and code
with and without cannot be linked safely. Lets just error out if nouveau
is requested and llvm is built without rtti.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109202
Fixes: c5a97d658e
       ("meson: fix builds against LLVM built without rtti")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-02 09:30:12 -08:00
Alexander von Gluck IV
1b97a72328 egl/haiku: Fix reference to disp vs dpy
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 00992700c9 "egl: set the EGLDevice when creating a display"
2019-01-02 13:45:09 +00:00
Iago Toral Quiroga
ec79069856 compiler/spirv: use 32-bit polynomial approximation for 16-bit asin()
The 16-bit polynomial execution doesn't meet Khronos precision requirements.
Also, the half-float denorm range starts at 2^(-14) and with asin taking input
values in the range [0, 1], polynomial approximations can lead to flushing
relatively easy.

An alternative is to use the atan2 formula to compute asin, which is the
reference taken by Khronos to determine precision requirements, but that
ends up generating too many additional instructions when compared to the
polynomial approximation. Specifically, for the Intel case, doing this
adds +41 instructions to the program for each asin/acos call, which looks
like an undesirable trade off.

So for now we take the easy way out and fallback to using the 32-bit
polynomial approximation, which is better (faster) than the 16-bit atan2
implementation and gives us better precision that matches Khronos
requirements.

v2:
 - Fallback to 32-bit using recursion (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:39 +01:00
Iago Toral Quiroga
fda3f6d424 compiler/spirv: implement 16-bit frexp
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:35 +01:00
Iago Toral Quiroga
7d3c34197a compiler/spirv: implement 16-bit hyperbolic trigonometric functions
v2:
 - use nir_fadd_imm and nir_fmul_imm helpers (Jason)

v3:
 - since we need to define one for fsub use it for fdiv too (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
88663ba67c compiler/spirv: implement 16-bit exp and log
v2
 - use nir_fmul_imm helper (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
f18554e2ce compiler/spirv: implement 16-bit atan2
v2:
 - fix huge_val for 16-bit, it was mean't to be 2^14 not 10^14.

v3:
 - rebase on top of new bool sized opcodes
 - use nir_b2f helper
 - use nir_fmul_imm helper

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
1c8de08ec9 compiler/spirv: implement 16-bit atan
v2:
 - use nir_fadd_imm and nir_fmul_imm helpers (Jason)
 - rebased on top of new sized boolean opcodes
 - use nir_b2f helper

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
df118535ca compiler/spirv: implement 16-bit acos
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
dbbbe24d76 compiler/spirv: implement 16-bit asin
v2:
  - use nir_fmul_imm and nir_fadd_imm helpers (Jason)

v3:
 - missed one case where we need to replace nir_imm_float
   with nir_imm_floatN_t (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
95b7c29c2c compiler/spirv: handle 16-bit float in radians() and degrees()
v2:
 - use nir_imm_fmul helper (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
aeee683780 compiler/nir: add nir_fadd_imm() and nir_fmul_imm() helpers
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
5fc9ad1cb0 compiler/nir: add a nir_b2f() helper
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Timothy Arceri
70be9afccb nir: link time opt duplicate varyings
If we are outputting the same value to more than one output
component rewrite the inputs to read from a single component.

This will allow the duplicate varying components to be optimised
away by the existing opts.

shader-db results i965 (SKL):

total instructions in shared programs: 12869230 -> 12860886 (-0.06%)
instructions in affected programs: 322601 -> 314257 (-2.59%)
helped: 3080
HURT: 8

total cycles in shared programs: 317792574 -> 317730593 (-0.02%)
cycles in affected programs: 2584925 -> 2522944 (-2.40%)
helped: 2975
HURT: 477

shader-db results radeonsi (VEGA):

SGPRS: 31576 -> 31664 (0.28 %)
VGPRS: 17484 -> 17064 (-2.40 %)
Spilled SGPRs: 184 -> 167 (-9.24 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 583340 -> 569368 (-2.40 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 6162 -> 6270 (1.75 %)
Wait states: 0 -> 0 (0.00 %)

vkpipeline-db results RADV (VEGA):

Totals from affected shaders:
SGPRS: 14880 -> 15080 (1.34 %)
VGPRS: 10872 -> 10888 (0.15 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 674016 -> 668396 (-0.83 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 2708 -> 2704 (-0.15 %)
Wait states: 0 -> 0 (0.00 %

V2: bunch of tidy ups suggested by Jason

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 12:19:17 +11:00
Timothy Arceri
d828694b80 nir: rework nir_link_opt_varyings()
This just cleans things up a little and make things more safe for
derefs.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 12:19:17 +11:00
Timothy Arceri
c0aba8b0dc nir: add can_replace_varying() helper
This will be reused by the following patch.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 12:19:17 +11:00
Timothy Arceri
50de3f80a8 nir: rename nir_link_constant_varyings() nir_link_opt_varyings()
The following patches will add support for an additional
optimisation so this function will no longer just optimise varying
constants.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 12:19:17 +11:00
Timothy Arceri
0a4378ce56 st/glsl_to_nir: call nir_lower_load_const_to_scalar() in the st
This will help the new opt introduced in the following patches
allowing us to remove extra duplicate varyings.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 12:19:17 +11:00
Timothy Arceri
2ef0f944f5 radeonsi: make use of ac_are_tessfactors_def_in_all_invocs()
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-02 10:01:31 +11:00
Timothy Arceri
2832bc972b ac/nir_to_llvm: add ac_are_tessfactors_def_in_all_invocs()
The following patch will use this with the radeonsi NIR backend
but I've added it to ac so we can use it with RADV in future.

This is a NIR implementation of the tgsi function
tgsi_scan_tess_ctrl().

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-02 10:01:24 +11:00
Timothy Arceri
2817a4ec0b radeonsi: remove unrequired param in si_nir_scan_tess_ctrl()
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-02 10:01:15 +11:00
Timothy Arceri
4dda445750 tgsi/scan: correctly walk instructions in tgsi_scan_tess_ctrl()
The previous code used a do while loop and continues after walking
a nested loop/if-statement. This means we end up evaluating the
last instruction from the nested block against the while condition
and potentially exit early if it matches the exit condition of the
outer block.

Fixes: 386d165d8d ("tgsi/scan: add a new pass that analyzes tess factor writes")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-02 09:53:01 +11:00
Timothy Arceri
dd061eb044 tgsi/scan: fix loop exit point in tgsi_scan_tess_ctrl()
This just happened not to crash/assert because all loops have at
least 1 if-statement and due to a second bug we end up matching
the same ENDIF to exit both the iteration over the if-statment
and the loop.

The second bug is fixed in the following patch.

Fixes: 386d165d8d ("tgsi/scan: add a new pass that analyzes tess factor writes")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-02 09:53:01 +11:00
Ilia Mirkin
8f98ff362c nv30: disable rendering to 3D textures
There's no way to tell the 3D engine about swizzling on such textures.
While rendering to NPOT ones may be possible, there's no great way to
expose that in gallium, nor would there be any practical benefit.

Fixes the non-compressed-format "copyteximage 3D" failures. Something
odd going on with the compressed formats.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-01 15:11:14 -05:00
Bas Nieuwenhuizen
8c93ef5de9 radv: Do a cache flush if needed before reading predicates.
This caused random failures for two conditional rendering tests:

dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_discard
dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_no_discard

These wrote the predicate with the vertex shader, did a barrier and then
started the conditional rendering. However the cache flushes for the barrier
only happen on first draw, so after the predicate has been read.

Fixes: e45ba51ea4 "radv: add support for VK_EXT_conditional_rendering"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-12-31 20:52:08 +01:00
Erik Faye-Lund
86089a7316 anv/autotools: make sure tests link with -msse2
Without this, I get the following error when building the tests with
autotools on i686:

---8<---
src/intel/common/gen_clflush.h: In function ‘gen_clflush_range’:
src/intel/common/gen_clflush.h:37:7: warning: implicit declaration of function ‘__builtin_ia32_clflush’; did you mean ‘__builtin_ia32_pause’? [-Wimplicit-function-declaration]
       __builtin_ia32_clflush(p);
       ^~~~~~~~~~~~~~~~~~~~~~
       __builtin_ia32_pause
src/intel/common/gen_clflush.h: In function ‘gen_flush_range’:
src/intel/common/gen_clflush.h:45:4: warning: implicit declaration of function ‘__builtin_ia32_mfence’; did you mean ‘__builtin_ia32_fnclex’? [-Wimplicit-function-declaration]
    __builtin_ia32_mfence();
    ^~~~~~~~~~~~~~~~~~~~~
    __builtin_ia32_fnclex
---8<---

The erros are generated for each of these files:
- mesa/src/intel/vulkan/tests/state_pool_no_free.c
- mesa/src/intel/vulkan/tests/state_pool.c
- mesa/src/intel/vulkan/tests/block_pool_no_free.c
- mesa/src/intel/vulkan/tests/state_pool_free_list_only.c

This is obviously because gen_clflush.h contains code that uses
intrinsics that are only available with SSE3. Since the driver already
uses SSE3, it seems reasonable to add this to the tests as well.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Eric Engeström <eric@engestrom.ch>
2018-12-31 17:28:21 +01:00
Erik Faye-Lund
89679e18a9 anv/meson: make sure tests link with -msse2
Without this, I get the following error when building the tests using
meson on i686:

---8<---
In file included from ../../../mesa/src/intel/vulkan/anv_private.h:46,
                 from ../../../mesa/src/intel/vulkan/tests/state_pool_no_free.c:26:
../../../mesa/src/intel/common/gen_clflush.h: In function ‘gen_clflush_range’:
../../../mesa/src/intel/common/gen_clflush.h:37:7: error: implicit declaration of function ‘__builtin_ia32_clflush’; did you mean ‘__builtin_ia32_pause’? [-Werror=implicit-function-declaration]
       __builtin_ia32_clflush(p);
       ^~~~~~~~~~~~~~~~~~~~~~
       __builtin_ia32_pause
../../../mesa/src/intel/common/gen_clflush.h: In function ‘gen_flush_range’:
../../../mesa/src/intel/common/gen_clflush.h:45:4: error: implicit declaration of function ‘__builtin_ia32_mfence’; did you mean ‘__builtin_ia32_fnclex’? [-Werror=implicit-function-declaration]
    __builtin_ia32_mfence();
    ^~~~~~~~~~~~~~~~~~~~~
    __builtin_ia32_fnclex
---8<---

The errors are generated for each of these files:
- mesa/src/intel/vulkan/tests/state_pool_no_free.c
- mesa/src/intel/vulkan/tests/state_pool.c
- mesa/src/intel/vulkan/tests/block_pool_no_free.c
- mesa/src/intel/vulkan/tests/state_pool_free_list_only.c

This is obviously because gen_clflush.h contains code that uses
intrinsics that are only available with SSE3. Since the driver already
uses SSE3, it seems reasonable to add this to the tests as well.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engeström <eric@engestrom.ch>
2018-12-31 17:27:33 +01:00
Ilia Mirkin
207fb558e4 nv30: fix some s3tc layout issues
s3tc layouts are a bit finicky - they're packed, but not swizzled.
Adjust logic to allow for that case:

  - Don't set a uniform pitch for POT-sized compressed textures
  - Adjust define_rect API to be less confused about block sizes
  - Only mark a texture as linear if it has a uniform pitch set

This has been tested to fix xonotic (as well as the s3tc-* piglits)
on nv3x and keeps it working on nv4x.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30 23:32:21 -05:00
Ilia Mirkin
ad251330e8 nv30: use correct helper to get blocks in y direction
This doesn't matter since all compressed formats supported by this
hardware use square blocks, but best to use the correct helper.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30 23:32:21 -05:00
Ilia Mirkin
b04c1907c8 nv30: add support for multi-layer transfers
This logic mirrors what we do on nv50. The relatively new
texture_subdata callback can cause this to happen with 3D textures,
which is triggered at least by xonotic, and probably many piglits.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30 23:32:21 -05:00
Ilia Mirkin
b34cfd4749 nv30: fix rare issue with fp unbinding not finding the bufctx
If the last-active context gets deleted, the pushbuf doesn't have a
bufctx to reference. Then there could be a sequence of binds which would
trigger a reset on that bin before validation was done. Instead we just
pass in the bufctx in question directly.

All other instances of PUSH_RESET happen strictly after a validation is
run.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102349
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30 19:44:43 -05:00
Ilia Mirkin
ef3eac9545 nv30: avoid setting user_priv without setting cur_ctx
The whole user_priv thing is a mess, but as long as it's there, it
basically has to map 1:1 to the cur_ctx. Unfortunately we were setting
user_priv to some context, then that context could get deleted without
any draws/validations in it, leading user_priv to become NULL, with
cur_ctx still pointing at some old context. Then we wouldn't run the
switch logic, which in turn led to a NULL bufctx being dereferenced.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102349
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30 19:44:43 -05:00
Eric Anholt
ad1e59cf8d v3d: Add support for gl_HelperInvocation.
We can just look at the MSF flags -- if they're unset, then we're
definitely in a helper invocation.  Fixes
dEQP-GLES31.functional.shaders.helper_invocation.* with GLES3.1 enabled.
2018-12-30 08:05:11 -08:00
Eric Anholt
20021e3473 v3d: Add support for textureSize() on MSAA textures.
Fixes failures in
dEQP-GLES31.functional.shaders.builtin_functions.texture_size.samples_1_texture_2d
in the GLES3.1 suite.
2018-12-30 08:05:11 -08:00
Eric Anholt
f695d62fe5 v3d: Add support for requesting the sample offsets. 2018-12-30 08:05:11 -08:00
Eric Anholt
906fca1b4b v3d: Add support for non-constant texture offsets.
Fixes
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_pot.clamp_to_edge_repeat
and others.
2018-12-30 08:05:11 -08:00
Eric Anholt
47caefc7b4 v3d: Force sampling from base level for tg4.
This is what the GLSL ES 310 spec tells us to do, but apparently the
"gather mode" flag doesn't imply it in the HW.  Fixes
dEQP-GLES31.functional.texture.gather.basic.2d.rgba8.filter_mode.min_nearest_mipmap_linear_mag_linear
2018-12-30 08:05:11 -08:00
Eric Anholt
f9bdce9966 v3d: Add a note for a potential performance win on multop/umul24.
Noticed while debugging a testcase.
2018-12-30 08:05:11 -08:00
Eric Anholt
b36757448d v3d: Dead-code eliminate unused flags updates.
The greedy comparison folding in bcsel means that we may have left the
original bool-generating NIR ALU instruction dead, but DCE wasn't
eliminating the VIR code for it because of the flags updates.

total instructions in shared programs: 5186024 -> 5100894 (-1.64%)
instructions in affected programs: 1448695 -> 1363565 (-5.88%)
2018-12-30 08:05:11 -08:00
Eric Anholt
20e3526298 v3d: Don't generate temps for comparisons.
This was just generated work for vir_opt_dead_code and cluttered up the
dumps.
2018-12-30 08:04:54 -08:00
Eric Anholt
ebde5afb93 v3d: Move "does this instruction have flags" from sched to generic helpers.
I wanted to reuse it for DCE of flags updates.
2018-12-30 08:03:51 -08:00
Eric Anholt
39b1112189 v3d: Drop incorrect dependency for flpop.
It is just shifting probably-means-flags bits out of a value, it doesn't
actually update the flags on its own.
2018-12-30 08:03:51 -08:00
Eric Anholt
a7c9fd7573 v3d: Drop unused count_nir_instrs() helper.
This was for shader-db, but I haven't cared about NIR instruction counts
in a long time.
2018-12-30 08:03:51 -08:00
Eric Anholt
696f63f1b4 v3d: Hook up some shader-db output to GL_ARB_debug_output.
This allows the original shader-db project's run.c runner to parse things
easily, and is probably a good thing to have for GL_ARB_debug_output in
general.  I formatted it more like Intel's so I can mostly reuse their
report script.
2018-12-30 08:03:51 -08:00
Eric Anholt
87b251a940 v3d: Add a "precompile" debug flag for shader-db.
I've been using my apitrace-based shader-db so far, but it's slow
(apitrace decompression), intrusive (apitrace windows spamming the
screen), and doesn't have much coverage.  The original shader-db provides
a lot more coverage and compiles faster, at the expense of not having the
actual runtime variant key.  As v3d has a lot less runtime variation than
vc4 did, this tradeoff makes more sense.
2018-12-29 13:52:09 -08:00
Eric Anholt
9ec6a3d621 v3d: Fix uniform pretty printing assertion failure with branches.
Fixes: 248a7fb392 ("v3d: Do uniform pretty-printing in the QPU dump.")
2018-12-29 13:52:09 -08:00
Dylan Baker
133a5b8383 meson: Override C++ standard to gnu++11 when building with altivec on ppc64
Otherwise there will be symbol collisions for the vector name.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108943
Distro Bug: https://bugs.gentoo.org/673622
Fixes: 42ea0631f1
       ("meson: build clover")
Acked-by: Matt Turner <mattst88@gmail.com>
2018-12-28 11:04:57 -08:00
Lionel Landwerlin
f7bccf6ab4 intel/aub_viewer: highlight true booleans
Useful to spot PIPE_CONTROL flags.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-12-28 16:48:46 +00:00
Lionel Landwerlin
6ba61ea391 intel/aub_viewer: fold binding/sampler table items
Makes things easier to read rather than a long block of text.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-12-28 16:48:43 +00:00
Lionel Landwerlin
7ab8c80625 intel/aub_viewer: fix shader view
Not decoding the shader at the right offset.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-12-28 16:48:40 +00:00
Lionel Landwerlin
f3ed4a058d intel/aub_viewer: print address of missing shader
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-12-28 16:48:21 +00:00
Lionel Landwerlin
0382e11989 intel/aub_viewer: fixup 0x address prefix
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-12-28 16:48:18 +00:00
Lionel Landwerlin
8e2fda411a intel/aub_viewer: fix shader get_bo
Instruction addresses are always in ppgtt space.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-12-28 16:48:08 +00:00
Nicholas Kazlauskas
e260493f2a radeonsi: Enable adaptive_sync by default for radeon
It's better to let most applications make use of adaptive sync
by default. Problematic applications can be placed on the blacklist
or the user can manually disable the feature.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
2018-12-28 17:08:14 +01:00
Nicholas Kazlauskas
2e12fe425f loader/dri3: Enable adaptive_sync via _VARIABLE_REFRESH property
The DDX driver can be notified of adaptive sync suitability by
flagging the application's window with the _VARIABLE_REFRESH property.

This property is set on the first swap the application performs
when adaptive_sync is set to true in the drirc.

It's performed here instead of when the loader is initialized for
two reasons:

(1) The window's drawable can be missing during loader init.
    This can be observed during the Unigine Superposition benchmark.

(2) Adaptive sync will only be enabled closer to when the application
    actually begins rendering.

If adaptive_sync is false then the _VARIABLE_REFRESH property
is deleted on loader init.

The property is only managed on the glx DRI3 backend for now. This
should cover most common applications and games on modern hardware.

Vulkan support can be implemented in a similar manner but would likely
require splitting the function out into a common helper function.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
2018-12-28 16:44:47 +01:00
Nicholas Kazlauskas
a9c36dbf9c drirc: Initial blacklist for adaptive sync
Applications that don't present at a predictable rate (ie. not games)
shouldn't have adapative sync enabled. This list covers some of the
common desktop compositors, web browsers and video players.

[ Michel Dänzer: Added entry for firefox-esr ]

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
2018-12-28 16:44:27 +01:00
Nicholas Kazlauskas
7407670036 util: Add adaptive_sync driconf option
This option lets the user decide whether mesa should notify the
window manager / DDX driver that the current application is adaptive
sync capable.

It's off by default.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
2018-12-28 16:38:06 +01:00
Nicholas Kazlauskas
759b940389 util: Get program name based on path when possible
Some programs start with the path and command line arguments in
argv[0] (program_invocation_name). Chromium is an example of
an application using mesa that does this.

This tries to query the real path for the symbolic link /proc/self/exe
to find the program name instead. It only uses the realpath if it
was a prefix of the invocation to avoid breaking wine programs.

Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-28 15:41:01 +01:00
Tomeu Vizoso
bf1dfcc3e8 etnaviv: Consolidate buffer references from framebuffers
We were leaking surfaces because the references taken in
etna_set_framebuffer_state weren't being released on context destroy.

Instead of just directly releasing those references in
etna_context_destroy, use the util_copy_framebuffer_state helper.

Take the chance to remove the duplicated buffer references in
compiled_framebuffer_state to avoid confusion.

The leak can be reproduced with a client that continuously creates and
destroys contexts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reported-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-12-28 10:22:01 +01:00
Dave Airlie
d1ce7eba8b virgl/vtest: fix front buffer flush with protocol version 0.
Older versions of virglrenderer before 33da7361aec486290df0aec4ad8dfa8ff6adde2c
in vtest mode, misrender gears.

Fixes: 9d81cd8e7c (virgl: Pass resource size and transfer offsets)
Reviewed-By: Gert  Wollny <gert.wollny@collabora.com>
2018-12-28 16:50:38 +10:00
Dylan Baker
6adbd9ac74 docs/autoconf: Mark autoconf as being replaced
I know it's not what anyone wants, but how about we start with a
message in the documentation that encourages people to try meson.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engeström <eric@engestrom.ch>
2018-12-27 09:03:20 -08:00
Dylan Baker
4c32964f49 docs/install: Update python dependency section
Note that meson requires python 3, scons requires python 2, and
autotools works with either.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engeström <eric@engestrom.ch>
2018-12-27 09:03:20 -08:00
Dylan Baker
a57dbe6971 docs/meson: Update LLVM section with information about native files
Reviewed-by: Eric Engeström <eric@engestrom.ch>
2018-12-27 09:03:17 -08:00
Dylan Baker
40ec5fec0a docs/install: Add meson to the main install page
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engeström <eric@engestrom.ch>
2018-12-27 09:03:07 -08:00
Juan A. Suarez Romero
fe7919acad docs: update calendar, add news item and link release notes for 18.2.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-12-27 17:37:33 +01:00
Juan A. Suarez Romero
0d53451890 docs: add sha256 checksums for 18.2.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 24c31bc0e2)
2018-12-27 17:35:04 +01:00
Juan A. Suarez Romero
008478e340 docs: add release notes for 18.2.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 785e09e3b3)
2018-12-27 17:35:02 +01:00
Ilia Mirkin
2269ab8588 nv50,nvc0: add missing CAPs for unsupported features
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-26 20:28:07 -05:00
Ilia Mirkin
1d10bb2025 nvc0: enable GL_NV_shader_atomic_float on pre-Maxwell
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-26 20:04:57 -05:00
Ilia Mirkin
0dd55db10f nv50/ir: add support for converting ATOMFADD to proper ir
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-26 20:04:57 -05:00
Ilia Mirkin
9867f2a1f7 st/mesa: expose GL_NV_shader_atomic_float when ATOMFADD is supported
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-26 20:04:57 -05:00
Ilia Mirkin
4d5a6a1649 st/mesa: select ATOMFADD when source type is float
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-26 20:04:57 -05:00
Ilia Mirkin
d139231b32 gallium: add PIPE_CAP_TGSI_ATOMFADD to indicate support
ATOMFADD is a little special -- make drivers have to specify it
explicitly.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-26 20:04:57 -05:00
Ilia Mirkin
5574414edc tgsi: add ATOMFADD operation
This is supported by at least NVIDIA hardware, and exposeable via GL
extensions.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-26 20:04:57 -05:00
Ilia Mirkin
bac8534267 st/mesa: allow glDrawElements to work with GL_SELECT feedback
Not sure if this ever worked, but the current logic for setting the
min/max index is definitely wrong for indexed draws. While we're at it,
bring in all the usual logic from the non-indirect drawing path.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109086
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-12-26 19:30:33 -05:00
Eric Anholt
7d7ecfbcbc gallium/ttn: Fix setup of outputs_written.
We need a 64-bit value, otherwise we only handle the low 32, and happen to
sign-extend to claim to write all varying slots if VARYING_SLOT_VAR2 was
used.

Fixes: 4d0b2c7aaa ("ttn: Update shader->info as we generate code.")
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-12-26 11:42:09 -08:00
Lionel Landwerlin
e2ae5f2f0a anv: don't do partial resolve on layer > 0
We've made the choice not to use fast clears on layer > 0 with
multilayer images. This is partly because we would need to store
multiple clear colors for each layer, making the existing memory
layout, already including aux surfaces, fast clear color, image state,
etc... even more complex.

Partial resolves are the operations transfering the clear colors into
the auxiliary buffers. This operation is currently implemented in
Blorp by loading the clear color from the image's BO, into a shader
that then samples from the auxiliary buffer and writes the color only
if it isn't there already.

The problem here is that because we store only one clear color for all
layers and it is used for partial resolves. If you trigger a partial
clear on a layer > 0, then you're likely to deal with a color that is
not what you actually want. In the particular issues below, we have
multiple layers, each cleared with a different color but the partial
resolve just writes the wrong color into the auxiliary buffers for
layers > 0.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108910
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911
Cc: mesa-stable@lists.freedesktop.org
2018-12-24 09:42:46 +00:00
Axel Davy
c6b37e5412 st/nine: Increase the limit of cached ff shaders
100 is too small for some games, which triggers recompilations
every frame. Increase to 1024.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-12-23 08:14:50 +01:00
Axel Davy
104681c5d5 st/nine: Add src reference to nine_context_range_upload
Just like nine_context_box_upload, nine_context_range_upload
should reference the src, which holds the ram source buffer.

Fixes: https://github.com/iXit/Mesa-3D/issues/327
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Cc: mesa-stable@lists.freedesktop.org
2018-12-23 08:14:50 +01:00
Axel Davy
42d672fa6a st/nine: Bind src not dst in nine_context_box_upload
nine_context_box_upload uploads a ram buffer (from src)
to a pipe_resource (dst).
We already have a refcount on the pipe_resource,
what needs to be protected from release is the ram buffer,
thus a reference to src.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Cc: mesa-stable@lists.freedesktop.org
2018-12-23 08:14:50 +01:00
Axel Davy
f91f748fab st/nine: Fix volumetexture dtor on ctor failure
The dtor is called on allocation failure,
thus we must check the volumes are allocated
before trying to release them.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Cc: mesa-stable@lists.freedesktop.org
2018-12-23 08:14:50 +01:00
Axel Davy
1cc8192ad0 st/nine: Switch to presentation buffer if resize is detected
This enables to match the window size
on resize on all cases, as it only works
currently with presentation buffers.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-12-23 08:14:50 +01:00
Axel Davy
c442dd7890 st/nine: Use helper to release swapchain buffers later
This patch introduces a structure to release the
present_handles only when they are fully released
by the server, thus making
"DestroyD3DWindowBuffer" actually release the buffer
right away when called.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-12-23 08:14:50 +01:00
Rob Clark
51a44c3aac freedreno/a6xx: fix 3d texture layout
Maybe not 100% perfect, but seems to be a pretty good approximation of
that.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-22 15:29:15 -05:00
Rob Clark
8f60f1381d freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-22 15:28:50 -05:00
Rob Clark
be9ec158d7 freedreno/a6xx: improve setup_slices() debug msgs
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-22 15:28:24 -05:00
Rob Clark
2b497fc507 freedreno/a6xx: simplify special case for 3d layout
This logic can be re-written as the two cases for 3d (ie. before/after
the miplevel sizes start reducing) vs everything else.  I think it is
easier to read this way.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-22 15:27:57 -05:00
Rob Clark
d71a50f831 freedreno: combine fd_resource_layer_offset()/fd_resource_offset()
We really only need this logic in one place.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-22 15:27:37 -05:00
Rob Clark
6667dde098 freedreno/ir3: don't treat all inputs/outputs as vec4
This was a hold-over from the early TGSI days, and mostly not needed
with NIR.  This avoids burning an entire 4 consecutive scalar regs
for vec3 outputs, for example.  Which fixes a few places that we were
doing worse that we should on register usage.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-22 15:27:21 -05:00
Rob Clark
3453814622 freedreno/ir3: fix fallout of extra assert
Fixes the following crash that happened after d6110d4d

The problem happens if we first compile a "vanilla" shader with nothing
lowered in NIR, which perform the final lowering passes on so->shader->
nir (including nir_lower_locals_to_regs()), and then later we have
compile a shader with some lowering.  The second time through we would
have already done nir_lower_locals_to_regs().

Arguably this was already a bug, just one we hadn't noticed yet.

Fixes: d6110d4d54 intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-21 19:04:22 -05:00
Kenneth Graunke
626f2477ab st/nir: Drop unused gl_program parameter in VS input handling helper.
Nobody uses this, so let's drop it.  This makes the helper callable
from places without a gl_program.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-21 15:29:32 -08:00
Kenneth Graunke
3a78b46e59 st/nir: Gather info after applying lowering FS variant features
DrawPixels lowering, for example, adds new varyings that need to be
accounted for in inputs_read.  The earlier info gathering at link time
cannot account for this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-21 15:29:30 -08:00
Kenneth Graunke
bcb6f19947 st/mesa: Combine the DrawPixels and Bitmap passthrough VS programs.
They're now identical, so we can just compile it once.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-21 15:29:29 -08:00
Kenneth Graunke
80dd9dfe33 st/mesa: Don't open code the drawpixels vertex shader.
Now that we always copy color, we can just use the util function.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-21 15:29:28 -08:00
Kenneth Graunke
ed1a356c5e st/mesa: Drop !passColor optimization in drawpixels shaders.
The glDrawPixels passthrough vertex shader copies position and texcoord
vertex attributes to varying outputs.  It also optionally copies a third
gl_Color attribute, which sometimes is unnecessary.  Until now, we've
compiled separate variants of the shader, one of which does this extra
copy, and the other of which doesn't.  We have done this since 2007.

But, the vertex shader runs for a whopping four vertices, and so the
cost of a copying a single input to output is likely inconsequential.
In theory, we could bind one fewer vertex element - but we always bind
all three regardless.  So, we don't even get that savings.

This patch unifies the two, so we always copy the optional color,
and save having to compile the variant.  It also makes the VS input
interface match up with the vertex element state without any dead
(unused) input attributes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-21 15:29:25 -08:00
Kenneth Graunke
42d31e0516 st/mesa: Drop dead 'passthrough_fs' field.
Dead since 2015 (commit 5142564734).

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-21 15:29:20 -08:00
Bas Nieuwenhuizen
bba5749484 radv: Fix wrongly positioned paren.
Trivial.

Fixes: 9f0bfbed11 "radv: Work around non-renderable 128bpp compressed 3d textures on GFX9."
2018-12-21 21:06:55 +01:00
Dylan Baker
1e872d1486 docs: add note about using backticks for rbs in gitlab
So that gitlab will render the < and > correctly allowing the tag to be
copy-n-pasted without additional formatting.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-12-21 17:43:56 +00:00
Alex Deucher
516160d717 pci_ids: add new VegaM pci id
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-12-21 11:51:34 -05:00
Roland Scheidegger
171983dc89 gallivm: abort when trying to use non-existing intrinsic
Whenever llvm removes an intrinsic (we're using), we're hitting segfaults
due to llvm doing calls to address 0 in the jitted code instead.
However, Jose figured out we can actually detect this with
LLVMGetIntrinsicID(), so use this to abort, so we don't have to wonder
what got broken. (Of course, someone still needs to fix the code to
no longer use this intrinsic.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-12-21 17:37:00 +01:00
Roland Scheidegger
f3b1acff48 gallivm: don't use pavg.b intrinsic on llvm >= 6.0
This intrinsic disppeared with llvm 6.0, using it ends up in segfaults
(due to llvm issuing call to NULL address in the jited shaders).
Add code doing the same thing as the autoupgrade code in llvm so it
can be matched and replaced back with a pavgb.

While here, also improve lp_test_format, so it tests both with and without
cache (as it was, it tested the cache versions only, whereas cache is
actually disabled in llvmpipe, and in any case even with it enabled
vertex and geometry shaders wouldn't use it). (Although at least for
the unorm8 uncached fetch, the code is still quite different to what
llvmpipe is using, since that would use unorm8x16 type, whereas
the test code is using unorm8x4 type, hence disabling some intrinsic
paths.)

Fixes: 6f4083143b ("gallivm: use llvm jit code for decoding s3tc")

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2018-12-21 17:35:05 +01:00
Emil Velikov
a8d020c3dc travis: meson: port gallium build combinations over
This commit adds a number of build combinations:

 - Gallium Drivers {SWR, RadeonSI, Others)
Each one has different LLVM requirements. Building SWR alone is twice
as slow as all other drivers combined.

 - Gallium ST Clover LLVM {5,6,7}
Because C++ API changes all the time. Analogous to above building
Clover takes as much time as building all other ST combined.

 - Gallium ST Others
Nouveau is used, instead of i915g since meson has explicit target
tracking. Meaning that a configure error is thrown if we use i915g
with say va, vdpau or others.

Note: LLVM prior to 5.0 is intentionally dropped. If needed we can add
that later.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-13 01:34:59 +00:00
Emil Velikov
39634f2f35 travis: meson: add explicit handling to gallium ST
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 13:52:20 +00:00
Emil Velikov
51318c32fe travis: meson: explicitly control the DRI loaders
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 13:42:36 +00:00
Emil Velikov
e890aaabed travis: meson: add unwind handling
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-12 13:33:14 +00:00
Emil Velikov
266ae2225e travis: meson: use FOO_DRIVERS directly
It makes for a shorter MESON_OPTIONS and cleaner handling.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 13:18:54 +00:00
Dylan Baker
31c162ad22 travis: meson: enable unit tests
v2: [Emil] pass the argument directly to meson

Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-11 10:34:51 -08:00
Dylan Baker
116f0fb216 travis: Don't try to read libdrm out of configure.ac
Since we're going to delete it shortly

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-11 11:09:21 -08:00
Dylan Baker
ecf96413bb travis: meson: use native files to override llvm-config
This is the supported way to do this, and should be more robust and
reliable.

v2: [Emil]
 - enable backslash escapes
 - don't hardcode the path
 - pass the argument directly to meson

Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-11 10:40:25 -08:00
Emil Velikov
81173fd69f travis: printout llvm-config --version
Provides quick and easy feedback.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-13 10:38:20 +00:00
Emil Velikov
de72c1fe6c travis: meson: print the configured state
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 17:43:07 +00:00
Emil Velikov
7c38d7b7c8 travis: flip to distro xenial, drop sudo false
The latter is the default these days and Travis will be removing sudo
soonish.

Flipping to xenial, allows us to remove a bunch of hacks we have. Plus
it prevents us from adding new ones, to workaround what seems like a
gcc/binutils bug. For example (from the upcoming meson build):

FAILED: ccache c++  -o src/gallium/targets/pipe-loader/pipe_r600.so ...
  ... src/util/libmesa_util.a ... /usr/lib/x86_64-linux-gnu/libz.so ...

src/util/libmesa_util.a(disk_cache.c.o): In function `deflate_and_write_to_disk':
_build/../src/util/disk_cache.c:746: undefined reference to `deflateInit_'
_build/../src/util/disk_cache.c:765: undefined reference to `deflate'
...

As we can see, even though libz.so is explicitly passed after the
object that requires it - the linker still fails to see the symbols.
Avoid all those situations - flip the switch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-13 11:20:41 +00:00
Emil Velikov
12187550f9 configure: add CXX11_CXXFLAGS to LLVM_CXXFLAGS
Seemingly with LLVM7 and GCC 5.0, the former won't properly advertise
-std=c++11 and the latter will choke.

dd this temporary workaround, otherwise we'll get errors like:

In file included from /usr/include/c++/5/type_traits:35:0,
                 from /usr/lib/llvm-7/include/llvm/Support/type_traits.h:18,
                 from /usr/lib/llvm-7/include/llvm/ADT/Optional.h:22,
                 from /usr/lib/llvm-7/include/llvm/ADT/STLExtras.h:20,
                 from /usr/lib/llvm-7/include/llvm/ADT/StringRef.h:13,
                 from /usr/lib/llvm-7/include/llvm/Target/TargetMachine.h:17,
                 from ../../../src/amd/common/ac_llvm_helper.cpp:36:
/usr/include/c++/5/bits/c++0x_warning.h:32:2: error: #error This file requires compiler and library support for the ISO C++ 2011 standard. This support must be enabled with the -std=c++11 or -std=gnu++11 compiler options.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-13 11:56:40 +00:00
Emil Velikov
f331419f26 glx/test: meson: assorted include fixes
Swap '..' with the symbolic inc_glx and add glproto as dependency. That
will pull the correct include, effectively fixing the tests on macOS.

Fixes: a47c525f32 ("meson: build glx")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 19:24:14 +00:00
Emil Velikov
e139d7a8a3 glx: meson: wire up the dispatch-index-check test
Accidentally dropped with earlier commit.!

Fixes: 4ccb981673 ("meson: Use consistent style for tests")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 19:07:52 +00:00
Emil Velikov
b44875e2dc glx: meson: drop includes from a link-only library
When producing the final libGL.so/libGLX_mesa.so we only link the local
static helper lib (libglx). Thus there's no reason for the includes.

Fixes: a47c525f32 ("meson: build glx")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 17:55:08 +00:00
Emil Velikov
9527f9ea26 TODO: glx: meson: build dri based glx tests, only with -Dglx=dri
The library itself (libGL) is only built when -Dglx=dri, yet it's
accompanying tests are build even with -Dglx=xlib.

Adjust the guards, so we don't build the tests when they are not
applicable

v2:
 - Reword commit message (Dylan)
 - Drop build_by_default hunk (Dylan)

Fixes: a47c525f32 ("meson: build glx")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 17:47:36 +00:00
Emil Velikov
2eedb79e1a pipe-loader: meson: reference correct library
The library is called libgalliumvl_stub - note singular.

Fixes: 42ea0631f1 ("meson: build clover")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-13 04:10:50 +00:00
Emil Velikov
9d10581897 meson: don't require glx/egl/gbm with gallium drivers
The gallium drivers do not require a DRI loader. Drop the artificial
and unnecessary restriction.

Fixes: af9d276134 ("meson: build libmesa_gallium")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-13 03:54:03 +00:00
Emil Velikov
e0dbfc9953 bin/get-pick-list.sh: warn when commit lists invalid sha
We had cases where people would list old/invalid sha in the commit.
Add a trivial checker to catch those and throw a warning.

CC: Juan A. Suarez <jasuarez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-12-21 14:39:52 +00:00
Emil Velikov
6b296f64af bin/get-pick-list.sh: rework handing of sha nominations
Currently our is_sha_nomination does:
 - folds any whitespace, attempting to extract sha-like information
 - checks that at least one of the shas has landed

Split it in two and do sha-like validation first.

This way, commits with mesa-stable and sha nominations will feature the
fixes/revert/etc instead of stable (a) or will be omitted if not
applicable for the respective branch (b).

Misc examples from 18.3

(a)
-[   stable ] 5bc509363b glx: make xf86vidmode mandatory for direct rendering
+[    fixes ] 5bc509363b glx: make xf86vidmode mandatory for direct rendering

(b)
-[   stable ] 9a7b319903 anv/query: flush render target before copying results

CC: Juan A. Suarez <jasuarez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-12-21 14:39:34 +00:00
Eric Anholt
17218a0406 vc4: Hook up perf_debug() output to GL_ARB_debug_output as well.
This is the right channel to report these things, so that end-users don't
need to know each driver's custom debug options.
2018-12-20 11:31:25 -08:00
Rhys Kidd
acc481ad79 vc4: Wire up core pipe_debug_callback
This lets the driver use pipe_debug_message() for GL_ARB_debug_output.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-12-20 11:31:19 -08:00
Eric Anholt
ba36312fbd v3d: Hook up perf_debug() output to GL_ARB_debug output as well.
This is the right channel to report these things, so that end-users don't
need to know each driver's custom debug options.
2018-12-20 11:31:19 -08:00
Rhys Kidd
d3991d2472 v3d: Wire up core pipe_debug_callback
This lets the driver use pipe_debug_message() for GL_ARB_debug_output.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-12-20 11:31:16 -08:00
Eric Anholt
d80761b8f3 v3d: Drop shadow comparison state from shader variant key.
The shadow state is now in the sampler.
2018-12-20 11:29:30 -08:00
Eric Anholt
0e2758daad v3d: Fix simulator mode on i915 render nodes.
i915 render nodes refuse the dumb ioctls, so the simulator would crash on
the original non-apitrace shader-db.  Replace them with direct i915 calls
if we detect that we're on one of their gem fds.
2018-12-20 11:29:30 -08:00
Dylan Baker
0ff7eed289 docs/meson: Recommend not using CFLAGS and friends
Because of the many caveats involved, using -Dc_args instead of CFLAGS
is recommended both by meson upstream and by us.

v2: - Fix typo

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-12-20 11:16:40 -08:00
Samuel Pitoiset
9606310081 radv: enable shaderStorageImageMultisample feature on GFX8+
Untested on older chips.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:19 +01:00
Samuel Pitoiset
6b976024a8 radv: add support for FMASK expand
Original patch by Dave Airlie.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:17 +01:00
Samuel Pitoiset
fa16da53d8 radv: initialize FMASK for images in fully expanded mode
The value depends on the number of samples.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:15 +01:00
Samuel Pitoiset
65d82c84d2 ac/nir: restrict fmask lookup to image load intrinsics
We don't ever want to do the fmask lookup on a atomic or
store, the fmask should have been decompressed if the
surface has been moved to IMAGE_LAYOUT.

Original patch by Dave Airlie.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:11 +01:00
Samuel Pitoiset
f45e43e156 spirv: add support for SpvCapabilityStorageImageMultisample
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:09 +01:00
Samuel Pitoiset
5b1ec10e4c radv: compute optimal VM alignment for imported buffers
This fixes GPU hangs on GFX9 with
dEQP-VK.memory.external_memory_host.bind_image_memory_and_render.with_zero_offset.*

Copied from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 17:34:04 +01:00
Bas Nieuwenhuizen
9f0bfbed11 radv: Work around non-renderable 128bpp compressed 3d textures on GFX9.
Exactly what title says, the new addrlib does not allow the above with
certain dimensions that the CTS seems to hit. Work around it by not
allowing the app to render to it via compat with  other 128bpp formats
and do not render to it ourselves during copies.

Fixes: 776b911365 "amd/addrlib: update Mesa's copy of addrlib"
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-20 15:07:20 +01:00
Samuel Pitoiset
5c7935f8fc radv: fix subpass image transitions with multiviews
The driver needs to decompress all image layers if a fast
depth/color clear has been performed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 13:36:37 +01:00
Samuel Pitoiset
0a7e767e58 radv: drop the amdgpu-skip-threshold=1 workaround for LLVM 8
This workaround has been introduced by 135e4d434f for fixing
DXVK GPU hangs with many games. It is no longer needed since
LLVM r345718.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 12:09:57 +01:00
Samuel Pitoiset
576040f2e5 ac/nir: remove the bitfield_extract workaround for LLVM 8
This workaround has been introduced by 3d41757788 and it
is no longer needed since LLVM r346422.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-20 09:40:16 +01:00
Iago Toral Quiroga
d6110d4d54 intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs
The former expects to see SSA-only things, but the latter injects registers.

The assertions in the lowering where not seeing this because they asserted
on the bit_size values only, not on the is_ssa field, so add that assertion
too.

Fixes: 11dc130779 "nir: Add a bool to int32 lowering pass"
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-20 08:02:44 +01:00
Ilia Mirkin
1250383e36 st/mesa: remove sampler associated with buffer texture in pbo logic
A long time ago, when this was first implemented, not having a sampler
bound would cause problems on Fermi. I didn't work out the reasons, but
the solution was simple -- just put the samplers back in.

Since then, regular texturing paths appear to have lost their associated
samplers which required a fuller investigation and fix in nouveau. Now
that this is done, this code should no longer need a sampler state for
fetching texels from a buffer texture.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-20 00:27:16 -05:00
Roland Scheidegger
6f4083143b gallivm: use llvm jit code for decoding s3tc
This is (much) faster than using the util fallback.
(Note that there's two methods here, one would use a cache, similar to
the existing code (although the cache was disabled), except the block
decode is done with jit code, the other directly decodes the required
pixels. For now don't use the cache (being direct-mapped is suboptimal,
but it's difficult to come up with something better which doesn't have
too much overhead.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-12-20 06:03:20 +01:00
Jason Ekstrand
ec1d5841fa radv/query: Use 1-bit booleans in query shaders
Fixes: 44227453ec "nir: Switch to using 1-bit Booleans for almost..."
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-19 16:36:40 -06:00
Jason Ekstrand
6896c91c10 radv/query: Add a nir_test_flag helper
This is little more than an iadd_imm right now but it will help in the
next commit where we refactor things further.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-19 16:36:26 -06:00
Eduardo Lima Mitev
c2ebc38052 freedreno/ir3: Handle GL_NONE in get_num_components_for_glformat()
An earlier patch that introduced the function failed to handle the case
where an image format layout qualifier is not specified, which is allowed
on desktop GL profiles. In these cases, nir_variable's image format is
GL_NONE, and we don't need to print a debug message for those.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-12-19 22:49:05 +01:00
Eric Anholt
90818558f0 docs: Add an encouraging note about providing reviews and acks.
Across several projects I've seen new contributors say "I wasn't sure if I
should provide a review tag since I'm not really an expert in this area."
Everyone I know already applies some implicit weighting to reviews from
different people, so encourage participation.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-19 12:49:17 -08:00
Eric Anholt
463df0ffe2 docs: Add a note that MRs should still include any r-b or a-b tags.
v2: Mention "Tested-by" too

Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-19 12:48:13 -08:00
Eric Anholt
fcfb7f573c v3d: Load and store aligned utiles all at once.
This calls the expensive uif offset function once per utile, but it still
gets us a 212.218% +/- 2.41216% (n=10) win on 1024x1024 glTexImage over
calling it on each pixel.
2018-12-19 10:27:26 -08:00
Eric Anholt
7c56b7a6ea v3d: Add a fallthrough path for utile load/store of 32 byte lines.
Now that V3D has 8 byte per pixel formats exposed, we've got stride==32
utiles to load and store.  Just handle them through the non-NEON paths for
now.
2018-12-19 10:27:26 -08:00
Eric Anholt
f6a0f4f41e vc4: Move the utile load/store functions to a header for reuse by v3d.
These implementations of whole-utile load/stores would be the same for
v3d, though the layouts of blocks of utiles has changed.
2018-12-19 10:27:26 -08:00
Eric Anholt
8ee752194c v3d: Implement texture_subdata to reduce teximage upload copies.
This lets us store the non-PBO glTexImage data directly into the tiled
image without making an extra untiled memcpy for the gallium transfer.
Improves 1024x1024 TexImage perf by ~19%, mostly from not thrashing around
in the kernel mapping and unmapping the transfer's temporary area.
2018-12-19 10:27:26 -08:00
Eric Anholt
e09d8aecb4 v3d: Remove dead prototypes for load/store utile functions. 2018-12-19 10:27:26 -08:00
Eric Anholt
fcf881adda v3d: Don't try to create shadow tiled temporaries for 1D textures.
They're raster order anyway, so we'd assertion fail along with wasting
bandwidth.

Fixes: 6ad9e8690d ("v3d: Add support for texturing from linear.")
2018-12-19 10:27:21 -08:00
Eric Anholt
b5adc744ba v3d: Fix check for TFU job completion in the simulator.
We're waiting for the jobs-completed count to increment (with wrapping),
not to reach its starting state.  This mostly ended up working out because
the next v3d_hw_tick() for a submit CL would end up doing the TFU
operation first, but it did fail when a blit was used for glReadPixels()
at the end of a test.

Fixes: ee0549ff9a ("v3d: Add the V3D TFU submit interface to the simulator.")
2018-12-19 10:26:04 -08:00
Eric Anholt
365728dc5d v3d: Put the dst bo first in the list of BOs for TFU calls.
In the UAPI, the first BO is the destination, and the one the kernel
should do an exclusive reservation on.  Currently we only do exclusive
reservations, anyway.  However, in the simulator path I was only copying
back the "destination" BO (actually src in this case), and this caused
regressions once I fixed the simulator to actually complete TFU before
returning (since otherwise, the TFU op would happen at the start of the
next CL submit and the draw would get the right contents).

Fixes: 976ea90bdc ("v3d: Add support for using the TFU to do some blits.")
2018-12-19 10:26:04 -08:00
Caio Marcelo de Oliveira Filho
947f7b452a nir: properly find the entry to keep in copy_prop_vars
When copy propagation handles a store/copy, it iterates the current
copy entries to remove aliases, but keeps the "equal" entry (if
exists) to be updated.

The removal step may swap the entries around (to ensure there are no
holes), invalidating previous iteration pointers.  The bug was saving
such pointer to use later.  Change the code to first perform the
removals and then find the remaining right entry.

This was causing updates to be lost since they were being made to an
entry that was not part of the current copies.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108624
Fixes: b3c6146925 "nir: Copy propagation between blocks"
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-19 09:33:36 -08:00
Michel Dänzer
9d8395bf0e winsys/amdgpu: Pull in LLVM CFLAGS
Fixes build failure if the LLVM headers aren't in a standard include
directory.

Fixes: ec22dd34c8 "radeonsi: move SI_FORCE_FAMILY functionality to
                     winsys"
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-12-19 17:54:18 +01:00
Caio Marcelo de Oliveira Filho
0ddc911f4d nir: properly clear the entry sources in copy_prop_vars
When updating a copy entry source value from a "non-SSA" (the data
come from a copy instruction) to a "SSA" (the data or parts of it come
from SSA values), it was possible to hold invalid data in ssa[0]
depending on the writemask.  Because the union, ssa[0] could contain a
pointer to a nir_deref_instr left-over from previous non-SSA usage.

Change code to clean up the array before use to avoid invalid data
around.

Fixes: 62332d139c "nir: Add a local variable-based copy propagation pass"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-19 08:35:48 -08:00
Eric Engestrom
0e4c7c3d5b docs: format code blocks a bit nicely
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-19 16:32:30 +00:00
Eric Engestrom
b0319d0768 docs: add meson cross compilation instructions
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-19 16:31:51 +00:00
Gurchetan Singh
b45aa6290b virgl: move resource creation / import / destruction to common code
We can remove some duplicated code.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
1d3d311133 virgl: move resource metadata into base resource
A resource is just a buffer with some metadata.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
db77573d7b virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT
Previously, we ignored the the glUnmap(..) operation and
flushed before we flush the cbuf.  Now, let's just flush
the data when we unmap.

Neither method is optimal, for example:

glMapBufferRange(.., 0, 100, GL_MAP_FLUSH_EXPLICIT_BIT)
glFlushMappedBufferRange(.., 25, 30)
glFlushMappedBufferRange(.., 65, 70)

We'll end up flushing 25 --> 70.  Maybe we can fix this later.

v2: Add fixme comment in the code (Elie)

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
11939f6fa2 virgl: make virgl_buffers use resource helpers
We can reuse the helpers we created.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
4e2c77cd51 virgl: make transfer code with PIPE_BUFFER targets
util_format_get_blocksize returns 1 for R8 formats (all
PIPE_BUFFERs are R8).

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
174f530008 virgl: consolidate transfer code
We could allocate and destroy transfers in one place.

v2: Keep l_stride around.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
13626b46f1 virgl: store layer_stride in metadata
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
2a44acc83b virgl: move vrend_get_tex_image_offset to common code
Will be reused.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
f749229a8e virgl: move virgl_resource_layout to common code
Will be reused.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
a63da9c062 virgl: move texture metadata to common code
Will be reused.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
6e7d396ad3 virgl: remove unnessecary code
With commit 89b479, we moved to tracking buffer cleanliness
when binding.

TEST=dEQP-GLES31.functional.image_load_store.buffer.load_store.r32ui

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
6d13d1aadb virgl: texture_transfer_pool --> transfer_pool
It's used for all types of resources.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Nicolai Hähnle
d73a25f2c0 radeonsi: const-ify the si_query_ops
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:02:07 +01:00
Nicolai Hähnle
c85b0dea0a radeonsi: split perfcounter queries from si_query_hw
Remove a level of indirection to make the code more explicit -- should
make it easier to follow what's going on.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:02:04 +01:00
Nicolai Hähnle
e0f0d3675d radeonsi: factor si_query_buffer logic out of si_query_hw
This is a move towards using composition instead of inheritance for
different query types.

This change weakens out-of-memory error reporting somewhat, though this
should be acceptable since we didn't consistently report such errors in
the first place.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:02:01 +01:00
Nicolai Hähnle
0fc6e573dd radeonsi: move query suspend logic into the top-level si_query struct
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:59 +01:00
Nicolai Hähnle
e2b9329f17 radeonsi: move remaining perfcounter code into si_perfcounter.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:57 +01:00
Nicolai Hähnle
7dd289d9e4 radeonsi: track constant buffer bind history in si_pipe_set_constant_buffer
Other callers of si_set_constant_buffer don't need it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:54 +01:00
Nicolai Hähnle
829d417914 radeonsi: use si_set_rw_shader_buffer for setting streamout buffers
Reduce the number of places that encode buffer descriptors.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:52 +01:00
Nicolai Hähnle
ce785f5ffd radeonsi: add an si_set_rw_shader_buffer convenience function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:50 +01:00
Nicolai Hähnle
556c4c42b7 radeonsi: avoid using hard-coded SI_NUM_RW_BUFFERS
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:48 +01:00
Nicolai Hähnle
1e49d72317 radeonsi: show the fixed function TCS in debug dumps
This is rather important for merged VS/TCS as LSHS shaders...

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:45 +01:00
Nicolai Hähnle
6e67e79de4 radeonsi: const-ify si_set_tesseval_regs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:42 +01:00
Nicolai Hähnle
5c841a1b1e radeonsi: rename SI_RESOURCE_FLAG_FORCE_TILING to clarify its purpose
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:39 +01:00
Nicolai Hähnle
0d58dcc3cf radeonsi: don't set RAW_WAIT for CP DMA clears
There is never a read-after-write hazard because the command doesn't read.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:34 +01:00
Nicolai Hähnle
23af72af25 radeonsi/gfx9: use SET_UCONFIG_REG_INDEX packets when available
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:32 +01:00
Nicolai Hähnle
f18b2ac0db radeonsi: add si_init_draw_functions and make some functions static
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:30 +01:00
Nicolai Hähnle
555cb668cc radeonsi: extract declare_vs_blit_inputs
Prepare for some later refactoring.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:27 +01:00
Nicolai Hähnle
ec22dd34c8 radeonsi: move SI_FORCE_FAMILY functionality to winsys
This helps some debugging cases by initializing addrlib with
slightly more appropriate settings.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:25 +01:00
Nicolai Hähnle
0ef263d62f ac/surface: 3D and cube surfaces are never displayable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:22 +01:00
Nicolai Hähnle
8efaffa893 amd/common: add i1 special case to ac_build_{inclusive,exclusive}_scan
Allow for a unified but efficient treatment of adding a bitmask over a
wave or an entire threadgroup.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:19 +01:00
Nicolai Hähnle
300876a9a7 amd/common: scan/reduce across waves of a workgroup
Order-aware scan/reduce can trade-off LDS traffic for external atomics
memory traffic in producer/consumer compute shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:17 +01:00
Nicolai Hähnle
3963402fd3 amd/common: add ac_build_ifcc
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:15 +01:00
Nicolai Hähnle
3c77f26ccc amd/common: whitespace fixes
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:12 +01:00
Nicolai Hähnle
76c5ad1995 amd/sid_tables: add additional python3 compatibility imports
This happened to bite me while doing some experiments.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:09 +01:00
Nicolai Hähnle
6f0322b16a r600: remove redundant semicolon
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:00:49 +01:00
Nicolai Hähnle
7230cb8f2b ddebug: always flush when requested, even when hang detection is disabled
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 11:59:18 +01:00
Nicolai Hähnle
539fdc49f1 ddebug: simplify watchdog loop and fix crash in the no-timeout case
The following race condition could occur in the no-timeout case:

  API thread               Gallium thread            Watchdog
  ----------               --------------            --------
  dd_before_draw
  u_threaded_context draw
  dd_after_draw
    add to dctx->records
    signal watchdog
                                                     dump & destroy record
                           execute draw
                           dd_after_draw_async
                             use-after-free!

Alternatively, the same scenario would assert in a debug build when
destroying the record because record->driver_finished has not signaled.

Fix this and simplify the logic at the same time by
- handing the record pointers off to the watchdog thread *before* each
  draw call and
- waiting on the driver_finished fence in the watchdog thread

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 11:59:10 +01:00
Tapani Pälli
3627c9efff anv/android: turn on VK_ANDROID_external_memory_android_hardware_buffer
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:42 +02:00
Tapani Pälli
3dc424a4f4 anv: ignore VkSamplerYcbcrConversion on non-yuv formats
This fulfills a requirement for clients that want to utilize same
code path for images with external formats (VK_FORMAT_UNDEFINED) and
"regular" RGBA images where format is known. This is similar to how
OES_EGL_image_external works.

To support this, we allow color conversion samplers for non-YUV
formats but skip setting up conversion when format does not have
can_ycbcr flag set.

v2: add comment and bundle can_ycbcr to the existing break
    condition (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
a7b7772cfb anv: support VkSamplerYcbcrConversionInfo in vkCreateImageView
If a conversion struct was passed, then initialize view using
format from the conversion structure.

v2: use vk_format directly from the anv_format struct
v3: added some assertions (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
bb0721aea4 anv: add VkFormat field as part of anv_format
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
c070b0e25f anv: support VkExternalFormatANDROID in vkCreateSamplerYcbcrConversion
If external format is used, we store the external format identifier in
conversion to be used later when creating VkImageView.

v2: rebase to b43f955037 changes
v3: added assert, ignore components when creating external
    format conversion (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
f1654fa7e3 anv/android: support creating images from external format
Since we don't know the exact format at creation time, some initialization
is done only when bound with memory in vkBindImageMemory.

v2: demand dedicated allocation in vkGetImageMemoryRequirements2 if
    image has external format

v3: refactor prepare_ahw_image, support vkBindImageMemory2,
    calculate stride correctly for rgb(x) surfaces, rename as
    'resolve_ahw_image'

v4: rebase to b43f955037 changes
v5: add some assertions to verify input correctness (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
517103abf1 anv/android: add ahardwarebuffer external memory properties
v2: have separate memory properties for android, set usage
    flags for buffers correctly

v3: code cleanup (Jason)
    + limit maxArrayLayers to 1 for AHardwareBuffer based images

v4: rebase to b43f955037 changes

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
c79a528d2b anv/android: support import/export of AHardwareBuffer objects
v2: add support for non-image buffers (AHARDWAREBUFFER_FORMAT_BLOB)
v3: properly handle usage bits when creating from image
v4: refactor, code cleanup (Jason)
v5: rebase to b43f955037 changes,
    initialize bo flags as ANV_BO_EXTERNAL (Lionel)
v6: add assert that anv_bo_cache_import succeeds, add comment
    about multi-bo support to clarify current implementation (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
5c65c60d6c anv: refactor, remove else block in AllocateMemory
This makes it cleaner to introduce more cases where we import memory
from different types of external memory buffers.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
884fc90fde anv: add anv_ahw_usage_from_vk_usage helper function
v2: rebase to b43f955037 changes

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
1e6a44400a anv/android: add GetAndroidHardwareBufferPropertiesANDROID
Use the anv_format address in formats table as implementation-defined
external format identifier for now. When adding YUV format support this
might need to change.

v2: code cleanup (Jason)
v3: set anv_format address as identifier
v4: setup suggestedYcbcrModel and suggested[X|Y]ChromaOffset
    as expected for HAL_PIXEL_FORMAT_NV12_Y_TILED_INTEL
v5: set linear tiling for GPU_DATA_BUFFER usage, add comment
    about multi-bo support to clarify current implementation (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
aa94e01bfe anv: add from/to helpers with android and vulkan formats
v2: handle R8G8B8X8 as R8G8B8_UNORM (Jason)
v3: add HAL_PIXEL_FORMAT_NV12_Y_TILED_INTEL, we make it define
    for now to avoid direct dependency to minigbm headers

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
c1f15a0a1a anv: make anv_get_image_format_features public
This will be utilized later by GetAndroidHardwareBufferPropertiesANDROID.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
8a469fd335 anv: refactor make_surface to use data from anv_image
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
2a98e5bbb9 anv: add create_flags as part of anv_image
This will make it possible for next patch to rip
anv_image_create_info out from make_surface function.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Ian Romanick
96c4b135e3 nir/algebraic: Don't put quotes around floating point literals
The quotation marks around 1.0 cause it to be treated as a string
instead of a floating point value.  The generator then treats it as an
arbitrary variable replacement, so any iand involving a ('ineg', ('b2i',
a)) matches.

v2: Remove misleading comment about sized literals (suggested by
Timothy).  Add assertion that the name of a varible is entierly
alphabetic (suggested by Jason).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com> [v1]
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> [v1]
Fixes: 6bcd2af086 ("nir/algebraic: Add some optimizations for D3D-style Booleans")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109075
2018-12-18 23:28:31 -08:00
Vinson Lee
0f7ba5758b meson: Fix libsensors detection.
Fixes: 5e71efef44 ("meson: Add lmsensors support")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-18 19:24:01 -08:00
Vinson Lee
84f39e5971 meson: Fix typo.
Fixes: 6b4c7047d5 ("meson: build gallium nine state_tracker")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-18 19:14:11 -08:00
Sagar Ghuge
933c44bcc4 nir: Add a new lowering option to lower 3D surfaces from txd to txl.
Tested on gen9.

v2: Rename lower_txd_3d_surafaces flag to lower_txd_3d (Jason Ekstrand)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-18 13:44:09 -08:00
Christian Gmeiner
7ea8e54dd6 meson: add etnaviv to the tools option
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-18 21:50:58 +01:00
Adam Jackson
e36d136102 specs: Bump GLX_MESA_query_renderer to version 9
Note that we have an official GL extension number, pick the appropriate
section of the GLX spec to modify, and add changelog.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2018-12-18 15:46:10 -05:00
Adam Jackson
9e8332ebc2 specs: Remove GLX_RENDERER_ID_MESA from GLX_MESA_query_renderer
This has not even had an attempt at implementation. If you asked for
renderer 0 - which, the spec implies, should always work - then
dri2_convert_glx_attribs would fail, we'd silently fall back to creating
an indirect context, and xserver would also not recognize the attribute
and would throw BadValue at you.

The API would be difficult to use in any case, since there's no way to
enumerate how many renderers the screen has. I'd be tempted to add that
by defining:

    glXQueryRendererIntegerMESA(dpy, screen,
                                /* renderer = */ -1,
                                0, &value);

to return the number of renderers, but a new entrypoint might be
cleaner. Still, better to not specify it at all than to lie about it.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2018-12-18 15:46:10 -05:00
Adam Jackson
c63c391756 specs: Remove GLES profile interaction text from GLX_MESA_query_renderer
In one place we say, if GLES isn't supported then the profile version
will be 0.0. Then later we say, if the GLES profile extension isn't
supported then GLX_RENDERER_OPENGL_ES_PROFILE_VERSION_MESA is not
mentioned in the spec. A strict reading of the latter would mean that
GLX_RENDERER_OPENGL_ES_PROFILE_VERSION_MESA is not a recognized token,
and the query should instead return False.

The implementation does not check for the GLES profile extensions, and
the additional complexity doesn't seem worth it. Removing the
interaction text makes the spec match the implementation.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2018-12-18 15:46:10 -05:00
Eduardo Lima Mitev
5820e63418 freedreno/ir3: Make imageStore use num components from image format
emit_intrinsic_store_image() is always using 4 components when
collecting registers for the value. When image has less than
4 components (e.g, r32f, rg32i, etc) this results in extra mov
instructions.

This patch uses the actual number of components from the image format.

For example, in a shader like:

layout (r32f, binding=0) writeonly uniform imageBuffer u_image;
...
void main(void) {
   ...
   imageStore (u_image, some_offset, vec4(1.0));
   ...
}

instruction count is reduced in at least 3 instructions (note image
format is r32f, 1 component only).

This obviously reduces register pressure as well.

v2: - Added support for image formats from NV_image_format extension
    (Ilia Mirkin).
    - Return 4 components by default instead of asserting. (Rob Clark).

v3: Added more missing formats (Ilia Mirkin).

v4: Added a debug message for unknown image formats (Rob Clark).

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-12-18 21:15:20 +01:00
Jason Ekstrand
5dad1abfdc nir/dead_write_vars: Get modes directly from derefs
Instead of going all the way back to the variable, just look at the
deref.  The modes are guaranteed to be the same by nir_validate whenever
the variable can be found.  This fixes clear_unused_for_modes for
derefs that don't have an accessible variable.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Jason Ekstrand
fa40a58fd9 nir/copy_prop_vars: Get modes directly from derefs
Instead of going all the way back to the variable, just look at the
deref.  The modes are guaranteed to be the same by nir_validate whenever
the variable can be found.  This fixes apply_barrier_for_modes for
derefs that don't have an accessible variable.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Jason Ekstrand
cf7fb39805 nir/lower_wpos_center: Look at derefs for modes
This is instead of looking all the way back to the variable which may
not exist for all derefs.  This makes this code properly ignore casts
with modes other than the mode[s] we care about (where casts aren't
allowed).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Jason Ekstrand
867fe35a16 nir/lower_io_to_scalar: Look at derefs for modes
This is instead of looking all the way back to the variable which may
not exist for all derefs.  This makes this code properly ignore casts
with modes other than the mode[s] we care about (where casts aren't
allowed).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Jason Ekstrand
3fe0363dda nir/lower_io_arrays_to_elements: Look at derefs for modes
This is instead of looking all the way back to the variable which may
not exist for all derefs.  This makes this code properly ignore casts
with modes other than the mode[s] we care about (where casts aren't
allowed).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Jason Ekstrand
8cc0f92492 nir/linking_helpers: Look at derefs for modes
This is instead of looking all the way back to the variable which may
not exist for all derefs.  This makes this code properly ignore casts
with modes other than the mode[s] we care about (where casts aren't
allowed).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Jason Ekstrand
8410cf66d7 nir/propagate_invariant: Skip unknown vars
If we can't find the variable from the deref, just assume it isn't
invariant and continue on.  This can happen if, for instance, we're
writing to a deref that points into an SSBO.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Ian Romanick
29e4b949b4 Revert "nir/lower_indirect: Bail early if modes == 0"
"There's no point in walking the program if we're never going to
    actually lower anything."

Except we might lower compacted local arrays.  In that case, modes will
be 0, but there is still lowering to be done.

This reverts commit 7f75cf2a94.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109081
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
2018-12-18 10:47:54 -08:00
Lucas Stach
433ca3127a st/dri: replace format conversion functions with single mapping table
Each time I have to touch the buffer import/export functions in the dri
state tracker I get lost in the maze of functions converting between
DRI_IMAGE_FOURCC, DRI_IMAGE_FORMAT, DRI_IMAGE_COMPONENTS and pipe format.

Rip it out and replace by a single table, which defines the correspondence
between the different representations.

Also this now stores all the known representations in the __DRIimageRec,
to avoid the loss of information we currently have when importing a buffer
with a fourcc, which doesn't have a corresponding dri format.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-18 19:19:45 +01:00
Lucas Stach
67174d40f1 st/dri: allow both render and sampler compatible dma-buf formats
Currently all the EGL APIs are missing a way to specify how an imported
dma-buf is intended to be used. Demanding the format to be both usable
for sampling and rendering artificially restricts the list of formats a
driver is able to import.

Looking at how the Intel driver implements those DRI2 image APIs it
doesn't distinguish between render or sampler compatible formats. So
this patch aligns behavior between Intel and Gallium based drivers.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-18 19:19:40 +01:00
Lucas Stach
a3e592e839 etnaviv: use surface format directly
There is no need to do the detour over the resource behind the
surface to get the format. Use the surface format directly.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2018-12-18 19:07:10 +01:00
Dylan Baker
7a90886921 meson: Add toggle for glx-direct
GNU Hurd needs to turn off glx-direct, rather than special case it,
we'll just add a toggle.

CC: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-18 09:20:53 -08:00
Dylan Baker
8c77f4c76d meson: Add support for gnu hurd
CC: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-18 09:20:49 -08:00
Dylan Baker
6cf5f25bc5 meson: remove duplicate definition
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-18 09:18:12 -08:00
Dylan Baker
e430a034b9 meson: Fix ppc64 little endian detection
Old versions of meson returned ppc64le as the cpu_family for little
endian power8 cpus, versions >=0.48 don't do this, so the check wouldn't
work in that case. This generalizes the check to work for both old and
new versions of meson.

Fixes: 34bbb24ce7
       ("meson: Add support for ppc assembly/optimizations")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-18 09:17:54 -08:00
Jason Ekstrand
3feda3cf35 anv: Bump the patch version to 96
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-18 09:40:46 -06:00
Kenneth Graunke
3c71ba3baa i965: Don't override subslice count to 4 on Gen11.
Gen9-10 have fewer than 4 subslices per slice, so they need this to be
rounded up.  Gen11 isn't documented as needing this hack, and it can
also have more than 4 subslices, so the hack actually can break things.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-12-17 14:03:45 -08:00
Ian Romanick
af07141b33 intel/compiler: More peephole_select for pre-Gen6
No shader-db changes on any Gen6+ platform.

All of the shaders with cycles hurt by more than ~2% are from Master of
Orion.  All of the shaders have instructions helped.  It looks like the
pass enables some control flow to be converted to bcsels, then the
scheduler does dumb things.  These are new shaders (just added before
doing this shader-db run), so there's probably some low-hanging fruit.

Iron Lake
total instructions in shared programs: 8214327 -> 8213684 (<.01%)
instructions in affected programs: 84469 -> 83826 (-0.76%)
helped: 114
HURT: 26
helped stats (abs) min: 2 max: 18 x̄: 7.75 x̃: 9
helped stats (rel) min: 0.17% max: 13.73% x̄: 2.52% x̃: 1.05%
HURT stats (abs)   min: 2 max: 20 x̄: 9.23 x̃: 8
HURT stats (rel)   min: 0.70% max: 2.48% x̄: 1.66% x̃: 1.61%
95% mean confidence interval for instructions value: -5.87 -3.32
95% mean confidence interval for instructions %-change: -2.32% -1.17%
Instructions are helped.

total cycles in shared programs: 187736850 -> 187749314 (<.01%)
cycles in affected programs: 506750 -> 519214 (2.46%)
helped: 104
HURT: 36
helped stats (abs) min: 2 max: 72 x̄: 21.96 x̃: 16
helped stats (rel) min: 0.02% max: 6.16% x̄: 0.97% x̃: 0.63%
HURT stats (abs)   min: 4 max: 1402 x̄: 409.67 x̃: 40
HURT stats (rel)   min: 0.33% max: 23.12% x̄: 5.79% x̃: 1.39%
95% mean confidence interval for cycles value: 28.32 149.74
95% mean confidence interval for cycles %-change: -0.07% 1.61%
Inconclusive result (%-change mean confidence interval includes 0).

GM45
total instructions in shared programs: 5044014 -> 5043652 (<.01%)
instructions in affected programs: 46751 -> 46389 (-0.77%)
helped: 63
HURT: 13
helped stats (abs) min: 2 max: 29 x̄: 7.65 x̃: 9
helped stats (rel) min: 0.17% max: 13.73% x̄: 2.93% x̃: 1.04%
HURT stats (abs)   min: 2 max: 20 x̄: 9.23 x̃: 8
HURT stats (rel)   min: 0.66% max: 2.35% x̄: 1.58% x̃: 1.52%
95% mean confidence interval for instructions value: -6.54 -2.99
95% mean confidence interval for instructions %-change: -3.04% -1.28%
Instructions are helped.

total cycles in shared programs: 128143042 -> 128150188 (<.01%)
cycles in affected programs: 324564 -> 331710 (2.20%)
helped: 57
HURT: 19
helped stats (abs) min: 6 max: 74 x̄: 30.70 x̃: 32
helped stats (rel) min: 0.08% max: 4.74% x̄: 1.22% x̃: 0.81%
HURT stats (abs)   min: 10 max: 1400 x̄: 468.21 x̃: 60
HURT stats (rel)   min: 0.56% max: 19.94% x̄: 5.80% x̃: 1.70%
95% mean confidence interval for cycles value: 6.90 181.15
95% mean confidence interval for cycles %-change: -0.52% 1.59%
Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Ian Romanick
378f996771 nir/opt_peephole_select: Don't peephole_select expensive math instructions
On some GPUs, especially older Intel GPUs, some math instructions are
very expensive.  On those architectures, don't reduce flow control to a
csel if one of the branches contains one of these expensive math
instructions.

This prevents a bunch of cycle count regressions on pre-Gen6 platforms
with a later patch (intel/compiler: More peephole select for pre-Gen6).

v2: Remove stray #if block.  Noticed by Thomas.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Ian Romanick
8fb8ebfbb0 intel/compiler: More peephole select
Shader-db results:

The one shader hurt for instructions is a compute shader that had both
spills and fills hurt.

v2: Fix typo in comment noticed by Caio.

v3: Fix inverted condition in brw_nir.c.  Noticed by Lionel.

Skylake, Broadwell, and Haswell had similar results. (Skylake shown)
total instructions in shared programs: 15072761 -> 15047884 (-0.17%)
instructions in affected programs: 895539 -> 870662 (-2.78%)
helped: 3623
HURT: 1
helped stats (abs) min: 1 max: 181 x̄: 6.89 x̃: 4
helped stats (rel) min: 0.10% max: 25.00% x̄: 3.93% x̃: 3.20%
HURT stats (abs)   min: 92 max: 92 x̄: 92.00 x̃: 92
HURT stats (rel)   min: 1.92% max: 1.92% x̄: 1.92% x̃: 1.92%
95% mean confidence interval for instructions value: -7.10 -6.63
95% mean confidence interval for instructions %-change: -4.03% -3.82%
Instructions are helped.

total cycles in shared programs: 369738930 -> 369535732 (-0.05%)
cycles in affected programs: 68027851 -> 67824653 (-0.30%)
helped: 2609
HURT: 1035
helped stats (abs) min: 1 max: 4508 x̄: 181.44 x̃: 77
helped stats (rel) min: <.01% max: 71.31% x̄: 9.14% x̃: 5.47%
HURT stats (abs)   min: 1 max: 33336 x̄: 261.04 x̃: 20
HURT stats (rel)   min: <.01% max: 47.61% x̄: 2.93% x̃: 1.47%
95% mean confidence interval for cycles value: -96.43 -15.09
95% mean confidence interval for cycles %-change: -6.07% -5.36%
Cycles are helped.

total spills in shared programs: 10158 -> 10159 (<.01%)
spills in affected programs: 166 -> 167 (0.60%)
helped: 1
HURT: 1

total fills in shared programs: 22105 -> 22116 (0.05%)
fills in affected programs: 837 -> 848 (1.31%)
helped: 4
HURT: 1

Ivy Bridge
total instructions in shared programs: 12021190 -> 11990256 (-0.26%)
instructions in affected programs: 910561 -> 879627 (-3.40%)
helped: 3344
HURT: 18
helped stats (abs) min: 1 max: 99 x̄: 9.29 x̃: 6
helped stats (rel) min: 0.11% max: 31.18% x̄: 5.19% x̃: 3.31%
HURT stats (abs)   min: 2 max: 20 x̄: 7.89 x̃: 6
HURT stats (rel)   min: 0.70% max: 2.59% x̄: 1.63% x̃: 1.70%
95% mean confidence interval for instructions value: -9.49 -8.91
95% mean confidence interval for instructions %-change: -5.32% -4.98%
Instructions are helped.

total cycles in shared programs: 179077826 -> 178570196 (-0.28%)
cycles in affected programs: 63205667 -> 62698037 (-0.80%)
helped: 2767
HURT: 620
helped stats (abs) min: 1 max: 7531 x̄: 217.58 x̃: 88
helped stats (rel) min: <.01% max: 75.86% x̄: 9.59% x̃: 6.09%
HURT stats (abs)   min: 1 max: 31255 x̄: 152.27 x̃: 11
HURT stats (rel)   min: <.01% max: 36.36% x̄: 2.77% x̃: 0.58%
95% mean confidence interval for cycles value: -173.94 -125.81
95% mean confidence interval for cycles %-change: -7.68% -6.97%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10852569 -> 10843758 (-0.08%)
instructions in affected programs: 235803 -> 226992 (-3.74%)
helped: 800
HURT: 0
helped stats (abs) min: 1 max: 88 x̄: 11.01 x̃: 8
helped stats (rel) min: 0.11% max: 23.08% x̄: 4.69% x̃: 3.36%
95% mean confidence interval for instructions value: -11.93 -10.10
95% mean confidence interval for instructions %-change: -4.99% -4.39%
Instructions are helped.

total cycles in shared programs: 154732047 -> 154608941 (-0.08%)
cycles in affected programs: 4063110 -> 3940004 (-3.03%)
helped: 606
HURT: 253
helped stats (abs) min: 1 max: 2524 x̄: 227.93 x̃: 62
helped stats (rel) min: 0.02% max: 39.24% x̄: 4.36% x̃: 1.81%
HURT stats (abs)   min: 1 max: 1966 x̄: 59.36 x̃: 11
HURT stats (rel)   min: 0.02% max: 67.10% x̄: 3.22% x̃: 0.67%
95% mean confidence interval for cycles value: -170.49 -116.13
95% mean confidence interval for cycles %-change: -2.61% -1.65%
Cycles are helped.

No change on Iron Lake or GM45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Ian Romanick
09b7e1d8e4 nir/opt_peephole_select: Don't try to remove flow control around indirect loads
That flow control may be trying to avoid invalid loads.  On at least
some platforms, those loads can also be expensive.

No shader-db changes on any Intel platform (even with the later patch
"intel/compiler: More peephole select").

v2: Add a 'indirect_load_ok' flag to nir_opt_peephole_select.  Suggested
by Rob.  See also the big comment in src/intel/compiler/brw_nir.c.

v3: Use nir_deref_instr_has_indirect instead of deref_has_indirect (from
nir_lower_io_arrays_to_elements.c).

v4: Fix inverted condition in brw_nir.c.  Noticed by Lionel.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Ian Romanick
4cd1a0be76 i965/vec4: Propagate conditional modifiers from more compares to other compares
If there is a CMP.NZ that compares a single component (via a .zzzz
swizzle, for example) with 0, it can propagate its conditional modifier
back to a previous CMP that writes only that component.  The specific
case that I saw was:

    cmp.l.f0(8)     g42<1>.xF       g61<4>.xF       (abs)g18<4>.zF
    ...
    cmp.nz.f0(8)    null<1>D        g42<4>.xD       0D

In this case we can just delete the second CMP.

No changes on Broadwell or Skylake because they do not use the vec4
backend.  Also no changes on GM45 or Iron Lake.

Sandy Bridge, Ivy Bridge, and Haswell had similar results. (Sandy Bridge shown)
total instructions in shared programs: 10856676 -> 10852569 (-0.04%)
instructions in affected programs: 228322 -> 224215 (-1.80%)
helped: 1331
HURT: 0
helped stats (abs) min: 1 max: 7 x̄: 3.09 x̃: 4
helped stats (rel) min: 0.11% max: 6.67% x̄: 1.88% x̃: 1.83%
95% mean confidence interval for instructions value: -3.19 -2.99
95% mean confidence interval for instructions %-change: -1.93% -1.83%
Instructions are helped.

total cycles in shared programs: 154788865 -> 154732047 (-0.04%)
cycles in affected programs: 2485892 -> 2429074 (-2.29%)
helped: 1097
HURT: 59
helped stats (abs) min: 2 max: 168 x̄: 51.96 x̃: 64
helped stats (rel) min: 0.12% max: 12.70% x̄: 3.44% x̃: 2.22%
HURT stats (abs)   min: 2 max: 16 x̄: 3.02 x̃: 2
HURT stats (rel)   min: 0.18% max: 0.83% x̄: 0.64% x̃: 0.71%
95% mean confidence interval for cycles value: -51.04 -47.26
95% mean confidence interval for cycles %-change: -3.40% -3.07%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Ian Romanick
9a83c3d3b3 i965/fs: Eliminate unary op on operand of compare-with-zero
The (-abs(x) >= 0) => (x == 0) optimization is removed from the vec4 and
scalar parts. In the VS part, adding the new pattern was not
helpful. The pattern that is removed is really old, and it has been
handled by NIR for ages.

All Gen7+ platforms had similar results. (Broadwell shown)
total instructions in shared programs: 14715715 -> 14715709 (<.01%)
instructions in affected programs: 474 -> 468 (-1.27%)
helped: 6
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.12% max: 1.35% x̄: 1.28% x̃: 1.35%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -1.40% -1.15%
Instructions are helped.

total cycles in shared programs: 559569911 -> 559569809 (<.01%)
cycles in affected programs: 5963 -> 5861 (-1.71%)
helped: 6
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 1.45% max: 1.88% x̄: 1.73% x̃: 1.85%
95% mean confidence interval for cycles value: -18.15 -15.85
95% mean confidence interval for cycles %-change: -1.95% -1.51%
Cycles are helped.

Iron Lake and Sandy Bridge had similar results. (Iron Lake shown)
total instructions in shared programs: 7780915 -> 7780913 (<.01%)
instructions in affected programs: 246 -> 244 (-0.81%)
helped: 2
HURT: 0

total cycles in shared programs: 177876108 -> 177876106 (<.01%)
cycles in affected programs: 3636 -> 3634 (-0.06%)
helped: 1
HURT: 0

GM45
total instructions in shared programs: 4799152 -> 4799151 (<.01%)
instructions in affected programs: 126 -> 125 (-0.79%)
helped: 1
HURT: 0

total cycles in shared programs: 122052654 -> 122052652 (<.01%)
cycles in affected programs: 3640 -> 3638 (-0.05%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Ian Romanick
440c051340 i965/vec4/dce: Don't narrow the write mask if the flags are used
In an instruction sequence like

            cmp(8).ge.f0.0 vgrf17:D, vgrf2.xxxx:D, vgrf9.xxxx:D
    (+f0.0) sel(8) vgrf1:UD, vgrf8.xyzw:UD, vgrf1.xyzw:UD

The other fields of vgrf17 may be unused, but the CMP still needs to
generate the other flag bits.

To my surprise, nothing in shader-db or any test suite appears to hit
this.  However, I have a change to brw_vec4_cmod_propagation that
creates cases where this can happen.  This fix prevents a couple dozen
regressions in that patch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 5df88c20 ("i965/vec4: Rewrite dead code elimination to use live in/out.")
2018-12-17 13:47:06 -08:00
Ian Romanick
111bcc8d02 i965/vec4: Silence unused parameter warnings in vec4 compiler tests
src/intel/compiler/test_vec4_copy_propagation.cpp: In member function ‘virtual brw::dst_reg* copy_propagation_vec4_visitor::make_reg_for_system_value(int)’:
src/intel/compiler/test_vec4_copy_propagation.cpp:57:51: warning: unused parameter ‘location’ [-Wunused-parameter]
    virtual dst_reg *make_reg_for_system_value(int location)
                                                   ^~~~~~~~
src/intel/compiler/test_vec4_copy_propagation.cpp: In member function ‘virtual void copy_propagation_vec4_visitor::emit_urb_write_header(int)’:
src/intel/compiler/test_vec4_copy_propagation.cpp:77:43: warning: unused parameter ‘mrf’ [-Wunused-parameter]
    virtual void emit_urb_write_header(int mrf)
                                           ^~~
src/intel/compiler/test_vec4_copy_propagation.cpp: In member function ‘virtual brw::vec4_instruction* copy_propagation_vec4_visitor::emit_urb_write_opcode(bool)’:
src/intel/compiler/test_vec4_copy_propagation.cpp:82:57: warning: unused parameter ‘complete’ [-Wunused-parameter]
    virtual vec4_instruction *emit_urb_write_opcode(bool complete)
                                                         ^~~~~~~~
src/intel/compiler/test_vec4_register_coalesce.cpp: In member function ‘virtual brw::dst_reg* register_coalesce_vec4_visitor::make_reg_for_system_value(int)’:
src/intel/compiler/test_vec4_register_coalesce.cpp:60:51: warning: unused parameter ‘location’ [-Wunused-parameter]
    virtual dst_reg *make_reg_for_system_value(int location)
                                                   ^~~~~~~~
src/intel/compiler/test_vec4_register_coalesce.cpp: In member function ‘virtual void register_coalesce_vec4_visitor::emit_urb_write_header(int)’:
src/intel/compiler/test_vec4_register_coalesce.cpp:80:43: warning: unused parameter ‘mrf’ [-Wunused-parameter]
    virtual void emit_urb_write_header(int mrf)
                                           ^~~
src/intel/compiler/test_vec4_register_coalesce.cpp: In member function ‘virtual brw::vec4_instruction* register_coalesce_vec4_visitor::emit_urb_write_opcode(bool)’:
src/intel/compiler/test_vec4_register_coalesce.cpp:85:57: warning: unused parameter ‘complete’ [-Wunused-parameter]
    virtual vec4_instruction *emit_urb_write_opcode(bool complete)
                                                         ^~~~~~~~
src/intel/compiler/test_vec4_cmod_propagation.cpp: In member function ‘virtual brw::dst_reg* cmod_propagation_vec4_visitor::make_reg_for_system_value(int)’:
src/intel/compiler/test_vec4_cmod_propagation.cpp:60:51: warning: unused parameter ‘location’ [-Wunused-parameter]
    virtual dst_reg *make_reg_for_system_value(int location)
                                                   ^~~~~~~~
src/intel/compiler/test_vec4_cmod_propagation.cpp: In member function ‘virtual void cmod_propagation_vec4_visitor::emit_urb_write_header(int)’:
src/intel/compiler/test_vec4_cmod_propagation.cpp:85:43: warning: unused parameter ‘mrf’ [-Wunused-parameter]
    virtual void emit_urb_write_header(int mrf)
                                           ^~~
src/intel/compiler/test_vec4_cmod_propagation.cpp: In member function ‘virtual brw::vec4_instruction* cmod_propagation_vec4_visitor::emit_urb_write_opcode(bool)’:
src/intel/compiler/test_vec4_cmod_propagation.cpp:90:57: warning: unused parameter ‘complete’ [-Wunused-parameter]
    virtual vec4_instruction *emit_urb_write_opcode(bool complete)
                                                         ^~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Bas Nieuwenhuizen
f67dea5e19 radv: Fix multiview depth clears
We were not using the view mask for depth clears, causing only the
first view to be cleared.

Fixes: 2e86f6b259 "radv: Add multiview clears."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-17 20:16:26 +00:00
Bas Nieuwenhuizen
9add63a3a5 radv: Remove redundant format check.
The switch directly after the check has a default case that returns
NULL too, so the effective return value is not changed. Also this
check is wrong once we start dealing with formats introduced by an
extension (e.g. YUV formats).

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-17 20:09:38 +00:00
Eric Anholt
708d8f4d0a nir: Fix clamping of uints for image store lowering.
I botched some copy-and-paste and clamped to signed int max instead of
uint max.  Fixes KHR-GL46.shader_image_load_store.multiple-uniforms on
skl.

Fixes: d3e046e76c ("nir: Pull some of intel's image load/store format
conversion to nir_format.h")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-17 20:02:22 +00:00
Eric Anholt
00e2cbc049 v3d: Fix the argument type for vir_BRANCH().
Apparently this has been spewing warnings for Jason's clang, but not my
gcc.
2018-12-17 09:52:23 -08:00
Eric Anholt
376054fff3 vc4: Reuse nir_format_convert.h in our blend lowering.
These helpers came along after and have effectively the same
implementation.
2018-12-17 09:52:23 -08:00
Samuel Pitoiset
445867c80d radv: report Vulkan version 1.1.90 for real
I thought the value was correctly propagated, but actually not.

Fixes: 2ac6d55f38 ("radv: bump reported version to 1.1.90")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-17 17:51:48 +01:00
Jason Ekstrand
cae373117c anv,radv: Re-enable VK_EXT_pci_bus_info
Now at version 2 with the fixed header.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 10:42:35 -06:00
Jason Ekstrand
e5b59fe6f5 vulkan: Update the XML and headers to 1.1.96
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 10:41:56 -06:00
Rhys Perry
ef198e8c6a radv: switch from nir_bcsel to nir_b32csel
Fixes: 191a1dce92 ('nir: Add 1-bit Boolean opcodes')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-17 14:52:39 +00:00
Rhys Perry
bba94a3d85 radv: don't set surf_index for stencil-only images
Fixes: f8d5b377c8 ('radv: set cb base tile swizzles for MRT speedups (v4)')
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108116
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-17 14:52:10 +00:00
Ian Romanick
9dc135efa1 nir: Release per-block metadata in nir_sweep
nir_sweep already marks all metadata invalid, so it is safe to release
the memory here too.

mean soft fp64 using uint64:   1,342,759,331 => 1,010,670,475
gfxbench5 aztec ruins high 11:    63,555,571 =>    61,889,811
deus ex mankind divided 148:      62,845,304 =>    62,829,640
deus ex mankind divided 2890:     71,922,686 =>    71,922,686
dirt showdown 676:                69,238,607 =>    69,238,607
dolphin ubershaders 210:          77,822,072 =>    77,822,072

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-16 14:39:56 -08:00
Ian Romanick
7adafd6e1c nir: Fix holes in nir_instr
Found using pahole.

Changes in peak memory usage according to Valgrind massif:

mean soft fp64 using uint64:   1,343,991,403 => 1,342,759,331
gfxbench5 aztec ruins high 11:    63,619,971 =>    63,555,571
deus ex mankind divided 148:      62,887,728 =>    62,845,304
deus ex mankind divided 2890:     72,399,750 =>    71,922,686
dirt showdown 676:                69,464,023 =>    69,238,607
dolphin ubershaders 210:          78,359,728 =>    77,822,072

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-16 14:39:56 -08:00
Ian Romanick
8161a87b24 nir/phi_builder: Use per-value hash table to store [block] -> def mapping
Replace the old array in each value with a hash table in each value.

Changes in peak memory usage according to Valgrind massif:

mean soft fp64 using uint64:   5,499,875,082 => 1,343,991,403
gfxbench5 aztec ruins high 11:    63,619,971 =>    63,619,971
deus ex mankind divided 148:      62,887,728 =>    62,887,728
deus ex mankind divided 2890:     72,402,222 =>    72,399,750
dirt showdown 676:                74,466,431 =>    69,464,023
dolphin ubershaders 210:         109,630,376 =>    78,359,728

Run-time change for a full run on shader-db on my Haswell desktop (with
-march=native) is 1.22245% +/- 0.463879% (n=11).  This is about +2.9
seconds on a 237 second run.  The first time I sent this version of this
patch out, the run-time data was quite different.  I had misconfigured
the script that ran the test, and none of the tests from higher GLSL
versions were run.  These are generally more complex shaders, and they
are more affected by this change.

The previous version of this patch used a single hash table for the
whole phi builder.  The mapping was from [value, block] -> def, so a
separate allocation was needed for each [value, block] tuple.  There was
quite a bit of per-allocation overhead (due to ralloc), so the patch was
followed by a patch that added the use of the slab allocator.  The
results of those two patches was not quite as good:

mean soft fp64 using uint64:   5,499,875,082 => 1,343,991,403
gfxbench5 aztec ruins high 11:    63,619,971 =>    63,619,971
deus ex mankind divided 148:      62,887,728 =>    62,887,728
deus ex mankind divided 2890:     72,402,222 =>    72,402,222 *
dirt showdown 676:                74,466,431 =>    72,443,591 *
dolphin ubershaders 210:         109,630,376 =>    81,034,320 *

The * denote tests that are better now.  In the tests that are the same
in both patches, the "after" peak memory usage was at a different
location.  I did not check the local peaks.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-16 14:39:56 -08:00
Ian Romanick
e3043e1276 util/hash_table: Add _mesa_hash_table_init function
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-16 14:39:56 -08:00
Jason Ekstrand
db197fdb6c st/nir: Use nir_src_as_uint for tokens
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-16 15:07:28 -06:00
Jason Ekstrand
47e1e0692c radv: Fix a stupid if in gather_intrinsic_info
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 15:06:07 -06:00
Jason Ekstrand
6bcd2af086 nir/algebraic: Add some optimizations for D3D-style Booleans
D3D Booleans use a 32-bit 0/-1 representation.  Because this previously
matched NIR exactly, we didn't have to really optimize for it.  Now that
we have 1-bit Booleans, we need some specific optimizations to chew
through the D3D12-style Booleans.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15136811 -> 14967944 (-1.12%)
    instructions in affected programs: 2457021 -> 2288154 (-6.87%)
    helped: 8318
    HURT: 10

    total cycles in shared programs: 373544524 -> 359701825 (-3.71%)
    cycles in affected programs: 151029683 -> 137186984 (-9.17%)
    helped: 7749
    HURT: 682

    total loops in shared programs: 4431 -> 4399 (-0.72%)
    loops in affected programs: 32 -> 0
    helped: 21
    HURT: 0

    total spills in shared programs: 10290 -> 10051 (-2.32%)
    spills in affected programs: 2532 -> 2293 (-9.44%)
    helped: 18
    HURT: 18

    total fills in shared programs: 22203 -> 21732 (-2.12%)
    fills in affected programs: 3319 -> 2848 (-14.19%)
    helped: 18
    HURT: 18

Note that a large chunk of the improvement fixing regressions caused by
switching to 1-bit Booleans.  Previously, our ability to optimize D3D
booleans was improved by using the D3D representation directly in NIR.
Now that NIR does 1-bit bools, we need a few more optimizations.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
3b30814791 nir/algebraic: Optimize 1-bit Booleans
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
44227453ec nir: Switch to using 1-bit Booleans for almost everything
This is a squash of a few distinct changes:

    glsl,spirv: Generate 1-bit Booleans

    Revert "Use 32-bit opcodes in the NIR producers and optimizations"

    Revert "nir/builder: Generate 32-bit bool opcodes transparently"

    nir/builder: Generate 1-bit Booleans in nir_build_imm_bool

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
11dc130779 nir: Add a bool to int32 lowering pass
We also enable it in all of the NIR drivers.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
191a1dce92 nir: Add 1-bit Boolean opcodes
We also have to add support for 1-bit integers while we're here so we
get 1-bit variants of iand, ior, and inot.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
615cc26b97 nir/algebraic: Generalize an optimization
This just makes it nicely scale across bit sizes.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
487514ae61 nir/large_constants: Properly handle 1-bit bools
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
3191a82372 nir: Add support for 1-bit data types
This commit adds support for 1-bit Booleans and integers.  Booleans
obviously take a value of true or false.  Because we have to define the
semantics of 1-bit signed and unsigned integers, we define uint1_t to
take values of 0 and 1 and int1_t to take values of 0 and -1.  1-bit
arithmetic is then well-defined in the usual way, just with fewer bits.
The definition of int1_t and uint1_t doesn't usually matter but we do
need something for purposes of constant folding.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
2fe8708ffd nir/constant_expressions: Rework Boolean handling
This commit contains three related changes.  First, we define boolN_t
for N = 8, 16, and 64 and move the definition of boolN_vec to the loop
with the other vec definitions.  Second, there's no reason why we need
the != 0 on the source because that happens implicitly when it's
converted to bool.  Third, for destinations, we use a signed integer
type and just do -(int)bool_val which will give us the 0/-1 behavior we
want and neatly scales to all bit widths.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
80e8dfe9de nir: Rename Boolean-related opcodes to include 32 in the name
This is a squash of a bunch of individual changes:

    nir/builder: Generate 32-bit bool opcodes transparently

    nir/algebraic: Remap Boolean opcodes to the 32-bit variant

    Use 32-bit opcodes in the NIR producers and optimizations

        Generated with a little hand-editing and the following sed commands:

        sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c
        sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c
        sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c
        sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c
        sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c
        sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c

     Use 32-bit opcodes in the NIR back-ends

        Generated with a little hand-editing and the following sed commands:

        sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c
        sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c
        sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c
        sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c
        sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c
        sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
b569093566 nir/algebraic: Make an optimization more specific
Later in this series, bool is not going to imply 32-bit.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
517099809a nir: Drop support for lower_b2f
This was originally added for the out-of-tree Mali driver but I think
we've all agreed it's easy enough for them to just do in their back-end.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
4bb1a34727 nir/algebraic: Optimize x2b(xneg(a)) -> a
Shader-db results on Kaby Lake:

    total instructions in shared programs: 15072525 -> 15072525 (0.00%)
    instructions in affected programs: 0 -> 0
    helped: 0
    HURT: 0

This helps prevent regressions in later commits.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
3595a0abf4 nir/constant_folding: Fix source bit size logic
Instead of looking at input_sizes[i] which contains the number of
components for each source, we look at the bit size of input_types[i].
This fixes a regression in the 1-bit boolean series though I have no
idea how we haven't seen it before now.

Fixes: 35baee5dce "nir/constant_folding: fix incorrect bit-size check"
Fixes: 9076c4e289 "nir: update opcode definitions for different bit sizes"
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
9f7bd843af nir/tgsi: Use nir_bany in ttn_kill_if
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
e17426058c nir/lower_idiv: Use ilt instead of bit twiddling
The previous code was creating a boolean by doing an arithmetic right-
shift by 31 which produces a boolean which is true if the argument is
negative.  This is the same as the expression r < 0 which is much
simpler and doesn't depend on NIR's representation of booleans.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-12-16 21:03:02 +00:00
Eric Anholt
2977c77758 v3d: Use the original bit size when scalarizing uniform loads.
Prevents a regression in jekstrand's 1-bit series.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-16 21:03:01 +00:00
Eric Anholt
91a0251dbc vc4: Use the original bit size when scalarizing uniform loads.
Prevents a regression in jekstrand's 1-bit series.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-16 21:03:01 +00:00
Rhys Perry
bde9f482de ac: split 16-bit ssbo loads that may not be dword aligned
Fixes: 7e7ee82698 ('ac: add support for 16bit buffer loads')
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108114
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-16 14:56:10 +00:00
Rhys Perry
12dc7cb202 ac: refactor visit_load_buffer
This is so that we can split different types of loads more easily.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-16 14:56:10 +00:00
Rhys Perry
ed4020fabe nir: fix constness in nir_intrinsic_align()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-16 14:56:10 +00:00
Jan Vesely
e4f9a37ace clover: Fix build after clang r348827
CodeGenOptions were moved to Basic.

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>
CC: mesa-stable@lists.freedesktop.org
2018-12-16 06:38:10 -05:00
Jon Turney
d512b35b62 glx: Fix compilation with GLX_USE_WINDOWSGL
Sadly, the GLX_USE_APPLEGL and GLX_USE_WINDOWSGL cases are not identical
(because GLX_USE_WINDOWSGL uses vtables rather than a maze of ifdefs)

Include <sys/time.h> again, as functions prototyped by it are used in
the GLX_USE_WINDOWSGL path.

Make the include guard around the __glxGetMscRate() definition match the
one at it's declaration again, as it's referenced from dri_common.c
which is built for GLX_USE_WINDOWSGL.

Fixes: a95ec138 ("glx: mandate xf86vidmode only for "drm" dri platforms")
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-15 13:49:24 +00:00
Eric Anholt
29927e7524 v3d: Drop in a bunch of notes about performance improvement opportunities.
These have all been floating in my head, and while I've thought about
encoding them in issues on gitlab once they're enabled, they also make
sense to just have in the area of the code you'll need to work in.
2018-12-14 17:48:01 -08:00
Eric Anholt
248a7fb392 v3d: Do uniform pretty-printing in the QPU dump.
If you're trying to trace what's going on in a QPU dump, this will
definitely help you find your way.
2018-12-14 17:48:01 -08:00
Eric Anholt
a370ed76ab v3d: Use the uniform pretty-printer in v3d_write_uniforms()'s debug code.
This will be a lot easier than my usual "38400.000000?  that looks like a
viewport scale" decoding strategy.
2018-12-14 17:48:01 -08:00
Eric Anholt
532b6c5671 v3d: Move uniform pretty-printing to its own helper function.
I want to reuse it in the QPU dump.
2018-12-14 17:48:01 -08:00
Eric Anholt
78ef05bde4 v3d: Move uinfo->data[] dereference to the top of v3d_write_uniforms().
Follows 3954331aff ("vc4: Pull uinfo->data[i] dereference out to the top
of the loop.") which showed a large performance win for vc4, but also
cleans up the code a decent bit.
2018-12-14 17:48:01 -08:00
Eric Anholt
a7e15a5086 v3d: Avoid assertion failures when removing end-of-shader instructions.
After generating VIR, we leave c->cursor pointing at the end of the
shader.  If the shader had dead code at the end (for example from preamble
instructions in a shader with no side effects), we would assertion fail
that we were leaving the cursor pointing at freed memory.  Since anything
following DCE should be setting up a new cursor anyway, just clear the
cursor at the start.
2018-12-14 17:48:01 -08:00
Eric Anholt
5b2cc03852 v3d: Add support for draw indirect for GLES3.1.
In trying to enable compute shaders, I found that a bunch of deqp-gles31's
compute stuff wanted to interact with indirect dispatch.  This was easy to
do on its own.
2018-12-14 17:48:01 -08:00
Eric Anholt
ff80e58b38 v3d: Add missing flagging of SYNCB as a TSY op.
Fixes: f2e41daac5 ("broadcom/vc5: Update QPU instruction pack/unpack for v4.2.")
2018-12-14 17:48:01 -08:00
Eric Anholt
3f9bcf9136 v3d: Make sure that a thrsw doesn't split a multop from its umul24.
The thrsw will invalidate rtop, just like accumulators and flags.  Caught
by simulator assertions in CS imulextended/umulextended tests.

Fixes: 90269ba353 ("broadcom/vc5: Use THRSW to enable multi-threaded shaders.")
2018-12-14 17:48:01 -08:00
Eric Anholt
332a5cf6a5 v3d: Add safety checks for resource_create().
This should ease my debugging next time I screw it up.
2018-12-14 17:48:01 -08:00
Eric Anholt
6ad9e8690d v3d: Add support for texturing from linear.
Just like vc4, we have to support linear shared BOs for X11 on arbitrary
displays.  When we're faced with a request to texture from one of those,
make a shadow image that we copy using the TFU at the start of the draw
call.
2018-12-14 17:48:01 -08:00
Eric Anholt
976ea90bdc v3d: Add support for using the TFU to do some blits.
This will be useful in particular for blits from raster to UIF for X11.
2018-12-14 17:48:01 -08:00
Eric Anholt
e5b4d1f55f v3d: Don't forget to bump the number of writes when doing TFU ops.
generatemipmap is just filling out the rest of the mipmap that's already
been written (by a mapping or a draw call), so it didn't matter.  As I
reuse the TFU code for linear-to-UIF conversions, it'll start mattering.
2018-12-14 17:48:01 -08:00
Eric Anholt
485df2574e v3d: Set up the right stride for raster TFU.
I didn't have any raster images in the generatemipmap path, so the
pixels-vs-bytes mixup didn't matter here.
2018-12-14 17:48:01 -08:00
Eric Anholt
e731d53716 v3d: Don't forget to wait for our TFU job before rendering from it.
Otherwise we may race to read old contents.  This didn't show up in the
CTS and piglit for me, but it did once I started using the TFU to do
linear->UIF blits for X11.

Fixes: 2ebca177dc ("v3d: Use the TFU to do generatemipmap.")
2018-12-14 17:48:01 -08:00
Ilia Mirkin
153d3fc5f9 nvc0: always keep TSC slot 0 bound to fix TXF
Same as on nv50, the TXF op always uses the TSC bound to slot 0,
returning blank values if nothing is bound.

An earlier change arranges for the TSC entries list to always have valid
data at entry 0, so here we just make use of it.

Fixes arb_texture_buffer_object-subdata-sync among others.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-14 20:01:31 -05:00
Ilia Mirkin
4aeaf89aa7 nvc0: replace use of explicit default_tsc with entry 0
This was used for implementing FBFETCH. However that uses TXF, which
doesn't do much with a TSC. The only important bit is that sRGB-decoding
works as expected, which we can achieve since all samplers we ever
generate enable sRGB-decoding. Always point to entry 0 in the TSC table,
and ensure that even before it ever gets initialized, the sRGB-decoding
enable bit is set.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-14 20:01:31 -05:00
Rob Clark
5f9085638a freedreno/a6xx: fix corrupted uniforms
For older gen's fd_wfi() is used to conditionally insert a WFI if there
hasn't already been one since last draw.  But this doesn't work out well
with stateobj since the order the stateobj is evaluated might not be
what you expect.  (Ie. stateobj might not be evaluated until a later
draw if there is no geometry from the current draw in a given tile.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-14 15:01:30 -05:00
Alex Deucher
4db4b3447d pci_ids: add new vega20 pci id
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-12-14 14:48:39 -05:00
Alex Deucher
56cf25a114 pci_ids: add new vega10 pci ids
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-12-14 14:48:18 -05:00
Rafael Antognolli
5c454661c6 i965/gen9: Add workarounds for object preemption.
Gen9 hardware requires some workarounds to disable preemption depending
on the type of primitive being emitted.

We implement this by adding a function that checks the primitive type
and number of instances right before the 3DPRIMITIVE.

For now, we just ignore blorp.  The only primitive it emits is
3DPRIM_RECTLIST, and since it's not listed in the workarounds, we can
safely leave preemption enabled when it happens. Or it will be disabled
by a previous 3DPRIMITIVE, which should be fine too.

v3:
 - Apply missing workarounds for instanced rendering and line loop (Ken)
 - Move workaround code to brw_draw_single_prim()

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-12-14 09:40:27 -08:00
Rafael Antognolli
d8b50e152a i965/gen10+: Enable object level preemption.
Set bit when initializing context.

v3:
 - Always toggle preemption bool to false before enabling it for the
 first time, so the state gets emitted (Chris Wilson).
 - Emit end of pipe sync with PIPE_CONTROL_RENDER_TARGET_FLUSH (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-12-14 09:40:27 -08:00
Rafael Antognolli
019a92ffa4 intel/genxml: Add register for object preemption.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-12-14 09:40:27 -08:00
Ian Romanick
a6b7d1151c util/slab: Rename slab_mempool typed parameters to mempool
Now everything with type 'struct slab_child_pool *' is name pool, and
everything with type 'struct slab_mempool *' is named mempool.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-14 07:36:05 -08:00
Ian Romanick
ba5402ec9a nir/phi_builder: Internal users should use nir_phi_builder_value_set_block_def too
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-14 07:36:05 -08:00
Christian Gmeiner
489ffaf0c1 etnaviv: drop redundant ctx function parameter
There is no need to have an extra ctx paramter as all the other
parameters carry all the needed information.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-12-14 11:23:00 +01:00
Kenneth Graunke
0b44644ca6 genxml: Consistently use a numeric "MOCS" field
When we first started using genxml, we decided to represent MOCS as an
actual structure, and pack values.  However, in many places, it was more
convenient to use a numeric value rather than treating it as a struct,
so we added secondary setters in a bunch of places as well.

We were not entirely consistent, either.  Some places only had one.
Gen6 had both kinds of setters for STATE_BASE_ADDRESS, but newer gens
only had the struct-based setters.  The names were sometimes "Constant
Buffer Object Control State" instead of "Memory", making it harder to
find.  Many had prefixes like "Vertex Buffer MOCS"...in a vertex buffer
packet...which is a bit redundant.

On modern hardware, MOCS is simply an index into a table, but we were
still carrying around the structure with an "Index to MOCS Table" field,
in addition to the direct numeric setters.  This is clunky - we really
just want a number on new hardware.

This patch eliminates the struct-based setters, and makes the numeric
setters be consistently called "MOCS".  We leave the struct definition
around on Gen7-8 for reference purposes, but it is unused.

v2: Drop bonus "Depth Buffer MOCS" fields on Gen7.5 and Gen9

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2018-12-14 00:44:54 -08:00
Timothy Arceri
a2ec78883f nir: fix opt_if_loop_last_continue()
The pass did not correctly handle loops ending in:

	if ssa_7 {
		block block_8:
		/* preds: block_7 */
		continue
		/* succs: block_1 */
	} else {
		block block_9:
		/* preds: block_7 */
		break
		/* succs: block_11 */
	}

The break will get eliminated by another opt but if this pass gets
called first (as it does on RADV) we ended up inserting
instructions after the break.

Fixes: 5921a19d4b ("nir: add if opt opt_if_loop_last_continue()")
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-12-14 17:21:35 +11:00
Rob Clark
0ac5acaeaa freedreno/a6xx: fix resource_copy_region()
pctx->resource_copy_region() needs to fall back to sw copy for
non-renderable formats.  But previously for things that we could
not use the blitter for, would fall back to 3d.  Which won't work
if 3d can't render to the dst format either.

Instead rework things to fallback to fd_resource_copy_region(),
which will try 3d core and then fall back to memcpy().

Fixes (for example) dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
4ec2f6129b freedreno: move fd_resource_copy_region()
Code-motion prep for next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
57b76ee2a8 freedreno/a6xx: more blitter fixes
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
d15fc787bc freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
532f8c0043 gallium/aux: add is_unorm() helper
We already had one for is_snorm() but not unorm.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
85cd4df47f freedreno/a6xx: fix blitter crash
Fixes a crash with unsupported formats in dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot

Also fixes gpu hangs with some formats that are supported, but which we
don't know what internal-format to use for the blitter, for ex
dEQP-GLES3.functional.texture.format.sized.2d_array.rgb10_a2_pot

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
cca1e9606c freedreno/ir3: don't remove unused input components
Fixes: 0d240c2214 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
c19c4bf488 freedreno/ir3: fix crash
Fixes a crash in dEQP-GLES3.functional.shaders.fragdepth.compare.fragcoord_z

Fixes: 0d240c2214 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
3e8e033f4c freedreno: also set DUMP flag on shaders
If we emit shader as a pointer to a GEM object, also set the RELOC_DUMP
flag as a hint to kernel that this is a useful buffer to snapshot for
debug dumps.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
4cd016b5d6 freedreno: debug GEM obj names
With a recent enough kernel, set debug names for GEM BOs, which will
show up in $debugfs/gem

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
7ef722861b freedreno/drm: sync uapi and enable softpin
Pull in updated UAPI and use kernel API version to enable softpin.
Since MSM_SUBMIT_BO_DUMP flag was added at same time, use that to
signal to kernel that cmdstream buffers are useful to dump for
debugging/cmdstream-traces.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Eric Anholt
4407e688cd nir: Move intel's half-float image store lowering to to nir_format.h.
I needed the same function for v3d.  This was originally in d3e046e76c
("nir: Pull some of intel's image load/store format conversion to
nir_format.h") before we made am istake about simplifying the function.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 12:24:26 -08:00
Eric Anholt
3a417a044e Revert "intel: Simplify the half-float packing in image load/store lowering."
This reverts commit 06fbcd2cd5.
nir_pack_half_2x16_split *isn't* vectorizable, it's 1-component only, thus
why we had this split-scalar code in the first place.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 12:24:24 -08:00
Eric Anholt
c2c44dba7a nir: Print the format of image variables.
This helps a lot when debugging image load/store lowering on large
testcases.  Unfortunately the Mesa enum name stuff is under src/mesa and
we can't get at it from the compiler.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 12:24:12 -08:00
Eric Anholt
19ffcba161 mesa/st: Expose compute shaders when NIR support is advertised.
We have a NIR path, and V3D doesn't have TGSI input for compute (only what
TTN can handle for the various gallium-internal shaders).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-12-13 11:44:47 -08:00
Dave Airlie
b3f2b03ece radv/xfb: fix counter buffer bounds checks.
If we gave this function 0 counter buffers, we'd still try and
access pCounterBuffers[0] as this check was incorrect.

Fixes crash with ext_transform_feedback-pipeline-basic-primgen
on zink on radv.

Fixes: 677b496b6 (radv: fix begin/end transform feedback with 0 counter buffers.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-13 19:27:05 +00:00
Jason Ekstrand
9ebc00f32e i965: Enable nir_opt_idiv_const for 32 and 64-bit integers
The pass should work for all bit sizes but it's less clear that the
extra instructions are worth it on small integers.  Also, the hardware
doesn't do mul_high on anything other than 32-bit integers and, absent
any decent mechanism for testing the pass on 8 and 16-bit types, it's
probably best to just leave it disabled for now.

Shader-db results on Sky Lake:

    total instructions in shared programs: 15105795 -> 15111403 (0.04%)
    instructions in affected programs: 72774 -> 78382 (7.71%)
    helped: 0
    HURT: 265

Note that hurt here actually means helped because we're getting rid of
integer quotient operations (which are a send on some platforms!) and
replacing them with fairly cheap ALU ops.

Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Jason Ekstrand
455ec7327d i965/vec4: Implement nir_op_uadd_sat
Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Ian Romanick
e639d39faf i965/fs: Implement nir_op_uadd_sat
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 17:49:48 +00:00
Jason Ekstrand
74492ebad9 nir: Add a pass for lowering integer division by constants
It's a reasonably well-known fact in the world of compilers that integer
divisions by constants can be replaced by a multiply, an add, and some
shifts.  This commit adds such an optimization to NIR for easiest case
of udiv.  Other division operations will be added in following commits.
In order to provide some additional driver control, the pass takes a
minimum bit size to optimize.

Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Ian Romanick
090e282407 nir: Add a saturated unsigned integer add opcode
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 17:49:48 +00:00
Jason Ekstrand
39198a1238 nir/lower_int64: Add support for [iu]mul_high
Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Jason Ekstrand
9525971e2b nir: Allow [iu]mul_high on non-32-bit types
Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Emil Velikov
a95ec13879 glx: mandate xf86vidmode only for "drm" dri platforms
Currently we have the three dri "platforms" - drm, apple and windows.

Since xf86vidmode is a thing only for the drm one, adjust the
preprocessor guards and correctly check for the dependency.

v2: terminate the GLX_USE_WINDOWSGL hunk

Cc: Jon TURNEY <jon.turney@dronecode.org.uk>
Fixes: 5bc509363b ("glx: make xf86vidmode mandatory for direct rendering")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-13 17:38:19 +00:00
Alejandro Piñeiro
c7bdcd67aa nir: remove unused variable
To avoid the following warning:
./src/compiler/nir/nir_loop_analyze.c:807:16: warning: unused variable ‘ns’ [-Wunused-variable]
    nir_shader *ns = impl->function->shader;
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-13 16:35:21 +01:00
Erik Faye-Lund
e888f28d1f virgl: work around bad assumptions in virglrenderer
Virglrenderer does the wrong thing when given an instance divisor;
it tries to use the element-index rather than the binding-index as
the argument to glVertexBindingDivisor(). This worked fine as long
as there was a 1:1 relationship between elements and bindings,
which was the case util 19a91841c3 "st/mesa: Use Array._DrawVAO in
st_atom_array.c.".

So let's detect instance divisors, and restore a 1:1 relationship in
that case. This will make old versions of virglrenderer behave
correctly. For newer versions, we can consider making a better
interface, where the instance divisor isn't specified per element,
but rather per binding. But let's save that for another day.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 19a91841c3 "st/mesa: Use Array._DrawVAO in st_atom_array.c."
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
2018-12-13 16:12:10 +01:00
Erik Faye-Lund
8447b64238 virgl: wrap vertex element state in a struct
This just has one member for now; the handle. But this is about to
change.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
2018-12-13 16:12:10 +01:00
Erik Faye-Lund
b702ff5378 virgl: simplify virgl_hw_set_index_buffer
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
2018-12-13 16:12:10 +01:00
Erik Faye-Lund
00143a6241 virgl: simplify virgl_hw_set_vertex_buffers
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
2018-12-13 16:12:10 +01:00
Juan A. Suarez Romero
0991085f66 docs: update calendar, add news item and link release notes for 18.2.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-12-13 15:45:20 +01:00
Juan A. Suarez Romero
e0b0995dcf docs: add sha256 checksums for 18.2.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit e90429cc6d)
2018-12-13 15:42:49 +01:00
Juan A. Suarez Romero
c8a17b45ea docs: add release notes for 18.2.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 419ee20097)
2018-12-13 15:42:46 +01:00
Samuel Pitoiset
5088ba2aeb radv: don't check if format is depth in radv_image_can_enable_hile()
This is always TRUE if htile_size is not 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:21 +01:00
Samuel Pitoiset
eb0034fe28 radv: check if addrlib enabled HTILE in radv_image_can_enable_htile()
When hile_size is 0, we can't enable HTILE. This doesn't change
anything, except not calling radv_image_alloc_htile().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:19 +01:00
Samuel Pitoiset
d8325f1f07 radv: switch on EOP when primitive restart is enabled with triangle strips
Otherwise, Yakuza hangs the GPU with DXVK. We don't know if
linetrip and pointlist are affected, so my point is to do that
only for triangle strips.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:16 +01:00
Samuel Pitoiset
74cf3b627c radv: allow to skip DCC decompressions with the new predicate
Feral games aren't affected because they don't decompress DCC.
F1 2018 has one DCC decompression per frame, but I don't see
any performance improvements. This new predicate will be
probably more useful for DCC/MSAA.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:14 +01:00
Samuel Pitoiset
3a5adc2879 radv: add a predicate for reflecting DCC decompression state
It's somehow similar to the FCE predicate.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:10 +01:00
Jordan Justen
c506eae53d i965/compute: Emit GPGPU_WALKER in genX_state_upload
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-12 22:28:06 -08:00
Jordan Justen
1b85c605a6 i965/genX_state: Add register access functions
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-12 22:28:02 -08:00
Eric Anholt
06fbcd2cd5 intel: Simplify the half-float packing in image load/store lowering.
This was noted by Jason in review when I tried to make a helper for the
old path.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-12 16:09:48 -08:00
Eric Anholt
d3e046e76c nir: Pull some of intel's image load/store format conversion to nir_format.h
I needed the same functions for v3d.  Note that the color value in the
Intel lowering has already been cut down to image.chans num_components.

v2: Drop the half float one, since it was a 1-liner after cleanup.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-12 16:09:43 -08:00
Eric Anholt
19c7cba2ab nir: Add some more consts to the nir_format_convert.h helpers.
Most of the bits were constant, but a few were missed.  Avoids warnings
from v3d's upcoming static const bits declarations.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-12 16:09:37 -08:00
Timothy Arceri
9e6b39e1d5 nir: detect more induction variables
This allows loop analysis to detect inductions variables that
are incremented in both branches of an if rather than in a main
loop block. For example:

   loop {
      block block_1:
      /* preds: block_0 block_7 */
      vec1 32 ssa_8 = phi block_0: ssa_4, block_7: ssa_20
      vec1 32 ssa_9 = phi block_0: ssa_0, block_7: ssa_4
      vec1 32 ssa_10 = phi block_0: ssa_1, block_7: ssa_4
      vec1 32 ssa_11 = phi block_0: ssa_2, block_7: ssa_21
      vec1 32 ssa_12 = phi block_0: ssa_3, block_7: ssa_22
      vec4 32 ssa_13 = vec4 ssa_12, ssa_11, ssa_10, ssa_9
      vec1 32 ssa_14 = ige ssa_8, ssa_5
      /* succs: block_2 block_3 */
      if ssa_14 {
         block block_2:
         /* preds: block_1 */
         break
         /* succs: block_8 */
      } else {
         block block_3:
         /* preds: block_1 */
         /* succs: block_4 */
      }
      block block_4:
      /* preds: block_3 */
      vec1 32 ssa_15 = ilt ssa_6, ssa_8
      /* succs: block_5 block_6 */
      if ssa_15 {
         block block_5:
         /* preds: block_4 */
         vec1 32 ssa_16 = iadd ssa_8, ssa_7
         vec1 32 ssa_17 = load_const (0x3f800000 /* 1.000000*/)
         /* succs: block_7 */
      } else {
         block block_6:
         /* preds: block_4 */
         vec1 32 ssa_18 = iadd ssa_8, ssa_7
         vec1 32 ssa_19 = load_const (0x3f800000 /* 1.000000*/)
         /* succs: block_7 */
      }
      block block_7:
      /* preds: block_5 block_6 */
      vec1 32 ssa_20 = phi block_5: ssa_16, block_6: ssa_18
      vec1 32 ssa_21 = phi block_5: ssa_17, block_6: ssa_4
      vec1 32 ssa_22 = phi block_5: ssa_4, block_6: ssa_19
      /* succs: block_1 */
   }

Unfortunatly GCM could move the addition out of the if for us
(making this patch unrequired) but we still cannot enable the GCM
pass without regressions.

This unrolls a loop in Rise of The Tomb Raider.

vkpipeline-db results (VEGA):

Totals from affected shaders:
SGPRS: 88 -> 96 (9.09 %)
VGPRS: 56 -> 52 (-7.14 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 2168 -> 4560 (110.33 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 4 -> 4 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211
2018-12-13 10:58:35 +11:00
Timothy Arceri
c03d6e61cc nir: reword code comment
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-12-13 10:58:35 +11:00
Timothy Arceri
48b40380e3 nir: in loop analysis track actual control flow type
This will allow us to improve analysis to find more induction
variables.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-12-13 10:58:35 +11:00
Danylo Piliaiev
5921a19d4b nir: add if opt opt_if_loop_last_continue()
Removing the last continue can allow more loops to unroll. Also
inserting code into the if branch can allow the various if opts
to progress further.

The insertion of some loops into the if branch also reduces VGPR
use in some shaders.

vkpipeline-db results (VEGA):

Totals from affected shaders:
SGPRS: 6552 -> 6576 (0.37 %)
VGPRS: 6544 -> 6532 (-0.18 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 481952 -> 478032 (-0.81 %) bytes
LDS: 13 -> 13 (0.00 %) blocks
Max Waves: 241 -> 242 (0.41 %)
Wait states: 0 -> 0 (0.00 %)

Shader-db results radeonsi (VEGA):

Totals from affected shaders:
SGPRS: 168 -> 168 (0.00 %)
VGPRS: 144 -> 140 (-2.78 %)
Spilled SGPRs: 157 -> 157 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 8524 -> 8488 (-0.42 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 7 -> 7 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

v2: (Timothy Arceri):
- allow for continues in either branch
- move any trailing loops inside the if as well as blocks.
- leave nir_opt_trivial_continues() to actually remove the
  continue.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211
2018-12-13 10:58:35 +11:00
Timothy Arceri
721566bddb nir: rework force_unroll_array_access()
Here we rework force_unroll_array_access() so that we can reuse
the induction variable detection in a following patch.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-12-13 10:39:51 +11:00
Timothy Arceri
48135f175c nir: factor out some of the complex loop unroll code to a helper
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-12-13 10:34:48 +11:00
Jordan Justen
7fe4e0ad5d docs: Document GitLab merge request process (email alternative)
This documents a process for using GitLab Merge Requests as an second
way to submit code changes for Mesa. Only one of the two methods is
allowed for each patch series.

We will *not* require all patches to be emailed. Some code changes may
be reviewed and merged without any discussion on the mesa-dev email
list.

v2:
 * No longer require email. Allow submitter to choose email or a
   GitLab merge request.
 * Various feedback from Brian, Daniel, Dylan, Eric, Erik, Jason,
   Matt, Michel and Rob.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Rob Clark <robdclark@gmail.com>
2018-12-12 10:05:29 -08:00
Rhys Kidd
ff6f1dd0d3 meson: libfreedreno depends upon libdrm (for fence support)
Error message building freedreno Gallium driver with meson:

  ../src/gallium/drivers/freedreno/freedreno_fence.c:27:21: fatal error: libsync.h: No such file or directory
   \#include <libsync.h>

Fixes: 4aa69cc425 ("meson: build freedreno")
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 09:01:06 -08:00
Jason Ekstrand
ca98902d09 nir: Document the function inlining process
This has thrown a few people off recently and it's good to have the
process and all the rational for it documented somewhere.  A comment at
the top of nir_inline_functions seems as good a place as any.

Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-12-12 08:32:32 -06:00
Jason Ekstrand
5749c0ebc4 intel/blorp: Assert that we don't re-layout a compressed surface
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-12 08:32:32 -06:00
Jason Ekstrand
e4fdc650f1 anv/pipeline: Set the correct binding count for compute shaders
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-12-12 08:32:25 -06:00
Samuel Pitoiset
2ac6d55f38 radv: bump reported version to 1.1.90
After going through the spec changelog, it looks like RADV
is up to date. Note that ANV also reports 1.1.90.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-12 13:51:16 +01:00
Erik Faye-Lund
f856f50194 virgl: force linear texturing support
When I made sure that half-float texture-filtering was required for ES3,
I didn't realize that virgl doesn't report support for this correctly.
This regressed the GLES version available on top of several drivers,
including i965 from 3.2 to 2.0.

This is going to need protocol changes to fix properly, so let's just
restore the previous behavior by enabling floating-point filtering
unconditionally for now.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: fcf9fcee3c "mesa/main: do not require float-texture filtering for es3"
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-12-12 11:44:47 +01:00
Iago Toral Quiroga
3918943211 intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments
The implementation of these opcodes in the generator assumes that their
arguments are packed, and it generates register regions based on that
assumption.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-12 08:09:45 +01:00
Jason Ekstrand
a10a450db2 anv: Advertise support for MinLod on Skylake+
These are usually used for dealing with sparse resources but there's no
reason why we can't hook them up before we have sparse.  We have the
hardware; let's light it up.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Jason Ekstrand
cb98e0755f intel/fs: Support min_lod parameters on texture instructions
We have to lower some shadow instructions because they don't exist in
hardware and we have to lower txb+offset+clamp because the message gets
too big and we run into the sampler message length limit of 11 regs.

Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Jason Ekstrand
4ef8f46fd1 nir/lower_tex: Add lowering for some min_lod cases
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Jason Ekstrand
4a691cfa7e nir/lower_tex: Modify txd instructions instead of replacing them
I don't know if one is better than the other or not but this approach
has the advantage that we never forget to copy information over and
we're not hard-coding quite as many assumptions.  It's also a lot
simpler and much less code.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Jason Ekstrand
5a968ae473 nir/lower_tex: Simplify lower_gradient logic
Instead of having to call two different lower_gradient functions based
on whether or not it's a cube, just make lower_gradient handle cubes.
This significantly simplifies some of the logic.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Jason Ekstrand
caeffe7549 spirv: Add support for MinLod
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Jason Ekstrand
e1ef6c3c29 intel/ir: Don't allow allocating zero registers
This simple check helps catch bugs early that can end up propagating
into later stages of the compile and triggering strange asserts.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Roland Scheidegger
86c45fe960 gallivm: remove unused float coord wrapping for aos sampling
AoS sampling tries to use integers for coord wrapping when possible,
as it should be faster. However, for AVX, this was suboptimal, because
only floats can use 8x32bit vectors, whereas integers have to be split
into 4x32bit vectors. (I believe part of why it was slower was also
that at least earlier llvm versions had trouble optimizing it properly,
since you can still do simple bit ops with 8x32bit vectors, so a
sequence of int add / and / int add / and with such vectors would
actually end up doing 128bit inserts/extracts between the operations
instead of just doing the cheap 128bit ands.)
Hence, a special float coord wrapping path was added to AoS sampling.
But this path was actually disabled for a long time already, since we
found that just splitting everything before entering the AoS path was
still sligthly faster usually, so none of this float coord wrapping
code was used anymore (AoS sampling code, when avx2 isn't supported,
never sees vectors with length > 4). I thought it might be useful some
day again, but I'm not interested anymore in optimizing for very weird
instruction sets which have support for 256bit vectors for floats but
not for ints, so just drop it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-12-12 03:50:03 +01:00
Emil Velikov
721c296bdc docs: update calendar, add news item and link release notes for 18.3.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-11 21:25:18 +00:00
Emil Velikov
5391b65ed1 docs: add sha256 checksums for 18.3.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-11 21:21:42 +00:00
Emil Velikov
512bd8d3dd docs: add release notes for 18.3.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-11 21:21:41 +00:00
Neil Roberts
8600aa35bd freedreno: Add .dir-locals to the common directory
The commit aa0fed10d3 moved a bunch of Freedreno code to a common
directory. The previous directory had a .dir-locals file for Emacs.
This patch copies it to the new directory as well.

Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-12-11 13:14:08 -08:00
Rob Clark
cfe8220904 mesa/st/nir: fix missing nir_compact_varyings
LinkedTransformFeedback is normally populated, which had nerf'd varying
packing since the check was introduced.

Fixes: dbd52585fa st/nir: Disable varying packing when doing transform feedback.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-11 15:51:34 -05:00
Rob Clark
9e3fc0c1e0 nir: fix spelling typo
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-11 15:51:34 -05:00
Jason Ekstrand
8f401b0ce6 anv,radv: Disable VK_EXT_pci_bus_info
The Vulkan working group recently discovered that we made a mistake in
assuming that PCI domains are 16-bit even though they can potentially be
32-bit values.  To fix this, the next spec update will change the types
in the VK_EXT_pci_bus_info struct to be 32 bits which will be a
backwards-incompatible change.  Normally, Khronos tries very hard to
never make backwards incompatible changes to specs.  Hopefully, the
extension is new enough (2 months) that there are no shipping apps which
use the extension so this should be safe.

This commit disables the extension for both anv and radv in mesa and
should be back-ported to 18.3 ASAP so we avoid any potential issues with
new apps running on old drivers.  I'll send out a commit (which we can
also back-port to 18.3 if we really care) to re-enable the extension in
both drivers once this week's spec update ships.  The one known use of
this extension is internal to mesa and will continue working with the
extension disabled and will naturally update when we get a new header.

Cc: "18.3" <mesa-stable@lists.freedesktop.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-11 11:30:05 -06:00
Juan A. Suarez Romero
fb88dcf5ca docs: extends 18.2 lifecycle
As 18.3 was published with some delay, let's extend 18.2 life for
another extra release.

CC: Andres Gomez <agomez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-11 15:20:10 +01:00
Kristian H. Kristensen
c0de7c21a3 glapi: fixup EXT_multisampled_render_to_texture dispatch
There's a few missing and convoluted bits:

 - FramebufferTexture2DMultisampleEXT
Missing sanity check, should be desktop="false"

 - RenderbufferStorageMultisampleEXT
Missing sanity check, is aliased to RenderbufferStorageMultisample.
Thus it's set only when desktop GL or GLES2 v3.0+, while the extension
is GLES2 2.0+.

If we flip the aliasing we'll break indirect GLX, so loosen the version
to 2.0. Not perfect, yet this is the most sane thing I could think of.

v2: [Emil] Fixup RenderbufferStorageMultisampleEXT, commmit message

Cc: Kristian H. Kristensen <hoegsberg@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108974
Fixes: 1b331ae505 ("mesa: Add core support for EXT_multisampled_render_to_texture{,2}")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 15:09:07 -08:00
Kristian H. Kristensen
9578dde1c8 freedreno: Fix the Makefile.am fix
Commit b028ce29f0 fixed a typo in
src/freedreno/Makefile.am, but ended up breaking the build for
freedreno.  The typo inadvertently made things work, as we were not
supposed to link with libnir or libmesautil to begin with.  Those come
in through libmesagallium and the typo prevented the duplicated
linkage.

Fixes: b028ce29f ("freedreno: add the missing _la in libfreedreno_ir3_la")
Cc: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 14:28:09 -08:00
Matt Turner
f447a13032 i965/fs: Handle V/UV immediates in dump_instructions() 2018-12-10 10:46:56 -08:00
Sagar Ghuge
694eb342a2 intel/compiler: Always print flag subregister number
While disassembling the predicate always print flag subregister number
to keep grammar same across the generation for assembler tool.

v2: Combine consecutive format calls (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-12-10 10:07:11 -08:00
Sagar Ghuge
e7598c5a62 intel/compiler: Set swizzle to BRW_SWIZZLE_XXXX for scalar region
When RepCtrl is set, the swizzle field is ignored by the hardware. In
order to ensure a 1-to-1 correspondence between the human-readable
disassembly and the binary instruction encoding always set the swizzle
to XXXX (all zeros) when it is unused due to RepCtrl

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-12-10 10:06:55 -08:00
Dylan Baker
6d3cbbbe15 meson: Add nir_algebraic_parser_test to suites
Just to make it easier to run a nir tests together.

Fixes: a0ae12ca91
       ("nir/algebraic: Add unit tests for bitsize validation")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-10 09:14:44 -08:00
Emil Velikov
27c4fdfdf8 amd/addrlib: drop si_ci_vi_merged_enum.h from the list
Fixes: 776b911365 ("amd/addrlib: update Mesa's copy of addrlib")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 16:35:01 +00:00
Emil Velikov
b028ce29f0 freedreno: add the missing _la in libfreedreno_ir3_la
Fixes: aa0fed10d3 ("freedreno: move ir3 to common location")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 16:35:01 +00:00
Emil Velikov
b30e37ec64 freedreno: drop duplicate MKDIR_GEN declaration
Fixes: aa0fed10d3 ("freedreno: move ir3 to common location")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 16:35:01 +00:00
Rhys Kidd
05c7e726f7 travis: radeonsi and radv require LLVM 7.0
Fixes: 3fbdcd942f ("amd: remove support for LLVM 6.0")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>
Cc: Andres Gomez <agomez@igalia.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 16:20:12 +00:00
Kirill Burtsev
a539316485 loader: free error state, when checking the drawable type
Currently we distinguish if the drawable is a window or pixmap by
checking xcb_present_select_input throws an error or not.

Yet, we don't always free the error state returned by xcb.

Cc: Kirill Burtsev <kirill.burtsev@qt.io>
Cc: Boyan Ding <boyan.j.ding@gmail.com>
Fixes: 6bd9ba7d07 ("loader: Add dri3 helper")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[Emil: add commit message, fixes tag]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 16:19:55 +00:00
Timothy Arceri
032f247921 nir: make use of new nir_cf_list_clone_and_reinsert() helper
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-10 13:59:50 +11:00
Timothy Arceri
6b961eb534 nir: add a new nir_cf_list_clone_and_reinsert() helper
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-10 13:59:50 +11:00
Timothy Arceri
03d7c65ad8 nir: clarify some nit_loop_info member names
Following commits will introduce additional fields such as
guessed_trip_count. Renaming these will help avoid confusion
as our unrolling feature set grows.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-10 13:59:50 +11:00
Timothy Arceri
de0aee7638 nir: small tidy ups for nir_loop_analyze()
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-10 13:59:50 +11:00
Kenneth Graunke
41a4a6ba6f i965: Flip arguments to load_register_reg helpers.
load_register_imm and load_register_mem take the destination as the
first argument, so I'd like load_register_reg to do the same the sake
of consistency.  Otherwise, reading sequences of mixed LRI/LRM/LRR is
needlessly confusing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-09 18:39:16 -08:00
Kenneth Graunke
34c9dc2537 i965: Delete dead brw_meta_resolve_color prototype.
Dead since commit 09e041d61d (May 2016).
2018-12-09 18:39:16 -08:00
Karol Herbst
77944fb2b7 nv50/ir: fix use-after-free in ConstantFolding::visit
opnd() might delete the passed in instruction, but it's used through
i->srcExists() later in visit

v2: use continue instead return
v3: use brackets for the outer if/else chain

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-09 18:19:59 +01:00
Karol Herbst
d63a133082 nouveau: use atomic operations for driver statistics
multiple threads can write to those at the same time

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-09 04:43:20 +01:00
Karol Herbst
a28ff22295 nv50/ir: initialize relDegree staticly
this race condition is pretty harmless, but also pretty trivial to fix

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-09 04:43:17 +01:00
Eric Anholt
cc6a5e937b shader-packing 2018-12-07 16:51:12 -08:00
Eric Anholt
09ad0d870c tfu 2018-12-07 16:49:41 -08:00
Eric Anholt
f1d98204c3 v3d: Fix a leak of the disassembled instruction string during debug dumps.
Fixes: ade416d023 ("broadcom: Add VC5 NIR compiler.")
2018-12-07 16:48:23 -08:00
Eric Anholt
7f8d8b7d27 vc4: Fix a leak of the transfer helper on screen destroy.
Fixes: d009463a65 ("vc4: Switch to using u_transfer_helper for MSAA maps.")
2018-12-07 16:48:23 -08:00
Eric Anholt
3bd73d31a8 v3d: Fix a leak of the transfer helper on screen destroy.
Fixes: 7a30517cce ("broadcom/vc5: Start adding support for rendering to Z32F_S8X24_UINT.")
2018-12-07 16:48:23 -08:00
Eric Anholt
bad95bb13c v3d: Add VIR dumping of TMU config p0/p1.
I had a bit of it for V3D 3.x, but didn't update it for 4.x.
2018-12-07 16:48:23 -08:00
Eric Anholt
1fc78ff3f1 v3d: Simplify VIR uniform dumping using a temporary. 2018-12-07 16:48:23 -08:00
Eric Anholt
5932575299 v3d: Garbage collect unused uniforms code. 2018-12-07 16:48:23 -08:00
Eric Anholt
62a3192112 v3d: Split most of TEXTURE_SHADER_STATE setup out of sampler views.
For shader image load/store, we want most of this logic to be shared.
2018-12-07 16:48:23 -08:00
Eric Anholt
8cb1f3bab7 v3d: Avoid confusing auto-indenting in TEXTURE_SHADER_STATE packing
Having "v3dx_pack() {" under each #if branch would confuse emacs's
indenter.
2018-12-07 16:48:23 -08:00
Eric Anholt
ee9b758053 v3d: Fix handling of texture first_layer offsets for 3D textures.
I think this bug predated adding v3d_layer_offset().  Noticed during an
unrelated refactor.
2018-12-07 16:48:23 -08:00
Eric Anholt
acecee4c2d v3d: Return the right gl_SampleMaskIn[] value.
It's supposed to be the dispatched sample mask for this pixel, not the GL
state's sample mask.
2018-12-07 16:48:23 -08:00
Eric Anholt
6870111051 v3d: Fix a comment typo 2018-12-07 16:48:23 -08:00
Eric Anholt
ca0e4ae4bc v3d: Convert to using nir_src_as_uint() from const_value derefs.
Follows 16870de8a0 ("nir: Use nir_src_is_const and nir_src_as_* in core
code") to clean up v3d.
2018-12-07 16:48:23 -08:00
Eric Anholt
503b55c622 v3d: Don't forget to flush writes to UBOs.
If someone did TF into a UBO, we might have left the TF job un-flushed at
the point of reading.
2018-12-07 16:48:23 -08:00
Eric Anholt
504d06e4c1 v3d: Make an array for frag/vert texture state in the context.
This simplifies a bunch of our texture handling, while introducing the
slots necessary for adding new shader stages.
2018-12-07 16:48:23 -08:00
Eric Anholt
d1965344ac v3d: Re-use the wrap mode uniform on V3D 3.3. 2018-12-07 16:48:23 -08:00
Eric Anholt
e94d034a38 v3d: Put default vertex attribute values into the state uploader as well.
The default attributes are long-lived (the state struct is cached), and
only 256 bytes each.
2018-12-07 16:48:23 -08:00
Eric Anholt
b38e4d313f v3d: Create a state uploader for packing our shaders together.
Shaders are usually quite short, and are private to the context.  We can
save memory and reduce the work the kernel needs to do at exec time by
packing them together in a stream uploader for long-lived state.
2018-12-07 16:48:23 -08:00
Eric Anholt
1911888760 v3d: Update simulator cache flushing code to match the kernel better.
We were missing the invalidate between bin and render (possibly relevant
for SSBOs), and still trying to flush the nonexistent L2C on 3.3+.
2018-12-07 16:48:23 -08:00
Eric Anholt
2ebca177dc v3d: Use the TFU to do generatemipmap.
This is a separate, dedicated hardware unit for texture layout conversions
and mipmap generation.
2018-12-07 16:48:23 -08:00
Eric Anholt
ee0549ff9a v3d: Add the V3D TFU submit interface to the simulator.
The TFU lets us format raster and SAND images into formats that can be
read by the texture engine, and do mipmap generation.

The UAPI comes from drm-next e69aa5f9b97f ("Merge tag
'drm-misc-next-2018-12-06' of git://anongit.freedesktop.org/drm/drm-misc
into drm-next")
2018-12-07 16:48:23 -08:00
Eric Anholt
42652ea51e v3d: Use combined input/output segments.
The HW apparently has some issues (or at least a much more complicated VCM
calculation) with non-combined segments, and the closed source driver also
uses combined I/O.  Until I get the last CTS failure resolved (which does
look plausibly like some VPM stomping), let's use combined I/O too.
2018-12-07 16:48:23 -08:00
Eric Anholt
fb9bcf5602 v3d: Add missing OES_half_float_linear support.
We were exposing ARB_texture_float, but apparently not the OES subset
flag.  Fixes regression from GLES3 support to GLES2.

Fixes: fcf9fcee3c ("mesa/main: do not require float-texture filtering
for es3")
2018-12-07 16:48:23 -08:00
Eric Anholt
90e98295a4 v3d: Add support for RGBA_SRGB along with BGRA_SRGB.
This is the actual native format for the hardware, without swizzling.
Noticed while debugging why GLES3 disappeared.
2018-12-07 16:48:23 -08:00
Kenneth Graunke
f0d51e81c9 intel/blorp: Expand blorp_address::offset to be 64 bits.
In the softpin world, surface state base address may be a fixed 64-bit
address (with no associated BO).  It makes sense to store this in the
offset field.  But it needs to be the full size.

We also update the clear color address to be consistently uint64_t
everywhere so we can continue passing intel_miptree_get_clear_color
a pointer to the blorp_address's offset field without type mismatches.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-12-07 16:35:51 -08:00
Rob Clark
d014af98b7 freedreno/drm: fix memory leak
Fix an emberrasing memory leak with the non-softpin submit/rb
implementation.

Fixes: f3cc0d2747 freedreno: import libdrm_freedreno + redesign submit
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 14:12:12 -05:00
Rob Clark
5c2c1f0a2d freedreno/ir3: track max flow control depth for a5xx/a6xx
Rather than just hard-coding BRANCHSTACK size.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
9517037bdc freedreno/ir3: code-motion
Split up ir3_compiler_nir.c a bit before starting to add new stuff for
a6xx SSBO/image instructions.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
e37351fa57 freedreno/ir3: sync instr/disasm
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
0d240c2214 freedreno/ir3: don't fetch unused tex components
Detect when a component of an (for example) texture fetch is unused and
propagate the updated wrmask back to the parent instruction.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
b971afd19e freedreno/a6xx: blitter fixes
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
237ae7daf2 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
e779725f0b freedreno/drm: fix relocs in nested stateobjs
If we have an reloc from stateobjA to stateobjB, we would previously
leave stateobjB's bos out of the submit's bos table.  Handle this case
by copying into stateobjA's reloc_bos table.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
9f7c6c78bc freedreno/a5xx+a6xx: remove unused fs/vs pvt mem
copy/pasta from older gens

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
c500e7b747 gallium: fix typo
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
f6ad286c80 freedreno: remove unused fd_surface fields
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Nicolai Hähnle
4275cae95c meson: link LLVM 'native' component when LLVM is available
Linking against LLVM built with BUILD_SHARED_LIBS fails otherwise,
as the component is required for the draw module.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-07 16:26:14 +01:00
Connor Abbott
2845c49218 nir: Fixup algebraic test for variable-sized conversions
b2i can now take any size boolean in preparation for 1-bit booleans, so
the error message printed is slightly different.

Fixes: dca6cd9ce6 ("nir: Make boolean conversions sized just like the others")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108961
Cc: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-07 16:07:51 +01:00
Samuel Pitoiset
e8a383ce67 gallium: add missing PIPE_CAP_SURFACE_SAMPLE_COUNT default value
Fixes: 2710c40e3c ("gallium: Add new PIPE_CAP_SURFACE_SAMPLE_COUNT")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2018-12-07 15:06:29 +01:00
Emil Velikov
96d4ecbb11 docs: update calendar, add news item and link release notes for 18.3.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-07 11:50:12 +00:00
Emil Velikov
0144bbdb98 docs: add sha256 checksums for 18.3.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d81beab96a)
2018-12-07 11:44:33 +00:00
Emil Velikov
b1e0336497 docs: update 18.3.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d603cd9d84)
2018-12-07 11:44:31 +00:00
Kristian H. Kristensen
3e55df4f83 freedreno: Add support for EXT_multisampled_render_to_texture
There is not much to do in freedreno - tile layout and multisample
state for gmem renderings is programmed based on the pfb sample count,
while resolve blits take the destination sample count from the resource.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-12-06 16:56:37 -08:00
Rob Clark
913eb7fa58 freedreno/a6xx: MSAA
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-06 16:55:59 -08:00
Kristian H. Kristensen
14ea811c67 st/mesa: Add support for EXT_multisampled_render_to_texture
In gallium, we model the attachment sample count as a new nr_samples
field in pipe_surface. A driver can indicate support for the extension
using the new pipe cap, PIPE_CAP_MULTISAMPLED_RENDER_TO_TEXTURE.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-12-06 16:55:46 -08:00
Kristian H. Kristensen
2710c40e3c gallium: Add new PIPE_CAP_SURFACE_SAMPLE_COUNT
This new pipe cap and the new nr_samples field in pipe_surface lets a
state tracker bind a render target with a different sample count than
the resource. This allows for implementing
EXT_multisampled_render_to_texture and
EXT_multisampled_render_to_texture2.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-12-06 16:55:43 -08:00
Kristian H. Kristensen
1b331ae505 mesa: Add core support for EXT_multisampled_render_to_texture{,2}
This also turns on EXT_multisampled_render_to_texture which is a
subset of EXT_multisampled_render_to_texture2, allowing only
COLOR_ATTACHMENT0.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-12-06 16:55:30 -08:00
Vinson Lee
b4fd59075b nir/algebraic: Make algebraic_parser_test.sh executable.
Fixes make check permission error.

../../bin/test-driver: line 107: ./nir/tests/algebraic_parser_test.sh: Permission denied
FAIL nir/tests/algebraic_parser_test.sh (exit status: 126)

Fixes: a0ae12ca91 ("nir/algebraic: Add unit tests for bitsize validation")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2018-12-06 11:48:20 -08:00
Samuel Pitoiset
3fbdcd942f amd: remove support for LLVM 6.0
User are encouraged to switch to LLVM 7.0 released in September 2018.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-06 14:02:56 +01:00
Kristian H. Kristensen
3b2ad8b290 gallium: Android build fixes
A couple of simple fixes for building on Android with autotools.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-05 13:56:07 -08:00
Jason Ekstrand
dca6cd9ce6 nir: Make boolean conversions sized just like the others
Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is
one if 8, 16, 32, or 64.  This leads to having a few more opcodes but
now everything is consistent and booleans aren't a weird special case
anymore.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:03:07 -06:00
Jason Ekstrand
be98b1db38 nir/opt_algebraic: Add 32-bit specifiers to a bunch of booleans
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:03:03 -06:00
Jason Ekstrand
2715080d65 nir/opt_algebraic: Drop bit-size suffixes from conversions
Suffixes are dropped from a bunch of conversion opcodes when it makes
sense to do so.  Others are kept if we really do want the bit-size
restriction.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:03:01 -06:00
Jason Ekstrand
ff8e3d3b7b nir/opt_algebraic: Simplify an optimization using the new search ops
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:58 -06:00
Jason Ekstrand
05af952a11 nir/algebraic: Add support for unsized conversion opcodes
All conversion opcodes require a destination size but this makes
constructing certain algebraic expressions rather cumbersome.  This
commit adds support to nir_search and nir_algebraic for writing
conversion opcodes without a size.  These meta-opcodes match any
conversion of that type regardless of destination size and the size gets
inferred from the sizes of the things being matched or from other
opcodes in the expression.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:56 -06:00
Jason Ekstrand
4925290ab1 nir/algebraic: Refactor codegen a bit
Instead of using an OrderedDict, just have a (necessarily sorted) array
of transforms and a set of opcodes.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:54 -06:00
Jason Ekstrand
d6aac618fb nir/algebraic: Clean up some __str__ cruft
Both of these things are already handled in the Value base class so we
don't need to handle them explicitly in Constant.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:52 -06:00
Jason Ekstrand
85f0ea9d8f nir/opcodes: Rename tbool to tbool32
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:49 -06:00
Jason Ekstrand
03571a7a6c nir/opcodes: Pull in the type helpers from constant_expressions
While we're at it, we rework them a bit to all use regular expressions
and assert more.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:06 -06:00
Connor Abbott
a0ae12ca91 nir/algebraic: Add unit tests for bitsize validation
The non-failure path can be tested by just compiling mesa and then
testing it, but the failure paths won't be hit unless you make a mistake,
so it's best to test them with some unit tests.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-05 17:57:40 +01:00
Connor Abbott
29a1450e28 nir/algebraic: Rewrite bit-size inference
Before this commit, there were two copies of the algorithm: one in C,
that we would use to figure out what bit-size to give the replacement
expression, and one in Python, that emulated the C one and tried to
prove that the C algorithm would never fail to correctly assign
bit-sizes. That seemed pretty fragile, and likely to fall over if we
make any changes. Furthermore, the C code was really just recomputing
more-or-less the same thing as the Python code every time. Instead, we
can just store the results of the Python algorithm in the C
datastructure, and consult it to compute the bitsize of each value,
moving the "brains" entirely into Python. Since the Python algorithm no
longer has to match C, it's also a lot easier to change it to something
more closely approximating an actual type-inference algorithm. The
algorithm used is based on Hindley-Milner, although deliberately
weakened a little. It's a few more lines than the old one, judging by
the diffstat, but I think it's easier to verify that it's correct while
being as general as possible.

We could split this up into two changes, first making the C code use the
results of the Python code and then rewriting the Python algorithm, but
since the old algorithm never tracked which variable each equivalence
class, it would mean we'd have to add some non-trivial code which would
then get thrown away. I think it's better to see the final state all at
once, although I could also try splitting it up.

v2:
- Replace instances of "== None" and "!= None" with "is None" and
"is not None".
- Rename first_src to first_unsized_src
- Only merge the destination with the first unsized source, since the
sources have already been merged.
- Add a comment explaining what nir_search_value::bit_size now means.
v3:
- Fix one last instance to use "is not" instead of !=
- Don't try to be so clever when choosing which error message to print
based on whether we're in the search or replace expression.
- Fix trailing whitespace.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-05 17:57:40 +01:00
Samuel Pitoiset
49ef890733 radv: expose VK_EXT_scalar_block_layout
Nothing to do, the compiler already handles that.

All new dEQP.VK.ubo.* and dEQP.VK.ssbo.* pass, except some
16-bit tests that are quite related to fdo bug #108114.

Only enable the extension on CIK+ because it might not work on SI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-05 17:38:20 +01:00
Samuel Pitoiset
c6465fec0c spirv: add SpvCapabilityInt64Atomics
Required for VK_KHR_shader_atomic_int64.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-05 14:39:55 +01:00
Michal Srb
63c0916ada drisw: Use separate drisw_loader_funcs for shm
The original code was modifying the global drisw_lf variable, which is bad
when there are multiple contexts in single process, each initialized with
different loader. One may support put_image_shm and the other not.

Since there are currently only two possible combinations, lets create two
global tables, one for each. Lets make them const, since we won't change them
and they can be shared.

This fixes crash in VLC. It used two GL contexts (each in different thread), one
was initialized by its Qt GUI, the other by its video output plugin. The first
one set the put_image_shm=drisw_put_image_shm, the second did not, but
since the same structure was used, the drisw_put_image_shm was used too. Then
it crashed because the second loader did not have putImageShm set.

Downstream bug:
https://bugzilla.opensuse.org/show_bug.cgi?id=1113533

v2: Added Fixes and described the VLC bug.

Fixes: 63c427fa71 ("drisw: use putImageShm if available")
Signed-off-by: Michal Srb <msrb@suse.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-05 13:16:09 +00:00
Michal Srb
c0ac038c97 gallium: Constify drisw_loader_funcs struct
The content is not expected to change.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Michal Srb <msrb@suse.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-05 13:16:09 +00:00
Samuel Pitoiset
c7ada4901a radv: wait on the high 32 bits of timestamp queries
In case we are unlucky if the low part is 0xffffffff.

Fixes: 5d6a560a29 ("radv: do not use the availability bit for timestamp queries")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-05 13:05:58 +01:00
Samuel Pitoiset
e899728769 radv: reset pending_reset_query when flushing caches
If the driver used a compute shader for resetting a query pool,
it should be completed when caches are flushed.

This might reduce the number of stalls if operations are done
between vkCmdResetQueryPool() and vkCmdBeginQuery()
(or vkCmdWriteTimestamp()).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
2018-12-05 13:05:55 +01:00
Lionel Landwerlin
9a7b319903 anv/query: flush render target before copying results
This change tracks render target writes in the pipeline and applies a
render target flush before copying the query results to make sure the
preceding operations have landed in memory before the command streamer
initiates the copy.

v2: Simplify logic in CopyQueryResults (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108909
Fixes: 37f9788e9a ("anv: flush pipeline before query result copies")
Cc: mesa-stable@lists.freedesktop.org
2018-12-05 11:43:34 +00:00
Alex Smith
c1b6cb068c radv: Flush before vkCmdWriteTimestamp() if needed
As done for vkCmdBeginQuery() already. Prevents timestamps from being
overwritten by previous vkCmdResetQueryPool() calls if the shader path
was used to do the reset.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108925
Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting the query pool")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-05 10:52:48 +00:00
Samuel Pitoiset
824cfc1ee5 radv: rework the TC-compat HTILE hardware bug with COND_EXEC
After investigating on this, it appears that COND_WRITE doesn't
work correctly in some situations. I don't know exactly why does
it fail to update DB_Z_INFO.ZRANGE_PRECISION, but as AMDVLK
also uses COND_EXEC I think there is a reason.

Now the driver stores a new metadata value in order to reflect
the last fast depth clear state. If a TC-compat HTILE is fast cleared
with 0.0f, we have to update ZRANGE_PRECISION to 0 in order to
work around that hardware bug.

This fixes rendering issues with The Forest and DXVK and doesn't
seem to introduce any regressions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108914
Fixes: 68dead112e ("radv: update the ZRANGE_PRECISION value for the TC-compat bug")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-05 09:26:31 +01:00
Dieter Nützel
2669dbf881 docs/features: Delete double nv50 entry and wrong enumeration
trivial

Fix commit d9b2234042

Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-12-04 18:51:18 -05:00
Marek Olšák
5907412d04 st/mesa: expose EXT_render_snorm on GLES
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-04 15:33:29 -05:00
Marek Olšák
1660f3aa05 mesa: expose AMD_texture_texture4
because the closed driver exposes it. Tested by piglit.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-04 15:33:29 -05:00
Marek Olšák
908f817918 mesa: expose EXT_texture_compression_bptc in GLES
tested by piglit.

v2: rebase

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-12-04 15:33:29 -05:00
Marek Olšák
34f07ddebb mesa: expose EXT_texture_compression_rgtc on GLES
The spec was modified to support GLES. Tested by piglit.

v2: rebase

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-12-04 15:33:29 -05:00
Erik Faye-Lund
91af56e383 mesa/main: fix up _mesa_has_rg_textures for gles2
rg-textures are supported in GLES 2.0 if EXT_texture_rg, so let's make
sure the enums are accepted.

Fixes: 510b642460 "mesa/main: do not allow rg-textures enums before gles3"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108936
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-04 21:14:26 +01:00
Erik Faye-Lund
5bf38bfb64 mesa/main: correct validation for GL_RGB565
Technically speaking, this validation was incorrect, because GL_RGB565
is only supported in OpenGL ES 1.x if OES_framebuffer_object is
supported. This couldn't lead to any real incorrect behavior, because
all drivers support OES_framebuffer_object. But let's keep the code
self-documenting, by correcting the check as per the spec.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-04 21:14:16 +01:00
Marek Olšák
4b218984d8 mesa: expose GL_EXT_texture_view as an alias of GL_OES_texture_view
There are no spec changes.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-04 12:50:36 -05:00
Marek Olšák
d9b2234042 st/mesa: expose GL_OES_texture_view
For format fallbacks like ETC and ASTC, switching between sRGB and linear
decoding is undefined, or at least is not bit-exact. Same as
EXT_texture_sRGB_decode on GLES.

There are no piglit or dEQP regresssions.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-04 12:50:36 -05:00
Eric Engestrom
95d62baac5 loader: deduplicate logger function declaration
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-04 16:29:32 +00:00
Eric Engestrom
eade6ffeee mesa: drop unused & deprecated lib
DeprecationWarning: the imp module is deprecated in favour of importlib

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-04 16:26:21 +00:00
Eric Engestrom
919bec1c47 anv: add unreachable() for VK_EXT_fragment_density_map
This silences the -Wswitch compiler warning.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-04 16:22:55 +00:00
Eric Engestrom
a0b14c1b02 meson: skip asm check when asm is disabled
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-04 16:22:51 +00:00
Andrii Simiklit
6ae873b97d intel/tools: make sure the binary file is properly read
1. tools/i965_disasm.c:58:4: warning:
     ignoring return value of ‘fread’,
     declared with attribute warn_unused_result
     fread(assembly, *end, 1, fp);

v2: Fixed incorrect return value check.
       ( Eric Engestrom <eric.engestrom@intel.com> )

v3: Zero size file check placed before fread with exit()
       ( Eric Engestrom <eric.engestrom@intel.com> )

v4: - Title is changed.
    - The 'size' variable was moved to top of a function scope.
    - The assertion was replaced by the proper error handling.
    - The error message on a caller side was fixed.
       ( Eric Engestrom <eric.engestrom@intel.com> )

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-04 16:19:26 +00:00
Toni Lönnberg
d7b99ab947 intel/aubinator_error_decode: Get rid of warning for missing switch case
../src/intel/tools/aubinator_error_decode.c: In function ‘instdone_register_for_ring’:
../src/intel/tools/aubinator_error_decode.c:177:4: warning: enumeration value ‘I915_ENGINE_CLASS_INVALID’ not handled in switch [-Wswitch]
    switch (class) {
    ^~~~~~
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-04 12:47:49 +00:00
Ilia Mirkin
bacf8471dc nouveau: set texture upload budget
It doesn't seem like the exact number has too much effect on the
performaince in "teximage". However setting it to just about anything
prevents some OOMs from getting hit. These values are not well-tuned,
but don't seem too bad.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-03 23:11:29 -05:00
Ilia Mirkin
08c64fe7a1 nv50,nvc0: add explicit handling of PIPE_CAP_MAX_VERTEX_ELEMENT_SRC_OFFSET
Since the max attrib stride is 2048, the max src offset makes sense as
2047.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-03 23:11:29 -05:00
Ilia Mirkin
de49e06507 nv50: always keep TSC slot 0 bound
All TXF operations implicitly use sampler 0, and fail if it's not bound
to anything. This does not happen in LINKED_TSC mode, but we don't
currently use this.

We ensure that TSC entry at id 0 has the SRGB conversion bit enabled
(and all samplers we normally generate will too). Then when the TSC at
*slot* 0 (not to be confused with entry 0 in the global TSC table) is
unbound, we bind it to entry 0. This way, TXF operations are not
dependent on there being a regular sampler bound there.

Fixes arb_texture_buffer_object-subdata-sync among others. (TBO's are
particularly susceptible to this as they don't bind a sampler.)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-03 23:11:29 -05:00
Dave Airlie
1363a47c9c radv: use 3d shader for gfx9 copies if dst is 3d
This fixes some crucible 3d miptree tests I've been working on
when executed using the compute shader path.

Fixes: d08f267814 (radv/gfx9: fix 3d image to image transfers on compute queues.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-04 10:42:31 +10:00
Bas Nieuwenhuizen
12e35a64c0 radv: Check for shareable images in central place.
One place to put the logic makes things easier to change.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-04 01:21:38 +01:00
Bas Nieuwenhuizen
3bf48741e1 radv/android: Use buffer metadata to determine scanout compat.
These days we don't always allocate scanout compatible textures anymore.
That does mean we have to fix the radv android WSI though.

Fixes: b1444c9ccb "radv: Implement VK_ANDROID_native_buffer."
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-04 01:21:38 +01:00
Bas Nieuwenhuizen
51091b3e1f radv/android: Mark android WSI image as shareable.
Fixes: b1444c9ccb "radv: Implement VK_ANDROID_native_buffer."
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-04 01:21:38 +01:00
Matt Turner
dd53bb7e1f Revert "st/mesa: silenced unhanded enum warning in st_glsl_to_tgsi.cpp"
This reverts commit 198c50f487.

This needs to be reverted after commit 017199d2d2 ("mesa: Revert
INTEL_fragment_shader_ordering support")
2018-12-03 16:20:43 -08:00
Matt Turner
017199d2d2 mesa: Revert INTEL_fragment_shader_ordering support
This extension is not properly tested (testing for
GL_ARB_fragment_shader_interlock is not sufficient), and since this was
noted in review on August 28th no tests have been sent.

Revert "i965: Add INTEL_fragment_shader_ordering support."
Revert "mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_ordering"

This reverts commit 03ecec9ed2.
This reverts commit 119435c877.

Cc: mesa-stable@lists.freedesktop.org
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Anholt <eric@anholt.net>
2018-12-03 15:37:37 -08:00
Dave Airlie
e3f075439c virgl: fix const warning on debug flags.
Fixes: 8d4bb6e5c (virgl: Add command and flags to initiate debugging on the host (v2))
2018-12-04 08:11:13 +10:00
Jason Ekstrand
71271e167b vulkan: Update the XML and headers to 1.1.95
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-03 14:27:10 -06:00
Tobias Klausmann
9401a2f2e6 amd/vulkan: meson build - use radv_deps for libvulkan_radeon
Without this the build breaks with:

FAILED: src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o
cc -Isrc/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha -Isrc/amd/vulkan
-I../src/amd/vulkan -Isrc/../include -I../src/../include -Isrc -I../src
-Isrc/mapi -I../src/mapi -Isrc/mesa -I../src/mesa -I../src/gallium/include
-Isrc/gallium/auxiliary -I../src/gallium/auxiliary -Isrc/amd -I../src/amd
-Isrc/amd/common -I../src/amd/common -Isrc/compiler -I../src/compiler
-Isrc/vulkan/util -I../src/vulkan/util -Isrc/vulkan/wsi -I../src/vulkan/wsi
-Isrc/compiler/nir -I../src/compiler/nir -I/usr/include -I/usr/include/libdrm
-fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch
-std=c99 -O2 -g '-DVERSION="18.3.0-rc5"' -DPACKAGE_VERSION=VERSION
'-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"'
-DGLX_USE_TLS -DHAVE_ST_VDPAU -DENABLE_ST_OMX_BELLAGIO=0
-DENABLE_ST_OMX_TIZONIA=0 -DHAVE_X11_PLATFORM -DGLX_INDIRECT_RENDERING
-DGLX_DIRECT_RENDERING -DGLX_USE_DRM -DHAVE_DRM_PLATFORM -DENABLE_SHADER_CACHE
-DHAVE___BUILTIN_BSWAP32 -DHAVE___BUILTIN_BSWAP64 -DHAVE___BUILTIN_CLZ
-DHAVE___BUILTIN_CLZLL -DHAVE___BUILTIN_CTZ -DHAVE___BUILTIN_EXPECT
-DHAVE___BUILTIN_FFS -DHAVE___BUILTIN_FFSLL -DHAVE___BUILTIN_POPCOUNT
-DHAVE___BUILTIN_POPCOUNTLL -DHAVE___BUILTIN_UNREACHABLE
-DHAVE_FUNC_ATTRIBUTE_CONST -DHAVE_FUNC_ATTRIBUTE_FLATTEN
-DHAVE_FUNC_ATTRIBUTE_MALLOC -DHAVE_FUNC_ATTRIBUTE_PURE
-DHAVE_FUNC_ATTRIBUTE_UNUSED -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT
-DHAVE_FUNC_ATTRIBUTE_WEAK -DHAVE_FUNC_ATTRIBUTE_FORMAT
-DHAVE_FUNC_ATTRIBUTE_PACKED -DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL
-DHAVE_FUNC_ATTRIBUTE_VISIBILITY -DHAVE_FUNC_ATTRIBUTE_ALIAS
-DHAVE_FUNC_ATTRIBUTE_NORETURN -DUSE_SSE41 -DUSE_GCC_ATOMIC_BUILTINS
-DUSE_X86_64_ASM -DMAJOR_IN_SYSMACROS -DHAVE_SYS_SYSCTL_H -DHAVE_LINUX_FUTEX_H
-DHAVE_ENDIAN_H -DHAVE_DLFCN_H -DHAVE_STRTOF -DHAVE_MKOSTEMP
-DHAVE_POSIX_MEMALIGN -DHAVE_TIMESPEC_GET -DHAVE_MEMFD_CREATE -DHAVE_STRTOD_L
-DHAVE_DLADDR -DHAVE_DL_ITERATE_PHDR -DHAVE_ZLIB -DHAVE_PTHREAD
-DHAVE_PTHREAD_SETAFFINITY -DHAVE_LIBDRM -DHAVE_LLVM=0x0600
-DMESA_LLVM_VERSION_PATCH=1 -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED
-DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -Werror=implicit-function-declaration
-Werror=missing-prototypes -Werror=return-type -fno-math-errno
-fno-trapping-math -Wno-missing-field-initializers -Wno-format-truncation -O2
-Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables
-fasynchronous-unwind-tables -fstack-clash-protection -DNDEBUG -fPIC -pthread
-D__STDC_FORMAT_MACROS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS
-D__STDC_LIMIT_MACROS -fvisibility=hidden -Wno-override-init
-DVK_USE_PLATFORM_XCB_KHR -DVK_USE_PLATFORM_XLIB_KHR
-DVK_USE_PLATFORM_WAYLAND_KHR -DVK_USE_PLATFORM_DISPLAY_KHR
-DVK_USE_PLATFORM_XLIB_XRANDR_EXT  -MD -MQ
'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o' -MF
'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o.d' -o
'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o' -c
../src/amd/vulkan/radv_pipeline.c
In file included from ../src/vulkan/util/vk_alloc.h:29,
                 from ../src/amd/vulkan/radv_private.h:52,
                 from ../src/amd/vulkan/radv_debug.h:27,
                 from ../src/amd/vulkan/radv_pipeline.c:30:
../src/../include/vulkan/vulkan.h:54:10: fatal error: wayland-client.h: Datei
oder Verzeichnis nicht gefunden
 #include <wayland-client.h>
          ^~~~~~~~~~~~~~~~~~
compilation terminated.

The above command misses the include directory for wayland:
    -I/usr/include/wayland

The missing include is contained in the (until now) unused radv_deps:

if with_platform_wayland
  radv_deps += dep_wayland_client
  radv_flags += '-DVK_USE_PLATFORM_WAYLAND_KHR'
  libradv_files += files('radv_wsi_wayland.c')
endif

Fixes: 673dda8330 "meson: build "radv" vulkan driver for radeon hardware"
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-03 09:18:48 -08:00
Erik Faye-Lund
fcf9fcee3c mesa/main: do not require float-texture filtering for es3
The OpenGL ES 3.0 specification, table 3.13 lists half-float textures as
filterable, but not float textures. So we shouldn't depend on
ARB_float_texture, which requires full filtering support for both.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
43015b2a89 mesa/st: do not probe for the same texture-formats twice
This should be equalent of what we did before.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
212d270b4e mesa/main: require EXT_texture_sRGB for gles3
sRGB textures is a requirement for OpenGL ES 3.0, so let's make sure
we don't incorrectly enable a too high version.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
487010a099 mesa/main: require EXT_texture_type_2_10_10_10_REV for gles3
OpenGL ES 3.0 require this functionality, so we should also test for it
to avoid incorrectly exposing a too high GLES version.

On desktop, this has been required since all the way back in OpenGL 1.2
anyway.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
74eab1c62f mesa/main: split float-texture support checking in two
On OpenGL ES 2.0, there's separate extensions adding support for
half-float and float textures. So we need to validate the enums
separately as well.

This also prevents these enums from incorrectly being allowed on
OpenGL ES 1.x, where there's no extension that enables this in the
first place.

While we're at it, remove the pointless default-case, and the seemingly
stale fallthrough comment.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
c4136ed5cc mesa/main: do not allow EXT_texture_sRGB_R8 enums before gles3
ctx->Extensions.EXT_texture_sRGB_R8 is set regardless of the API
that's used, so checking for those direcly will always allow the
enums from this extensions when they are supported by the driver.

There's no extension adding support for this on OpenGL ES before
version 3.0, so let's tighten the check.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
d972939986 mesa/main: do not allow sRGB texture enums before gles3
ctx->Extensions.EXT_texture_sRGB is set regardless of the API that's
used, so checking for those direcly will always allow the enums from
this extensions when they are supported by the driver.

There's no extension adding support for this on OpenGL ES before
version 3.0, so let's tighten the check.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
3629ee025c mesa/main: do not allow snorm-texture enums before gles3
ctx->Extensions.EXT_texture_snorm is set regardless of the API
that's used, so checking for those direcly will always allow the
enums from this extensions when they are supported by the driver.

There's no extension adding support for this on OpenGL ES before
version 3.0, so let's tighten the check.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
52dc8b4f7b mesa/main: do not allow floating-point texture enums on gles1
ctx->Extensions.OES_texture_float is set regardless of the API
that's used, so checking for those direcly will always allow the
enums from this extensions when they are supported by the driver.

There's no extension enabling floating-point textures for OpenGL
ES 1.x, so we shouldn't allow those enums there.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
167dcd59ae mesa/main: do not allow type_2_10_10_10_REV enums before gles3
ctx->Extensions.EXT_texture_type_2_10_10_10_REV is set regardless of
the API that's used, so checking for those direcly will always enable
extensions when they are supported by the driver.

There's no corresponding extension for OpenGL ES 1.x/2.0, so we
shouldn't allow these enums there.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
b112e62ba4 mesa/main: do not allow MESA_ycbcr_texture enums on gles
This extension requies OpenGL, and shouldn't be available on OpenGL ES.
So let's not allow the enums from it either.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
1b2e9aca77 mesa/main: do not allow EXT_texture_shared_exponent enums before gles3
ctx->Extensions.EXT_texture_shared_exponent is set regardless of the
API that's used, so checking for those direcly will always allow the
enums from this extensions when they are supported by the driver.

We also need to make sure this is enabled on OpenGL ES 3. Because the
check is repeated, let's introduce a helper.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
510b642460 mesa/main: do not allow rg-textures enums before gles3
EXT_packed_float isn't supported on OpenGL ES, we shouldn't allow
these enums there, before OpenGL ES 3.0 which also introduce support
for these enums.

Since this check is repeated a lot, let's make a helper for this.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
59690bf0a3 mesa/main: do not allow EXT_packed_float enums before gles3
EXT_packed_float isn't supported on OpenGL ES, we shouldn't allow
these enums there, before OpenGL ES 3.0 which also introduce support
for these enums.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
83db9d3e3a mesa/main: do not allow ARB_depth_buffer_float enums before gles3
Floating-point depth buffers are only supported on OpenGL 3.0, OpenGL ES
3.0, or if ARB_depth_buffer_float is supported. Because we checked a
driver capability rather than using an extension-check helper, we ended
up incorrectly allowing this on OpenGL ES 1.x and 2.x.

Since this logic is repeated, let's make a helper for it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
3bbd543b6e mesa/main: do not allow integer-texture enums before gles3
Integer textures shouldn't be implicitly exposed on OpenGL ES 1.x and
2.x, but because the code checked against a driver-capability rather
than using an extension-check helper, we ended up accidentally allowing
these enums on older versions when the driver supports it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
b5a370dc25 mesa/main: do not allow ARB_texture_rgb10_a2ui enums before gles3
ARB_texture_rgb10_a2ui isn't supported on OpenGL ES, we shouldn't expose
it there even if the driver supports it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
76b038bee7 mesa/main: do not allow stencil-texture enums on gles1
ctx->Extensions.ARB_texture_stencil8 is set regardless of the API
that's used, so checking for those direcly will always allow the
enums from this extensions when they are supported by the driver.

So let's instead check for both ARB_texture_stencil8 and
OES_texture_stencil8, so we support depth textures on OpenGL and
OpenGL ES 2.0+. There's no extension enabling stencil-textures for
OpenGL ES 1.x, so we shouldn't allow those enums there.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
19eb0bf28f mesa/main: do not allow depth-texture enums on gles1
ctx->Extensions.ARB_depth_texture is set regardless of the API that's
used, so checking for those direcly will always allow the enums from
this extensions when they are supported by the driver.

So let's instead check for both ARB_depth_texture and OES_depth_texture,
so we support depth textures on OpenGL and OpenGL ES 2.0+. There's no
extension enabling depth-textures for OpenGL ES 1.x, so we shouldn't
allow those enums there.

This fixes oes_packed_depth_stencil-depth-stencil-texture_gles1 on i965

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
2dfcaf7554 mesa/main: do not allow astc enums on gles1
ctx->Extensions.KHR_texture_compression_astc_ldr is set regardless of
the API that's used, so checking for those direcly will always enable
extensions when they are supported by the driver.

But there's no extension enabling ASTC for OpenGL ES 1.x, so we
shouldn't allow those enums there.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
1aa134038c mesa/main: do not allow etc2 enums on gles1
ctx->Extensions.ARB_ES3_compatibility is set regardless of the API
that's used, so checking for those direcly will always enable
extensions when they are supported by the driver.

But there's no extension enabling ETC2 for OpenGL ES 1.x, so we
shouldn't allow those enums there.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
27ca87ccca mesa/main: do not allow s3tc enums on gles1
There's no extension enabling S3TC formats on OpenGL ES 1.x, so we
shouldn't allow these even if the driver can support it. So let's check
for EXT_texture_compression_s3tc instead of ANGLE_texture_compression_dxt,
which is supported on all other OpenGL variations.

We also need to use _mesa_has_EXT_texture_compression_s3tc() instead of
checking the driver cap directly, otherwise we end up enabling this on
OpenGL ES 1.x, as the API isn't checked.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
d70cfb322a mesa/main: use _mesa_has_FOO_bar for compressed format checks
_mesa_has_FOO_bar() knows about the APIs these extensions should be
supported under, so let's use that to simplify these checks a bit.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
70bfd31287 mesa/main: clean up integer texture check
This makes the logic a little bit easier to follow, and reduce a bit of
repetition.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
5109742e7b mesa/main: clean up ES2_compatibility check
This makes the logic a little bit easier to follow; this is *either*
about ES2 compatibility *or* about gles. GL_RGB565 was added already in
OpenGL ES 1.0.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
2e753b77dd mesa/main: clean up OES_texture_float_linear check
Using the _mesa_has_FOO_bar helpers is generally more safe and should
generally be prefered over checking driver-caps like this code did,
because the _mesa_has_FOO_bar helpers also verify the API type and
version.

This shouldn't have any practical effect here, as this function only
gets called for OpenGL ES 3.x right now. But if this was to change in
the future, this makes the function behave a lot more predictable.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
1373d117c2 mesa/main: clean up S3_s3tc check
S3_s3tc is the extension that enables this functionality on desktop, so
let's check for that one. The _mesa_has_S3_s3tc() helper already
verifies the API according to the extension-table.

As for the second hunk, we currently already only expose
EXT_texture_compression_s3tc on desktop so by using the helper instead,
we get rid of this detail here, and once we enable it for GLES we'll
automaticall get the interaction right.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
e8b331ae13 mesa/main: rename format-check function
_mesa_es3_error_check_format_and_type isn't specific to OpenGL ES 3.x,
it applies to all versions of OpenGL ES. So let's rename it to reflect
this.

While we're at it, let's also rename a helper function it uses similarly.
As the helper is static, we can also remove the namespacing-prefix from
the name.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
ca8e2a5277 mesa/main: make _mesa_has_tessellation return bool
All other _mesa_has_foo functions return bool rather than GLboolean, so
let's follow that style here as well.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:43 +01:00
Chad Versace
3ef0ca65c9 i965: Fix -Wswitch on INTEL_COPY_STREAMING_LOAD
The warning is emitted when building without INLINE_SSE41.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-03 13:07:56 +02:00
Karol Herbst
fc0139d283 nv50,nvc0: Fix gallium nine regression regarding sampler bindings
The new approach is that samplers don't get unbound even if they won't be used
in a draw and we should just leave them be as well.

Fixes a regression in multiple windows games using gallium nine and nouveau.

v2: adjust num_samplers to keep track of the highest sampler bound
v3: rework how to set the new value of num_samplers

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106577
Fixes: 4d6fab245e
       "cso: don't track the number of sampler states bound"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-02 00:05:04 +01:00
Andre Heider
b6f095f7ce d3dadapter9: use snprintf(..., "%s", ...) instead of strncpy
Fixes -Wstringop-truncation compiler warnings.
See f836d799f9 "intel/decoder: use snprintf(..., "%s", ...) instead of strncpy"

Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2018-12-01 21:32:53 +01:00
Mauro Rossi
37a2072e97 android: st/mesa: fix building error due to sched_getcpu()
Android has cpufeatures library but pinning of threads is not supported
PIPE_OS_LINUX code path causes build error due to sched_getcpu() unavailable
thus we need to avoid setting HAVE_SCHED_GETCPU for Android

Fixes: 48f2160 ("st/mesa: regularly re-pin driver threads to the CCX where the app thread is")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-01 10:15:58 +01:00
Vinson Lee
4f74580d30 st/xvmc: Add X11 include path.
This patch fixes this build error.

  CC       tests/xvmc_bench.o
In file included from tests/xvmc_bench.c:35:
tests/testlib.h:38:10: fatal error: 'X11/Xlib.h' file not found
         ^~~~~~~~~~~~

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-30 22:09:43 -08:00
Mauro Rossi
eed3f1121c android: amd/addrlib: update Mesa's copy of addrlib
Needed to fix build error in addrlib in mesa for Android

Fixes: 776b911 ("amd/addrlib: update Mesa's copy of addrlib")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-12-01 01:13:53 +01:00
Gurchetan Singh
89b4798c06 virgl: don't mark buffers as unclean after a write
We can mark the buffer unclean if it's ever bound as a TBO,
SSBO, ABO, or image.

This improves

dEQP-GLES3.performance.buffer.data_upload.function_call.map_buffer_range.new_specified_buffer.flag_write_full.stream_draw

from 9.58 MB/s to 451.17 MB/s.

v2: Track buffer cleanliness as a function of bindings (Ilia).
v3: virgl_modify_clean --> virgl_dirty_res (Erik)

Tested-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-11-30 12:21:01 +01:00
Gurchetan Singh
d18492c64f virgl: avoid large inline transfers
We flush everytime the command buffer (16 kB) is full, which is
quite costly.

This improves

dEQP-GLES3.performance.buffer.data_upload.function_call.buffer_data.new_buffer.usage_stream_draw

from 111.16 MB/s to 1930.36 MB/s.

In addition, I made the benchmark produce buffers from 0 --> VIRGL_MAX_CMDBUF_DWORDS * 4,
and tried ((VIRGL_MAX_CMDBUF_DWORDS * 4) / 2), ((VIRGL_MAX_CMDBUF_DWORDS * 4) / 4), etc.

I didn't notice any clear differences, so let's just go with the most obvious
heuristic.

Tested-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-11-30 12:20:41 +01:00
Gurchetan Singh
c0773315af virgl: quadruple command buffer size
Tested running WebGL aquarium on Nvidia host (10,000 fishes)

This moves us from 7 fps to 9 fps.  After quadrupling, performance
gains diminish.

v2: Remove change ID (Erik)

Tested-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-11-30 12:20:06 +01:00
Lionel Landwerlin
37f9788e9a anv: flush pipeline before query result copies
Pipeline state pending bits should be taken into account when copying
results.

In the particular bug below, the results of the
vkCmdCopyQueryPoolResults() command was being overwritten by the
preceding vkCmdCopyBuffer() with a same destination buffer. This is
because we copy the buffers using the 3D pipeline whereas we copy the
query results using the command streamer. Those pieces of HW work in
parallel and the results are somewhat undefined.

v2: Unconditionally flush the pipeline before copying the results
    (Jason)

v3: Wrap & expressions (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108894
Cc: mesa-stable@lists.freedesktop.org
2018-11-29 22:07:31 +00:00
Marek Olšák
39b20b7d4f Revert "winsys/amdgpu: overallocate buffers for faster address translation on Gfx9"
I didn't mean to push this. I don't think it makes any difference.

This reverts commit f737fe00a0.
2018-11-29 14:46:06 -05:00
Roland Scheidegger
fbf95ce074 draw: fix infinite loop in line stippling
The calculated length of a line may be infinite, if the coords we
get are bogus. This leads to an infinite loop in line stippling.
To prevent this test for this explicitly (although technically
on at least x86 sse it would actually work without the explicit
test, as long as we use the int-converted length value).
While here also get rid of some always-true condition.

Note this does not actually solve the root cause, which is that
the coords we receive are bogus after clipping. This seems a difficult
problem to solve. One issue is that due to float arithmetic, clip w
may become 0 after clipping if the incoming geometry is
"sufficiently degenerate", hence x/y/z ndc (and window) coords will
be all inf (or nan). Even with w not quite 0, I believe it's possible
we produce values which are actually outside the view volume.
(Also, x=y=z=w=0 coords in clipspace would be not considered subject
to clipping, and similarly result in all NaN coords.) We just hope for
now other draw stages (and rasterizers) can handle those relatively
safely (llvmpipe itself should be sort of robust against this, certainly
converstion to fixed point will produce garbage, it might fail a couple
assertions but should neither hang nor crash otherwise).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-11-29 18:39:40 +01:00
Józef Kucia
94bfb8bf38 nir: Fix assert in print_intrinsic_instr().
Signed-off-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-29 16:29:37 +00:00
Nicolai Hähnle
776b911365 amd/addrlib: update Mesa's copy of addrlib
Update to the internal master as of 2018-11-15.

This has a lot of gratuitous whitespace change, but on the plus
side it's built using the same tooling that's used for AMDVLK,
which should help going forward.
2018-11-29 13:18:24 +01:00
Nicolai Hähnle
621c107760 ac/surface/gfx9: let addrlib choose the preferred swizzle kind
Our choices here are simply redundant as long as sin.flags is set
correctly.

(v2:
- remove unused function parameter)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-29 13:18:23 +01:00
Nicolai Hähnle
729ebdf07e radv: remove dependency on addrlib gfx9_enum.h
v2:
- use SI_CONTEXT_REG_OFFSET

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-29 13:18:23 +01:00
Thomas Hellstrom
058f85d41c winsys/svga: Fix a memory leak
The ioctl.cap_3d member was never freed.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-29 10:42:06 +01:00
Thomas Hellstrom
7fce3ca375 st/xa: Fix a memory leak
Free the context after destruction.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-29 10:42:06 +01:00
Samuel Pitoiset
cc7deb749c radv: drop few useless state changes when doing color/depth decompressions
Viewport/scissor don't need to be updated for array textures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:55 +01:00
Samuel Pitoiset
6d4f65deea radv: remove unused pending_clears param in the transition path
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:53 +01:00
Samuel Pitoiset
4b9df824f7 radv: optimize CmdClear{Color,DepthStencil}Image() for layered textures
If all layers are bound we can perform a fast color or depth clear
instead of iterating over all layers. This has the advantage
to avoid trashing the framebuffer for nothing if you we end up by
doing a fast clear when calling radv_clear_image_layer(), and
clearing all layers in one shot is obviously faster.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
7484bc894b radv: refactor the fast clear path for better re-use
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
f78ee19702 radv: simplify a check in emit_fast_color_clear()
Currently only true if RADV_PERFTEST=dccmsaa is set.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
eca931a726 radv: add radv_can_fast_clear_{color,depth}() helpers
For further optimisations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
93f5ce8fa7 radv: add radv_image_view_can_fast_clear() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
aeaf8dbd09 radv: add radv_image_can_fast_clear() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
3e718db1ff radv: remove useless check in emit_fast_color_clear()
The driver doesn't support DCC/CMASK for mipmapped textures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Vinson Lee
d0c7b079d0 freedreno: Fix autotools build.
Fix build error.

  CXXLD    pipe_msm.la
../../../../src/gallium/drivers/freedreno/.libs/libfreedreno.a(freedreno_batch.o): In function `batch_init':
src/gallium/drivers/freedreno/freedreno_batch.c:54: undefined reference to `fd_device_version'
src/gallium/drivers/freedreno/freedreno_batch.c:59: undefined reference to `fd_submit_new'
src/gallium/drivers/freedreno/freedreno_batch.c:61: undefined reference to `fd_submit_new_ringbuffer'
src/gallium/drivers/freedreno/freedreno_batch.c:64: undefined reference to `fd_submit_new_ringbuffer'
src/gallium/drivers/freedreno/freedreno_batch.c:66: undefined reference to `fd_submit_new_ringbuffer'
src/gallium/drivers/freedreno/freedreno_batch.c:70: undefined reference to `fd_submit_new_ringbuffer'

Fixes: b4476138d5 ("freedreno: move drm to common location")
Fixes: aa0fed10d3 ("freedreno: move ir3 to common location")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-11-28 22:23:52 -08:00
Marek Olšák
075fd5d8f2 radeonsi: add memory management stress tests for GDS
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
c1d3c08699 winsys/amdgpu: add support for allocating GDS and OA resources
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
d7a4fa91f0 radeonsi: allow si_cp_dma_clear_buffer to clear GDS from any IB
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
72b2b61d8c winsys/amdgpu: use optimal VM alignment for CPU allocations
Acked-by: Christian König <christian.koenig@amd.com>
2018-11-28 20:20:27 -05:00
Marek Olšák
27f9935075 winsys/amdgpu: use optimal VM alignment for imported buffers
Window system buffers didn't use the optimal alignment.

Acked-by: Christian König <christian.koenig@amd.com>
2018-11-28 20:20:27 -05:00
Marek Olšák
6b554d863f winsys/amdgpu,radeon: pass vm_alignment to buffer_from_handle
Acked-by: Christian König <christian.koenig@amd.com>
2018-11-28 20:20:27 -05:00
Marek Olšák
f737fe00a0 winsys/amdgpu: overallocate buffers for faster address translation on Gfx9
Sadly, the 3 games I tested (DeusEx:MD, DiRT Rally, DOTA 2) are unaffected
by the overallocation, because I guess their buffers don't fall into
the small range below a power-of-two size.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
8c00f778fc winsys/amdgpu: increase the VM alignment to the MSB of the size for Gfx9
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
a2a6b06d48 winsys/amdgpu: use >= instead of > for VM address alignment
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
98f2312b4f winsys/amdgpu: clean up code around BO VM alignment
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
5f9ccf827e winsys/amdgpu: optimize slab allocation for 2 MB amdgpu page tables
- the slab buffer size increased from 128 KB to 2 MB (PTE fragment size)
- the max suballocated buffer size increased from 64 KB to 256 KB,
  this increases memory usage because it wastes memory
- the number of suballocators increased from 1 to 3 and they are layered
  on top of each other to minimize unused space in slabs

The final increase in memory usage is:
  DeusEx:MD:  1.8%
  DOTA 2:     1.75%
  DiRT Rally: 0.2%

The kernel driver will also receive fewer buffers.
2018-11-28 20:20:27 -05:00
Marek Olšák
cf6835485c radeonsi: generalize the slab allocator code to allow layered slab allocators
There is no change in behavior. It just makes it easier to change the number
of slab allocators.
2018-11-28 20:20:27 -05:00
Marek Olšák
9576266a37 winsys/amdgpu: always reclaim/release slabs if there is not enough memory 2018-11-28 20:20:27 -05:00
Marek Olšák
015061beb3 radeonsi: fix is_oneway_access_only for bindless images 2018-11-28 20:20:27 -05:00
Marek Olšák
8c25ab1a23 radeonsi/nir: parse more information about bindless usage
fill more tgsi_shader_info fields.
2018-11-28 20:20:27 -05:00
Marek Olšák
2a936f8afa tgsi/scan: add more information about bindless usage
radeonsi will use this.
2018-11-28 20:20:27 -05:00
Marek Olšák
fba91b5173 radeonsi: small cleanup for memory opcodes 2018-11-28 20:20:27 -05:00
Marek Olšák
709905cbb6 radeonsi: fix is_oneway_access_only for image stores
We need to look at the Dst for image stores.
2018-11-28 20:20:27 -05:00
Marek Olšák
648dc52367 radeonsi: use structured buffer intrinsics for image views
to stop using the workaround in si_make_buffer_descriptor.
2018-11-28 20:20:27 -05:00
Marek Olšák
442dae2693 radeonsi: clean up primitive binning enablement
no change in behavior.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Dave Airlie
8eb8be3f54 virgl: fix undefined shift to use unsigned.
Ported from virglrenderer.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-11-29 09:09:31 +10:00
Dave Airlie
2ddd44d941 r600: make suballocator 256-bytes align
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108311
Cc: <mesa-stable@lists.freedesktop.org>
2018-11-29 09:09:02 +10:00
Kenneth Graunke
f11780779f intel/compiler: Use nir's info when checking uses_streams.
Vulkan and Gallium don't use Mesa's gl_program data structure, so they
can't poke at 'prog'.  But we can simply use the copy of the shader info
stored with the NIR shader, which is guaranteed to exist.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-11-28 13:35:29 -08:00
Jason Ekstrand
199a0353d6 nir/derefs: Add a nir_derefs_do_not_alias enum value
This makes some of the code more clear.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-11-28 14:29:25 -06:00
Gurchetan Singh
eb44c36cf1 egl: add missing #include <stddef.h> in egldevice.h
Otherwise, I get this error:

main/egldevice.h:54:13: error: ‘NULL’ undeclared (first use in this function)
       dev = NULL;
             ^~~~
with this config:

./autogen.sh --enable-gles1 --enable-gles2 --with-platforms='surfaceless' --disable-glx
             --with-dri-drivers="i965" --with-gallium-drivers="" --enable-gbm

v3: Use stddef.h (Matt)
v4: Modify commit message (Eric)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-28 11:22:47 -08:00
Matt Turner
2d48d5116b gallivm: Use nextafterf(0.5, 0.0) as rounding constant
The common truncf(x + 0.5) fails for the floating-point value just less
than 0.5 (nextafterf(0.5, 0.0)). nextafterf(0.5, 0.0) + 0.5, after
rounding is 1.0, thus truncf does not produce the desired value.

The solution is to add nextafterf(0.5, 0.0) instead of 0.5 before
truncating. This works for all values.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-11-28 11:22:47 -08:00
Juan A. Suarez Romero
e2ad94d928 docs: update calendar, add news item and link release notes for 18.2.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-11-28 19:20:09 +01:00
Juan A. Suarez Romero
a53a280479 docs: add sha256 checksums for 18.2.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit cfd1f8b92c)
2018-11-28 19:20:09 +01:00
Juan A. Suarez Romero
f6ab6e2867 docs: add release notes for 18.2.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 3e741344d7)
2018-11-28 19:20:09 +01:00
Nicolai Hähnle
c02390f8fc egl/wayland: rather obvious build fix
Fixes: ce74a7bb8d ("egl/wayland: plug memory leak in drm_handle_device()")
Fixes: c59d3aa4b9 ("egl/wayland: bail out when drmGetMagic fails")
2018-11-28 18:30:36 +01:00
Nicolai Hähnle
eb94b6bd5c winsys/amdgpu: explicitly declare whether buffer_map is permanent or not
Introduce a new driver-private transfer flag RADEON_TRANSFER_TEMPORARY
that specifies whether the caller will use buffer_unmap or not. The
default behavior is set to permanent maps, because that's what drivers
do for Gallium buffer maps.

This should eliminate the need for hacks in libdrm. Assertions are added
to catch when the buffer_unmap calls don't match the (temporary)
buffer_map calls.

I did my best to update r600 for consistency (r300 needs no changes
because it never calls buffer_unmap), even though the radeon winsys
ignores the new flag.

As an added bonus, this should actually improve the performance of
the normal fast path, because we no longer call into libdrm at all
after the first map, and there's one less atomic in the winsys itself
(there are now no atomics left in the UNSYNCHRONIZED fast path).

Cc: Leo Liu <leo.liu@amd.com>
v2:
- remove comment about visible VRAM (Marek)
- don't rely on amdgpu_bo_cpu_map doing an atomic write
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-28 18:24:14 +01:00
Nicolai Hähnle
35eb81987c winsys/amdgpu: add amdgpu_winsys_bo::lock
We'll use it in the upcoming mapping change. Sparse buffers have always
had one.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-28 18:23:29 +01:00
Eric Engestrom
e0f1f74eda vulkan/wsi: fix s/,/;/ typo
Fixes: 59e58c348e "vulkan/wsi: Only wait on semaphores on the first swapchain"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-28 16:44:01 +00:00
Emil Velikov
ce74a7bb8d egl/wayland: plug memory leak in drm_handle_device()
As we fail to open the node, we leak the node/device name.

v2: Log and then free() (Eric)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-28 16:12:12 +00:00
Emil Velikov
c59d3aa4b9 egl/wayland: bail out when drmGetMagic fails
Currently as the function fails, we pass uninitialized data to the
authentication function. Stop doing that and print an warning when
the function fails.

v2: Plug memory leak in error path (Eric)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-28 16:11:22 +00:00
Eric Engestrom
9575cd2893 wsi/display: fix mem leak when freeing swapchains
Fixes: da997ebec9 "vulkan: Add KHR_display extension using DRM [v10]"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
2018-11-28 12:09:54 +00:00
Gert Wollny
f08d107054 i965: Set the FBO error state INCOMPLETE_ATTACHMENT only for SRGB_R8
Originally the driver reported GL_FRAMEBUFFER_UNSUPPORTED in all cases,
adding more specific error messages was not correct and broke many tests.
Mostly revert this and only report GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT
for MESA_FORMAT_R_SRGB8.

Fixes: ebcde34545
  i965: be more specific about FBO completeness errors

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108805

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-28 10:12:47 +01:00
Gert Wollny
d8bb88d0b4 i965: Explicitely handle swizzles for MESA_FORMAT_R_SRGB8
The format is emulated by using ISL_FORMAT_L8_SRGB, therefore we need to
force swizzles for the GBA channels. However, doing this only based on the
data type GL_RED breaks other formats, therefore, test specifically for the
format.

Fixes: c5363869d4
  i965: Force zero swizzles for unused components in GL_RED and GL_RG
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-28 10:07:02 +01:00
Gert Wollny
091295d7cb virgl: Don't try handling server fences when they are not supported
vtest doesn't implement the according API and would segfault:

Program received signal SIGSEGV, Segmentation fault.
  #0  0x0000000000000000 in ?? ()
  #1  in virgl_fence_server_sync  at
       src/gallium/drivers/virgl/virgl_context.c:1049
  #2  in st_server_wait_sync  at
       src/mesa/state_tracker/st_cb_syncobj.c:155

so just don't do the call when the function pointers are not set.

Fixes dEQP:
  dEQP-GLES3.functional.fence_sync.wait_sync_smalldraw
  dEQP-GLES3.functional.fence_sync.wait_sync_largedraw

Fixes: d1a1c21e76
  virgl: native fence fd support

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
2018-11-28 10:02:31 +01:00
Gert Wollny
073fdd7382 virgl,vtest: Initialize return value
Avoids:
Conditional jump or move depends on uninitialised value(s)
  at 0x9E2B39F: virgl_vtest_winsys_resource_cache_create (virgl_vtest_winsys.c:379)
  by 0x9E2725F: virgl_buffer_create (virgl_buffer.c:169)
  by 0x9E246D5: virgl_resource_create (virgl_resource.c:60)
  by 0xA0C1B9F: bufferobj_data (st_cb_bufferobjects.c:344)
  by 0xA0C1B9F: st_bufferobj_data (st_cb_bufferobjects.c:390)
  by 0x9F4ACE3: vbo_use_buffer_objects (vbo_exec_api.c:1136)
  by 0xA0C68C3: st_create_context_priv (st_context.c:416)
  by 0xA0C707A: st_create_context (st_context.c:598)
  by 0x9F81C6B: st_api_create_context (st_manager.c:918)
  by 0x9BBE591: dri_create_context (dri_context.c:161)
  by 0x9BB6931: driCreateContextAttribs (dri_util.c:473)
  by 0x4E97A44: drisw_create_context_attribs (drisw_glx.c:630)
  by 0x4E7C591: glXCreateContextAttribsARB (create_context.c:78)
Uninitialised value was created by a stack allocation
  at 0x9E2B249: virgl_vtest_winsys_resource_cache_create (virgl_vtest_winsys.c:342)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
2018-11-28 10:02:31 +01:00
Iago Toral Quiroga
e55cbf26ea intel/compiler: fix register allocation in opt_peephole_sel
This wasn't handling 64-bit cases properly. Found by inspection.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-11-28 08:28:27 +01:00
Matt Turner
6f737b9207 glsl: Remove unused member variable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-27 22:29:53 -08:00
Matt Turner
1a210268b8 nir: Call fflush() at the end of nir_print_shader()
We normally call with stderr which is unbuffered, so this won't affect
that, but it does let me call nir_print_shader(nir, fopen("log", "w+"))
from gdb and actually get the whole shader in my file.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-27 22:29:53 -08:00
Eric Anholt
e113b21cb7 v3d: Add renderonly support.
I've been using this with the kmsro series to test v3d on VKMS without my
old KMS hack in the v3d kernel driver.  KMSRO still needs some cleanup,
but v3d RO support seems reasonable.
2018-11-27 15:03:02 -08:00
Eric Anholt
55edafa73e gallium: Remove unused variable in u_tests.
Fixes: 0d17b685b1 ("gallium/u_tests: add a compute shader test that clears an image")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-27 15:02:57 -08:00
Bas Nieuwenhuizen
6569644bb6 radv: Align large buffers to the fragment size.
Improves performance in Talos by about 15% (and significant improvements
in RotR and possibly other but did not bench with final patch) on
kernel 4.19 and earlier.

On 4.20+ a similar effect comes from

433ca054949a "drm/amdgpu: try allocating VRAM as power of two"

v2: Do not impact the alignment of the physical memory.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
2018-11-27 22:17:42 +01:00
Hyunjun Ko
76945e4140 freedreno: implements get_sample_position
Since 1285f71d3e landed, it needs to provide apps with proper sample
position for MSAA.

Currently no way to query this to hw, these are taken from blob driver.

Fixes: dEQP-GLES31.functional.texture.multisample.samples_#.sample_position
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Rob Clark
5973a4d0b7 freedreno/a3xx: also set FSSUPERTHREADENABLE
We set equiv bit in SP_FS_CTRL_REG0.  Somehow the hw doesn't hang with
this mismatched config, but does run slower.  It is faster with either
neither bit set, or both bits set, but both is the fastest of the three
configurations.  Worth a bit over 10% gain in glmark2.

Spotted-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
e68cd91251 freedreno: use MSM_BO_SCANOUT with scanout buffers
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2018-11-27 15:44:03 -05:00
Jonathan Marek
3ed4aad524 freedreno: use GENERIC instead of TEXCOORD for blit program
blip_fp uses GENERIC as input, so blit_vp should match for linking

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
3a273a4abc freedreno: a2xx texture update
Adds all missing texture related logic. For everything to work it also
needs changes to ir2/fd2_program, which are part of the ir2 update patch.

Note: it needs rnndb update

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
[remove stray patch]
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
4887aba638 freedreno/a2xx: Compute depth base in gmem correctly
Note: it needs rnndb update

Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
e7114575f7 freedreno/a2xx: set VIZ_QUERY_ID on a20x
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
a50b8a0152 freedreno: add missing a20x ids
200: 256KiB GMEM A200 (imx53)
201: 128KiB GMEM A200 (imx51)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
4e6ee033ff freedreno/a2xx: fix POINT_MINMAX_MAX overflow
As it stands, it overflows to zero.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
78fede86d9 freedreno: a2xx: fd2_draw update
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Jonathan Marek
3e7186d472 nir: add fceil lowering
lowers ceil(x) as -floor(-x)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
11593f9041 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
d47d77d49d freedreno/a6xx: set guardband clip
On older gens, the CLIP_ADJ bitfields were actually 3.6 fixed point.
Which might make more sense.  Although this formula comes up with values
pretty close to what blob does for various viewport sizes (for at least
a5xx and a6xx), and seems to work.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
2773919f06 freedreno/a6xx: disable LRZ for z32
f6131d4ec7 had the side effect of enabling LRZ w/ 32b depth buffers.
But there are some bugs with this, which aren't fully understood yet,
so for now just skip LRZ w/ z32..

Fixes: f6131d4ec7 freedreno/a6xx: Clear z32 and separate stencil with blitter
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Kristian H. Kristensen
9595be67a9 freedreno/a6xx: Clear gmem buffers at flush time
We generate an IB to clear the gmem at flush time and jump to it
before rendering each tile. This lets us get rid of the command stream
patching for gmem offsets.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Kristian H. Kristensen
b5a9bb28c6 freedreno/a6xx: Move resolve blits to an IB
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Kristian H. Kristensen
5f068cf3b0 freedreno/a6xx: Move restore blits to IB
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
09300bbe03 mesa/st: better colormask check for clear fallback
For RGB surfaces (for example) we don't really care that the colormask
is 0x7 instead of 0xf.  This should not trigger clear_with_quad()
slowpath.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-27 15:44:02 -05:00
Rob Clark
65cee01430 mesa/st: swap order of clear() and clear_with_quad()
If we can't clear all the buffers with pctx->clear() (say, for example,
because of ColorMask), push the buffers we *can* clear with pctx->clear()
first.  Tilers want to see clears coming before draws to enable fast-
paths, and clearing one of the attachments with a quad-draw first
confuses that logic.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-27 15:44:02 -05:00
Rob Clark
aa0fed10d3 freedreno: move ir3 to common location
Move (most of) the ir3 compiler to src/freedreno/ir3 so that it can be
re-used by some future vulkan driver.  The parts that are gallium
specific have been refactored out and remain in the gallium driver.

Getting the move done now so that it can happen before further
refactoring to support a6xx specific instructions.

NOTE also removes ir3_cmdline compiler tool from autotools build since
that was easier than fixing it and I normally use meson build.  Waiting
patiently for the day that we can remove *everything* from the autotools
build.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
556eec249d freedreno/ir3: remove u_inlines usage
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
312eae45a3 freedreno/ir3: split up ir3_shader
Split the parts that are gallium specific into ir3_gallium so the rest
can move to a common location outside of gallium.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
ea4cbf601d freedreno/ir3: remove pipe_stream_output_info dependency
A bit annoying to have to copy into our own struct.  But this is
something the compiler really needs to know, at least on earlier
generations where streamout is implemented in shader.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
030e98630d freedreno/ir3: some header file cleanup
Clean up some of the low-hanging-fruit usages of freedreno_util.h

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
2482153d52 freedreno/ir3: use env_var_as_unsigned()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
a321f939f6 util: env_var_as_unsigned() helper
So I can drop env2u() helper from freedreno_util.h and get rid of one
small ir3 dependency on gallium/freedreno

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
bfd8d26372 freedreno/ir3: move disasm and optmsgs debug flags
Move them to IR3_SHADER_DEBUG so we can remove ir3's dependency on
fd_mesa_debug.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
424d75656f freedreno: FD_SHADER_DEBUG -> IR3_SHADER_DEBUG
Only used by ir3, so move it into ir3 to be more self contained.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
8a654f092e freedreno: remove shader_stage_name()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
c635703c50 freedreno: shader_t -> gl_shader_stage
Just massive search/replace for the most part.

Step towards removing ir3 dependency on disasm.h which is shared by
a2xx.  One step closer to being able to move ir3 out of gallium.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
388aac32ed freedreno/ir3: standalone compiler updates
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
b4476138d5 freedreno: move drm to common location
So that we can re-use at least parts of it for vulkan driver, and so
that we can move ir3 to a common location (which uses fd_bo to allocate
storage for shaders)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
6cb74eb4f1 freedreno/drm: remove dependency on gallium driver
Prep work to move drm to a common location.

Slightly hacky, but the softpin debug flag is only temporary.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Dylan Baker
88c4680b5a util: promote u_memory to src/util
as well as os_memory*
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Eric Anholt
bade179153 gallium: Fix uninitialized variable warning in compute test.
The compiler doesn't know that ny != 0, so x might be uninitialized for
the printf at the end.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-11-27 11:23:22 -08:00
Bas Nieuwenhuizen
08ea6b9d9b radv: Clamp gfx9 image view extents to the allocated image extents.
Mirrors AMDVLK. Looks like if we go over the alignment of height
we actually start to change the addressing. Seems like the extra
miplevels actually work with this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108245
Fixes: f6cc15dccd "radv/gfx9: fix block compression texture views. (v2)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-27 10:19:52 +01:00
Iago Toral Quiroga
453570cd8c intel/compiler: fix indentation style in opt_algebraic() 2018-11-27 09:53:09 +01:00
Anuj Phogat
16e4911972 anv/icl: Set use full ways in L3CNTLREG
L3 allocation table in h/w specification recommends using 4 KB
granularity for programming allocation fields in L3CNTLREG.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-11-26 15:11:36 -08:00
Anuj Phogat
3f55fd3814 intel/icl: Set way_size_per_bank to 4
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-11-26 15:11:36 -08:00
Anuj Phogat
3ce04da5b4 i965/icl: Set use full ways in L3CNTLREG
L3 allocation table in h/w specification recommends using 4 KB
granularity for programming allocation fields in L3CNTLREG.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-11-26 15:11:36 -08:00
Anuj Phogat
3282c7be89 i965/icl: Fix L3 configurations
Use L3 configuration specified in h/w specification.

V2: Drop configs which do under allocation of l3 cache.
    Bump up the comment above table.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-11-26 15:11:36 -08:00
Eric Engestrom
c0c533767e build: stop defining unused VERSION
Scons and autotools don't define it, and as of last commit nothing
uses it.

`VERSION` is also a generic enough name that something somewhere will
eventually clash, and we don't want to repeat the LLVM `DEBUG` fiasco.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-26 22:05:02 +00:00
Eric Engestrom
bd12e02530 vulkan/utils: s/VERSION/PACKAGE_VERSION/
Everything else uses PACKAGE_VERSION, so let's be consistent, and
VERSION and PACKAGE_VERSION are currently defined to be the same in
meson and android, while VERSION is undefined in autotools and scons.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-26 22:05:02 +00:00
Eric Engestrom
56d126f8fd anv: correctly use vulkan 1.0 by default
Per chapter 3.2 "Instances":
> Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing
> an apiVersion of 0 is equivalent to providing an apiVersion of
> VK_MAKE_VERSION(1,0,0).

Reported-by: Niklas Haas <git@haasn.xyz>
Fixes: 8c048af589 "anv: Copy the appliation info into the instance"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-26 22:05:02 +00:00
Erik Faye-Lund
d6d35d87f1 mesa/main: fixup requirements for GL_PRIMITIVES_GENERATED
This enum is also allowed by EXT_tessellation_shader, which is supported
on older i965 HW (as opposed to OES_geometry_shader). This was missed
when narrowing this code-path, leading to dEQP regressions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108868
Fixes: f09d94fbd1 "mesa/main: fix validation of transform-feedback queries"
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2018-11-26 22:12:07 +01:00
Erik Faye-Lund
c120dbfe4d mesa/main: fix incorrect depth-error
If glGetTexImage or glGetnTexImage is called with a level that doesn't
exist, we get an error message on this form:

Mesa: User error: GL_INVALID_VALUE in glGetTexImage(depth = 0)

This is clearly nonsensical, because these APIs don't even have a
depth-parameter. The reason is that get_texture_image_dims() return
all-zero dimensions for non-existent texture-images, and we go on to
validate these dimensions as if they were user-input, because
glGetTextureSubImage requires checking.

So let's split this logic in two, so glGetTextureSubImage can have
stricter input-validation. All arguments that are no longer validated
are generated internally by mesa, so there's no use in validating them.

Fixes: 42891dbaa1 "gettextsubimage: verify zoffset and depth are correct"
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-26 12:29:54 +01:00
Erik Faye-Lund
38af69adfa mesa/main: check cube-completeness in common code
This check is the only part of dimensions_error_check that isn't about
error-checking the offset and size arguments of
glGet[Compressed]TextureSubImage(), so it doesn't really belong in here.

This doesn't make a difference right now, apart for changing the
presedence of this error. But it will make a difference  for the next
patch, where we no longer call this method from the non-sub tex-image
getters.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-26 12:29:54 +01:00
Erik Faye-Lund
42820c5727 mesa/main: factor out common error-checking
This error checking is the same for teximage and texsubimage getters, so
let's factor it out to its own function.

This will be useful when getteximage and gettexsubimage gets their own
error checking routines a bit later.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-26 12:29:54 +01:00
Erik Faye-Lund
5e0a84f31c mesa/main: factor out tex-image error-checking
This will be useful when we split error-checking for getteximage and
gettexsubimage later.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-26 12:29:54 +01:00
Erik Faye-Lund
38bbb61252 mesa/main: remove bogus error for zero-sized images
The explanation quotes the spec on the following wording to justify the
error:

"An INVALID_VALUE error is generated if xoffset + width is greater than
 the texture’s width, yoffset + height is greater than the  texture’s
 height, or zoffset + depth is greater than the texture’s depth."

However, this shouldn't generate an error in the case where *all three*
of width, xoffset and the texture's width are zero. In this case, we end
up generating an unspecified error.

So let's remove this check, and instead make sure that we consider this
as an empty texture.

So let's not generate an error, there's non mandated in the spec in
xoffset/yoffset/zoffset = 0 case. We already avoid doing any work in
this case, because of the final, non-error generating check in this
function.

Fixes: b37b35a5d2 "getteximage: assume texture image is empty for non defined levels"
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-26 12:29:54 +01:00
Erik Faye-Lund
f1998e15ff mesa/main: remove ARB suffix from glGetnTexImage
This function has been core since OpenGL 4.3, so naming the
implementation and reporting erros using an ARB-suffix can be
confusing.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-26 12:29:54 +01:00
Gert Wollny
f5d053702f glsl: free or reuse memory allocated for TF varying
When a shader program is de-serialized the gl_shader_program passed in
may actually still hold memory allocations for the transform feedback
varyings. If that is the case, free the varying names and reallocate
the new storage for the names array.

This fixes a memory leak:
Direct leak of 48 byte(s) in 6 object(s) allocated from:
 in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880)
 in transform_feedback_varyings ../../samba/mesa/src/mesa/main/transformfeedback.c:875
 in _mesa_TransformFeedbackVaryings ../../samba/mesa/src/mesa/main/transformfeedback.c:985
 ...
Indirect leak of 42 byte(s) in 6 object(s) allocated from:
  in __interceptor_strdup (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0x761c8)
  in transform_feedback_varyings ../../samba/mesa/src/mesa/main/transformfeedback.c:887
  in _mesa_TransformFeedbackVaryings ../../samba/mesa/src/mesa/main/transformfeedback.c:985

Fixes: ab2643e4b0
   glsl: serialize data from glTransformFeedbackVaryings

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-26 09:58:25 +01:00
Bas Nieuwenhuizen
3c96a1e3a9 radv: Fix opaque metadata descriptor last layer.
We used the layer count which results in an off by one error.

Not sure this really affects anything.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-26 09:29:39 +01:00
Mathias Fröhlich
ff466c2d48 mesa/st: Make st_pipe_vertex_format static.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-26 07:57:10 +01:00
Mathias Fröhlich
2a3eae82a1 mesa/st: Use binding information from the VAO in feedback rendering.
Use VAO binding information in feedback rendering. In theory
it should reduce the amount of buffer objects scheduled for rendering.
Feedback rendering is implemented in a crude way anyhow, so I do not
expect much gain here. But for the sake of code reuse we should
use the same code for the same task. And finally if feeback rendering
may get improved the array setup is already well done there.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-26 07:57:10 +01:00
Mathias Fröhlich
a00a8fb8d1 mesa/st: Avoid extra references in the feedback draw function scope.
The change removes the reference that is held on the entries of the
vbuffers[] array. The new code does not do that anymore as following
the code into draw_set_vertex_buffers() the draw context holds an
other reference as long as it is reset down the function again.
So it should be already by that argument save to remove that
additional reference count.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-26 07:57:10 +01:00
Mathias Fröhlich
6705188cc5 mesa/st: Factor out array and buffer setup from st_atom_array.c.
Factor out vertex array setup routines from the array state atom.
The factored functions will be used in feedback rendering in the
next change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-26 07:57:09 +01:00
Mathias Fröhlich
774d585d49 mesa/st: Only unmap the uploader that was actually used.
In st_atom_array, we only need to unmap the upload buffer that
was actually used.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-26 07:57:09 +01:00
Mathias Fröhlich
65332aff29 mesa/st: Only care about the uploader if it was used.
In st_atom_array, we only need to care for unmapping the upload buffer
if we actually used it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-26 07:57:09 +01:00
Ilia Mirkin
927ce66b39 nv50/ir: remove dnz flag when converting MAD to ADD due to optimizations
dnz flag only applies for multiplications (e.g. to make 0 * Infinity
becomes 0 instead of NaN). Once we optimize a MAD into an ADD, the dnz
flag no longer makes sense, and upsets the GM107 emitter (since it looks
at the ftz and dnz flags together).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-11-24 22:15:53 -05:00
Marek Olšák
d4e7d8b7f0 winsys/amdgpu: fix a device handle leak in amdgpu_winsys_create
Cc: 18.2 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-23 17:08:44 -05:00
Marek Olšák
82aa07f81f winsys/amdgpu: fix a buffer leak in amdgpu_bo_from_handle
Cc: 18.2 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-23 17:08:42 -05:00
Samuel Pitoiset
9fc1ce258c radv: ignore subpass self-dependencies for CreateRenderPass() too
We really need to refactor this...

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-23 11:59:11 +01:00
Samuel Pitoiset
2951a766bd radv: remove useless sync before CmdClear{Color,DepthStencil}Image()
We don't need to flush anything before these two commands as well.
This is because they have to be externally synchronized, so the
app should have called CmdPipelineBarrier() prior to that and the
driver should have flushed the caches.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-23 11:59:08 +01:00
Erik Faye-Lund
a652842982 mesa/main: remove overly strict query-validation
The rules encoded in this code also applies to OpenGL ES 3.0 and up,
but the per-enum validation has already been taught about these rules.
So let's get rid of this duplicate, narrow version of the validation.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:36 +01:00
Erik Faye-Lund
d52be6dd29 mesa/main: fix validation of GL_TIMESTAMP
ctx->Extensions.ARB_timer_query is set based on the driver-
capabilities, not based on the context type. We need to check
against _mesa_has_ARB_timer_query(ctx) instead to figure out
if the extension is really supported. We also need to check for
EXT_disjoint_timer_query for GLES-support.

This shouln't have any functional effect, as this entry-point is only
valid on desktop GL, or on GLES with EXT_disjoint_timer_query in the
first place. But if this gets added to the core of a future version
of ES, this should be a step in the right direction.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:36 +01:00
Erik Faye-Lund
7a4d74c35a mesa/main: fix validation of ARB_query_buffer_object
ctx->Extensions.ARB_query_buffer_object is set based on the driver-
capabilities, not based on the context type. We need to check against
_mesa_has_ARB_query_buffer_object(ctx) instead to figure out if the
extension is really supported.

This turns attempts to read queries into buffer objects on ES 3 into
errors, as required by the spec.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:36 +01:00
Erik Faye-Lund
75e39b59dc mesa/main: fix validation of transform-feedback overflow queries
ctx->Extensions.ARB_transform_feedback_overflow_query is set based on
the driver-capabilities, not based on the context type. We need to
check against _mesa_has_RB_transform_feedback_overflow_query(ctx)
instead to figure out if the extension is really supported.

This turns usage of GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW and
GL_TRANSFORM_FEEDBACK_OVERFLOW into errors on ES 3, as required by the
spec.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:36 +01:00
Erik Faye-Lund
f09d94fbd1 mesa/main: fix validation of transform-feedback queries
ctx->Extensions.EXT_transform_feedback is set based on the driver-
capabilities, not based on the context type. We need to check against
_mesa_has_EXT_transform_feedback(ctx) instead to figure out if the
extension is really supported. We also need to check for
OES_geometry_shader.

This turns usage of GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN into an
error on ES 2, as well as usage of GL_PRIMITIVES_GENERATED on ES 3, both
as required by the spec.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:36 +01:00
Erik Faye-Lund
b551fe5fa7 mesa/main: fix validation of GL_TIME_ELAPSED
ctx->Extensions.EXT_timer_query is set based on the driver-
capabilities, not based on the context type. We need to check against
_mesa_has_EXT_timer_query(ctx) instead to figure out if the extension
is really supported. We also need to check for
EXT_disjoint_timer_query, which enables the same functionality for ES.

This turns usage of GL_TIME_ELAPSED into an error on ES 3, as is
required by the spec.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:36 +01:00
Erik Faye-Lund
059928e114 mesa/main: fix validation of GL_ANY_SAMPLES_PASSED_CONSERVATIVE
ctx->Extensions.ARB_ES3_compatibility is set based on the driver-
capabilities, not based on the context type. We need to check against
_mesa_has_ARB_ES3_compatibility(ctx) instead to figure out if the
extension is really supported.

In addition, EXT_occlusion_query_boolean should also allow this
behavior.

This shouldn't cause any functional change, as all drivers that support
ES3_compatibility should in practice enable either ES3_compatibility or
EXT_occlusion_query_boolean under all APIs that export this symbol.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Erik Faye-Lund
8ea819dd60 mesa/main: fix validation of GL_ANY_SAMPLES_PASSED
ctx->Extensions.ARB_occlusion_query2 is set based on the driver-
capabilities, not based on the context type. We need to check against
_mesa_has_ARB_occlusion_query2(ctx) instead to figure out if the
extension is really supported.

In addition, EXT_occlusion_query_boolean should also allow this
behavior.

This shouldn't cause any functional change, as all drivers that support
ARB_occlusion_query2 should in practice enable either
ARB_occlusion_query2 or EXT_occlusion_query_boolean under all APIs that
export this symbol.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Erik Faye-Lund
fff1738d57 mesa/main: fix validation of GL_SAMPLES_PASSED
ctx->Extensions.ARB_occlusion_query is set based on the driver-
capabilities, not based on the context type. We need to check against
_mesa_has_ARB_occlusion_query(ctx) instead to figure out if the
extension is really supported. We also need to check for
ARB_occlusion_query2, as ARB_occlusion_query isn't available in core
contexts.

This turns usage of GL_SAMPLES_PASSED into an error on ES 3, as is
required by the spec.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Erik Faye-Lund
9c13ad0ea4 mesa/main: simplify pipeline-statistics query validation
The _mesa_has_ARB_pipeline_statistics_query(ctx)-helper will already
check the GLES-version according to the extension-table, so if this
extension would ever be back-ported to ES, we only need to update the
table to support this.

This shouln't have any functional effect.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Erik Faye-Lund
dd4241b34f mesa/main: use non-prefixed enums for consistency
These enums all have the same values as their non-prefixed versions, and
there's several aliases for some of them. So let's switch to the
non-prefixed versions for simplicity.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Erik Faye-Lund
ba4e8d3754 mesa/main: correct year for EXT_occlusion_query_boolean
According to the extension spec, this was initially released in 2011,
so let's set this to the correct value.

The value of 2001 could be a copy-paste mistake, as ARB_occlusion_query
which this is based on was released then.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Erik Faye-Lund
35555b08d7 mesa/main: correct requirement for EXT_occlusion_query_boolean
EXT_occlusion_query_boolean require support for GL_ANY_SAMPLES_PASSED,
which ARB_occlusion_query doesn't supply. We need ARB_occlusion_query2
for this instead.

This is still not 100% accurate, as we also require support for the
GL_SAMPLES_PASSED_CONSERVATIVE target, which isn't guaranteed by either
ARB_occlusion_query nor ARB_occlusion_query2. But it should be trivial
to implement for any driver supporting ARB_occlusion_query2, as it can
simply be implemented as GL_ANY_SAMPLES_PASSED.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Tapani Pälli
09adaa4b89 anv: allow exporting an imported SYNC_FD semaphore type
Fixes issues with following SkQP tests:

   unitTest_VulkanHardwareBuffer_Vulkan_EGL_Syncs
   unitTest_VulkanHardwareBuffer_Vulkan_Vulkan_Syncs

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-23 07:49:46 +02:00
Eric Engestrom
896c59d690 glapi: add missing visibility args
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108829
Fixes: 3218056e0e "meson: Build i965 and dri stack"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-22 18:21:05 +00:00
Jason Ekstrand
a24654b49d anv/nir: Rework arguments to apply_pipeline_layout
Instead of taking a whole pipeline (which could be anything!), just take
a physical device and robust_buffer_access boolean.  This makes it
easier to verify that only the things in the hash actually affect
pipeline compilation.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-11-22 09:17:28 -06:00
Jason Ekstrand
617e402b3d anv: Put robust buffer access in the pipeline hash
It affects apply_pipeline_layout.  Shaders compiled with the wrong value
will work but they may not be robust as requested by the app.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-11-22 09:17:10 -06:00
Jason Ekstrand
a845c2bc10 anv: Expose VK_EXT_scalar_block_layout
Our compile already splits UBO loads into scalars and the untyped
surface read messages we use for SSBO reads and writes only require
dword alignment.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-22 08:16:47 -06:00
Jason Ekstrand
2ca9a4417d vulkan: Update the XML and headers to 1.1.93
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-22 08:16:40 -06:00
Samuel Pitoiset
4ff4af3d91 radv: remove useless sync after CmdClear{Color,DepthStencil}Image()
'post_flush' is only set to NULL for the normal clear path
(ie. only vkCmdClearColorImage() and vkCmdClearDepthStencilImage()
are affected commands).

Because these two operations have to be externally synchronized
with VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT,
it's useless to set those flags internallY.

VK_PIPELINE_STAGE_TRANSFER_BIT will wait for compute to be idle,
while VK_ACCESS_TRANSFER_WRITE_BIT will invalidate both L1 vector
caches and L2. RADV_CMD_FLAG_WRITEBACK_GLOBAL_L2 will be superseded
by RADV_CMD_FLAG_INV_GLOBAL_L2.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-22 08:56:36 +01:00
Bas Nieuwenhuizen
33b2f74e77 vulkan: Allow storage images in the WSI.
Since apps also have to follow the ImageFormatProperties query,
we can disallow formats that don't allow image stores (for AMD
that would be SRGB formats).

Note that this only affects anything if the app actually decides
to use the flag.

Had someone ask for this on IRC and at least on the AMD side we
can support it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-21 21:36:55 +01:00
Axel Davy
1f1d4d571a st/nine: Remove thread_submit warning
thread_submit can be useful even without DRI_PRIME,
as it can help avoid missed pageflips.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2018-11-21 19:55:28 +01:00
Axel Davy
d304f0aa31 st/nine: Allow 'triple buffering' with thread_submit
The path allowing triple buffering behaviour wasn't implemented
yet for thread_submit

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2018-11-21 19:55:28 +01:00
Robert Foss
19af208c7d virgl: add assert and missing function parameter
Verify the pipe_fd_type to be of PIPE_FD_TYPE_NATIVE_SYNC.

Fixes: d1a1c21e76 "virgl: native fence fd support"

Suggested-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-21 15:59:00 +01:00
Gert Wollny
61b535437e r600: clean up the GS ring buffers when the context is destroyed
This fixes two memory leaks reported by ASAN:

Direct leak of 248 byte(s) in 1 object(s) allocated from:
   in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880)
   in r600_alloc_buffer_struct ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:578
   in r600_buffer_create ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:600
   in r600_resource_create_common ../../samba/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1265
   in r600_resource_create ../../samba/mesa/src/gallium/drivers/r600/r600_pipe.c:725
   in pipe_buffer_create ../../samba/mesa/src/gallium/auxiliary/util/u_inlines.h:291
   in update_gs_block_state ../../samba/mesa/src/gallium/drivers/r600/r600_state_common.c:1482

Direct leak of 248 byte(s) in 1 object(s) allocated from:
   in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880)
   in r600_alloc_buffer_struct ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:578
   in r600_buffer_create ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:600
   in r600_resource_create_common ../../samba/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1265
   in r600_resource_create ../../samba/mesa/src/gallium/drivers/r600/r600_pipe.c:722
   in pipe_buffer_create ../../samba/mesa/src/gallium/auxiliary/util/u_inlines.h:291
   in update_gs_block_state ../../samba/mesa/src/gallium/drivers/r600/r600_state_common.c:1489

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Fixes: 1371d65a7f
  r600g: initial support for geometry shaders on evergreen (v2)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-11-21 10:34:17 +01:00
Samuel Pitoiset
4b9bc4791b radv: only sync CP DMA for transfer operations or bottom pipe
CP DMA can only be busy when the driver copies buffers. The
only affected Vulkan commands are vkCmdCopyBuffer() and
vkCmdUpdateBuffer() (because we fallback to a copy depending on
a threshold). Clear operations are currently not concerned
because the driver always syncs after the last DMA operation.

Per the spec, these two operations have to be externally
synchronized with VK_PIPELINE_STAGE_TRANSFER_BIT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 10:03:01 +01:00
Samuel Pitoiset
457ac6ce1e radv: ignore subpass self-dependencies
Unnecessary as they allow the app to call vkCmdPipelineBarrier()
inside the render pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 10:02:59 +01:00
Iago Toral Quiroga
8e73b57634 Revert "nir/builder: Assert that intN_t immediates fit"
This reverts commit 1f29f4db1e.

For this to work the compiler must ensure that it never puts
the values that arrive to this helper into unsigned variables
at any point in its processing, since that would not apply sign
extension to the value and it would break the expectations here.
Unfortunately, we use uint64_t extensively to pass and copy
things around, so some times we get to this helper with values
that are not properly sign extended to 64-bit. Here is an example
for an 8-bit value that comes from a switch case:

(gdb) p /x x
$1 = 0xffffffd6

The value seems to have been sign extended to 32-bit at some point
getting proper sign extension, but then copied into a uint64_t
which wont' apply sign extension, breaking the expectations of
the assertion.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-21 08:12:50 +01:00
Iago Toral Quiroga
387888e3b7 nir/from_ssa: fix bit-size of temporary register
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-21 08:07:22 +01:00
Mathias Fröhlich
2d3c466add mesa: Remove unneeded bitfield widths from the VAO.
With the current VAO layout we do not need to make these
fields a bitfield. We get a tight struct layout with this change
for VAO attributes.

v2: Change unsigned char -> GLubyte.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
0a7020b4e6 mesa: Factor out struct gl_vertex_format.
Factor out struct gl_vertex_format from array attributes.
The data type is supposed to describe the type of a vertex
element. At this current stage the data type is only used
with the VAO, but actually is useful in various other places.
Due to the bitfields being used, special care needs to be
taken for the glGet code paths.

v2: Change unsigned char -> GLubyte.
    Use struct assignment for struct gl_vertex_format.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
2da7b0a2fb tnl: Use gl_array_attribute::_ElementSize.
Instead of open coding the size computation, use the
already available gl_array_attribute::_ElementSize value.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
a4c01839c2 nouveau: Use gl_array_attribute::_ElementSize.
Instead of open coding the size computation, use the
already available gl_array_attribute::_ElementSize value.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
182ed6de8c mesa: Unify glEdgeFlagPointer data type.
Use GL_UNSIGNED_BYTE as initialization data type
for the edge flag vertex attribute array. The same datatype
is used in the glEdgeFlagPointer function when setting the
array pointer.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
1b743e2966 mesa: Work with bitmasks when en/dis-abling VAO arrays.
For enabling or disabling VAO arrays it is now possible to
change a set of arrays with a single call without the need to
iterate the attributes.
Make use of this technique in the vao module.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
3c46fa5988 mesa: Remove gl_array_attributes::Enabled.
Now that all users go via the VAO Enabled bitfield,
get rid of the Enabled boolean.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
093aeb3565 mesa: Use gl_vertex_array_object::Enabled for glGet.
Instead of using gl_array_attributes::Enabled use the
much more compact representation stored in
gl_vertex_array_object::Enabled using the corresponding bits.
Keep the glGet changes in a seperate patch at least for review.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
1217a8448c mesa: Use the gl_vertex_array_object::Enabled bitfield.
Instead of using gl_array_attributes::Enabled use the
much more compact representation stored in
gl_vertex_array_object::Enabled using the corresponding bits.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
73d2d313e9 mesa: Rename gl_vertex_array_object::_Enabled -> Enabled.
Mark the up to now derived bitfield value now as primary
value by removing the underscore.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Marek Olšák
ea9f95e2a6 radeonsi: go back to using bottom-of-pipe for beginning of TIME_ELAPSED
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102597

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-20 21:18:48 -05:00
Marek Olšák
6c1a34d2e7 radeonsi: don't send data after write-confirm with BOTTOM_OF_PIPE_TS
There are no writes.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-20 21:18:46 -05:00
Marek Olšák
bc5adc27b5 st/mesa: pin driver threads to a fixed CCX when glthread is enabled
radeonsi has 3 driver threads (glthread, gallium, winsys), other drivers
may have 2 (glthread, gallium), so it makes sense to pin them to a random
CCX and keep that irrespective of the app thread.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-20 21:18:43 -05:00
Marek Olšák
48f2160936 st/mesa: regularly re-pin driver threads to the CCX where the app thread is
This is used when glthread is disabled.

Mesa pretty much chases the app thread on the CPU.
The performance is the same as pinning the app thread.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-20 21:18:30 -05:00
Marek Olšák
ce7f84eb77 drirc: enable glthread for Talos Principle
Ryzen 1700X, Vega 56, 1600x900, 4xAA: improvement +4.4%

Immediate mode was needed.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-20 21:17:42 -05:00
Marek Olšák
7f1cac7ba6 mesa/glthread: enable immediate mode
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-20 21:17:41 -05:00
Marek Olšák
247d5a8e94 mesa/glthread: pass the function name to _mesa_glthread_restore_dispatch
If you insert printf there, you'll know why glthread was disabled.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-20 21:17:38 -05:00
Marek Olšák
25d95ed535 gallium/u_tests: fix MSVC build by using old-style zero initializers 2018-11-20 19:06:40 -05:00
Kenneth Graunke
562448b75a i965: Do NIR shader cloning in the caller.
This moves nir_shader_clone() to the driver-specific compile function,
rather than the shared src/intel/compiler code.  This allows i965 to do
key-specific passes before calling brw_compile_*.  Vulkan should not
need this cloning as it doesn't compile multiple variants.

We do need to continue cloning in the compute shader code because we
lower various things in NIR based on the SIMD width.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-11-20 15:53:46 -08:00
Kenneth Graunke
6a10dd08f4 i965: Use a 'nir' temporary rather than poking at brw_program
It's shorter and will also be useful when I adjust cloning soon.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-11-20 15:53:46 -08:00
Marek Olšák
0d17b685b1 gallium/u_tests: add a compute shader test that clears an image 2018-11-20 18:50:48 -05:00
Dave Airlie
3486fe655a ac: handle cast derefs
Just give back the same value for now.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 08:54:46 +10:00
Dave Airlie
baa4bdd3a6 radv: handle loading from shared pointers
We won't have a var to load from, so don't try to the processing
required if we don't need it.

This avoids crashes in:
dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.compute.workgroup_two_buffers

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 08:54:42 +10:00
Dave Airlie
ec9fe8abc7 ac: avoid casting pointers on bcsel and stores
For variable pointers we really don't want to case the pointers to int
without a good reason, just add a wrapper for bcsel loading and result
storing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 08:54:25 +10:00
Dylan Baker
a999798daa meson: Add tests to suites
Meson test has a concepts of suites, which allow tests to be grouped
together. This allows for a subtest of tests to be run only (say only
the tests for nir). A test can be added to more than one suite, but for
the most part I've only added a test to a single suite, though I've
added a compiler group that includes nir, glsl, and glcpp tests.

To use this you'll need to invoke meson test directly, instead of ninja
test (which always runs all targets). it can be invoked as:
`meson test -C builddir --suite $suitename` (meson test has addition
options that are pretty useful).

Tested-By: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-20 09:09:22 -08:00
Andrii Simiklit
b787dcf57b i965/batch: avoid reverting batch buffer if saved state is an empty
There's no point reverting to the last saved point if that save point is
the empty batch, we will just repeat ourselves.

v2: Merge with new commits, changes was minimized, added the 'fixes' tag
v3: Added in to patch series
v4: Fixed the regression which was introduced by this patch
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108630
    Reported-by:  Mark Janes <mark.a.janes@intel.com>
    The solution provided by: Jordan Justen <jordan.l.justen@intel.com>

CC: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 3faf56ffbd "intel: Add an interface for saving/restoring
                     the batchbuffer state."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108630 (fixed in v4)
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-20 06:33:43 -08:00
Emil Velikov
982e012b3a travis: adding missing x11-xcb for meson+vulkan
Required by the x11 WSI

Fixes: df82012b2c ("travis: add meson build for vulkan drivers.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-20 11:16:46 +00:00
Emil Velikov
5bc509363b glx: make xf86vidmode mandatory for direct rendering
Currently we detect the module and if missing, the glXGetMsc* API is
effectively a stub, always returning false.

This is what effectively has been happening with our meson build :-(

Thus users have no chance of using it - they cannot even distinguish
if the failure is due to a misconfigured build.

There's no reason for keeping xf86vidmode optional - it has been
available in all distributions for years.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: a47c525f32 "meson: build glx"
2018-11-20 11:13:20 +00:00
Emil Velikov
84445a86d1 travis: drop unneeded x11proto-xf86vidmode-dev
The only place where the package is needed is for building the DRI
based libGL library.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-20 11:13:20 +00:00
Samuel Pitoiset
f4563d8f5b ac/nir: fix intrinsic name string size in visit_image_atomic()
Fixes an assertion in SoTTR.

Fixes: dd0172e865 ("radv: Use structured intrinsics instead of indexing workaround for GFX9.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-20 10:23:45 +01:00
Bas Nieuwenhuizen
dd0172e865 radv: Use structured intrinsics instead of indexing workaround for GFX9.
These force the index to be used in the instruction so we don't need the
workaround.

Totals:
SGPRS: 1321642 -> 1321802 (0.01 %)
VGPRS: 943664 -> 943788 (0.01 %)
Spilled SGPRs: 28468 -> 28480 (0.04 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 52415292 -> 52338932 (-0.15 %) bytes
LDS: 400 -> 400 (0.00 %) blocks
Max Waves: 233903 -> 233803 (-0.04 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 238344 -> 238504 (0.07 %)
VGPRS: 232732 -> 232856 (0.05 %)
Spilled SGPRs: 13125 -> 13137 (0.09 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 15752712 -> 15676352 (-0.48 %) bytes
LDS: 139 -> 139 (0.00 %) blocks
Max Waves: 31680 -> 31580 (-0.32 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-19 23:36:00 +01:00
Kenneth Graunke
0990168642 i965: Allow only one slot of clip distances to be set on Gen4-5.
The existing backend code assumed that if VARYING_SLOT_CLIP_DIST0
was written, then VARYING_SLOT_CLIP_DIST1 would be as well.  That's
true with the current lowering, but not necessary if there are 4 or
fewer clip distances.  Separate out the checks to allow this.

The new NIR-based lowering will trigger this case, which would have
caused backend validation errors (src is null) without this patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-19 14:33:16 -08:00
Kenneth Graunke
5b682143da nir: Make nir_lower_clip_vs optionally work with variables.
The way nir_lower_clip_vs() works with store_output intrinsics makes a
ton of assumptions about the driver_location field.

In i965 and iris, I'd rather do this lowering early and work with
variables.  v3d may want to switch to that as well, and ir3 could too,
but I'm not sure exactly what would need updating.  For now, handle
both methods.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-19 14:33:16 -08:00
Kenneth Graunke
d0f746b645 nir: Save nir_variable pointers in nir_lower_clip_vs rather than locs.
I'll want the variables in the next patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-19 14:33:16 -08:00
Kenneth Graunke
63c8696874 nir: Inline lower_clip_vs() into nir_lower_clip_vs().
It's now called exactly once, and there's not really any distinction.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-19 14:33:14 -08:00
Kenneth Graunke
bfa789aceb nir: Use nir_shader_get_entrypoint in nir_lower_clip_vs().
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-19 14:31:20 -08:00
Dave Airlie
c8a35285f0 nir: handle shared pointers in lowering indirect derefs.
Check if the base ends up with no variable, and continue
if we see that case outside the loop.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-20 05:36:52 +10:00
Dave Airlie
760859cac2 nir: move getting deref from var after we check deref type.
I posted a load of hacks before to do this, Jason suggested this,
just check the deref mode, not the variable mode and delay getting
the variable until we know the type.

avoids crashes when derefing shared memory pointers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-20 05:36:38 +10:00
Dave Airlie
2f4f5a5055 spirv/vtn: handle variable pointers without offset lowering
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-20 05:36:16 +10:00
Jason Ekstrand
dca35c598d intel/fs,vec4: Fix a compiler warning
../src/intel/compiler/brw_fs_nir.cpp:3534:46: warning: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare]
       assert(nir_intrinsic_write_mask(instr) ==
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
              (1 << instr->num_components) - 1);
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This was caused by 6339aba775 which added these completely valid
checks.  However clang likes to complain about signedness mismatches.

Fixes: 6339aba775 "intel/compiler: Lower SSBO and shared..."
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-11-19 09:57:41 -06:00
Jason Ekstrand
060817b2fa intel,nir: Move gl_LocalInvocationID lowering to nir_lower_system_values
It's not at all intel-specific; the formula is dictated by OpenGL and
Vulkan.  The only intel-specific thing is that we need the lowering.  As
a nice side-effect, the new version is variable-group-size ready.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-11-19 09:57:41 -06:00
Eric Engestrom
486091bc00 gbm: add missing comma between strings
Fixes: d971a4230d "loader: Factor out the common driver
                              opening logic from each loader."
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-19 15:50:56 +00:00
Samuel Pitoiset
724107553c radv: implement fast HTILE clears for depth or stencil only on GFX9
This allows to fast clear the depth part (or the stencil part)
of a depth+stencil surface when HTILE is enabled. I didn't test
on GFX8, so it's disabled currently.

This gives a very nice boost, for example when clearing the depth
aspect of a 4096x4096 D32_SFLOAT_S8_UINT image (18x faster).

BEFORE: 235 us
AFTER: 13 us

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:18 +01:00
Samuel Pitoiset
7dcddbe54d radv: rewrite the condition that checks allowed depth/stencil values
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:16 +01:00
Samuel Pitoiset
9133bbf186 radv: check allowed fast HTILE clears a bit earlier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:14 +01:00
Samuel Pitoiset
193ad4748b radv: add radv_is_fast_clear_{depth,stencil}_allowed() helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:12 +01:00
Samuel Pitoiset
c7e142ed78 radv: add radv_get_htile_fast_clear_value() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:10 +01:00
Samuel Pitoiset
6f3fbcc041 radv: remove unnecessary goto in the fast clear paths
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:08 +01:00
Samuel Pitoiset
36006e3cec radv/winsys: remove the max IBs per submit limit for the sysmem path
This path will be eventually improved later but as it's only
used on SI (or with RADV_DEBUG=noibs), I'm not sure if that
matters much.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:06 +01:00
Samuel Pitoiset
4d30f2c6f4 radv/winsys: remove the max IBs per submit limit for the fallback path
The chained submission is the fastest path and it should now
be used more often than before. This removes some EOP events.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:04 +01:00
Lucas Stach
8ca8a6a7b1 etnaviv: use dummy RT buffer when rendering without color buffer
At least GC2000 seems to push some dirt from the PE color cache into
the last bound render target when drawing depth only. Newer cores
seem to behave properly and don't do this, but I have found no way
to fix it on GC2000. Flushes and stalls don't seem to make any
difference.

In order to stop the core from pushing the dirt into a precious real
render target, plug in dummy buffer when rendering without a color
buffer.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2018-11-19 15:48:10 +01:00
Dave Airlie
8706204074 virgl: fix vtest regression since fencing changes.
The in_fence_fd needs to be initialised to -1.

Fixes: d1a1c21e7 (virgl: native fence fd support)

Reviewed-by: Robert Foss <robert.foss@collabora.com>
2018-11-19 15:33:19 +01:00
Samuel Pitoiset
55c75d2b49 radv: always clear the FCE predicate after DCC/FMASK/CMASK decompressions
DCC and FMASK also imply a fast-clear eliminate, so it should be
safe to reset the predicate unconditionally. We still only skip
FMASK or CMASK decompressions for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 14:05:35 +01:00
Samuel Pitoiset
483a28bfd4 radv: tidy up radv_set_dcc_need_cmask_elim_pred()
This is just a small cleanup.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 14:05:33 +01:00
Nicolai Hähnle
46a59ce026 radeonsi: fix an out-of-bounds read reported by ASAN
We read 4 values out of sample_locs_8x, so make sure the array is
big enough.

Fixes: ac76aeef20 ("radeonsi: switch back to standard DX sample positions")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-19 11:16:35 +01:00
Gert Wollny
d174cbccfa r600: Only set context streamout strides info from the shader that has outputs
With 5d517a streamout info is only attached to the shader for which the
transform feedback is actually recorded, but the driver set the context info
with each state submitted, thereby always using the info data that was
attached to the vertex shader.

Pass the streamout stride info to the context only from the shader that
actually has outputs. (Thanks to Marek Olšák for pointing me in the right
direction)

Fixes regresion with: dEQP-GLES31.functional.tessellation.invariance.*
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108734
Fixes: 5d517a599b
  st/mesa: Don't record garbage streamout information in the non-SSO case.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-19 11:06:56 +01:00
Gert Wollny
18a8e11aea i965:use FRAMEBUFFER_UNSUPPORTED instead of FRAMEBUFFER_INCOMPLETE_DIMENSIONS
FRAMEBUFFER_INCOMPLETE_DIMENSIONS is not supported for GLES 3.0 and later and
not defined for Desktop OpenGL. Instead use FRAMEBUFFER_UNSUPPORTED like it
was done before.

Thanks to Iago Toral and Andrey Simiklit for pointing out the problem and the
details.

Fixes:  ebcde34545
   i965: be more specific about FBO completeness errors
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-11-19 11:06:52 +01:00
Gert Wollny
40eca7d3e1 virgl: Use file descriptor instead of un-allocated object
The structure qdws is not allocated at this point, nor is the
file descriptor set to it's member. Use the fd directly instead.

Fixes:  d1a1c21e76
    virgl: native fence fd support

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
2018-11-19 11:03:56 +01:00
Gert Wollny
78fdc507a3 i965: Add support for and expose EXT_texture_sRGB_R8
Emulate MESA_FORMAT_R_SRGB8 by using L8_UNORM_SRGB. This is possible
because component swizzling is handled based on the mesa format and,
hence, the a r001 swizzling can be used to correct the components.

Enables and makes pass (tested on Kabylake)

  dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.*
  dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8*

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-19 08:05:44 +01:00
Gert Wollny
c5363869d4 i965: Force zero swizzles for unused components in GL_RED and GL_RG
This makes it possible to use a hardware luminance format as RED format.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-19 08:05:44 +01:00
Gert Wollny
ebcde34545 i965: be more specific about FBO completeness errors
The driver was returning GL_FRAMEBUFFER_UNSUPPORTED for all cases of an
incomplete fbo, be a bit more specific about this following the description
of glCheckFramebufferStatus.

This helps to keeps dEQP happy when adding EXT_texture_sRGB_R8 support.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-19 08:05:44 +01:00
Gert Wollny
24a02157dd i965: Correct L8_UNORM_SRGB table entry
As the name says, the format is an sRGB format.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-19 08:05:44 +01:00
Robert Foss
70692adf48 virgl: Clean up fences commit
Remove a dead variable, a int->bool conversion and some
whitespace changes.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-18 12:14:55 +01:00
Kenneth Graunke
c2e3d0f163 i915: Delete swizzling detection logic.
This is all leftover from the i965 split.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-17 10:26:31 -08:00
Ilia Mirkin
beb66d3747 nv50/ir/ra: enforce max register requirement, and change spill order
On nv50, certain operations must happen on regs below 64, due to
encoding requirements. First of all, we add infrastructure to enforce
this. Secondly we change the spill order to first spill RIG nodes that
are unconstrained, followed by ones that are.

This makes the gamecube logo shadertoy compile properly. Curiously, if
we adjust the spill order so that we first spill the constrained RIG
nodes instead, the RA also succeeds. However it seems more logical to
first spill the unconstrained ones.

While we're at it, drop the nv50 max register to reserve r127 as the
zero register of last resort (r63 is preferred).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Karol Herbst <kherbst@redhat.com>
2018-11-16 22:43:52 -05:00
Ilia Mirkin
799e021894 nv50/ir/ra: improve condition for short regs, unify with cond for 16-bit
Instead of the size restriction existing in two places, and potentially
being applied twice, we move this together. Ops with 16-bit register
addresses can only take a short reg, and ops with immediates can only
take a short reg.

Of course we leave the immediate 0 in place since we know that it will
be replaced by r63/r127 down the line, so don't treat zeroes as an
immediate.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-11-16 20:53:33 -05:00
Ilia Mirkin
955d943c33 nv50/ir: delete MINMAX instruction that is no longer in the BB
We removed the op from the BB, but it was still listed in its sources'
uses. This could trip up some logic down the line which analyzes all the
uses of an l-value, e.g. spilling.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-11-16 20:53:09 -05:00
Eric Anholt
7e9fc11ff8 egl: Print the actual message to the console from _eglError().
Previously we would print errors on the console like:

   libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglInitialize

When we had everything we needed for:

   libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglInitialize: DRI2: failed to find EGLDevice

(for a gbm error in my case)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-16 17:49:31 -08:00
Eric Anholt
d971a4230d loader: Factor out the common driver opening logic from each loader.
I copied the code from egl_dri2.c, but the functionality was equivalent
between all the loaders other than their particular environment variables.

v2: Drop the logging function equivalent to loader_default_logger()
    (requested by Eric, Emil).  Move the SCons workaround across.  Drop
    the now-unused driGetDriverExtensions() declaration that was lost in a
    rebase.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2018-11-16 17:49:17 -08:00
Eric Anholt
cc19815738 loader: Stop using a local definition for an in-tree header
I need other types from the header now, and "gl.h is big" is not a good
reason to duplicate definitions.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-16 15:38:18 -08:00
Eric Anholt
2bc1f5c2e7 egl: Move loader_set_logger() up to egl_dri2.c.
Everyone needs to call it, and platform_x11 forgot to.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-16 15:38:18 -08:00
Eric Anholt
c2b515379b glx: Move DRI extensions pointer loading to driOpenDriver().
The only thing you do with a dri driver handle is get the extensions
pointer, so just fold it in to simplify the callers.

v2: Add the declaration of driGetDriverExtensions() that got lost in a
    rebase.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2018-11-16 15:38:18 -08:00
Eric Anholt
7076e9f116 glx: Remove an old DEFAULT_DRIVER_DIR default.
You can tell by "Mesa/configs/default" how old this is.  Your build system
really has to provide the DEFAULT_DRIVER_DIR, or other loaders will break.

v2: Move the bad (non-prefix-dependent) define to the SConscript to avoid
    breaking it.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2018-11-16 15:37:47 -08:00
Samuel Pitoiset
d031d5c999 radv: enable primitive binning by default
After doing a bunch of benchmarks, primitive binning helps
some games like The Talos Principle (+5%) or Serious Sam 2017
(+3%). For other titles, either it doesn't change anything or
it hurts very few (less than 1%).

This only affects GFX9.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-16 17:51:15 +01:00
Samuel Pitoiset
afd834b62e radv: add a debug option for disabling primitive binning
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-16 17:51:12 +01:00
Robert Foss
d1a1c21e76 virgl: native fence fd support
Following the support for fences on the virtio driver add support
for native fence on virgl. This was somewhat based on the freedeno one.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-16 14:41:57 +01:00
Lionel Landwerlin
0db898cef2 intel/aub_viewer: Print blend states properly
Identical fix to :

commit 70de31d0c1
Author: Jason Ekstrand <jason.ekstrand@intel.com>
Date:   Fri Aug 24 16:05:08 2018 -0500

    intel/batch_decoder: Print blend states properly

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
2018-11-16 11:40:38 +00:00
Lionel Landwerlin
ac324a6809 intel/aub_viewer: fix dynamic state printing
Identical fix to :

commit cbd4bc1346
Author: Jason Ekstrand <jason.ekstrand@intel.com>
Date:   Fri Aug 24 16:04:03 2018 -0500

    intel/batch_decoder: Fix dynamic state printing

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
2018-11-16 11:40:14 +00:00
Lionel Landwerlin
59c1059528 intel/aubinator: fix ring buffer pointer
We can only start parsing commands from the head pointer. This was
working fine up to now because we only dealt with a "made up" ring
buffer (generated by aub_write) which always had its head at 0.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
2018-11-16 11:39:54 +00:00
Lionel Landwerlin
25443cbb72 intel/decoders: read ring buffer length
Use this value to limit reading the ring buffer.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
2018-11-16 11:37:08 +00:00
Lionel Landwerlin
1c56d21156 egl/dri: fix error value with unknown drm format
According to the EGL_EXT_image_dma_buf_import spec, creating an EGL
image with a DRM format not supported should yield the BAD_MATCH
error :

"
       * If <target> is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT
         attribute is set to a format not supported by the EGL, EGL_BAD_MATCH
         is generated.
"

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 20de7f9f22 ("egl/dri2: support for creating images out of dma buffers")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-11-16 10:28:06 +00:00
Daniel Stone
5e1fe240c4 gbm: Clarify acceptable formats for gbm_bo
gbm_bo_create() was presumably meant to originally accept gbm_bo_format
enums, but it's accepted GBM_FORMAT_* tokens since the dawn of time.
This is good, since gbm_bo_format is rarely used and covers a lot less
ground than GBM_FORMAT_*.

Change the documentation to refer to both; this involves removing a 'see
also' for gbm_bo_format, since we can't also use \sa to refer to a
family of anonymous #defines.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-16 09:40:46 +00:00
Connor Abbott
ba94a00c7c Revert "radv: disable VK_SUBGROUP_FEATURE_VOTE_BIT"
This reverts commit 647c2b90e9. There was
one recently-introduced bug in ac for dvec3 loads, but the other test
failures were actually bugs in the tests. See
9429e621c4

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-16 10:32:03 +01:00
Eric Anholt
cc71bf529c vc4: Don't return a vc4 BO handle on a renderonly screen.
The handles exported need to be on the KMS device's fd, anything else is
failure.  Also, this code is assuming that the scanout resource has been
created already, so assert it.
2018-11-15 21:11:44 -08:00
Eric Anholt
cc0bc76a38 vc4: Make sure we make ro scanout resources for create_with_modifiers.
The DRI3 create_with_modifiers paths don't set tmpl.bind to SCANOUT or
SHARED, with the theory that given that you've got modifiers, that's all
you need.  However, we were looking at the tmpl.bind for setting up the
KMS handle in the renderonly case, so we'd end up trying to use vc4's
handle on the hx8357d fd.

Fixes: 84ed8b67c5 ("vc4: Set shareable BOs as T tiled if possible")
2018-11-15 21:11:44 -08:00
Danylo Piliaiev
f9fd0cf479 i965: Fix calculation of layers array length for isl_view
Handle all cases in calculation of layers count for isl_view
taking into account texture view and image unit.
st_convert_image was taken as a reference.

When u->Layered is true the whole level is taken with respect to
image view. In other case only one layer is taken.

v3: (Józef Kucia and Ilia Mirkin)
    - Rewrote patch by taking st_convert_image as a reference
    - Removed now unused get_image_num_layers function
    - Changed commit message

v4: (Jason Ekstrand)
    - Added assert

Fixes: 5a8c8903
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107856

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-15 19:59:54 -06:00
Jason Ekstrand
6339aba775 intel/compiler: Lower SSBO and shared loads/stores in NIR
We have a bunch of code to do this in the back-end compiler but it's
fairly specific to typed surface messages and the way we emit them.
This breaks it out into NIR were it's easier to do things a bit more
generally.  It also means we can easily share the code between the vec4
and FS back-ends if we wish.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:49 -06:00
Jason Ekstrand
d34fd81e76 nir: Add alignment parameters to SSBO, UBO, and shared access
This also changes spirv_to_nir and glsl_to_nir to set them.  The one
place that doesn't set them is shared memory access lowering in
nir_lower_io.  That will have to be updated before any consumers of it
can effectively use these new alignments.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
2018-11-15 19:59:42 -06:00
Jason Ekstrand
fb127f7729 nir/lower_io: Add shared to get_io_offset_src
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:31 -06:00
Jason Ekstrand
b5c48271d4 nir/glsl: Force 32-bit for UBO and SSBO Booleans
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:30 -06:00
Jason Ekstrand
44b7005581 nir/spirv: Force 32-bit for UBO and SSBO Booleans
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:29 -06:00
Jason Ekstrand
f16bd8a9fe nir/builder: Add a nir_pack/unpack/bitcast helpers
The new helpers can generate any pack/unpack operation including those
for which we do not have specific opcodes and they express a bitcast in
terms of these pack/unpack operations.  In particular, the new helpers
properly handle 8-bit types.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:28 -06:00
Jason Ekstrand
b77d68b78e nir/builder: Add iadd_imm and imul_imm helpers
The pattern of adding or multiplying an integer by an immediate is
fairly common especially in deref chain handling.  This adds a helper
for it and uses it a few places.  The advantage to the helper is that
it automatically handles bit sizes for you.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-11-15 19:59:27 -06:00
Jason Ekstrand
1f29f4db1e nir/builder: Assert that intN_t immediates fit
This assert won't catch all mistakes with this helper but it will at
least ensure that the top bits are all zero or all one which should help
catch bugs.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:26 -06:00
Jason Ekstrand
4266932c0b nir/lower_alu_to_scalar: Don't try to lower unpack_32_2x16
It messes up when trying to lower.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:09 -06:00
Ian Romanick
425c133ab9 glsl: Refactor type checking for redeclarations
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-15 14:27:32 -08:00
Ian Romanick
61e003ce7e glsl: Omit redundant qualifier checks on redeclarations
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-15 14:27:29 -08:00
Ian Romanick
9b9f3218db glsl: prevent qualifiers modification of predeclared variables
Section 3.7 (Identifiers) of the GLSL spec says:

    However, as noted in the specification, there are some cases where
    previously declared variables can be redeclared to change or add
    some property, and predeclared "gl_" names are allowed to be
    redeclared in a shader only for these specific purposes.  More
    generally, it is an error to redeclare a variable, including those
    starting "gl_".

This patch should fix piglit tests:
clip-distance-redeclare-without-inout.frag
clip-distance-redeclare-without-inout.vert

However, this causes a regression in
clip-distance-out-values.shader_test.  A fix for that test has been sent
to the piglit list for review:

    https://patchwork.freedesktop.org/patch/255201/

As far as I understood following mailing thread:
https://lists.freedesktop.org/archives/piglit/2013-October/007935.html
looks like we have accepted to remove an ability to change qualifiers
but have not done it yet. Unless I missed something)

v2 (idr): Move 'earlier->data.mode != var->data.mode' test much earlier
in the function.  Add special handling for gl_LastFragData.

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-15 14:27:26 -08:00
Eric Anholt
538bca78e2 v3d: Don't try to set PF flags on a LDTMU operation
We need an ALU op in order to set PF.  Fixes a recent assertion failure in
dEQP-GLES3.functional.ubo.single_basic_type.shared.bool_vertex
2018-11-15 11:12:54 -08:00
Eric Anholt
03928dd682 v3d: Fix double-swapping of R/B on V3D 4.1
Fixes: 4018eb04e8 ("v3d: Use the TLB R/B swapping instead of recompiles when available.")
2018-11-15 11:12:54 -08:00
Eric Engestrom
2b2f790e59 egl: fix bad rebase
I screwed up a rebase over a refactor and didn't notice locally because
the uncommitted refactor hid the issue.

Fixes: c973364967 "egl: add missing glvnd entrypoint for EGL_ANDROID_blob_cache"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-15 17:51:40 +00:00
Sagar Ghuge
6e60ff1ea9 intel/compiler: Disassemble GEN6_SFID_DATAPORT_SAMPLER_CACHE as dp_sampler
Both BRW_SFID_SAMPLER and GEN6_SFID_DATAPORT_SAMPLER_CACHE are getting
disassembled as "sampler", which is misleading for assembler tool.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2018-11-15 09:36:55 -08:00
Eric Engestrom
c973364967 egl: add missing glvnd entrypoint for EGL_ANDROID_blob_cache
Fixes dEQP-EGL.functional.get_proc_address.extension.egl_android_blob_cache
on builds with glvnd enabled.

Fixes: 6f5b57093b "egl: add support for EGL_ANDROID_blob_cache"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 16:27:27 +00:00
Eric Engestrom
2640854399 gbm: add new entrypoint to symbols check
Fixes: 6328536ff2 "gbm: Introduce a helper function for
                              printing GBM format names."
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-15 16:25:42 +00:00
Emil Velikov
adbdfc6666 bin/get-pick-list.sh: handle reverts prior to the branchpoint
Currently we detect when a breaking commit:
 - has landed in stable, and
 - is referenced by a untagged fix in master

Yet we did not consider the case of breaking commit:
 - prior to the branchpoint, and
 - is referenced by a untagged fix in master

Addressing the latter is extremely slow, due to the size of the lookup.

That said, we can trivially use the existing is_sha_nomination() helper
to catch reverts.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 16:15:15 +00:00
Emil Velikov
c0012a0708 bin/get-pick-list.sh: use test instead of [ ]
Latter is rather picky wrt surrounding white space. The explicit `test`
doesn't have that problem, plus the statements read a bit easier.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:51 +00:00
Emil Velikov
77ff0bfb5f bin/get-pick-list.sh: handle unofficial "broken by" tag
We have a number of cases were devs will use a tag "broken by".
While it's not something officially documented or recommended, checking
for it is trivial enough.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:47 +00:00
Emil Velikov
209525aafb bin/get-pick-list.sh: handle fixes tag with missing colon
Every so often, we forget to add the colon after "fixes". Trivially
tweak the script to catch it.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:44 +00:00
Emil Velikov
b7418d1f3f bin/get-pick-list.sh: flesh out is_sha_nomination
Refactor is_fixes_nomination into a is_sha_nomination helper. This way
we can reuse it for more than the usual "Fixes:" tag.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:40 +00:00
Emil Velikov
533fead423 bin/get-pick-list.sh: tweak the commit sha matching pattern
Currently we match on:
 - any arbitrary length of,
 - any a-z A-Z and 0-9 characters

At the same time, a commit sha consists of lowercase hexadecimal
numbers. Any sha shorter than 8 characters is ambiguous - in some cases
even 11+ are required.

So change the pattern to a-f0-9 and adjust the length to 8-40.

As we're here we could use a single grep, instead of the grep/sed combo.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:36 +00:00
Emil Velikov
181203f3c5 bin/get-pick-list.sh: handle the fixes tag
Having a separate script to handle the fixes tag, brings a number of
issues, so let's fold it in get-pick-list.sh.

v2:
 - pass the sha as argument to the function
 - Keep original sed pattern

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:31 +00:00
Emil Velikov
e6b3a3b201 bin/get-pick-list.sh: handle "typod" usecase.
As the comment in get-typod-pick-list.sh says, there's little point in
having a duplicate file.

Add the new pattern + tag to get-pick-list.sh and nuke this file.

v2:
 - pass the sha as argument to the function
 - grep -q instead of using a variable (Eric)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:24 +00:00
Emil Velikov
fac10169bb bin/get-pick-list.sh: prefix output with "[stable] "
With later commits we'll fold all the different scripts into one.
Add the explicit prefix, so that we know the origin of the nomination

v2:
 - pass the sha as argument to the function
 - swap $tag = none for an else statment (Juan)
 - grep -q instead of using a variable (Eric)
 - print the tag and commit oneline separately (Eric)

v3:
 - drop unused "tag=none" assignment (Juan)
 - typo nomination

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:54:48 +00:00
Emil Velikov
559c32d241 bin/get-pick-list.sh: simplify git oneline printing
Currently we force disable the pager via "|cat" where --no-pager
exists. Additionally we could use git show instead of git log -n1.

Use those for a slightly more understandable code.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:51:24 +00:00
Emil Velikov
7d9556681d docs: document the staging branch and add reference to it
A while back we agreed that having a live/staging branch is beneficial.
Sadly we forgot to document that, so here is my first attempt.

Document the caveat that the branch history is not stable.

CC: Andres Gomez <agomez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-15 15:48:15 +00:00
Emil Velikov
4ae749acf1 docs/submittingpatches.html: correctly handle the <p> tag
As pointed out by the w3c validator.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-15 15:48:13 +00:00
Emil Velikov
19a081473f docs/releasing.html: polish cherry-picking/testing text
Reword slightly and highlight the important parts of the text.

CC: Andres Gomez <agomez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-15 15:48:08 +00:00
Guido Günther
ab5653680e etnaviv: Make sure rs alignment checks match
etna_resource_alloc and etna_resource_from_handle currently use different checks.
This leads to

   etna_resource_from_handle:492: target=2, format=PIPE_FORMAT_B8G8R8X8_UNORM, 1080x1920x1, array_size=1, last_level=0, nr_samples=0, usage=0, bind=8000a, flags=0
   etna_resource_from_handle:541: BO stride 4320 is too small for RS engine width padding (4352, format PIPE_FORMAT_B8G8R8X8_UNORM)

since etna_resource_from_handle wants to be aligned to a 16 byte
boundary while the etna_resource_alloc does not.

Adjust the two checks by using a common function.

Broken by baff59ebf0

Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-11-15 16:38:35 +01:00
Juan A. Suarez Romero
52368ef83a docs: update calendar, add news item and link release notes for 18.2.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-11-15 13:08:58 +00:00
Juan A. Suarez Romero
aa7a419b8b docs: add sha256 checksums for 18.2.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 79be754f9a74a43b5748dc0934241e7701cb9581)
2018-11-15 13:06:12 +00:00
Juan A. Suarez Romero
e53ec08931 docs: add release notes for 18.2.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit f34bddc325)
2018-11-15 13:06:10 +00:00
Marek Olšák
9367514524 radeonsi: fix video APIs on Raven2
This was missed when I added the new enum.

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-11-14 17:08:34 -05:00
Andrii Simiklit
e13dd70581 i965: avoid 'unused variable' warnings
1. brw_pipe_control.c:311:34: warning:
    unused variable ‘devinfo’
2. brw_program_binary.c:209:19: warning:
    unused variable ‘gen_size’
3. brw_program_binary.c:216:19: warning:
    unused variable ‘nir_size’

v2: Changes for unreproducible issues were removed

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-14 14:41:58 +00:00
Andrii Simiklit
7aca650122 compiler: avoid 'unused variable' warnings
1. nir/nir_lower_vars_to_ssa.c:691:21: warning:
       unused variable ‘var’
       nir_variable *var = path->path[0]->var;

v2: Changes for some part of 'may be used uninitialized'
    warnings were removed, seems like it is a compiler issue.
        ( Eric Engestrom <eric.engestrom@intel.com> )
    Possible like this one:
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46684
    This issue is flagged as duplicate but an
    original one is not closed yet.

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-14 13:35:38 +00:00
Andrii Simiklit
69ee49ac46 intel/tools: avoid 'unused variable' warnings
1. tools/aub_read.c:271:31: warning: unused variable ‘end’
    const uint32_t *p = data, *end = data + data_len, *next;

2. tools/aub_mem.c:292:13: warning: unused variable ‘res’
       void *res = mmap((uint8_t *)bo.map + map_offset, 4096, PROT_READ,
   tools/aub_mem.c:357:13: warning: unused variable ‘res’
       void *res = mmap((uint8_t *)bo.map + (page - bo.addr), 4096, PROT_READ,

v2: The i965_disasm.c changes was moved into a separate patch
    The 'end' variable declared separately with MAYBE_UNUSED
    to avoid effect of it to other variables.
       ( Eric Engestrom <eric.engestrom@intel.com> )

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-14 13:35:28 +00:00
Thomas Hellstrom
25b48e3df9 st/xa: Bump minor
Bump minor to signal support for new formats and higher precision
solid pictures.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-11-14 13:12:09 +01:00
Thomas Hellstrom
c9085f6d3b st/xa: Support Component Alpha with trivial blending
Support Component Alpha for those composite operations that do not require
per-channel alpha blending.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2018-11-14 13:12:09 +01:00
Thomas Hellstrom
0477d17f51 st/xa: Minor renderer cleanups
constify function arguments to clean up the code a bit.

Reported-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2018-11-14 13:12:09 +01:00
Thomas Hellstrom
56aa23b146 st/xa: Fix transformations when we have both source and mask samplers
In the case when we had both source and mask samplers, transformations were
typically not applied correctly.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2018-11-14 13:12:09 +01:00
Thomas Hellstrom
e1298def9f st/xa: Support a couple of new formats
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-11-14 13:12:09 +01:00
Thomas Hellstrom
258d20152a st/xa: Support higher color precision for solid pictures
The only solid fill picture type we supported only had 8 bit color
channels. Add a new solid picture type that supports float channels.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-11-14 13:11:51 +01:00
Thomas Hellstrom
d86ad38205 st/xa: Render update. Better support for solid pictures
Remove unused and obsolete code for gradients and component-alpha
Support solid source- and mask pictures using a variable number
of samplers in the composite pipeline rather than the fixed number
we used before.

Tested using rendercheck for XA.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-11-14 13:07:00 +01:00
Gert Wollny
4bba280937 nir: Allow to skip integer ops in nir_lower_to_source_mods
Some hardware supports source mods only for float operations. Make it
possible to skip lowering to source mods in these cases.

v2: use option flags instead of a boolean (Jason Ekstrand)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-14 08:59:26 +01:00
Karol Herbst
b4380cb070 nir/spirv: cast shift operand to u32
v2: fix for specialization constants as well

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-11-14 02:09:11 +01:00
Karol Herbst
099728b115 nir: replace nir_load_system_value calls with appropiate builder functions
this helps reduce the overall code changes when a bit_size parameter is
added to nir_load_system_value

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-11-14 02:09:11 +01:00
Karol Herbst
80db331c2d nir: add const_index parameters to system value builder function
this allows to replace some nir_load_system_value calls with the specific
system value constructor

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-11-14 02:09:11 +01:00
Timothy Arceri
95b513c937 radv: make use of nir_move_out_const_to_consumer()
vkpipeline-db results:

Totals from affected shaders:
SGPRS: 28400 -> 28576 (0.62 %)
VGPRS: 27916 -> 27692 (-0.80 %)
Spilled SGPRs: 140 -> 138 (-1.43 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1534456 -> 1520560 (-0.91 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 3541 -> 3582 (1.16 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-14 09:41:50 +11:00
Lionel Landwerlin
ea53f76d7b anv: move helper function internally
It's only used in anv_image.c

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 18:56:31 +00:00
Lionel Landwerlin
8b00d3d6eb anv: use image aspects rather than computed ones
This shouldn't make any difference but I feel uneasy to use the
expanded aspects that do not represent the image in its entirety. If
we ever change the implementation of the anv_image_aspect_to_plane()
helper, this is safer.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 18:56:27 +00:00
Lionel Landwerlin
465de47bad anv: associate vulkan formats with aspects
This will make it easier to associate an aspect with a plane number.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 18:56:24 +00:00
Lionel Landwerlin
fe3b7fe982 anv/lower_ycbcr: make sure to set 0s on all components
To play around with debugging, we might want to disable one or the
other component. Having 0s as default values makes this work.
Otherwise we might have NULL components, leading to crashes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 18:56:21 +00:00
Lionel Landwerlin
ee8d65c25a anv/image: remove unused parameter
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 18:56:13 +00:00
Lionel Landwerlin
352e297091 anv: simplify internal address offset
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 18:56:10 +00:00
Eric Engestrom
4fa2fb3524 meson: fix wayland-less builds
Those empty variables in the !wayland case are useless and running that
meson.build with them breaks the build:

  [287/850] Generating wayland-drm-client-protocol.h with a custom command.
  FAILED: src/egl/wayland/wayland-drm/wayland-drm-client-protocol.h
  client-header ../src/egl/wayland/wayland-drm/wayland-drm.xml src/egl/wayland/wayland-drm/wayland-drm-client-protocol.h
  /bin/sh: client-header: command not found
  ninja: build stopped: subcommand failed.

Fixes: d1992255bb "meson: Add build Intel "anv" vulkan driver"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-13 17:25:02 +00:00
Eric Engestrom
7df80de6e6 gbm: remove unnecessary meson include
`inc_wayland_drm` is only used if wayland is built, and it's already
added in that case a few lines below.

Fixes: a29869e872 "gbm: Don't traverse backwards for includes"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-13 17:25:02 +00:00
Eric Engestrom
3832db275e meson: only run vulkan's meson.build when building vulkan
Fixes: d1992255bb "meson: Add build Intel "anv" vulkan driver"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-13 17:25:02 +00:00
Eric Engestrom
4f1ae271e1 xmlpool: update translation po files
These files are close to 4 years out of date; a lot's changed since.
Let's just check in a recently-regenerated version.

Changes generated by running `ninja xmlpool-{pot,update-po,gmo}`.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-13 17:25:02 +00:00
Eric Engestrom
1e918e5bef REVIEWERS: add Vulkan reviewer group
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-11-13 17:25:02 +00:00
Eric Engestrom
59b3335496 REVIEWERS: add Emil as EGL reviewer
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-11-13 17:25:02 +00:00
Eric Engestrom
923aca84b2 REVIEWERS: add include path for EGL
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-11-13 17:25:02 +00:00
Toni Lönnberg
2af4e3345f intel/genxml: Add engine definition to render engine instructions (gen11)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

v4: Added missing engine definition to MI_TOPOLOGY_FILTER.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
1921982d3e intel/genxml: Add engine definition to render engine instructions (gen10)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

v4: Added missing engine definition to MI_TOPOLOGY_FILTER.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
030fe0f981 intel/genxml: Add engine definition to render engine instructions (gen9)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

v4: Added more missing engine definitions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
12e34fc7ba intel/genxml: Add engine definition to render engine instructions (gen8)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

v4: Added missing engine tag for MI_TOPOLOGY_FILTER and MI_LOAD_URB_MEM.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
a883fd2277 intel/genxml: Add engine definition to render engine instructions (gen75)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
27cf6252d3 intel/genxml: Add engine definition to render engine instructions (gen7)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
ecf62a967e intel/genxml: Add engine definition to render engine instructions (gen6)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions

v4: Added missing engine to MEDIA_GATEWAY_STATE

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
571d6447d8 intel/genxml: Add engine definition to render engine instructions (gen5)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
6463ceca69 intel/genxml: Add engine definition to render engine instructions (gen45)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added addition engine definitions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
a4ca710c96 intel/genxml: Add engine definition to render engine instructions (gen4)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
102dadec81 intel/decoder: tools: Use engine for decoding batch instructions
The engine to which the batch was sent to is now set to the decoder context when
decoding the batch. This is needed so that we can distinguish between
instructions as the render and video pipe share some of the instruction opcodes.

v2: The engine is now in the decoder context and the batch decoder uses a local
function for finding the instruction for an engine.

v3: Spec uses engine_mask now instead of engine, replaced engine class enums
with the definitions from UAPI.

v4: Fix up aubinator_viewer (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
a6aab7e436 intel/decoder: tools: gen_engine to drm_i915_gem_engine_class
Removed the gen_engine enum and changed the involved functions to use the
drm_i915_gem_engine_class enum from UAPI instead.

v3: Wrong engine was being used for blocks in video ring

v4: Fixed aubinator_viewer.cpp
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
b00bccd012 intel/decoder: Engine parameter for instructions
Preliminary work for adding handling of different pipes to gen_decoder. Each
instruction needs to have a definition describing which engine it is meant for.
If left undefined, by default, the instruction is defined for all engines.

v2: Changed to use the engine class definitions from UAPI

v3: Changed I915_ENGINE_CLASS_TO_MASK to use BITSET_BIT, change engine to
engine_mask, added check for incorrect engine and added the possibility to
define an instruction to multiple engines using the "|" as a delimiter in the
engine attribute.

v4: Fixed the memory leak.

v5: Removed an unnecessary ralloc_free().

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Gert Wollny
8d4bb6e5cd virgl: Add command and flags to initiate debugging on the host (v2)
On the host VREND_DEBUG=guestallow must be set to let the guest override
the debug flags.

v2: Send flag string instead of flags, this avoids the need to keep
    the flags in sync.
v3: Only request host logging if the host actually understands the command

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-11-13 14:42:22 +01:00
Gert Wollny
caa964b422 mesa: Reference count shaders that are used by transform feedback objects
Transform feedback objects may hold a pointer to a shader program, and
at least in Gallium, this must be a valid pointer until
ctx->Driver.EndTransformFeedback in glEndTransformFeedback has been called
- which is conform with the spec that any program that is part of a
current rendering state should only be flagged for deletion by glDeleteProgram.
This was not handled properly for the transform feedback objects so that
a call sequence

  glUseProgram(x)
  glBeginTransformFreedback(...)
  glPauseTransformFeedback(...)
  glDeleteProgram(x)
  glEndTransformFeedback(...)

would result in a use after free bug. With this patch the transform
feedback object also updates the reference count to the used program
thereby keeping the program valid as long as the transform feedback
objects links to it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108713
Fixes: 654587696b
       mesa: add end_transform_feedback() helper

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-13 10:57:25 +01:00
Samuel Pitoiset
90d68858ed radv: set optimal OVERWRITE_COMBINER_WATERMARK on GFX9
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-13 10:24:36 +01:00
Samuel Pitoiset
f70c5d31cd radv: set PA.SC_CONSERVATIVE_RASTERIZATION.NULL_SQUAD_AA_MASK_ENABLE
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-13 10:24:33 +01:00
Samuel Pitoiset
b5f213bb1d radv: binding streamout buffers doesn't change context regs
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-13 10:24:31 +01:00
Plamena Manolova
c5f3013cba nir: Don't lower the local work group size if it's variable.
If the local work group size is variable it won't be available
at compile time so we can't lower it in nir_lower_system_values().

Signed-off-by: Plamena Manolova <plamena.n.manolova@gmail.com>

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-11-13 10:57:04 +02:00
Matt Turner
efb1ccadca util/ralloc: Make sizeof(linear_header) a multiple of 8
Prior to this patch sizeof(linear_header) was 20 bytes in a
non-debug build on 32-bit platforms. We do some pointer arithmetic to
calculate the next available location with

   ptr = (linear_size_chunk *)((char *)&latest[1] + latest->offset);

in linear_alloc_child(). The &latest[1] adds 20 bytes, so an allocation
would only be 4-byte aligned.

On 32-bit SPARC a 'sttw' instruction (which stores a consecutive pair of
4-byte registers to memory) requires an 8-byte aligned address. Such an
instruction is used to store to an 8-byte integer type, like intmax_t
which is used in glcpp's expression_value_t struct.

As a result of the 4-byte alignment returned by linear_alloc_child() we
would generate a SIGBUS (unaligned exception) on SPARC.

According to the GNU libc manual malloc() always returns memory that has
at least an alignment of 8-bytes [1]. I think our allocator should do
the same.

So, simple fix with two parts:

   (1) Increase SUBALLOC_ALIGNMENT to 8 unconditionally.
   (2) Mark linear_header with an aligned attribute, which will cause
       its sizeof to be rounded up to that alignment. (We already do
       this for ralloc_header)

With this done, all Mesa's unit tests now pass on SPARC.

[1] https://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html

Fixes: 47e1758692 ("glcpp: use the linear allocator for most objects")
Bug: https://bugs.gentoo.org/636326
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-12 20:54:49 -08:00
Matt Turner
7e3748c268 util/ralloc: Switch from DEBUG to NDEBUG
The debug code is all asserts, so protect it with the same thing that
controls assert.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-12 20:54:49 -08:00
Timothy Arceri
34dffcf913 nir: add support for removing redundant stores to copy prop var
For example the following type of thing is seen in TCS from
a number of Vulkan and DXVK games:

	vec1 32 ssa_557 = deref_var &oPatch (shader_out float)
	vec1 32 ssa_558 = intrinsic load_deref (ssa_557) ()
	vec1 32 ssa_559 = deref_var &oPatch@42 (shader_out float)
	vec1 32 ssa_560 = intrinsic load_deref (ssa_559) ()
	vec1 32 ssa_561 = deref_var &oPatch@43 (shader_out float)
	vec1 32 ssa_562 = intrinsic load_deref (ssa_561) ()
	intrinsic store_deref (ssa_557, ssa_558) (1) /* wrmask=x */
	intrinsic store_deref (ssa_559, ssa_560) (1) /* wrmask=x */
	intrinsic store_deref (ssa_561, ssa_562) (1) /* wrmask=x */

No shader-db changes on i965 (SKL).

vkpipeline-db results RADV (VEGA):

Totals from affected shaders:
SGPRS: 7832 -> 7728 (-1.33 %)
VGPRS: 6476 -> 6740 (4.08 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 469572 -> 456596 (-2.76 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 989 -> 960 (-2.93 %)
Wait states: 0 -> 0 (0.00 %)

The Max Waves and VGPRS changes here are misleading. What is
happening is a bunch of TCS outputs are being optimised away as
they are now recognised as unused. This results in more varyings
being compacted via nir_compact_varyings() which can result in
more register pressure when they are not packed in an optimal way.
This is an existing problem independent of this patch. I've run
some benchmarks and haven't noticed any performance regressions
in affected games.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 15:19:36 +11:00
Timothy Arceri
3561108de0 anv/i965: make use of nir_link_constant_varyings()
shader-db results for SLK:

total instructions in shared programs: 13106498 -> 13091573 (-0.11%)
instructions in affected programs: 1186244 -> 1171319 (-1.26%)
helped: 6186
HURT: 0

total cycles in shared programs: 332062633 -> 331961653 (-0.03%)
cycles in affected programs: 8537165 -> 8436185 (-1.18%)
helped: 5371
HURT: 862

LOST:   6
GAINED: 14

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 14:06:32 +11:00
Eric Anholt
621b0fa892 egl: Improve the debugging of gbm format matching in DRI configs.
Previously the debug would be:

libEGL debug: No DRI config supports native format 0x20203852
libEGL debug: No DRI config supports native format 0x38385247

but

libEGL debug: No DRI config supports native format R8
libEGL debug: No DRI config supports native format GR88

is a lot easier to understand.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-11-12 15:20:23 -08:00
Eric Anholt
6328536ff2 gbm: Introduce a helper function for printing GBM format names.
This requires that the caller make a little (stack) allocation to store
the string.

v2: Use gbm_format_canonicalize (suggested by Daniel)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-11-12 15:20:23 -08:00
Eric Anholt
ee7f848c00 gbm: Move gbm_format_canonicalize() to the core.
I want it for the format name debugging code.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-11-12 15:20:23 -08:00
Dylan Baker
4eab98b66e meson: fix libatomic tests
There are two problems:
1) the extra underscore in MISSING_64BIT_ATOMICS
2) we should link with libatomic if the previous test decided we needed
   it

Fixes: d1992255bb
       ("meson: Add build Intel "anv" vulkan driver")
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
2018-11-12 13:29:00 -08:00
Marek Olšák
32a334777c mesa: mark GL_SR8_EXT non-renderable on GLES
Fixes: dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.sr8_ext

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-12 16:19:43 -05:00
Marek Olšák
e0c7114eb3 st/mesa: disable L3 thread pinning
This implementation can have massive drawbacks.

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
2018-11-12 16:18:15 -05:00
Christian Gmeiner
c6aaafa3a1 nir: add lowering for ffloor
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-12 21:57:25 +01:00
Alyssa Rosenzweig
41c8f99137 util: Fix warning in u_cpu_detect on non-x86
regs is only set and used on x86; on other platforms (like ARM), this
code causes a trivial warning, solved by moving the regs declaration to
the architecture-dependent usage.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2018-11-12 10:28:04 -08:00
Dylan Baker
9c2a95b298 meson: Don't set -Wall
meson does this for you with its warn levels, so we don't need to set
it ourselves.

Fixes: d1992255bb
       ("meson: Add build Intel "anv" vulkan driver")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-12 08:55:55 -08:00
Rob Clark
4a0c2cfdd6 freedreno/drm: fix unused 'entry' warnings
Looks like importing libdrm_freedreno into mesa crossed paths with
e27902a261.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-12 10:45:48 -05:00
Lionel Landwerlin
89785e2d56 i965: add support for sampling from AYUV
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-12 13:22:54 +00:00
Lionel Landwerlin
252ca7b43f dri: add AYUV format
v2: Add a AYUV entry android in the android backend (Tapani)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-12 13:22:54 +00:00
Lionel Landwerlin
8a15f06d19 nir/lower_tex: Add AYUV lowering support
Byte ordering is :

0: V
1: U
2: Y
3: A

v2: Split refactoring of alpha channel (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
2018-11-12 13:22:54 +00:00
Lionel Landwerlin
0a30c33e83 nir/lower_tex: add alpha channel parameter for yuv lowering
We're about to introduce AYUV support which provides its own alpha
channel. So give alpha as a parameter and set it to 1 on exising
formats.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-12 13:22:54 +00:00
Samuel Pitoiset
97fb1a02fd radv: make use of num_good_cu_per_sh in si_emit_graphics() too
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-12 09:35:46 +01:00
Samuel Pitoiset
d9d14346c2 radv: clean up setting partial_es_wave for distributed tess on VI
Only needed when the pipeline actually uses tessellation. I don't
think that changes anything, except improving readability.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-12 09:35:44 +01:00
Samuel Pitoiset
cc4569b733 radv: cleanup and document a Hawaii bug with offchip buffers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-12 09:35:42 +01:00
Hanno Böck
8dc2085baf glsl/test: Fix use after free in test_optpass.
The variable state is free'd and afterwards state->error is used
as the return value, resulting in a use after free bug detected
by memory safety tools like address sanitizer.

Signed-off-by: Hanno Böck <hanno@hboeck.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108636
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-12 07:42:58 +02:00
Timothy Arceri
a068958692 nir: don't pack varyings ints with floats unless flat
Fixes: 1c9c42d16b ("nir: add varying component packing helpers")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-12 15:38:56 +11:00
Timothy Arceri
9dd737bb02 nir: add glsl_type_is_integer() helper
Fixes: 1c9c42d16b ("nir: add varying component packing helpers")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-12 15:38:56 +11:00
Francisco Jerez
552642066f intel/fs: Prevent emission of IR instructions not aligned to their own execution size.
This can occur during payload setup of SIMD-split send message
instructions, which can lead to the emission of header setup
instructions with a non-zero channel group and fixed SIMD width.  Such
instructions could end up using undefined channel enable signals
except they don't care since they're always marked force_writemask_all.

Not known to affect correctness of any workload at this point, but it
would be trivial to back-port to stable if something comes up.

Reported-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Sagar Ghuge <sagar.ghuge@intel.com>
2018-11-09 19:39:22 -08:00
Timothy Arceri
590fcb50e7 st/mesa: make use of nir_link_constant_varyings()
Shader-db results radeonsi (VEGA):

Totals from affected shaders:
SGPRS: 161464 -> 161368 (-0.06 %)
VGPRS: 86904 -> 86292 (-0.70 %)
Spilled SGPRs: 296 -> 314 (6.08 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 3618596 -> 3573852 (-1.24 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 26189 -> 26276 (0.33 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-10 11:41:00 +11:00
Timothy Arceri
d40dd05553 nir: add new linking opt nir_link_constant_varyings()
This pass moves constant outputs to the consuming shader stage
where possible.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-10 11:41:00 +11:00
Andre Heider
414470854d st/nine: clean up thead shutdown sequence a bit
Just break out of the loop instead, it does the same thing.

Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2018-11-09 22:37:27 +01:00
Andre Heider
123bf9cbe7 st/nine: plug thread related leaks
Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2018-11-09 22:37:27 +01:00
Andre Heider
10598c9667 st/nine: fix stack corruption due to ABI mismatch
This fixes various crashes and hangs when using nine's 'thread_submit'
feature.

On 64bit, the thread function's data argument would just be NULL.
On 32bit, the data argument would be garbage depending on the compiler
flags (in my case -march>=core2).

Fixes: f3fa7e3068 ("st/nine: Use WINE thread for threadpool")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2018-11-09 22:37:26 +01:00
Marek Olšák
d2b2364313 radeonsi: stop command submission with PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET only
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-09 14:55:04 -05:00
Marek Olšák
4bec5025ac gallium: add PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-09 14:55:04 -05:00
Marek Olšák
9dc776f3f2 radeonsi: don't set the CB clear color registers for 0/1 clear colors on Raven2
and add has_dcc_constant_encode.
2018-11-09 14:55:04 -05:00
Marek Olšák
832ab883e2 radeonsi: use better DCC clear codes
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-09 14:55:04 -05:00
Marek Olšák
d059eae269 ac/surface: remove the overallocation workaround for Vega12
not needed anymore (probably since the tile_swizzle fix)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-09 14:55:04 -05:00
Lionel Landwerlin
959e2a5aeb intel/aub_read: remove useless breaks
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-09 18:17:30 +00:00
Erik Faye-Lund
b55af392d9 Revert "mesa: expose NV_conditional_render on GLES"
This reverts commit 5213be9fab.
2018-11-09 17:39:25 +01:00
Erik Faye-Lund
cf8b271cbe Revert "mesa/main: fixup make check after NV_conditional_render for gles"
This reverts commit cccd7a253f.
2018-11-09 17:39:22 +01:00
Erik Faye-Lund
cccd7a253f mesa/main: fixup make check after NV_conditional_render for gles
It seems I missed some details when exposing NV_conditional_render
on GLES; this fixes up "make check".

Fixes: 5213be9fab ("mesa: expose NV_conditional_render on GLES")
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-09 16:47:34 +01:00
Nicolai Hähnle
8c97abc066 radv: include LLVM IR in the VK_AMD_shader_info "disassembly"
Helpful for debugging compiler backend problems: this allows us to
easily retrieve the LLVM IR from RenderDoc.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-09 14:54:37 +01:00
Erik Faye-Lund
5213be9fab mesa: expose NV_conditional_render on GLES
The extension spec has been updated to include GLES 2 support, so let's
enable it there.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-09 13:03:00 +01:00
Iago Toral Quiroga
35baee5dce nir/constant_folding: fix incorrect bit-size check
nir_alu_type_get_type_size takes a type as parameter and we were
passing a bit-size instead, which did what we wanted by accident,
since a bit-size of zero matches nir_type_invalid, which has a
size of 0 too.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-09 08:22:15 +01:00
Iago Toral Quiroga
6c418dfa42 intel/compiler: fix node interference of simd16 instructions
SIMD16 instructions need to have additional interferences to prevent
source / destination hazards when the source and destination registers
are off by one register.

While we already have code to handle this, it was only running for SIMD16
dispatches, however, we can have SIDM16 instructions in a SIMD8 dispatch.
An example of this are pull constant loads since commit b56fa830c6,
but there are more cases.

This fixes a number of CTS test failures found in work-in-progress
tests that were hitting this situation for 16-wide pull constants
in a SIMD8 program.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-09 08:22:08 +01:00
Roland Scheidegger
a3c898dc97 gallivm: fix improper clamping of vertex index when fetching gs inputs
Because we only have one file_max for the (2d) gs input file, the value
actually represents the max of attrib and vertex index (although I'm
not entirely sure if we really want the max, since the max valid value
of the vertex dimension can be easily deduced from the input primitive).

Thus in cases where the number of inputs is higher than the number of
vertices per prim, we did not properly clamp the vertex index, which
would result in out-of-bound fetches, potentially causing segfaults
(the segfaults seemed actually difficult to trigger, but valgrind
certainly wasn't happy). This might have happened even if the shader
did not actually try to fetch bogus vertices, if the fetching happened
in non-active conditional clauses.

To fix simply use the correct max vertex index value (derived from
the input prim type) instead when clamping for this case.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-11-09 00:53:03 +01:00
Aditya Swarup
a5c39ed974 i965: Lift restriction in external textures for EGLImage support
Fixes Skqp's unitTest_EGLImageTest test.

For Intel platforms, we support external textures only for EGLImages
created with EGL_EXT_image_dma_buf_import. This restriction seems to
be Intel specific and not present for other platforms.

While running SKQP test - unitTest_EGLImageTest, GL_INVALID is sent
to the test because of this restriction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105301
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-11-08 12:33:06 -08:00
Ian Romanick
c5a4c26450 glsl: Add pragma to disable all warnings
Use #pragma warning(off) and #pragma warning(on) to disable or enable
all warnings.  This is a big hammer.  If we ever need a smaller hammer,
we can enhance this functionality.

There is one lame thing about this.  Because we parse everything, create
an AST, then convert the AST to GLSL IR, we have to treat the #pragma
like a statment.  This means that you can't do something like

'    void
'    #pragma warning(off)
'    __foo
'    #pragma warning(on)
'    (float param0);

Fixing that would, as far as I can tell, require a huge amount of work.

I did try just handling the #pragma during parsing (like we do for
state for the whole shader.

v2: Fix the #pragma lines in the commit message that git-commit ate.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-11-08 11:00:00 -08:00
Ian Romanick
011abfc963 glsl: Add warning tests for identifiers with __
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-11-08 10:59:53 -08:00
Jason Ekstrand
d28bc35ece intel/fs: Add an assert to optimize_frontfacing_ternary
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:25 -06:00
Jason Ekstrand
bcc6aab065 anv: Use nir_src_is_const and friends in lowering code
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:25 -06:00
Jason Ekstrand
52145070c0 intel/analyze_ubo_ranges: Use nir_src_is_const and friends
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:25 -06:00
Jason Ekstrand
1413512b4c intel/vec4: Use the new nir_src_is_const and friends
As of this commit, all uses of const sources either go through a
nir_src_as_<type> helper which handles bit sizes correctly or else are
accompanied by a nir_src_bit_size() == 32 assertion to assert that we
have the size we think we have.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:25 -06:00
Jason Ekstrand
61e15348c4 nir: Add a read_mask helper for ALU instructions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:22 -06:00
Jason Ekstrand
344cfe6980 intel/fs: Use the new nir_src_is_const and friends
As of this commit, all uses of const sources either go through a
nir_src_as_<type> helper which handles bit sizes correctly or else are
accompanied by a nir_src_bit_size() == 32 assertion to assert that we
have the size we think we have.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:20 -06:00
Jason Ekstrand
6b2918709a intel/fs,vec4: Clean up a repeated pattern with SSBOs
Everywhere we handle SSBO intrinsics, we have exactly the same pattern
for computing the index so we may as well make a helper for it.  We also
add a get_nir_src_imm to vec4 and use it for SSBO offsets.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:06 -06:00
Samuel Pitoiset
c472ad82e4 radv: fix GPU hangs when loading depth/stencil clear values on SI/CIK
HTILE is supported on these chips, not sure how I missed that.
This restores using PFP_SYNC_ME when LOAD_CONTEXT_REG is not used.

Fixes: f425d9ee74 ("radv: use LOAD_CONTEXT_REG when loading fast clear values")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-08 11:20:03 +01:00
Samuel Pitoiset
f425d9ee74 radv: use LOAD_CONTEXT_REG when loading fast clear values
This avoids syncing the Micro Engine. This is only supported
for VI+ currently. There is probably a way for using
LOAD_CONTEXT_REG on previous chips but that could be done later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-08 10:41:45 +01:00
Samuel Pitoiset
0dcd99c687 radv: only expose VK_SUBGROUP_FEATURE_ARITHMETIC_BIT for VI+
Inclusive and exclusives scan are missing because older chips
don't have llvm.amdgcn.update.dpp.

This fixes crashes with dEQP-VK.subgroups.arithmetic.*.

CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-08 10:41:41 +01:00
Adam Jackson
16f1023037 glx: Demand success from CreateContext requests (v2)
GLXCreate{,New}Context, like most X resource creation requests, does not
emit a reply and therefore is emitted into the X stream asynchronously.
However, unlike most resource creation requests, the GLXContext we
return is a handle to library state instead of an XID. So if context
creation fails for any reason - say, the server doesn't support indirect
contexts - then we will fail in strange places for strange reasons.

We could make every GLX entrypoint robust against half-created contexts,
or we could just verify that context creation worked. Reuse the
__glXIsDirect code to do this, as a cheap way of verifying that the
XID is real.

glXCreateContextAttribsARB solves this by using the _checked version of
the xcb command, so effectively this change makes the classic context
creation paths as robust as CreateContextAttribs.

v2: Better use of Bool, check that error != NULL first (Olivier Fourdan)

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-11-07 12:38:05 -05:00
Karol Herbst
f7fae7f64e gm107/ir: fix compile time warning in getTEXSMask
In function 'uint8_t nv50_ir::getTEXSMask(uint8_t)':
warning: control reaches end of non-void function [-Wreturn-type]

Reported-by: Moiman@freenode
Fixes: f821e80213
       "gm107/ir: use scalar tex instructions where possible"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-07 17:48:58 +01:00
Michel Dänzer
32b0eb51a3 winsys/amdgpu: Stop using amdgpu_bo_handle_type_kms_noimport
It only behaves any different from amdgpu_bo_handle_type_kms with
libdrm 2.4.93, and it breaks if an older version is picked up.

Bugzilla: https://bugs.freedesktop.org/108096
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-07 17:37:47 +01:00
Lionel Landwerlin
792dde66f2 intel/dump_gpu: add platform option
Got tired of remembering the PCI ids.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-07 11:27:41 +00:00
Lionel Landwerlin
e262cc0353 intel/dump_gpu: move output option together
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-07 11:27:38 +00:00
Samuel Pitoiset
0a0aa2ba6c radv: disable conditional rendering for vkCmdCopyQueryPoolResults()
VK_EXT_conditional_rendering says that copy commands should not be
affected by conditional rendering.

Cc: 18.2 18.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-07 11:31:36 +01:00
Samuel Pitoiset
1e7c3379e1 radv: allocate enough space in CS when copying query results with compute
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-07 11:31:34 +01:00
Timothy Arceri
9aa3c1915e ac/nir_to_llvm: fix b2f for f64
Fixes: d7e0d47b9d ("nir: Add a bunch of b2[if] optimizations")

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-07 16:35:07 +11:00
Karol Herbst
f821e80213 gm107/ir: use scalar tex instructions where possible
TEXS, TLD4 and TLD4S are variants of tex instructions which are more
scalar, which gives RA more freedom and is less likely to insert silly
MOVs to satisfy quad registers.

shader-db changes:
total instructions in shared programs : 7687265 -> 7614782 (-0.94%)
total gprs used in shared programs    : 803620 -> 798045 (-0.69%)
total shared used in shared programs  : 639636 -> 639636 (0.00%)
total local used in shared programs   : 24648 -> 24648 (0.00%)
total bytes used in shared programs   : 82103400 -> 81330696 (-0.94%)

                local     shared        gpr       inst      bytes
    helped           0           0        3648       10647       10647
      hurt           0           0         464         205         205

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06 19:57:05 +01:00
Karol Herbst
edd6c41751 nv50/ir: add scalar field to TexInstructions
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06 19:57:05 +01:00
Karol Herbst
8d825f78fc nv50/ra: add condenseDef overloads for partial condenses
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06 19:57:05 +01:00
Karol Herbst
a4550de434 nv50/ir: print color masks of tex instructions
v2: print the mask for TXG as well
    make the mask to be printed more mask like

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06 19:57:05 +01:00
Jason Ekstrand
610061838a vulkan: Update the XML and headers to 1.1.91
The biggest change here is the rename of VK_NVX_ray_tracing to
VK_NV_ray_tracing and the total removal of VK_KHR_mir_surface.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-06 12:21:19 -06:00
Gert Wollny
c171d76b94 r600: Add support for EXT_texture_sRGB_R8
Enables on R600 and makes pass:
  dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.*
  dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8*

v2: remove chunk for dri/radeon (Emil)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-11-06 18:49:02 +01:00
Lionel Landwerlin
421fa01d64 anv/android: mark gralloc allocated BOs as external
Allocating through Gralloc implies buffers are going to be used
outside the driver. We have special MOCS settings for external BOs and
we probably want to use them here too.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a1220e7311 ("anv/android: Set the BO flags in bo_cache_import (v2)")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-06 15:28:07 +00:00
Lionel Landwerlin
b43f955037 anv: stub internal android code
This reduces the amount of #ifdef ANDROID we'll have to have inside
the driver. Potentially offering better coverage of the android
extensions.

v2: Move anv_android.h include before anv_entrypoints.h (Tapani)
    Fix autotools android build (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-06 15:28:07 +00:00
Kristian H. Kristensen
f6131d4ec7 freedreno/a6xx: Clear z32 and separate stencil with blitter
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-11-06 08:56:38 -05:00
Rob Clark
3bbad81c80 freedreno/a6xx: fix VSC bug with larger # of tiles
At higher resolutions with the addition of MSAA, the number of tiles
can increase to the point where we use more than one VSC pipe per
tile.  Which would cause us to calculate an out-of-bounds offset for
VSC_SIZE_ADDRESS.  So don't try to be clever, just always put it at
a fixed offset assuming the max 32 VSC pipes in use.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-06 08:56:21 -05:00
Rob Clark
2d9c3a5db2 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-06 08:43:27 -05:00
Olivier Fourdan
55af17ffed wayland/egl: Resize EGL surface on update buffer for swrast
After commit a9fb331ea ("wayland/egl: update surface size on window
resize"), the surface size is updated as soon as the resize is done, and
`update_buffers()` would resize only if the surface size differs from
the attached size.

However, in the case of swrast, there is no resize callback and the
attached size is updated in `dri2_wl_swrast_commit_backbuffer()` prior
to the `swrast_update_buffers()` so the attached size is always up to
date when it reaches `swrast_update_buffers()` and the surface is never
resized.

This can be observed with "totem" using the GDK backend on Wayland (the
default) when running on software rendering:

  $ LIBGL_ALWAYS_SOFTWARE=true CLUTTER_BACKEND=gdk totem

Resizing the window would leave the EGL surface size unchanged.

To avoid the issue, partially revert the part of commit a9fb331ea for
`swrast_update_buffers()` and resize on the win size and not the
attached size.

Fixes: a9fb331ea - wayland/egl: update surface size on window resize
Signed-off-by: Olivier Fourdan <ofourdan@redhat.com>
CC: Daniel Stone <daniel@fooishbar.org>
CC: Juan A. Suarez Romero <jasuarez@igalia.com>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-06 13:59:38 +01:00
Lionel Landwerlin
b47a69ed4c intel/decoders: fix instruction base address parsing
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 00103db04a ("intel: Fix decoding for partial STATE_BASE_ADDRESS updates.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-05 13:22:35 -08:00
Emil Velikov
b3ade65387 egl/glvnd: correctly report errors when vendor cannot be found
If the user provides an invalid display or device the ToVendor lookup
will fail.

In this case, the local [Mesa vendor] error code will be set. Thus on
sequential eglGetError(), the error will be EGL_SUCCESS.

To be more specific, GLVND remembers the last vendor and calls back
into it's eglGetError, although there's no guarantee to ever have had
one.

v2:
 - Add _eglError call, so the debug callback is executed (Kyle)
 - Drop XXX comment.

Piglit: tests/egl/spec/egl_ext_device_query
Fixes: ce562f9e3f ("EGL: Implement the libglvnd interface for EGL (v3)")
Cc: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kyle Brenneman <kbrenneman@nvidia.com>
2018-11-05 20:53:05 +00:00
Emil Velikov
2a8fefdeb0 egl: add EGL_EXT_device_base entrypoints
eglQueryDevicesEXT (unlike the other three functions) does not depend
on the display. It is implemented in GLVND, which calls into each
driver collecting the list of devices and presenting it to the user.

For the other entrypoints, GLVND acts as pass through stub calling into
the vendor library. The vendor implementation calls back into GLVND to
get the vendor dispatch. Then the driver proceeds to call itself via
the said dispatch.

This design makes is possible to keep using "old" GLVND with newer
vendor drivers. Since effectively all the extension code is within the
latter itself.

Without said entrypoints, any user will outright crash - as reported in
the bug report.

Note: there's a follow-up fix needed to our GLVND code, to make piglit
happy.

v2: add some beefy documentation in the commit message.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108635
Fixes: 7552fcb7b9 ("egl: add base EGL_EXT_device_base implementation")
Reported-by: kyle.devir@mykolab.com
Cc: kyle.devir@mykolab.com
Acked-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-05 20:53:05 +00:00
Emil Velikov
7e169cf2a0 docs: mention EXT_shader_implicit_conversions
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-05 20:53:05 +00:00
Marek Olšák
04298a2f24 st/va: fix incorrect use of resource_destroy
Fixes: 4373dd3215 ("st/va: Support YUV formats in vaCreateSurfaces")
Cc: Drew Davenport <ddavenport@chromium.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-11-05 15:47:50 -05:00
Sergii Romantsov
5aeee1ab15 i965/batch/debug: Allow log be dumped before assert
Message that may show the culprit of assert now will
be dumped before that for debug purposes.

Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Lionel G Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-05 09:24:55 -08:00
Lionel Landwerlin
4fd0ff75f3 intel/sanitize_gpu: add debug message on mmap fail
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-05 15:45:08 +00:00
Lionel Landwerlin
e400ac52e4 intel/sanitize_gpu: deal with non page multiple buffer sizes
We can only map at page aligned offsets. We got that wrong with buffer
size where (size % 4096) != 0 (anv has a WA buffer of 1024).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-05 15:45:07 +00:00
Lionel Landwerlin
c5fca35af1 intel/sanitize_gpu: add help/gdb options to wrapper
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-05 15:45:07 +00:00
Lionel Landwerlin
9ab5089150 intel/dump_gpu: add missing gdb option
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-05 15:43:34 +00:00
Eric Engestrom
d515ded4d9 wsi/wayland: only finish() a successfully init()ed display
Fixes: 4369102498 "vulkan/wsi/wayland: Stop caching Wayland displays"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2018-11-05 15:29:21 +00:00
Eric Engestrom
dcee22afed wsi/wayland: use proper VkResult type
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-05 14:55:05 +00:00
Sergii Romantsov
ce837a5372 autotools: library-dependency when no sse and 32-bit
Building of 32bit Mesa may fail if __SSE__ is not specified.
Added missed dependency from libm.

v2: avoided dependecy on any flag, just link

v3: meson doesn't fail, but have added dependency on libm

CC: Dylan Baker <dylan@pnwbakers.com>
CC: Lionel G Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108560
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-05 13:21:49 +01:00
Samuel Pitoiset
f7fd0d86a9 radv: more use of radv_cp_wait_mem()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-05 09:48:50 +01:00
Samuel Pitoiset
c571ca7a08 radv: replace si_emit_wait_fence() with radv_cp_wait_mem()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-05 09:48:50 +01:00
Samuel Pitoiset
b1b2dd06a7 radv: add missing TFB queries support to CmdCopyQueryPoolsResults()
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Fixes: b4eb029062 ("radv: implement VK_EXT_transform_feedback")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-05 09:48:43 +01:00
Samuel Pitoiset
dc3419195c radv: remove useless sync after copying query results with compute
The spec says:
   "vkCmdCopyQueryPoolResults is considered to be a transfer
    operation, and its writes to buffer memory must be synchronized
    using VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT
    before using the results."

VK_PIPELINE_STAGE_TRANSFER_BIT will wait for compute to be idle,
while VK_ACCESS_TRANSFER_WRITE_BIT will invalidate both L1 vector
caches and L2. So, it's useless to set those flags internally.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-05 09:47:55 +01:00
Vinson Lee
64a9ed8848 r600/sb: Fix constant logical operand in assert.
Fixes: da977ad907 ("r600/sb: start adding GDS support")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2018-11-04 21:09:55 -08:00
Kenneth Graunke
5d517a599b st/mesa: Don't record garbage streamout information in the non-SSO case.
In the non-SSO case, where multiple shader stages are linked together,
we were recording garbage pipe_stream_output_info structures for all
but the last enabled geometry-processing stage.

Specifically, we were using the gl_transform_feedback_info from
shader_program->last_vert_prog (the stage whose outputs will be
recorded)...but were pairing it with the output varying mappings
from the current shader stage.  For example, a program with a VS and
GS, the VS's pipe_shader_state would have a pipe_stream_output_info
based on the GS transform feedback info, but the VS output mapping.

This generally worked out okay because only the pipe_stream_output_info
for the last stage really matters - the others can be ignored.  However,
we'd like to avoid confusing the pipe driver.  In particular, my new
driver translates the stream out information to hardware packets at
bind_{vs,tes,gs}_state() time...and was hitting asserts about garbage
varyings that didn't exist.

This patch changes st/mesa to record a blank pipe_stream_output_info
with num_outputs = 0 for all stages prior to last_vert_prog.  The last
one is captured as normal.

(In the fully-SSO case, nothing should change - each program contains
a single shader stage, so last_vert_prog *is* the current shader.)

Tested with llvmpipe (piglit's gpu profile), and freedreno (a3xx,
gpu profile with -t transform.feedback).  Fixes several hundred CTS
tests on my new driver.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-03 23:34:36 -07:00
Kenneth Graunke
b6410a2d22 st/nir: Drop unused parameter from st_nir_assign_uniform_locations().
ARB programs won't have one of these, and we don't use it anyway.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-11-03 23:34:36 -07:00
Kenneth Graunke
5294d65011 st/mesa: Pull nir_lower_wpos_ytransform work into a helper function.
This will let me use it in the ARB program code as well.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-11-03 23:34:34 -07:00
Kenneth Graunke
424a6052df intel: Use a URB start offset of 0 for disabled stages.
There are some cases where the VS is the only stage enabled, it uses the
entire URB, and the URB is large enough that placing later stages after
the VS exceeds the number of bits for "URB Starting Address".

For example, on Icelake GT2, "varying-packing-simple mat2x4 array" from
Piglit is getting a starting offset of 128 for the GS/HS/DS.  But the
field is only large enough to hold an offset of 127.

i965 doesn't hit any genxml assertions because it's still using the old
OUT_BATCH mechanism.  128 << GEN7_URB_STARTING_ADDRESS_SHIFT (57) == 0,
with the extra bit falling off the end.  So we place the disabled stage
at the beginning of the URB (overlapping with push constants).  This is
likely okay since it's a zero size region (0 entries).

It seems like the Vulkan driver might hit this assertion, however, and
the situation seems harmless.  To work around this, always place
disabled stages at the start of the URB, so the last enabled stage can
fill the remaining space without overflowing the field.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-11-03 23:25:57 -07:00
Mauro Rossi
5c0cff868a android: radv: add libmesa_git_sha1 static dependency
libmesa_git_sha1 whole static dependency is added to get git_sha1.h header
and avoid following building error:

external/mesa/src/amd/vulkan/radv_device.c:46:10:
fatal error: 'git_sha1.h' file not found
         ^
1 error generated.

Fixes: 9d40ec2cf6 ("radv: Add support for VK_KHR_driver_properties.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-03 10:48:45 +01:00
Eric Anholt
0d78c6af0d vc4: Use the normal simulator ioctl path for CL submit as well.
The simulator no longer needs to look back into the gallium structs.
2018-11-02 14:26:38 -07:00
Eric Anholt
c80e267a0a vc4: Maintain a separate GEM mapping of BOs in the simulator.
This will let us avoid looking back into the gallium driver's vc4_bo.
2018-11-02 14:26:38 -07:00
Eric Anholt
645ca269d2 vc4: Take advantage of _mesa_hash_table_remove_key() in the simulator. 2018-11-02 14:26:38 -07:00
Eric Anholt
f32ba7abd7 v3d: Remove the special path for simulaton of the submit ioctl.
Now that it doesn't need to find the struct v3d_bos, it can just take the
normal v3d_ioctl() path.
2018-11-02 14:26:38 -07:00
Eric Anholt
df9f574c13 v3d: Maintain a mapping of the GEM buffer in the simulator.
This way we don't need to reach back into the gallium driver code to get
the mapping.
2018-11-02 14:26:38 -07:00
Dylan Baker
7652931d33 meson: link gallium nine with pthreads
In some cases (not building with llvm, which automatically pulls in
pthreads) nine needs to be directly linked with pthreads. Fixes building
on x86 (32 bit) without llvm.

Distro bug: https://bugs.gentoo.org/670094
Fixes: 6b4c7047d5
       ("meson: build gallium nine state_tracker")
Tested-by: Rafal Lalik <rafallalik@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-11-02 13:10:33 -07:00
Anuj Phogat
1c140470ef anv/icl: Disable prefetching of sampler state entries
WA_1606682166:
Incorrect TDL's SSP address shift in SARB for 16:6 & 18:8 modes.
Disable the Sampler state prefetch functionality in the SARB by
programming 0xB000[30] to '1'. This is to be done at boot time and
the feature must remain disabled permanently.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-02 08:34:33 -07:00
Topi Pohjolainen
9a41a10f8a i965/icl: Disable prefetching of sampler state entries
In the same spirit as commit a5889d70f2
"i965/icl: Disable binding table prefetching". Fixes some 110+
intermittent piglit failures with tex-miplevel-selection variants.

WA_1606682166:
Incorrect TDL's SSP address shift in SARB for 16:6 & 18:8 modes.
Disable the Sampler state prefetch functionality in the SARB by
programming 0xB000[30] to '1'. This is to be done at boot time and
the feature must remain disabled permanently.

Anuj: Set SamplerCount = 0 for vs, gs, hs, ds and wm units as well.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-02 08:34:33 -07:00
Jan Vesely
9cab8ccd6c amd: Make vgpr-spilling depend on llvm version
The option was removed in LLVM r345763

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-02 10:32:47 -04:00
Timothy Arceri
769ae9fb7f nir: fix condition propagation when src has a swizzle
We cannot use nir_build_alu() to create the new alu as it has no
way to know how many components of the src we will use. This
results in it guessing the max number of components from one of
its inputs.

Fixes the following CTS tests:

dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_frag
dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_geom
dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_tessc
dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_vert

Fixes: 2975422ceb ("nir: propagates if condition evaluation down some alu chains")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-03 00:44:01 +11:00
Mauro Rossi
b9dec214f5 android: gallium/auxiliary: add include to get u_debug.h header
To avoid build error in u_debug_stack_android.cpp
due to now missing u_debug.h header:

external/mesa/src/gallium/auxiliary/util/u_debug_stack_android.cpp:26:10:
fatal error: 'u_debug.h' file not found
#include "u_debug.h"
         ^
1 error generated.

Fixes: 37db383abb ("util: Move u_debug to utils")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-02 13:31:37 +01:00
Gert Wollny
b710680093 virgl/vtest-winsys: Use virgl version of bind flags
The bind flags defined by mesa/gallium might not always be in sync
with the ones copied to virglrenderer/gallium. Therefore, use the
flags defined in virgl like it is done for all the other calls to
create resources.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-02 11:53:09 +01:00
Gert Wollny
acd2968005 mesa/st: Add support for EXT_texture_sRGB_R8
This only adds support on the Gallium core level, for the drivers
it is likely that additional changes are needed to support the
new texture format and thereby enabling the extension.

Enables on softpipe and makes pass:
  dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.*

v2: - add include for getting GL_SR8_EXT
v4: - since the extension is not required don't bother providing
      a fallback (Ilia Mirkin)
    - split patch (2/2) to separate Gallium and mesa/st parts
      (Roland Scheidegger)
    - trim commit message to only contain the history of the patch
      relevant to this part
v5: - don't include GLES headers (required enum has been added to glheader.h)
      (Ilia Mirkin)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-02 11:52:44 +01:00
Gert Wollny
29f0ab2c30 Gallium: Add format PIPE_FORMAT_R8_SRGB
This format is needed to support EXT_texture_sRGB_R8. THe patch adds a new
format enum, the format entries in Gallium and and svga, the mapping between
sRGB and linear formats, and tests.

  v2: - add mapping to linear format for PIPE_FORMATR_R8_SRGB
  v3: - Add texture format to svga format table since otherwise building
        mesa will fail when this driver is enabled. It was not tested
        whether the extension actually works.
  v4: - svga: remove the SVGA specific format definitions and table entries
        and only add correct the location of PIPE_FORMAT_R8_SRGB in the
        format_conversion_table (Ilia Mirkin)
      - Split patch (1/2) to separate Gallium part and mesa/st part.
        (Roland Scheidegger)
      - Trim the commit message to only contain the relevant parts from the
        split.
  v5: - svga: correct location of PIPE_FORMAT_SRGB_R8 (Ilia Mirkin)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-02 11:52:44 +01:00
Gert Wollny
b8e9c6522d mesa/core: Add definitions and translations for EXT_texture_sRGB_R8
v2: - fix format definition line
    - disable  for desktop GL
    - don't add GL_R8_EXT to glext.h since it is already in
      GLES2/gl2ext.h in glext.h and include this header  where needed
      (all Emil)
v3: - swrast: Fill the function table for sRGB_R8
      The size of the function table is checked at compile time and must
      correspond to the number of mesa texture formats.
      dri/swrast being gles-2.0 doesn't support the extension though
v4: - correct format layout comment (Ilia Mirkin)
    - correct logic for accepting GL_RED only textures (in part Ilia Mirkin)
      EXT_texture_sRGB_R8 requires OpenGL ES 3.0 which includes
      ARB_texture_rg/EXT_texture_rg, so one only must check for the first
      when SR8_EXT is really requested.
v5: - add define for GL_ES8_XT to glheader.h and don't include GLES
      headers  (Ilia Mirkin)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-02 11:52:44 +01:00
Erik Faye-Lund
742dace825 glsl: do not allow implicit casts of unsized array initializers
The GLSL 4.6 specification (section 4.1.14. "Implicit Conversions")
says:

  "There are no implicit array or structure conversions. For
   example, an array of int cannot be implicitly converted to an
   array of float."

So let's add a check in place when assigning array initializers to
implicitly sized arrays, to avoid incorrectly allowing code on the
form:

int[] foo = float[](1.0, 2.0, 3.0)

This fixes the following dEQP test-cases:
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_float_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_float_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_uint_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_uint_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.uint_to_float_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.uint_to_float_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_float_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_float_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_uint_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_uint_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.uint_to_float_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.uint_to_float_fragment

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-02 11:10:36 +01:00
Erik Faye-Lund
6df922f438 mesa/glsl: add support for EXT_shader_implicit_conversions
EXT_shader_implicit_conversions adds support for implicit conversions
for GLES 3.1 and above.

This is essentially a subset of ARB_gpu_shader5, and augments
OES_gpu_shader5.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-02 11:10:36 +01:00
Erik Faye-Lund
ecab2d6f14 glsl: fall back to inexact function-match
In GLES, we currently either need an exact match with a local function,
or an exact match with a builtin.

However, if we add support for implicit conversions for GLES shaders,
we also need to fall back to a non-exact match in the case where there
were no builtin match either.

Luckily, we already have a variable ready with this, so let's just
return it if the builtin-search failed.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-02 11:10:36 +01:00
Erik Faye-Lund
e975c5b785 glsl: add has_implicit_uint_to_int_conversion()-helper
This makes the code a bit easier to read, as well as reduces repetition,
especially when we add support for EXT_shader_implicit_conversions.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-02 11:10:36 +01:00
Erik Faye-Lund
12f001f013 glsl: add has_implicit_conversions()-helper
This makes the code a bit easier to read, as well as will reduce
repetition when we add support for EXT_shader_implicit_conversions.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-02 11:10:36 +01:00
Mathias Fröhlich
9f009c1a8f mesa: Remove needless indirection in some draw functions.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-02 08:42:03 +01:00
Timothy Arceri
c7bdda8aa5 nir: allow propagation of if evaluation for bcsel
Shader-db results Skylake:

total instructions in shared programs: 13109035 -> 13109024 (<.01%)
instructions in affected programs: 4777 -> 4766 (-0.23%)
helped: 11
HURT: 0

total cycles in shared programs: 332090418 -> 332090443 (<.01%)
cycles in affected programs: 19474 -> 19499 (0.13%)
helped: 6
HURT: 4

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-02 15:56:34 +11:00
Dave Airlie
677b496b6b radv: fix begin/end transform feedback with 0 counter buffers.
If the user gives 0 counterBuffers then the driver should still
enable transform feedback on all targets. This changes the
driver to always enable xfb, and use counter buffers where
one is defined for the target in question.

Fixes: b4eb029062 (radv: implement VK_EXT_transform_feedback)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-02 04:15:07 +00:00
Dave Airlie
7f37a52a21 radv: apply xfb buffer offset at buffer binding time not later. (v2)
In order to handle pause/resume properly, the offset should
be added to the buffer binding not to the begin/end paths.

v2: don't add offset to size
Fixes ext_transform_feedback-alignment* under zink

Fixes: b4eb029062 (radv: implement VK_EXT_transform_feedback)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-02 04:13:31 +00:00
Mark Janes
5f312e95f8 Revert "i965/batch: avoid reverting batch buffer if saved state is an empty"
This reverts commit a9031bf9b5.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108630
2018-11-01 16:28:05 -07:00
Eric Anholt
43a397c580 vc4: Drop the winsys_stride relayout in the simluator
Since 0c1dd9dee0 ("broadcom/vc4: Allow importing linear BOs with
arbitrary offset/stride."), we have the vc4-side BO properly laid out
(assuming it's linear) in the winsys BO so that we can skip this extra
copy.
2018-11-01 14:34:02 -07:00
Eric Anholt
4e1b163eed v3d: Update the TLB config for depth writes on V3D 4.2.
Fixes 311 piglit cases on the simulator.
2018-11-01 13:56:30 -07:00
Eric Anholt
4018eb04e8 v3d: Use the TLB R/B swapping instead of recompiles when available.
The recompile reduction is nice, but this also makes it so that a straight
texture copy could get optimized some day to not unpack/repack the f16
values.
2018-11-01 13:56:30 -07:00
Eric Anholt
3923cf626d v3d: Take advantage of _mesa_hash_table_remove_key() in the simulator. 2018-11-01 13:54:36 -07:00
Eric Anholt
47586ab569 v3d: Respect user-passed strides for BO imports.
If the caller has passed in a stride for (linear) BO import, we should use
that stride when rendering to the BO (or, if we some day support texturing
from linear-imported BOs, when doing the linear-to-UIF shadow copy).  This
lets us remove the extra stride-changing relayout in the simulator.
2018-11-01 13:54:36 -07:00
Eric Anholt
5313fb8abd v3d: Drop #if 0-ed out v3d_dump_to_file().
This came from vc4, where we had a file format for GPU hangs.  I don't
have one of those for V3D, and I probably won't ever have the simulator
side produce dumps even if I do.
2018-11-01 13:54:36 -07:00
Eric Anholt
d3f66c385b v3d: Fix a typo in a comment in job handling. 2018-11-01 13:54:36 -07:00
Eric Anholt
b93fc160f4 v3d: Fix a copy-and-paste comment in the simulator code. 2018-11-01 13:54:36 -07:00
Anuj Phogat
13c955182f anv/icl: Set Error Detection Behavior Control Bit in L3CNTLREG
The default setting of this bit is not the desirable behavior.
WA_1406697149

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-01 12:00:23 -07:00
Anuj Phogat
b3d6937fb0 i965/icl: Set Error Detection Behavior Control Bit in L3CNTLREG
The default setting of this bit is not the desirable behavior.
WA_1406697149

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-01 12:00:23 -07:00
Emil Velikov
ac95a0e024 docs: add 19.0.0-devel release notes template
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-01 18:56:54 +00:00
Emil Velikov
97c73c9174 mesa: bump version to 19.1.0-devel
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-01 18:54:02 +00:00
Dylan Baker
1f41104b9b meson: don't install translation files
Tested-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 7834926a4f
       ("meson: add support for generating translation mo files")
2018-11-01 10:49:16 -07:00
Eric Engestrom
4da169d368 egl: use the LC_ALL hammer instead of LANG
Some environment (like Travis apparently) set LC_* vars, messing up the
sort ordering, so let's use envvar with the highest priority to make
sure this is actually sorted in ASCII order.

Suggested-by: Michel Dänzer <michel@daenzer.net>
Fixes: b42dc50a5f "egl: fix entrypoint sorting test"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-11-01 17:25:08 +00:00
Eric Engestrom
b42dc50a5f egl: fix entrypoint sorting test
Fixes: 68dc591af1 "egl: Fix eglentrypoint.h sort order."
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 15:45:26 +00:00
Andrii Simiklit
fc3cecda8c intel/tools: fix resource leak
Some memory and file descriptors are not freed/closed.

v2: fixed case where we skipped the 'aub' variable initialization

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-01 13:21:07 +00:00
Jonathan Gray
ae8e81b0e3 intel/tools: include stdarg.h in error2aub
Include stdarg.h in error2aub.c otherwise it fails to build on
OpenBSD due to not finding definitions for va_list va_start va_end.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-01 10:27:26 +00:00
Mathias Fröhlich
68dc591af1 egl: Fix eglentrypoint.h sort order.
Fixes a make check failure.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108617
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 10:56:21 +01:00
Samuel Pitoiset
9cbdcc86b7 radv: set PA_SU_PRIM_FILTER_CNTL optimally
Ported from RadeonSI. It's always TRUE for CIK+ because RADV
doesn't support 16 samples.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-01 08:49:15 +01:00
Samuel Pitoiset
85010585cd radv: only enable gl_SampleMask if MSAA is enabled too
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-01 08:49:11 +01:00
Samuel Pitoiset
0c08074cef radv: use radeon_info::num_good_cu_per_sh
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-01 08:49:08 +01:00
Samuel Pitoiset
9278089d05 ac/nir: make use of i1false in few more places
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-01 08:49:05 +01:00
Samuel Pitoiset
79410b1e87 radv: add support for Raven2
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-01 08:48:52 +01:00
Mathias Fröhlich
ad52e19408 mesa: Collect all the draw functions in draw.{h,c}.
Some of these functions were distributed across different
implementation and header files. Put them at a central place.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Mathias Fröhlich
3d64f3c795 mesa/vbo: Move _vbo_draw_indirect -> _mesa_draw_indirect
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Mathias Fröhlich
f726c61cc1 mesa/vbo: Move src/mesa/vbo/vbo_exec_array.c -> src/mesa/main/draw.c
The array type draw is no longer directly dependent on the vbo module.
Thus move array type draws into mesa/main/draw.c.
Rename symbols starting with vbo_* to _mesa_* and apply some
reindenting to make it consistent.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Mathias Fröhlich
952a5da584 vbo: Pull the _mesa_set_draw_vao calls out of the if clauses.
These calls are just the same in each if branch. So pull that
before the if.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Mathias Fröhlich
b00cb994ef vbo: Preserve vbo_save::no_current_update on primitive restart.
With this change we preserve the no_current_update property when we
observe a glPrimitiveRestart call. That means that we now also get the
no_current_update optimization for display lists that are made
out of indexed draws using primitive restart.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Mathias Fröhlich
f2a52b3c25 vbo: Make no_current_update an argument to vbo_save_NotifyBegin.
Instead of coding additional information into the primitive
mode, make the only remaining flag there a direct argument to
vbo_save_NotifyBegin.

v2: Fix incorrect no_current_update in glRectf.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Mathias Fröhlich
b899f5e59c vbo: Move no_current_update out of _mesa_prim.
The _mesa_prim::no_current_update flag should tell the compiled
display list if the current attributes that are placed in the dlists
vbo shall take a defined state past replay of a display list.
Immediate mode draws compiled into display lists should set the
current values. Array draws may leave the current values in
undefined state.
So finally this flag is not a property of every primitive
but it is a property of the compiled display list and there it
is a property of the last primitive compiled into the list.
So move the flag out of _mesa_prim into vbo_save.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Mathias Fröhlich
eae4ee9419 vbo: Remove the now unused VBO_SAVE_PRIM_WEAK define.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Mathias Fröhlich
873adb06fa vbo: Remove the always false branch dlist replay.
The previous patch left a constant if (0) in the code.
Clean that up now.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Mathias Fröhlich
1387b4d533 vbo: Test for VBO_SAVE_PRIM_WEAK in _mesa_prim::mode is false.
When setting the _mesa_prim::mode field we always filter out
all non OpenGL primitive mode bits. So this tested bit cannot be
there anymore and the test evaluates to zero.
The zero is removed with the next patch to ease review.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Mathias Fröhlich
cee0dd8d5a vbo: Remove VBO_SAVE_PRIM_WEAK from vbo_save_NotifyBegin calls.
Now looking at the implementation of vbo_save_NotifyBegin.
The VBO_SAVE_PRIM_WEAK flag, delivered in the primitive mode
argument to vbo_save_NotifyBegin, is not evaluated anymore.
The two users of the mode argument are the primitive mode
itself, where the VBO_SAVE_PRIM_WEAK bit is masked out to
retrieve the underlying OpenGL primitive mode. The other
user is to check for the VBO_SAVE_PRIM_NO_CURRENT_UPDATE bit
which is different from VBO_SAVE_PRIM_WEAK.
So, since vbo_save_NotifyBegin does not care about
VBO_SAVE_PRIM_WEAK, we can savely remove it from the call
arguments of vbo_save_NotifyBegin.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Mathias Fröhlich
b632c072b2 vbo: Remove set but not used weak field from _mesa_prim.
The only reader of the weak field in _mesa_prim is pretty
console printing. By that, remove the weak field from _mesa_prim.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Mathias Fröhlich
2dc951b7c3 vbo: Remove the VBO_SAVE_FALLBACK flag.
On finishing a display list playback the VBO_SAVE_FALLBACK bit
is still kept in vbo_save_context::replay_flags. But examining
replay_flags and the display list flags that feed this value
the corresponding bit is never set these days anymore.
So, since it is nowhere set or checked, we can safely remove it.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Mathias Fröhlich
5b41504f66 vbo: Remove unused vbo_save_fallback function.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 06:08:49 +01:00
Emil Velikov
075f92b2b7 docs/relnotes: add the EGL Device extensions
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-01 00:05:43 +00:00
Emil Velikov
83c7fbb4e4 meson: egl: group dri2 bits separately from haiku
One cannot have haiku and dri2 - surfaceless,x11,etc.

Group things up, which will make the addition of platform_device a bit
easier.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-01 00:05:43 +00:00
Emil Velikov
c7cc135e23 egl: enable EGL_EXT_device_{base,enumeration,query}
Now that we support the extensions, fully, enabled them.

The specs mandate that we always have at least one device and each dpy
has a device associated with it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 00:05:43 +00:00
Emil Velikov
00992700c9 egl: set the EGLDevice when creating a display
This is the final requirement from the base EGLDevice spec.

v2:
 - split from another patch
 - move wayland hunk after we have the fd

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 00:05:43 +00:00
Emil Velikov
dbb4457d98 egl: add EGL_EXT_device_drm support
Add implementation based around the drmDevice API. As such it's only
available only when building with libdrm. With the latter already a
requirement when using !SW code paths in the platform code.

Note: the current code will work if a device is hot-plugged. Yet
hot-unplugged is not implemented, since I have no ways of testing it.

v2:
 - ddd some _eglDeviceSupports checks
 - require DRM_NODE_RENDER
 - add _eglGetDRMDeviceRenderNode helper

v3:
 - flip inverted asserts (Mathias)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 00:05:43 +00:00
Emil Velikov
f73c5d27c1 egl: add EGL_MESA_device_software support
Add a plain software device, which is always available.

We can safely assign it as the first/initial device in _eglGlobals,
although we ensure that's the case with a handful of _eglDeviceSupports
checks throughout the code.

v2:
 - s/_eglFindDevice/_eglAddDevice/ (Eric)
 - s/_eglLookupAllDevices/_eglRefreshDeviceList/ (Eric)
 - move ^^ helpers into a earlier patch (Eric, Mathias)
 - set the SW device on _eglGlobal init. (Eric)
 - add a number of _eglDeviceSupports checks (Mathias)
 - split Device/Display attach to a separate patch

v3:
 - flip inverted asserts (Mathias)
 - s/on-stack/static/ (Mathias)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 00:05:43 +00:00
Adam Jackson
3f08e500c4 specs: Add EGL_MESA_device_software
The device extension string is expected to contain the name of the
extension defining what kind of device it is, so the caller can know
what kinds of operations it can perform with it. So that string had
better be non-empty, hence this trivial extension.

v2:
 - drop "fallback", update history and update contributor list

Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 00:05:43 +00:00
Emil Velikov
7552fcb7b9 egl: add base EGL_EXT_device_base implementation
Introduce the API for device query and enumeration. Those at the moment
produce nothing useful since zero devices are actually available.

That contradicts with the spec, so the extension isn't advertised just
yet.

With later commits we'll add support for software (always) and hardware
devices. Each one exposing the respective extension string.

v2:
 - fold API boilerplate into this patch
 - move _eglAddDevice, _eglDeviceSupports, _eglRefreshDeviceList to this
patch (Eric, Mathias)
 - make _eglFiniDevice the one called last

v3:
 - comment on the dummy _egl_device_extension enum entry (Eric)
 - annotate dev as MAYBE_UNUSED (Mathias)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-01 00:05:43 +00:00
Emil Velikov
e55c1bcb08 glx: be explicit about when mapping X <> GLX visuals
Write down both X and GLX visual types when mapping from one to the
other. Makes grepping through the code a tiny bit easier.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-11-01 00:05:43 +00:00
Emil Velikov
833e3cad19 glx: remove unused __glXPreferEGL() declaration
The function definition is no longer around, drop the useless declaration.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-11-01 00:05:43 +00:00
Emil Velikov
4428eed896 travis: use mako for python2
Earlier commit flipped the default to python2 but forgot to update the
travis file. Props to pip caching things "worked" for a little while.

Fixes: f22ad5ef18 ("travis: use python3 for the autoconf builds")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-01 00:05:43 +00:00
Dave Airlie
fcf15a007d radv/xfb: don't increase offset by component mask start.
This is incorrect, the offset is into the buffer, and it's legal
to write

loc 0,0 -> buffer0, offset 0
loc 0,1 -> buffer1, offset 0

This fixes a bunch of piglits running on my zink xfb code on
radv.

Fixes: 6c21645046 (radv: emit stream outputs for vertex and tessellation stages)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-31 23:48:10 +00:00
Dylan Baker
d25179469b util/gen_xmlpool: Make use of python's foreach loop
Instead of using a while loop with indexing. This is much cleaner. This
requires some other small changes.

Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-31 16:37:46 -07:00
Dylan Baker
465cfcb266 util/gen_xmlpool: Don't use len to test for container emptiness
This is a very common python anti-pattern. Not using length allows us to
go through faster C paths, but has the same meaning.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-31 16:37:46 -07:00
Dylan Baker
b9cd81ea31 util/gen_xmlpool: Don't write via shell redirection
Using shell redirection to write to a file is more complicated than
necessary, and has the potential to run into unicode encoding problems.
It's also less code.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108530

v2: - update commit message to say less about LANG=C
    - use flags instead of positional arguments for the script (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-31 16:37:46 -07:00
Dylan Baker
1df086662a util/gen_xmlpool: use with statement to open file
Which ensures it is closed at the end of the scope.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-31 16:37:12 -07:00
Dylan Baker
bc4a7645e4 util/gen_xmlpool: use a main function
Again, just good style

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-31 16:37:12 -07:00
Dylan Baker
187fad5c0b util/gen_xmlpool: Use print function instad of sys.stderr.write
This ensures that stderr is flushed, unlike writing

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-31 16:37:12 -07:00
Dylan Baker
2c2aa98ee7 util/gen_xmlpool: Use more standard style
gen_xmlpool uses a style unlike the rest of mesa, spaces between
function/method calls and the parens, strange whitespace to force lining
up method calls, and some other whitespace stuff. Since I'm going to be
doing some work in the file, I'm going to start cleaning those up.

Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-31 16:37:12 -07:00
Dylan Baker
a8004ef03e docs/meson: Add note about update translations
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-31 16:37:12 -07:00
Dylan Baker
0621e91a8c util/xmlpool: Update for meson generation
Meson won't put the .gmo files in the layout that python's
gettext.translation() expects, it puts them in the build directory in a
flat layout. This modifies android and autotools to do the same (scons
doesn't work with translations at all)

v3: - Squash 4 patches into this patch

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-31 16:37:12 -07:00
Dylan Baker
7834926a4f meson: add support for generating translation mo files
Meson has handy a handy built-in module for handling gettext called
i18n, this module works a bit differently than our autotools build does,
namely it doesn't automatically generate translations instead it creates
3 new top level targets to run. These are:

xmlpool-pot
xmlpool-update-po
xmlpool-gmo

v2: - Add new files to autotools dist tarball

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-31 16:37:12 -07:00
Dylan Baker
2857b18991 util/gen_xmlpool: use argparse for argument handling
This is a little cleaner than just looking at sys.argv, but it's also
going to allow us to handle the differences in the way meson and
autotools handle translations more cleanly.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-31 16:37:12 -07:00
Timothy Arceri
5b757b4097 nir: fix if condition propagation for alu use
We need to update the cursor before we check if the alu use is
dominated by the if condition. Previously we were checking if
the current location of the alu instruction was dominated by
the if condition which would miss some optimisation opportunities.

Fixes: a3b4cb3458 ("nir/opt_if: Rework condition propagation")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-01 09:22:55 +11:00
Vinson Lee
802ae533ab freedreno: Do not link ir3_compiler with valgrind libraries.
This patch fixes this freedreno autotools build error.

  CXXLD    ir3_compiler
/usr/lib/valgrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-m_main.o): In function `_start':
(.text+0x0): multiple definition of `_start'
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o:(.text+0x0): first defined here
/usr/bin/ld: /usr/lib/valgrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-m_main.o): relocation R_X86_64_32S against undefined symbol `vgPlain_interim_stack' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: /usr/lib/valgrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-m_trampoline.o): relocation R_X86_64_32 against `.text' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: /usr/lib/valgrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-dispatch-amd64-linux.o): relocation R_X86_64_32S against symbol `vgPlain_stats__n_xindirs_32' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status

Fixes: f3cc0d2747 ("freedreno: import libdrm_freedreno + redesign submit")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108595
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-10-31 15:05:28 -07:00
Emil Velikov
f22ad5ef18 travis: use python3 for the autoconf builds
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-31 19:16:00 +00:00
Emil Velikov
986033a275 configure: allow building with python3
Pretty much all of the scripts are python2+3 compatible.
Check and allow using python3, while adjusting the PYTHON2 refs.

Note:
 - python3.4 is used as it's the earliest supported version
 - python2 chosen prior to python3

v2: use python2 by default

Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-31 19:15:50 +00:00
Juan A. Suarez Romero
6d7d3dbda5 docs: update calendar, add new item and link release notes for 18.2.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-10-31 19:58:00 +01:00
Juan A. Suarez Romero
5b074c756e docs: add sha256 checksums for 18.2.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 624e384ea8)
2018-10-31 19:55:28 +01:00
Juan A. Suarez Romero
7c2239aa55 docs: add release notes for 18.2.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 1cdef5e70c)
2018-10-31 19:55:25 +01:00
Eric Engestrom
091da79bb0 meson: hide warnings from external project gtest
gtest is an external project that is copied in this tree for technical
reasons, but isn't maintained by us, so its warnings are irrelevant.

Cc: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-10-31 18:20:25 +00:00
Eric Engestrom
455a3cd515 tools/imgui: disable all warnings
This is an external project we have no control over, and will not be
fixing (other than by sometimes pulling the latest sources), so warnings
serve no purpose here.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-31 16:28:33 +00:00
Alejandro Piñeiro
95b8da22cf glspirv: no need to force entrypoint name to "main"
Since commit "intel/compiler: Stop assuming the entrypoint is called
"main"" there is no need to force the entrypoint name to be "main".

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-31 15:57:23 +01:00
Tapani Pälli
27f1298b9d glsl/linker: validate attribute aliasing before optimizations
Patch does a 'dry run' of assign_attribute_or_color_locations before
optimizations to catch cases where we have aliasing of unused attributes
which is forbidden by the GLSL ES 3.x specifications.

We need to run this pass before unused attributes may be removed and with
attribute binding information from program, therefore we re-use existing
pass in linker rather than attempt to write another one.

This fixes WebGL2 test 'gl-bindAttribLocation-aliasing-inactive' and
Piglit test 'gles-3.0-attribute-aliasing'.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106833
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-31 14:53:47 +02:00
Eric Engestrom
a96749b13c egl: drop EGL driver name
This is a revert of Marek's 2cb9ab53dd revert.
It was needed to revert the previous commit, and didn't have any issue
itself.
--

The "DRI2" name was reported as confusing when printing EGL infos (one
user reported thinking DRI3 was not working on his X server), and the
only alternative is Haiku, which can only be used on a Haiku machine.

The name therefore doesn't add any information that the user wouldn't
know already, so let's just drop it.

Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Related-to: b174a1ae72 ("egl: Simplify the "driver" interface")
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-31 11:01:54 +00:00
Eric Engestrom
cb0980e69a egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}
This is a revert of Marek's 84f3afc2e1 revert, with a missing
line added back. I failed a rebase and dropped that crucial line, and
didn't do a runtime test after my rebase, and as a result broke EGL for
everyone.
This commit has been tested by Intel's CI and I re-read it once more, so
it should be good this time.
--

Note: dropping the EGL_BAD_ALLOC in egl_haiku because it's
overwritten by the EGL_NOT_INITIALIZED in eglInitialize().

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-31 11:01:54 +00:00
Christian Gmeiner
21d9b78289 Revert "imx: make use of loader_open_render_node(..) helper"
This reverts commit 773d6ea6e7.

Since kernel 4.17 (drm/etnaviv: remove the need for a gpu-subsystem DT
node) the etnaviv DRM driver doesn't have an associated DT node
anymore. This is technically correct, as the etnaviv device is a
virtual device driving multiple hardware devices.

Before 4.17 the userspace had access to the following information:
DRIVER=etnaviv
OF_NAME=gpu-subsystem
OF_FULLNAME=/gpu-subsystem
OF_COMPATIBLE_0=fsl,imx-gpu-subsystem
OF_COMPATIBLE_N=1
MODALIAS=of:Ngpu-subsystemT<NULL>Cfsl,imx-gpu-subsystem
DRIVER=imx-drm
OF_NAME=display-subsystem
OF_FULLNAME=/display-subsystem
OF_COMPATIBLE_0=fsl,imx-display-subsystem
OF_COMPATIBLE_N=1

Afer 4.17:
DRIVER=etnaviv
MODALIAS=platform:etnaviv

The OF node has never been part of the etnaviv UABI, simply due to the
fact that it's still possible to instantiate the etnaviv driver from a
platform file, instead of a devicetree node.

A patch set to fix this problem was send out [1] but it looks like
that a proper solution needs more time to bake.

[1] https://lists.freedesktop.org/archives/dri-devel/2018-October/194651.html

Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-10-31 09:41:26 +01:00
Samuel Pitoiset
9ef8ea1451 radv: use WAIT_REG_MEM_GREATER_OR_EQUAL instead of a magic value
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-31 09:21:28 +01:00
Samuel Pitoiset
a9a56f47f8 radv: use pool->stride when calling radv_query_shader()
Not needed to recompute the stride.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-31 09:21:28 +01:00
Samuel Pitoiset
e60ab66e33 radv: rename some parameters in Cmd{Begin,End}TransformFeedbackEXT()
To match latest spec.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-31 09:21:28 +01:00
Samuel Pitoiset
57982b683b radv/winsys: do not assign last submission when chained path failed
I don't think we want to wait for something that hasn't been
correctly submitted. This is similar to the fallback path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-31 09:21:28 +01:00
Samuel Pitoiset
ae3aecd07f radv/winsys: fix buffer deletion in the sysmem path
In case we failed to submit the CS correctly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-31 09:21:28 +01:00
Samuel Pitoiset
72877865d9 radv/winsys: cleanup the chained submission path
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-31 09:21:28 +01:00
Samuel Pitoiset
d12dd16a97 radv/winsys: remove unused surface_best()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-31 09:21:28 +01:00
Jason Ekstrand
d3a0d8b750 intel/compiler: Stop assuming the entrypoint is called "main"
This isn't true for Vulkan so we have to whack it to "main" in anv which
is silly.  Instead of walking the list of functions and asserting that
everything is named "main" and hoping there's only one function named
"main", just use the nir_shader_get_entrypoint() helper which has better
assertions anyway.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-10-30 20:14:52 -05:00
Timothy Arceri
31596836fc st/glsl_to_nir: fix next_stage gathering
ffs() just returns the bit that is set, we need to know what
stage that bit represents so use u_bit_scan() instead.

Fixes: 2ca5d9548f ("st/glsl_to_nir: gather next_stage in shader_info")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-31 09:33:17 +11:00
Timothy Arceri
9ec4a5ef29 st/mesa: calculate buffer size correctly for packed uniforms
Fixes: edded12376 ("mesa: rework ParameterList to allow packing")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-31 09:32:41 +11:00
Dylan Baker
fb02bd3d1c util: move u_cpu_detect to util
CC: vlee@freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107870
Fixes: 80825abb5d
       ("move u_math to src/util")
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Dylan Baker
37db383abb util: Move u_debug to utils
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Dylan Baker
2fd5dff7e7 util: Move os_misc to util
this is needed by u_debug

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Dylan Baker
f1f104e548 gallium/util: remove u_inlines.h from u_debug.c
It's not used, and I'm not pulling u_inlines into src/util.

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Dylan Baker
59d494c1cc gallium/util: remove p_format.h from u_debug.h
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Dylan Baker
314777e86a gallium/util: move memory debug declarations into u_debug_gallium
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Dylan Baker
68074dfa0e gallium/util: move debug_print_tranfer_flags to u_debug_galilum
This also appears to be unused.

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Dylan Baker
fc39dc9841 gallium/util: move debug_print_bind_flags to u_debug_gallium
This also appears to be unused.

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Dylan Baker
e4f1fea821 gallium/util: move debug_print_usage_enum to the u_debug_gallium
This isn't used in mesa, maybe vmware uses this in a closed source state
tracker?

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Dylan Baker
078b3cdb34 gallium/util: start splitting u_debug into generic and gallium specific components
In order to pull u_debug into src/util we need to break the generically
useful bits from the bits that are tightly coupled to gallium.

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Dylan Baker
389d59c72a gallium: split u_prim_name out of u_debug.h
This allows us to pull u_prim.h out of u_debug.h

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Andre Heider
25a3ce97d5 gallium/hud: fix power sensor readings for amdgpu users
amdgpu doesn't use the INPUT but the AVERAGE subfeature:

$ sensors -u
amdgpu-pci-0100
Adapter: PCI adapter
power1:
  power1_average: 17.233
  power1_cap: 180.000

Signed-off-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 16:30:32 -04:00
Rhys Perry
5172eb231d glsl_to_tgsi: don't create 64-bit integer MAD/FMA
TGSI has no I64MAD/U64MAD opcode.

Fixes: 278580729a ('st/glsl_to_tgsi: add support for 64-bit integers')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 20:27:12 +00:00
Marek Olšák
26cb93e229 radeonsi: add support for Raven2 (v2)
v2: fix enabling primitive binning

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-30 16:03:02 -04:00
Marek Olšák
0dea85928e radeonsi: clean up decompress flags in fast color clear 2018-10-30 16:03:02 -04:00
Marek Olšák
99835fff08 radeonsi/gfx9: set optimal OVERWRITE_COMBINER_WATERMARK 2018-10-30 16:03:02 -04:00
Marek Olšák
8ad12c8bec gallium: rework PIPE_HANDLE_USAGE_* flags
Only radeonsi uses them, so adjust them to match its needs.
2018-10-30 16:03:02 -04:00
Danylo Piliaiev
00fc56a68d anv: Disable dual source blending when shader doesn't support it on gen8+
Dual source blending behaviour is undefined when shader doesn't
have second color output.

 "If SRC1 is included in a src/dst blend factor and
  a DualSource RT Write message is not used, results
  are UNDEFINED. (This reflects the same restriction in DX APIs,
  where undefined results are produced if “o1” is not written
  by a PS – there are no default values defined)."

Dismissing fragment in such situation leads to a hang on gen8+
if depth test in enabled.

Since blending cannot be gracefully fixed in such case and the result
is undefined - blending is simply disabled.

v2 (Jason Ekstrand):
 - Apply the workaround to each individual entry
 - Emit a warning through debug_report

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-10-30 12:59:53 -07:00
Danylo Piliaiev
eca4a6548d i965: Disable dual source blending when shader doesn't support it on gen8+
Dual source blending behaviour is undefined when shader doesn't
have second color output, dismissing fragment in such situation
leads to a hang on gen8+ if depth test in enabled.

Since blending cannot be gracefully fixed in such case and the result
is undefined - blending is simply disabled.

v2 (Kenneth Graunke):
 - Listen to BRW_NEW_FS_PROG_DATA in 3DSTATE_PS_BLEND
 - Also whack BLEND_STATE[] to keep the two in sync, since we're not
   sure exactly which copy of the redundant info the hardware will use.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107088
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-10-30 12:59:53 -07:00
Kenneth Graunke
337a808062 i965: Respect GL_TEXTURE_SRGB_DECODE_EXT in GenerateMipmaps()
Apparently, we're supposed to look at the texture object's built-in
sampler object's sRGB decode setting in order to decide whether to
decode/downsample/re-encode, or simply downsample as-is.  Previously,
I had always done the decoding/encoding.

Fixes SKQP's Skia_Unit_Tests.SRGBMipMaps test.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-10-30 12:59:53 -07:00
Andrii Simiklit
e4e0fd5ffe i965/batch: don't ignore the 'brw_new_batch' call for a 'new batch'
If we restore the 'new batch' using 'intel_batchbuffer_reset_to_saved'
function we must restore the default state of the batch using
'brw_new_batch' function because the 'intel_batchbuffer_flush'
function will not do it for the 'new batch' again.
At least the following fields of the batch
'state_base_address_emitted','aperture_space', 'state_used'
should be restored to default values to avoid:
1. the aperture_space overflow
2. the missed STATE_BASE_ADDRESS commad in the batch
3. the memory overconsumption of the 'statebuffer'
   due to uncleared 'state_used' field.
etc.

v2: merge with new commits, changes was minimized, added the 'fixes' tag
v3: added in to patch series

Fixes: 3faf56ffbd "intel: Add an interface for saving/restoring
                     the batchbuffer state."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-10-30 12:09:17 -07:00
Andrii Simiklit
a9031bf9b5 i965/batch: avoid reverting batch buffer if saved state is an empty
There's no point reverting to the last saved point if that save point is
the empty batch, we will just repeat ourselves.

CC: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 3faf56ffbd "intel: Add an interface for saving/restoring
                     the batchbuffer state."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-10-30 12:09:09 -07:00
Eric Engestrom
ea738a91de egl: add messages to a few assert() and turn a couple into unreachable()
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-30 18:10:59 +00:00
Eric Engestrom
d0d6ec549d util: s/0/NULL/ for pointer
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-30 18:10:59 +00:00
Eric Engestrom
5c64847322 i965: add missing case to fix -Wswitch
While at it, turn "unreachable" assert() into unreachable().

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-30 18:10:59 +00:00
Eric Engestrom
2894e278cf mesa: fix struct/class mismatch
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-30 18:10:59 +00:00
Eric Engestrom
6000895e2d mesa: fix memcpy() and memset(0) of non-trivial structs
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-30 18:10:59 +00:00
Eric Engestrom
69eb6d58e8 nouveau: remove unused class member
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-10-30 18:10:59 +00:00
Eric Engestrom
6f9309d5d4 scons: drop unused HAVE_STDINT_H macro
This was required back when MSVC didn't support C99 and was missing this
header, but since MSVC 2013 (or maybe earlier?) this isn't it does and
this code isn't doing anything anymore.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-30 18:10:59 +00:00
Eric Engestrom
a18d726621 aub_viewer: show vertex buffer pitch
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-30 18:10:59 +00:00
Eric Engestrom
0bbee28a3b meson: add note about intel tools build options
Fixes: ea83a1d304 "intel: tools: import ImGui"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-10-30 18:10:59 +00:00
Eric Engestrom
4a266d01a7 vl: drop left-over variable
Fixes: 6ccc435e7a "pipe-loader: move dup(fd) within pipe_loader_drm_probe_fd"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-30 18:10:59 +00:00
Eric Anholt
68657d76b9 vc4: Fix unused variable warning.
Fixes: bb84fa146f ("util: use C99 declaration in the for-loop hash_table_foreach() macro")
2018-10-30 10:46:52 -07:00
Eric Anholt
cc54e1acf9 v3d: Use nir_remove_unused_io_vars to handle binner shader output DCE
We were doing this late after nir_lower_io, but we can just reuse the core
code.  By doing it at this stage, we won't even set up the VS attributes
as inputs, reducing our VPM size.
2018-10-30 10:46:52 -07:00
Eric Anholt
c152c79d5e v3d: Only add output slot tracking for the current varying slot.
We always emit 4 slots per slot because things like color output and
position processing in the epilogue will potentially look up more values
than the variable declaration had.  However, when we get a .location_frac
!= 0, we don't want to overwrite components of the following
.driver_location.
2018-10-30 10:46:52 -07:00
Eric Anholt
17c8198952 v3d: Use nir_lower_io_to_scalar_early to DCE unused VS input components.
This lets us trim unused trailing components in the vertex attributes,
reducing the size of our VPM allocations.
2018-10-30 10:46:52 -07:00
Eric Anholt
fc85f7cfdc v3d: Don't rely on sorting input vars for VPM read setup.
For supporting scalar VPM i/o at the NIR level, we need to do a pass over
the vars to figure out how big each attribute is after DCE.  Once we've
done that, we can just walk over c->vattr_sizes[] instead of bothering
with vars.
2018-10-30 10:46:52 -07:00
Eric Anholt
cc78676030 v3d: Split out NIR input setup between FS and VPM.
They don't share much code, and I'm about to rewrite the remaining shared
code for the VPM case.
2018-10-30 10:46:52 -07:00
Eric Anholt
8265dfaa87 nir: Allow using nir_lower_io_to_scalar_early on VS input vars.
This will be used on V3D to cut down the size of the VS inputs in the VPM
(memory area for sharing data between shader stages).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-30 10:46:52 -07:00
Jason Ekstrand
f48b742289 anv: Bump the advertised patch version to 90
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-30 11:43:43 -05:00
Emil Velikov
29283921b7 m4: add Werror when checking for compiler flags
Seemingly that at some point clang started accepting _any_ flags,
whereas previously it would error out.

These days, you can give it -Whamsandwich and it will succeed, while
at the same time throwing an annoying warning.

Add -Werror so that everything gets flagged and set accordingly.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108082
Cc: Vinson Lee <vlee@freedesktop.org>
Repored-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-30 16:41:05 +00:00
Dylan Baker
a8bed38b54 docs/calendar: Add 18.3 plan and expand 18.2
Emil will be helping out with 18.3, while Juan finalises 18.2

v2: [Emil] add Emil for 18.3, fix typos

CC: Emil Velikov <emil.velikov@collabora.com>
CC: Juan A. Romero Suarez <jasuarez@igalia.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-10-30 16:35:58 +00:00
Emil Velikov
c210d0c3b7 vulkan/wsi: use the drmGetDevice2() API
On older kernels, the drmGetDevice() call will wake up all the GPUs
on the system, while fetching the PCI revision.

Use the 2 version of the API and pass flags == 0, so we don't fetch the
device PCI revision, since we don't need that information.

Fixes: baa38c144f ("vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-30 16:35:50 +00:00
Jason Ekstrand
a45b6fb452 spirv: Pass SSA values through functions
Previously, we would create temporary variables and fill them out.
Instead, we create as many function parameters as we need and pass them
through as SSA defs.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-10-30 11:22:44 -05:00
Mauro Rossi
bfe0e32913 android: i965/tiled_memcpy: fix build for x86 generic target
x86 32 bit generic target does not enable ARCH_X86_HAVE_SSE4_1
for this reason all Android library modules using SSE4_1 in mesa
are built conditionally to ARCH_X86_HAVE_SSE4_1

The same approach is now applied to libmesa_intel_tiled_memcpy_sse41
in order to avoid the following building errors:

external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:574:15:
error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
      __m128i val = _mm_stream_load_si128((__m128i *)src);
              ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:578:15:
error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
      __m128i val0 = _mm_stream_load_si128(((__m128i *)src) + 0);
              ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:579:15:
error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
      __m128i val1 = _mm_stream_load_si128(((__m128i *)src) + 1);
              ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:580:15:
error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
      __m128i val2 = _mm_stream_load_si128(((__m128i *)src) + 2);
              ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:581:15: error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
      __m128i val3 = _mm_stream_load_si128(((__m128i *)src) + 3);
              ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5 errors generated.

Fixes: 11b1afdc92 ("i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-10-30 14:45:16 +02:00
Toni Lönnberg
50e952840f intel: tools: Add handling for video pipe
Preliminary work for adding handling of different pipes to gen_decoder. We
need to be able to distinguish between different pipes in order to decode
the packets correctly due to opcode re-use.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-30 12:43:00 +00:00
Toni Lönnberg
d5a938c58d intel/decoder: Use 'DWord Length' and 'bias' fields for packet length.
Use the 'DWord Length' and 'bias' fields from the instruction definition to
parse the packet length from the command stream when possible. The hardcoded
mechanism is used whenever an instruction doesn't have this field.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-30 12:43:00 +00:00
Marek Olšák
a09cbaffbf mesa: expose EXT_texture_compression_s3tc on GLES
The spec was modified to support GLES.

Tested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-10-30 13:31:00 +01:00
Michał Janiszewski
2734baa9e2 mesa: Add missing include guards
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-10-30 06:19:10 -06:00
Michał Janiszewski
ec994ca0fc glx: Add missing include guards
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-10-30 06:19:10 -06:00
Michał Janiszewski
8ebd7039c4 svga: Add missing include guards
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-10-30 06:19:09 -06:00
Michał Janiszewski
0654450911 glsl: Add missing include guards
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-10-30 06:19:09 -06:00
Eric Engestrom
fddf384d1d intel/batch-decoder: remove never-used function
This function was there when the file was introduced in commit
38f10d5a03 "intel: tools: add aubinator viewer", but was
never actually used.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-30 10:59:43 +00:00
Eric Engestrom
e9fb81375a st/dri: remove leftover local variable
Left over from the cleanup in 6ccc435e7a "pipe-loader: move dup(fd)
within pipe_loader_drm_probe_fd"

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-30 10:20:58 +00:00
Vadym Shovkoplias
7d66eddbbd glsl/linker: Fix out variables linking during single stage
Since out variables are copied from shader objects instruction
streams to linked shader instruction steam it should be cloned
at first to keep source instruction steam unaltered.

Fixes: 966a797e43 ("glsl/linker: Link all out vars from a shader
objects on a single stage")

Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105731
2018-10-30 10:19:17 +11:00
Marek Olšák
8676af12c8 ac: fix ac_build_fdiv for f64
trivial

Fixes: a5f35aa742
2018-10-29 17:24:21 -04:00
Brian Paul
9007c0ed26 nir: fix yet another MSVC build break
Trivial.
2018-10-29 11:15:12 -06:00
Eric Engestrom
f3a5757eba vulkan/wsi: simplify meson file tracking
Meson already automatically tracks included headers, so there's no need
to add them everywhere; cleans up the code a bit.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-10-29 16:39:47 +00:00
Eric Engestrom
1df0c1e8fb clover: add missing meson build dependency
Fixes: 42ea0631f1 "meson: build clover"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-10-29 16:39:42 +00:00
Eric Engestrom
98e7c3e7a7 svga: add missing meson build dependency
Fixes: a537231b22 "meson: build svga driver on linux"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-10-29 16:39:38 +00:00
Eric Engestrom
912cd0ce3b radv: add missing meson build dependency
Fixes: 9d40ec2cf6 "radv: Add support for VK_KHR_driver_properties."
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-10-29 16:39:34 +00:00
Eric Engestrom
2be1f9ceba anv: add missing meson build dependency
Fixes: e4538b93f5 "anv: Implement VK_KHR_driver_properties"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-10-29 16:39:07 +00:00
Samuel Pitoiset
b4eb029062 radv: implement VK_EXT_transform_feedback
This implementation should work and potential bugs can be
fixed during the release candidates window anyway.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:10:58 +01:00
Samuel Pitoiset
f8d0337299 radv: add multiple streams support for the GS copy shader
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:09:08 +01:00
Samuel Pitoiset
6c21645046 radv: emit stream outputs for vertex and tessellation stages
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:09:08 +01:00
Samuel Pitoiset
19f1b49236 radv: declare streamout SGPRs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:09:08 +01:00
Samuel Pitoiset
f4fa8de794 radv: gather stream output info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:09:08 +01:00
Samuel Pitoiset
fe551ec122 radv: allow to emit a vertex to a specified stream
This is required for GS multiple streams support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:09:08 +01:00
Samuel Pitoiset
a59f1b06ef radv: allow to use up to 4 GSVS ring buffers
For all streams. We basically just need to update the
base address and compute a stride for every stream.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:09:08 +01:00
Samuel Pitoiset
98c09c3fcd radv: adjust the number of output components per stream
Same as the previous patch, except that is only the number of
components.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:09:08 +01:00
Samuel Pitoiset
4649471a9e radv: adjust the GSVS ring sizes based on the number of components
For multiple streams support we have to set the different ring
buffer sizes correctly. This relies on the number of output
components per stream.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:09:08 +01:00
Samuel Pitoiset
8e428e24a8 radv: gather which GS stream is used for every outputs
To only emit outputs for the given stream.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:09:08 +01:00
Samuel Pitoiset
dd996d1885 radv: gather the number of output components per stream
This will be also used for splitting the GS->VS ring buffer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:09:08 +01:00
Samuel Pitoiset
87e6866b04 radv: gather the number of streams used by geometry shaders
This will be used for splitting the GS->VS ring buffer. The
stream ID is always 0 for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:09:08 +01:00
Jason Ekstrand
19064b8c3a nir: Add a pass for gathering transform feedback info
This is different from the GL_ARB_spirv pass because it generates a much
simpler data structure that isn't tied to OpenGL and mtypes.h.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-29 17:09:08 +01:00
Jason Ekstrand
e8a5fa054d vulkan: Update the XML and headers to 1.1.90
This doesn't include any new features but it does include an XML and
header typo fix for modifiers.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-29 10:17:19 -05:00
Samuel Pitoiset
9e56ffb0b4 radv: remove wrong comment in calculate_gs_ring_sizes() about streams
The computation seems correct compared to RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 12:33:58 +01:00
Rob Clark
a61952e737 freedreno: don't flush when new and old pfb is identical
In the 'inorder' case (ie. FD_MESA_DEBUG=inorder, or old kernel), if the
u_blitter clear path is used (a3xx, a4xx, and some fallback cases on
newer gens), util_blitter_restore_fb_state() will set_framebuffer_state()
to something that is identical to the current fb state, which triggers
an unnecessary flush, and then eventually an assert:

  (gdb) bt
  #0  0x0000007fbf24a078 in kill () from /lib64/libc.so.6
  #1  0x0000007fbe061278 in _debug_assert_fail (expr=0x7fbe93a820 "!batch->flushed", file=0x7fbe93a628 "../src/gallium/drivers/freedreno/freedreno_batch.c", line=491, function=0x7fbe93a990 <__func__.17380> "fd_batch_check_size") at ../src/gallium/auxiliary/util/u_debug.c:322
  #2  0x0000007fbe1ccb8c in fd_batch_check_size (batch=0x55556d5a70) at ../src/gallium/drivers/freedreno/freedreno_batch.c:491
  #3  0x0000007fbe1d0e08 in fd_clear (pctx=0x55555c61e0, buffers=5, color=0x55556e388c, depth=1, stencil=0) at ../src/gallium/drivers/freedreno/freedreno_draw.c:463
  #4  0x0000007fbe57afa4 in st_Clear (ctx=0x55556e17b0, mask=18) at ../src/mesa/state_tracker/st_cb_clear.c:452

The assert was introduced in 4b847b38ae, so from a functionality
standpoint this patch fixes that commit.  But it should also avoid an
unnecessary flush in the 'inorder' case, fixing a performance bug.

Fixes: 4b847b38ae freedreno: make fd_batch a one-shot thing
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-28 14:03:38 -04:00
Rob Clark
32dd75b927 freedreno: dependency tracking for z/s depends on ZSA state
ZSA state can change whether depth or stencil is enabled

This plus previous patch fix stk, and various things w/
FD_MESA_DEBUG=inorder

Fixes: ec717fc629 freedreno: reduce resource dependency tracking overhead
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-28 14:03:38 -04:00
Rob Clark
05e868925c freedreno: mark all state dirty after switching batch
The problem isn't directly with ec717fc629 but rather that commit
exposes the problem.  When we switch batch we cannot assume previous
state is clean so we should mark all state dirty.

Fixes: ec717fc629 freedreno: reduce resource dependency tracking overhead
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-28 14:03:38 -04:00
Jason Ekstrand
1bd4f8fefc anv: Use absolute timeouts in wait_for_bo_fences
We were previously using relative timeouts and decrementing the
user-provided timeout as we waited.  Instead, this commit refactors
things to use absolute timeouts throughout.  This should fix a subtle
bug in the waitAll case where we aren't decrementing the timeout after a
successful GPU wait.  Since pthread_cond_timedwait already takes an
absolute timeout, it's also significantly simpler.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-27 16:18:33 -05:00
Jason Ekstrand
cbd4468695 anv: Flag semaphore BOs as external
It probably doesn't actually break anything but it does cause some
assertions in debug builds.

Fixes: 7a89a0d9ed "anv: Use separate MOCS settings for external BOs"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-27 00:02:32 -05:00
Jason Ekstrand
663a113700 anv: Improve the asserts in anv_buffer_get_range
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-27 00:02:32 -05:00
Rob Clark
c41772d17a freedreno/a6xx: inline draw_impl()
Now that it is just called once per draw (instead of once for binning
and once for draw), let's just inline it.  If nothing else, it makes
perf-annotate easier to look at.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-26 18:10:00 -04:00
Rob Clark
604b5f1dca freedreno/a6xx: small cleanup
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-26 18:10:00 -04:00
Rob Clark
2a74d9ae8d freedreno/a6xx: move where we handle dirty vbo state
Historically this wasn't in fdN_emit_state(), because prior to addition
of blitter in a5xx, fdN_emit_state() was also used in the clear path.
These days that is only true for a2xx (a3xx and a4xx use u_blitter).  So
the reason for it not to be in fd6_emit_state() no longer exists.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-26 18:10:00 -04:00
Rob Clark
ddb7fadaf8 freedreno: avoid no-op flushes by re-using last-fence
Noticed that with webgl (in chromium, at least) we end up generating a
lot of no-op submits just to get a fence.  Tracking the last fence and
returning that if there is no rendering since last flush avoids this.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
01194cd582 freedreno/a6xx: Move stencil/depth/alpha state to IB
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
a664dc2d59 freedreno/a6xx: Move stencil mask emit to FD_DIRTY_ZSA group
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
3073926512 freedreno/a6xx: Rename FD6_GROUP_ZSA ro FD6_GROUP_LRZ
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
edc0f1b10f freedreno/a6xx: Move rasterizer state to state object
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
3264eb691a freedreno/a6xx: Fix set_blit_scissor helper
The scissor maxx/maxy are non-inclusive, so don't subtract one from
framebuffer width and height.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
4222fe8af2 freedreno/a2xx: Squash a compiler warning
We get a warning here for assigning a const char * pointer to
char *swizzle in struct ir2_src_register.  The constructor strdups a 4
byte string here, so just memcpy to that instead.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
4fd6265f42 freedreno/a6xx: Use fd6_emit_ib from a6xx
Move it to a header and use it where possible to avoid vfunc call.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Rob Clark
f3cc0d2747 freedreno: import libdrm_freedreno + redesign submit
In the pursuit of lowering driver overhead, it became clear that some
amount of redesign of how libdrm_freedreno constructs the submit ioctl
would be needed.  In particular, as the gallium driver is starting to
make heavier use of CP_SET_DRAW_STATE state groups/objects, the over-
head of tracking cmd buffers and relocs becomes too much.  And for
"streaming" state, which isn't ever reused (like uniform uploads) the
overhead of allocating/freeing ringbuffer[1] objects is too high.

This redesign makes two main changes:

 1) Introduces a fd_submit object for tracking bos and cmds table
    for the submit ioctl, making ringbuffer objects more light-
    weight.  This was previously done in the ringbuffer.  But we
    have many ringbuffer instances involved in a submit (gmem +
    draw + potentially 1000's of state-group rbs), and only need
    a single bos and cmds table.  (Reloc table is still per-rb)

    The submit is also a convenient place for a slab allocator for
    ringbuffer objects.  Other options would have required locking
    because, while we can guarantee allocations will only happen on
    a single thread, free's could happen either on the application
    thread or the flush_queue thread.  With the slab allocator in
    the submit object, any frees that happen on the flush_queue
    thread happen after we know that the application thread is done
    with the submit.

 2) Introduce a new "softpin" msm_ringbuffer_sp implementation that
    does not use relocs and only has cmds table entries for IB1 (ie.
    the cmdstream buffers that kernel needs to CP_INDIRECT_BUFFER
    to from the RB).  To do this properly will require some updates
    on the kernel side, so whether you get the softpin or legacy
    submit/ringbuffer implementation at runtime depends on your
    kernel version.

To make all these changes in libdrm would basically require adding a
libdrm_freedreno2, so this is a good point to just pull the libdrm code
into mesa.  Plus it allows for using mesa's hashtable, slab allocator,
etc.  And it lets us have asserts enabled for debug mesa buids but
omitted for release builds.  And it makes life easier if further API
changes become necessary.

At this point I haven't tried to pull in the kgsl backend.  Although
I left the level of vfunc indirection which would make it possible
to have other backends.  (And this was convenient to keep to allow
for the "softpin" ringbuffer to coexist.)

NOTE: if bisecting a build error takes you here, try a clean build.
There are a bunch of ways things can go wrong if you still have
libdrm_freedreno cflags.

[1] "ringbuffer" is probably a bad name, the only level of cmdstream
    buffer that is actually a ring is RB managed by kernel.  User-
    space cmdstream is all IB1/IB2 and state-groups.

Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-26 18:10:00 -04:00
Jason Ekstrand
aa02d7e878 Revert "anv/skylake: disable ForceThreadDispatchEnable"
This reverts commit 0fa9e6d7b3.  The real
issue appears to have been that HiZ ops don't like having WM thread
dispatch force-enabled.  The previous commit fixes that problem so we
can go back to using the ForceThreadDispatchEnable bit even on SKL+.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-10-26 16:39:47 -05:00
Jason Ekstrand
b6b2b27809 blorp: Emit a dummy 3DSTATE_WM prior to 3DSTATE_WM_HZ_OP
Cc: mesa-stable@lists.freedesktop.org
Suggested-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-10-26 16:39:35 -05:00
Axel Davy
2318ca68bb st/nine: Handle window resize when a presentation buffer is used
Usually when a window is resized, the app calls d3d to resize the back
buffer to the window size. In some cases, it is not done,
and it expects the output resizes to the window size, even if
the back buffer size is unchanged.

This patch introduces the behaviour when a presentation buffer
is used.

ID3DPresent_GetWindowInfo is a function available with
D3DPresent v1.0, and thus we don't need to check if the
function is available.
The function had been introduced to implement this very
feature.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
e50d374b61 d3dadapter: Fix wrong naming in header file
GetWindowInfo used to be GetWindowSize before gallium
nine was merged. A left-over remained...

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
3d975e98e4 st/nine: Reduce MaxSimultaneousTextures to 8
Windows drivers don't set this flag (which affects ff) to more than 8.

Do the same in case some games check for 8.

v2: Remove any dependence on MaxSimultaneousTextures. For non-ff
the number of textures is 16 when the device is able of vs/ps3.
Add this requirement of 16 textures to the driver requirements.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
739c700950 st/nine: Enable shadow mapping for ps 1.X
We didn't implement shadow textures for ps 1.X,
assuming the case couldn't happen...
Well it does.

Fixes: https://github.com/iXit/Mesa-3D/issues/261

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
847861aab4 st/nine: Do not set unused states for stateblocks
A lot of these states are used only for the context,
and are unused for stateblocks (which just uses the
changed.* fields instead for a lot of them).

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
6f373b9b74 st/nine: Fix aliasing states for stateblocks
If NINE_STATE_FF_MATERIAL is set, the stateblock will upload
its recorded materials matrix.
If NINE_STATE_FF_LIGHTING is set, the lighting set is uploaded.

These flags could be set by a NineDevice9_SetTransform call
or by setting some states related to ff, but that shouldn't trigger
these stateblock behaviours.

We don't need to follow the context states dirtied by render states.
NINE_STATE_FF_VSTRANSF is exactly the state controlling stateblock
updates of transformation matrices, NINE_STATE_FF is too broad.

These two changes avoid setting the two mentionned states when we
shouldn't.

Fixes: https://github.com/iXit/Mesa-3D/issues/320

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
454201b452 st/nine: Never update device changed.* fields
The device state changed.* field are never used.
These fields are used only for stateblocks.

Avoid setting them at all for clarity.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
2594b2efdc st/nine: Capture also default matrices for D3DSBT_ALL
We avoid allocating space for never unused matrices.
However we must do as if we had captured them.
Thus when a D3DSBT_ALL stateblock apply has fewer matrices
than device state, allocate the default matrices for the stateblock
before applying.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
bbeddb801e st/nine: Mark transform matrices dirty for D3DSBT_ALL
D3DSBT_ALL stateblocks capture the transform matrices.

Fixes some d3d test programs not displaying properly.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
a4e9bbb8f8 st/nine: Don't update unused world matrices
While to the application we have to track
accurately all 256 world matrices (including
in stateblocks), hw vertex processing enables
to set a limit to the number of world matrices
the hardware can access to in the advertised caps,
which is 8 for nine.

Thus don't bother in the stateblock code to send
the updated values for the unreachable matrices.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
2e51c4c7cc st/nine: Remove two unused states.
NINE_STATE_MATERIAL was used incorrectly at one location.
Replace it with the correct state.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
cb8ea21e1c st/nine: Remove commented nine_context_apply_stateblock
At some point the project was to adapt the
commented version to csmt.

The csmt rework enabled to fix some state aliasing
issues between stateblocks and internal state updates.
The commented version needs a lot of work to work with that.
Just drop it.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Brian Paul
7e64e39f8b nir: Fix array initializer
Empty initializer is not standard C.  This fixes MSVC build.

Trivial.
2018-10-26 12:35:48 -06:00
Jason Ekstrand
07eb8e7466 anv: Return VK_ERROR_DEVICE_LOST from anv_device_set_lost
This lets us get rid of a bunch of duplicated error messages.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-26 13:27:21 -05:00
Jason Ekstrand
ade22ae1ac anv/util: Split a vk_errorv helper out of vk_errorf
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-26 13:27:21 -05:00
Brian Paul
d6be0b5556 scons/svga: remove opt from the list of valid build types
This reverts commit a5fd54f8bf.

The whole point was to add a way to pass -DVMX86_STATS to the build,
but we can do that with a command line argument when we invoke scons.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2018-10-26 12:09:00 -06:00
Nanley Chery
5bcf479524 intel/blorp: Define the clear value bounds for HiZ clears
Follow the restriction of making sure the clear value is between the min
and max values defined in CC_VIEWPORT. Avoids a simulator warning for
some piglit tests, one of them being:

./bin/depthstencil-render-miplevels 146 d=z32f_s8

Jason found this to fix incorrect clearing on SKL.

Fixes: 09948151ab
       ("intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORP")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-26 10:34:07 -07:00
Eric Engestrom
285ebc84c7 radv: remove duplicate brackets in version string
MESA_GIT_SHA1 resolves to either an empty "" string if not build from git,
or " (git-DEADBEEF)" if it is. No need to wrap it in additional "()".

Fixes: 9d40ec2cf6 "radv: Add support for VK_KHR_driver_properties."
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 18:33:11 +01:00
Eric Engestrom
738f0f789b vulkan: drop always-true param
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-26 18:33:11 +01:00
Boyuan Zhang
f4126cfaab radeon/vcn: use util function to get h264 profile idc
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
2018-10-26 13:23:06 -04:00
Boyuan Zhang
55cf565698 radeon/vce: use util function to get h264 profile idc
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
2018-10-26 13:23:06 -04:00
Boyuan Zhang
b15d0200a9 vl: get h264 profile idc
Adding a function for converting h264 pipe video profile to profile idc

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
2018-10-26 13:23:06 -04:00
Jason Ekstrand
5cdeefe057 intel/nir: Use the OPT macro for more passes
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
18fb2c5d92 spirv: Initialize subgroup destinations with the destination type
Instead of initializing them manually, just use the type that we already
have sitting there.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
8fa70cfcfd spirv: Use the right bit-size for spec constant ops
Previously, we would always pull the bit size from the destination which
is wrong for opcodes like nir_ilt where the sources are variable-sized
but the destination is a fixed size.  We were getting lucky before
because nir_op_ilt returns a 32-bit value and basically everyone who
uses spec constants uses 32-bit ones.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
1d2ed694c1 nir/prog: Use nir_bany in kill handling
We have a helper that does exactly what the bany_inequal was doing.  It
emits the same code but is a bit higher level and is designed to operate
on a bvec4.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
2fe3031440 glsl/nir: Use i2b instead of ine for fixing UBO/SSBO Booleans
They do the same thing in the end but i2b is a bit simpler.  Also, let's
clean up the mess of code for SSBO handling with one line of builder.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
5bfce5fcc2 nir/system_values: Use the bit size from the load_deref
This isn't a great solution for bit-sizes but we don't have a
particularly convenient way to get a bit size from the system value enum
and this keeps the lowering pass from changing it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
a3b4cb3458 nir/opt_if: Rework condition propagation
Instead of doing our own constant folding, we just emit instructions and
let constant folding happen.  This is substantially simpler and lets us
use the nir_imm_bool helper instead of dealing with the const_value's
ourselves.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
4cd8a58595 nir/search: Use the nir_imm_* helpers from nir_builder
This requires that we rework the interface a bit to use nir_builder but
that's a nice little modernization anyway.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
6e32115bd6 nir/builder: Handle 16-bit floats in nir_imm_floatN_t
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
ff45649bc2 nir/builder: Add a nir_imm_true/false helpers
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
249e32ab17 nir/constant_folding: Use nir_src_as_bool for discard_if
Missed one while converting to the nir_src_as_* helpers.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
6de1869e86 nir/constant_folding: Add an unreachable to a switch
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
28bb6abd1d nir/validate: Print when the validation failed
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
292ebdbf98 anv: Handle the device loss abort in anv_device_set_lost
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-26 08:40:23 -05:00
Jason Ekstrand
cd0960b430 anv: Add helpers for setting/checking device lost
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-26 08:40:21 -05:00
Jason Ekstrand
319ff6f1ad anv: Provide a error message with a DEVICE_LOST
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-26 08:40:10 -05:00
Alex Smith
3bd239f71d anv: Fix sanitization of stencil state when the depth test is disabled
When depth testing is disabled, we shouldn't pay attention to the
specified depthCompareOp, and just treat it as always passing. Before,
if the depth test is disabled, but depthCompareOp is VK_COMPARE_OP_NEVER
(e.g. from the app having zero-initialized the structure), then
sanitize_stencil_face() would have incorrectly changed passOp to
VK_STENCIL_OP_KEEP.

v2: Roll the depthTestEnable check into the ds_aspect check below since
    they now both do the same thing.

Fixes: 028e1137e6 "anv/pipeline: Be smarter about depth/stencil state"
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-26 10:25:40 +01:00
Samuel Pitoiset
79bbdf8e45 radv: implement image to image operations for R32G32B32
This should address the remaining failures in Batman Arkhman City.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107765
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 10:50:08 +02:00
Samuel Pitoiset
6198245775 radv: fix a comment in radv_meta_buffer_to_image_cs_r32g32b32()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 10:50:05 +02:00
Samuel Pitoiset
02ccef7874 radv: add get_image_stride_for_r32g32b32() helper
For the special R32G32B32 paths.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 10:50:03 +02:00
Samuel Pitoiset
468c33e2f7 radv: add create_bview_for_r32g32b32() helper
For the special R32G32B32 paths.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 10:50:00 +02:00
Samuel Pitoiset
e60e3e1b3f radv: add create_buffer_from_image() helper
For the special R32G32B32 paths.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 10:49:58 +02:00
Sagar Ghuge
416abe809a intel/compiler: Print message descriptor as immediate source
While disassembling send(c) instruction print message descriptor as
immediate source operand along with message descriptor. This allows
assembler to read immediate source operand and set bits accordingly.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-26 06:42:14 +02:00
Sagar Ghuge
d15fa24860 intel/compiler: Print hex representation along with floating point value
While encoding the immediate floating point values in instruction we use
values upto precision 9, but while disassembling, we print precision to
6 places, which round up the value and gives wrong interpretation for
encoded immediate constant.

To avoid misinterpretation of encoded immediate values in instruction
and disassembled output, print hex representation along with floating
point value which can be used by assembler in future.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-26 06:41:08 +02:00
David McFarland
07a00a8729 util: Change remaining uint32 cache ids to sha1
After discussion with Timothy Arceri. disk_cache_get_function_identifier
was using only the first byte of the sha1 build-id.  Replace
disk_cache_get_function_identifier with implementation from
radv_get_build_id.  Instead of writing a uint32_t it now writes to a
mesa_sha1.  All drivers using disk_cache_get_function_identifier are
updated accordingly.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Fixes: 83ea8dd99b ("util: add disk_cache_get_function_identifier()")
2018-10-26 14:49:22 +11:00
Hyunjun Ko
3d198926a4 freedreno: use fd_bc_alloc_batch instead of fd_batch_create.
Following the commit 2385d7b066 and 8e798e28f7, for resource dependancy
tracking.

Fixes: dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo
with FD_MESA_DEBUG=inorder

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-25 18:46:19 -04:00
Hyunjun Ko
703271c22a freedreno/ir3: take reg->num out of union in ir3_register
To avoid wrong result when identifying the type of register.
Ie. If the reg is an array, it might be identified as address or
predicate register.

Fixes: dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.6

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-25 18:45:45 -04:00
Rob Clark
3c402d0dc2 freedreno/a6xx: disable unused groups
Don't leave vsconst/fsconst group enabled if we switch to shader with no
uniforms.

Fixes: abcdf5627a freedreno/a6xx: move const emit to state group
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-25 18:38:53 -04:00
Rob Clark
d53074d3f1 freedreno: add useful assert
Would have been useful to catch the problem fixed in
8e798e28f7

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-25 18:38:53 -04:00
Alok Hota
edf38019a0 swr/rast: ignore CreateElementUnorderedAtomicMemCpy
This function's API changed between LLVM 5 and 6. Compile errors occur
when building with LLVM 6+ if LLVM 5 was used for a dist tarball

CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107865
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-25 11:05:59 -05:00
Alok Hota
8c872ac2e3 swr/rast: fix intrinsic/function for LLVM 7 compatibility
Converted from x86 VFMADDPS intrinsic to generic LLVM intrinsic, and
removed createInstructionSimplifierPass, which were both removed in LLVM
7.0.0

These changes combine patches we received from the community and our own
internal patches

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
2018-10-25 10:32:27 -05:00
Rhys Perry
26ed0f0234 nvc0: increase NOUVEAU_TRANSFER_PUSHBUF_THRESHOLD to 1024 on Kepler+
Gives a +3.89% to +5.27% FPS improvement with Hitman and +2.73% to +2.82%
FPS improvement with Dirt Rally on my GTX 1060.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-10-25 15:25:10 +01:00
Bas Nieuwenhuizen
d41c3cc013 radv: Emit enqueued pipeline barriers on event write.
Since the CPU can read them we need to execute any GPU->CPU
flushes before the event is written.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108524
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-25 16:17:54 +02:00
Bas Nieuwenhuizen
9d40ec2cf6 radv: Add support for VK_KHR_driver_properties.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-25 16:14:43 +02:00
Eric Engestrom
e27902a261 util: use C99 declaration in the for-loop set_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-25 12:43:18 +01:00
Eric Engestrom
bb84fa146f util: use C99 declaration in the for-loop hash_table_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-25 12:43:18 +01:00
Dylan Baker
3d261cf77b gen: Add AMD_gpu_shader_int64.xml to tarball
CC: Ian Romanick <ian.d.romanick@intel.com>
CC: Marek Olšák <marek.olsak@amd.com>
Fixes: b3c17330e6
       ("mesa: expose AMD_gpu_shader_int64")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-10-24 11:29:30 -07:00
Dylan Baker
6d5fa65c74 gen: Add EXT_vertex_attrib_64bit.xml to dependency lists
Which is also required to put it in the tarball, a requirement for
building with meson from the tarball.

CC: Ian Romanick <ian.d.romanick@intel.com>
CC: Marek Olšák <marek.olsak@amd.com>
Fixes: 263c962cfd
       ("mesa: expose EXT_vertex_attrib_64bit")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-10-24 11:29:29 -07:00
Eric Engestrom
edc06dd533 anv: move variable to proper scope and mark as MAYBE_UNUSED
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-24 18:16:20 +01:00
Eric Engestrom
ed5d65a6a1 anv: use snprintf() instead of memset()+strcpy()
snprintf() guarantees that it will not write more chars than allowed,
and that the string will be null-terminated, without the need to fill
the whole thing with zeroes to begin with.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-24 18:15:56 +01:00
Eric Engestrom
33d757096d anv: drop unused includes
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-24 18:15:05 +01:00
Dylan Baker
c4de8ba036 autotools: include intel_tiled_memcopy.c
There are two problems with the fixed patch. First, it fails to create a
dependency on the sourced .c file, so changes to intel_tiled_memcpy.c
won't trigger a rebuild. It also doesn't get included in the dist
tarball.

Fixes: 11b1afdc92
       ("i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-10-24 09:22:15 -07:00
Dylan Baker
43b0d5fa04 meson: fix formatting and add extra_files to i965
extra_files is just a nice way to to tell certain IDEs (and those
reading the file) that this file is also a dependency. Meson will use
the .d file generated by the compiler to figure out what the target
actually depends on.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-10-24 09:22:13 -07:00
Eduardo Lima Mitev
b0c427043b ir3_compiler/nir: fix imageSize() for buffer-backed images
GL_EXT_texture_buffer introduced texture buffers, which can be used
in shaders through a new type imageBuffer.

Because how image access is implemented in freedreno, calling
imageSize on an imageBuffer returns the size in bytes instead of texels,
which is incorrect.

This patch adds a division of imageSize result by the bytes-per-pixel
of the image format, when image is buffer-backed.

Fixes all tests under
dEQP-GLES31.functional.image_load_store.buffer.image_size.*

v2: Pre-compute and submit the log2 of the image format's bpp as shader
    constant instead of emitting the LOG2 instruction in code. (Rob Clark)

v3: Use ffs (find-first-bit) helper for computing log2 (Ilia Mirkin)

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-10-24 18:18:35 +02:00
Jose Fonseca
d9a04196d9 nir: Fix array initializer.
Empty initializer is not standard C.  This fixes MSVC build.

Trivial.
2018-10-24 11:37:09 +01:00
Liviu Prodea
d99fda17c8 scons: Put to rest zombie texture_float build option.
I found a remnant of texture_float build option that wasn't removed in
commit 66673bef94

This patch removes it.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-10-24 11:10:17 +01:00
Alex Smith
6c56c1fbd4 anv: Allow presenting via a different GPU
anv_GetPhysicalDeviceSurfaceSupportKHR will already return success for
this, but anv_GetPhysicalDevice{Xcb,Xlib}PresentationSupportKHR do not.
Apps which check for presentation support via the latter (all Feral
Vulkan games at least) will therefore fail.

This allows me to render on an Intel GPU and present to a display
connected to an AMD card (tested HD 530 + Vega 64).

v2: Rebase on current master.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-24 09:40:02 +01:00
Juan A. Suarez Romero
3112da346b nir: fix nir_copy_propagation test
Use nir_src_comp_as_uint() to read the proper second component, as
nir_src_as_uint() returns the first one.

v2: Use nir_src_comp_as_uint() [Jason]

Fixes: 16870de8a0 ("nir: Use nir_src_is_const and nir_src_as_* in core
                     code")
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108532
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-24 09:13:24 +02:00
Timothy Arceri
0ff1ccca25 radv: call nir_link_xfb_varyings()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-24 08:21:29 +11:00
Timothy Arceri
c769ed10de radv: move nir_lower_io_to_scalar_early() to radv_link_shaders()
nir_lower_io_to_scalar_early() is really part of the link time
optimisations. Moving it here allows the code to be simplified
and also keeps the code easy to follow in the next patch.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-24 08:21:29 +11:00
Samuel Pitoiset
7c694cbfa4 nir: add linking helper nir_link_xfb_varyings()
The linking opts shouldn't try removing or compacting XFB varyings
in the consumer. To avoid this we copy the always_active_io flag
from the producer.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-24 08:21:29 +11:00
Sagar Ghuge
0a7664fe8c intel/compiler: Change src1 reg type to unsigned doubleword
To have uniform behavior while disassembling send(c) instruction use
register type of unsigned doubleword for src1 when message descriptor is
immediate value. Bspec does not specifiy anything for src1 immediate
default type.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2018-10-23 12:44:24 -07:00
Eduardo Lima Mitev
22ddd4988e mesa/glformats: Remove redundant helper _mesa_base_format_component_count
There exists _mesa_components_in_format() which already includes
all cases handled in _mesa_base_format_component_count().

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-10-23 21:29:15 +02:00
Jason Ekstrand
ecb7775e1c nir/algebraic: Fix a typo in the bit size validation code
The conon_bit_class and canon_var_class variables got switched.

Fixes: 932c650e0b "nir/algebraic: Loosen a restriction on variables"
Reported-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-23 12:22:29 -05:00
Leo Liu
b75fb8ee36 amd/common: check DRM version 3.27 for JPEG decode
JPEG was added after DRM version 3.26

Signed-off-by: Leo Liu <leo.liu@amd.com>
Fixes: 4558758c51749(amd/common: add vcn jpeg ip info query)
Cc: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-23 13:12:05 -04:00
Juan A. Suarez Romero
a8c2a6b0ac docs: update calendar
I'll take care of 18.2 releases series on Andres behalf.

CC: Andres Gomez <agomez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
2018-10-23 18:40:09 +02:00
Lionel Landwerlin
a8594887bc intel/decoders: fix end of batch limit
Pointer arithmetic...

v2: s/4/sizeof(uint32_t)/ (Eric)

v3: Give bytes to print_batch() in error_decode (Lionel)
    Make clear what values we're dealing with in error_decode (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-10-23 14:49:33 +01:00
Boyuan Zhang
55e7de7b19 radeonsi: enable vcn jpeg decode for raven
Enable vcn jpeg decode for raven.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
97c473bb29 winsys/amdgpu: add vcn jpeg cs support
Add vcn jpeg cs support, align cs by no-op.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
4558758c51 amd/common: add vcn jpeg ip info query
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
6d2d910653 radeon/vcn: implement jpeg target buffer cmd
Implement jpeg target buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
0ee5630cfc radeon/vcn: implement jpeg bitstream buffer cmd
Implement jpeg bitstream buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
9b478b0c7a radeon/uvd: remove get mjpeg slice header
Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
4fc2368e3b st/va: get mjpeg slice header
Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
c7a5ef26ad radeon/vcn: add jpeg decode implementation
Add a new file to handle VCN Jpeg decode specific functions. Use Jpeg
specific cmd sending function in end_frame call.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
40fceb55f3 radeon/vcn: separate send cmd call from end frame
Use function pointer for sending cmd in end_frame call. By doing this, we can
assign different cmd sending logics for Jpeg decode later.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
4f1f128f8e radeon/vcn: create cs based on ring type
Add RING_VCN_JPEG for VCN Jpeg decode, and keep RING_VCN_DEC for other codecs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
f7116e4ff8 radeon/winsys: add vcn jpeg ring type
Add a new ring type for vcn jpeg.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
e7e68d15b5 radeon/vcn: add vcn jpeg decode interface
Add VCN Jpeg decode interfaces and register defines.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
6bc0a3a834 radeon/vcn: move radeon decoder define to header file
Move radeon_decoder definition from "radeon_vcn_dec.c" to "radeon_vcn_dec.h",
so that it can be included by other files later.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
0f59e3f088 meson: update required amdgpu version to 2.4.95
VCN jpeg requires new hw ip

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
2e768ade61 configure.ac: update libdrm amdgpu version to 2.4.95
VCN jpeg requires new hw ip

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Samuel Pitoiset
69c44de798 radv: fix btoi for R32G32B32 when the dest offset is not 0
Fixes: 593996bc02 ("radv: implement buffer to image operations for R32G32B32")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-23 14:29:26 +02:00
Scott D Phillips
54c823ec79 i965/miptree: Use cpu tiling/detiling when mapping
Rename the (un)map_gtt functions to (un)map_map (map by
returning a map) and add new functions (un)map_tiled_memcpy that
return a shadow buffer populated with the intel_tiled_memcpy
functions.

Tiling/detiling with the cpu will be the only way to handle Yf/Ys
tiling, when support is added for those formats.

v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson)

v3: Add units to parameter names of tile_extents (Nanley Chery)
    Use _mesa_align_malloc for the shadow copy (Nanley)
    Continue using gtt maps on gen4 (Nanley)

v4: Use streaming_load_memcpy when detiling

v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it
    takes precedence.  Add intel_miptree_access_raw, needed after
    rebasing on commit b499b85b0f.

v6: refactor to changes done for sse41 separation (Tapani)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
2018-10-23 14:08:05 +03:00
Scott D Phillips
11b1afdc92 i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear
The reference for MOVNTDQA says:

    For WC memory type, the nontemporal hint may be implemented by
    loading a temporary internal buffer with the equivalent of an
    aligned cache line without filling this data to the cache.
    [...] Subsequent MOVNTDQA reads to unread portions of the WC
    cache line will receive data from the temporary internal
    buffer if data is available.

This hidden cache line sized temporary buffer can improve the
read performance from wc maps.

v2: Add mfence at start of tiled_to_linear for streaming loads (Chris)
v3: add Android build support (Tapani)
v4: squash 'fix i915: Fix streaming loads for intel_tiled_memcpy'
    separate sse41 to own static library (Tapani)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
2018-10-23 14:08:05 +03:00
Tapani Pälli
91d3a5d1a8 i965: expose type of memcpy instead of memcpy function itself
There is currently no use of returned memcpy functions outside
intel_tiled_memcpy. Patch changes intel_get_memcpy to return memcpy
type instead of actual function. This makes it easier later to separate
streaming load copy in to own static library.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-10-23 14:08:05 +03:00
Eric Engestrom
bc021be78d util: use *unsigned* ints for bit operations
Fixes errors thrown by GCC's Undefined Behaviour sanitizer (ubsan) every
time this macro is used.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-23 11:44:02 +01:00
Eric Engestrom
17b03b5320 radv: s/abs/fabsf/ for floats
Fixes: a4c4efad89 "radv: Rework guard band calculation"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-23 11:43:51 +01:00
Eric Engestrom
8629d807aa meson: drop option description relic
`platforms` is no longer a comma-separated string, and some of our
option descriptions are way too long already. Just drop the incorrect
bit.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-10-23 11:43:51 +01:00
Jason Ekstrand
8b626a22b2 st/mesa: Record shader access qualifiers for images
They're not required to be the same as the access flag on the image
unit.  For hardware that does shader image lowering based on the
qualifier (Intel), it may be required for state setup.

v2: (by Kenneth Graunke, incorporating feedback from Marek Olšák)
 - Reduce both access and shader_access to uint16_t to avoid making
   the pipe_image_view structure larger.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-23 02:36:24 -07:00
Jason Ekstrand
bf441d22a7 nir/algebraic: Provide descriptive asserts for bit size checks
This will hopefully make debugging opt_algebraic bit-size compile
failures easier.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-22 16:00:18 -05:00
Jason Ekstrand
932c650e0b nir/algebraic: Loosen a restriction on variables
Previously, we would fail if a variable had an assigned but unknown bit
size X and we tried to assign it an actual bit size.  However, this is
ok because, at the time we do the search, the variable does have an
actual bit size and it will match X because of the NIR rules.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-22 16:00:18 -05:00
Jason Ekstrand
ea9e651423 nir/algebraic: A bit of validation refactoring'
We rename some local variables in validate() to be more readable and
plumb the var through to get/set_var_bit_class instead of the var index.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-22 16:00:18 -05:00
Jason Ekstrand
641f4be8e8 nir/algebraic: Make internal classes str-able
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-22 16:00:18 -05:00
Jason Ekstrand
6068be543b nir/algebraic: Generalize an optimization
There's nothing boolean about (a | ~a) ~> -1

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-22 16:00:18 -05:00
Jason Ekstrand
69618a8678 nir/algebraic: Use bool internally instead of bool32
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-22 16:00:18 -05:00
Kenneth Graunke
00103db04a intel: Fix decoding for partial STATE_BASE_ADDRESS updates.
STATE_BASE_ADDRESS only modifies various bases if the "modify" bit is
set.  Otherwise, we want to keep the existing base address.

Iris uses this for updating Surface State Base Address while leaving the
others as-is.

v2: Also update aubinator_viewer_decoder (caught by Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-22 13:38:44 -07:00
Jason Ekstrand
16870de8a0 nir: Use nir_src_is_const and nir_src_as_* in core code
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 14:24:15 -05:00
Jason Ekstrand
ce36f412c9 nir/search_helpers: Use nir_src_is_const and friends
This not only makes them safe for more bit sizes but it also fixes a bug
in is_zero_to_one where it would return true for constant NaN.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 14:24:15 -05:00
Jason Ekstrand
7bae7828aa nir/search: Use nir_src_is_const and friends
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 14:24:15 -05:00
Jason Ekstrand
bca5c2c688 nir: Add some new helpers for working with const sources
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 14:24:15 -05:00
Alyssa Rosenzweig
e0c267c752 mesa/st: Only call nir_lower_io_to_scalar_early on scalar ISAs
On scalar ISAs, nir_lower_io_to_scalar_early enables significant
optimizations. However, on vector ISAs, it is counterproductive and
impedes optimal codegen. This patch only calls
nir_lower_io_to_scalar_early for scalar ISAs. It appears that at present
there are no upstreamed drivers using Gallium, NIR, and a vector ISA, so
for existing code, this should be a no-op. However, this patch is
necessary for the upcoming Panfrost (Midgard) and Lima (Utgard)
compilers, which are vector.

With this patch, Panfrost is able to consume NIR directly, rather than
TGSI with the TGSI->NIR conversion.

For how this affects Lima, see
https://www.mail-archive.com/mesa-dev@lists.freedesktop.org/msg189216.html

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-10-22 20:37:07 +02:00
Dylan Baker
4e785fb383 meson: don't require libelf for r600 without LLVM
r600 doesn't have a hard requirement on LLVM, and therefore doesn't have
a hard requirement on libelf. Currently the logic doesn't allow that
however.

Distro-bug: https://bugs.gentoo.org/669058
Fixes: 5060c51b6f
       ("meson: build r600 driver")
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-10-22 11:29:55 -07:00
Jason Ekstrand
ca4e465f7d anv,radv: Trivially expose two new VK_GOOGLE extensions
This patch exposes support for the following two extensions:

 * VK_GOOGLE_decorate_string
 * VK_GOOGLE_hlsl_functionality1

There's nothing for the driver to do; it's all handled in spirv_to_nir.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107971
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 10:50:20 -05:00
Jason Ekstrand
891886da2f spirv: Add no-op support for VK_GOOGLE_hlsl_functionality1
This extension adds two new decorations which carry meaning only for
HLSL shaders.  They are expected to be handled by higher level layers
and can be ignored by implementations.  However, it does save the client
a bit of work if the implementation safely ignores them instead of the
client having to strip them out of the SPIR-V in order for it to be
valid.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 10:49:53 -05:00
Jason Ekstrand
5f0322d5c3 spirv: Add support for SPV_GOOGLE_decorate_string
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 10:49:53 -05:00
Rob Herring
2bb05d70af android: Build kms_swrast for the Android platform
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-22 13:08:17 +01:00
Connor Abbott
27fe3f5b5a ac: Fix loading a dvec3 from an SSBO
The comment was wrong, since the loop above casts to a type with the
correct bitsize already.

Fixes: 7e7ee82698 ("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 09:44:51 +02:00
Connor Abbott
59535b05cf ac: Introduce ac_build_expand()
And implement ac_bulid_expand_to_vec4() on top of it.

Fixes: 7e7ee82698 ("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 09:44:51 +02:00
Eduardo Lima Mitev
fdd926d5b2 ir3/nir: Set up image_dims consts for image_deref_size intrinsic too
`nir_intrinsic_image_deref_size` is not being considered during scan for
driver constants, so image constants are not emitted if a shader
only ever query the size of an image (no load, store, atomic op, etc).
This is unlikely, but possible.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-10-21 21:29:18 +02:00
Karol Herbst
2d235d69c8 nv50/ir: fix ConstantFolding::createMul for 64 bit muls
Fixes: 2f52925f5c
       "nv50/ir: move a * b -> a << log2(b) code into createMul()"

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-10-20 03:00:04 +02:00
Sonny Jiang
bfb2b90246 radeonsi: Disable clear_state with radeon kernel driver
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2018-10-19 16:16:57 -04:00
Kenneth Graunke
f91f9bab83 meson: Add -Werror=return-type when supported.
This warning detects non-void functions with a missing return statement,
return statements with a value in void functions, and functions with an
bogus return type that ends up defaulting to int.  It's already enabled
by default with -Wall.  Generally, these are fairly serious bugs in the
code, which developers would like to notice and fix immediately.  This
patch promotes it from a warning to an error, to help developers catch
such mistakes early.

I would not expect this warning to change much based on the compiler
version, so hopefully it won't become a problem for packagers/builders.

See the GCC documentation or 'man gcc' for more details:
https://gcc.gnu.org/onlinedocs/gcc-7.3.0/gcc/Warning-Options.html#index-Wreturn-type

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-19 10:16:57 -07:00
Jason Ekstrand
0d380af809 anv: Define trampolines as the weak functions
Instead of having weak references to the anv functions and separate
trampoline functions with their own dispatch table, just make the
trampoline functions weak.  This gets rid of a dispatch table and
potentially lets the compiler delete the unused weak function.  The
end result is a reduction in the .text section of 5.7K and a reduction
in the .data section of 1.4K.

Before:

   text	   data	    bss	    dec	    hex	filename
3190329	 282232	   8960	3481521	 351fb1	_install/lib64/libvulkan_intel.so

After:

   text	   data	    bss	    dec	    hex	filename
3184548	 280792	   8960	3474300	 35037c	_install/lib64/libvulkan_intel.so

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-19 11:52:00 -05:00
Juan A. Suarez Romero
f8e789d2ac docs: fix typo in 18.2.3 release notes link
Fixes: 86b4bd52dc ("docs: update calendar, add news item and link
release notes for 18.2.3")

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-10-19 18:48:12 +02:00
Juan A. Suarez Romero
86b4bd52dc docs: update calendar, add news item and link release notes for 18.2.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-10-19 18:45:41 +02:00
Juan A. Suarez Romero
01f5d37d3e docs: add sha256 checksums for 18.2.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 27fd12857b)
2018-10-19 18:43:49 +02:00
Juan A. Suarez Romero
e30970e2cd docs: add release notes for 18.2.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit d219361b42)
2018-10-19 18:43:48 +02:00
Jose Fonseca
45bacc4b63 scons: Remove gles option.
It's broken, and WGL state tracker is always built with GLES support
noawadays.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-10-19 16:50:26 +01:00
Bas Nieuwenhuizen
68c7833540 radv: Fix WSI & PCI bus info initialization order.
Trying to access the bus info before it is initialized is not going
to work.

Fixes: baa38c144f "vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108491
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2018-10-19 13:24:19 +02:00
Marek Olšák
69a87b5d47 radeonsi: fix a typo in a comment in emit_guardband 2018-10-18 18:01:22 -04:00
Marek Olšák
2a26b1c045 radeonsi: fix gnome-shell crash
I wasn't expecting to get viewports with the center having
negative coordinates.

Broken by: 6cc79e4411
2018-10-18 17:55:44 -04:00
Jason Ekstrand
8c0b9fdfa1 Revert "anv: Stop generating weak references for instance entrypoints"
This reverts commit 00bb42105d.  It was
not as well thought out as I had intended and broke the build when
VK_KHR_display is disabled in the build.
2018-10-18 15:36:26 -05:00
Marek Olšák
77bcbe712e radeonsi: clamp point size to the limit
This fixes dEQP-GLES2.functional.rasterization.limits.points.
Broken by: ea039f789d

Tested-by: Jakob Bornecrantz <jakob@collabora.com>
2018-10-18 16:08:56 -04:00
Marek Olšák
eae8f49fc6 radeonsi: fix a VGT hang with primitive restart on Polaris10 and later
Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
2018-10-18 16:08:56 -04:00
Marek Olšák
165817d47f radeonsi: fix a deadlock due to partially-initialized context on CI 2018-10-18 16:08:56 -04:00
Jan Vesely
06bf56725d radeonsi: Bump number of allowed global buffers to 32
Fixes assertion failure/crash when running luxmark/luxball on clover.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108272
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-18 16:02:42 -04:00
Andres Rodriguez
e71a87775e radv: fix check for perftest options size
It was using the debug options array size.

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:42:20 -04:00
Marek Olšák
6cc79e4411 radeonsi: fix incorrect hw screen offset and guardband computation
It resulted in assertion failures or incorrect rendering.

Broken by: 9e182b8313
2018-10-18 14:42:42 -04:00
Jason Ekstrand
baa38c144f vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching
This lets us avoid passing the DRM fd around all over the place and gets
us closer to layer utopia.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-18 11:29:00 -05:00
Michel Dänzer
c20ba1be18 loader/dri3: Also wait for front buffer fence if we triggered it
In that case, we have to wait for the fence to synchronize with the
corresponding drawing we triggered in the X server.

Fixes incorrect display with the i965 driver and some applications, e.g.
solvespace.

Bugzilla: https://bugs.freedesktop.org/108097
Fixes: aefac10fec "loader/dri3: Only wait for back buffer fences in
                     dri3_get_buffer"
Tested-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
2018-10-18 16:52:06 +02:00
Jason Ekstrand
00bb42105d anv: Stop generating weak references for instance entrypoints
We don't need weak references to instance entrypoints because we never
have more than one of each so we don't need the NULL fall-back.  This
also helps us avoid forgetting things because we now get link errors for
missing instance entrypoints.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-18 09:17:39 -05:00
Jason Ekstrand
7c65cf9844 vulkan/wsi: Implement GetPhysicalDevicePresentRectanglesKHR
This got missed during 1.1 enabling because it was defined as an
interaction between device groups and WSI and it wasn't obvious it was
in the delta.

The idea behind it is that it's supposed to provide a hint to the
application in a multi-GPU setup to indicate which regions of the screen
are being scanned out by which GPU so a multi-device split-screen
rendering application can render each part of the screen on the GPU that
will be presenting it and avoid extra bus traffic between GPUs.  On a
single-GPU setup or one which doesn't support this present mode, we need
to do something.  We choose to return the window size (or a max-size
rect) if the compositor, X server, or crtc is associated with the given
physical device and zero rectangles otherwise.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-18 09:17:39 -05:00
Jason Ekstrand
7629c00557 vulkan/wsi: Store the instance allocator in wsi_device
We already have wsi_device and we know the instance allocator at
wsi_device_init time so there's no need to pass it into the physical
device queries.  This also fixes a memory allocation domain bug that can
occur if CreateSwapchain gets called prior to any queries (not likely)
in which case the cached connection gets allocated off the device
instead of the instance.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-18 09:17:39 -05:00
Michał Janiszewski
0ef50ecc69 st/xlib: Use more appropriate include guard
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com
2018-10-18 11:03:04 +01:00
Michał Janiszewski
bcc613acc1 gallium: Fix mismatched ifdef-guards
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-18 11:03:03 +01:00
Gert Wollny
74adc624b6 softpipe: dynamically allocate space for immediate constants
The number of immediate constants was fixed and the size check was
only done by means of an assertion. Given this a shader that emits
more immediate constants would result in a memory corruption when
mesa is build in release mode.

Instead of using this fixed limit allocate the space dynamically, let it
grow as needed, and also remove the unused ImmArray.

Fixes: dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.1

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-10-18 10:59:51 +02:00
Timothy Arceri
3a95396f3c radv: use nir_shrink_vec_array_vars()
Totals from affected shaders:
SGPRS: 1096 -> 1096 (0.00 %)
VGPRS: 1192 -> 1056 (-11.41 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 100940 -> 94384 (-6.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 100 -> 112 (12.00 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from Batman Arkham City.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:04:09 +11:00
Timothy Arceri
8086fa1bcd radv: use nir_split_array_vars()
We call in the opt loop in case another pass results in an
array with indirect access being turned into direct access.

Totals from affected shaders:
SGPRS: 512 -> 496 (-3.12 %)
VGPRS: 456 -> 452 (-0.88 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 40040 -> 39664 (-0.94 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 41 -> 43 (4.88 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from Batman Arkham City.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:04:09 +11:00
Timothy Arceri
06675711e7 radv: use nir_opt_find_array_copies()
Totals from affected shaders:
SGPRS: 1112 -> 1112 (0.00 %)
VGPRS: 1492 -> 1196 (-19.84 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 112172 -> 101316 (-9.68 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 93 -> 98 (5.38 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from "Batman: Arkham City" over DXVK.

The pass detects that the temporary array created by DXVK for
storing TCS inputs is a copy of the input arrays and allows
us to avoid copying all of the input data and then indirecting
on it with if-ladders, instead we just do indirect indexing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:04:09 +11:00
Timothy Arceri
9d5b106b2e radv: use nir_opt_copy_prop_vars and nir_opt_dead_write_vars
Totals from affected shaders:
SGPRS: 2856 -> 2856 (0.00 %)
VGPRS: 3236 -> 3248 (0.37 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 236560 -> 233548 (-1.27 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 277 -> 283 (2.17 %)
Wait states: 0 -> 0 (0.00 %)

Even in the cases were we have increased VGPR use it appears
the NIR is improved significantly.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:04:09 +11:00
Keith Packard
67a2c1493c vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5]
Offers three clocks, device, clock monotonic and clock monotonic
raw. Could use some kernel support to reduce the deviation between
clock values.

v2:
	Ensure deviation is at least as big as the GPU time interval.

v3:
	Set device->lost when returning DEVICE_LOST.
	Use MAX2 and DIV_ROUND_UP instead of open coding these.
	Delete spurious TIMESTAMP in radv version.

	Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
	Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

v4:
	Add anv_gem_reg_read to anv_gem_stubs.c

	Suggested-by: Jason Ekstrand <jason@jlekstrand.net>

v5:
	Adjust maxDeviation computation to max(sampled_clock_period) +
	sample_interval.

	Suggested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
	Suggested-by: Jason Ekstrand <jason@jlekstrand.net>

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-17 20:10:15 -07:00
Topi Pohjolainen
a11cafbd7a intel/compiler/icl: Use invocation id bits 22:16 instead of 23:17
Identifier bits in the dispatch header have changed. See Bspec:

SINGLE_PATCH Payload:

3D Pipeline Stages - 3D Pipeline Geometry -
Hull Shader (HS) Stage IVB+ - Payloads IVB+

Fixes: KHR-GL46.tessellation_shader.tessellation_shader_tc_barriers.barrier_guarded_read_write_calls

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-10-17 21:19:57 +03:00
Neil Roberts
a9475d9337 Fix setting indent-tabs-mode in the Emacs .dir-locals.el files
Some of the .dir-locals.el had the wrong name for the truthy value so
it wasn’t setting indent-tabs-mode.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-10-17 19:03:08 +02:00
Rob Clark
d27b1c83b9 freedreno/a6xx: don't allocate binning rb
Now that a single cmdstream is used for both binning and draw passes, we
can skip allocation of cmdstream buffer for binning.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:49 -04:00
Rob Clark
24d57a6d8f freedreno/a6xx: single cmdstream for draw+binning
Now that state which is different for draw vs binning pass is split out
into different state-groups with appropriate enable_mask (so the
appropriate one is chosen for draw vs binning), switch over to using a
single cmdstream for both passes.

This should significantly lower draw overhead for CPU bound benchmarks.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:49 -04:00
Rob Clark
72f6164fef freedreno/a6xx: split binning vs draw program stateobj's
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:49 -04:00
Rob Clark
3313d693af freedreno/a6xx: split VBO state into binning/draw variants
Blob seems to manage to use same input registers for BS (binning pass)
vs VS (draw pass) shaders, so it can use the same VBO state for both.
We can't quite do that yet, so split them.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:49 -04:00
Rob Clark
b23fc4cacb freedreno/a6xx: move VBO state to stateobj
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:49 -04:00
Rob Clark
e194056832 freedreno/a6xx: move ZSA state to stateobj
Step towards single cmdstream, where we need different state-group-id's
for binning vs draw ZSA state.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
a50a9a44e8 freedreno/a6xx: remove vismode param
We don't need to keep this IGNORE_VISIBILITY in binning pass.  Prep work
for using single cmdstream for both draw and binning passes.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
d9dbc9c21f freedreno/ir3: move binning-pass fixup for a6xx+
Move this to after ir3_cp (which can add lowered immediates to the const
state) for a6xx+, to ensure the uniform state matches between binning
and vertex shaders.  This way we can emit just a single VS_CONST state-
group when we re-use single cmdstream for both binning and draw passes.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
1a51c4a87e freedreno/a6xx: a bit more state emit cleanup
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
2ffc79c7d1 freedreno/a6xx: move framebuffer state emit to emit_mrt()
No point in checking this per-draw, since framebuffer change means new
batch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
5894f37b85 freedreno/a6xx: small emit_mrt() cleanup
On a6xx, this is only used for pfb->cbufs so we can just directly pass
the pfb state.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
b4e94af37d freedreno/a6xx: use program cache
Use the in-memory cache to construct shader program state and re-use it
on subsequent draws, to lower driver overhead.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
1d7fbe2cd1 freedreno/ir3: shader variant cache
Cache that maps gallium hwcso (in this case, 'struct ir3_shader') plus
shader variant key to a generation specific state object.

This could eventually replace the linked list of shader variants, but
for now it lets us re-use the work currently done in fdN_program_emit()

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
2e9c08c0bc freedreno/ir3: move binning_pass out of shader variant key
Prep work for a following patch, that introduces a cache to map from
program state (all shader stages) plus variant key to pre-baked hw
state (which could be emit'd via CP_SET_DRAW_STATE, for example).
To do that, we really want the variant key to be immutable, and to
treat the binning pass shader as an extra shader stage, rather than
as a VS variant.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
8b1a3b5dde freedreno/ir3: track # of samplers used by shader
This is useful for a6xx to avoid program state from depending on bound
tex/samp state.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
1b9d69410c freedreno/a6xx: texture state obj
Unfortunately gallium doesn't match what the hw wants perfectly here, in
using a separate CSO for each texture/sampler.  So we have to use a hash
table to map the collection of texture/samplers to hw state object.

We probably could use separate hw state objects for texture and sampler
state, but mesa/st tends to update the tex and samp state together.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
e8606b11dd freedreno: add resource seqno
Intended to be something more compact than a 64b pointer, which could be
used as a key into hashtables.  Prep work for texture state objects.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
abcdf5627a freedreno/a6xx: move const emit to state group
Eventually we want to move nearly everything, but no other state depends
on const state, so this is the easiest one to move first.

For webgl aquarium, this reduces GPU load by about 10%, since for each
fish it does a uniform upload plus draw.. fish frequently are visible in
only a single tile, so this skips the uniform uploads for other tiles.

The additional step of avoiding WFI's when using CP_SET_DRAW_STATE seems
to be work an additional 10% gain for aquarium.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
a398d26fd2 freedreno/a6xx: add infrastructure for CP_DRAW_STATE
Add helper to add state-groups to emit, and code to emit CP_DRAW_STATE
packet if we have any state-groups.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
ec717fc629 freedreno: reduce resource dependency tracking overhead
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Neil Roberts
ee61790daf freedreno: Remove the Emacs mode lines
These are not necessary because the corresponding settings are set via
the .dir-locals.el file anyway. Most of them were missing a ‘:’ after
“tab-width” which was making Emacs display an annoying warning
whenever you open the file.

This patch was made with:

sed -ri '/-\*- mode:/,/^$/d' \
    $(find src/gallium/{drivers,winsys} -name \*.\[ch\] \
               -exec grep -l -- '-\*- mode:' {} \+)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Neil Roberts
afe640b360 freedreno: Fix the Emacs indentation configuration file
The .dir-locals.el had the wrong name for the truthy value so it
wasn’t setting indent-tabs-mode.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Hyunjun Ko
8e798e28f7 freedreno: allocate batches from the cache in launch_grid
Needs to allocate batches from the cache so that it could
get a valid index and make resource dependancy tracking right.

In addition this fixes assertion on debug build since the commit
1a40faa8 landed.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Hyunjun Ko
2385d7b066 freedreno: adds nondraw param to fd_bc_alloc_batch
Needs to specify nondraw when creating a batch through
fd_bc_alloc_batch since it'd better create a batch through
it rather than fd_batch_create.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
9e6019bd46 freedreno/a6xx: remove fd6_emit_render_cntl()
It was dead code carried over from a5xx

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
835cb06965 freedreno/ir3: fix broken texcoord inputs
TODO not sure if this is best solution, but current logic is broken for
texcoord inputs.  It is definitely the simplest solution.

Fixes: 1a24f51966 freedreno/ir3: ignore unused inputs
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Rob Clark
cbf9fe50b5 freedreno: fix off-by-one error in BEGIN_RING()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:48 -04:00
Marek Olšák
669dd22983 util: document a limitation of util_fast_udiv32
trivial
2018-10-17 12:27:58 -04:00
Matt Turner
58a51d0a67 i965/fs: Add 64-bit int immediate support to dump_instructions()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-10-16 17:48:17 -07:00
Marek Olšák
fcc70e4855 radeonsi: track context rolls better for the Vega scissor bug workaround
We should get fewer context rolls with the SET_CONTEXT_REG optimization,
but it would have been for nothing if the scissor state rolled the context
anyway. Don't emit the scissor state if there is no context roll.
2018-10-16 17:23:25 -04:00
Marek Olšák
25ddb15cfe radeonsi: emit sample locations for 1xAA only when the hw bug is present 2018-10-16 17:23:25 -04:00
Marek Olšák
9b331e462e radeonsi: use compute shaders for clear_buffer & copy_buffer
Fast color clears should be much faster. Also, fast color clears on
evicted buffers should be 200x faster on GFX8 and older.
2018-10-16 17:23:25 -04:00
Marek Olšák
5030adcbe0 radeonsi: use copy_buffer in buffer_do_flush_region directly 2018-10-16 17:23:25 -04:00
Marek Olšák
0b40fbc879 radeonsi: use faster integer division for instance divisors
We know the divisors when we upload them, so instead we can precompute
and upload division factors derived from each divisor.

This fast division consists of add, mul_hi, and two shifts,
and we have to load 4 dwords intead of 1.

This probably won't affect any apps.
2018-10-16 17:23:25 -04:00
Marek Olšák
bfc795670e ac: add helpers for fast integer division by a constant 2018-10-16 17:23:25 -04:00
Marek Olšák
ea039f789d radeonsi: use higher subpixel precision (QUANT_MODE) for smaller viewports 2018-10-16 15:28:22 -04:00
Marek Olšák
4fd8d2df9c radeonsi: move emission of PA_SU_VTX_CNTL into emit_guardband
We'll modify the quant mode there, which also affects the guarband
computation.
2018-10-16 15:28:22 -04:00
Marek Olšák
41a6c3de1f radeonsi: don't re-upload the sample position constant buffer repeatedly 2018-10-16 15:28:22 -04:00
Marek Olšák
b94824c787 radeonsi: set PA_SU_PRIM_FILTER_CNTL optimally 2018-10-16 15:28:22 -04:00
Marek Olšák
9e182b8313 radeonsi: center viewport to improve guardband clipping for high resolutions
This will be more useful when we change the quant mode to increase subpixel
precision and decrease the viewport range (which might not be possible
if the viewport is not centered in the viewport range).
2018-10-16 15:28:22 -04:00
Marek Olšák
fedc1fda30 radeonsi: save raster config in screen, add se_tile_repeat 2018-10-16 15:28:22 -04:00
Marek Olšák
ac76aeef20 radeonsi: switch back to standard DX sample positions
Apps may rely on them.
2018-10-16 15:28:22 -04:00
Marek Olšák
67f02cf810 radeonsi: add GDS support to CP DMA 2018-10-16 15:28:22 -04:00
Marek Olšák
0d05581578 radeonsi: rename si_gfx_* functions to si_cp_*
and write_event_eop -> release_mem
2018-10-16 15:28:22 -04:00
Marek Olšák
6e1cf6532d radeonsi: make si_gfx_write_event_eop more configurable 2018-10-16 15:28:22 -04:00
Sergii Romantsov
0fa9e6d7b3 anv/skylake: disable ForceThreadDispatchEnable
On Skylake enabling of ForceThreadDispatchEnable causes gpu-hang.

-v2: enabling of  ForceThreadDispatchEnable is only for gen8, for
     gen9 and higher reverted enabling of PixelShaderHasUAV.

-v3 (Jason Ekstrand): Rework the comments a bit.

CC: Jason Ekstrand <jason.ekstrand@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107941
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107760
Fixes: 79270d2140 (anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-16 13:20:51 -05:00
Lionel Landwerlin
322a919a41 anv: Implement VK_EXT_pci_bus_info
Even though the Intel GPU are always at the same PCI location, all the
info we need is already provided by libdrm. Let's be future proof.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-16 12:47:55 +01:00
Jose Fonseca
8550be7a2f appveyor: Cache pip's cache files.
It should speed up the Python packages installation.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-10-16 09:41:14 +01:00
Jose Fonseca
bfb8afb14d appveyor: Update to newer Mako/winflexbison versions.
As that's what most people are bound to use.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-10-16 09:41:12 +01:00
Jose Fonseca
b94f9cd8f9 appveyor: Update to MSVC 2017.
That's what we (and I suppose most people out there) are using now.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-10-16 09:41:07 +01:00
Samuel Pitoiset
647c2b90e9 radv: disable VK_SUBGROUP_FEATURE_VOTE_BIT
This feature isn't used for now, so disable it until
wwm is fixed in LLVM.

Fixes dEQP-VK.subgroups.vote.graphics.subgroupallequal*

https://bugs.freedesktop.org/show_bug.cgi?id=108115
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-16 10:24:19 +02:00
Samuel Pitoiset
593996bc02 radv: implement buffer to image operations for R32G32B32
This should fix rendering issues with Batman Arkham City.
We will probably need to implement itob and itoi at some
point, but currently nothing hits these paths.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107765
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-16 09:22:38 +02:00
Alex Smith
ca83d51cfb ac/nir: Use context-specific LLVM types
LLVMInt*Type() return types from the global context and therefore are
not safe for use in other contexts. Use types from our own context
instead.

Fixes frequent crashes seen when doing multithreaded pipeline creation.

Fixes: 4d0b02bb5a "ac: add support for 16bit load_push_constant"
Fixes: 7e7ee82698 "ac: add support for 16bit buffer loads"
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-16 08:18:24 +01:00
Vadym Shovkoplias
ad558408ff glsl: Check the subroutine associated functions names
Adding compile time check for subroutine functions with
the same names. Similar check for intrastage linking was already
landed in commit 5f0567a4f6.

From Section 6.1.2 (Subroutines) of the GLSL 4.00 specification

    "A program will fail to compile or link if any shader
     or stage contains two or more functions with the same
     name if the name is associated with a subroutine type."

Fixes:
    * no-overloads.vert

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108109
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-10-16 08:15:21 +03:00
Vadym Shovkoplias
d2ea3d4a76 glsl/linker: Change the format of spec quotation
Also there is no "OpenGL ES Shading Language 4.00" spec,
so change it to GLSL 4.00 spec.

Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-10-16 08:15:21 +03:00
Dave Airlie
ff281e6204 nir: fix clip cull lowering to not assert if GLSL already lowered.
If GLSL has already done the lowering, we'd rather not crash in this pass.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-10-15 18:53:48 -07:00
Kenneth Graunke
5bd8369681 i965: Add PCI IDs for new Amberlake parts that are Coffeelake based
See commit c0c46ca461f136a0ae1ed69da6c874e850aeeb53 in the Linux kernel,
where José Roberto de Souza added this new PCI ID there.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-10-15 18:10:27 -07:00
Kenneth Graunke
8f8111646c intel: disable FS IR validation in release mode.
We probably don't need to iterate, fprintf, and abort in release mode.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-10-15 18:10:27 -07:00
Caio Marcelo de Oliveira Filho
b3c6146925 nir: Copy propagation between blocks
Extend the pass to propagate the copies information along the control
flow graph.  It performs two walks, first it collects the vars
that were written inside each node. Then it walks applying the copy
propagation using a list of copies previously available.  At each node
the list is invalidated according to results from the first walk.

This approach is simpler than a full data-flow analysis, but covers
various cases.  If derefs are used for operating on more memory
resources (e.g. SSBOs), the difference from a regular pass is expected
to be more visible -- as the SSA copy propagation pass won't apply to
those.

A full data-flow analysis would handle more scenarios: conditional
breaks in the control flow and merge equivalent effects from multiple
branches (e.g. using a phi node to merge the source for writes to the
same deref).  However, as previous commentary in the code stated, its
complexity 'rapidly get out of hand'.  The current patch is a good
intermediate step towards more complex analysis.

The 'copies' linked list was modified to use util_dynarray to make it
more convenient to clone it (to handle ifs/loops).

Annotated shader-db results for Skylake:

    total instructions in shared programs: 15105796 -> 15105451 (<.01%)
    instructions in affected programs: 152293 -> 151948 (-0.23%)
    helped: 96
    HURT: 17

        All the HURTs and many HELPs are one instruction.  Looking
        at pass by pass outputs, the copy prop kicks in removing a
        bunch of loads correctly, which ends up altering what other
        other optimizations kick.  In those cases the copies would be
        propagated after lowering to SSA.

        In few HELPs we are actually helping doing more than was
        possible previously, e.g. consolidating load_uniforms from
        different blocks.  Most of those are from
        shaders/dolphin/ubershaders/.

    total cycles in shared programs: 566048861 -> 565954876 (-0.02%)
    cycles in affected programs: 151461830 -> 151367845 (-0.06%)
    helped: 2933
    HURT: 2950

        A lot of noise on both sides.

    total loops in shared programs: 4603 -> 4603 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total spills in shared programs: 11085 -> 11073 (-0.11%)
    spills in affected programs: 23 -> 11 (-52.17%)
    helped: 1
    HURT: 0

        The shaders/dolphin/ubershaders/12.shader_test was able to
        pull a couple of loads from inside if statements and reuse
        them.

    total fills in shared programs: 23143 -> 23089 (-0.23%)
    fills in affected programs: 2718 -> 2664 (-1.99%)
    helped: 27
    HURT: 0

        All from shaders/dolphin/ubershaders/.

    LOST:   0
    GAINED: 0

The other generations follow the same overall shape.  The spills and
fills HURTs are all from the same game.

shader-db results for Broadwell.

    total instructions in shared programs: 15402037 -> 15401841 (<.01%)
    instructions in affected programs: 144386 -> 144190 (-0.14%)
    helped: 86
    HURT: 9

    total cycles in shared programs: 600912755 -> 600902486 (<.01%)
    cycles in affected programs: 185662820 -> 185652551 (<.01%)
    helped: 2598
    HURT: 3053

    total loops in shared programs: 4579 -> 4579 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total spills in shared programs: 80929 -> 80924 (<.01%)
    spills in affected programs: 720 -> 715 (-0.69%)
    helped: 1
    HURT: 5

    total fills in shared programs: 93057 -> 93013 (-0.05%)
    fills in affected programs: 3398 -> 3354 (-1.29%)
    helped: 27
    HURT: 5

    LOST:   0
    GAINED: 2

shader-db results for Haswell:

    total instructions in shared programs: 9231975 -> 9230357 (-0.02%)
    instructions in affected programs: 44992 -> 43374 (-3.60%)
    helped: 27
    HURT: 69

    total cycles in shared programs: 87760587 -> 87727502 (-0.04%)
    cycles in affected programs: 7720673 -> 7687588 (-0.43%)
    helped: 1609
    HURT: 1416

    total loops in shared programs: 1830 -> 1830 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total spills in shared programs: 1988 -> 1692 (-14.89%)
    spills in affected programs: 296 -> 0
    helped: 1
    HURT: 0

    total fills in shared programs: 2103 -> 1668 (-20.68%)
    fills in affected programs: 438 -> 3 (-99.32%)
    helped: 4
    HURT: 0

    LOST:   0
    GAINED: 1

v2: Remove the DISABLE prefix from tests we now pass.

v3: Add comments about missing write_mask handling. (Caio)
    Add unreachable when switching on cf_node type. (Jason)
    Properly merge the component information in written map
    instead of replacing. (Jason)
    Explain how removal from written arrays works. (Jason)
    Use mode directly from deref instead of getting the var. (Jason)

v4: Register the local written mode for calls. (Jason)
    Prefer cf_node instead of node. (Jason)
    Clarify that remove inside iteration only works in backward
    iterations. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
dc349f07b5 nir: Take call instruction into account in copy_prop_vars
Calls are not used yet (functions are inlined), but since new code is
already taking them into account, do it here too.  The convention here
and in other places is that no writable memory is assumed to remain
unchanged, as well as global variables.

Also, explicitly state the modes affected (instead of using the
reverse logic) in one of the apply_for_barrier_modes calls.

Suggested by Jason.

v2: Consider local vars used by a call to be conservative, SPIR-V has
    such cases. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
797f01c220 nir: Add tests for copy propagation of derefs
Also tests for removal of redundant loads, that we currently handle as
part of the copy propagation.

Note some tests involve multiple blocks and are currently DISABLED
because they (expectedly) fail.

v2: Add missing DISABLED prefix to "multi block" tests. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
4dfa7adc10 nir: Remove handling of dead writes from copy_prop_vars
These are covered by another pass now.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
c20dd1f77c intel/nir, freedreno/ir3: Use the separated dead write vars pass
No changes to shader-db for intel.
No changes to shader-db expected for freedreno.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
cb126cf67a nir: Separate dead write removal into its own pass
Instead of doing this as part of the existing copy_prop_vars pass.

Separation makes easier to expand the scope of both passes to be more
than per-block.  For copy propagation, the information about valid
copies comes from previous instructions; while the dead write removal
depends on information from later instructions ("have any instruction
used this deref before overwrite it?").

Also change the tests to use this pass (instead of copy prop vars).
Note that the disabled tests continue to fail, since the standalone
pass is still per-block.

v2: Remove entries from dynarray instead of marking items as
    deleted.  Use foreach_reverse. (Caio)

    (all from Jason)
    Do not cache nir_deref_path.  Not worthy for this patch.
    Clear unused writes when hitting a call instruction.
    Clean up enumeration of modes for barriers.
    Move metadata calls to the inner function.

v3: For copies, use the vector length to calculate the mask.

    (all from Jason)
    Use nir_component_mask_t when applicable.
    Rename functions for clarity.
    Consider local vars used by a call to be conservative (SPIR-V has
    such cases).
    Comment and assert the assumption that stores and copies are
    always to a deref that ends with a vector or scalar.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
a02fd7000d nir: Add tests for dead write elimination
Note at the moment the pass called is nir_opt_copy_prop_vars, because
dead write elimination is implemented there.

Also added tests that involve identifying dead writes in multiple
blocks (e.g. the overwrite happens in another block).  Those currently
fail as expected, so are marked to be skipped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
bbda2a17f7 nir: Add test file for vars related passes
Add basic helpers for doing tests on the vars related optimization
passes.  The main goal is to lower the barrier to create tests during
development and debugging of the passes.  Full coverage is not a
requirement.

v2: Make find_next_intrinsic() skip blocks before 'after'. (Jason)
    Move nir_imm_ivec2() to nir_builder.h. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
c869646b7d nir: Add nir_imm_ivec2 helper
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
3966f053a1 util: Add foreach_reverse for dynarray
Useful to walk the array removing elements by swapping them with the
last element.

v2: Change iteration to make sure we never underflow. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Eric Anholt
8ec83dc51e v3d: Add support for hardware pack/unpack of half floats.
Cuts the formerly 7-minute simulation time of fs-packHalf2x16.shader_test
in half.
2018-10-15 17:16:44 -07:00
Eric Anholt
7d77fe1bcc nir: Expose nir_remove_unused_io_vars().
For gallium drivers where you want to do some linking at variant compile
time, you don't have the other producer/consumer shader on hand to modify.
By exposing the inner function, the driver can have the used varyings in
the compiled shader cache key and still do linking.

This is also useful for V3D, where the binning shader wants to only output
position and TF varyings.  We've been removing those after nir_lower_io,
but this will be less driver-specific code and let more of the shader get
DCEed early in NIR.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-15 17:16:44 -07:00
Eric Anholt
b788ab6d5c nir: Be sure to fix deref modes after demoting shader i/o vars to global.
Fixes assertion failures when calling nir_remove_unused_varyings() or
nir_remove_unused_io_vars().

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-15 17:16:44 -07:00
Eric Anholt
dda1ae9b3c gallium/ttn: Convert inputs and outputs to derefs of variables.
This means that TTN shaders more closely resemble GTN shaders: they have
inputs and outputs as variable derefs, with the variables having their
.driver_location already set up for you.

This will be useful for v3d to do input variable DCE in NIR, which we
can't do when the TTN shaders never have a pre-nir_lower_io stage.

Acked-by: Rob Clark <robdclark@gmail.com>
2018-10-15 17:16:43 -07:00
Eric Anholt
da15a0d88e gallium/ttn: Fix the type of gl_FragDepth.
In TGSI we have a vec4 of which only .z is used, but for NIR we should be
using a float the same as other NIR IR.  We were already moving TGSI's .z
to the .x channel.

Acked-by: Rob Clark <robdclark@gmail.com>
2018-10-15 17:16:43 -07:00
Kristian H. Kristensen
f93e431272 freedreno/a6xx: Enable blitter
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-10-15 15:22:38 -07:00
Kristian H. Kristensen
47bc9fad3e freedreno/a6xx: Update headers
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-10-15 15:22:35 -07:00
Kristian H. Kristensen
421863412c freedreno/a6xx: Remove unnecessary GRAS_2D_BLIT_INFO write
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-10-15 15:20:28 -07:00
Jason Ekstrand
e4c9bcd037 anv: Don't advertise ASTC support on BSW
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-10-15 16:55:25 -05:00
Samuel Pitoiset
26a2ce35ab radv: do not force the flat qualifier for clip/cull distances
This fixes some new CTS that reads clip/cull distances
from the fragment shader stage:

dEQP-VK.clipping.user_defined.clip_*

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-15 21:55:28 +02:00
Samuel Pitoiset
80c84bdba9 radv: bump discreteQueuePriorities to 2
It's the minimum value required by the spec.

This fixes dEQP-VK.api.info.device.properties.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-15 21:55:25 +02:00
Jason Ekstrand
ae18c53ba6 anv: Split dispatch tables into device and instance
There's no reason why we need generate trampoline functions for instance
functions or carry N copies of the instance dispatch table around for
every hardware generation.  Splitting the tables and being more
conservative shaves about 34K off .text and about 4K off .data when
built with clang.

Before splitting dispatch tables:

   text	   data	    bss	    dec	    hex	filename
3224305	 286216	   8960	3519481	 35b3f9	_install/lib64/libvulkan_intel.so

After splitting dispatch tables:

   text	   data	    bss	    dec	    hex	filename
3190325	 282232	   8960	3481517	 351fad	_install/lib64/libvulkan_intel.so

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-15 13:30:24 -05:00
Kenneth Graunke
18cc65edf8 i965: Drop assert about number of uniforms in ARB handling.
My recent prog_to_nir patch started making new sampler uniforms, which
apparently increased the number of parameters.  We used to poke at the
one parameter directly, making it important that there was only one,
but we haven't done that in a while.  It should be safe to just delete
the assertion.

Fixes: 1c0f92d8a8 "nir: Create sampler variables in prog_to_nir."
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 10:56:12 -07:00
Jason Ekstrand
2241be1d1b vulkan: Add the fuchsia headers
These were missing in the last couple of spec updates.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-15 10:20:31 -05:00
Bas Nieuwenhuizen
6ed0fd24d4 radv: Implement VK_EXT_pci_bus_info.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-15 12:27:49 +02:00
Kenneth Graunke
38a23517fd gallium/u_transfer_helper: Add support for separate Z24/S8 as well.
u_transfer_helper already had code to handle treating packed Z32_S8
as separate Z32_FLOAT and S8_UINT resources, since some drivers can't
handle that interleaved format natively.

Other hardware needs depth and stencil as separate resources for all
formats.  For example, V3D3 needs this for 24-bit depth as well.

This patch adds a new flag to lower all depth/stencils formats, and
implements support for Z24_UNORM_S8_UINT.  (S8_UINT_Z24_UNORM is left
as an exercise to the reader, preferably someone who has access to a
machine that uses that format.)

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-14 23:36:28 -07:00
Kenneth Graunke
c3d219837a gallium/format: Add a helper to combine separate Z24 and S8 stencil.
This new function takes separate Z24 depth and S8 stencil sources,
and packs them into a single combined Z24S8 buffer.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-14 23:36:28 -07:00
Kenneth Graunke
5849e0612c gallium/auxiliary: Add util_format_get_depth_only() helper.
This will be used by u_transfer_helper.c shortly, in order to split
packed depth-stencil into separate resources.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-14 23:36:28 -07:00
Kenneth Graunke
1c0f92d8a8 nir: Create sampler variables in prog_to_nir.
This is needed for nir_gather_info to actually count the textures,
since it operates solely on variables.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-14 23:35:47 -07:00
Kenneth Graunke
ed169c9ad2 nir: Create sampler2D variables in nir_lower_{bitmap,drawpixels}.
This is needed for nir_gather_info to actually count the new textures,
since it operates solely on variables.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-14 23:35:35 -07:00
Jason Ekstrand
b7397b09d5 spirv: Update SPIR-V json and headers to Khronos master
This corresponds to commit 801cca8104245c07e8cc532 on GitHub.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-13 09:56:18 -05:00
Samuel Pitoiset
13fd4e601c vulkan: Update the XML and headers to 1.1.88
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-13 09:56:18 -05:00
Vinson Lee
cc33621e3b r600/sb: Fix constant-logical-operand warning.
sb/sb_bc_parser.cpp:620:27: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand]
        if (cf->bc.op_ptr->flags && FF_GDS)
                                 ^  ~~~~~~
sb/sb_bc_parser.cpp:620:27: note: use '&' for a bitwise operation
        if (cf->bc.op_ptr->flags && FF_GDS)
                                 ^~
                                 &
sb/sb_bc_parser.cpp:620:27: note: remove constant to silence this warning
        if (cf->bc.op_ptr->flags && FF_GDS)
                                ~^~~~~~~~~

Fixes: da977ad907 ("r600/sb: start adding GDS support")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-12 10:58:58 -07:00
Rafael Antognolli
ca168ec008 i965/miptree: Use enum instead of boolean.
ISL_AUX_USAGE_NONE happens to be the same as "false", but let's do the
right thing and use the enum.

v2: fix intel_miptree_finish_depth too (Caio)

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-12 10:14:20 -07:00
Samuel Pitoiset
2c139e2cdf radv: do not support blitting surfaces for R32G32B32 formats
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108113
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-12 15:28:21 +02:00
Jose Fonseca
7c5aececda scons: Allow building with custom MSVC_USE_SCRIPT script.
SCons MSVC support relies on vcvarsall.bat to extract the PATH, CPP
includes, library paths, etc.

And SCons also has an build env var named MSVC_USE_SCRIPT which one can
use to point to alternative vcvarsall.bat script.

This change exposes this MSVC_USE_SCRIPT build env variable as a SCons
command line variable.  This will enable using MSVC outside Program
Files (e.g, network shares, etc.)

This change also links advapi32 library, necessary for the Windows
Registry API used by WGL state tracker, avoiding missing symbols.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-10-12 07:45:53 +01:00
Samuel Pitoiset
416013b4f5 radv: emit the GLC bit for SSBO loads/stores when needed
This fixes some new memory model tests:
dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.*

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108112
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-12 08:42:08 +02:00
Samuel Pitoiset
4b74f05f6b spirv/nir: handle memory access qualifiers for SSBO loads/stores
v2: - change how the access qualifiers are accumulated
v3: - duplicate members in struct_member_decoration_cb()
    - handle access qualifiers on variables
    - remove access qualifiers handling in _vtn_variable_load_store()
    - fix setting access qualifiers on type->array_element

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net
2018-10-12 08:42:08 +02:00
Tapani Pälli
26a10e3844 anv/android: we need git_sha1.h in include paths
Fixes: e4538b9 "anv: Implement VK_KHR_driver_properties"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-12 07:29:03 +03:00
Nanley Chery
0ee0e0b6b9 anv: Clear WM_HZ_OP overrides in init_device_state
This is basically a port of commit,
3ade766684
("i965: Disable 3DSTATE_WM_HZ_OP fields.")

The BDW+ docs describe how to use the 3DSTATE_WM_HZ_OP instruction in
the section titled, "Optimized Depth Buffer Clear and/or Stencil Buffer
Clear." It mentions that the packet overrides GPU state for the clear
operation and needs to be reset to 0s to clear the overrides. Depending
on the kernel, we may not get a context with the GPU state for this
packet zeroed. Do it ourselves just in case.

Prevents a number of GPU hangs when running crucible on ICL. I tried to
get the exact number of hangs that occurs without this patch, but was
unsuccessful. The test machine became unresponsive before completing the
full run.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-10-11 16:31:08 -07:00
Jordan Justen
494d2ec277 i965/gen10+: Initialize new fields in STATE_BASE_ADDRESS
Ref: 263b584d5e "i965/skl: Emit extra zeros in STATE_BASE_ADDRESS on Skylake."
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-10-11 15:16:04 -07:00
Jordan Justen
d18a0d955e anv/gen9+: Initialize new fields in STATE_BASE_ADDRESS
Ref: 263b584d5e "i965/skl: Emit extra zeros in STATE_BASE_ADDRESS on Skylake."
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-10-11 15:16:00 -07:00
Jason Ekstrand
d7e0d47b9d nir: Add a bunch of b2[if] optimizations
The b2f and b2i conversions always produce zero or one which are both
representable in every type and size.  Since b2i and b2f support all bit
sizes, we can just get rid of the conversion opcode.

total instructions in shared programs: 15089335 -> 15084368 (-0.03%)
instructions in affected programs: 212564 -> 207597 (-2.34%)
helped: 896
HURT: 0

total cycles in shared programs: 369831123 -> 369826267 (<.01%)
cycles in affected programs: 2008647 -> 2003791 (-0.24%)
helped: 693
HURT: 216

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-11 15:21:19 -05:00
Jason Ekstrand
0e0dc596a2 intel/vec4: Fix nir_op_b2[fi] with 64-bit result
This is valid NIR but you can't actually hit this case today.  GLSL IR
doesn't have a bool to double opcode; it does f2d(b2f(x)).  In SPIR-V we
don't have any to/from bool conversion opcodes at all.  However, the
next commit will make us start generating it so we should be ready.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-11 15:21:19 -05:00
Jason Ekstrand
497675c21e intel/fs: Fix nir_op_b2[fi] with 64-bit result on Gen8 LP and Gen9 LP
Several of the Atom GPUs have additional restrictions on alignment when
moving < 64-bit source to a 64-bit destination.  All of the nir_op_*2*64
code generation paths respected this, but nir_op_b2[fi] did not.

Previous to commit a68dd47b91 it was not possible to generate such an
instruction from the GLSL path.  It may have been possible from SPIR-V,
but it's not clear.  The aforementioned patch converts a 64-bit
nir_op_fsign into a sequence of operations including a nir_op_b2f with a
64-bit result.  This "just works" everywhere except these Atom parts.

This problem was not detected during normal CI testing because the Atom
parts are not included in developer builds.

v2 (idr): Make the patch compile, and make some cosmetic changes.  Add a
commit message.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108319
Fixes: a68dd47b91 "nir/algebraic: Simplify fsat of fsign"
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-11 15:21:19 -05:00
Vinson Lee
4ece6aa552 egl: Use correct shared libraries suffix on macOS.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-11 11:30:00 -07:00
Illia Iorin
b18f8e63ef mesa: Fix pack_uint_Z_FLOAT32()
Fixed pack_uint_Z_FLOAT32 by casting row data to float instead uint.
Remove code duplicate function pack_uint_Z_FLOAT32_X24S8.
Edited case in "_mesa_get_pack_uint_z_func".
Now it looks like "_mesa_get_pack_float_z_func".
Remove _mesa_problem call, which was added for debuging this issue.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91433
Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-10-11 10:15:09 -07:00
Rodrigo Vivi
24db1c7fcc intel: Introducing Whiskey Lake platform
Whiskey Lake uses the same gen graphics as Coffe Lake, including some
ids that were previously marked as reserved on Coffe Lake, but that
now are moved to WHL page.

This follows the ids and approach used on kernel's commit
b9be78531d27 ("drm/i915/whl: Introducing Whiskey Lake platform")
and commit c1c8f6fa731b ("drm/i915: Redefine some Whiskey Lake SKUs")

v2: Lionel noticed that GT{1,2,3} on kernel wasn't following
spec when looking to number of EUs, so kernel has been updated.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-11 10:02:40 -07:00
Boyuan Zhang
d76c277421 st/va: use provided sizes and coords for vlVaGetImage
vlVaGetImage should respect the width, height, and coordinates x and y that
passed in. Therefore, pipe_box should be created with the passed in values
instead of surface width/height.

v2: add input size check, return error when size out of bounds
v3: fix the size check for vaimage
v4: add size adjustment for x and y coordinates

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Christian König <christian.koenig@amd.com>
2018-10-11 09:00:18 -04:00
Samuel Pitoiset
229803b66a radv: implement clear operations for R32G32B32
This fixes crashes for some CTS:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.*.linear_*_*
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.*.*_linear_*

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108113
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-11 14:49:16 +02:00
Samuel Pitoiset
c3ba3c2611 radv: disallow 3D images and mipmaps/layers for R32G32B32 linear formats
R32G32B32 are weird formats and we are only going to support
some basic operations for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-11 14:49:14 +02:00
Samuel Pitoiset
d179312b53 radv: add a workaround for a VGT hang with prim restart and strips
Otherwise, Yakuza and The Evil Within hang the GPU with DXVK.
This apparently only works on Polaris.

Suggested by Marek.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-11 10:16:11 +02:00
Timothy Arceri
3bc012a34e glsl: remove redundant es_shader checks
The es check is already covered by the is_version() check.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-11 14:45:43 +11:00
Dave Airlie
cc2fe57922 st/glsl_to_tgsi: initialise need_uarl in contructor
Found by coverity

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-11 10:20:37 +10:00
Dave Airlie
c5c3da6c90 glspirv: drop pointless assert (size_t is unsigned)
Found by coverity

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-10-11 10:19:48 +10:00
Dave Airlie
600d8ecb57 radv: remove unsigned comparison against 0
The value is always >= 0 here.

Found by coverity

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-11 10:19:20 +10:00
Dave Airlie
6e1d294804 radv: remove dead code for master_fd close
We have never opened master_Fd at this point, so remove code to
close it.

Found by coverity.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-11 10:19:16 +10:00
Dave Airlie
7c04b96f03 radv: don't pass shader key by copy
Coverity pointed out we were copying 168 bytes here unnecessarily.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-11 10:18:43 +10:00
Dave Airlie
29a7631986 anv: add missing unlock in error path.
Not going to matter, but be consistent.

Found by coverity

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: caf41c78c (anv/allocator: Support softpin in the BO cache)
2018-10-11 09:50:27 +10:00
Jason Ekstrand
4ba445e011 intel: Don't propagate conditional modifiers if a UD source is negated
This fixes a bug uncovered by my NIR integer division by constant
optimization series.

Fixes: 19f9cb72c8 "i965/fs: Add pass to propagate conditional..."
Fixes: 627f94b72e "i965/vec4: adding vec4_cmod_propagation..."
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-10 13:13:12 -05:00
Jason Ekstrand
328d4d080b util: Add tests for fast integer division by constants
While I generally trust rediculousfish to have done his homework, we've
made some adjustments to suit the needs of mesa and it'd be good to
test those.  Also, there's no better place than unit tests to clearly
document the different edge cases of the different methods.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-10 13:13:12 -05:00
Marek Olšák
a9be8dddfe util: Add power-of-two divisor support to compute_fast_udiv_info
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-10 13:13:12 -05:00
Jason Ekstrand
7cde4dbcd7 util: Generalize fast integer division to be variable bit-width
There's nothing inherently fixed-width in the code.  All that's required
to generalize it is to make everything internally 64-bit and pass
UINT_BITS in as a parameter to util_compute_fast_[us]div_info.  With
that, it can now handle 8, 16, 32, and 64-bit integer division by a
constant.

We also add support for division by 1 and by other powers of 2.  This is
useful if you want to divide by a uniform value in a shader where you
have the opportunity to adjust the uniform on the CPU before passing it
in.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-10 13:13:12 -05:00
Marek Olšák
64eb0738d4 util: Add fast division helpers
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-10 13:13:12 -05:00
Marek Olšák
2940c257a6 util: import public domain code for integer division by a constant
Compilers can use this to generate optimal code for integer division
by a constant.

Additionally, an unsigned division by a uniform that is constant but not
known at compile time can still be optimized by passing 2-4 division
factors to the shader as uniforms and executing one of the fast_udiv*
variants. The signed division algorithm doesn't have this capability.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-10 13:13:12 -05:00
Jason Ekstrand
0dca6730b4 util: Add a simple big math library
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-10 13:13:12 -05:00
Dylan Baker
b8521704ed meson: Don't allow building EGL on Windows or MacOS
Currently mesa only supports EGL on Unix like systems, cygwin, and
haiku. Meson should actually enforce this. This fixes the default build
on MacOS.

v2: - invert the condition, mark darwin and windows as not supported
      instead of trying to mark what is supported.
v3: - add missing )
v3: - Update comment to reflect condition change in v2

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-10 11:02:36 -07:00
Timothy Arceri
0346ad3774 glsl: ignore trailing whitespace when define redefined
The Nvidia/AMD binary drivers allow this, as does GCC.

This fixes shader compilation issues in the latest update of
No Mans Sky.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-10 15:08:32 +11:00
Ian Romanick
b44c9292b7 intel/compiler: Don't handle fsign.sat
No shader-db or CI changes on any Intel platform.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-10-09 13:56:42 -07:00
Ian Romanick
a68dd47b91 nir/algebraic: Simplify fsat of fsign
These allows us to not support fsign.sat in the Intel compiler backend,
and that will simplify some later changes.

No shader-db changes on any Intel platform.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-10-09 13:56:42 -07:00
Ian Romanick
1546204cdd nir/algebraic: sign(x)*x*x is abs(x)*x
shader-db results:

All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 15106023 -> 15105981 (<.01%)
instructions in affected programs: 300 -> 258 (-14.00%)
helped: 6
HURT: 0
helped stats (abs) min: 7 max: 7 x̄: 7.00 x̃: 7
helped stats (rel) min: 14.00% max: 14.00% x̄: 14.00% x̃: 14.00%
95% mean confidence interval for instructions value: -7.00 -7.00
95% mean confidence interval for instructions %-change: -14.00% -14.00%
Instructions are helped.

total cycles in shared programs: 566050327 -> 566050075 (<.01%)
cycles in affected programs: 2826 -> 2574 (-8.92%)
helped: 6
HURT: 0
helped stats (abs) min: 40 max: 44 x̄: 42.00 x̃: 42
helped stats (rel) min: 8.89% max: 8.94% x̄: 8.92% x̃: 8.92%
95% mean confidence interval for cycles value: -44.30 -39.70
95% mean confidence interval for cycles %-change: -8.95% -8.88%
Cycles are helped.

No changes on Gen6 or earlier.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-10-09 13:56:42 -07:00
Ian Romanick
10f4a8871e nir: Add helper functions to get the instruction that generated a nir_src
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-10-09 13:56:42 -07:00
Brian Paul
797e34f658 svga: change svga_destroy_shader_variant() to return void
svga_destroy_shader_variant() itself flushes and retries the command
if there's a failure.  So no need for the callers to do it.  Other
callers of the function were already ignoring the return value.

This also fixes a corner-case double-free reported by Coverity
(and reported by Dave Airlie).

Tested with various OpenGL apps.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-10-09 11:17:14 -06:00
Dylan Baker
b781688636 meson: Don't build glsl compiler tests unless OpenGL is enabled
Since there are no other users of the glsl compiler.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-09 08:56:00 -07:00
Dylan Baker
d84f003b95 meson: Only build gallium state tracker tests with shared_glapi
This has always been a requirement, it's just somehow been missed in the
meson build.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-09 08:55:56 -07:00
Dylan Baker
0fa6a8271a meson: only build clapi tests when OpenGL is being built
Otherwise building just vulkan (among other things) will build these
tests, pull in a bunch of stuff they shouldn't, and potentially fail to
compile.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-09 08:55:48 -07:00
Ilia Mirkin
92f56fbd89 nvc0: fix blitting red to srgb8_alpha
For some reason the 2d engine can't handle this. Red formats get special
treatment there, so perhaps related.

Fixes dEQP-GLES3 tests of the form:

  dEQP-GLES3.functional.fbo.blit.conversion.r{8,16f,32f}_to_srgb8_alpha8

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
2018-10-09 10:33:11 -04:00
Ilia Mirkin
9bf0614116 nv50,nvc0: guard against zero-size blits
The current state tracker can generate these sometimes. Fixing this is
more involved, and due to some integer math we can generate
divisions-by-zero.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
2018-10-09 10:33:11 -04:00
Ilia Mirkin
78d3640e49 nv50,nvc0: mark RGBX_UINT formats as renderable
This helps st/mesa avoid some (apparently) buggy fallbacks. Specifically
the CopyTexSubImage fallback tries to read texture A as RGBA_FLOAT and
write back that data into the target format, which fails for integer
formats which have no appropriate logic to do the conversion.

Since integer formats don't blend, there's no harm in the fact that the
"A" component gets written anyways.

Fixes, among others:
  https://www.khronos.org/registry/webgl/sdk/tests/conformance2/textures/canvas/tex-2d-rgb8ui-rgb_integer-unsigned_byte.html

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2018-10-09 10:33:11 -04:00
Eric Engestrom
976188737d radv: add missing meson c++ visibility arguments
Fixes: 6f3aee40f9 "radv: using tls to store llvm related info
                             and speed up compiles (v10)"
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-10-09 14:22:24 +01:00
Michel Dänzer
9d3fefdc41 gbm: Add GBM_FORMAT_ARGB1555 support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-09 10:32:51 +02:00
Michel Dänzer
e7e033ed8a st/dri: Handle BGRA5551 format
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-09 10:32:50 +02:00
Rob Clark
fa52ff856d freedreno/a5xx+a6xx: fix LRZ pitch alignment
Both RB_2D_DST_SIZE.PITCH (a6xx) and RB_MRT[n].PITCH (a5xx) need
alignment to 64.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-08 19:05:14 -04:00
Rob Clark
82c3b6fe49 freedreno/a6xx: add LRZ support
As with a5xx, hidden behind FD_MESA_DEBUG=lrz due to being paranoid
about z-fighting issues with some games (in particular, this was
observed with 0ad on a5xx.. but I think the proper solution to enable
this by default is to figure out how to do driver specific driconf
options).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-08 19:05:14 -04:00
Rob Clark
a877451a41 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-08 18:03:35 -04:00
Rob Clark
bf79a7cc25 freedreno/a6xx: add helper for various CP_EVENT_WRITE
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-08 17:50:26 -04:00
Rob Clark
60af89815e freedreno/a6xx: remove unused fxns
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-08 17:50:26 -04:00
Rob Clark
d5bd3ce89c freedreno/a6xx: remove fd6_shader_stateobj
Earlier gen's already got this cleanup, but a6xx was still off on a
branch then.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-08 17:50:26 -04:00
Ilia Mirkin
1bb1c03d61 glsl: fix array assignments of a swizzled vector
This happens in situations where we might do

  vec.wzyx[i] = ...

The swizzle would get effectively ignored because of the interaction
between how ir_assignment->set_lhs works and overwriting the write_mask.
There are two cases, one where i is a constant, and another where i is
variable. We have to be extra-careful in both cases.

Fixes the following WebGL test:

  https://www.khronos.org/registry/webgl/sdk/tests/conformance2/glsl3/vector-dynamic-indexing-swizzled-lvalue.html

And the new piglit tests:

  swizzled-writemask-indexing-nonconst.shader_test
  swizzled-writemask-indexing.shader_test

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2018-10-08 14:29:14 -04:00
Samuel Pitoiset
d3682766f6 radv: tidy up radv_pipeline_init_multisample_state()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-08 14:17:43 +02:00
Samuel Pitoiset
b38228ccb0 radv: always set PA_SC_MODE_CNTL_1.OUT_OF_ORDER_WATER_MARK
It has probably no effect without out of order rasterization
anyway.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-08 14:17:40 +02:00
Samuel Pitoiset
937986ca1d radv: set DB_EQAA.INCOHERENT_EQAA_READS
My attempt was to set this field instead of duplicating one.

Fixes: 6cfa321c39 ("radv: add potential missing fields for DB_EQAA")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-08 14:17:33 +02:00
Chystiakov, Dmytro
47e3338b04 i965: fallback RGBX to RGBA in glEGLImageTargetRenderbufferStorageOES
In the same fashion as is done for glEGLImageTextureTarget2D.

v2: share the fallback which sets baseformat and internalformat correctly
    which makes both of the tests pass (Tapani)

Fixes android.hardware.nativehardware.cts.AHardwareBufferNativeTests:

   #SingleLayer_ColorTest_GpuColorOutputCpuRead_R8G8B8X8_UNORM
   #SingleLayer_ColorTest_GpuColorOutputIsRenderable_R8G8B8X8_UNORM

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-10-08 08:03:45 +03:00
Tapani Pälli
d1fa69ed61 glsl: do not attempt assignment if operand type not parsed correctly
v2: check types of both operands (Ian)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108012
2018-10-08 08:02:50 +03:00
Marek Olšák
d877451b48 util/u_queue: add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY
Initial version discussed with Rob Clark under a different patch name.
This approach leaves his driver unaffected.
2018-10-06 22:05:58 -04:00
Marek Olšák
066aa44fc5 radeonsi: fix a typo at CS_PARTIAL_FLUSH
harmless
2018-10-06 21:50:52 -04:00
Marek Olšák
77903c8cfb ac: add ac_build_round 2018-10-06 21:50:09 -04:00
Marek Olšák
fa023f293e ac: correct PKT3_COPY_DATA definitions 2018-10-06 21:50:09 -04:00
Marek Olšák
82f5f89bf6 ac: simplify LLVM alloca helpers 2018-10-06 21:50:09 -04:00
Marek Olšák
a668c8d6ba ac: define all address spaces properly 2018-10-06 21:50:09 -04:00
Gert Wollny
8f77156c26 gallivm: Make it possible to disable some optimization shortcuts in release builds
For testing it is of interest that all tests of dEQP pass, e.g. to test
virglrenderer on a host only providing software rendering like in a CI.
Hence make it possible to disable certain optimizations that make tests fail.

While we are there also add some documentation to the flags to make it clear
that this is opt-out.

Setting the environment variable "GALLIVM_PERF=no_filter_hacks" can be used to make
the following tests pass in release mode:

  dEQP-GLES2.functional.texture.mipmap.2d.affine.*_linear_*
  dEQP-GLES2.functional.texture.mipmap.cube.generate.*
  dEQP-GLES2.functional.texture.vertex.2d.filtering.*_mipmap_linear_*
  dEQP-GLES2.functional.texture.vertex.2d.wrap.*

Related:
  https://bugs.freedesktop.org/show_bug.cgi?id=94957

v2: rename optimization disabling flag to 'safemath' and also move the
    nopt flag to the perf flags.

v3: rename flag "safemath" to "no_filter_hacks" since safemath is usually
    associated with floating point operations (Roland)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-10-06 13:12:48 +02:00
Tomeu Vizoso
9d81cd8e7c virgl: Pass resource size and transfer offsets
Pass the size of a resource when creating it so a backing can be kept in
the other side.

Also pass the required offset to transfer commands.

This moves vtest closer to how virtio-gpu works, making it more useful
for testing.

v2: - Use new messages for creation and transfers, as changing the
      behavior of the existing messages would be messy given that we don't
      want to break compatibility with older servers.

v3: - Use correct strides: The resource corresponding to the output display
      might have a differnt line stride then the IOVs, so when reading back
      to this resource take the resource stride and the the IOV stride
      into account.

v4: Fix transfer size calculation (Andrey Simiklit)

v5: Add comment about transfer size value in the PUT commend (Gurchetan).
    Add a comment about the size correction for transfers for reading and
    writing the resource. Fixing this by correctly evaluating the size
    upfront will need some work also  on the virglrenderer side.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> (v2)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-10-06 13:12:44 +02:00
Gert Wollny
5d7858f151 virgl, vtest: Correct the transfer size calculation
The transfer size used in virglrenderer refers to uint32_t, so one
must add 3 and then divide by 4 instead of adding 3/4 which is a no-op
with integers.

Fixes: b3b82fe8ea virgl/vtest: add vtest driver

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-10-06 13:12:44 +02:00
Alan Coopersmith
066850edad util: Make xmlconfig.c build on Solaris without d_type in dirent (v2)
v2: check for lstat() failing

Fixes: 04bdbbcab3 "xmlconfig: read more config files from drirc.d/"
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-05 17:30:45 -07:00
Sonny Jiang
084cf3b966 radeonsi:optimizing SET_CONTEXT_REG for shaders vgt_vertex_reuse
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-05 19:04:13 -04:00
Sonny Jiang
ce1d72609d radeonsi:optimizing SET_CONTEXT_REG for shaders Tessellation
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-05 19:04:13 -04:00
Sonny Jiang
4de328da07 radeonsi:optimizing SET_CONTEXT_REG for shaders PS
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-05 19:04:13 -04:00
Sonny Jiang
f243980f2c radeonsi:optimizing SET_CONTEXT_REG for shaders VS
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-05 19:04:13 -04:00
Sonny Jiang
4052624398 radeonsi:optimizing SET_CONTEXT_REG for shaders GS
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-05 19:04:13 -04:00
Marek Olšák
86f004bdfc radeonsi: optimize and allow reg > 31 in radeon_opt_set_context_reg functions
reg_saved will have 64 bits, and (1 << reg) where reg > 31 has undefined
behavior. (1ull << reg) would be correct for 64 bits.

This commit shifts the other way in order to merge the conditions.
2018-10-05 19:04:13 -04:00
Sonny Jiang
eeb9170599 radeonsi: optimizing SET_CONTEXT_REG for shaders ES
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-05 17:53:52 -04:00
Samuel Pitoiset
a1bc152340 spirv: mark variables decorated with XfbBuffer as always active
Otherwise, they are removed during NIR linking or in some
lowering passes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-05 18:13:25 +02:00
Juan A. Suarez Romero
5bd03d02c1 docs: update calendar, add news and link release notes to 18.2.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-10-05 12:51:34 +02:00
Juan A. Suarez Romero
c565eeee0b docs: add sha256 checksums for 18.2.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit cb63a4e114)
2018-10-05 12:46:33 +02:00
Juan A. Suarez Romero
3537465059 docs: add release notes for 18.2.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit abaeb79eb2)
2018-10-05 12:46:31 +02:00
Jason Ekstrand
dd553bc67f nir/alu_to_scalar: Use ssa_for_alu_src in hand-rolled expansions
The ssa_for_alu_src helper will correctly handle swizzles and other
source modifiers for you.  The expansions for unpack_half_2x16,
pack_uvec2_to_uint, and pack_uvec4_to_uint were all broken with regards
to swizzles.  The brokenness of unpack_half_2x16 was causing rendering
errors in Rise of the Tomb Raider on Intel ever since c11833ab24
which added an extra copy propagation to the optimization pipeline and
caused us to start seeing swizzles where we hadn't seen any before.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107926
Fixes: 9ce901058f "nir: Add lowering of nir_op_unpack_half_2x16."
Fixes: 9b8786eba9 "nir: Add lowering support for packing opcodes."
Tested-by: Alex Smith <asmith@feralinteractive.com>
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-10-04 12:43:59 -05:00
Vadym Shovkoplias
5f0567a4f6 glsl/linker: Check the subroutine associated functions names
>From Section 6.1.2 (Subroutines) of the GLSL 4.00 specification

    "A program will fail to compile or link if any shader
     or stage contains two or more functions with the same
     name if the name is associated with a subroutine type."

v2:
  - error out earlier (Tapani)
  - style fixes (Iago)

Fixes:
    * no-overloads.vert

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108109
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-10-04 17:41:19 +02:00
Tomeu Vizoso
ed53a79cf8 virgl: Negotiate version with vtest server
Check if server supports version negotation by sending a PING_PROTOCOL_VERSION
message right before a dummy RESOURCE_BUSY_WAIT. If we don't get a reply
for the first, we know the server doesn't support it.

If it does support it, we can query the max protocol version supported
by the server and fall back if needed.

v2: - Send a new message to negotiate the protocol version, checking if
      the server supports this message by immediately sending a busy wait
      message. (Dave Airlie)

v3: - Send a zero-arg command PING_PROTOCOL_VERSION so we actually keep
      compatibility with older servers. (Code by Dave Airlie)

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-10-04 16:18:36 +02:00
Sagar Ghuge
0c70e11206 intel: aubinator: Fix memory leaks
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-04 10:01:56 +01:00
Sagar Ghuge
29a2eaf3db intel/decoder: construct correct xml filename
construct correct gen xml filename when we try to load hardware xml
description from a given path

v2: remove temporary variable (Francesco Ansanelli)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-04 10:01:56 +01:00
Sagar Ghuge
f9c8468c82 intel/decoder: Avoid freeing invalid pointer
v2: Free ctx.spec if error while reading genxml (Lionel Landwerlin)

v3: Handle case where genxml is empty (Lionel Landwerlin)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-04 10:01:56 +01:00
Sagar Ghuge
ba3304e764 intel/decoder: add gen_spec_init method
Initialize gen_spec instance properly when loading hardware xml
description from specifc directory to avoid segmentation fault.

v2: correct function definition (Lionel Landwerlin)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-04 10:01:56 +01:00
Samuel Pitoiset
2b34985d93 radv: fix resetting the pool for timestamp queries
Since the driver no longer uses the availability bit for
timestamp queries it shouldn't reset it. Instead, it should
reset the query values to UINT32_MAX. This fixes VM faults.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108164
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-04 10:56:25 +02:00
Guido Günther
b2a876a42b etnaviv: Use write combine instead of unached mappings for shader bo
The later are sensitive to unaligned accesses on arm64[1] and we don't
need an uncached mapping here.

[1]: https://lists.freedesktop.org/archives/etnaviv/2018-September/001956.html

Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-10-04 10:33:25 +02:00
Marek Olšák
8e0b4cb8a1 drirc: add a workaround for ARMA 3
Cc: 18.2 <mesa-stable@lists.freedesktop.org>
2018-10-04 01:01:54 -04:00
Jason Ekstrand
f5bab06428 anv/batch_chain: Don't start a new BO just for BATCH_BUFFER_START
Previously, we just went ahead and emitted MI_BATCH_BUFFER_START as
normal.  If we are near enough to the end, this can cause us to start a
new BO just for the MI_BATCH_BUFFER_START which messes up chaining.  We
always reserve enough space at the end for an MI_BATCH_BUFFER_START so
we can just increment cmd_buffer->batch.end prior to emitting the
command.

Fixes: a0b133286a "anv/batch_chain: Simplify secondary batch return..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107926
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-03 09:03:12 -05:00
Jason Ekstrand
7a89a0d9ed anv: Use separate MOCS settings for external BOs
On Broadwell and above, we have to use different MOCS settings to allow
the kernel to take over and disable caching when needed for external
buffers.  On Broadwell, this is especially important because the kernel
can't disable eLLC so we have to do it in userspace.  We very badly
don't want to do that on everything so we need separate MOCS for
external and internal BOs.

In order to do this, we add an anv-specific BO flag for "external" and
use that to distinguish between buffers which may be shared with other
processes and/or display and those which are entirely internal.  That,
together with an anv_mocs_for_bo helper lets us choose the right MOCS
settings for each BO use.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99507
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-03 09:03:03 -05:00
Emil Velikov
08bff097e1 meson: remove invalid "opencl" llvm component
Seeming copy/paste mistake from configure.ac which uses $2 for the
component and $3 for the fancy name printing.

Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-10-03 13:38:06 +01:00
Emil Velikov
fe8be81b4a Revert "mesa: remove unnecessary 'sort by year' for the GL extensions"
This reverts commit 3d81e11b49.

As reported by Federico, some games require the 'sort by year' since
they truncate the extensions which do not fit the fixed size string
array.

Seemingly I did not consider that, as the documentation (both Mesa and
Nvidia) mentions about program crashes ... which are worked around by
setting the env. variable.

This commit reinstates the workaround and enhances the documentation.

Cc: Marek Olšák <maraeo@gmail.com>
Cc: Ian Romanick <idr@freedesktop.org>
Reported-by: Federico Dossena <info@fdossena.com>
Fixes: 3d81e11b49 ("mesa: remove unnecessary 'sort by year' for the GL
extensions")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Federico Dossena <info@fdossena.com>
2018-10-03 13:38:06 +01:00
Emil Velikov
91ff8b1dd9 mesa: reorder and document the tokens in glheader.h
Split into different sections, document each one as well as strange
cases like GL_ATI_texture_compression_3dc.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-03 13:38:06 +01:00
Emil Velikov
5f70964b1d mesa: remove duplicate declarations from glheader.h
Remove all the desktop GL and GLX entries from the list.
Former are pulled by the gl.h and glext.h includes at the top while the
latter are no longer needed.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-03 13:38:06 +01:00
Emil Velikov
01b92916af i965: reference __DRI_ATTRIB_SWAP_COPY token over the GLX one
Earlier commit updated the code to use the DRI tokens, yet forgot to
update the comment.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-03 13:38:06 +01:00
Emil Velikov
e04b2c0376 i915: reference __DRI_ATTRIB_SWAP_COPY token over the GLX one
Earlier commit updated the code to use the DRI tokens, yet forgot to
update the comment.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-03 13:38:06 +01:00
Emil Velikov
d26b122ee8 dri/common: move the required GLX_* token definitions locally
Will allow us to remove even bigger hack elsewhere. But more
importantly, we should not be using _any_ GLX tokens in DRI.

Document the gory details about the current side-effects.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-03 13:38:06 +01:00
Emil Velikov
4ef53669af dri/common: use __DRI_ATTRIB_SWAP* instances when describing db_modes
Somewhat recently Thomas Hellstrom added the respective DRI tokens
and updated the drivers. Update the documentation to match reality.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-03 13:38:06 +01:00
Emil Velikov
d6a6760139 egl/x11: remove eglSwap* surface check
Already handled further up in eglapi.c.

To make things a tiny bit strange, X11+DRI3 was doing the wrong thing by
returning EGL_FALSE (+ no error), while X11+DRI2 was returning EGL_TRUE.

Cc: samiuddi <sami.uddin.mohammad@intel.com>
Cc: Eric Engestrom <eric.engestrom@intel.com>
Cc: Erik Faye-Lund <kusmabite@gmail.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-10-03 13:38:06 +01:00
Emil Velikov
8030741996 egl/surfaceless: remove eglSwap* stubs
The API validation in eglapi.c already returns if the surface type is
!window.

Cc: samiuddi <sami.uddin.mohammad@intel.com>
Cc: Erik Faye-Lund <kusmabite@gmail.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Gurchetan Singh <gurchetansingh@chromium.org>
Cc: Chad Versace <chadversary@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-03 13:38:06 +01:00
Emil Velikov
a370e278d3 egl/drm: remove eglSwap* surface check
Already handled further up in eglapi.c

Cc: samiuddi <sami.uddin.mohammad@intel.com>
Cc: Erik Faye-Lund <kusmabite@gmail.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-03 13:38:06 +01:00
Emil Velikov
91ccb59ff4 egl/android: remove eglSwap* surface check
Already handled further up in eglapi.c

Cc: samiuddi <sami.uddin.mohammad@intel.com>
Cc: Erik Faye-Lund <kusmabite@gmail.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-03 13:38:06 +01:00
Emil Velikov
8f66743ca2 egl: make eglSwapBuffers* a no-op for !window surfaces
Analogous to the previous commit - the spec says the function is a
no-op when a pbuffer or pixmap surface is used.

Cc: samiuddi <sami.uddin.mohammad@intel.com>
Cc: Erik Faye-Lund <kusmabite@gmail.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-03 13:38:05 +01:00
Emil Velikov
64b4ccde0c egl: make eglSwapInterval a no-op for !window surfaces
As the spec says, the function is a no-op when the surface is not a
window one.

That spec implies that EGL_TRUE should be returned in that case, yet
the ARM driver seems to return EGL_FALSE + EGL_BAD_SURFACE.

The Nvidia driver returns EGL_TRUE. We follow that behaviour until a
decision is made.

https://gitlab.khronos.org/egl/API/merge_requests/17

Cc: samiuddi <sami.uddin.mohammad@intel.com>
Cc: Erik Faye-Lund <kusmabite@gmail.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-03 13:38:05 +01:00
Emil Velikov
c231b49c53 freedreno: add the a6xx sources to the Android build
Add the files otherwise things just won't build.
Haven't actually tested it, but it's a small step in the right
direction.

Fixes: de3b34df97 ("freedreno: Add a6xx backend")
Cc: Kristian H. Kristensen <hoegsberg@chromium.org>
Cc: Rob Clark <robdclark@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-10-03 13:38:05 +01:00
Emil Velikov
7419b22413 pipe-loader: add a dup() in pipe_loader_sw_probe_kms
The pipe_loader_release API closes the fd given, even if the pipe-loader
should _not_ take ownership of it.

With earlier commit we fixed pipe_loader_drm_probe_fd, and now with
cover the final piece.

Note that unlike the DRM case, here the caller _did_ forget to dup
before using it ... most likely leading to all sorts of fun.

Don't forget the close in the error path. Seems like the things are a
bit leaky/asymmetrical with the semi-recent config work. But we can shave
that yak another day ;-)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-03 13:38:05 +01:00
Emil Velikov
6ccc435e7a pipe-loader: move dup(fd) within pipe_loader_drm_probe_fd
Currently pipe_loader_drm_probe_fd takes ownership of the fd given.
To match that, pipe_loader_release closes it.

Yet we have many instances which do not want the change of ownership,
and thus duplicate the fd before passing it to the pipe-loader.

Move the dup() within pipe-loader, explicitly document that and document
all the cases through the codebase.

A trivial git grep -2 pipe_loader_release makes things as obvious as it
gets ;-)

Cc: Leo Liu <leo.liu@amd.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Axel Davy <davyaxel0@gmail.com>
Cc: Patrick Rudolph <siro@das-labor.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com> (for nine)
2018-10-03 13:38:05 +01:00
Emil Velikov
7b8d1b313c st/nine: do not double-close the fd on teardown
As the newly introduced comment says:
 The pipe loader takes ownership of the fd

Thus, there's no need to close it again.

Cc: Patrick Rudolph <siro@das-labor.org>
Cc: Axel Davy <davyaxel0@gmail.com>
Cc: mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2018-10-03 13:38:05 +01:00
Emil Velikov
fa9df82f67 mesa: fold _glapi_check_multithread() back into _mesa_make_current
With commit c6c0f94714, back in 2006 Brian removed the
_glapi_check_multithread() call from core mesa - _mesa_make_current.

It was done to remove fairly awkward #ifdef guard which caused subtle
differences in core mesa.

Since that guard is long gone, we can drop the duplication and
reintroduce the call in core.

Note that the function is was missing when using EGL + classic dri HW
drivers. Yet on TLS builds it's a no-op, so we're safe.

Any non TLS users - more or less anything !Linux (or even musl on Linux
up-to semi-recently) may have experienced problems.

v2: don't remove the call from swrast - move it to core (Eric)

Cc: Eric Anholt <eric@anholt.net>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-10-03 13:38:05 +01:00
Emil Velikov
d081ad2aa2 vl/dri3: do full teardown on screen_destroy
Earlier commit added support for 'front_buffers', erroneously adding a
return in vl_dri3_screen_destroy. Effectively leaking a lot of state.

Fixes: 8d7ac0a4e4 ("vl/dri3: implement DRI3 BufferFromPixmap")
Cc: Leo Liu <leo.liu@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-03 13:38:05 +01:00
Emil Velikov
1301674c39 st/dri: make swrast_no_present member of dri_screen
Just like the dri2 options, this is better suited in the dri_screen
struct.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-03 13:38:05 +01:00
Emil Velikov
80b62e2d6d st/dri: inline dri2_buffer.h within dri2.c
The header was used only by dri2.c, containing a two-member struct and cast wrapper.
Just inline it where it's used/needed.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-03 13:38:05 +01:00
Emil Velikov
89c2c386c0 st/xa: remove unused xa_screen::d[s]_depth_bits_last
Unused since the initial import.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-03 13:38:05 +01:00
Emil Velikov
5ade4b10e2 mesa: use C99 initializer in get_gl_override()
The overrides array contains entries indexed on the gl_api enum.
Use a C99 initializer to make it a bit more obvious.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-03 13:38:05 +01:00
Gabriel Majeri
f0b987646a anv: Ensure discreteQueuePriorities is at least 2
This is the minimum value according to the spec.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-10-03 07:57:37 +02:00
Timothy Arceri
2b5f42068d r600: use build-id when available for disk cache
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-03 09:49:21 +10:00
Timothy Arceri
397f2603eb nouveau: use build-id when available for disk cache
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-03 09:49:21 +10:00
Timothy Arceri
2169acbf34 radeonsi: use build-id when available for disk cache
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-03 09:49:21 +10:00
Timothy Arceri
83ea8dd99b util: add disk_cache_get_function_identifier()
This can be used as a drop in replacement for
disk_cache_get_function_timestamp().

Here we use build-id to generate a driver-id rather than build
timestamp if available. This should resolve issues such as
distros using reproducable builds and flatpak not having
real build timestamps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-03 09:49:21 +10:00
Timothy Arceri
6a884014e4 util: rename timestamp param in disk_cache_create()
Only some drivers use a timestamp here. Others use things such
as build-id, or even a combination of build-ids from Mesa and
LLVM.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-03 09:49:21 +10:00
Józef Kucia
e24a4e05c7 radeonsi: avoid sending GS_EMIT in shaders without outputs
Fixes GPU hangs.

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107857
Signed-off-by: Józef Kucia <joseph.kucia@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-02 17:13:52 -04:00
Fritz Koenig
08f97407fb i965: Replace checks for rb->Name with FlipY (v2)
In the GL_MESA_framebuffer_flip_y implementation
_mesa_is_winsys_fbo checks were replaced with
FlipY checks.  rb->Name is also used to determine
if a buffer is winsys.

v2: Fixes annotation [for emil]

Fixes: ab05dd183c ("i965: implement GL_MESA_framebuffer_flip_y [v3]")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-10-02 11:28:46 -07:00
Marek Olšák
2fd58d8eb2 radeonsi: initialize ac_gpu_info::name when using SI_FORCE_FAMILY
so that it's not NULL when loading radeonsi and a GCN GPU is not
present in the system.
2018-10-02 12:21:49 -04:00
Marek Olšák
0b062f0419 radeonsi: don't set the VS prolog key for the blit VS 2018-10-02 12:21:49 -04:00
Jason Ekstrand
58360ca09d spirv: Move function call handling to vtn_cfg
It makes way more sense for it to live there with the rest of function
handling.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-10-02 10:24:56 -05:00
Jason Ekstrand
00f385e6d4 nir/from_ssa: Don't rewrite derefs destinations to registers
We already call nir_rematerialize_derefs_in_use_blocks_impl prior to
calling nir_lower_ssa_defs_to_regs_block so the assertion that all deref
uses in the block should hold.  This fixes the following CTS test when
SPIR-V optimization recipe 1:

dEQP-VK.glsl.struct.local.loop_nested_struct_array_vertex

Fixes: 606eb56ab9 "intel/nir: Only lower load/store derefs"
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-10-02 10:24:56 -05:00
Jason Ekstrand
bfc89c668e nir/cf: Remove phi sources if needed in nir_handle_add_jump
If the block in which the jump is inserted is the predecessor of a phi
then we need to remove phi sources otherwise the phi may end up with
things improperly connected.  This fixes the following CTS test when
dEQP is run with SPIR-V optimization recipe 1:

dEQP-VK.glsl.functions.control_flow.return_in_nested_loop_vertex

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-10-02 10:24:56 -05:00
Eric Engestrom
7b0752fb10 anv: suppress warning about unhandled image layout
Let's just be explicit that VK_NV_shading_rate_image is not supported.

Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: 6ee1709170 "vulkan: Update the XML and headers to 1.1.86"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2018-10-02 15:09:29 +01:00
Rob Clark
ae78489d3e freedreno/a6xx: hwbinning
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-02 10:08:18 -04:00
Rob Clark
8ff349e564 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-02 10:08:18 -04:00
Jason Ekstrand
7e7959fcb7 intel/fs: Fix a typo in need_matching_subreg_offset
This fixes a bunch of Vulkan subgroup tests on little core platforms.

Fixes: 4150920b95 "intel/fs: Add a helper for emitting scan operations"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-02 07:44:25 -05:00
Timothy Arceri
ea66bfda88 util: disable cache if we have no build-id and timestamp is zero
Timestamp can be zero for example when Flatpak is used. In this
case just disable the cache rather then segfaulting when
incompatible cache items are loaded.

V2: actually return false when mtime is 0.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-02 22:07:55 +10:00
Eric Engestrom
0bdf7b1d0f include: sync eglext.h from Khronos
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2018-10-02 12:10:46 +01:00
Timothy Arceri
0e6cdfd561 radeonsi: add a workaround for bitfield_extract when count is 0
This ports the fix from 3d41757788. Both LLVM 7 & 8 continue
to have this problem.

It fixes rendering issues in some menu and loading screens of
Civ VI which can be seen in the trace from bug 104602.

Note: This does not fix the black triangles on Vega for bug
104602.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104602
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107276
2018-10-02 08:39:51 +10:00
Jason Ekstrand
e4538b93f5 anv: Implement VK_KHR_driver_properties
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-01 13:21:12 -05:00
Jason Ekstrand
6ee1709170 vulkan: Update the XML and headers to 1.1.86
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-01 11:43:20 -05:00
Samuel Pitoiset
c2867e4c2a radv: do not try to set DCC_CONTROL when image doesn't use DCC
Unnecessary. While we are at it, remove the check for pre-VI
because it's already checked earlier.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-01 12:13:12 +02:00
Samuel Pitoiset
f622ab889a radv: add a sanity check for mutable formats and TC-compat HTILE
If apps use the MUTABLE bit and the same formats as the image one
in the list, we can still enable TC-compat HTILE. I don't think
this happens often but given the fact that TC-compat HTILE allows
a nice boost in some situations, it's worth checking.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-01 12:13:09 +02:00
Samuel Pitoiset
dc91c4d40a radv: disable HTILE for very small depth surfaces
Like we disable DCC/CMASK for small color surfaces as well.
Serious Sam 2017 creates a 1x1 depth surface and I think
it should be faster to do slow clears on the graphics queue
instead of fast clears on compute, and eventually a depth
expand if the surface isn't TC-compatible HTILE.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-01 10:16:33 +02:00
Samuel Pitoiset
6cfa321c39 radv: add potential missing fields for DB_EQAA
Other drivers set these two as well, just apply the same rule.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-01 10:16:30 +02:00
Samuel Pitoiset
bd6df2f923 radv: disable complicated point clipping against user clip planes
I don't think this is required by Vulkan too.

Ported from RadeonSI (AMDVLK doesn't set it either).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-01 10:16:25 +02:00
Michel Dänzer
cb863de626 gallium/util: Clarify comment in util_init_thread_pinning
As discussed in the review of the patch which added the comment:

Nothing happens when a thread is created, because pthread_atfork doesn't
affect creating threads. However, spawning a child process will likely
crash.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-28 17:52:11 +02:00
Samuel Pitoiset
3fb4adae83 radv: do not sync CP DMA when copying buffers
We already track if the DMA engine is busy/idle with a flag,
and we emit a packet that waits for all CP DMA operations
to be complete. This is done at end of command buffer because
the kernel doesn't wait for them, and also when emitting
barriers, so it should be safe.

This improves small copies for both aligned and unaligned sizes.

Aligned sizes:
BEFORE:
1 KB: 59.840000 ms
2 KB: 71.200000 ms
AFTER:
1 KB: 31.200000 ms
2 KB: 31.040000 ms

Unaligned sizes:
BEFORE:
2 KB: 68.3200 ms
3 KB: 79.3600 ms
5 KB: 76.6400 ms
9 KB: 90.8800 ms
17 KB: 116.0000 ms
AFTER:
2 KB: 31.0400 ms
3 KB: 32.0000 ms
5 KB: 30.8800 ms
9 KB: 30.5600 ms
17 KB: 29.6000 ms

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-28 09:08:52 +02:00
Samuel Pitoiset
621e70dd40 radv: adjust the CmdUpdateBuffer threshold for optimal performance
According to my benchmark results, it appears that we should
reduce the threshold to 1024.

BEFORE:
1 KB: 68.656000 ms
2 KB: 118.368000 ms

AFTER:
1 KB: 31.760000 ms
2 KB: 29.840000 ms

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-28 09:08:44 +02:00
Samuel Pitoiset
5d6a560a29 radv: do not use the availability bit for timestamp queries
It's unnecessary because we can just check if the timestamp
is to different to the default value when a pool is created
or resetted. Instead of waiting for the availability bit to
be 1, we have to emit a not equal WAIT_REG_MEM for checking
if the timestamp is ready.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-09-28 09:08:03 +02:00
Kristian H. Kristensen
3e90505224 freedreno/a6xx: Build up draw dword0 outside visibilty if statement
Pulling this logic out means we can share the logic and avoid a couple
of temporary variables that helped make things clearer before. Note
that in either vismode case, we always program vismode 0.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-09-27 16:08:52 -04:00
Kristian H. Kristensen
74a87cdaa6 freedreno/a6xx: Simplify draw_emit() branches a bit
Now that we've copied the emit logic into each branch of the
if (info->index_size) statement, we can simplify the logic a bit
according to which case we're in.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-09-27 16:08:52 -04:00
Kristian H. Kristensen
2516073cb6 freedreno/a6xx: Copy OUT_RING() part into each branch of the index if
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-09-27 16:08:52 -04:00
Kristian H. Kristensen
c3d58d9ffc freedreno/a6xx: Split fd6_draw_emit into direct and indirect paths
This splits the two code paths into separate functions and moves the
"if (info->indirect)" test into draw_impl().

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-09-27 16:08:52 -04:00
Kristian H. Kristensen
adcd83fb22 freedreno/a6xx: Inline fd6_draw()
Simplify the code a bit by inlining this helper.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-09-27 16:08:52 -04:00
Kristian H. Kristensen
fb1c6b89a2 freedreno/a6xx: Move emit_marker and wfi to draw_impl()
This way the markers clearly bracket the draw call and isn't
duplicated for both direct and indirect draw code.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-09-27 16:08:52 -04:00
Kristian H. Kristensen
0559050557 freedreno/a6xx: Move inline functions out of fd6_draw.h
Only used in fd6_draw.c so put them there.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-09-27 16:08:52 -04:00
Hyunjun Ko
1a40faa864 freedreno: fix a typo in launch_grid 2018-09-27 16:06:19 -04:00
Hyunjun Ko
aef410f31e freedreno/ir3: fix the param order of cmpxchg
According to the following definition,
int AtomicCompSwap(inout int mem, uint compare, uint data);

the preceding one in atomic_comp_swap of NIR is compare and data is
followed, while src0 for cmpxchg needs vec2(data, compare)
So for ssbo/image deref comp_swap, that should be reversed.

Fixes: dEQP-GLES31.functional.image_load_store.*.atomic.comp_swap*
2018-09-27 16:05:49 -04:00
Rob Clark
49d22c2dfc freedreno/a6xx: fix shaders w/ >= 24 regs
Possibly these bits mean something else now.  Blob always seems to use
FOUR_QUADS, and changing to TWO_QUADS seems to cause different threads
to overlap registers.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:49:14 -04:00
Rob Clark
6530fcc4a7 freedreno/a6xx: fix gl_FragCoord.w
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:45:44 -04:00
Rob Clark
919741b8d5 freedreno: handle invalidated buffers harder
Do a better job of skipping mem2gmem/gmem2mem..

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:41:46 -04:00
Rob Clark
19e9d28646 freedreno/a6xx: fix constlen
Fix a few bits of confusion, as with previous gen's constlen is aligned
to 4, and value in bitfield is left-shifted by 2 (ie. divided by 4).
But this is done by the CONSTLEN() accessor/builder fxn, so don't do it
twice.  Also HLSQ_FS_CNTL.CONSTLEN is not special.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:33:10 -04:00
Rob Clark
12de415ad1 freedreno: fix inorder rendering case
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:32:39 -04:00
Rob Clark
b65b6f7606 freedreno/a6xx: backface stencil state
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:31:56 -04:00
Rob Clark
93db15d300 freedreno/a6xx: fix gpu crash with separate-stencil
Fixes a crash in (of all things) dEQP-GLES2.info.vendor with
--deqp-surface-type=fbo..

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:31:34 -04:00
Rob Clark
a52ef80d24 freedreno/a6xx: fix MRT config
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:30:36 -04:00
Rob Clark
8930e83642 freedreno: fix potential hang when destroying batch
batch_flush_reset_dependencies() expects to be called unlocked, and can
call fd_batch_reference() which can try to aquire the screen lock again.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:29:45 -04:00
Rob Clark
ef6d15f8a8 freedreno: fix corrupted fb state
In c3d9f29b we allowed ctx->batch to be null, and started tracking the
current framebuffer state in fd_context.  But the existing logic in
fd_blitter_pipe_begin() would, if !ctx->batch, set null fb state to be
restored after blit.  Which broke the world of deqp (and probably other
things)

Fixes: c3d9f29b78 freedreno: allocate ctx's batch on demand
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:27:38 -04:00
Rob Clark
5bb96bf73a freedreno: simplify pctx->clear()
This is defined to always clear the entire surface(s) specified,
regardless of scissor state.. mesa/st will turn scissored clears
into a draw.  So rip about a bunch of unnecessary machinery.

Also remove a comment that was obsolete since using u_blitter to
turn clear into draw (for the cases where there isn't a hw blitter
fast-path).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:26:32 -04:00
Rob Clark
a7fa44cd33 freedreno: fix FD_MESA_DEBUG=flush
The logic to force a flush every draw was short-circuited with newer
kernels.  Also it should apply to clears as well.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:25:49 -04:00
Rob Clark
83c5c026ee freedreno: fix scissor state emit
The effective scissor changes based on rasterizer->scissor flag, so we
need to re-emit scissor state when rasterizer state changes.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:25:24 -04:00
Rob Clark
106f18258a freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-27 15:25:01 -04:00
Erik Faye-Lund
c3486cd8c9 st/mesa: do not call update_framebuffer_size with NULL pointer
In st_renderbuffer_alloc_storage, we avoid allocating storage for
zero-sized buffers, leading to this pointer being NULL. We already
take care to avoid dereferencing these pointers for color-buffers,
but not for depth/stencil-buffers.

So let's thread a bit more carefully here.

This avoids a crash while running Piglit's glx/glx-visuals-stencil
test, both on virgl and r600g.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Guillaume Charifi <guillaume.charifi@sfr.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-27 10:33:44 +02:00
Maxime
dd333c66bd vulkan: Disable randr lease for libxcb < 1.13
Since the Randr lease code was added, compiling against libxcb 1.12 no
longer works.

CC: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108024
Fixes: 7ab1fffcd2
Tested-By: Maxime <berillions@gmail.com>
Fixes: 7ab1fffcd2 "vulkan: Add EXT_acquire_xlib_display [v5]"
2018-09-27 16:31:42 +10:00
Bas Nieuwenhuizen
40585ddb48 radv: Remove garbage comment.
Trivial.
2018-09-27 02:04:06 +02:00
Bas Nieuwenhuizen
0207ebcbf1 radv: Do not use multiple draws for multisample copies.
Use sample rate shading instead, should give better locality.

Makes Nier with 8x msaa on a Raven go 5 fps -> 7 fps in the menu.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-09-27 02:04:00 +02:00
Jordan Justen
ca1d3fc538 anv: If softpin is supported, use it with the hiz clear value bo
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-09-26 10:21:23 -07:00
Jordan Justen
2a97390552 anv: s/batch/value_bo/ on anv_device_init_hiz_clear_batch
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-09-26 10:21:23 -07:00
Dylan Baker
e9bd071f49 docs: update calendar, add news and link release notes for 18.1.9 2018-09-26 09:44:40 -07:00
Dylan Baker
d4bdcf5d22 docs: Add sha256 sums to 18.1.9 2018-09-26 09:41:53 -07:00
Dylan Baker
4769f49455 docs: Add 18.1.9 release notes 2018-09-26 09:40:56 -07:00
Jason Ekstrand
b3f477ef7a intel/isl: Add a unit suffixes to some struct fields and variables
I was about to make the claim to someone that every field in isl_surf
is either an enum or has explicit units.  Then I looked at isl_surf and
discovered this claim was wrong.  We should fix that.  This commit does
a few refactors:

 * Add _B suffixes to some struct fields
 * Add _B to some variables and parameters
 * Rename row_pitch_tiles -> row_pitch_tl

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-09-26 08:52:26 -05:00
Axel Davy
0d495bec25 radeonsi: NaN should pass kill_if
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=105333
Fixes: https://github.com/iXit/Mesa-3D/issues/314

For this application, NaN is passed to KILL_IF and is expected to
pass.

v2: Explain in the code why UGE is used.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

CC: <mesa-stable@lists.freedesktop.org>
2018-09-25 22:05:24 +02:00
Axel Davy
46814e771a st/nine: Do not mark both ff vs and ps updated
Previously if only ff vs or only ff ps was used,
the constants for both were marked as updated,
while only the constants of the used ff shader
were updated.

Now that NINE_STATE_FF_VS and
NINE_STATE_FF_PS do not intersect anymore,
we can correctly mark the correct set of constant
as updated.

Fixes: https://github.com/iXit/Mesa-3D/issues/319

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
8e0526555d st/nine: Split NINE_STATE_FF_OTHER
NINE_STATE_FF_OTHER was mostly ff vs states.

Rename it to NINE_STATE_FF_VS_OTHER and
move common states with ps to
NINE_STATE_FF_PS_CONSTS (renamed from
NINE_STATE_FF_PSSTAGES).

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
5f7a41c33b st/nine: Add dummy ff shader state
Some states only affect the ff shader,
not its constants.
Currently we don't check anything and
always recompute the ff shader key.

However we do check for NINE_STATE_FF_OTHER
and if set we reupload some constants.

Thus for those states which had NINE_STATE_FF_OTHER
set but didn't need it,
replace by a dummy ff shader state (which is
easier to understand for an external reader than
just setting 0 and more future proof).

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
f6bf1d2db0 st/nine: Mark pointsize states as ff states
The pointsize states were missing the ff
NINE_STATE_FF_OTHER flag, and thus might
miss state updates when using ff.

Fixes some wine tests.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
89beea100f st/nine: Minor refactor of a few NINE_STATE_* flags
Rename NINE_STATE_FOG_SHADER,
NINE_STATE_POINTSIZE_SHADER and NINE_STATE_PS1X_SHADER
into
NINE_STATE_VS_PARAMS_MISC and NINE_STATE_PS_PARAMS_MISC.

The behaviour is unchanged, except one minor change:
D3DRS_FOGTABLEMODE doesn't need to affect VS.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
7ae2509ce0 st/nine: Increase maximum number of temp registers
With some test app I hit the limit.
As we allocate on demand (up to the maximum),
it is free to increase the limit.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
2018-09-25 22:05:24 +02:00
Axel Davy
dc4b53e129 st/nine: Lock the entire buffer in some cases.
Previously we had already found that for
MANAGED buffers the buffer started dirty
(which meant all writes out of bound
before the first draw call using the
buffer have to be taken into account).

Possibly it is the same for the other types of buffers.
For now always lock the entire buffer (starting from the offset)
for these (except for DYNAMIC buffers, which might hurt
performance too much).

Fixes: https://github.com/iXit/Mesa-3D/issues/301

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
0eeb583650 st/nine: Don't call SetCursor until a cursor is set
The previous code was ignoring the input
until a cursor is set inside d3d
(with SetCursorProperties), as expected
by wine tests.

However it did still make a call to ID3DPresent_SetCursor,
which would result into a SetCursor(NULL) call, thus
hidding any cursor set outside d3d, which we shouldn't do.

Add comment about not avoiding redundant ID3DPresent_SetCursor
calls once a cursor has been set in d3d, as it has been tested to
cause regressions.

Fixes: https://github.com/iXit/Mesa-3D/issues/197

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
dcfde02bb0 st/nine: Avoid redundant SetCursorPos calls
For some applications SetCursorPosition
is called when a cursor event is received.

Our SetCursorPosition was always calling
wine SetCursorPos which would trigger
a cursor event.

The infinite loop is avoided by not calling
SetCursorPos when the position hasn't changed.
Found thanks to wine tests.

Fixes irresponsive GUI for some applications.

Fixes: https://github.com/iXit/Mesa-3D/issues/173

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
2018-09-25 22:05:24 +02:00
Axel Davy
112c770597 st/nine: Init cursor position at device creation
This is only useful for software cursor,
but at least now we won't start it at (0, 0).

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
62ea55ec8b st/nine: Initialize manually cursor structure
Initialize manually the cursor structure fields
for more clarity on its content.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
110950318c st/nine: Check if format is DS before retrieving flags
d3d9_get_pipe_depth_format_bindings assumes the input format
is a depth stencil format.
Previously the user could hit this function with an invalid format.
Protect the last non protected call with a depth_stencil_format check.

Another solution is to have d3d9_get_pipe_depth_format_bindings
support non depth stencil format, but we don't want the user
to create depth buffers with d3d formats that can't be one,
it's better to check if the format can be depth buffer with d3d.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
af60fbc0a4 st/nine: Remove clamping when mul_zero_wins
Tests show the clamping can be removed
when mul_zero_wins is supported.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
a0afa80889 st/nine: Implement predicated instructions
Most of the work was already there, just not implemented.

Fixes: https://github.com/iXit/Mesa-3D/issues/318

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
e7e82bcdc9 st/nine: Fix aliased read in ff
Fix aliasing of colorarg_b4 with
colorarg_b5.

Fixes: https://github.com/iXit/Mesa-3D/issues/302

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
9fc6aa1bbe st/nine: Fix ff assignment with aliasing
"tex_stage[s][D3DTSS_COLORARG0] >> 4" could be a two bit
number, thus colorarg_b4 was incorrectly set.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
8c35fb0280 st/nine: Clarify some ff assignments
colorarg0, etc are 3 bits wide.
Make the code more readable by adding an & 0x7
to further indicate we only remember the first 3 bits only.

The 4th bit is always 0,
and colorarg_b4, colorarg_b5, etc are used to store
the 5th and 6th bits.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
59aaeeb730 st/nine: Print transform matrices in debug
This is useful to see the matrices content
in the log to debug.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
d9da0a1f6d st/nine: Add ff key hash to help debug
This is very useful to find in the log
the ff shader shource of a given call.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
fcbb00a502 st/nine: Avoid RefToBind calls in ff
When using csmt, ff shader creation happens on the csmt
thread. Creating the shaders, then calling RefToBind causes
the device ref to be increased then decreased.

However the device dtor assumes than no work pending on the
csmt thread could increase the device ref, leading to hang.

The issue is avoided by creating the shaders with a bind
count directly.

Fixes: https://github.com/iXit/Mesa-3D/issues/295

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
e83b15cba0 st/nine: Add new helper for object creation with bind
Add a new helper to create objects starting with a bind
count instead of a ref count.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
fd86ce7c14 st/nine: Add parameter to start with bind
Add a parameter to start new object with a bind
instead of a refcount.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
a9bf82ecf4 st/nine: Use perspective correction for ps depth fog
Emulate perspective interpolation of depth for programmable ps fog

ff ps fog uses position z, or 1/w depending on the ff
projection matrix set. This is according to public documents
found describing the algorithm and tests we made.

In the case of programmable ps, we used position's z,
which was sufficient to pass wine tests (which test shaders
don't set w).

Issue https://github.com/iXit/Mesa-3D/issues/315 showed
that this calculation was wrong.
Using perspective interpolation on z, that is using z * 1/w
seems to satisfy both this application and wine tests.

Fixes: https://github.com/iXit/Mesa-3D/issues/315

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-09-25 22:05:24 +02:00
Axel Davy
7ee5e5e239 st/nine: Clamp RCP when 0*inf!=0
Tests done on several devices of all 3 vendors and
of different generations showed that there are several
ways of handling infs and NaN for d3d9.

Tests showed Intel on windows does always clamp
RCP, RSQ and LOG (thus preventing inf/nan generation),
for all shader versions (some vendor behaviours vary
with shader versions).
Doing this in nine avoids 0*inf issues for drivers
that can't generate 0*inf=0 (which is controled by
TGSI's MUL_ZERO_WINS).

For now clamp for all drivers. An ulterior optimization
would be to avoid clamping for drivers with MUL_ZERO_WINS
for the specific shader versions where NV or AMD don't
clamp.

LOG and RSQ being already clamped, this patch only
clamps RCP.

Fixes: https://github.com/iXit/Mesa-3D/issues/316

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
2018-09-25 22:05:23 +02:00
Jan Vesely
1f3fe4aaeb .travis: Drop note about Clover builds being slow
SWR takes 17+ minutes to build. Clover builds take ~6-7 minutes.

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-09-25 14:08:06 -04:00
Jan Vesely
cb1b109733 .travis: Add LLVM-7 Clover build
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-09-25 14:08:06 -04:00
Caio Marcelo de Oliveira Filho
3cf07361ac intel/compiler: Export TCS passthrough creation
Move create_passthrough_tcs() from i965 so can be used in other
contexts.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-09-25 09:16:31 -07:00
Gert Wollny
47a6f98e15 mesa/st: In the precense of integer buffers enable per buffer blending
Since blending will be disabled later for integer formats we have to
consider that in the case of a mixed set of integer/non-integer format
buffers blending must be handled on a per buffer basis.

Fixes on r600:
  dEQP-GLES31.functional.draw_buffers_indexed.random.
      max_required_draw_buffers.13

Fixes:  8fb966688b
  st/mesa: Disable blending for integer formats.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-25 15:54:38 +02:00
Eric Engestrom
97ae5a858d meson+autotools: get rid of spammy GCC warning -Wformat-truncation
That warning fires every time a string function takes an argument that
could possibly be longer than its max output, which triggers all over
the place, especially when working with file paths ("what if every file
path is MAX_PATH long?" is what GCC is saying, which is really annoying
when we *know* that "/dev/dri/cardN" is not gonna be 4096 char long and
it's safe to store it in a 32-char array).

Anyway, we either add a ton of dead code all over the place to make GCC
happy, or we get rid of its spam. I chose the latter.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-09-25 11:40:08 +01:00
Eric Engestrom
1a37a80bf6 meson: make it trivial to add other -Wno-foo CFLAGS
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-09-25 11:39:56 +01:00
Eric Engestrom
f5b41f9121 gallivm: ensure string is null-terminated instead of assert()ing
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-09-25 11:39:30 +01:00
Topi Pohjolainen
1cc17fb731 intel/compiler/icl: Use barrier id bits 24:30 instead of 24:27,31
Fixes gpu hangs with Carchase and Manhattan.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-09-25 09:59:59 +03:00
Andres Rodriguez
ec1fcf92ae radv: only emit ZPASS_DONE for timestamp queries on gfx queues
A ZPASS_DONE packet doesn't make sense for the compute queue. It will
result in a gpu hang.

This change resolves a gpu hang for SteamVR+Vega.

Cc: mesa-stable@lists.freedesktop.org
Fixes: 1f616a840e "radv: emit a dummy ..."
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-09-25 02:30:34 -04:00
Timothy Arceri
72e4287e8f radv: make use of nir_lower_load_const_to_scalar()
This allows NIR to CSE more operations. LLVM does this also so the
impact is limited, however doing this in NIR allows other opts to
make progress. For example in radeonsi more loops are unrolled in
Civilization Beyond Earth.

The actual pipeline-db stats are not overwhelming but even in the
negatively affected shaders the NIR is clearly better. It just
happens that the code shuffling and in some cases calls to max
rather than a flt result in the final output from LLVM not
giving as good numbers.

However this is an incremental opt that further passes build off
so the change should be made IMO.

Totals from affected shaders:
SGPRS: 20192 -> 20184 (-0.04 %)
VGPRS: 19516 -> 19524 (0.04 %)
Spilled SGPRs: 437 -> 444 (1.60 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1527444 -> 1522276 (-0.34 %) bytes
LDS: 6 -> 6 (0.00 %) blocks
Max Waves: 1018 -> 1016 (-0.20 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-25 09:31:22 +10:00
Dylan Baker
f03a160592 meson: de-duplicate LLVM check
By adding `_llvm == 'true'` to the required argument we can check the
'auto' and 'true' case in one path.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-24 13:02:07 -07:00
Eric Engestrom
f2519e3493 vulkan/wsi/display: wsi_display_select_crtc() doesn' need to modify the connector
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-24 17:38:11 +01:00
Eric Engestrom
bde3102c0d vulkan/wsi/display: check if wsi_swapchain_init() succeeded
Fixes: da997ebec9 "vulkan: Add KHR_display extension using DRM [v10]"
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-09-24 17:37:43 +01:00
Leo Liu
3e7b5e5db2 radeon/uvd: use bitstream coded number for symbols of Huffman tables
Signed-off-by: Leo Liu <leo.liu@amd.com>
Fixes: 130d1f456(radeon/uvd: reconstruct MJPEG bitstream)
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-09-24 09:12:49 -04:00
Rhys Perry
6ca1402c11 nv50/ir: fix link-time build failure
Seems this fixes linking problems that occur in some situations.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-09-23 18:20:08 +01:00
Rhys Perry
b473fcc9a3 nvc0: fix bindless multisampled images on Maxwell+
NVC0_CB_AUX_BINDLESS_INFO isn't written to on Maxwell+ and it's too small
anyway.

With these changes, TXQ is used to determine the number of samples and
the coordinate adjustment information looked up in a small array in the
driver constant buffer.

v2: rework to use TXQ and a small array instead of a larger array with an
    entry for each texture
v3: get rid of the small array and calculate the adjustments in the shader

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: c2ae9b4052 ('nvc0: implement multisampled images on Maxwell+')
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-09-22 20:13:17 +01:00
Eric Engestrom
ed797f6597 docs: fix couple typos/outdated info
`git-branch` doesn't exist, and mesa3d-dev hasn't been used in a great
many years :)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-09-22 17:23:18 +01:00
Eric Engestrom
ae2694efe0 docs: update repo URLs after GitLab move
I also updated the developer instructions; presumably someone who's been
given commit rights already knows how to clone a repository :)

A more useful thing is to show how to update the pushurl, and how to use
access tokens to push over HTTPS (especially for us at Intel, where
non-http traffic is a pain).

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-09-22 17:23:18 +01:00
Stuart Young
c95dd966c4 docs: Update FAQ with respect to s3tc support
It's just over 10 months since 17.3.0 was released with s3tc support enabled.
Probably a good idea to update the FAQ page.

v2: Incorporate feedback from Adam Jackson <ajax@redhat.com>

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 04396a134f ("mesa: Import libtxc_dxtn sources")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-09-22 17:23:18 +01:00
Rhys Perry
f580a895b1 nvc0: warn about changing NVC0_CB_AUX_MP_INFO and NVC0_CB_AUX_DRAW_INFO
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-09-22 16:50:39 +01:00
Rhys Perry
01fa76b707 nvc0: Update counter reading shaders to new NVC0_CB_AUX_MP_INFO
Fixes: 66ca7e400b ('nvc0: add support for programmable sample locations')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-09-22 16:50:22 +01:00
Eric Anholt
cd667edecc vc4: Remove dead i == 0 code from the cos() implementation.
The loop starts at 1.
2018-09-21 17:16:43 -07:00
Eric Anholt
10d5d2d527 vc4: Fix sin(0.0) and cos(0.0) accuracy to fix SDL rendering rotation.
SDL has some shaders that compute sin(angle) and cos(angle) for a rotation
matrix in the VS, and angle is usually 0.0.  Our previous implementation
had quite a bit of error around 0.0, causing single-pixel rotations at
typical window sizes.  SDL2 has changed as of August 28th (commit
12156:e5a666405750) to not need sin/cos in the VS, but we should still fix
this for existing implementations or similar patterns that other programs
may have.

glsl-cos goes from 32 instructions to 36, but 9 uniforms to 7.
glsl-sin goes from 32 instructions to 34, but 8 uniforms to 7.

This seems like a fine impact to have for the bugfix.

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Fixes: https://github.com/anholt/mesa/issues/110
2018-09-21 17:16:43 -07:00
Anuj Phogat
a0baedb638 intel/icl: Fix URB size for different SKUs
Different ICL SKUs have different URB sizes.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-09-21 14:40:04 -07:00
Anuj Phogat
fa1ff71a0f i965/icl: Set Enabled Texel Offset Precision Fix bit
h/w specification requires this bit to be always set.

V2: Fix bit mask (Chris Wilson)

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-09-21 14:40:04 -07:00
Anuj Phogat
5eb173304b anv/icl: Set Enabled Texel Offset Precision Fix bit
h/w specification requires this bit to be always set.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-09-21 14:40:04 -07:00
Alex Deucher
afb7c6b301 pci_ids: add new polaris pci id
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-09-21 14:33:13 -05:00
Marek Olšák
f0cd7dbcd7 glsl_to_tgsi: invert gl_SamplePosition.y for the default framebuffer
Fixes dEQP-GLES31.functional.shaders.sample_variables.sample_pos.correctness.default_framebuffer
with --deqp-gl-config-name=rgba8888d24s8ms4

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
2018-09-21 13:39:00 -04:00
Caio Marcelo de Oliveira Filho
b29ec31854 util: Add macro to get number of elements in dynarray
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-09-21 10:12:51 -07:00
Dylan Baker
be56f8a788 docs/meson: Add note about llvm-config$version and llvm-config-$version
v2: - fix typo

These are how FreeBSD and Debian handle multiple versions of LLVM
installed at the same time, respectively.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-21 10:03:15 -07:00
Dylan Baker
e0829f9c1a docs/meson: Update notes on using CFLAGS and -Dc_args
v2: - Use ${} to denote variables instead of just $
    - fix spelling error

bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107313
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-21 10:03:15 -07:00
Dylan Baker
1da60667b5 docs: update meson docs to reflect the current status
v2: - minor grammar changes
    - fix typo

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-21 10:03:15 -07:00
Dylan Baker
509ea4649a meson: Don't force libva to required from auto
We already correctly handle va being auto, but we force it to being
true, which is bad.

Fixes 94cf397092
      ("meson: Fix auto option for va")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-21 10:03:15 -07:00
Dylan Baker
5dcb77e491 meson: Don't compile pipe loader with dri support when not using dri
Corrects building glx as gallium-xlib without any dri targets.

v2: - fix ugly formatting

Fixes: 66c94b9313
       ("meson: build gallium winsys for dri, null, and wrapper")

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-21 10:03:15 -07:00
Samuel Pitoiset
fe3f13cc5a radv: use the resolve compute path if dest uses multiple layers
The hardware path doesn't support resolving layers, for both
source and destination images.

This fixes a reflection issue when MSAA is enabled which
affects GTA V and probably DIRT3.

CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107786
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Gregor Münch <gr.muench_at_gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-21 16:35:59 +02:00
Jason Ekstrand
ab80889e92 anv,radv: Implement vkAcquireNextImage2
This was added as part of 1.1 but it's very hard to track exactly what
extension added it.  In any case, we should implement it.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <Airlied@redhat.com>
2018-09-21 07:02:35 -05:00
Juan A. Suarez Romero
24bacaddef docs: update calendar, add news and link release notes to 18.2.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-09-21 13:09:21 +02:00
Juan A. Suarez Romero
eefc77e691 docs: add sha256 checksums for 18.2.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 686eab6642)
2018-09-21 13:06:14 +02:00
Juan A. Suarez Romero
17fbb1ef74 docs: add release notes for 18.2.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 3c8c851fe4)
2018-09-21 13:06:12 +02:00
Samuel Pitoiset
674fcfaecc radv: only enable shaderInt16 on GFX9+ and LLVM7+
The throughput is similar to 32-bit integers on GFX8 and
AMDVLK does not expose 16-bit integers on pre Vega as well.
On GFX9+, only LLVM 7+ has support.

This fixes a bunch of CTS crashes on GFX9/LLVM 6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-21 10:56:17 +02:00
Marek Olšák
945e9cdb2b docs/features: add EXT_direct_state_access features
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-09-21 03:01:58 -04:00
Bas Nieuwenhuizen
0a77e70d10 radv: Fix driver UUID SHA1 init.
Was missing the init, found by Emil.

Fixes: d17443a459 "radv: Use build ID if available for cache UUID."
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-09-20 23:38:38 +02:00
Charmaine Lee
64731e7c5e svga: fix uninitialized fields in DefineDepthStencilView/DefineStreamOutput
This patch fixes uninitialized fields in DefineDepthStencilView and
DefineStreamOutput commands that are not relevant in SM4 device.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-20 13:20:10 -06:00
Brian Paul
7f4e6f4c97 r300g: add PIPE_SHADER_CAP_SCALAR_ISA switch case to silence warning
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-09-20 13:20:10 -06:00
Brian Paul
198c50f487 st/mesa: silenced unhanded enum warning in st_glsl_to_tgsi.cpp
Add ir_intrinsic_begin_fragment_shader_ordering switch case to
silence warning

Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-09-20 13:20:10 -06:00
Brian Paul
35ea66a68e mesa: use GLsizeiptrARB, GLintptrARB in bufferobj.c
The function pointer declarations in dd.h for the BufferData() and
BufferSubData() use the ARB-suffixed datatypes.  This patch changes
the buffer_data_fallback() and buffer_sub_data_fallback() functions
to use those datatypes too.

This fixes a build warning when building 32-bit libraries.  Evidently,
GLsizeiptrARB and GLsizeiptr are defined differently in that situation.

All all implementations of these driver hooks use the ARB-suffixed
types.

Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-09-20 13:20:10 -06:00
Neha Bhende
708d34d41a svga: Enable Opengl 3.3 compatibility profile
With this patch, svga driver will start advertising OpenGL 3.3
compatibility profile.

Tested with some mesa demos, piglit and glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-20 13:20:10 -06:00
Neha Bhende
ede805dd19 svga: Apply texcoord scale factors only if there is sampler view
We need to convert unnormalized texcoords to normalized texcoords
when we are sampling from texture. We don't need this conversion
if there is no sampler view.

Tested with piglit, glretrace

Fixes vmware bug 2101970

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-20 13:20:10 -06:00
Charmaine Lee
1dcf377a76 svga: fix texture array layer index in transfer map
In gallium, the layer index of a texture array to be mapped
is specified in the z component, whereas in svga device, the
index is specified in a separate argument.
Currently in svga_texture_transfer_map(), we explicitly modify
the z value in the base transfer map to 0 so the layer offset will not be
applied twice, but this causes problem when state tracker later
refers to the base transfer map and expects the slice index to be
specified in z (commit 463b0ea1f6).

To fix the problem, this patch makes a local copy of the box in
svga_transfer and modifies the z value in this copy instead.

Fixes spec@khr_texture_compression-astc piglit test crashes.
Fixes regression in the dma path with commit 1fdd3dd94a.

Tested with mtt glretrace, piglit on Windows VM and Linux VM.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-20 13:20:10 -06:00
Dylan Baker
18a6e426f3 Revert "utils/u_math: break dependency on gallium/utils"
This reverts commit 0abce6d770.

Which broke the windows build.
2018-09-20 10:36:33 -07:00
Caio Marcelo de Oliveira Filho
2567ad28bb i965: remove outdated comment about TCS passthrough
Since commit 75881bed9e "i965: Rework the TCS passthrough shader to
use NIR." the created nir_shader is not dummy, and it is compiled by
the backend like the others.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-09-20 09:58:55 -07:00
Christoph Haag
b01834b56c meson: add option to statically link llvm
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-09-20 06:08:50 -07:00
Dylan Baker
0abce6d770 utils/u_math: break dependency on gallium/utils
Currently u_math needs gallium utils for cpu detection.  Most of what
u_math uses out of u_cpu_detection is duplicated in src/mesa/x86
(surprise!), so I've just reworked it as much as possible to use the
x86/common_x86_features.h macros instead of the gallium ones. The mesa
implementation is a header only approach, with no external dependencies.
There is one small function that was copied over, as promoting
u_cpu_detection is itself a fairly hefty undertaking, as it depends on
u_debug, and this fixes the bug for now.

bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107870
Tested-by: Vinson Lee <vlee@freedesktop.org>
2018-09-20 05:52:23 -07:00
Emil Velikov
b8b3517a49 egl/android: rework device probing
Unlike the other platforms, here we aim do guess if the device that we
somewhat arbitrarily picked, is supported or not.

In particular: when a vendor is _not_ requested we loop through all
devices, picking the first one which can create a DRI screen.

When a vendor is requested - we use that and do _not_ fall-back to any
other device.

The former seems a bit fiddly, but considering EGL_EXT_explicit_device and
EGL_MESA_query_renderer are MIA, this is the best we can do for the
moment.

With those (proposed) extensions userspace will be able to create a
separate EGL display for each device, query device details and make the
conscious decision which one to use.

v2:
 - update droid_open_device_drm_gralloc()
 - set the dri2_dpy->fd before using it
 - return a EGLBoolean for droid_{probe,open}_device*
 - do not warn on droid_load_driver failure (Tomasz)
 - plug mem leak on dri2_create_screen failure (Tomasz)
 - fixup function name typo (Tomasz, Rob)

v3:
 - add forward declaration for droid_load_driver()
Fixes the HAVE_DRM_GRALLOC build (Mauro)
 - split dup() assignment and check in separate lines (Tomasz, Eric)
 - make droid_load_driver() static (Tomasz)
 - drop unused prop_set variable (Tomasz)

v4:
 - rebase
 - fwd declarationi should be for droid_probe_device()

Cc: Robert Foss <robert.foss@collabora.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Mauro Rossi <issor.oruam@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Tested-by: Tomasz Figa <tfiga@chromium.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
2018-09-20 10:15:38 +01:00
Danylo Piliaiev
18be7403a1 glsl: Add an assert when cloning ir_dereference_record with invalid field
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-09-20 08:30:11 +10:00
Danylo Piliaiev
6f3c7374b1 glsl: Avoid propagating incompatible type of initializer
do_assignment validated assigment but when rhs type was not compatible
it proceeded without issues and returned error_emitted = false.
On the other hand process_initializer expected do_assignment to always
return compatible type and never fail.

As a result when variable was initialized with incompatible type
the type of variable changed to the incompatible one.
This manifested in unnecessary error messages and in one case in crash.

Example GLSL:
 vec4 tmp = vec2(0.0);
 tmp.z -= 1.0;

Past error messages:
 initializer of type vec2 cannot be assigned to variable of type vec4
 invalid swizzle / mask `z'
 type mismatch
 operands to arithmetic operators must be numeric

After this patch:
 initializer of type vec2 cannot be assigned to variable of type vec4

In the other case when we initialize variable with incompatible struct,
accessing variable's field leaded to a crash. Example:
 uniform struct {float field;} data;
 ...
 vec4 tmp = data;
 tmp.x -= 1.0;

After the patch there is only error line without a crash:
 initializer of type #anon_struct cannot be assigned to variable of
  type vec4

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107547
2018-09-20 08:30:11 +10:00
Michal Srb
194bf0a2e0 st/dri: don't set queryDmaBufFormats/queryDmaBufModifiers if the driver does not implement it
This is equivalent to commit a65db0ad1c, but for dri_kms_init_screen. Without
this gbm_dri_is_format_supported always returns false.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104926
Fixes: e14fe41e0b ("st/dri: implement createImageFromRenderbuffer(2)")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Tested-by: Adam Williamson <adamwill@fedoraproject.org>
2018-09-19 15:20:04 -04:00
Jason Ekstrand
c811af767e anv/so_memcpy: Don't consider src/dst_offset when computing block size
The only thing that matters is the size since we never specify any
offsets in terms of blocks.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-19 09:38:04 -05:00
Jakob Bornecrantz
09171705d5 Revert "mesa: only update framebuffer-state for clears"
This reverts commit fb86365148.
2018-09-19 15:21:26 +01:00
Samuel Pitoiset
121f226471 radv: use a 64-bit unsigned integer when allocating a descriptor pool
pool->size is a 64-bit unsigned integer too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-19 13:36:12 +02:00
Samuel Pitoiset
35656823b9 radv: enable VK_SUBGROUP_FEATURE_ARITHMETIC_BIT
All CTS pass on Polaris/Vega with LLVM 6, 7 and master, so
I think it's safe to enable the feature.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-19 13:36:10 +02:00
Samuel Pitoiset
febdc13a6c radv: do not support blitting surfaces with depth and stencil
Fixes:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_optimal_nearest

And all friends that try to blit a surface with different
depth and stencil formats.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-19 13:36:07 +02:00
Erik Faye-Lund
fb86365148 mesa: only update framebuffer-state for clears
If we update the program-state etc, we risk compiling needless shaders,
which can cost quite a bit of performance.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-19 11:52:53 +02:00
Juan A. Suarez Romero
0c82e3603e nir: add initializer data to fix MSVC compile error
CC: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 82799a5d1b8 ("nir: Add a small pass to rematerialize derefs
per-block")
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-09-19 11:46:44 +02:00
Jason Ekstrand
976046a8d8 nir: Add some asserts that we don't put derefs in phis
The lcssa and phis_to_regs passes are used by various NIR optimizations
that modify the CFG.  Putting a couple of asserts will help ensure that
we don't accidentally put derefs in phis as part of an optimization
pass.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-09-19 02:00:49 -05:00
Jason Ekstrand
864c780566 nir/opt_if: Re-materialize derefs in use blocks before peeling loops
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107879
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
2018-09-19 02:00:49 -05:00
Jason Ekstrand
0796c3934e nir/loop_unroll: Re-materialize derefs in use blocks before unrolling
When we're about to re-arrange a bunch of blocks, it's a good idea to
make sure that we don't have deref uses crossing block boundaries.
Otherwise we may end up with a deref going through a phi and that would
be bad.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
2018-09-19 01:59:40 -05:00
Jason Ekstrand
7d1d1208c2 nir: Add a small pass to rematerialize derefs per-block
This pass re-materializes deref instructions on a per-block basis to
ensure that every use of a deref occurs in the same block as the
instruction which uses it.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
2018-09-19 01:59:40 -05:00
Kenneth Feng
4490fce166 amd: Add Picasso device id
No changes here compared to Raven.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
2018-09-18 18:05:17 -04:00
Bas Nieuwenhuizen
95bb7d82ca Revert "radv: fix descriptor pool allocation size"
This reverts commit 90819abb56.

This logic was wrong, the original code is correct. The direct
impact is that we allocate up to approximately a squared amount
of memory compared to what we should allocate.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-09-18 22:51:42 +02:00
Samuel Pitoiset
c9dbe52f84 radv: implement VK_EXT_conservative_rasterization
Only supported by GFX9+.

The conservativeraster Sascha demo seems to work as expected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-18 13:28:01 +02:00
Samuel Pitoiset
450a325858 radv: do not re-create the sampler for every blits in CmdBlitImage()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-18 13:27:59 +02:00
Samuel Pitoiset
3871dd7a92 radv: allow to force anisotropy via RADV_TEX_ANISO
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-18 13:27:58 +02:00
Timothy Arceri
b54a2311a9 mesa: enable EXT_framebuffer_object in core profile
Since user defined names are not allowed in core profile
we remove the allow_user_names bool and just check if
we have a core profile like all other buffer/texture
object handling code does.

This extension is required by "Wolfenstein: The Old Blood"
and is exposed in core in the Nvidia binary driver.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-18 19:58:24 +10:00
Timothy Arceri
02843ed768 mesa: move legacy dri config option texture_depth
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-18 19:43:05 +10:00
Timothy Arceri
f958ea6eff mesa: move legacy dri config option fthrottle_mode
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-18 19:43:05 +10:00
Timothy Arceri
4b1a81ef9d mesa: move legacy dri config option def_max_anisotropy
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-18 19:43:05 +10:00
Timothy Arceri
6164d59bcc mesa: move legacy dri config option no_neg_lod_bias
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-18 19:43:05 +10:00
Timothy Arceri
6d1890fa07 mesa: move legacy dri config option round_mode
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-18 19:43:05 +10:00
Timothy Arceri
3a1d09fd55 mesa: remove unused dri option float_depth
This seems to have only been used by DRI1 drivers which were
removed with e4344161bd.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-18 19:43:05 +10:00
Timothy Arceri
91e76ce493 mesa: move legacy dri config option dither_mode
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-18 19:43:05 +10:00
Timothy Arceri
2d7dc9591d mesa: move legacy dri config option color_reduction
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-18 19:43:05 +10:00
Timothy Arceri
408d41a413 mesa: move legacy TCL dri config options
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-18 19:43:05 +10:00
Timothy Arceri
024abd3534 util: use force_compat_profile for Wolfenstein The Old Blood
This game is looking for some odd extension after creating a core
context such as ARB_vertex_program and EXT_framebuffer_object.

Rather then enabling these in core this forces the game to use
compat. This allows the game to run and seems to work without
issues. All other id tech games/engines use a compat profile.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-18 19:34:54 +10:00
Timothy Arceri
64ec50d52f mesa/st: add force_compat_profile option to driconfig
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-18 19:34:54 +10:00
Timothy Arceri
7a992fcfa0 Revert "radeonsi: avoid syncing the driver thread in si_fence_finish"
This reverts commit bc65dcab3b.

This was manually reverted. Reverting stops the menu hanging in
some id tech games such as RAGE and Wolfenstein The New Order.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107891
2018-09-18 19:21:32 +10:00
Eric Anholt
4e1af6808c v3d: Switch from FLUSH_ALL_STATE to FLUSH for ending our bin CLs.
The HW for FLUSH_ALL_STATE isn't validated, since the closed driver only
uses FLUSH.  Now that we don't have any new state at the end of our bin
CLs, follow their lead.
2018-09-17 16:35:45 -07:00
Eric Anholt
0b8007b523 v3d: Stop clearing the OQ state at the end of the job.
Ever since we added OQ support, we've been clearing OQ state at the start
of the job anyway.  We're intentionally breaking old-and-new-driver-mix
systems, because we need to stop using the unvalidated FLUSH_ALL_STATE.
2018-09-17 16:35:45 -07:00
Eric Anholt
350cb79045 v3d: Always emit a TF disable at the start of drawing on V3D 4.x.
The HW's FLUSH_ALL_STATE is not validated, so we probably shouldn't use
it, meaning that we need to reset state at the start.  By doing this, we
also make ourselves more resilient to another client leaving the TF state
enabled at the end of their batch (as we now do, ourselves).

However, we still need to emit a single TF disable at the end of the
frame, for SWVC5-718.
2018-09-17 16:35:45 -07:00
Dylan Baker
7f08bcb73f build: Don't overlink gallium xlib target
Currently gallium's xlib target will fail to link due to multiple
definitions of all the symbols in libmesautil, this only shows up in
autotools, and not in meson due to differences in the way that meson and
autotools handle linking static archives into static archives. Autotools
uses -Wl,--whole-archive implicitly, meson requires this behavior to be
opted-into. The solution is just to remove libmesautils from the
libgl-xlib target, since it will get all of those symbols form
libmesagallium.

I've dropped the link from meson as well, it doesn't seem to hurt
anything and should make linking just a little faster.

Fixes: 8396043f30
       ("Replace uses of _mesa_bitcount with util_bitcount")
bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107923
Tested-by: Brian Paul <brianp@vmware.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Cc: Sergii Romantsov<sergii.romantsov@globallogic.com>
2018-09-17 13:21:01 -07:00
Dylan Baker
3acc18fcf7 move pthread_setaffinity_np check to the build system
Rather than trying to encode all of the rules in a header, lets just put
them in the build system where they belong. This fixes the build on
FreeBSD, which does have pthraed_setaffinity_np, but it's in a
pthread_np.h, not behind _GNU_SOURCE. FreeBSD also implements cpu_set
slightly differently, so additional changes would be required to get it
working right there anyway.

v2: - fix #define in autotools

Fixes: 9f1bbbdbbd
       ("util: try to fix the Android and MacOS build")
Cc: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-17 13:16:46 -07:00
Fritz Koenig
60d0c0d062 mesa: FramebufferParameteri parameter checking
Missing break; causes parameter checking to
never pass GL_FRAMEBUFFER_FLIP_Y_MESA parameters.

Fixes: 318c265160 ("mesa: GL_MESA_framebuffer_flip_y extension [v4]")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-17 11:48:00 -07:00
Fritz Koenig
ba6cc32cf9 mesa: Additional FlipY applications
Instances where direction was determined based on
winsys or user fbo and should be determined based on
FlipY.

Key STATE_FB_WPOS_Y_TRANSFORM for of FlipY instead of
_mesa_is_user_fbo.  This corrects gl_FragCoord usage
when applying GL_MESA_framebuffer_flip_y.

Fixes: ab05dd183c ("i965: implement GL_MESA_framebuffer_flip_y [v3]")
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-17 11:48:00 -07:00
Bas Nieuwenhuizen
d17443a459 radv: Use build ID if available for cache UUID.
To get an useful UUID for systems that have a non-useful mtime
for the binaries.

I started using SHA1 to ensure we get reasonable mixing in the
various possibilities and the various build id lengths.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-09-17 20:19:52 +02:00
Samuel Pitoiset
08103c5f65 radv: enable shaderInt16 capability
Not sure if this is all wired up. CTS does pass and the Tangrams
demo works fine on Vega. There are corruption issues on Polaris
but not sure if that related to 16-bit support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:39 +02:00
Samuel Pitoiset
cd76ce0078 ac: add 16-bit support to ac_build_bitfield_reverse()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:37 +02:00
Samuel Pitoiset
fc398f4d67 ac: add 16-bit support to ac_build_bit_count()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:34 +02:00
Samuel Pitoiset
94dd08eb7c ac: add 16-bit support to ac_find_lsb()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:32 +02:00
Samuel Pitoiset
5a6c8ca3e8 ac: add 16-bit support to ac_build_umsb()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:30 +02:00
Samuel Pitoiset
3e7f3e2cd1 ac: add 16-bit support to ac_build_isign()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:28 +02:00
Samuel Pitoiset
cfd6314cfe ac: add 16-bit constant values for zero and one
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:26 +02:00
Samuel Pitoiset
074e29183c ac: add ac_build_bifield_reverse() helper
Are we missing 64-bit support?

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:23 +02:00
Samuel Pitoiset
371c35e5bb ac: add ac_build_bit_count() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:20 +02:00
Samuel Pitoiset
aec9151464 radv: fix use of unreachable() in the meta blit path
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-09-17 11:29:25 +02:00
Samuel Pitoiset
6521d4a659 Revert "radv: Optimize rebinding the same descriptor set."
This introduces random GPU hangs on Vega, at least.

This reverts commit 02a43edf18.
2018-09-17 11:20:57 +02:00
Samuel Pitoiset
90819abb56 radv: fix descriptor pool allocation size
The size has to be multiplied by the number of sets.

This gets rid of the OUT_OF_POOL_KHR error and fixes
a crash with the Tangrams demo.

CC: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 10:18:01 +02:00
Jason Ekstrand
67094e11e9 anv/query: Add an emit_srm helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-17 02:57:21 -05:00
Jason Ekstrand
40149441b8 anv: Add a mi_memset and use it for zeroing queries
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-17 02:57:21 -05:00
Jason Ekstrand
b11e9b5ffe anv/query: Use anv_address everywhere
Instead of passing around BOs and offsets, use addresses which are anv's
GPU equivalent of pointers.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-17 02:57:21 -05:00
Jason Ekstrand
07e214f1ce anv/query: Write both dwords in emit_zero_queries
Each query slot is a uint64_t and we were only zeroing half of it.

Fixes: 7ec6e4e689 "anv/query: implement multiview interactions"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-17 02:57:21 -05:00
Jason Ekstrand
c0420a62c9 anv/query: Increment an index while writing results
Instead of computing an index at the end which we hope maps to the
number of things written, just count the number of things as we go.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-17 02:57:21 -05:00
Ian Romanick
df9dbc03d3 i965/fs: Don't propagate conditional modifiers from integer compares to adds
No shader-db changes on any Intel platform... which probably explains
why no bugs have been bisected to this problem since it landed in Mesa
18.1. :( The commit mentioned below is in 18.2, so 18.1 would need a
slightly different fix (due to code refactoring).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes: 77f269bb56 "i965/fs: Refactor propagation of conditional modifiers from compares to adds"
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (reviewed the original patch)
Cc: Matt Turner <mattst88@gmail.com> (reviewed the original patch)
2018-09-17 00:38:22 -07:00
Bas Nieuwenhuizen
0dd8189f15 radv: Only allow 16 user SGPRs for compute on GFX9+.
Apparently for compute there are only 16 instead of the 32 for the
graphics path.

Fixes dEQP-VK.binding_model.descriptorset_random.sets16.noarray.ubolimitlow.sbolimitlow.imglimitlow.noiub.comp.0

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-09-16 12:50:58 +02:00
Bas Nieuwenhuizen
d97c892584 radv: Set the user SGPR MSB for Vega.
Otherwise using 32 user SGPRs would be broken.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-09-16 12:50:58 +02:00
Bas Nieuwenhuizen
02a43edf18 radv: Optimize rebinding the same descriptor set.
This makes it cheaper to just change the dynamic offsets with
the same descriptor sets.

Suggested-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-09-16 12:50:19 +02:00
Gert Wollny
14976817f4 r600/sb: use safe math optimizations when TGSI contains precise operations
Fixes:
  dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_3
  dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_3
  dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_3

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-09-15 20:44:53 +02:00
Mauro Rossi
cc3b99bb48 android: broadcom/cle: export the broadcom top level path headers
Fixes the following building error in vc4 build:

In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_render_cl.c:34:
In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_drv.h:27:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_simulator_validate.h:34:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_context.h:39:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_cl.h:56:
gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h:12:10:
fatal error: 'cle/v3d_packet_helpers.h' file not found
         ^~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2018-09-15 09:14:46 +02:00
Mauro Rossi
9158e0bd82 android: broadcom/cle: add gallium include path
Fixes the following building error:

In file included from external/mesa/src/broadcom/cle/v3d_decoder.c:38:
In file included from external/mesa/src/broadcom/cle/v3d_packet_helpers.h:29:
external/mesa/src/gallium/auxiliary/util/u_math.h:42:10:
fatal error: 'pipe/p_compiler.h' file not found
         ^~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2018-09-15 09:14:42 +02:00
Mauro Rossi
3341429d74 android: broadcom/genxml: fix collision with intel/genxml header-gen macro
Fixes the following building error, happening when building both intel and broadcom:

Gen Header: libmesa_broadcom_genxml_32 <= v3d_packet_v21_pack.h
FAILED: gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h
/bin/bash -c "python external/mesa/src/broadcom/cle/gen_pack_header.py \
external/mesa/src/broadcom/cle/v3d_packet_v21.xml \
> gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h"
Traceback (most recent call last):
  File "external/mesa/src/broadcom/cle/gen_pack_header.py", line 626, in <module>
    p = Parser(sys.argv[2])
IndexError: list index out of range

header-gen macro is already defined by Intel genxml building rules
and the existing header-gen does not have the $(PRIVATE_VER) argument,
infact the bash command line logged in the building error is missing
exactly $(PRIVATE_VER) argument

Renaming the macro as pack-header-gen in src/broadcom/Android.genxml.mk
solves the building error, another possible way is to keep the gen rules
commands expanded and not use the macros.

Fixes: 7f80a9ff13 ("vc4: Introduce XML-based packet header generation like Intel's.")
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2018-09-15 09:14:33 +02:00
Caio Marcelo de Oliveira Filho
f9d25f630c anv/memcpy: fix build after starting to use addresses
The offsets now come from the anv_address, these references were not
updated and using the old variable.

Fixes: e1ab834557 "anv/memcpy: Use addresses instead of bo+offset"
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2018-09-14 21:45:50 -07:00
Jason Ekstrand
d6a73824bd anv/cmd_buffer: Take an address in emit_lrm
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-14 22:12:11 -05:00
Jason Ekstrand
e1ab834557 anv/memcpy: Use addresses instead of bo+offset
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-14 22:12:11 -05:00
Jason Ekstrand
90b46f6c17 anv/so_memcpy: Use the correct SO_BUFFER size on gen8+
This shouldn't matter as we'll never write OOB anyway but we may as well
get it right.  It's supposed to be in dwords - 1.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-09-14 22:12:11 -05:00
Timothy Arceri
e29f0ede75 ac: fix get_image_coords() for radeonsi
Because this was setting image to true we would end up calling
si_load_image_desc() when we sould be calling
si_load_sampler_desc().

This fixes an assert() in Deus Ex: MD

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-15 12:23:32 +10:00
Marek Olšák
914bd3014f gallium/util: don't let child processes inherit our thread affinity
v2: corrected the comment
2018-09-14 21:15:39 -04:00
Marek Olšák
7d41a7593a gallium/util: start with a random L3 cache index for AMD Zen 2018-09-14 21:05:37 -04:00
Josh Pieper
936e0dcd61 st/mesa: Validate the result of pipe_transfer_map in make_texture (v2)
When using Freecad, I was getting intermittent segfaults inside of
mesa.  I traced it down to this path in st_cb_drawpixels.c where the
result of pipe_transfer_map wasn't being checked.  In my case, it was
returning NULL because nouveau_bo_new returned ENOENT.  I'm by no
means a mesa developer, but this patch solves the problem for me and
seems reasonable enough.

v2: Marek - also unmap the PBO and release the texture, and call
    the make_texture function sooner for less cleanup

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
2018-09-14 21:05:37 -04:00
Samuel Pitoiset
c79aad30ae radv: emit the initial config only once in the preambles
It shouldn't be needed to emit the initial graphics or compute
state when beginning a new command buffer. Emitting them in
the preamble should be enough and this will reduce IB sizes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset
9de062ef20 radv: fix setting global locations for indirect descriptors
Indirect descriptors only need one entry, we don't have to
emit a location for every descriptors.

Fixes GPU hangs with new CTS:
dEQP-VK.binding_model.descriptorset_random.*

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset
748f4cce18 radv: fix flushing indirect descriptors
Let say, we first bind a graphics pipeline that needs indirect
descriptors sets. The userdata pointers will be emitted at draw
time. Then if we bind a compute pipeline that doesn't need any
indirect descriptors, the driver will re-emit them for all
grpahics stages.

To avoid this to happen, just check the bind point type.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset
063264db5b radv: fix GPU hangs with 32-bit indirect descriptors
LLVM 6 isn't affected.

Fixes GPU hangs with new CTS:
dEQP-VK.binding_model.descriptorset_random.*

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset
aa30205929 radv: handle loc->indirect correctly for the first descriptor
This was wrong for descriptor #0 when all of them are indirect.
This is because indirect_offset was 0 and we emitted a
"normal" descriptor pointer for nothing.

While we are at it remove
radv_userdata_info::indirect_offset which is useless.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset
b9f6521157 radv: bump the maximum number of arguments to 64
Bumping to 64 should be safe enough.

Fixes some crashes with new CTS:
dEQP-VK.binding_model.descriptorset_random.*

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset
c28ea92947 radv: tidy up ac_setup_rings() for the GSVS rings
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset
40fb8c7fca radv: fix setting the number of entries for GSVS on VI+
According to RadeonSI, it's unnecessary to multiply by
the stride. That field seems to always be 64.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset
a006c24237 radv: always compute the number of components from the output mask
That removes two special cases for clip/cull distances.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset
9447e91329 radv: emit data contiguously in the GS->VS ring buffer
Instead of having holes. The other ring parameters like
offset and stride can be updated later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset
fbc064a5b4 radv: make use of the output usage mask in GS copy shader
This is just for consistency because LLVM can detect and
remove unused loads.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset
f398595dca radv: improve a comment in si_emit_set_predication_state()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset
abdf396cbe radv: fix VK_EXT_conditional_rendering visibility
It's actually just the opposite.

This fixes the new Sascha conditionalrender demo.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset
18464d298b radv: make use of ac_unpack_param() instead of ac_build_bfe()
Same code is generated because LLVM ends up by using bfe, but
that seems cleaner to me.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Timothy Arceri
21e34bab09 nir: add loop unroll support for complex wrapper loops
In GLSL IR we cheat with switch statements and simply convert them
into loops with a single iteration. This allowed us to make use of
the existing jump instruction handling provided by the loop handing
code, it also allows dead code to be cleaned up once we have
wrapped the code in a loop.

However using loops in this way created previously unrollable loops
which limits further optimisations. Here we provide a way to unroll
loops that end in a break and have multiple other exits.

All shader-db changes are from the dolphin uber shaders. There is a
small amount of HURT shaders but in general the improvements far
exceed the HURT.

shader-db results IVB:

total instructions in shared programs: 10018187 -> 10016468 (-0.02%)
instructions in affected programs: 104080 -> 102361 (-1.65%)
helped: 36
HURT: 15

total cycles in shared programs: 220065064 -> 154529655 (-29.78%)
cycles in affected programs: 126063017 -> 60527608 (-51.99%)
helped: 51
HURT: 0

total loops in shared programs: 2515 -> 2308 (-8.23%)
loops in affected programs: 903 -> 696 (-22.92%)
helped: 51
HURT: 0

total spills in shared programs: 4370 -> 4124 (-5.63%)
spills in affected programs: 1397 -> 1151 (-17.61%)
helped: 9
HURT: 12

total fills in shared programs: 4581 -> 4419 (-3.54%)
fills in affected programs: 2201 -> 2039 (-7.36%)
helped: 9
HURT: 15

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-09-14 16:07:36 +10:00
Timothy Arceri
2975422ceb nir: propagates if condition evaluation down some alu chains
v2:
 - only allow nir_op_inot or nir_op_b2i when alu input is 1.
 - use some helpers as suggested by Jason.

v3:
 - evaluate alu op for single input alu ops
 - add helper function to decide if to propagate through alu
 - make use of nir_before_src in another spot

shader-db IVB results:

total instructions in shared programs: 9993483 -> 9993472 (-0.00%)
instructions in affected programs: 1300 -> 1289 (-0.85%)
helped: 11
HURT: 0

total cycles in shared programs: 219476091 -> 219476059 (-0.00%)
cycles in affected programs: 7675 -> 7643 (-0.42%)
helped: 10
HURT: 1

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-09-14 16:07:36 +10:00
Timothy Arceri
ef4ad7baf1 nir: evaluate if condition uses inside the if branches
Since we know what side of the branch we ended up on we can just
replace the use with a constant.

All the spill changes in shader-db are from Dolphin uber shaders,
despite some small regressions the change is clearly positive.

V2: insert new constant after any phis in the
    use->parent_instr->type == nir_instr_type_phi path.

v3:
 - use nir_after_block_before_jump() for inserting const
 - check dominance of phi uses correctly

v4:
 - create some helpers as suggested by Jason.

v5 (Jason Ekstrand):
 - Use LIST_ENTRY to get the phi src

shader-db results IVB:

total instructions in shared programs: 9999201 -> 9993483 (-0.06%)
instructions in affected programs: 163235 -> 157517 (-3.50%)
helped: 132
HURT: 2

total cycles in shared programs: 231670754 -> 219476091 (-5.26%)
cycles in affected programs: 143424120 -> 131229457 (-8.50%)
helped: 115
HURT: 24

total spills in shared programs: 4383 -> 4370 (-0.30%)
spills in affected programs: 1656 -> 1643 (-0.79%)
helped: 9
HURT: 18

total fills in shared programs: 4610 -> 4581 (-0.63%)
fills in affected programs: 374 -> 345 (-7.75%)
helped: 6
HURT: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-09-14 16:07:36 +10:00
Erik Faye-Lund
fa5e9f1f73 virgl: adjust strides when mapping temp-resources
When we're mapping temp-resources, we clip the resource to the
transfer-box, which means the stride might not be correct any more.

So let's update the stride from the temp-resource, and recompute the
layer-stride.

This fixes crashes when running dEQP with --deqp-gl-config-name=rgba8888d24s8ms4

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: a8987b88ff "virgl: add driver for virtio-gpu 3D (v2)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-09-14 10:59:02 +10:00
Pierre Moreau
21b92b3464 nvir: Always split 64-bit IMAD/IMUL operations
Those operations do not map to actual hardware instructions, therefore
those should always be lowered to 32-bit instructions.

Fixes: 009c54aa7a "nv50/ir: Split 64-bit integer MAD/MUL operations"

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-09-13 20:49:38 +02:00
Leo Liu
cb63e5d1eb st/vdpau: Use output buffer as back buffer with 24-bit color only
Using output buffer with 8 bits video RGB as back buffer
certainly is not working for 30 bits color depth visual.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-09-13 14:28:32 -04:00
Leo Liu
4d8ec12f03 vl/dri: add color depth to vl winsys
For VDPAU use later

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-09-13 14:28:32 -04:00
Leo Liu
cd77d49ecf vl/dri3: add support for 10 bits format
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-09-13 14:28:32 -04:00
Leo Liu
902358de4b vl/dri: add 10 bits format supports
v2: Tell B10G10R10X2 and R10G10B10X2 formats for different HW.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-09-13 14:28:32 -04:00
Kristian H. Kristensen
aaafae4f55 egl/android: Declare droid_load_driver() static
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-09-13 11:12:35 -07:00
Samuel Pitoiset
d4bf954fe6 radv: fix function names for VK_EXT_conditional_rendering
Otherwise they are not exported.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-09-13 16:03:18 +02:00
Jason Ekstrand
1a263b377c anv: Silence a couple compiler warnings
[63/93] Compiling C object 'src/intel/vulkan/...intel@vulkan@@anv_common@sta/anv_device.c.o'.
../src/intel/vulkan/anv_device.c:685:30: warning: passing 'const char *' to parameter of type 'void *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
   vk_free(&instance->alloc, instance->app_info.app_name);
                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/vulkan/util/vk_alloc.h:62:51: note: passing argument to parameter 'data' here
vk_free(const VkAllocationCallbacks *alloc, void *data)
                                                  ^
../src/intel/vulkan/anv_device.c:686:30: warning: passing 'const char *' to parameter of type 'void *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
   vk_free(&instance->alloc, instance->app_info.engine_name);
                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/vulkan/util/vk_alloc.h:62:51: note: passing argument to parameter 'data' here
vk_free(const VkAllocationCallbacks *alloc, void *data)
                                                  ^
[65/93] Compiling C object 'src/intel/vulkan/...ommon@sta/anv_nir_apply_pipeline_layout.c.o'.
../src/intel/vulkan/anv_nir_apply_pipeline_layout.c:519:13: warning: unused variable 'image_uniform' [-Wunused-variable]
   unsigned image_uniform;

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-09-12 21:20:27 -05:00
Michel Dänzer
e34dd4f508 loader/dri3: Don't wait for fence of old buffer when re-allocating it
We only need to wait for the fence before drawing to a buffer, not
before reading from it.

This might avoid hangs when re-allocating the fake front buffer, similar
to the previous change. But I haven't seen any evidence that this was
actually happening in practice.

Tested-by: Olivier Fourdan <ofourdan@redhat.com>
2018-09-12 16:55:09 +02:00
Michel Dänzer
aefac10fec loader/dri3: Only wait for back buffer fences in dri3_get_buffer
We don't need to wait before drawing to the fake front buffer, as front
buffer rendering by definition is allowed to produce artifacts.

Fixes hangs in some cases when re-using the fake front buffer, due to it
still being busy (i.e. in use for presentation).

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/106404
Bugzilla: https://bugs.freedesktop.org/107757
Tested-by: Olivier Fourdan <ofourdan@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
2018-09-12 16:53:58 +02:00
Vadym Shovkoplias
9b5c0c520f glsl/linker: Check the invariance of built-in special variables
From Section 4.6.4 (Invariance and Linkage) of the GLSL ES 1.0 specification

    "The invariance of varyings that are declared in both the vertex and
     fragment shaders must match. For the built-in special variables,
     gl_FragCoord can only be declared invariant if and only if
     gl_Position is declared invariant. Similarly gl_PointCoord can only
     be declared invariant if and only if gl_PointSize is declared
     invariant. It is an error to declare gl_FrontFacing as invariant.
     The invariance of gl_FrontFacing is the same as the invariance of
     gl_Position."

Fixes:
    * glsl-pcoord-invariant.shader_test
    * glsl-fcoord-invariant.shader_test
    * glsl-fface-invariant.shader_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107734
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-09-12 11:43:21 +03:00
Tapani Pälli
30580640f2 intel/tools: fix initial position of window in aubinator viewer
Currently position is set before widgets are sized by gtk and
calculation can get wrong results where window is positioned
offscreen. Patch fixes this by setting aubfile window position
as 0,0 only when size_allocate has been called to the widget.

Now window is always positioned to 0,0 if imgui.ini is missing.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-12 11:43:21 +03:00
Erik Faye-Lund
eaa718588e winsys/virgl: avoid unintended behavior
If we end up never taking the loop that writes ret, we can end up with
an uninitialized value, and if we're *really* unlucky, that value can
be -1, causing us to go down an error-path instead of a success path.

This was obviously not intended, so let's just initialize this to zero.

Noticed by Valgrind:

Conditional jump or move depends on uninitialised value(s)
   at 0xBA640A0: virgl_drm_winsys_resource_cache_create (virgl_drm_winsys.c:348)
   by 0xBA62FCF: virgl_buffer_create (virgl_buffer.c:170)
   by 0xBA605AC: virgl_resource_create (virgl_resource.c:60)
   by 0xBCF816F: bufferobj_data (st_cb_bufferobjects.c:344)
   by 0xBCF816F: st_bufferobj_data (st_cb_bufferobjects.c:390)
   by 0xBB7E836: vbo_use_buffer_objects (vbo_exec_api.c:1136)
   by 0xBCFCC6E: st_create_context_priv (st_context.c:414)
   by 0xBCFD3CD: st_create_context (st_context.c:590)
   by 0xBBB30CA: st_api_create_context (st_manager.c:896)
   by 0xB981E76: dri_create_context (dri_context.c:155)
   by 0xB97BDCE: driCreateContextAttribs (dri_util.c:473)
   by 0x5288331: dri3_create_context_attribs (dri3_glx.c:309)
   by 0x5264D64: glXCreateContextAttribsARB (create_context.c:78)

Fixes: a8987b88ff ("virgl: add driver for virtio-gpu 3D (v2)")
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-09-12 10:14:43 +02:00
Juan A. Suarez Romero
d631916f29 travis: use python3.5 for meson
Newer Meson versions require python >=3.5. But in Trusty default python3
version is 3.4.x.

Install python3.5 and makes it the default version for Meson using
update-alternatives method.

CC: Jan Vesely <jano.vesely@gmail.com>
CC: Andres Gomez <agomez@igalia.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
CC: Jon Turney <jon.turney@dronecode.org.uk>
CC: Eric Engestrom <eric.engestrom@intel.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Fixes: 3824c8e7cd "meson: disable asserts by default on release builds"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-09-11 14:27:58 +01:00
Samuel Pitoiset
3d08631fe5 radv: adjust ESGS ring buffer size computation on VI+
Noticed while working in this area. Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-09-11 11:30:19 +02:00
Gert Wollny
47e01e77d8 mesa/texture: Also check for LA texture when querying intensity component size
Gallium may pick L16A16_FLOAT to represent GL_INTENSITY16F if no intensity
format is provided by the driver. However, when calling

   glGetTexLevelParameteriv(..., GL_TEXTURE_INTENSITY_SIZE, ...)

mesa will return a zero size because the actually used format has no
intensity channel and as a fallback only the sizes of the red/green
channels are checked.

Also checking for LA sizes in the allocated texture resolves this problem.

v2: Only check alpha channel size and return it (Marek)
    L and A size are always the same in this case.

Fixes (on virgl):
  ext_framebuffer_multisample-fast-clear GL_ARB_texture_float *

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107832

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-11 09:07:05 +02:00
Ilia Mirkin
133e12fb69 nv50,nvc0: warn on not-explicitly-handled caps
Not handling caps explicitly means that we're likely getting incorrect
values -- these need to be reviewed and set appropriately.

While we're at it, add in some missing caps, and set all the subpixel
stuff to 8 as that seems to be what the blob reports.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-09-11 01:25:19 -04:00
Timothy Arceri
e66c2158f8 mesa: remove duplicate dispatch sanity tests
This removes duplicate tests from gl_core_functions_possible
that are already covered by common_desktop_functions_possible.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-11 10:13:31 +10:00
Timothy Arceri
355a5ef761 mesa: tidy up init_matrix_stack()
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-09-11 09:26:04 +10:00
Christopher Egert
51995f6920 radeon: fix ColorMask
Since commit af3685d149 various OpenGL applications regressed
on the classic mesa radeon driver.

Signed-off-by: Christopher Egert <cme3000@gmail.com>
CC: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-09-10 16:57:20 -04:00
Elie Tournier
9179c745f6 gallium: Correctly handle no config context creation
This patch fixes the following Piglit test:
spec@egl_mesa_configless_context@basic
It also fixes few test in a virgl guest.

v2: Evaluate the value of no_config (Ilia)

Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-09-10 15:30:17 -04:00
Bas Nieuwenhuizen
f6e09db2e6 radv: Support v3 of VK_EXT_vertex_attribute_divisor.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: 18.2 <mesa-stable@lists.freedesktop.org>
2018-09-10 21:26:17 +02:00
Marek Olšák
867f7aaed2 radeonsi/nir: port some bindless and sampler code from TGSI
These might be all missing changes for bindless textures.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:23:21 -04:00
Marek Olšák
b00deed66f radeonsi: adjust and simplify max_alloc_size determination
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
203ef19f48 radeonsi: split si_copy_buffer
compute and SDMA will be added into it.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
986d6f12fb radeonsi: don't call VBO prefetch with size=0
for the next commit.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
1119fe5c25 radeonsi: merge SI and CI dma_clear_buffer and remove the callback
also use assertions for the requirements that offset and size are a multiple
of 4.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
be0bd95abf radeonsi: fix GPU hangs with bindless textures and LLVM 7.0
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
fa595e3d0c ac: remove deprecated use of LLVMInt1Type()
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
cc36ebbdc3 ac: use iN_0/1 constants
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
bc09c3d59e ac: add radeon_info::num_good_cu_per_sh
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
a5f35aa742 ac: revert new LLVM 7.0 behavior for fdiv
Cc: 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
662db03577 radeonsi: fix printing a BO list into ddebug reports
important for debugging

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
da72b6296c r600: fix HTILE for NPOT textures with mipmapping
Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
d4e52281aa winsys/radeon: fix CMASK fast clear for NPOT textures with mipmapping on SI/CI
Cc: 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
a1b9a00f82 radeonsi: fix HTILE for NPOT textures with mipmapping on SI/CI
VI uses addrlib so it's unaffected.

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Brian Paul
5162735957 docs: document new features/extensions in driver for WS 15 / Fusion 11
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
7baf45dfc7 svga: assorted fixes/changes in svga_pipe_blit.c
To align the code with VMware's in-house copy.

Signed-off-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
25fceccf72 svga: set buffer bind_flags in svga_buffer_add_host_surface()
To match the in-house VMware code.

Signed-off-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
337a74aa40 svga: add format conversion for legacy formats
This patch extends the format_conversion table to support
different view formats on texture buffer.
For legacy image formats such as INTENSITY, LUMINANCE, LUMINANCE_ALPHA,
special swizzle masks will be used on the red or RG channels.

This fixes piglit test arb_texture_buffer_object-formats fs|vs arb

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
389450a271 svga: remove obsolete code to reemit gs binding
The svga_reemit_gs_bindings function is no longer needed. Remove it.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
c174ee9f9d svga: move variant->fs_shadow_compare_units assignment
Fixes a crash since the variant object isn't allocated until later
in the function.  Not sure how this got through.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
cb70474b20 svga: fix resource checking in is_blending_enabled()
This patch makes sure a valid color buffer is bound before
checking its resource. This fixes Unigine Valley running in SM41 device.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Neha Bhende
c6103328ab svga: Use texture_copy_region instead of texture_copy_handle for multisampling
This fixes some of tests cases in arb_copy_image-formats and also fixes
SurfaceCopy related errors in vmware.log when multi sampled surfaces are
used.

Tested with piglit, glretrace on windows and linux VM.

v2: As per Brian's comment

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
fdf5885183 svga: add missing devcap check for texture array support
The patch checks DXFMT_ARRAY devcap for texture array support.

Tested with MTT-piglit. No regressions.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
3069581260 svga: no need to check MULTISAMPLE devcap for view format
According to the current SVGA contract, any view format can be
used on the underlying resource that is multisample. So there
is no need to check the MULTISAMPLE devcap for the view format.

Fixes black rendering issue with Tropics running with 4xMSAA.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
6f254ad9b4 svga: sync devcap name changes in svga3d_devcaps.h
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
49428c8d61 svga: explicit set DXFMT_SHADER_SAMPLE for DS format for pre-SM41 device
Explicit set the DXFMT_SHADER_SAMPLE bit for depth stencil formats
for pre-SM41 device only. This bit is now set by the SM41 device.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
379a2f265f svga: remove unused variable
Trivial.
2018-09-10 13:07:30 -06:00
Brian Paul
cbcc416a58 svga: draw round points when msaa is enabled
See comments for details.  This allows the piglit
ext_framebuffer_multisample-point-smooth test to pass.

Also, test the pipe_rasterizer_state::point_quad_rasterization field
to see if sprite point rasterization is needed because it's possible
for no sprite_coord_enable bits to be set when drawing sprites.

Finally, remove old, stale comments.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
6b039c7d7c svga: check number of samples before emitting MSAA decls/opcodes
If real MSAA is not available, we only support 1 sample/pixel.  In that
case, we must not declare MSAA resources or emit MSAA opcodes.  Do that
by checking the sample count.

Fixes several piglit MSAA tests, such as
arb_texture_multisample-sample-depth (when the hard-coded sample count
of 4 is fixed in that test).

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
cf2fb6813c svga: remove obsolete comment on format_cap_table[]
We removed the special cases referred to in this comment in the commit
"svga: add a separate function to get dx format capabilities from
vgpu10 device".

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
0fc6c17bf2 svga: allow TGSI_TEXTURE_CUBE_ARRAY in emit_tg4()
Technically, SM4.1 doesn't support cube map arrays, but our backend
renderers actually do.  This allows the Piglit textureGather cube
map array tests to pass.

Tested with GLrenderer, DX11renderer and SWrenderer.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
3467a274e0 svga: no dma on multisample surface
Force direct map on multisample surface.

Fixes SVGA Driver Errors running multisample piglit tests on Linux VM

v2: use texture for the check.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
5f14444184 svga: src surface for IntraSurfaceCopy cannot be multisample
Fixes SVGA Driver Errors with piglit test arb_copy_image-targets

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
026e1ad7bb svga: fix missing format multisample devcap check
In commit e4048f6cd1, svga_is_dx_format_supported() is supposed to
also check the SVGA3D_DXFMT_MULTISAMPLE bit for multisample
support of a format. Somehow that code is not included in that commit.
This patch fixes it.

Fixes piglit test spec@ext_framebuffer_multisample@formats all_samples.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
285d8b47b1 svga: fix incorrect multisample support in VGPU9 device
Commit e4048f6cd1 unintentionally allows multisample support for VGPU9 device.
This patch fixes this regression.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
59a56ca1c8 svga: fix the missing devcap for SVGA3D_BC3_UNORM_SRGB
Set the devcap to SVGA3D_DEVCAP_DXFMT_BC3_UNORM_SRGB

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
16666eb470 svga: add a separate function to get dx format capabilities from vgpu10 device
Currently we have one function to get format capabailities and
we convert DX10 devcaps back to DX9. This can be confusing.
Going forward we will have a separate function for dealing with dx formats.

This patch also fixes the depth stencil devcap. Instead of hardcoding
the capabilities for the depth stencil formats, we will inquire the
device for the capabilities. Note: we will still need to explicity set
the SVGA3D_DXFMT_SHADER_SAMPLE bit for SVGA3D_R32_FLOAT_X8X24 and
SVGA3D_R24_UNORM_X8 since this bit is not advertised but supported
by the device.

v2: reapply the patch after svga_is_format_supported is moved to svga_format.c

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
b1aee7ff05 svga: assign a separate function for is_format_supported() for vgpu10 device
This patch adds a new function svga_is_dx_format_supported() to check
for format support in a VGPU10 device.

v2: reapply the patch after svga_is_format_supported is moved to svga_format.c

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
1ea9c80d6d svga: add some devcap debugging code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
96ef81e39e svga: fix depth and coverage mask output declaration
Set the component mask to zero for both registers.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
7187a2f7ff svga: add sample positions for 2 samples
Fixes piglit tests spec@arb_sample_shading@builtin-gl-sample-position 2
                   spec@arb_texture_multisample@fb-completeness@2

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
73c850fb9a svga: check sample count devcaps
Check sample count devcaps from the svga device to determine the
supported sample counts.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
afacde3553 svga: fix 1-element cube map array issue
As with 1D and 2D array textures, if there's only one array element
(one cubemap in this case) we have to issue different shader code.

This fixes a number of Piglit cubemap array tests.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
767c1eb436 svga: simplify array test in svga_init_shader_key_common()
And squash commit a patch to silence a compiler warning (add
default case to the switch statement).

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
45517f492b winsys/drm: check for CAPS2/SM41 support if VGPU10 is enabled
No need to check for HW_CAPS2 or SM4_1 support if VGPU10 is not
enabled or is explicitly disabled via the environment variable
SVGA_VGPU10.

Reviewed-by: Deepak Rawat <drawat@vmware.com>
2018-09-10 13:07:30 -06:00
Deepak Rawat
159e706c4c winsys/drm: Add support for quality level in surface ioctl
A new argument "quality level" is added in surface define v3 which
represets precision settings for surface. This commit add support
for quality level in DRM_VMW_GB_SURFACE_CREATE_EXT and
DRM_VMW_GB_SURFACE_REF_EXT.

Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
b343c6915c svga: sync svga3d_types.h with upstream changes
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
b5827db2ea winsys/drm: enable intra_surface_copy if HW_CAP2 is supported
With drm version 2_15, we can inquire for support of HW_CAP2.
If it is supported, we can enable intra_surface_copy support.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
7448bb0089 svga: add git version logging at init time
Before we can log the git version in the host log,
we'll add the git version in the init debug message.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
4669ffd29b svga: fix a typo in svga_texture_copy_region()
Trivial.
2018-09-10 13:07:30 -06:00
Charmaine Lee
3233d05390 svga: use helper function to do copy region
Use the common helper function svga_texture_copy_region
for copy region command.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
74791b80b9 svga: fix cubemap array rendering with backed surface view
This patch fixes the layer index when rendering to a
backed surface view of a cubemap array.

Fixes piglit test fbo-generatemipmap-cubemap array.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
2d39e6d0c8 svga: add a helper function to send ResolveCopy command
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
9a24b08a49 svga: sync svga3d header files
This is a squash of what was orginally three commits.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
f3eda3e5e1 svga: add SM4_1 enable debug print
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
ccd895db76 svga: fix swizzling for texture gather
Texture swizzling for texture gather needs to be done to the selected texels
rather than to the returned vector. This patch has specical cases
for the different swizzles in emit_tg4().

Fixes a lot of piglit texture gather tests.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
be1993d6ed svga: fix starting index for system values
Currently, the starting index for system values is assigned to
the next index after the highest index of the tgsi declared input registers.
But the tgsi index might be different from the actual assigned index, hence
this might cause overlap of indices.
With this patch, the shader linker keeps track of the highest index of the
translated input registers, and the next index will be used for the
starting index for system values.

Fixes SHIM errors running arb_copy_image-formats on SM4_1 device.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Deepak Rawat
569f838987 winsys/svga: Add support for new surface ioctl, multisample pattern
Kernel driver version 2.15 added new surface ioctl named:
DRM_VMW_GB_SURFACE_CREATE_EXT
DRM_VMW_GB_SURFACE_REF_EXT

The new ioctl has support for 64-bit svga3d_flags if
DRM_VMW_PARAM_SM4_1 is available.

Multisampling surface mob size calculation is added. Also synced the
relevant header update.

svga device modified the surface define command V3 with new parameter
multisampling pattern. Adding support for that in winsys.

Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
3f55425ee6 svga: enable MSAA for SM4_1 device
The SVGA device is deprecating the DX9 MSAA support.
This patch enables MSAA for SM4_1 device by explicitly
setting the SVGA3D_SURFACE_MULTISAMPLE bit.
For SM4_1 device, only 4 samples is supported.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
8088cb6f53 svga: add sample count to the surface_can_create interface
With this patch, sample count is also taken into account
when determining if a resource can be created.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
4a1976bfcf svga: implement support for GL_ARB_texture_query_lod
Just translate the TGSI LODQ intruction to VGPU10 LOD instruction.
All (4) Piglit GL_ARB_texture_query_lod tests pass.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-09-10 13:07:30 -06:00
Neha Bhende
252e97ecdf svga: Add support for arb_texture_gather
With sm4_1, we can support single channel 2D or CubeMap textures.
This patch exercises this feature.

Tested with piglit

v2: As per Brian's comment

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
36c84bcd77 svga: add support for interpolation at sample position
Vs. sampling at the centroid or the fragment center.

Note that this does not fix failures with the Piglit
arb_sample_shading-interpolate-at-sample-position or
arb_sample_shading-ignore-centroid-qualifier.exe tests at this time.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
bcf7aaa9f7 svga: clarify sys value -> input register mapping
We translate TGSI system value registers to VGPU10 input registers.
Add a comment and set file = TGSI_FILE_INPUT.  That's not stricly
necessary since we map both TGSI_FILE_INPUT and TGSI_FILE_SYSTEM_VALUE
to VGPU10_OPERAND_TYPE_INPUT, but this makes the code a bit more
understandable.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
9de5bdb341 svga: add support for FS sample mask output
This, with the previous work for sample position/id query, allows
us to enable per-sample shading for VGPU 10.1.

Note that quite a few Piglit arb_sample_shading tests still do not
pass, but many do.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
0a219dd918 svga: add support for sample id, sample position
Sample ID is just a system value.  Sample position must be implemented
with the VGPU10_OPCODE_SAMPLE_POS instruction.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
ac4a0c0e82 svga: implement no-op svga_set_min_samples()
This is part of the per-sample shading feature (PIPE_CAP_SAMPLE_SHADING).

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
3c3fc7154e svga: add support for independent blend function per render target
This patch adds support for GL_ARB_draw_buffers_blend extension
for SM4_1 device.

Fixes piglit test fbo-draw-buffers-blend.

This patch is squashed with a subsequent patch which fixed a
regression.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
5512f943b8 svga: emit shader version as 4.0 or 4.1 depending on device support
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
1d806b6f13 svga: restructure nested if's in emit_src_register()
To make it cleaner for subsequent changes.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
16439085f5 svga: sync VGPU10ShaderTokens.h with upstream changes
This includes new DX 10.1 opcodes and tokens.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
22e8099711 svga: add support for shadow cubemap array
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
f929247d24 svga: add support for rendering to cubemap array
Fixes piglit test arb_texture_cube_map_array-fbo-cubemap-array

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
1df17fc697 svga: add support for TXL2 opcode
This patch adds support for cubemap array texture lookup with
explicit LOD.

Fixes piglit test arb_texture_cube_map_array-cubemap-lod

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Charmaine Lee
62402be407 svga: add support for cubemap array
This patch adds support for cubemap array for SM4_1.

Fixes piglit test arb_texture_cube_map_array-cubemap

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Brian Paul
018ff0112f svga: add have_sm4_1 flag, helper function
Signed-off-by: Brian Paul <brianp@vmware.com>
2018-09-10 13:07:30 -06:00
Marek Olšák
d211679017 gallium/u_inlines: remove the destroy variable in pipe_reference_described
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 14:53:01 -04:00
Marek Olšák
ed880fe192 gallium/u_inlines: improve pipe_reference_described perf for debug builds
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-09-10 14:53:01 -04:00
Marek Olšák
c042a34b14 gallium/auxiliary: don't dereference counters twice needlessly
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 14:52:32 -04:00
Marek Olšák
61767c059e gallium/u_inlines: normalize naming, use dst & src, style fixes (v2)
v2: update comments

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 14:52:32 -04:00
Marek Olšák
9f1bbbdbbd util: try to fix the Android and MacOS build
Bionic does not have pthread_setaffinity_np.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107869
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-09-10 14:49:07 -04:00
Jason Ekstrand
6f00785765 anv: Support v3 of VK_EXT_vertex_attribute_divisor
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-10 13:45:32 -05:00
Jason Ekstrand
34a17a48d4 vulkan: Update the XML and headers to 1.1.84
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-10 13:30:21 -05:00
Sergii Romantsov
bbe551f3ea mesa/meson: 32bit xmlconfig linkage
Building of 32bit mesa with meson causes linkage issue:
"undefined reference to `util_get_process_name'"
Fixed by adding link-with mesa_util for xmlconfig primary.

v2: Removed '[]', commit message corrected.

v3: Reverted changes in gbm and glx libraries.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107843
Fixes: 2e1e6511f7 "util: extract get_process_name from xmlconfig.c"
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-09-10 08:57:42 -07:00
Jose Fonseca
52ca32121b Require Visual Studio 2015.
We no longer need or use Visual Studio 2013.

https://ci.appveyor.com/project/jrfonseca/mesa/build/52

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-09-10 10:10:16 +01:00
Jose Fonseca
d5f934522d util: Make util_context_thread_changed a no-op on Windows.
Spite using thrd_t types, these functions are wed to pthreads, and break
Windows builds, because thrd_current() is not implemented there, as it's
impossible to have an efficient thrd_current() implementation on
Windows.

Trivial.
2018-09-10 10:10:16 +01:00
Erik Faye-Lund
c4017106bb virgl: do not map zero-sized resource
When creating textures, we avoid creating backing-store for all
multisampled textures, not just depth buffers.

So we can't try to map them later. That's just going to fail. So
let's take the blit-based code-path that seems to avoid this problem.

This make this piglit test-case no longer crash (although it still
fails):

bin/copyteximage 2D -samples=2 -auto

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-09-10 10:35:42 +02:00
Erik Faye-Lund
8083464013 virgl: remove dead code
We don't use the size we calculate in this function, so let's just
drop the calculation

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-09-10 10:35:32 +02:00
Erik Faye-Lund
b9c40e492d virgl: drop needless return-code
We always return TRUE, and we never check the return-value. Let's
just drop the return value instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-09-10 10:35:20 +02:00
Erik Faye-Lund
9635869d73 virgl: free trans on map-error
When we fail to map memory, we should also free trans to avoid
leaking memory.

Noticed while reading code.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-09-10 10:35:02 +02:00
Chris Wilson
44e3e6a9b4 i965: Bump aperture tracking to u64
As a prelude to handling large address spaces, first allow ourselves the
luxury of handling the full 4G.

Reported-by: Andrey Simiklit <asimiklit.work@gmail.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-09-10 09:14:46 +01:00
Mathias Fröhlich
2fece204c0 etnaviv: Reduce max offset to available hardware bits.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-09-10 07:59:31 +02:00
Mathias Fröhlich
4569bc6ad0 gallium: New cap PIPE_CAP_MAX_VERTEX_ELEMENT_SRC_OFFSET.
Introduce a new capability for the maximum value of
pipe_vertex_element::src_offset. Initially just every driver
backend returns the value previously set from _mesa_init_constants.
So this shall end up in no functional change.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-09-10 07:59:31 +02:00
Dave Airlie
240af61494 virgl: don't send a shader create with no data. (v2)
This fixes the situation where we'd send a shader with just the
header and no data.

piglit/glsl-max-varyings test was causing this to happen, and
the renderer fix was breaking it.

v2: drop fprintf

Fixes: a8987b88ff "virgl: add driver for virtio-gpu 3D (v2)"
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-09-10 12:23:30 +10:00
Timothy Arceri
14fe9fa11b mesa: enable ARB_vertex_buffer_object in core profile
This extension is required by "Wolfenstein: The Old Blood"
and is exposed in core in the Nvidia binary driver.

All the functions are just alias of the core functions so
there should be nothing more to do.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-09-08 14:35:09 +10:00
Marek Olšák
21ca322e63 st/mesa: throttle texture uploads if their memory usage goes beyond a limit
This prevents radeonsi from running out of memory. It also increases
texture upload performance by being nice to the kernel memory manager.
2018-09-07 17:59:02 -04:00
Marek Olšák
9ce2cef68f gallium: add PIPE_CAP_MAX_TEXTURE_UPLOAD_MEMORY_BUDGET 2018-09-07 17:59:02 -04:00
Andres Gomez
ecfe41e690 docs: update calendar, add news item and link release notes for 18.2.0
Signed-off-by: Andres Gomez <agomez@igalia.com>
2018-09-08 00:40:43 +03:00
Andres Gomez
5382a90cb2 docs: add sha256 checksums for 18.2.0
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit cb1ddf48e2)
2018-09-08 00:28:23 +03:00
Andres Gomez
65f3327db6 docs: update 18.2.0 release notes
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 7378180e7a)
2018-09-08 00:28:21 +03:00
Marek Olšák
7ac52c2e38 Revert "gallium/os_thread: simplify helper pipe_current_thread_get_time_nano"
This reverts commit 6d477bc546.

It fixes the Windows build hopefully.
2018-09-07 16:52:36 -04:00
Jason Ekstrand
465e5a868c anv: Clamp scissors to the framebuffer boundary
The Vulkan 1.1.81 spec says:

    "It is legal for offset.x + extent.width or offset.y + extent.height
    to exceed the dimensions of the framebuffer - the scissor test still
    applies as defined above. Rasterization does not produce fragments
    outside of the framebuffer, so such fragments never have the scissor
    test performed on them."

Elsewhere, the Vulkan 1.1.81 spec says:

    "The application must ensure (using scissor if necessary) that all
    rendering is contained within the render area, otherwise the pixels
    outside of the render area become undefined and shader side effects
    may occur for fragments outside the render area. The render area
    must be contained within the framebuffer dimensions."

Unfortunately, there's some room for interpretation here as to what the
consequences are of having the render area set to exactly the
framebuffer dimensions and having a scissor that is larger than the
framebuffer.  Given that GL and other APIs provide automatic clipping to
the framebuffer, it makes sense that applications would assume that
Vulkan does this as well.  It costs us very little to play it safe and
just clamp client-provided scissors to the framebuffer dimensions.
Fortunately, the user is required to provide us with at least one
scissor so we don't need to handle the case where they don't.

Fixes: fb2a5ceb32 "anv: Emit DRAWING_RECTANGLE once at driver..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-09-07 15:19:02 -05:00
Jason Ekstrand
b08b4b2b25 anv: Disable the vertex cache when tessellating on SKL GT4
I have no idea if I'm correct about what's going wrong or if this is the
correct fix.  However, in my multiple weeks of banging my head on this
hang, a VUE reference counting bug seems to match all the symptoms and
it definitely fixes the hang.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107280
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-09-07 15:19:02 -05:00
Jason Ekstrand
5dee89438a anv: Implement a VF cache invalidate workaround
Known to fix nothing whatsoever but it's in the docs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-09-07 15:19:02 -05:00
Jason Ekstrand
c643c5e18d anv: Re-emit vertex buffers when the pipeline changes
Some of the bits of VERTEX_BUFFER_STATE such as access type, instance
data step rate, and pitch come from the pipeline.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-09-07 15:19:02 -05:00
Marek Olšák
25ffb84016 radeonsi: pin the winsys thread to the requested L3 cache (v2)
v2: rebase

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-07 16:03:36 -04:00
Marek Olšák
8016639f63 gallium/u_threaded: implement set_context_param for thread pinning (v2)
v2: - use set_context_param
    - set set_context_param even if the driver doesn't implement it

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-07 16:03:36 -04:00
Marek Olšák
8d473f555a st/mesa: pin driver threads to a specific L3 cache on AMD Zen (v2)
v2: use set_context_param

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-07 16:03:30 -04:00
Marek Olšák
e5e3b5cdcc gallium: add pipe_context::set_context_param for tuning perf on AMD Zen (v2)
State trackers will not use the new param directly, but will instead use
a helper in MakeCurrent that does the right thing.

v2: rework the interface

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-07 15:48:31 -04:00
Marek Olšák
6d477bc546 gallium/os_thread: simplify helper pipe_current_thread_get_time_nano
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-07 15:48:31 -04:00
Marek Olšák
15fa2c5e35 gallium/u_cpu_detect: get the number of cores per L3 cache for AMD Zen
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-07 15:48:31 -04:00
Marek Olšák
ce432e259d gallium/u_cpu_detect: fix parsing the CPU family
According to:
https://support.amd.com/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf

Also Intel:
https://www.microbe.cz/docs/CPUID.pdf

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-07 15:48:31 -04:00
Marek Olšák
a84fd58f48 gallium/u_cpu_detect: fix a race condition on initialization
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-07 15:48:31 -04:00
Dylan Baker
8396043f30 Replace uses of _mesa_bitcount with util_bitcount
and _mesa_bitcount_64 with util_bitcount_64. This fixes a build problem
in nir for platforms that don't have popcount or popcountll, such as
32bit msvc.

v2: - Fix additional uses of _mesa_bitcount added after this was
      originally written

Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-09-07 10:21:26 -07:00
Dylan Baker
80825abb5d move u_math to src/util
Currently we have two sets of functions for bit counts, one in gallium
and one in core mesa. The ones in core mesa are header only in many
cases, since they reduce to "#define _mesa_bitcount popcount", but they
provide a fallback implementation. This is important because 32bit msvc
doesn't have popcountll, just popcount; so when nir (for example)
includes the core mesa header it doesn't (and shouldn't) link with core
mesa. To fix this we'll promote the version out of gallium util, then
replace the core mesa uses with the util version, since nir (and other
non-core mesa users) can and do link with mesautils.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-09-07 10:21:26 -07:00
Dylan Baker
aa4386ebfe docs: update calendar, add news item and link release notes for X.Y.Z
Signed-off-by: Dylan Baker <dylan@pnwbakers.com>
2018-09-07 10:19:33 -07:00
Dylan Baker
d514f55611 docs/relnotes: Add sha256 sums for mesa 18.1.8 2018-09-07 10:17:38 -07:00
Dylan Baker
f6a9f44529 docs: Add release notes for 18.1.8 2018-09-07 10:17:36 -07:00
Jason Ekstrand
f9e630e23d i965: Workaround the gen9 hw astc5x5 sampler bug
gen9 hardware has a bug in the sampler cache that can cause GPU hangs
whenever an texture with aux compression enabled is in the sampler cache
together with an ASTC5x5 texture.  Because we can't control what the
client binds at any given time, we have two options: resolve the CCS or
decompresss the ASTC.  Doing a CCS or HiZ resolve is far less drastic
and will likely have a smaller performance impact.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-09-07 10:42:40 -05:00
Eric Anholt
a91b158bd9 v3d: Fix setup of the VCM cache size.
There were two bugs working together to make things mostly work: I wasn't
dividing the VPM output size available by the size of a batch (vertex),
but I also had the size of the VPM reduced by a factor of 8.

Fixes dEQP-GLES3.functional.vertex_array_objects.all_attributes and it
seems also my intermittent varying failures.

Fixes: 1561e4984e ("v3d: Emit the VCM_CACHE_SIZE packet.")
2018-09-07 08:11:38 -07:00
Eric Anholt
f73f748323 v3d: Fix SRC_ALPHA_SATURATE blending for RTs without alpha.
Fixes
dEQP-GLES3.functional.fragment_ops.blend.default_framebuffer.rgb_func_alpha_func.dst.src_alpha_saturate_src_alpha_saturate
and friends with --deqp-egl-config-name=rgb565d0s0

Cc: "18.2" <mesa-stable@lists.freedesktop.org>
2018-09-07 08:11:05 -07:00
Lionel Landwerlin
69874e9a6a intel/genxml: turn SLM Enable bit into boolean
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-09-07 14:46:20 +01:00
Sergii Romantsov
97fcccb25e i965/tools: 32bit compilation with meson
Building of 32bit mesa with meson causes issue:
"implicit declaration of function ‘__builtin_ia32_clflush’".
Fixed by adding msse2 compilation flag.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107843
Fixes: 314879f7fe (i965: Fix asynchronous mappings on !LLC platforms.)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-07 13:46:48 +01:00
Sergii Romantsov
d709f12792 intel: compiler option msse2 and mstackrealign
Seems in case of 32-bit library, usage of msse2 makes
some stack corruption or incorrect instructions.
Usage with mstackrealign fixes that case.

v2: Fixed meson.

v3: Definition of c_sse2_args moved on the top (L.Landwerlin).
    Added mstackrealign for Android's mks where msee4.1 is used.

v4: Added for Vulkan also.

v5: Commit message correction.

CC: <mesa-stable@lists.freedesktop.org>
Fixes: 6b05c080f2 (i965: Compile with -msse3)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107779
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-07 13:45:46 +01:00
Rob Clark
5404e0637f freedreno: fix rast->depth_cleap_near/far
Fixes: daa19363de gallium: split depth_clip into depth_clip_near & depth_clip_far
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-07 07:41:43 -04:00
Marek Olšák
fda7683726 gallium: enable GL_AMD_depth_clamp_separate on r600, radeonsi 2018-09-06 21:53:00 -04:00
Marek Olšák
daa19363de gallium: split depth_clip into depth_clip_near & depth_clip_far
for AMD_depth_clamp_separate.
2018-09-06 21:53:00 -04:00
Jason Ekstrand
7b26741806 anv/pipeline: Only consider double elements which actually exist
The brw_vs_prog_data::double_inputs_read field comes directly from
shader_info::double_inputs which may contain inputs which are not
actually read.  Instead of using it directly, AND it with inputs_read
which is only things which are read.  Otherwise, we may end up
subtracting too many elements when computing elem_count.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103241
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-06 16:07:50 -05:00
Jason Ekstrand
44ec31cd75 nir: Drop the vs_inputs_dual_locations option
It was very inconsistently handled; the only things that made use of it
were glsl_to_nir, glspirv, and nir_gather_info.  In particular,
nir_lower_io completely ignored it so anyone using nir_lower_io on
64-bit vertex attributes was going to be in for a shock.  Also, as of
the previous commit, it's set by every driver that supports 64-bit
vertex attributes.  There's no longer any reason to have it be an option
so let's just delete it.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-09-06 16:07:50 -05:00
Jason Ekstrand
0909a57b63 radeonsi/nir: Set vs_inputs_dual_locations and let NIR do the remap
We were going out of our way to disable dual-location re-mapping in NIR
only to then do the remapping in st_glsl_to_nir.cpp.  Presumably, this
was so that double_inputs would be correct for the core state tracker.
However, now that we've it to gl_program::DualSlotInputs which is
unaffected by NIR lowering, we can let NIR lower things for us.  The one
tricky bit here is that we have to remap the inputs_read bitfield back
to the single-slot convention for the gallium state tracker to use.

Since radeonsi is the only NIR-capable gallium driver that also supports
GL_ARB_vertex_attrib_64bit, we only have to worry about radeonsi when
making core gallium state tracker changes.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-09-06 16:07:50 -05:00
Jason Ekstrand
25efd787cf compiler: Move double_inputs to gl_program::DualSlotInputs
Previously, we had two field in shader_info: double_inputs_read and
double_inputs.  Presumably, the one was for all double inputs that are
read and the other is all that exist.  However, because nir_gather_info
regenerates these two values, there is a possibility, if a variable gets
deleted, that the value of double_inputs could change over time.  This
is a problem because double_inputs is used to remap the input locations
to a two-slot-per-dvec3/4 scheme for i965.  If that mapping were to
change between glsl_to_nir and back-end state setup, we would fall over
when trying to map the NIR outputs back onto the GL location space.

This commit changes the way slot re-mapping works.  Instead of the
double_inputs field in shader_info, it adds a DualSlotInputs bitfield to
gl_program.  By having it in gl_program, we more easily guarantee that
NIR passes won't touch it after it's been set.  It also makes more sense
to put it in a GL data structure since it's really a mapping from GL
slots to back-end and/or NIR slots and not really a NIR shader thing.

Tested-by: Alejandro Piñeiro <apinheiro@igalia.com> (ARB_gl_spirv tests)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-09-06 16:07:50 -05:00
Marek Olšák
1285f71d3e gallium: add PIPE_CAP_RASTERIZER_SUBPIXEL_BITS
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-09-06 16:07:40 -04:00
Eric Engestrom
3824c8e7cd meson: disable asserts by default on release builds
By the time Mesa 18.3 comes out (probably December '18), Meson 0.45 will
be 9 months old (March '18), so I think this is reasonable.

(btw, the currently-required Meson 0.44.1 was released less than 12 days
 before 0.45, so we're really not bumping by much.)

Currently, the Meson versions in the major distributions are:
Arch:     ships 0.47.2
CentOS:   7 ships 0.47.1
Debian:   stable ships 0.37.1, so it hasn't been usable in a long time.
          everything more recent ships 0.47.2
Fedora:   28 ships 0.45.1
FreeBSD:  ships 0.46.1 (ports)
Gentoo:   ships 0.46.1
OpenSUSE: 15 ships 0.46
Ubuntu:   18.04 ships 0.45.1

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-09-06 18:16:31 +01:00
Andrii Simiklit
2930b76cfe mesa/util: add missing va_end() after va_copy()
MSDN:
"va_end must be called on each argument list that's initialized
 with va_start or va_copy before the function returns."

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107810
Fixes: c6267ebd6c "gallium/util: Stop bundling our snprintf implementation."
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
2018-09-06 17:33:27 +01:00
Andrii Simiklit
65cfe698b0 mesa/util: don't ignore NULL returned from 'malloc'
We should exit from the function 'util_vasprintf'
with error code -1 for case where 'malloc'
returns NULL

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 864148d69e "util: add util_vasprintf() for Windows (v2)"
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
2018-09-06 17:33:27 +01:00
Andrii Simiklit
570cacba7a mesa/util: don't use the same 'va_list' instance twice
The first usage of the 'va_list' instance could change it.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 864148d69e "util: add util_vasprintf() for Windows (v2)"
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
2018-09-06 17:33:27 +01:00
Andrii Simiklit
267ed29288 apple/glx/log: added missing va_end() after va_copy()
Each invocation of va_copy() must be matched by a
corresponding invocation of va_end()

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 51691f0767 "darwin: Use ASL for logging"
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
2018-09-06 17:33:27 +01:00
Eric Engestrom
6daba55aa1 meson: drop unnecessary llvm version hacks
The current minimum meson version supported is 0.44.1, so we have met
both the 0.43 and 0.44 requirement to not need these hacks anymore :)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-09-06 17:16:58 +01:00
Danylo Piliaiev
2b98a023d9 mesa: add missing return statement for GL_RG_SNORM case
Fixes: 0d356cf478 "mesa: enable EXT_render_snorm extension"
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-09-06 17:24:53 +03:00
Eric Engestrom
e67dadd3a9 meson: consolidate langs lists
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-09-06 15:22:24 +01:00
Eric Engestrom
07ff56791d intel/compiler: remove unused get_image_base_type()
Unused since 09f1de97a7 "anv,i965: Lower away image derefs in
the driver".

Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-09-06 15:22:24 +01:00
Mathias Fröhlich
a6232b6932 tnl: Fix green gun regression in xonotic.
Fix an other regression of
mesa: Make gl_vertex_array contain pointers to first order VAO members.
The regression showed up with drivers using the tnl module and
was reproducible using xonotic-glx -benchmark demos/the-big-keybench.dem.

Fixes: 64d2a20480
    mesa: Make gl_vertex_array contain pointers to first order VAO members.
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-09-06 14:35:12 +02:00
Lionel Landwerlin
2dce1175c1 Revert "i965/tools: 32bit compilation with meson"
This reverts commit 4aec44c0d9.

Unfortunately this patch needed a another one to be committed first.
2018-09-06 12:25:07 +01:00
Sergii Romantsov
4aec44c0d9 i965/tools: 32bit compilation with meson
Building of 32bit mesa with meson causes issue:
"implicit declaration of function ‘__builtin_ia32_clflush’".
Fixed by adding msse2 compilation flag.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107843
Fixes: 314879f7fe (i965: Fix asynchronous mappings on !LLC platforms.)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-06 11:55:57 +01:00
Timothy Arceri
b9fe8ff23d glsl: fixer lexer for unreachable defines
If we have something like:

   #ifdef NOT_DEFINED
   #define A_MACRO(x) \
	if (x)
   #endif

The # on the #define is not skipped but the define itself is so
this then gets recognised as #if.

Until 28a3731e3f this didn't happen because we ended up in
<HASH>{NONSPACE} where BEGIN INITIAL was called stopping the
problem from happening.

This change makes sure we never call RETURN_TOKEN_NEVER_SKIP for
if/else/endif when processing a define.

Cc: Ian Romanick <idr@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107772
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-09-06 10:13:21 +10:00
Hyunjun Ko
2454742a84 freedreno/ir3: insert mov if same instruction in the outputs.
For example,

    result0 = texture(sampler[indexBase + 5], coords);
    result1 = texture(sampler[indexBase + 0], coords);
    result2 = texture(sampler[indexBase + 0], coords);
    out_result0 = result0;
    out_result1 = result1;
    out_result2 = result2;

In this kind of case we need to insert an extra mov to the outputs
so that the result could be assigned to each register respectively.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-05 13:38:43 -04:00
Hyunjun Ko
b4da2f6667 freedreno/ir3: make immediates array dynamic
Since most shaders wouldn't need that large array of immediates, making
the array dynamic could save unnecessary spaces.

In addition, sometimes we can potentially have a much larger array
of immediates to be lowered, which might be more than 64.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-05 13:38:43 -04:00
Rob Clark
c3d9f29b78 freedreno: allocate ctx's batch on demand
Don't fall over when app wants more than 32 contexts.  Instead allocate
contexts on demand.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-05 13:38:43 -04:00
Rob Clark
a122118c14 freedreno: add fd_context_batch() accessor
For cases in which (after the following commit) ctx->batch may be null.
Prep work for following commit.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-05 13:38:43 -04:00
Rob Clark
a45e1802db freedreno/a6xx: fix mem2gmem for zsbuf
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-05 13:38:43 -04:00
Rob Clark
c77e0948c7 freedreno/batch: fix crash in !reorder case
We aren't using the batch-cache if reorder==false.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-05 13:38:43 -04:00
Rob Clark
2c623e7071 freedreno/ir3: better compile_error() printing
Try to show the error at the appropriate line of nir

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-05 13:38:43 -04:00
Rob Clark
ca758251ba freedreno/a6xx: bordercolor fixes
Port fixes from a5xx (f0715442)

TODO maybe this should move to shared code, since it seems to be the
same.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-05 13:38:43 -04:00
Rob Clark
73378013d7 freedreno: fix context teardown harder
The border_color_uploaders need to be torn down before the transfer_pool
is destroyed.

Fixes: e11e9d6394 freedreno: fix context teardown race
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-05 13:38:43 -04:00
Rob Clark
1a24f51966 freedreno/ir3: ignore unused inputs
We could end up w/ inputs larger than vec4, simply because unused inputs
are not split.

Fixes things like dEQP-GLES31.functional.separate_shader.random.77 (and
probably a handful of others)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-05 13:38:43 -04:00
Rob Clark
6b4397feab freedreno/a6xx: fix debug build crash
Porting 0c8d9e923a to a6xx.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-09-05 13:38:43 -04:00
Dylan Baker
d25a27ec56 meson: Print a message about why a libdrm version was selected
We require a single version of libdrm for all of our libdrm
dependencies (core and driver), but the way this is structured can make
the error message less than helpful, as one driver might be the one
setting the libdrm requirement, while another might be the one that
generates the version failure.

This adds a simple message to the output announcing which libdrm module
set the version, which might be more helpful.

v2: - Use message suggested by Eric Engstrom

Fixes: c445b1d56f
       ("meson: Use the same version for all libdrm checks")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-05 10:32:51 -07:00
Charmaine Lee
af104ad799 svga: rename face to layer_face
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-05 11:22:42 -06:00
Brian Paul
e334e104d0 svga: encode sample count in resource declarations
No regressions before the corresponding host-side change.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-09-05 11:22:42 -06:00
Charmaine Lee
49678e9e49 svga: sync with upstream changes to surface flags
SVGA device now supports 64 bits surface flags. This patch
updates the winsys interface to allow 64 bits surface flags.
The linux winsys layer will for now only honor the lower 32 bits of
the surface flags.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-05 11:22:42 -06:00
Neha Bhende
4310649ccb svga: avoid try_blit() for some depth formats on non vgpu10.
On non vgpu10, driver doesn't support util_blitter_blit for SVGA3D_Z_D16,
SVGA3D_Z_D24x8, SVGA3D_Z_D24S8. Patch fixes following piglit tests regression on hwv8 caused
by commit 27bf35caea5e:
spec@arb_depth_texture@fbo-depth-gl-depth-component16-blit
spec@arb_depth_texture@fbo-depth-gl-depth-component24-blit
spec@arb_depth_texture@fbo-depth-gl-depth-component32-blit

Tested with mtt-piglit on hw 8,9,10,11,13 and mtt-glretrace on windows and linux.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-05 11:22:42 -06:00
Neha Bhende
53091a0312 svga: convert dst format to linear when blending is enabled.
When blending is enabled, framebuffer colorspace has to be linear.
Previously, we never hit this case because we were not supporting sRGB
drawable. Previous patch added that support.

Tested with mtt glretrace, viewperf, piglit, conform.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-05 11:22:42 -06:00
Neha Bhende
dfab1289e8 winsys/svga: Avoid cap2 code path for now
CAP2 functionality is not yet part of vmwgfx. This is causing unnecessary
dmesg error messages.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-09-05 11:22:42 -06:00
Neha Bhende
8449c33a27 svga: start using SVGA3dCmdIntraSurfaceCopy command for svga_blit.
Basically, SVGA3dCmdIntraSurfaceCopy command allow copying when
source and destination are same.

Tested with MTT piglit, glretrace, viewperf, conform

v2: changes as per Charmaine's comment
v3: changes as per Charmaine's comment

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-05 11:22:42 -06:00
Neha Bhende
4639ef3763 svga/winsys: Add cap2 support in winsys
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-05 11:22:42 -06:00
Neha Bhende
6b3627da08 svga: Add SVGA3dCmdIntraSurfaceCopy command support in OpenGL driver
v2: changes as per Charmaine's comment

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-09-05 11:22:42 -06:00
Brian Paul
bac94dfefa svga: update device header files from upstream
This is a squash commit of several earlier patches.

Signed-off-by: Brian Paul <brianp@vmware.com>
2018-09-05 11:22:42 -06:00
Charmaine Lee
f4f39fa5d9 winsys/drm: Fix assert when try to accumulate an invalid fd
This patch makes sure there is a valid fd before merging it
to the context's fd in vmw_svga_winsys_fence_server_sync().

This fixes the assert running webot.
No regression running kmscube.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2018-09-05 11:22:42 -06:00
Eric Anholt
16f17e3a3c loader: Drop unused argument from dri3_update_drawable().
The argument has never been used since the function was added.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-09-05 10:11:27 -07:00
Alejandro Piñeiro
4e1f8d82c2 i965/fs: include multisamplers on image_intrinsic_coord_components
This is the second patch needed to fix the following piglit tests:

   tests/spec/arb_gl_spirv/linker/uniform/multisampler.shader_test
   tests/spec/arb_gl_spirv/linker/uniform/multisampler-array.shader_test

Although in this case it doesn't affect so many borrowed tests, as
there aren't too many tests using multisamplers on Intel.

It is worth to note that this patch is also needed when those tests
are run on GLSL mode (using the --glsl option). Although most Intel
drivers would not be able to run/execute tests using multisamplers, as
GL_MAX_IMAGE_SAMPLES is zero, technically those tests are expected to
link correctly, so linking tests should pass.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-09-05 17:02:28 +02:00
Alejandro Piñeiro
8969777686 i965: move brw_nir_lower_gl_images call
At this moment that lowering is using info coming from the
UniformStorage, so for the ARB_gl_spirv codepath, it needs to be done
after calling gl_nir_link_uniforms. As for the GLSL codepath it can
also be called later, we just move the call on both cases, to avoid
adding several shader->spirv_data checks, and keep the patch as small
as possible.

This is the first patch needed to fix the following piglit tests:

  tests/spec/arb_gl_spirv/linker/uniform/multisampler.shader_test
  tests/spec/arb_gl_spirv/linker/uniform/multisampler-array.shader_test

but fixes thousands of tests when borrowing the tests from other specs
(that needs to be done manually right now).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-09-05 17:02:28 +02:00
Alejandro Piñeiro
2a6182fe06 intel/compiler: rename brw_nir_lower_glsl_images
To brw_nir_lower_gl_images, as it will be also used on the
ARB_gl_spirv codepath, that doesn't involves GLSL at all. So the
lowering is about images following the OpenGL semantics. In any case
"brw_nir_lower_opengl_images" seemed too long to me, so I just used
gl. That shortening is already used on other parts of the code.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-09-05 17:02:28 +02:00
Alejandro Piñeiro
960f6459be intel/compiler: remove unused variable num_images
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-09-05 17:02:28 +02:00
Gert Wollny
218ff0d510 winsys/virgl/vtest: Correct off-by-one error in resource allocation
The resource bo array must already extended when the target index is
equal to the current size of the array.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
2018-09-05 13:54:01 +02:00
Gert Wollny
5341260f62 winsys/virgl: Initialize value to silence valgrind
Silences:

  Conditional jump or move depends on uninitialised value(s)
  at 0xB72F2C0: virgl_drm_winsys_create (virgl_drm_winsys.c:854)
  by 0xB72F2C0: virgl_drm_screen_create (virgl_drm_winsys.c:926)
  by 0xB21C885: pipe_virgl_create_screen (drm_helper.h:275)
  by 0xB7201F0: pipe_loader_create_screen (pipe_loader.c:137)
  by 0xB639C91: dri2_init_screen (dri2.c:2112)
  by 0xB634F68: driCreateNewScreen2 (dri_util.c:153)
  by 0x63023E6: dri3_create_screen (dri3_glx.c:893)
  by 0x62D35BD: AllocAndFetchScreenConfigs (glxext.c:820)
  by 0x62D35BD: __glXInitialize (glxext.c:946)
  by 0x62CECB3: GetGLXPrivScreenConfig (glxcmds.c:174)
  by 0x62CF69C: glXQueryExtensionsString (glxcmds.c:1304)
  by 0x60AA7D9: ??? (in /usr/lib/x86_64-linux-gnu/libwaffle-1.so.0.5.2)
  by 0x4F81450: wfl_checked_display_connect (piglit-util-waffle.h:74)
  by 0x4F829E0: piglit_wfl_framework_init (piglit_wfl_framework.c:627)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
2018-09-05 13:54:01 +02:00
Gert Wollny
9b0e8d8723 winsys/virgl: correct resource and handle allocation (v2)
Fixes crash with
  piglit/bin/map_buffer_range-invalidate CopyBufferSubData \
                               increment-offset -auto -fbo

* Resize the resource storage already when the count is equal to the
  allocated size, fixes:

  Invalid write of size 8
  at 0xB72E4CF: virgl_drm_add_res (virgl_drm_winsys.c:629)
  by 0xB72E4CF: virgl_drm_emit_res (virgl_drm_winsys.c:663)
  by 0xB72A44A: virgl_encode_resource_copy_region (virgl_encode.c:776)
  by 0xB40CD12: st_copy_buffer_subdata (st_cb_bufferobjects.c:585)
  by 0xB244A3B: _mesa_CopyBufferSubData (bufferobj.c:2940)
  by 0x109A1E: upload (invalidate.c:169)
  by 0x109C2F: piglit_display (invalidate.c:215)
  by 0x4F80FBE: run_test (piglit_fbo_framework.c:52)
  by 0x4F66E5F: piglit_gl_test_run (piglit-framework-gl.c:229)
  by 0x10949D: main (invalidate.c:47)
  Address 0xbe07d30 is 0 bytes after a block of size 4,096 alloc'd
  at 0x4C31B25: calloc (in
       /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
  by 0xB72DAAF: virgl_drm_cmd_buf_create (virgl_drm_winsys.c:567)

* Also resize the space allocated for the handles, fixes:

  Invalid write of size 4
  at 0xB72E4F0: virgl_drm_add_res (virgl_drm_winsys.c:631)
  by 0xB72E4F0: virgl_drm_emit_res (virgl_drm_winsys.c:663)
  by 0xB72A44A: virgl_encode_resource_copy_region (virgl_encode.c:776)
  by 0xB40CD12: st_copy_buffer_subdata (st_cb_bufferobjects.c:585)
  by 0xB244A3B: _mesa_CopyBufferSubData (bufferobj.c:2940)
  by 0x109A1E: upload (invalidate.c:169)
  by 0x109C2F: piglit_display (invalidate.c:215)
  by 0x4F80FBE: run_test (piglit_fbo_framework.c:52)
  by 0x4F66E5F: piglit_gl_test_run (piglit-framework-gl.c:229)
  by 0x10949D: main (invalidate.c:47)
  Address 0xbe08570 is 0 bytes after a block of size 2,048 alloc'd
  at 0x4C2FB0F: malloc (
    in /usr/lib/valgrind/vgpreload_memcheck-amd64- linux.so)
  by 0xB72DAC8: virgl_drm_cmd_buf_create (virgl_drm_winsys.c:572)

Fixes: 4b15b5e803 ("virgl: resize resource bo allocation if we need to.")

v2: - Use REALLOC macro and avoid memory leak when re-allocation fails
    - add Fixes tag (both Emil Velikov)
    - reorder commit message

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
2018-09-05 13:54:01 +02:00
Tomeu Vizoso
f13de57edb virgl: use hw-atomics instead of in-ssbo ones
Emulating atomics on top of ssbos can lead to too small max SSBO count,
so let's use the hw-atomics mechanism to expose atomic buffers instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-09-05 05:46:58 +01:00
Erik Faye-Lund
1bd927d997 virgl: update minor differences to upstream header
virgl_protocol.h is considered to have it's upstream in the
virglrenderer repository, and somehow these minor differences has
crept in.

Let's sync with the upstream to avoid this.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-09-05 05:46:52 +01:00
Erik Faye-Lund
5a587d18d5 gallium: add PIPE_CAP_MAX_COMBINED_HW_ATOMIC_COUNTER{S,_BUFFERS}
This moves the evergreen-specific max-sizes out as a driver-cap, so
other drivers with less strict requirements also can use hw-atomics.

Remove ssbo_atomic as it's no longer needed.

We should now be able to use hw-atomics for some stages and not for
other, if needed.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-09-05 05:46:46 +01:00
Erik Faye-Lund
d641d3f48b gallium: add PIPE_CAP_MAX_COMBINED_SHADER_BUFFERS
This gets rid of a r600 specific hack in the state-tracker, and prepares
for other drivers to be able to use hw-atomics.

While we're at it, clean up some indentation in the various drivers.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-09-05 05:46:37 +01:00
Erik Faye-Lund
84795f8c64 st/mesa: simplify MaxAtomicBufferSize-logic
MaxAtomicCounters has already been assigned in the loop above in the
ssbo_atomic = true case, so this will calculate the same value as the
default.

While we're at it, fixup indentation on the MaxAtomicBufferBindings
assign.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-09-05 05:46:33 +01:00
Erik Faye-Lund
38f0c078de st/mesa: clean up atomic vs ssbo code
This makes the code a bit easier to follow; we first set up
MaxShaderStorageBlocks, then we either set up a dedicated
MaxAtomicBuffers, or we split MaxShaderStorageBlocks in two.

While we're at it, also make the SSBO-splitting code tolerate the
hypothetical case of having an odd number of SSBOs without incorrectly
dropping the last SSBO.

This has the nice result that the SSBOs and atomic buffers are dealt
with almost completely orthogonally, easing some upcoming patches.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-09-05 05:46:27 +01:00
Erik Faye-Lund
a805e4e9de st/mesa: use real bool for can_ubo
We're doing full c99 now, so there's no point in using the old boolean
type.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-09-05 05:46:09 +01:00
Marek Olšák
28e542dcdb gallium/u_threaded: increase batch size to increase performance
This reduces mutex overhead.

radeonsi: +4.4% performance with piglit/drawoverhead, DrawElements, Ryzen X1700
iris_dri.so: +14% with piglit/drawoverhead, DrawArrays, i7 7700HQ.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-09-04 14:31:56 -04:00
Marek Olšák
ebd5806e0f st/vdpau: silence an unitialized-variable warning 2018-09-04 14:01:43 -04:00
Marek Olšák
725e8ad559 st/mesa: help fix stencil border color for GL_DEPTH_STENCIL textures
GL_STENCIL_INDEX uses GL_INTENSITY for the border color, which is nicer
to hardware that doesn't read the stencil border value from the X channel.

This fixes a bunch of dEQP tests on Vega & Raven.

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
2018-09-04 14:01:43 -04:00
Ernestas Kulik
d49904085a glsl_to_tgsi: Fix potential leak
Reported by Coverity: arr_live_ranges is freed in a different branch
than the one in which it was allocated.

Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-09-04 14:01:43 -04:00
Ernestas Kulik
ea1e50cc16 u_vbuf: Fix leak
Reported by Coverity: data is heap-allocated, but only freed in the
info->index_size != 0 branch.

Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 18.2 <mesa-stable@lists.freedesktop.org>
2018-09-04 14:01:43 -04:00
Eric Anholt
2e59b88903 freedreno: Drop a bunch of duplicated gallium PIPE_CAP default code.
Now that we have the util function for the default values, we can get rid
of the boilerplate.

v2: Rebase on new gallium caps

Reviewed-by: Rob Clark <robdclark@gmail.com> (v1)
2018-09-04 08:08:22 -07:00
Eric Anholt
492b74b445 v3d: Drop a bunch of duplicated gallium PIPE_CAP default code.
Now that we have the util function for the default values, we can get rid
of the boilerplate.

v2: Rebase on new gallium caps
2018-09-04 08:08:18 -07:00
Eric Anholt
c311e00000 vc4: Drop a bunch of duplicated gallium PIPE_CAP default code.
Now that we have the util function for the default values, we can get rid
of the boilerplate.

v2: drop GLSL level in favor of defaults.
v3: Rebase on new gallium caps
2018-09-04 08:08:10 -07:00
Eric Anholt
ad782a7020 gallium: Add a helper for implementing PIPE_CAP_* default values.
One of the pains of implementing a gallium driver is filling in a million
pipe caps you don't know about yet when you're just starting out.  One of
the pains of working on gallium is copy-and-pasting your new PIPE_CAP into
each driver.  We can fix both of these by having each driver call into the
default helper from their default case, so that both sides can ignore each
other until they need to.

v2: fix i915g build, revert swr change to avoid breaking scons build
    (https://travis-ci.org/anholt/mesa/jobs/419739857)
v3: Rebase on 3 new gallium caps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: George Kyriazis <george.kyriazis@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
2018-09-04 08:07:52 -07:00
Jason Ekstrand
67571ae796 intel/compiler: Remove redundant nir_remove_dead_variables call
As of 07a2098a70, brw_nir_optimize calls nir_remove_dead_variables as
the last optimization.  Doing it again is just pointless.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-09-04 09:03:16 -05:00
Lionel Landwerlin
07a2098a70 intel: compiler: remove dead local variables at optimization pass
We're hitting an assert in gfxbench because one of the local variable
is a sampler (according to Jason this isn't valid) :

testfw_app: ../src/compiler/nir_types.cpp:551: void glsl_get_natural_size_align_bytes(const glsl_type*, unsigned int*, unsigned int*): Assertion `!"type does not have a natural size"' failed.

Since this particular variable isn't used, it can be eliminated by
removing unused local variables at the end of the optimization loop.
This makes sense also for valid local variables.

v2: Move additional local variable removal out of optimization loop,
    but before large constant removal (Jason/Lionel)

v3: Move the removal at the end of brw_nir_optimize()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107806
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-09-03 17:24:19 +01:00
Andrii Simiklit
095600dad6 intel/decoder: fix the possible out of bounds group_iter
The "gen_group_get_length" function can return a negative value
and it can lead to the out of bounds group_iter.

v2: printing of "unknown command type" was added
v3: just the asserts are added

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-03 11:14:30 +01:00
Bas Nieuwenhuizen
233718a199 radv: Fix CMASK dimensions.
Mirrors

1e40f69483 "ac/surface: fix CMASK fast clear for NPOT textures with mipmapping on SI/CI/VI"

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-09-03 09:24:30 +02:00
Bas Nieuwenhuizen
ab64891f4c radv: Use a lower max offchip buffer count.
No clue what gets fixed by this but both radeonsi and amdvlk do it.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-09-03 09:24:30 +02:00
Bas Nieuwenhuizen
4dc244eb44 radv: Add VEGA20 support.
Just mirror the radeonsi bits. Since this is just adding the extra
switch entries for new HW I think this should be fine for stable.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-09-03 09:24:30 +02:00
Dave Airlie
c1ba33c34b radv: don't expose linear depth surfaces on SI/CIK/VI either.
ac_surface.c: gfx6_compute_surface says
/* DB doesn't support linear layouts. */

Now if we expose linear depth and create a linear depth image
and use CmdCopyImage to copy into it, we can't map the underlying
memory and read it linearly which I think should work.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-03 11:38:00 +10:00
Mauro Rossi
ac0856ae41 egl/android: do not indent HAVE_DRM_GRALLOC preprocessor directive
Fixes: 3f7bca44d9 ("egl/android: #ifdef out flink name support")
Fixes: c7bb82136b ("egl/android: Add DRM node probing and filtering")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2018-09-02 11:27:08 +02:00
Jason Ekstrand
2ad9917e18 anv/blorp: Fix a comment as per Nanley's review feedback
This accidentally didn't make it into 62378c5e9e
2018-09-01 09:12:08 -05:00
Jason Ekstrand
62378c5e9e anv/blorp: Do more flushing around HiZ clears
We make the flush after a HiZ clear unconditional and add a flush/stall
before the clear as well.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107760
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-09-01 09:08:36 -05:00
Ian Romanick
82530ce1b5 i965/vec4: Clamp indirect tes input array reads with 0x0fffffff
Page 190 of "Volume 7: 3D Media GPGPU Engine (Haswell)" says the valid
range of the offset is [0, 0FFFFFFFh].

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2018-09-01 00:23:45 -07:00
Ian Romanick
75666605c9 i965/vec4: Correctly handle uniform sources in generate_tes_add_indirect_urb_offset
Fixes failure in the new piglit test
tes-patch-input-array-vec2-index-invalid-rd.shader_test.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2018-09-01 00:23:43 -07:00
Andres Gomez
adad7e3aa8 docs: update calendar to extended the 18.1 cycle by one more release
Due to having 2 additional RCs for 18.2.

Cc: Dylan Baker <dylan.c.baker@intel.com>
Cc: Juan A. Suarez <jasuarez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-09-01 02:23:14 +03:00
Rodrigo Vivi
e8c42ed4ab intel: Introducing Amber Lake platform
Amber Lake uses the same gen graphics as Kaby Lake, including a id
that were previously marked as reserved on Kaby Lake, but that
now is moved to AML page.

This follows the ids and approach used on kernel's commit
e364672477a1 ("drm/i915/aml: Introducing Amber Lake platform")

Reported-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-31 13:57:52 -07:00
Rodrigo Vivi
886a048feb intel: aubinator: Adding missed platforms to the error message.
Many new platforms got added to gen_device_name_to_pci_device_id()
but the error message inside aubinator didn't reflected those
changes. So syncing on the same order to be sure that we are not
missing any now.

Cc: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-31 13:57:41 -07:00
Nanley Chery
904c2a617d i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on some gen9
According to internal docs, some gen9 platforms have a pixel shader push
constant synchronization issue. Although not listed among said
platforms, this issue seems to be present on the GeminiLake 2x6's we've
tested.

We consider the available workarounds to be too detrimental on
performance. Instead, we mitigate the issue by applying part of one of
the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch
(as suggested by Ken).

Fixes ext_framebuffer_multisample-accuracy piglit test failures with the
following options:
* 6 depth_draw small depthstencil
* 8 stencil_draw small depthstencil
* 6 stencil_draw small depthstencil
* 8 depth_resolve small
* 6 stencil_resolve small depthstencil
* 4 stencil_draw small depthstencil
* 16 stencil_draw small depthstencil
* 16 depth_draw small depthstencil
* 2 stencil_resolve small depthstencil
* 6 stencil_draw small
* all_samples stencil_draw small
* 2 depth_draw small depthstencil
* all_samples depth_draw small depthstencil
* all_samples stencil_resolve small
* 4 depth_draw small depthstencil
* all_samples depth_draw small
* all_samples stencil_draw small depthstencil
* 4 stencil_resolve small depthstencil
* 4 depth_resolve small depthstencil
* all_samples stencil_resolve small depthstencil

v2: Include more platforms in WA (Ken).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93355
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-31 13:19:17 -07:00
Christian Gmeiner
773d6ea6e7 imx: make use of loader_open_render_node(..) helper
Gets rid of hard-coded gpu device path.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-31 21:47:13 +02:00
Christian Gmeiner
b05a8f4f41 tegra: make use loader_open_render_node(..) helper
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-31 21:46:32 +02:00
Christian Gmeiner
ab348885eb loader: add loader_open_render_node(..)
This helper is almost a 1:1 copy of tegra_open_render_node().

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-31 21:46:03 +02:00
Christian Gmeiner
d0b09e2dfe tegra: fix memory leak
Fixes: 1755f608f5 ("tegra: Initial support")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-31 21:45:16 +02:00
Daniel Stone
01c0aa9f05 st/dri: Don't expose sRGB formats to clients
Though the SARGB8888 format is used internally through its FourCC value,
it is not a real format as defined by drm_fourcc.h; it cannot be used
with KMS or other interfaces expecting drm_fourcc.h format codes.

Ensure we don't advertise it through the dmabuf format/modifier query
interfaces, preventing us from tripping over an assert.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Fixes: 8c1b9882b2 ("egl/dri2: Guard against invalid fourcc formats")
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2018-08-31 18:02:42 +01:00
Samuel Pitoiset
686ec97cfb radv: add missing support for protected memory properties
Fixes Vulkan CTS CL#2849. Similar to the ANV driver.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-31 17:35:13 +02:00
Samuel Pitoiset
7355e9326b radv: remove dead code in scan_shader_output_decl()
Never used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-31 17:34:41 +02:00
Samuel Pitoiset
e9acf069b2 radv: remove radv_shader_context::num_output_{clips,culls}
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-31 17:34:41 +02:00
Samuel Pitoiset
a6a6441c75 radv: adjust the cull dist mask in scan_shader_output_decl()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-31 17:34:41 +02:00
Samuel Pitoiset
ea778e760c radv: get length of the clip/cull distances array from usage mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-31 17:34:41 +02:00
Samuel Pitoiset
732679c25e radv: do not recompute the output usage mask for clipdist twice
The shader info pass takes care of this now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-31 17:34:41 +02:00
Samuel Pitoiset
730c704f86 radv: gather the output usage mask for clip/cull distances correctly
It's a special case because both are combined into a single array.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-31 17:34:41 +02:00
Samuel Pitoiset
ffe3a2a298 radv: add set_output_usage_mask() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-31 17:34:41 +02:00
Samuel Pitoiset
6f47df3129 radv: fix passing clip/cull distances from VS to PS
CTS doesn't test input clip/cull distances for the fragment
shader stage, which explains why this was totally broken. I
wrote a simple test locally that works now.

This fixes a crash with GTA V and DXVK.

Note that we are exporting unused parameters from the vertex
shader now, but this can't be optimized easily because we don't
keep the fragment shader info...

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107477
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-31 17:34:36 +02:00
Juan A. Suarez Romero
54a9622dd5 egl/wayland: do not leak wl_buffer when it is locked
If color buffer is locked, do not set its wayland buffer to NULL;
otherwise it can not be freed later.

Rather, flag it in order to destroy it later on the release event.

v2: instruct release event to unlock only or free wl_buffer too (Daniel)

This also fixes dEQP-EGL.functional.swap_buffers_with_damage.* tests.

CC: Daniel Stone <daniel@fooishbar.org>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-08-31 16:29:36 +02:00
Dave Airlie
2c1f249f2b ac/radeonsi: fix CIK copy max size
While adding transfer queues to radv, I started writing some tests,
the first test I wrote fell over copying a buffer larger than this
limit.

Checked AMDVLK and found the correct limit.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-31 15:11:49 +10:00
Dave Airlie
c9f5448695 radeonsi: fix regression in indirect input swizzles.
This fixes:
tests/spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-dvec3.shader_test
since I reworked the 64-bit swizzles.

Fixes: bb17ae49ee (gallivm: allow to pass two swizzles into fetches.)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-31 06:08:24 +01:00
Dave Airlie
750b829daf radeonsi: fix tess/gs fetchs for new swizzle.
I have piglit results from my machine, but I must have messed up,
and not built mesa in between properly.

Fixes: bb17ae49ee (gallivm: allow to pass two swizzles into fetches.)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-31 06:08:21 +01:00
Marek Olšák
355ed029b0 mesa: ignore VAO IDs equal to 0 in glDeleteVertexArrays
This fixes a firefox crash.

Fixes: 781a78914c

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-30 22:30:28 -04:00
Kenneth Graunke
b147254d36 Revert "intel/tools/aubwrite: Always use physical addresses for traces."
This reverts commit f8cfc77660.

This appears to break intel_dump_gpu for Gen9 systems - I can load them
in the simulator, but nothing happens.  Reverting the patch makes the
simulator properly execute our commands and shaders again.
2018-08-30 14:36:28 -07:00
Jason Ekstrand
a0f18f2142 intel/nir: Lowering image loads and stores trashes all metadata
This fixes the GL_ARB_fragment_shader_interlock piglit test on gen8
platforms where the lack of metadata dirtying was causing another pass
to accidentally delete a much needed loop.

https://bugs.freedesktop.org/show_bug.cgi?id=107745
Fixes: 37f7983bcc "intel/compiler: Do image load/store lowering..."
Jason Ekstrand <jason@jlekstrand.net> writes:
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-30 14:06:31 -05:00
Jason Ekstrand
d9cf4308ce i965/screen: Allow modifiers on sRGB formats
This effectively reverts a266934935 which
was a misguided attempt at protecting intel_query_dma_buf_modifiers from
invalid formats.  Unfortunately, in some internal EGL cases, we can get
an SRGB format validly in this function.  Rejecting such formats caused
us to not allow CCS in some cases where we should have been allowing it.
This regressed the performance of some SynMark tests as well as GfxBench
ALU2, Tessellation and Manhattan 3.0 tests

There's some question of whether or not we really should be using SRGB
"fourcc" formats that aren't actually in drm_foucc.h but there's not
much harm in allowing them through here.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107223
Fixes: a266934935 "i965/screen: Return false for unsupported..."
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-30 11:41:50 -05:00
Jason Ekstrand
8c1b9882b2 egl/dri2: Guard against invalid fourcc formats
We already reject attempts to import images with invalid fourcc formats
but don't really guard the queries all that well.  This makes us error
out in any calls to eglQueryDmaBufModifiersEXT if the given format is
not a valid fourcc format.  We also add an assert to ensure that drivers
don't advertise any non-fourcc formats.

Cc: mesa-stable@lists.freedesktop.org
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-30 11:41:50 -05:00
Jason Ekstrand
b95896f492 egl/dri2: Add a helper for the number of planes for a FOURCC format
This also serves as a convenient "is this a fourcc format" check as well
which we'll take advantage of in the next commit.

Cc: mesa-stable@lists.freedesktop.org
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-30 11:41:50 -05:00
Jason Ekstrand
19bdc7dd0f radv/meta: Set num_components on image_store intrinsics
Now that image load/store intrinsics are variable-width, we need to set
num_components accordingly.  In 15d39f474b, both glsl_to_nir and
spirv_to_nir were updated to properly set num_components but radv meta
was left behind.

Fixes: 15d39f474b "nir: Make image load/store intrinsics..."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-30 08:26:14 -05:00
Vicki Pfau
8c0e3f3822 gallivm: Detect VSX separately from Altivec
Previously gallivm would attempt to use VSX instructions on all systems
where it detected that Altivec is supported; however, VSX was added to
POWER long after Altivec, causing lots of crashes on older POWER/PPC
hardware, e.g. PPC Macs. By detecting VSX separately from Altivec we can
automatically disable it on hardware that supports Altivec but not VSX

Signed-off-by: Vicki Pfau <vi@endrift.com>
2018-08-30 06:09:49 +02:00
Ilia Mirkin
3e04c67950 nv50: bump compat glsl level to same as core
Passes the compat piglits. I'm sure that there will be odd issues that
aren't caught by them, but at least it should basically work.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-08-29 20:51:40 -04:00
Ilia Mirkin
a608e5cc9f nvc0: bump compat GLSL version to match core
This passes the handful of tests in piglit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-08-29 20:51:40 -04:00
Ilia Mirkin
52a7297dc6 glsl: avoid lowering texcoord array except in simple cases
With compat creeping up to geometry and tess shaders, lowering texcoord
accesses/writes becomes more complicated. Since it's an optimization
anyways, just avoid the complication for now.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-29 20:51:23 -04:00
Andres Gomez
3731233cba docs: update calendar 18.2.0-rc5 is out, extend to 18.2.0-rc6
Signed-off-by: Andres Gomez <agomez@igalia.com>
2018-08-30 03:33:08 +03:00
Timothy Arceri
9c47c39687 st/mesa, gallium: add a workaround for No Mans Sky
The spec seems clear this is not allowed but the Nvidia binary
forces apps to add layout qualifiers so this works around the
issue for No Mans Sky until the CTS can be sorted out.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-30 09:54:40 +10:00
Timothy Arceri
9ce7d79cdc glsl: add a mechanism to allow layout qualifiers on function params
The spec is quite clear this is not allowed:

    From Section 4.4. (Layout Qualifiers) of the GLSL 4.60 spec:

       "Layout qualifiers can appear in several forms of declaration.
       They can appear as part of an interface block definition or
       block member, as shown in the grammar in the previous section.
       They can also appear with just an interface-qualifier to establish
       layouts of other declarations made with that qualifier:

          layout-qualifier interface-qualifier ;

       Or, they can appear with an individual variable declared with
       an interface qualifier:

          layout-qualifier interface-qualifier declaration ;"

    From Section 4.10 (Memory Qualifiers) of the GLSL 4.60 spec:

       "Layout qualifiers cannot be used on formal function parameters,
       and layout qualification is not included in parameter matching."

However on the Nvidia binary driver they actually fail to compile
if image function params don't have a layout qualifier. This results
in applications such as No Mans Sky using layout qualifiers on params.

I've submitted a CTS test to expose this problem in the Nvidia driver
but until that is resolved this patch will help Mesa drivers work
around the issue.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-30 09:54:40 +10:00
Timothy Arceri
28a3731e3f glsl: skip stringification in preprocessor if in unreachable branch
This fixes compilation of some "No Mans Sky" shaders where the stringification
happens in branches intended for DX12.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-30 09:51:57 +10:00
Bas Nieuwenhuizen
4738b6ac81 radv: Add missing checks in radv_get_image_format_properties.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-30 01:21:20 +02:00
Dave Airlie
bb17ae49ee gallivm: allow to pass two swizzles into fetches.
This hijacks the top 16-bits of swizzle, to pass in the swizzle
for the second channel.

This fixes handling .yx swizzles of 64-bit values.

This should fixup radeonsi and llvmpipe.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107524
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-30 00:15:40 +01:00
Timothy Arceri
3bcec6cf1c radeonsi: enable radeonsi_zerovram for No Mans Sky
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-30 07:57:38 +10:00
Timothy Arceri
5566dd8a61 radeonsi: add radeonsi_zerovram driconfig option
More and more games seem to require this so lets make it a config
option.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-30 07:57:38 +10:00
Timothy Arceri
406c3d748d radeonsi: enable GL 4.5 in compat profile
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-30 07:57:38 +10:00
Timothy Arceri
781a78914c mesa: enable ARB_direct_state_access in compat for GL3.1+
We could enable it for lower versions of GL but this allows us
to just use the existing version/extension checks that are already
used by the core profile.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-30 07:57:38 +10:00
Marek Olšák
93b8b987d0 radeonsi: add a thorough clear/copy_buffer benchmark 2018-08-29 15:31:42 -04:00
Marek Olšák
5914f5bd4a radeonsi: let internal compute dispatches tune WAVES_PER_SH 2018-08-29 15:31:42 -04:00
Marek Olšák
c5442c1165 radeonsi: add TGSI_SEMANTIC_CS_USER_DATA for reading up to 4 SGPRs with TGSI 2018-08-29 15:31:42 -04:00
Marek Olšák
d7250e4304 radeonsi: add SI_QUERY_TIME_ELAPSED_SDMA_SI for measuring DMA on SI
DMA on SI doesn't support the timestamp packet, so it's emulated.
2018-08-29 15:31:42 -04:00
Marek Olšák
c359880d8b radeonsi: add SI_QUERY_TIME_ELAPSED_SDMA for measuring SDMA performance 2018-08-29 15:31:42 -04:00
Marek Olšák
0c5429cc73 radeonsi: add flag L2_STREAM for minimal cache usage 2018-08-29 15:31:41 -04:00
Marek Olšák
8f6e06d160 gallium: add TGSI_MEMORY_STREAM_CACHE_POLICY
For internal radeonsi shaders.
2018-08-29 15:31:41 -04:00
Jason Ekstrand
d8033d4083 intel/compiler: Remove surface_idx from brw_image_param
Now that the drivers are lowering to surface indices themselves, we no
longer need to push the surface index into the shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:03 -05:00
Jason Ekstrand
3cbc02e469 intel: Use TXS for image_size when we have a typed surface
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:03 -05:00
Jason Ekstrand
09f1de97a7 anv,i965: Lower away image derefs in the driver
Previously, the back-end compiler turn image access into magic uniform
reads and there was a complex contract between back-end compiler and
driver about setting up and filling out those params.  As of this
commit, both drivers now lower image_deref_load_param_intel intrinsics
to load_uniform intrinsics controlled by the driver and lower the other
image_deref_* intrinsics to image_* intrinsics which take an actual
binding table index.  There are still "magic" uniforms but they are now
added and controlled entirely by the driver and that contract no longer
spans components.

This also has the side-effect of making most image use compile-time
binding table indices.  Previously, all image access pulled the binding
table index from a uniform.  Part of the reason for this was that the
magic uniforms made it difficult to decouple binding table indices from
the uniforms and, since they are indexed completely differently
(especially in Vulkan), it was hard to pull them apart.  Now that the
driver is handling both, it's trivial to decouple the two and provide
actual binding table indices.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15166872 -> 15164293 (-0.02%)
    instructions in affected programs: 115834 -> 113255 (-2.23%)
    helped: 191
    HURT: 0

    total cycles in shared programs: 571311495 -> 571196465 (-0.02%)
    cycles in affected programs: 4757115 -> 4642085 (-2.42%)
    helped: 73
    HURT: 67

    total spills in shared programs: 10951 -> 10926 (-0.23%)
    spills in affected programs: 742 -> 717 (-3.37%)
    helped: 7
    HURT: 0

    total fills in shared programs: 22226 -> 22201 (-0.11%)
    fills in affected programs: 1146 -> 1121 (-2.18%)
    helped: 7
    HURT: 0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:03 -05:00
Jason Ekstrand
0de003be03 nir: Add handle/index-based image intrinsics
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
3942943819 nir: Use a bitfield for image access qualifiers
This commit expands the current memory access enum to contain the extra
two bits provided for images.  We choose to follow the SPIR-V convention
of NonReadable and NonWriteable because readonly implies that you *can*
read so readonly + writeonly doesn't make as much sense as NonReadable +
NonWriteable.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
48e4fa7dd8 glsl/link,i965: Make ImageAccess four-state
The GLSL spec allows you to set both the "readonly" and "writeonly"
qualifiers on images to indicate that it can only be used with
imageSize.  However, we had no way of representing this int he linked
shader and flagged it as GL_READ_ONLY.  This is good from a "does it use
this buffer?" perspective but not from a format and access lowering
perspective.  By using GL_NONE for if "readonly" and "writeonly" are
both set, we can detect this case in the driver and handle it correctly.

Nothing currently relies on the type of surface in the "readonly" +
"writeonly" case but that's about to change.  i965 is the only drier
which uses the ImageAccess field and gl_bindless_image::access is
currently unused.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
4289143899 intel/compiler: Use two components for 1D array image sizes
Having the array length component stored in .z was a small convenience
for the ISL image param filling code and an annoyance in the NIR
lowering code.  The only convenience of treating 1D arrays like 2D
arrays in the lowering code is in the address calculation code so let's
put all the complexity there as well.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
b1c414ef28 isl: Use the view array length for the image size
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
37f7983bcc intel/compiler: Do image load/store lowering to NIR
This commit moves our storage image format conversion codegen into NIR
instead of doing it in the back-end.  This has the advantage of letting
us run it through NIR's optimizer which is pretty effective at shrinking
things down.  In the common case of rgba8, the number of instructions
emitted after NIR is done with it is half of what it was with the
lowering happening in the back-end.  On the downside, the back-end's
lowering is able to directly use predicates and the NIR lowering has to
use IFs.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15166910 -> 15166872 (<.01%)
    instructions in affected programs: 5895 -> 5857 (-0.64%)
    helped: 15
    HURT: 0

Clearly, we don't have that much image_load_store happening in the
shaders in shader-db....

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
b217705dec nir/types: Add a wrapper for coordinate_components
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
f2d0a2b110 anv/pipeline: Remove dead image loads in lower_input_attacnments
Dead code will get rid of them eventually but it's better if they're
just gone so we guarantee they won't trip up later passes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
15d39f474b nir: Make image load/store intrinsics variable-width
Instead of requiring 4 components, this allows them to potentially use
fewer.  Both the SPIR-V and GLSL paths still generate vec4 intrinsics so
drivers which assume 4 components should be safe.  However, we want to
be able to shrink them for i965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
7cdf8f9339 nir/format_convert: Fix a bitmask in unpack_11f11f10f
Fixes: 4e337b42f9 "nir/format_convert: Add pack/unpack for R11F_G11F_B10F"

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
1f7be4968f nir/format_convert: Rename pack_r11g11b10f to pack_11f11f10f
This matches the unpack function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
7bd0363d6f nir/format_convert: Add [us]norm conversion helpers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
152fdeddbb nir/format_convert: Rename nir_format_bitcast_uint_vec
We have a name for that, it's called a uvec.  This just makes the
function name a bit shorter.  While we're here, we also add an assert
for one of the assumptions this function makes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
7c5df52bdc nir/format_convert: Add vec mask and sign-extend helpers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
ea4f200864 nir/format_convert: Add support for unpacking signed integers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
80c424148b nir/opcodes: Make unpack_half_2x16_split_* variable-width
There is nothing inherent about these opcodes that requires them to only
take scalars.  It's very convenient if we let them take vectors as well.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
d448fa3ae3 nir/algebraic: Add some max/min optimizations
Found by inspection.  This doesn't help much now but we'll see this
pattern with images if you load UNORM and then store UNORM.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15166916 -> 15166910 (<.01%)
    instructions in affected programs: 761 -> 755 (-0.79%)
    helped: 6
    HURT: 0

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
4dd5263663 nir/algebraic: Add more extract_[iu](8|16) optimizations
This adds the "(a << N) >> M" family of mask or sign-extensions.  Not a
huge win right now but this pattern will soon be generated by NIR format
lowering code.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15166918 -> 15166916 (<.01%)
    instructions in affected programs: 36 -> 34 (-5.56%)
    helped: 2
    HURT: 0

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
116b47fe3c nir/algebraic: Be more careful converting ushr to extract_u8/16
If it's not the right bit-size, it may not actually be the correct
extraction.  For now, we'll only worry about 32-bit versions.

Fixes: 905ff86198 "nir: Recognize open-coded extract_u16"
Fixes: 76289fbfa8 "nir: Recognize open-coded extract_u8"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Sagar Ghuge
40fc4b5acd intel/tools: new i965_disasm tool
Adds a new i965 instruction disassemble tool

v2: 1) fix a few nits (Matt Turner)
    2) Remove i965_disasm header (Matt Turner)

v3: 1) Redirect output to correct file descriptors (Matt Turner)
    2) Refactor code (Matt Turner)
    3) Use better formatting style (Matt Turner)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2018-08-29 11:19:55 -07:00
Kenneth Graunke
8fb966688b st/mesa: Disable blending for integer formats.
Blending isn't valid for integer formats.  Rather than having drivers
worry about this, just disable blending in this case.  This hopefully
will increase hits in the CSO cache as well, by eliminating most of the
meaningless fields in this case.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-29 10:51:11 -07:00
Brian Paul
18e9b4791b svga: add missing switch cases for shadow textures
This doesn't seem to make any difference in testing, but it fixes a
failed assertion when dumping sm3 shaders.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-08-29 11:29:07 -06:00
Brian Paul
fb7e462c97 svga: fix vgpu9 sprite coordinate bug
Setting GL_POINT_SPRITE_COORD_ORIGIN to GL_LOWER_LEFT did not work for
vgpu9.  We can use the rasterizer sprite_coord_enable bitfield as-is.
We need to index into it using the TGSI semantic index, not the
register index.

This fixes the Piglit fbo-gl_pointcoord and glsl-fs-pointcoord tests.

Testing done: Piglit, Mesa sprite demos

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-08-29 11:29:07 -06:00
Brian Paul
8331d69a87 svga: fix PIPE_TEXTURE_RECT/BUFFER const buffer issue
The flag_rect and flag_buffer fields didn't sufficiently capture
the state changes needed for those resource types.  For example,
if a texture binding was changed from a 500x500 rect texture to a
400x400 rect texture we didn't set SVGA_NEW_TEXTURE_CONSTS.  But
we need to do that to emit the new texcoord scale factors to the
constant buffers.  Rather than track the sizes of all bound
resources, just set the flag if the resource is a rect.  Same
story with texture buffers.

Also, since rect/buffer textures are usable with VS/GS shaders,
add SVGA_NEW_TEXTURE_CONSTS to the flags we check for emitting
VS/GS constants.

This seems to help with XFCE / xfwm4 desktop scaling.
VMware issue 2156696.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-08-29 11:29:07 -06:00
Brian Paul
46c7433da8 svga: minor improvements in svga_state_constants.c
Add const qualifiers.  Add 'f' suffix on floats to avoid double
promotion.

Remove unneeded shader type assertion since the switch statement
handled it already.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-08-29 11:29:07 -06:00
Jason Ekstrand
cdea5d996e anv: Free the app and engine name
Fixes: 8c048af589 "anv: Copy the appliation info into the instance"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-29 11:24:57 -05:00
Rhys Kidd
f7d0c112cb nv50/ir: silence partitionLoadStore() unused function warning
Move this now-unused function into the existing comment block, which was its only prior use.

../../../../../src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp:2645:1: warning:
      unused function 'partitionLoadStore' [-Wunused-function]
partitionLoadStore(uint8_t comp[2], uint8_t size[2], uint8_t mask)

Fixes: ("86e4440361 nouveau: codegen: Disable more old resource handling code")
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-08-29 08:59:27 -04:00
vadym.shovkoplias
966a797e43 glsl/linker: Link all out vars from a shader objects on a single stage
During intra stage linking some out variables can be dropped because
it is not used in a shader with the main function. But these out vars
can be referenced on later stages which can lead to further linking
errors.

Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105731
2018-08-29 20:03:56 +10:00
Lionel Landwerlin
5a1c23d150 anv: blorp: support multiple aspect blits
Newer blit tests are enabling depth&stencils blits. We currently don't
support it but can do by iterating over the aspects masks (copy some
logic from the CopyImage function).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9f44745eca ("anv: Use blorp to implement VkBlitImage")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-29 10:31:06 +01:00
Tapani Pälli
a72dbc461b mesa: allow GL_UNSIGNED_BYTE type for SNORM reads
OpenGL ES spec states:
   "For normalized fixed-point rendering surfaces, the combination format
    RGBA and type UNSIGNED_BYTE is accepted."

This fixes following failing VK-GL-CTS tests:

   KHR-GLES3.packed_pixels.pbo_rectangle.rgba8_snorm
   KHR-GLES3.packed_pixels.rectangle.rgba8_snorm
   KHR-GLES3.packed_pixels.varied_rectangle.rgba8_snorm

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
https://bugs.freedesktop.org/show_bug.cgi?id=107658
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Andres Gomez <agomez@igalia.com>
2018-08-29 09:26:23 +03:00
Timothy Arceri
5db981952a nir: add loop unroll support for wrapper loops
This adds support for unrolling the classic

    do {
        // ...
    } while (false)

that is used to wrap multi-line macros. GLSL IR also wraps switch
statements in a loop like this.

shader-db results IVB:

total loops in shared programs: 2515 -> 2512 (-0.12%)
loops in affected programs: 33 -> 30 (-9.09%)
helped: 3
HURT: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-29 16:02:05 +10:00
Timothy Arceri
0f450b57a1 nir/opt_loop_unroll: Remove unneeded phis if we make progress
Now that SSA values can be derefs and they have special rules, we have
to be a bit more careful about our LCSSA phis.  In particular, we need
to clean up in case LCSSA ended up creating a phi node for a deref.
This avoids validation issues with some CTS tests with the following
patch, but its possible this we could also see the same problem with
the existing unrolling passes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-29 16:02:05 +10:00
Timothy Arceri
5a6b04d94b nir: add complex_loop bool to loop info
In order to be sure loop_terminator_list is an accurate
representation of all the jumps in the loop we need to be sure we
didn't encounter any other complex behaviour such as continues,
nested breaks, etc during analysis.

This will be used in the following patch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-29 16:02:05 +10:00
Timothy Arceri
fef6325e58 nir: always attempt to find loop terminators
This will help later patches with unrolling loops that end with a
break i.e. loops the always exit on their first interation.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-29 16:02:05 +10:00
Marek Olšák
1e40f69483 ac/surface: fix CMASK fast clear for NPOT textures with mipmapping on SI/CI/VI
This fixes VM faults and corruption.

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-28 19:51:51 -04:00
Ian Romanick
c836326a29 i965/vec4: Emit BRW_AOP_INC or BRW_AOP_DEC for atomicAdd of +1 or -1
No shader-db changes on any Intel platform.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-28 15:35:50 -07:00
Ian Romanick
c856403868 i965/fs: Emit BRW_AOP_INC or BRW_AOP_DEC for imageAtomicAdd of +1 or -1
v2: Refactor selection of atomic opcode to a separate function.
Suggested by Jason.

No changes on any other Intel platforms.

Skylake
total instructions in shared programs: 14304261 -> 14304241 (<.01%)
instructions in affected programs: 1625 -> 1605 (-1.23%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 5.00 x̃: 5
helped stats (rel) min: 1.01% max: 14.29% x̄: 5.86% x̃: 4.07%
95% mean confidence interval for instructions value: -10.66 0.66
95% mean confidence interval for instructions %-change: -15.91% 4.19%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 527531226 -> 527531194 (<.01%)
cycles in affected programs: 92204 -> 92172 (-0.03%)
helped: 2
HURT: 0

Haswell and Broadwell had similar results. (Broadwell shown)
total instructions in shared programs: 14615730 -> 14615710 (<.01%)
instructions in affected programs: 1838 -> 1818 (-1.09%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 5.00 x̃: 5
helped stats (rel) min: 0.89% max: 13.04% x̄: 5.37% x̃: 3.78%
95% mean confidence interval for instructions value: -10.66 0.66
95% mean confidence interval for instructions %-change: -14.59% 3.85%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-28 15:35:46 -07:00
Ian Romanick
b6e247cf0e i965/fs: Refactor image atomics to be a bit more like other atomics
This greatly simplifies the next patch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-28 15:35:46 -07:00
Ian Romanick
fabe3ead57 i965/fs: Emit BRW_AOP_INC or BRW_AOP_DEC for atomicAdd of +1 or -1
Funny story... a single shader was hurt for instructions, spills, fills.
That same shader was also the most helped for cycles.  #GPUsAreWeird

No changes on any other Intel platform.

v2: Refactor selection of atomic opcode to a separate function.
Suggested by Jason.

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14304116 -> 14304261 (<.01%)
instructions in affected programs: 12776 -> 12921 (1.13%)
helped: 19
HURT: 1
helped stats (abs) min: 1 max: 16 x̄: 2.32 x̃: 1
helped stats (rel) min: 0.05% max: 7.27% x̄: 0.92% x̃: 0.55%
HURT stats (abs)   min: 189 max: 189 x̄: 189.00 x̃: 189
HURT stats (rel)   min: 4.87% max: 4.87% x̄: 4.87% x̃: 4.87%
95% mean confidence interval for instructions value: -12.83 27.33
95% mean confidence interval for instructions %-change: -1.57% 0.31%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 527552861 -> 527531226 (<.01%)
cycles in affected programs: 1459195 -> 1437560 (-1.48%)
helped: 16
HURT: 2
helped stats (abs) min: 2 max: 21328 x̄: 1353.69 x̃: 6
helped stats (rel) min: 0.01% max: 5.29% x̄: 0.36% x̃: 0.03%
HURT stats (abs)   min: 12 max: 12 x̄: 12.00 x̃: 12
HURT stats (rel)   min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -3699.81 1295.92
95% mean confidence interval for cycles %-change: -0.94% 0.30%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 8025 -> 8033 (0.10%)
spills in affected programs: 208 -> 216 (3.85%)
helped: 1
HURT: 1

total fills in shared programs: 10989 -> 11040 (0.46%)
fills in affected programs: 444 -> 495 (11.49%)
helped: 1
HURT: 1

Ivy Bridge
total instructions in shared programs: 11709181 -> 11709153 (<.01%)
instructions in affected programs: 3505 -> 3477 (-0.80%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 23 x̄: 9.33 x̃: 4
helped stats (rel) min: 0.11% max: 1.16% x̄: 0.63% x̃: 0.61%

total cycles in shared programs: 254741126 -> 254738801 (<.01%)
cycles in affected programs: 919067 -> 916742 (-0.25%)
helped: 3
HURT: 0
helped stats (abs) min: 21 max: 2144 x̄: 775.00 x̃: 160
helped stats (rel) min: 0.03% max: 0.90% x̄: 0.32% x̃: 0.03%

total spills in shared programs: 4536 -> 4533 (-0.07%)
spills in affected programs: 40 -> 37 (-7.50%)
helped: 1
HURT: 0

total fills in shared programs: 4819 -> 4813 (-0.12%)
fills in affected programs: 94 -> 88 (-6.38%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-28 15:35:38 -07:00
Ian Romanick
41399f4bc7 intel/compiler: Silence unused parameter warnings in brw_eu.h
All of the other brw_*_desc functions take a devinfo parameter, and all
of the others at least have an assert that uses it.  Keep the parameter,
but mark it as unused.

Silences 37 warnings like:

In file included from src/intel/common/gen_disasm.c:27:0:
src/intel/compiler/brw_eu.h: In function ‘brw_pixel_interp_desc’:
src/intel/compiler/brw_eu.h:377:53: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_pixel_interp_desc(const struct gen_device_info *devinfo,
                                                     ^~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-28 15:35:38 -07:00
Sagar Ghuge
56574f4df3 i965: enable AMD_depth_clamp_separate
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-28 12:57:27 -07:00
Sagar Ghuge
e6adea0dc0 i965: add functional changes for AMD_depth_clamp_separate
Gen >= 9 have ability to control clamping of depth values separately at
near and far plane.

z_w is clamped to the range [min(n,f), 0] if clamping at near plane is
enabled, [0, max(n,f)] if clamping at far plane is enabled and [min(n,f)
max(n,f)] if clamping at both plane is enabled.

v2: 1) Use better coding style (Ian Romanick)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-28 12:57:27 -07:00
Sagar Ghuge
2765749e0f mesa: add EXTRA_EXT for AMD_depth_clamp_separate
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-28 12:57:27 -07:00
Sagar Ghuge
2770446740 mesa: add support for GL_AMD_depth_clamp_separate tokens
_mesa_set_enable() and _mesa_IsEnabled() extended to accept new two
tokens GL_DEPTH_CLAMP_NEAR_AMD and GL_DEPTH_CLAMP_FAR_AMD.

v2: Remove unnecessary parentheses (Marek Olsak)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-28 12:57:27 -07:00
Sagar Ghuge
5650d39978 mesa: Add support for AMD_depth_clamp_separate
Enable _mesa_PushAttrib() and _mesa_PopAttrib() to handle
GL_DEPTH_CLAMP_NEAR_AMD and GL_DEPTH_CLAMP_FAR_AMD tokens.

Remove DepthClamp, because DepthClampNear + DepthClampFar replaces it,
as suggested by Marek Olsak.

Driver that enables AMD_depth_clamp_separate will only ever look at
DepthClampNear and DepthClampFar, as suggested by Ian Romanick.

v2: 1) Remove unnecessary parentheses (Marek Olsak)
    2) if AMD_depth_clamp_separate is unsupported, TEST_AND_UPDATE
       GL_DEPTH_CLAMP only (Marek Olsak)
    3) Clamp against near and far plane separately (Marek Olsak)
    4) Clip point separately for near and far Z clipping plane (Marek
       Olsak)

v3: Clamp raster position zw to the range [min(n,f), 0] for near plane
    and [0, max(n,f)] for far plane (Marek Olsak)

v4: Use MIN2 and MAX2 instead of CLAMP (Marek Olsak)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-28 12:57:27 -07:00
Sagar Ghuge
379949b967 mesa: Add types for AMD_depth_clamp_separate.
Add some basic types and storage for the AMD_depth_clamp_separate
extension.

v2: 1) Drop unnecessary definition (Marek Olsak)
    2) Expose extension in compatibility profile (Marek Olsak)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-28 12:57:27 -07:00
Sagar Ghuge
f663fb5487 glapi: define AMD_depth_clamp_separate
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-28 12:57:27 -07:00
Jason Ekstrand
c92a463d23 anv: Claim to support depthBounds for ID games
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-28 13:05:54 -05:00
Jason Ekstrand
8c048af589 anv: Copy the appliation info into the instance
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-28 13:05:54 -05:00
Jason Ekstrand
4ffb575da5 vulkan/alloc: Add a vk_strdup helper
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-28 13:05:54 -05:00
Dylan Baker
7c00db9527 meson: Actually load translation files
Currently we run the script but don't actually load any files, even in a
tarball where they exist.

Fixes: 3218056e0e
       ("meson: Build i965 and dri stack")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-28 08:51:05 -07:00
Caio Marcelo de Oliveira Filho
f172a77dd8 nir: Remove outdated comment
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-28 08:11:03 -07:00
Kevin Rogovin
03ecec9ed2 i965: Add INTEL_fragment_shader_ordering support.
Adds suppport for INTEL_fragment_shader_ordering. We achieve
the fragment ordering by using the same instruction as for
beginInvocationInterlockARB() which is by issuing a memory
fence via sendc.

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-08-28 17:15:10 +03:00
Kevin Rogovin
119435c877 mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_ordering
This extension provides new GLSL built-in function
beginFragmentShaderOrderingIntel() that guarantees
(taking wording of GL_INTEL_fragment_shader_ordering
extension) that any memory transactions issued by
shader invocations from previous primitives mapped to
same xy window coordinates (and same sample when
per-sample shading is active), complete and are visible
to the shader invocation that called
beginFragmentShaderOrderingINTEL().

One advantage of INTEL_fragment_shader_ordering over
ARB_fragment_shader_interlock is that it provides a
function that operates as a memory barrie (instead
of a defining a critcial section) that can be called
under arbitary control flow from any function (in
contrast the begin/end of ARB_fragment_shader_interlock
may only be called once, from main(), under no control
flow.

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-08-28 17:15:10 +03:00
Andrii Simiklit
1b0df8a460 i965/gen6/xfb: handle case where transform feedback is not active
When the SVBI Payload Enable is false I guess the register R1.4
which contains the Maximum Streamed Vertex Buffer Index is filled by zero
and GS stops to write transform feedback when the transform feedback
is not active.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107579
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-08-28 15:32:45 +02:00
Rhys Perry
743e11c10b docs: add forgotten features to 18.2.0 release notes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewied-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 18.2: <mesa-stable@lists.freedesktop.org>
2018-08-28 13:50:51 +01:00
Erik Faye-Lund
a4e60ccb56 virgl: add debug-switch to output TGSI
This is quite useful for debugging shader-transpiling issues in
virglrenderer.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2018-08-28 14:13:43 +02:00
Erik Faye-Lund
4ab06cc56e virgl: introduce $VIRGL_DEBUG=verbose
This adds an environment-varaible that can be used for driver-specific
flags, as well as a flag for it to enable verbose output.

While we're at it, quiet some overly chatty debug-output by default.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2018-08-28 14:13:43 +02:00
Erik Faye-Lund
1b2444dffc virgl: replace fprintf-call with debug_printf
This is the only direct call-site for fprintf in virgl; all other
call-sites call debug_printf instead. So let's follow in style here.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2018-08-28 14:13:43 +02:00
Erik Faye-Lund
2ebfa90abe virgl: delete commented out fprintf-call
This is just debug-cruft left over. Let's just get rid of it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2018-08-28 14:13:43 +02:00
Guido Günther
9de34b4dde meson: Don't enable any vulkan drivers on arm, aarch64
There's no Vulkan support for arm atm.

Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-27 11:32:04 -07:00
Guido Günther
05e2fc6860 meson: Be a bit more helpful when arch or OS is unknown
V2: Add one missing @0@

Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-27 11:31:52 -07:00
Sagar Ghuge
a1e3305f75 intel/eu: print bytes instead of 32 bit hex value
INTEL_DEBUG=hex prints 32 bit hex value and due to endianness of CPU
byte order is reversed. In order to disassemble binary files, print
each byte instead of 32 bit hex value.

v2: Print blank spaces in order to vertically align output of compacted
    instructions hex value with uncompacted instructions hex value.
    (Matt Turner)

v3: Fix line wrap at correct length

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-08-27 11:07:39 -07:00
Lionel Landwerlin
440a988bd1 intel: decoder: handle 0 sized structs
Gen7.5 has a BLEND_STATE of size 0 which includes a variable length
group. We did not deal with that very well, leading to an endless
loop.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-27 18:33:18 +01:00
Rhys Perry
e56e600bd3 nv50/ir,nvc0: use constant buffers for compute when possible on Kepler+
Gives a +7.79% increase in FPS with Hitman on lowest quality settings on
my GTX 1060.

total instructions in shared programs : 5787979 -> 5748677 (-0.68%)
total gprs used in shared programs    : 669901 -> 669373 (-0.08%)
total shared used in shared programs  : 548832 -> 548832 (0.00%)
total local used in shared programs   : 21068 -> 21064 (-0.02%)

                local     shared        gpr       inst      bytes
    helped           1           0         152         274         274
      hurt           0           0           0           0           0

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 14:23:42 +01:00
Rhys Perry
d27c791891 nv50/ir: optimize multiplication by 16-bit immediates into two xmads
Rather than the usual three that would be created.

total instructions in shared programs : 5796385 -> 5786560 (-0.17%)
total gprs used in shared programs    : 670103 -> 669968 (-0.02%)
total shared used in shared programs  : 548832 -> 548832 (0.00%)
total local used in shared programs   : 21164 -> 21068 (-0.45%)

                local     shared        gpr       inst      bytes
    helped           1           0          64        1040        1040
      hurt           0           0          27           0           0

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 13:57:11 +01:00
Rhys Perry
400a4eb964 nv50/ir: optimize near power-of-twos into shladd
total instructions in shared programs : 5819319 -> 5796385 (-0.39%)
total gprs used in shared programs    : 670571 -> 670103 (-0.07%)
total shared used in shared programs  : 548832 -> 548832 (0.00%)
total local used in shared programs   : 21164 -> 21164 (0.00%)

                local     shared        gpr       inst      bytes
    helped           0           0         318        1758        1758
      hurt           0           0          63           0           0

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 13:57:01 +01:00
Rhys Perry
2f52925f5c nv50/ir: move a * b -> a << log2(b) code into createMul()
With this commit, OP_MAD is handled on nv50 too. This commit is also
useful for later commits.

Also, instead of creating a shladd, it relies on LateAlgebraicOpt to
create one. This simplifies the code and helps shader-db slightly overall.

total instructions in shared programs : 5820882 -> 5819319 (-0.03%)
total gprs used in shared programs    : 670595 -> 670571 (-0.00%)
total shared used in shared programs  : 548832 -> 548832 (0.00%)
total local used in shared programs   : 21164 -> 21164 (0.00%)

                local     shared        gpr       inst      bytes
    helped           0           0          18         230         230
      hurt           0           0           8         263         263

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 13:56:47 +01:00
Rhys Perry
b60bc7a4ab nv50/ir: optimize imul/imad to xmads
This hits the shader-db numbers a good bit, though a few xmads is way
faster than an imul or imad and the cost is mitigated by the next commit,
which optimizes many multiplications by immediates into shorter and less
register heavy instructions than the xmads.

total instructions in shared programs : 5768871 -> 5820882 (0.90%)
total gprs used in shared programs    : 669919 -> 670595 (0.10%)
total shared used in shared programs  : 548832 -> 548832 (0.00%)
total local used in shared programs   : 21068 -> 21164 (0.46%)

                local     shared        gpr       inst      bytes
    helped           0           0          38           0           0
      hurt           1           0         365        3076        3076

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 13:56:44 +01:00
Rhys Perry
bcbcdf8448 gm107/ir: add support for OP_XMAD on GM107+
v4: make the immediate field 16 bits
v5: don't ever emit h1 flags for immediates

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 13:56:41 +01:00
Rhys Perry
5d6952d2de nv50/ir: add preliminary support for OP_XMAD
v4: remove uint16_t(...)
v4: don't allow immediates outside [0,65535] in insnCanLoad()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 13:56:36 +01:00
vadym.shovkoplias
4a8444d5bc glsl/linker: Allow unused in blocks which are not declated on previous stage
>From Section 4.3.4 (Inputs) of the GLSL 1.50 spec:

    "Only the input variables that are actually read need to be written
     by the previous stage; it is allowed to have superfluous
     declarations of input variables."

Fixes:
    * interstage-multiple-shader-objects.shader_test

v2:
  Update comment in ir.h since the usage of "used" field
  has been extended.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101247
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-27 12:13:53 +02:00
Jason Ekstrand
07a227f543 nir: Pull block_ends_in_jump into nir.h
We had two different implementations in different files.  May as well
have one and put it in nir.h.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-27 02:15:38 -05:00
Samuel Iglesias Gonsálvez
59a8e0dbf8 anv: Add support for protected memory properties on anv_GetPhysicalDeviceProperties2()
VkPhysicalDeviceProtectedMemoryProperties structure is new on Vulkan 1.1.

Fixes Vulkan CTS CL#2849.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-27 09:07:52 +02:00
Jason Ekstrand
aad501f15e intel/tools: Add 0x in front of a couple of hex values
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 18:47:08 -05:00
Jason Ekstrand
76b0e4d8c9 anv: Fill holes in the VF VUE to zero
This fixes a GPU hang in DOOM 2016 running under wine.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104809
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 18:47:08 -05:00
Kai Wasserbäch
b2313ef4a8 intel: tools: Fix aubinator_error's fprintf call (format-security)
The recent commit 4616639b49 introduced
the new function aubinator_error() which is a trivial wrapper around
fprintf() to STDERR. The call to fprintf() however is passed the message
msg directly:
  fprintf(stderr, msg);

This is a format-security violation and leads to an FTBFS with
-Werror=format-security (GCC 8):
  ../../../src/intel/tools/aubinator.c: In function 'aubinator_error':
  ../../../src/intel/tools/aubinator.c:74:4: error: format not a string literal and no format arguments [-Werror=format-security]
      fprintf(stderr, msg);
      ^~~~~~~

This patch fixes this trivially by introducing a catch-all "%s" format
argument.

Fixes: 4616639b49 ("intel: tools: split aub parsing from aubinator")
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 16:52:12 +01:00
Jason Ekstrand
70de31d0c1 intel/batch_decoder: Print blend states properly
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 07:50:45 -05:00
Jason Ekstrand
cbd4bc1346 intel/batch_decoder: Fix dynamic state printing
Instead of printing addresses like everyone else, we were accidentally
printing the offset from state base address.  Also, state_map is a void
pointer so we were incrementing in bytes instead of dwords and every
state other than the first was wrong.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 07:50:43 -05:00
Jason Ekstrand
d1971be6ea intel/decoder: Print ISL formats for vertex elements
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 07:50:40 -05:00
Jason Ekstrand
2abd7ae189 intel/decoder: Clean up field iteration and fix sub-dword fields
First of all, setting iter->name in advance_field is unnecessary because
it gets set by gen_decode_field which gets called immediately after
gen_decode_field in the one call-site.  Second, we weren't properly
initializing start_bit and end_bit in the initial condition of
gen_field_iterator_next so the first field of a struct would get printed
wrong if it doesn't start on the first bit.  This is fixed by adding a
iter_start_field helper which sets the field and also sets up the other
bits we need.  This fixes decoding of 3DSTATE_SBE_SWIZ.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 07:50:36 -05:00
Kenneth Graunke
1281608849 gallium: Split out PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE.
Some hardware can do PIPE_TEX_WRAP_MIRROR_REPEAT but not
PIPE_TEX_WRAP_MIRROR_CLAMP and PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER.

Drivers for such hardware would like to advertise support for
ARB_texture_mirror_clamp_to_edge but not EXT_texture_mirror_clamp.

This commit adds a new PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE bit,
changes the extension enable to be based on that, and enables it
in all upstream drivers which supported PIPE_CAP_TEXTURE_MIRROR_CLAMP
(so they continue supporting this mode).
2018-08-24 17:25:36 -07:00
Lionel Landwerlin
f430a37fa7 intel: decoder: unify MI_BB_START field naming
The batch decoder looks for a field with a particular name to decide
whether an MI_BB_START leads into a second batch buffer level. Because
the names are different between Gen7.5/8 and the newer generation we
fail that test and keep on reading (invalid) instructions.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-24 23:10:08 +01:00
Dylan Baker
7f745c19c1 docs: Update calendar, news, relnotes for 18.1.7 2018-08-24 09:35:24 -07:00
Dylan Baker
82c2e7bf9e docs: Add mesa 18.1.7 notes 2018-08-24 09:34:03 -07:00
Dylan Baker
2d8569073e docs: Add mesa 18.1.7 docs 2018-08-24 09:33:59 -07:00
Andres Gomez
0d3bb146a8 docs: update calendar 18.2.0-rc4 is out, extend to 18.2.0-rc5
Signed-off-by: Andres Gomez <agomez@igalia.com>
2018-08-24 18:58:00 +03:00
Kevin Rogovin
e345247092 docs/relnotes: Mark NV_fragment_shader_interlock support in i965
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-24 08:59:54 -05:00
Emil Velikov
081395e99d egl/drm: use gbm_dri_bo() wrapper
Remove the explicit cast, using the appropriate wrapper instead.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-08-24 11:53:24 +01:00
Emil Velikov
7b4269a5e0 egl/drm: use gbm_dri_surface() wrapper
Remove the explicit cast, using the appropriate wrapper instead.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-08-24 11:53:20 +01:00
Emil Velikov
7eb4a28d41 egl/drm: use gbm_dri_device() wrapper
Remove the explicit cast, using the appropriate wrapper instead.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-08-24 11:52:48 +01:00
Emil Velikov
2c049384b1 egl/android: simplify device open/probe
Currently droid_probe_device, does not do any 'probing' but filtering
out a device if it doesn't match the vendor string given.

Rename the function, straighten the return type and call it only as
needed - an actual vendor string is provided.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:52:44 +01:00
Emil Velikov
2f8403a4ca egl/android: remove drmVersion::name NULL check
The name string is guaranteed to be non-NULL.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:52:41 +01:00
Emil Velikov
d1211f3112 egl/android: remove droid_probe_driver()
The function name is misleading - it effectively checks if
loader_get_driver_for_fd fails. Which can happen only only on strdup
error - a close to impossible scenario.

Drop the function - we call the loader API at at later stage.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:52:39 +01:00
Emil Velikov
9b5bf7afce egl/android: use strcmp with drmVersion::name
The name string is guaranteed to be NULL terminated. Drop the explicit
length check that comes with strncmp().

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:52:37 +01:00
Emil Velikov
3827966643 egl/android: use drmDevice instead of the manual /dev/dri iteration
Replace the manual handling of /dev/dri in favor of the drmDevice API.
The latter provides a consistent way of enumerating the devices,
providing device details as needed.

v2:
 - Use ARRAY_SIZE (Frank)
 - s/famour/favor/ typo (Frank)
 - Make MAX_DRM_DEVICES a macro - fix vla errors (RobF)
 - Remove left-over dev_path instance (RobF)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com> (v1)
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:50:36 +01:00
Emil Velikov
cff80b6c15 Revert "configure: allow building with python3"
This reverts commit ae7898dfdb.

Turns out the python scripts are _not_ fully python 3 compatible.
As Ilia reported using get_xmlpool.py with LANG=C produces some weird
output - see the link for details.

Even though the issue was spotted with the autoconf build, it exposes a
genuine problem with the script (and lack of lang handling of the meson
build.)

https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html
2018-08-24 11:14:15 +01:00
Emil Velikov
7a4d2d1fdf Revert "travis: use python3 for the autoconf builds"
This reverts commit 855af9a5a2.

Turns out the python scripts are _not_ fully python 3 compatible.
As Ilia reported using get_xmlpool.py with LANG=C produces some weird
output - see the link for details.

Even though the issue was spotted with the autoconf build, it exposes a
genuine problem with the script (and lack of lang handling of the meson
build.)

https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html
2018-08-24 11:10:24 +01:00
Kenneth Graunke
93e8e17fa4 Revert "mesa: bump GL_MAX_ELEMENTS_INDICES and GL_MAX_ELEMENTS_VERTICES"
This reverts commit 095515e16c.

This breaks KHR-GL46.map_buffer_alignment.functional on i965.

This code was apparently not reviewed and I don't know why we would
move from a driver configurable constant to a hardcoded value for all
drivers.  This really looks like an accidental hack push.
2018-08-24 00:36:01 -07:00
Kenneth Graunke
9d670fd86c Revert recent changes about not including compute in combined limits.
As far as I can tell, no one reviewed these changes, they made i965
assert fail on driver load, and I am not certain they are correct.
(Hopefully reverting these does not break radeonsi too badly...)

The uniform related changes seem fine and reasonable, but the texture
image units change is possibly incorrect.  According to the
OES_tessellation_shader spec issue 5:

   (5) How are aggregate shader limits computed?

    RESOLVED: Following the GL 4.4 model, but we restrict uniform
    buffer bindings to 12/stage instead of 14, this results in

        MAX_UNIFORM_BUFFER_BINDINGS = 72
            This is 12 bindings/stage * 6 shader stages, allowing a static
            partitioning of the bindings even though at most 5 stages can
            appear in a program object).
        MAX_COMBINED_UNIFORM_BLOCKS = 60
            This is 12 blocks/stage * 5 stages, since compute shaders can't
            be mixed with other stages.
        MAX_COMBINED_TEXTURE_IMAGE_UNITS = 96
            This is 16 textures/stage * 6 stages.

which definitely is including compute shaders in that last limit.
Not including compute shaders breaks the following test:
dEQP-GLES31.functional.state_query.integer.max_combined_texture_image_units_getinteger

There was enough breakage that I figured we should just send this back
to the drawing board.

Revert "i965: don't include compute resources in "Combined" limits"
Revert "st/mesa: don't include compute resources in "Combined" limits"
Revert "mesa: don't include compute resources in MAX_COMBINED_* limits"

This reverts commit b03dcb1e5f.
This reverts commit cff290df4c.
This reverts commit 45f87a48f9.
2018-08-24 00:36:01 -07:00
Roland Scheidegger
8e1be9a34a gallivm: don't use saturated unsigned add/sub intrinsics for llvm 8.0
These have been removed. Unfortunately auto-upgrade doesn't work for
jit. (Worse, it seems we don't get a compilation error anymore when
compiling the shader, rather llvm will just do a call to a null
function in the jitted shaders making it difficult to detect when
intrinsics vanish.)

Luckily the signed ones are still there, I helped convincing llvm
removing them is a bad idea for now, since while the unsigned ones have
sort of agreed-upon simplest patterns to replace them with, this is not
the case for the signed ones, and they require _significantly_ more
complex patterns - to the point that the recognition is IMHO probably
unlikely to ever work reliably in practice (due to other optimizations
interfering). (Even for the relatively trivial unsigned patterns, llvm
already added test cases where recognition doesn't work, unsaturated
add followed by saturated add may produce atrocious code.)
Nevertheless, it seems there's a serious quest to squash all
cpu-specific intrinsics going on, so I'd expect patches to nuke them as
well to resurface.

Adapt the existing fallback code to match the simple patterns llvm uses
and hope for the best. I've verified with lp_test_blend that it does
produce the expected saturated assembly instructions. Though our
cmp/select build helpers don't use boolean masks, but it doesn't seem
to interfere with llvm's ability to recognize the pattern.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106231
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-08-24 07:50:13 +02:00
Marek Olšák
45b5f5fa25 st/mesa: expose KHR_texture_compression_astc_sliced_3d
This is ASTC 2D LDR allowing texture arrays and 3D, compressing each
slice as a separate 2D image. Tested by piglit. Trivial.
2018-08-24 00:36:18 -04:00
Marek Olšák
dae4cf397d st/mesa: expose EXT_disjoint_timer_query
same cap as ARB_timer_query, no changes needed, tested by piglit
2018-08-24 00:36:18 -04:00
Marek Olšák
263c962cfd mesa: expose EXT_vertex_attrib_64bit
because the closed driver exposes it.
It's the same as the ARB extension.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-24 00:36:18 -04:00
Marek Olšák
5c90091036 mesa: expose AMD_query_buffer_object
it's a subset of the ARB extension.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-24 00:36:18 -04:00
Marek Olšák
056b9a5a36 mesa: expose AMD_multi_draw_indirect
because the closed driver exposes it.
This is equivalent to the ARB extension.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-24 00:36:18 -04:00
Marek Olšák
b3c17330e6 mesa: expose AMD_gpu_shader_int64
because the closed driver exposes it.

It's equivalent to ARB_gpu_shader_int64.
In this patch, I did everything the same as we do for ARB_gpu_shader_int64.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-24 00:36:18 -04:00
Marek Olšák
1cf3631b9c mesa: expose ARB_post_depth_coverage in the Compatibility profile
It only contains GLSL changes.

v2: allow the layout qualifier on GLSL <= 1.30
2018-08-24 00:36:18 -04:00
Jason Ekstrand
8d8222461f intel/nir: Enable nir_opt_find_array_copies
We have to be a bit careful with this one because we want it to run in
the optimization loop but only in the first brw_nir_optimize call.
Later calls assume that we've lowered away copy_deref instructions and
we don't want to introduce any more.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15176942 -> 15176942 (0.00%)
    instructions in affected programs: 0 -> 0
    helped: 0
    HURT: 0

In spite of the lack of any shader-db improvement, this patch completely
eliminates spilling in the Batman: Arkham City tessellation shaders.
This is because we are now able to detect that the temporary array
created by DXVK for storing TCS inputs is a copy of the input arrays and
use indirect URB reads instead of making a copy of 4.5 KiB of input data
and then indirecting on it with if-ladders.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:47:51 -05:00
Jason Ekstrand
53072582dc nir: Add an array copy optimization
This peephole optimization looks for a series of load/store_deref or
copy_deref instructions that copy an array from one variable to another
and turns it into a copy_deref that copies the entire array.  The
pattern it looks for is extremely specific but it's good enough to pick
up on the input array copies in DXVK and should also be able to pick up
the sequence generated by spirv_to_nir for a OpLoad of a large composite
followed by OpStore.  It can always be improved later if needed.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:47:47 -05:00
Jason Ekstrand
a4a9c07549 intel/nir: Use nir_shrink_vec_array_vars
Shader-db results on Kaby Lake:

    total instructions in shared programs: 15177605 -> 15176765 (<.01%)
    instructions in affected programs: 4259 -> 3419 (-19.72%)
    helped: 1
    HURT: 0

    total spills in shared programs: 10954 -> 10855 (-0.90%)
    spills in affected programs: 295 -> 196 (-33.56%)
    helped: 1
    HURT: 0

    total fills in shared programs: 22222 -> 22117 (-0.47%)
    fills in affected programs: 417 -> 312 (-25.18%)
    helped: 1
    HURT: 0

The helped shader is from the OglCSDof synmark test.  On my Kaby Lake
laptop, the actual framerate of the benchmark didn't appear to improve
beyond the noise.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:46:56 -05:00
Jason Ekstrand
be8d009908 nir: Add a array-of-vector variable shrinking pass
This pass looks for variables with vector or array-of-vector types and
narrows the type to only the components used.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:46:56 -05:00
Jason Ekstrand
02a5442dd7 intel/nir: Use the new structure and array splitting passes
We call structure splitting once because it is guaranteed to split all
the structures in the entire shader in one go.  We call array splitting
in the loop in case future optimizations turn indirects into direct
dereferences and we can split more arrays.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15177605 -> 15177605 (0.00%)
    instructions in affected programs: 0 -> 0
    helped: 0
    HURT: 0

This is unsurprising because nir_lower_vars_to_ssa already effectively
does structure and array splitting internally.  It doesn't actually
split the variables but it's ability to reason about aliasing in the
presence of arrays and structures and pick out scalars or vectors to be
lowered to SSA values is fairly advanced.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:44:14 -05:00
Jason Ekstrand
fa6417495c nir: Add an array splitting pass
This pass looks for array variables where at least one level of the
array is never indirected and splits it into multiple smaller variables.

This pass doesn't really do much now because nir_lower_vars_to_ssa can
already see through arrays of arrays and can detect indirects on just
one level or even see that arr[i][0][5] does not alias arr[i][1][j].
This pass exists to help other passes more easily see through arrays of
arrays.  If a back-end does implement arrays using scratch or indirects
on registers, having more smaller arrays is likely to have better memory
efficiency.

v2 (Jason Ekstrand):
 - Better comments and naming (some from Caio)
 - Rework to use one hash map instead of two

v2.1 (Jason Ekstrand):
 - Fix a couple of bugs that were added in the rework including one
   which basically prevented it from running

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:44:14 -05:00
Jason Ekstrand
26eb077ec4 nir: Add a structure splitting pass
This pass doesn't really do much now because nir_lower_vars_to_ssa can
already see through structures and considers them to be "split".  This
pass exists to help other passes more easily see through structure
variables.  If a back-end does implement arrays using scratch or
indirects on registers, having more smaller arrays is likely to have
better memory efficiency.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:44:14 -05:00
Jason Ekstrand
b489998e63 nir/types: Add array_or_matrix helpers
Reviewed-by: Thomas Helland<thomashelland90@gmail.com>
2018-08-23 21:44:14 -05:00
Kenneth Graunke
b03dcb1e5f i965: don't include compute resources in "Combined" limits
The combined limits should only include shader stages that can be active
at the same time.  We don't need to include compute.

See also cff290df4c for st/mesa.

Unbreaks i965 from assert failing on driver load since Marek's
45f87a48f9, which dropped the core
Mesa capabilities before adjusting driver limits down to match.
2018-08-23 17:27:27 -07:00
Marek Olšák
9176703788 radeonsi: increase the maximum UBO size to 2 GB
Same as the closed driver.

This causes a failure in GL45-CTS.compute_shader.max, which has a trivial
bug.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
5693ca865d radeonsi: bump MAX_GS_INVOCATIONS
same as the closed driver

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
d3c1b212bc gallium: add PIPE_CAP_MAX_SHADER_BUFFER_SIZE
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
f6ccd594e7 gallium: add PIPE_CAP_MAX_GS_INVOCATIONS
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
8c71b70f07 tgsi/ureg: don't call tgsi_sanity when it's too slow
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
80aecad0ca st/mesa: fix up uniform limits to be able to expose large UBOs
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
cff290df4c st/mesa: don't include compute resources in "Combined" limits
The combined limits should only include shader stages that can be active
at the same time.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
d36af3a9d9 st/mesa: set ctx->Const.SubPixelBits
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
3867af39f9 glsl: fix error checking against MAX_UNIFORM_LOCATIONS
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
f01338118c mesa: make MaxCombinedUniformComponents 64-bit to allow large UBOs
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
a8b71f2db8 mesa: add ctx->Const.MaxGeometryShaderInvocations
radeonsi wants to report a different value

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
45f87a48f9 mesa: don't include compute resources in MAX_COMBINED_* limits
5 is the maximum number of shader stages that can be used by 1 execution
call at the same time (e.g. a draw call). The limit ensures that each
stage can use all of its binding points.

Compute is separate and doesn't need the 5x multiplier.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
095515e16c mesa: bump GL_MAX_ELEMENTS_INDICES and GL_MAX_ELEMENTS_VERTICES
same number as our closed GL driver

v2: don't use MaxArrayLockSize

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
356ff963ec mesa: remove incorrect change for EXT_disjoint_timer_query
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
37eee90df7 glapi: actually implement GL_EXT_robustness for GLES
The extension was exposed but not the functions.

This fixes:
    dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.readn_pixels
    dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_nuniformfv
    dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_nuniformiv

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-23 16:54:30 -04:00
Kenneth Graunke
578e45ab7b intel/decoder: Decode SFIXED values.
This lets us example SAMPLER_STATE's LOD Bias field, among other things.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-23 13:04:53 -07:00
Emil Velikov
855af9a5a2 travis: use python3 for the autoconf builds
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-23 17:00:28 +01:00
Emil Velikov
ae7898dfdb configure: allow building with python3
Pretty much all of the scripts are python2+3 compatible.
Check and allow using python3, while adjusting the PYTHON2 refs.

Note:
 - python3.4 is used as it's the earliest supported version
 - python3 chosen prior to python2

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-23 17:00:13 +01:00
Emil Velikov
c51e7486d9 bin/git_sha1_gen.py: remove execute bit/shebang
The script is executed explicitly via the build system, that uses
PYTHON/prog_python and equivalent.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-23 17:00:04 +01:00
Eric Engestrom
993a456360 vk/wsi: avoid reading uninitialised memory
It will be ignored by x11_swapchain_result() anyway (because reaching
the `fail` label without setting `result` means the swapchain status was
already a hard error), but the compiler still complains about reading
uninitialised memory.

While at it, drop the unused assignment right before returning.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 14:47:59 +01:00
Eric Engestrom
a0f6a11944 egl: drop unused _EGL_BUILT_IN_DRIVER_DRI2
Unused since b174a1ae72 "egl: Simplify the "driver" interface".

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-08-23 14:47:59 +01:00
Samuel Pitoiset
87fbc16e34 radv/gfx9: implement coherent shaders for VK_ACCESS_SHADER_READ_BIT
Single-sample color and single-sample depth (not stencil)
are coherent with shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
2018-08-23 15:42:56 +02:00
Mathieu Bridon
6027d354d1 bin/install_megadrivers.py: Remove shebang and executable bit
Since the script is never executed directly, but launched by Meson as an
argument to the Python interpreter, those are not needed any more.

In addition, they are the reason this script was missed when I moved the
Meson buildsystem to Python 3, so removing them helps avoiding future
confusion.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-23 12:12:06 +01:00
Mathieu Bridon
8c8fd0bb8e meson: Run the install script with Python 3
The script was being run directly as an executable, and it has a
Python 2 shebang.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-23 12:12:06 +01:00
Emil Velikov
48820ed8da glsl: remove execute bit and shebang from python tests
Just like the rest of the tree - these should be run either as part of
the build system check target, or at the very least with an explicitly
versioned python executable.

Fixes: db8cd8e367 ("glcpp/tests: Convert shell scripts to a python script")
Fixes: 97c28cb082 ("glsl/tests: Convert optimization-test.sh to pure python")
Fixes: 3b52d29227 ("glsl/tests: reimplement warnings-test in python")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-23 12:02:45 +01:00
Emil Velikov
e39b916d0c docs: update required mako version
The requirement was bumped a while back, but we forgot to update the
docs.

Fixes: ed871af91c ("configure.ac: raise Mako required version to
0.8.0")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-23 12:02:45 +01:00
Emil Velikov
e7149369bd configure: use distutils in ax_check_python_mako_module
Handling the version comparison by hand is a bad idea. Python has a handy
module distutils for that - use it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-23 11:59:48 +01:00
Emil Velikov
df2042d99a configure: enforce python 2.7 with AM_PATH_PYTHON
Currently we use AC_CHECK_PROGS looking for python2.7, python2 and
finally python. That is due to the varying names used across the
different OS.

Use the handy AM_PATH_PYTHON which finds the correct name and checks for
the version.

Note: python2.7 has been an unofficial requirement for quite some time.
Update the docs to reflect that.

Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-23 11:55:55 +01:00
Ian Romanick
c7c0b391ef i965: Enable INTEL_shader_atomic_float_minmax on Gen9+
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
59c17dbc6c i965: Sort Gen9+ extension enables
This is a strictly alphabetic sort, as is done in extensions_table.h
There are other options.  We should pick one and document it.  Right
now, this file is chaos.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
d515c75463 intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages
v2: Split changes to the message type field to another patch.  Suggested
by Caio.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
f347348f8a intel/compiler: Expand untyped atomic message type field by a bit
This is necessary for a new Gen9 message type that will be added in the
next patch.  There are also Gen8 message types that need the extra bit
(mostly for bindless).

v2: Split off from the next patch.  Suggested by Caio.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
d628642a34 intel/compiler: Silence unused parameter warnings
src/intel/compiler/brw_disasm_info.c: In function ‘nir_print_instr’:
src/intel/compiler/brw_disasm_info.c:30:61: warning: unused parameter ‘instr’ [-Wunused-parameter]
 __attribute__((weak)) void nir_print_instr(const nir_instr *instr, FILE *fp) {}
                                                             ^~~~~
src/intel/compiler/brw_disasm_info.c:30:74: warning: unused parameter ‘fp’ [-Wunused-parameter]
 __attribute__((weak)) void nir_print_instr(const nir_instr *instr, FILE *fp) {}
                                                                          ^~
src/intel/compiler/brw_disasm.c: In function ‘src_ia1’:
src/intel/compiler/brw_disasm.c:850:18: warning: unused parameter ‘_reg_file’ [-Wunused-parameter]
         unsigned _reg_file,
                  ^~~~~~~~~
src/intel/compiler/brw_fs_surface_builder.cpp: In function ‘void brw::surface_access::emit_byte_scattered_write(const brw::fs_builder&, const fs_reg&, const fs_reg&, const fs_reg&, unsigned int, unsigned int, unsigned int, brw_predicate)’:
src/intel/compiler/brw_fs_surface_builder.cpp:193:57: warning: unused parameter ‘size’ [-Wunused-parameter]
                                 unsigned dims, unsigned size,
                                                         ^~~~

v2: Update commit message.  brw_fs_generator.cpp warnings were already
fixed by another patch.  Noticed by Caio.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
0842655ac6 nir: Add floating point atomic min, max, and compare-swap instrinsics
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
69ce7baa9e nir: Add floating point atomic add instrinsics
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
a390158d10 glsl: Add support for lowering shared-variable float atomics
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
39bf3100ac glsl: Add support for lowering SSBO float atomics
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
280ab4afa8 glsl: Add built-in functions for INTEL_shader_atomic_float_minmax
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
c9d52c83a4 mesa: Extension boilerplate for INTEL_shader_atomic_float_minmax
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
346321a836 docs: Initial version of INTEL_shader_atomic_float_minmax spec
v2: Describe interactions with the capabilities added by
SPV_INTEL_shader_atomic_float_minmax

v3: Remove 64-bit float support.

v4: Explain NaN issues.  Explain issues with atomicMin(-0, +0) and
atomicMax(-0, +0).

v5: Fix whitespace issues noticed by Caio.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
88b6c7bc14 glsl: Add built-in functions for NV_shader_atomic_float
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
9527bb4e70 mesa: Extension boilerplate for NV_shader_atomic_float
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Gurchetan Singh
c731508b98 meson: fix egl build for android
Haven't tested this, but we do include loader.h
in platform_android.c

Fixes: c5ec155685 ("meson: wire up egl/android")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-22 16:47:19 -07:00
Gurchetan Singh
ec6cb01e21 meson: fix egl build for surfaceless
Without this, I get:

 > platform_surfaceless.c:38:10: fatal error: 'loader.h' file not found
 > #include "loader.h"
 >      ^~~~~~~~~~
 > 1 error generated.

Fixes: 108d257a16 ("meson: build libEGL")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>

v2: Split up patches, modify commit message (Dylan)
2018-08-22 16:47:09 -07:00
Caio Marcelo de Oliveira Filho
410de0e3f1 nir: Give end_block its own index
Since there's no particular reason for the index to be 0, choose an
index that is not used by other block.  This is convenient when we
store "per-block" data in an array AND look for the successors
data (e.g. any kind of backwards data-flow analysis).

v2: Add a note about end_block's index. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-22 14:41:26 -07:00
Caio Marcelo de Oliveira Filho
8364ec3fce nir: Skip common instructions when comparing deref paths
Deref paths may share the same deref instructions in their chains,
e.g.

    ssa_100 = deref_var A
    ssa_101 = deref_struct "array_field" of ssa_100
    ssa_102 = deref_array "[1]" of ssa_101
    ssa_103 = deref_struct "field_a" of ssa_102
    ssa_104 = deref_struct "field_a" of ssa_103

when comparing the two last deref instructions, their paths will share
a common sequence ssa_100, ssa_101, ssa_102.  This patch skips to next
iteration if the deref instructions are the same.  Path[0] (the var)
is still handled specially, so in the case above, only ssa_101 and
ssa_102 will be skipped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-22 14:41:26 -07:00
Caio Marcelo de Oliveira Filho
5196041e93 nir: Export deref comparison functions
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-22 14:41:26 -07:00
Caio Marcelo de Oliveira Filho
7f8ecedced util/dynarray: add a clone function
v2: Fix mem_ctx parameter type. (Thomas)

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-22 14:41:26 -07:00
Mariusz Ceier
61b84b8c14 amd/addrlib: Fix include path for c99_compat.h
Without this patch mesa doesn't compile:

In file included from ../mesa-9999/src/amd/addrlib/addrinterface.cpp:39:
../mesa-9999/src/util/macros.h:29:10: fatal error: c99_compat.h: No such file or directory
 #include "c99_compat.h"
          ^~~~~~~~~~~~~~
compilation terminated.

Fixes: 15ca5ce99a
       ("amd/addrlib: mark returnCode as MAYBE_UNUSED in")
Signed-off-by: Mariusz Ceier <mceier+mesa-dev@gmail.com>
Acked-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-22 14:39:02 -07:00
Grazvydas Ignotas
0076ea92a9 vulkan/wsi: fix pointer-integer conversion warnings
For 32bit build. Trivial.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-23 00:34:32 +03:00
Grazvydas Ignotas
9177074524 radv: use different builtin shader cache for 32bit
Currently if 64bit and 32bit programs are used interchangeably, radv
will keep overwriting the cache. Use separate cache files to avoid
that.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-23 00:34:32 +03:00
Grazvydas Ignotas
356f6673d6 radv: place pointer length into cache uuid
Thanks to reproducible builds, binary file timestamps may be identical
for both 32bit and 64bit packages when built from the same source.
This means radv will use the same cache for both 32 and 64 bit
processes, which leads to crashes.

Conveniently there is a spare byte in cache_uuid, let's place the
pointer size there.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
CC: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107601
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105904
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-23 00:34:32 +03:00
Grazvydas Ignotas
2edf47edf0 llvmpipe: add cc clobber to inline asm
The bsr instruction modifies flags, so that needs to be indicated to the
compiler. No effect on generated code, but still needed for correctness.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-08-23 00:34:32 +03:00
Nanley Chery
6d80b0b4ba intel/isl: Avoid tiling some 16K-wide render targets
Fix rendering issues on BDW and SKL.

Fixes: 0288fe8d04
("i965/miptree: Use the correct BLT pitch")

Fixes the following regressions seen

exclusively on SKL:
* KHR-GL46.texture_barrier_ARB.disjoint-texels
* KHR-GL46.texture_barrier_ARB.overlapping-texels
* KHR-GL46.texture_barrier.disjoint-texels
* KHR-GL46.texture_barrier.overlapping-texels

and both on BDW and SKL:
* GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners
* GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners

v2: Note the fixed tests (Andres).
    Don't cause failures with multisampled buffers (Andres).
    Don't hamper SKL GT4 (Ken).
v3: Fix the Fixes tag (Dylan).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107359
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-22 13:53:19 -07:00
Nanley Chery
b041fc0649 i965/miptree: Fix can_blit_slice()
Check the destination's row pitch against the BLT engine's row pitch
limitation as well.

Fixes: 0288fe8d04
("i965/miptree: Use the correct BLT pitch")

v2: Fix the Fixes tag (Dylan).
    Check the destination row pitch (Chris).

Reported-by: Dylan Baker <dylan@pnwbakers.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-22 13:53:02 -07:00
Nanley Chery
030b6efcfd i965/miptree: Use miptree_map in map_blit functions
This struct contains all the data of interest. can_blit_slice() will use
it in the next patch to calculate the correct pitch.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-22 13:23:17 -07:00
Rafael Antognolli
f8cfc77660 intel/tools/aubwrite: Always use physical addresses for traces.
It looks like we can't rely on the simulator to always translate virtual
addresses to physical ones correctly. So let's use physical everywhere.

Since our current GGTT maps virtual to physical addresses in a 1:1 way,
no further changes are required.

Additionally, we have other address spaces not in use right now. So
let's make it easier to switch which one we are using but putting the
default one into the aub_file struct.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-22 12:52:41 -07:00
Rafael Antognolli
e82d8fa964 intel/tools/aubwrite: Rename "legacy" to "Trace Block".
Hopefully it's a little more descriptive, and more accurate.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-22 12:52:41 -07:00
Jason Ekstrand
68ae66542a nir/vars_to_ssa: Don't build deref nodes for non-local variables
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 14:17:38 -05:00
Marek Olšák
e80e8d7adc ac: fix WAITCNT flags for GFX9
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-22 14:34:43 -04:00
Kai Wasserbäch
c836a751bc amd/addrlib: mark physicalSliceSize as MAYBE_UNUSED in Addr::V1::EgBasedLib::HwlGetSizeAdjustmentMicroTiled
Only used, when asserts are enabled.

Fixes an unused-but-set-variable warning with GCC 8:
 ../../../src/amd/addrlib/r800/egbaddrlib.cpp: In member function 'virtual long long unsigned int Addr::V1::EgBasedLib::HwlGetSizeAdjustmentMicroTiled(unsigned int, unsigned int, ADDR_SURFACE_FLAGS, unsigned int, unsigned int, unsigned int, unsigned int*, unsigned int*) const':
 ../../../src/amd/addrlib/r800/egbaddrlib.cpp:4111:13: warning: variable 'physicalSliceSize' set but not used [-Wunused-but-set-variable]
      UINT_64 physicalSliceSize;
              ^~~~~~~~~~~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-22 14:33:21 -04:00
Kai Wasserbäch
2e0586e379 amd/addrlib: mark numPipes as MAYBE_UNUSED in Addr::V1::EgBasedLib::SanityCheckMacroTiled (v2)
Only used, when asserts are enabled.

Fixes an unused-variable warning with GCC 8:
 ../../../src/amd/addrlib/r800/egbaddrlib.cpp: In member function 'int Addr::V1::EgBasedLib::SanityCheckMacroTiled(ADDR_TILEINFO*) const':
 ../../../src/amd/addrlib/r800/egbaddrlib.cpp:982:13: warning: unused variable 'numPipes' [-Wunused-variable]
      UINT_32 numPipes    = HwlGetPipes(pTileInfo);
              ^~~~~~~~

v2: Don't realign other variable definitions, to keep in line with file
    style (Marek)

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-22 14:33:21 -04:00
Kai Wasserbäch
6a7ef7c7dc amd/addrlib: mark *pEqToCheck as MAYBE_UNUSED in Addr::V2::Gfx9Lib::ComputeStereoInfo (v2)
Only used, when asserts are enabled.

Fixes an unused-variable warning with GCC 8:
 ../../../src/amd/addrlib/gfx9/gfx9addrlib.cpp: In member function 'ADDR_E_RETURNCODE Addr::V2::Gfx9Lib::ComputeStereoInfo(const ADDR2_COMPUTE_SURFACE_INFO_INPUT*, ADDR2_COMPUTE_SURFACE_INFO_OUTPUT*, unsigned int*) const':
 ../../../src/amd/addrlib/gfx9/gfx9addrlib.cpp:3879:34: warning: unused variable 'pEqToCheck' [-Wunused-variable]
              const ADDR_EQUATION *pEqToCheck        = &m_equationTable[eqIndex];
                                   ^~~~~~~~~~

v2: Don't realign other variable definitions, to keep in line with file
    style (Marek)

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-22 14:33:21 -04:00
Kai Wasserbäch
556f89a715 amd/addrlib: mark microBlockDim as MAYBE_UNUSED in Addr::V2::Gfx9Lib::HwlComputeBlock256Equation
Only used, when asserts are enabled.

Fixes an unused-but-set-variable warning with GCC 8:
 ../../../src/amd/addrlib/gfx9/gfx9addrlib.cpp: In member function 'virtual ADDR_E_RETURNCODE Addr::V2::Gfx9Lib::HwlComputeBlock256Equation(AddrResourceType, AddrSwizzleMode, unsigned int, ADDR_EQUATION*) const':
 ../../../src/amd/addrlib/gfx9/gfx9addrlib.cpp:2473:15: warning: variable 'microBlockDim' set but not used [-Wunused-but-set-variable]
          Dim2d microBlockDim = Block256_2d[elementBytesLog2];
                ^~~~~~~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-22 14:33:21 -04:00
Kai Wasserbäch
15ca5ce99a amd/addrlib: mark returnCode as MAYBE_UNUSED in ElemGetExportNorm
Only used, when asserts are enabled.

Fixes an unused-but-set-variable warning with GCC 8:
 ../../../src/amd/addrlib/addrinterface.cpp: In function 'int ElemGetExportNorm(ADDR_HANDLE, const ELEM_GETEXPORTNORM_INPUT*)':
 ../../../src/amd/addrlib/addrinterface.cpp:835:23: warning: variable 'returnCode' set but not used [-Wunused-but-set-variable]
      ADDR_E_RETURNCODE returnCode = ADDR_OK;
                        ^~~~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-22 14:33:21 -04:00
Lionel Landwerlin
8b0e48887f intel: aubinator_viewer: add urb view
This is available through a "Show URB" button on the 3DPRIMITIVE
instructions.

v2: Fix urb allocation end value in tooltip (Rafael)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-08-22 18:02:11 +01:00
Lionel Landwerlin
d1c4a62bf8 intel: aubinator_viewer: store urb state during decoding
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-08-22 18:02:11 +01:00
Lionel Landwerlin
38f10d5a03 intel: tools: add aubinator viewer
A graphical user interface version of aubinator.
Allows you to :

   - simultaneously look at multiple points in the aub file (using all
     the goodness of the existing decoding in aubinator)

   - edit an aub file

v2: Switch from GLFW to GTK+3

v3: Fix warning when exiting

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1)
2018-08-22 18:02:11 +01:00
Lionel Landwerlin
ea83a1d304 intel: tools: import ImGui
We want to add a new UI tool to decode aub files. This will use the
Dear ImGui library to render its interface. The build of this UI
toolkit is conditional to -Dwith_tools=intel-ui which superseeds
-Dwith_tools=intel.

The main way to use ImGui is to embed its source code at a particular
revision. Most embedding projects have to do a bit of integration
which is really specific to one's project. In our case the only
modification is to include libepoxy. We also choose to use Gtk+3 for
the window system integration. As oppose to the previous previous
version of this patch using GLFW, Gtk+ is able to handle X11/Wayland
session as well as property DPI scaling on retina monitors.

The import was done at this commit (https://github.com/ocornut/imgui) :

commit 6211f40f3d903dd9df961256e044029c49793aa3
Author: omar <omarcornut@gmail.com>
Date:   Fri Jul 27 12:29:33 2018 +0200

    Internals: Drag and Drop: default drop preview use a narrower clipping rectangle (no effect here, but other branches uses a narrow clipping rectangle that was too small so this is a fix for it) + Comments

v2: Switch from GLFW to GTK+ (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-08-22 18:02:11 +01:00
Lionel Landwerlin
4ba12e8c54 intel: tools: aub_mem: reuse already mapped ppgtt buffers
When we map a PPGTT buffer into a continous address space of aubinator
to be able to inspect it, we currently add it to the list of BOs to
unmap once we're finished. An optimization we can apply it to look up
that list before trying to remap PPGTT buffers again (we already do
this for GGTT buffers).

We need to take some care before doing this because the list also
contains GGTT BOs. As GGTT & PPGTT are 2 different address spaces, we
can have matching addresses in both that point to different physical
locations.

This changes adds a flag on the elements of the list of mapped BOs to
differenciate between GGTT & PPGTT, which allows use to reuse that
list when looking up both address spaces.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-08-22 18:02:11 +01:00
Lionel Landwerlin
8fd78b4eea intel: tools: aubmem: map gtt data to aub file
This will allow the aubinator viewer tool to modify the aub data that
was loaded at a particular gtt address.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-08-22 18:02:11 +01:00
Lionel Landwerlin
ebb145ee12 intel: tools: create libaub
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-22 18:02:11 +01:00
Lionel Landwerlin
475d670ef7 intel: tools: aubwrite: wrap function declarations for c++
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-08-22 18:02:11 +01:00
Lionel Landwerlin
ed21007a6a intel: tools: split memory management out of aubinator
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-08-22 18:02:11 +01:00
Lionel Landwerlin
14a1cb37eb util: rb_tree: add safe iterators
v2: Add helper to make iterators more readable (Rafael)
    Fix rev iterator bug (Rafael)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-08-22 17:49:36 +01:00
Lionel Landwerlin
4616639b49 intel: tools: split aub parsing from aubinator
v2: add parsing error callback (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1)
2018-08-22 17:49:36 +01:00
Mathieu Bridon
e15686567c meson: Run the test with Python 3
This is a patch from me and a patch from Mathieu Bridon squashed
together.

Signed-off-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Mathieu Bridon <bochecha@daitauha.fr>
2018-08-22 08:41:01 -07:00
Mathieu Bridon
ff0ce31e2a python: Disable universal newlines
We are testing the behaviour of a tool, for different input files, each
one using a different newline sequence. ('\n' on UNIX, '\r\n' on
Windows, …)

Unfortunately, when opening a file in text mode, Python 3 will by
default enable the "universal newlines" mode, which means it replaces
all the known newline sequences by '\n'.

This (usually useful) behaviour breaks the tests, which are specifically
trying to handle files with newline sequences different from '\n'.

Disabling the universal newlines mode fixes the tests.

However, to keep the script compatible with both Python 2 and 3, we must
use the io.open() function instead of the open() builtin, as the latter
only knows about the `newline` argument on Python 3.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-22 08:41:01 -07:00
Mathieu Bridon
fc708069f7 python: difflib prefers unicode strings
Python 3 does not automatically convert from bytes to unicode strings
like Python 2 used to do.

This commit makes sure we pass unicode strings to difflib.unified_diff,
so that the script works on both Python 2 and 3.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-22 08:41:01 -07:00
Dylan Baker
477d4b9960 compiler/glsl/tests: Make tests python3 safe
v2: - explicitly decode the output of subprocesses
    - handle bytes and string types consistently rather than relying on
      python 2's coercion for bytes and ignoring them in python 3
v3: - explicitly set encode as well as decode
    - python 2.7 and 3.x `bytes` instead of defining an alias

Reviewed-by: Mathieu Bridon <bochecha@daitauha.fr>
2018-08-22 08:41:01 -07:00
Juan A. Suarez Romero
6ea5718318 travis: SWR requires LLVM 6.0
v2: update clarification why ubuntu-toolchain-r-test is required (Emil)

Fixes: 0cef0cccf5 ("swr: bump minimum supported LLVM version to 6.0")
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-22 17:29:20 +02:00
Samuel Pitoiset
4c43ec461d ac/nir: fix getting GLSL type of array of samplers for TG4
This fixes a crash in build_tex_intrinsic() when trying to
launch the Basemark GPU benchmark on GFX8. It looks like
there is still something wrong because some frames are black.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106980
CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-22 15:23:11 +02:00
Samuel Pitoiset
24ee53231d radv: remove dead variables after splitting per member structs
Otherwise, nir_lower_clip_cull_distance_arrays might report
wrong number of output clips/culls because it relies on
shader output variables and some of them might be dead.

This fixes a rendering issue with Dolphin and Super Mario
Sunshine.

Fixes: b0c643d8f5 ("spirv: Use NIR per-member splitting")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107610
CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-22 13:57:18 +02:00
Yunchao He
bea4d4c78c anv: add VK_EXT_sampler_filter_minmax support
This extension can be supported on SKL+. With this patch,
all corresponding tests (6K+) in CTS can pass. No test fails.

I verified CTS with the command below:
deqp-vk --deqp-case=dEQP-VK.pipeline.sampler.view_type.*reduce*

v2: 1) support all depth formats, not depth-only formats, 2) fix
a wrong indention (Jason).

v3: fix a few nits (Lionel).

v4: fix failures in CI: disable sampler reduction when sampler
reduction mode is not specified via this extension (Lionel).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-22 11:56:19 +01:00
Samuel Pitoiset
0608349232 radv: use ac_build_imad()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-22 09:17:40 +02:00
Marek Olšák
d87fe1f0fd ac,radeonsi: use ac_build_gather_values more
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Marek Olšák
60beac9efc ac,radeonsi: use ac_build_fmad
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Marek Olšák
c401ead68a radeonsi: use ac_build_imad
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Marek Olšák
659f2e0fcb ac: add imad & fmad helpers
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Marek Olšák
2276f8f064 ac: add ac_build_s_barrier
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Marek Olšák
6224144b6d radeonsi: print the shader stage name when printing LLVM IR
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Marek Olšák
5d20b9be90 radeonsi: use is_merged shader in si_prolog_get_rw_buffers
needed to change the input type to si_shader_context

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Marek Olšák
a4a104fc81 ac: completely remove +auto-waitcnt-before-barrier
it causes corruption on several different GPU generations.

Cc: 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Anuj Phogat
2383ddace1 anv/icl: Allow headerless sampler messages for pre-emptable contexts
It fixes simulator warnings in vulkancts tests complaining about missing
support for headerless sampler messages for pre-emptable contexts.
Bit 5 in SAMPLER MODE register is newly introduced for ICLLP.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-21 12:50:05 -07:00
Anuj Phogat
81b74b5d96 anv/icl: Disable binding table prefetching
Gen 11 workarounds table #2056 WABTPPrefetchDisable suggests to
disable prefetching of binding tables for ICLLP A0 and B0
steppings. We have a similar patch for i965 driver in  Mesa
commit a5889d70.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-21 12:50:05 -07:00
Anuj Phogat
482f328f3b i965/icl: Allow headerless sampler messages for pre-emptable contexts
It fixes simulator warnings in piglit tests complaining about missing
support for headerless sampler messages for pre-emptable contexts.
Bit 5 in SAMPLER MODE register is newly introduced for ICLLP.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-21 12:50:05 -07:00
Dave Airlie
32529e6084 r600/eg: rework atomic counter emission with flushes
With the current code, we didn't do the space checks prior
to atomic counter setup emission, but we also didn't add
atomic counters to the space check so we could get a flush
later as well.

These flushes would be bad, and lead to problems with
parallel tests. We have to ensure the atomic counter copy in,
draw emits and counter copy out are kept in the same command
submission unit.

This reworks the code to drop some useless masks, make the
counting separate to the emits, and make the space checker
handle atomic counter space.

[airlied: want this in 18.2]

Fixes: 06993e4ee (r600: add support for hw atomic counters. (v3))
2018-08-21 20:45:38 +01:00
Dave Airlie
41d58e2098 virgl: ARB_enhanced_layouts support
We need to handle the gaps in the streamout bindings on the guest
side and enable if it the host has the rest enabled.

Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
2018-08-22 05:05:21 +10:00
Chad Versace
aa79cc2bc8 i965: Implement EGL_KHR_mutable_render_buffer
Testing:
  - Manually tested a low-latency handwriting demo that toggles
    EGL_RENDER_BUFFER. Toggling changed the display latency as expected.
    Used Android on Chrome OS, Kabylake GT2.
  - No change in dEQP-EGL.functional.* on Fedora 27, Wayland, Skylake
    GT2.  Used deqp at tag android-p-preview-5.
  - No regressions in dEQP-EGL.functional.*, ran on Android on Chrome
    OS, Kabylake GT2. Some dEQP-EGL.functional.mutable_render_buffer.*
    test change from NotSupported to Pass.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-21 09:56:20 -07:00
Chad Versace
ed7c694688 egl/android: Implement EGL_KHR_mutable_render_buffer
Specifically, implement the extension DRI_MutableRenderBufferLoader.
However, the loader enables EGL_KHR_mutable_render_buffer only if the
DRI driver implements its half of the extension,
DRI_MutableRenderBufferDriver.

Testing:
  - No change in dEQP-EGL.functional.* on Fedora 27, Wayland, Skylake
    GT2.  Used deqp at tag android-p-preview-5.
  - No change in dEQP-EGL.functional.*, ran on Android on Chrome OS,
    Kabylake GT2.
  - Manually inspected Android apps on same Chrome OS device.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-21 09:56:20 -07:00
Eric Engestrom
317c460a4d util/xmlpool: make indentation coherent
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-21 17:36:13 +01:00
Eric Engestrom
2de9e841e7 egl: add helper to combine two u32 into one u64
Use a helper to avoid the common issues of upcasting after the right shift
(losing the upper bits) and shifting signed values (sign gets shifted too).

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-21 15:50:02 +01:00
Eric Engestrom
1ca23420c1 docs: trivial s/>/&gt;/ html fix
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-21 15:41:41 +01:00
Eric Engestrom
6ff1c47996 autotools: don't ship the git_sha1.h generated in git in the tarballs
This file is regenerated at build time anyway, so this would just get
overwritten anyway. No reason to ship it in the tarball.

Fixes: 44df06211c "autotools: include git_sha1.h in dist tarball"
Fixes: 471f708ed6 "git_sha1: simplify logic"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-21 15:30:56 +01:00
Eric Engestrom
81fe9bdf6d intel/genxml: minor python style fix
Suggested-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-21 15:30:55 +01:00
Jose Fonseca
9e5e3a8ead appveyor: Set git core.autocrlf setting to true.
The git core.autocrlf setting defaults to true (ie, all text files get
checked out as CRLF on Windows), except on Appveyor where's set to
"input" (ie, all text files get checked out with the upstream
repository's line endings, which for us typically means LF.)

And this was masking on Appveyor a regression in gen_xmlpool.py
processing t_options.h with CRLF line endings.

This change makes core.autocrlf to be true, which would have enabled to
immediately catch the issue, as seen in
https://ci.appveyor.com/project/jrfonseca/mesa/build/51

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-08-21 09:46:19 +01:00
Timothy Arceri
797cd198ae mesa: move legacy hyperz option from dri config
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-21 09:19:02 +10:00
Timothy Arceri
02062ab1e1 mesa: remove unused dri config option disable_shader_bit_encoding
This was added as a workaround for Heaven 3.0 but was later removed
by 5ead448719 to allow Heaven 4.0 to work correctly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-21 09:19:02 +10:00
Timothy Arceri
c5f863f2fd mesa: drop legacy no_rast dri option
Add enviroment var overrides to legacy drivers instead.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-21 09:19:01 +10:00
Timothy Arceri
02e32c92a2 i965: remove unused no_rast bool
Forcing software fallbacks for i965 hasn't been an option since
5e3c093ff8.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-21 09:19:01 +10:00
Timothy Arceri
7867c1078a i915: remove early_z dri option
This driver is in maintenance mode so lets remove this hidden
unsafe option.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-21 09:19:01 +10:00
Kevin Rogovin
7ec308d978 Add NV_fragment_shader_interlock support.
The main purpose for having NV_fragment_shader_interlock
extension is because that extension is also for GLES31 while
the ARB extension is for GL only.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-08-20 13:32:43 -07:00
Juan A. Suarez Romero
44df06211c autotools: include git_sha1.h in dist tarball
This fixes `make distcheck`.

Fixes: 471f708ed6 ("git_sha1: simplify logic")
CC: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-20 18:43:50 +02:00
Juan A. Suarez Romero
0cef0cccf5 swr: bump minimum supported LLVM version to 6.0
RADV now requires LLVM 6.0 or greater, and thus we can't build dist
tarball because swr requires LLVM 5.0.

Let's bump required LLVM to 6.0 in swr too.

v2: bump also in meson.build (Eric)

Fixes: fd1121e839 ("amd: remove support for LLVM 5.0")
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-08-20 16:13:37 +02:00
Danylo Piliaiev
25ec806eb2 i965: Advertise 8 bits subpixel precision for viewport bounds on gen6+
We use floating-points for viewport bounds so VIEWPORT_SUBPIXEL_BITS
should reflect this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105975

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-20 15:11:57 +01:00
Rob Clark
e11e9d6394 freedreno: fix context teardown race
We could still have batches queued up to flush, so fd_context_destroy()
(which will kill and sync on the flush_queue) before deleting buffers
that might be referenced from fdN_gmem() from context of flush_queue.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-20 10:03:05 -04:00
Kai Wasserbäch
5fab32ddad intel/decoder: mark total_length as MAYBE_UNUSED in gen_spec_load
Only used, when asserts are enabled.

Fixes an unused-variable warning with GCC 8:
 ../../../src/intel/common/gen_decoder.c: In function 'gen_spec_load':
 ../../../src/intel/common/gen_decoder.c:535:47: warning: variable 'total_length' set but not used [-Wunused-but-set-variable]
     uint32_t text_offset = 0, text_length = 0, total_length;
                                                ^~~~~~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-20 11:08:52 +01:00
Kai Wasserbäch
4228e052b3 intel/tools: initialise bo_addr to 0 in main
Supresses a maybe-uninitialized warning with GCC 8.

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-20 11:08:52 +01:00
Kai Wasserbäch
ccdefbb559 intel: aubinator: mark ftruncate_res as MAYBE_UNUSED in ensure_phys_mem
Only used, when asserts are enabled.

Fixes an unused-variable warning with GCC 8:
 ../../../src/intel/tools/aubinator.c: In function 'ensure_phys_mem':
 ../../../src/intel/tools/aubinator.c:209:11: warning: unused variable 'ftruncate_res' [-Wunused-variable]
        int ftruncate_res = ftruncate(mem_fd, mem_fd_len += 4096);
            ^~~~~~~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-20 11:08:52 +01:00
Kai Wasserbäch
64c2bca59f intel/aubinator_error_decode: mark ret as MAYBE_UNUSED in main
Only used, when asserts are enabled.

Fixes an unused-but-set-variable warning with GCC 8:
 ../../../src/intel/tools/aubinator_error_decode.c: In function 'main':
 ../../../src/intel/tools/aubinator_error_decode.c:759:11: warning: variable 'ret' set but not used [-Wunused-but-set-variable]
        int ret;
            ^~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-20 11:08:52 +01:00
Samuel Pitoiset
0aacb5eab6 radv: do not use CP predication for DCC decompressions
This fixes a regression with some Unity demos. Not sure
what the root cause of the problem is, especially because
the driver doesn't perform any fast color clears. So, it
shouldn't be needed to decompress DCC. RadeonSI says that
the decompression is relatively cheap if the surface has
been decompressed already.

One possible improvement is to two use predicates, one for
DCC and one for FCE that could be cleared when DCC, FMASK
or CMASK are performed by the driver. That might skip some
unnecessary decompression passes (not DCC though).

Fixes: ff7daadca1 ("radv: enable/disable predication for the DCC decompression pass")
CC: 18.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107563
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-20 11:54:37 +02:00
Tapani Pälli
799b3d16d4 egl: implement EXT_surface_SMPTE2086_metadata and EXT_surface_CTA861_3_metadata
Patch implements common bits for EXT_surface_SMPTE2086_metadata
and EXT_surface_CTA861_3_metadata extensions by adding new required
attributes and eglQuerySurface + eglSurfaceAttrib changes.

Currently none of the drivers are utilizing this data but this patch
is enabler in getting there.

v2: don't enable extension globally, should be only enabled by
    EGL drivers that can transfer metadata to the window system (Jason)
    use EGLint instead of uint16_t (Eric)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-20 09:44:53 +03:00
Timothy Arceri
5a0684d665 mesa: move legacy dri config option texture_units
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-20 13:53:59 +10:00
Timothy Arceri
8b4157d578 mesa: remove unused dri config option texture_heaps
This seems to have only been used by DRI1 drivers which were
removed with e4344161bd.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-20 13:53:59 +10:00
Timothy Arceri
fb277f504e mesa: move legacy dri config option texture_blend_quality
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-20 13:53:59 +10:00
Timothy Arceri
c470db706a util: remove unused S3TC translation for dri config
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-20 13:53:59 +10:00
Timothy Arceri
7d2474afb5 mesa: remove dri configs unused software-fallback options
These seems to have only been used by DRI1 drivers which were
removed with e4344161bd.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-20 13:53:58 +10:00
Timothy Arceri
24da2d162d mesa: remove unused dri config option excess_mipmap
This seems to have only been used by DRI1 drivers which were
removed with e4344161bd.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-20 13:53:58 +10:00
Timothy Arceri
498831c7e6 mesa: remove unused dri config option performance_boxes
This seems to have only been used by DRI1 drivers which were
removed with e4344161bd.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-20 13:53:58 +10:00
Timothy Arceri
4a91d4ef0f docs: update the default mesa shader cache dir
We renamed the dir in commit 28b326238b, this just updates the
website to reflect the change.
2018-08-20 08:08:58 +10:00
Kai Wasserbäch
2c020dbf06 vulkan/wsi: initialise image_index to 0 in x11_manage_fifo_queues
Supresses a maybe-uninitialized warning with GCC 8.

Note: image_index should always be initialised due to the result check,
      but the compiler doesn't see that.

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-18 10:34:19 +10:00
Kai Wasserbäch
6f0647c0b2 nir: mark *prev_block as MAYBE_UNUSED in opt_peel_loop_initial_if
Only used, when asserts are enabled.

Fixes an unused-variable warning with gcc-8:
 ../../../src/compiler/nir/nir_opt_if.c: In function 'opt_peel_loop_initial_if':
 ../../../src/compiler/nir/nir_opt_if.c:109:15: warning: unused variable 'prev_block' [-Wunused-variable]
     nir_block *prev_block =
                ^~~~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-18 10:34:15 +10:00
Kai Wasserbäch
9387ca29ae util: mark s as MAYBE_UNUSED in _mesa_half_to_unorm8
Only used, when asserts are enabled.

Fixes an unused-variable warning with gcc-8:
 ../../../src/util/half_float.c: In function '_mesa_half_to_unorm8':
 ../../../src/util/half_float.c:189:14: warning: unused variable 's' [-Wunused-variable]
     const int s = (val >> 15) & 0x1;
               ^

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-18 10:34:12 +10:00
Timothy Arceri
0da93de9c8 util: add drirc workarounds for RAGE
This allows the game to run on wine (tested on radeonsi where we
have compat profile support).
2018-08-18 09:26:51 +10:00
Timothy Arceri
3f9d8e9c88 util: better handle program names from wine
For some reason wine will sometimes give us a windows style path
for an application. For example when running the 64bit version
of Rage wine gives a Unix style path, but when running the 32bit
version is gives a windows style path.

If we detect no '/' in the path at all it should be safe to
assume we have a wine application and instead look for a '\'.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-18 09:20:39 +10:00
Timothy Arceri
d0803dea11 nir: allow more nested loops to be unrolled
The innermost check was added to stop us from unrolling multiple
loops in a single pass, and to stop outer loops from unrolling.

When we successfully unroll a loop we need to run the analysis
pass again before deciding if we want to go ahead an unroll a
second loop.

However the logic was flawed because it never tried to unroll any
nested loops other than the first innermost loop it found.
If this innermost loop is not unrolled we end up skipping all
other nested loops.

This unrolls a loop in a Deus Ex: MD shader on ultra settings and
also unrolls a loop in a shader from the game Prey when running
on DXVK.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-18 09:03:13 +10:00
Ray Strode
9baff597ce gallium/winsys/kms: don't unmap what wasn't mapped
At the moment, depending on pipe transfer flags, the dumb
buffer map address can end up at either kms_sw_dt->ro_mapped
or kms_sw_dt->mapped.

When it's time to unmap the dumb buffer, both locations get unmapped,
even though one is probably initialized to 0.

That leads to the code segment getting unmapped at runtime and
crashes when trying to call into unrelated code.

This commit addresses the problem by using MAP_FAILED instead of
NULL for ro_mapped and mapped when the dumb buffer is unmapped,
and only unmapping mapped addresses at unmap time.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107098
Signed-off-by: Ray Strode <rstrode@redhat.com>
Fixes: d891f28df9 ("gallium/winsys/kms: Fix possible leak in map/unmap.")
Cc: Lepton Wu <lepton@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-17 17:16:32 +01:00
Qiang Yu
0aa80abf25 loader: add dri_driver option to override dri driver to load
drirc implementation of MESA_LOADER_DRIVER_OVERRIDE which can be
used to override dri driver to load.

Usage:

override dri driver for device with spec kernel driver name:

<device kernel_driver="kernel_driver_name">
  <option name="dri_driver" value="new_dri_driver" />
</device>

or

<device driver="loader" kernel_driver="kernel_driver_name">
  <option name="dri_driver" value="new_dri_driver" />
</device>

v2:
  add kernel_driver device attribute to specify kernel
  driver name instead of reuse driver attribute

v3:
  seperate loader_get_kernel_driver_name into another patch
  seperate add kernel_driver attribute into another patch

Suggested-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Qiang Yu <Qiang.Yu@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[v4 Emil: add HAVE_LIBDRM guard around __driConfigOptionsLoader and
loader_get_dri_config_driver]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-17 17:16:32 +01:00
Qiang Yu
3bbe180b98 xmlconfig: add kernel_driver device attribute
This attribute can be used by loader to apply different
option to device use specific kernel driver.

Signed-off-by: Qiang Yu <Qiang.Yu@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-17 17:16:32 +01:00
Qiang Yu
e8b91e99e9 loader: abstract loader_get_kernel_driver_name for reuse
This function can be shared by the following kernel_driver
drirc patch.

Signed-off-by: Qiang Yu <Qiang.Yu@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-17 17:16:32 +01:00
Qiang Yu
30b10dbb7c driconf: move ${sysconfdir}/drirc to ${datadir}/drirc.d/00-mesa-defaults.conf
${sysconfdir} is for store admin config files, so move
this mesa default config file to ${datadir}/drirc.d.

Signed-off-by: Qiang Yu <Qiang.Yu@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-17 17:16:32 +01:00
Qiang Yu
04bdbbcab3 xmlconfig: read more config files from drirc.d/
Driver and application can put their drirc files in
${datadir}/drirc.d/ with name xxx.conf. Config files
will be read and applied in file name alphabetic order.

So there are three places for drirc listed in order:
1. /usr/share/drirc.d/
2. /etc/drirc
3. ~/.drirc

v4:
  fix meson build

v3:
  1. seperate driParseConfigFiles refine into another patch
  2. fix entries[i] mem leak

v2:
  drop /etc/drirc.d

Signed-off-by: Qiang Yu <Qiang.Yu@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-17 17:16:32 +01:00
Emil Velikov
0da417129e xmlconfig: refine driParseConfigFiles to use parseOneConfigFile
Also prepare for the usage of following parseConfigDir patch.

Signed-off-by: Qiang Yu <Qiang.Yu@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[Emil: add #include <limits.h>]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-17 17:16:32 +01:00
Jason Ekstrand
d9ea015ced anv/pipeline: Lower pipeline layouts etc. after linking
This allows us to use the link-optimized shader for determining binding
table layouts and, more importantly, URB layouts.  For apps running on
DXVK, this is extremely important as DXVK likes to declare max-size
inputs and outputs and this lets is massively shrink our URB space
requirements.

VkPipeline-db results (Batman pipelines only) on KBL:

    total instructions in shared programs: 820403 -> 790008 (-3.70%)
    instructions in affected programs: 273759 -> 243364 (-11.10%)
    helped: 622
    HURT: 42

    total spills in shared programs: 8449 -> 5212 (-38.31%)
    spills in affected programs: 3427 -> 190 (-94.46%)
    helped: 607
    HURT: 2

    total fills in shared programs: 11638 -> 6067 (-47.87%)
    fills in affected programs: 5879 -> 308 (-94.76%)
    helped: 606
    HURT: 3

Looking at shaders by hand, it makes the URB between TCS and TES go from
containing 32 per-vertex varyings per tessellation shader pair to a more
reasonable 8-12.  For a 3-vertex patch, that's at least half the URB
space no matter how big the patch section is.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-17 10:50:28 -05:00
Jason Ekstrand
f210a5f4bb anv/pipeline: Set tess IO read/written key fields in compile_*
We want these to be set as close to the final compile as possible so
that they are guaranteed to happen after nir_shader_gather_info is
called.  The next commit is going to move nir_shader_gather_info to
after the linking step which makes this necessary.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-17 10:50:28 -05:00
Jason Ekstrand
2e4094cd8f anv/pipeline: Use more fields from stage in compile_cs
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-17 10:50:28 -05:00
Jason Ekstrand
4af1a8c9e4 anv/apply_pipeline_layout: Add to the bind map instead of replacing it
This commit makes three changes.  One is to only walk the descriptors once
and set bind map sizes at the same time as filling out the entries.  The
second is to make the pass additive so that we can put stuff in the bind
map before applying the pipeline layout.  Third, we switch to using
designated initializers.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-17 10:50:28 -05:00
Jason Ekstrand
320dacb0a0 anv/lower_ycbcr: Use the binding array size for bounds checks
Because lower_ycbcr gets called before apply_pipeline_layout, the
indices are all logical and the binding layout HW size is actually too
big for the bounds check.  We should just use the regular logical array
size instead.

Fixes: f3e91e78a3 "anv: add nir lowering pass for ycbcr textures"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-17 10:50:28 -05:00
Mathieu Bridon
459ec5265c python: Open the template as text, with an explicit encoding
In commit bd27203f4d we changed this to
open in binary mode, to then explicitly decode the lines with the right
encoding.

Unfortunately, that broke the build on Windows, where the template file
can have '\r\n' as line terminators: opening in binary mode would keep
those terminators and break the regexp.

We need to go back to text mode, where the "universal newlines" mode
takes care of this.

However, to fix the initial issue, let's specify the encoding explicitly
when opening the file, and make sure it is open in text mode, so we only
get unicode strings.

Reviewed-by: Jose Fonseca <jfonseca@vmware>
2018-08-17 09:34:49 -06:00
Mathieu Bridon
f9415d760a python: Help Python 2 print the line
Reviewed-by: Jose Fonseca <jfonseca@vmware>
2018-08-17 09:33:16 -06:00
Rob Clark
a8ef7f5e02 freedreno/a6xx: streamout
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-17 11:04:21 -04:00
Rob Clark
7fa2a8c3c4 freedreno/a6xx: fragz fixes
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-17 11:04:21 -04:00
Rob Clark
7c73d41160 freedreno/a6xx: scissor fixes
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-17 11:04:21 -04:00
Rob Clark
b7f18e49b7 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-17 11:04:21 -04:00
Rob Clark
a4754c245b freedreno/a6xx: fix srgb
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-17 11:04:21 -04:00
Rob Clark
2658f63701 freedreno: fix dEQP-GLES3.functional.fence_sync.*
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-17 11:04:21 -04:00
Samuel Pitoiset
d27e1584ce radv/winsys: fix creating the BO list for virtual buffers
When the number of unique BO is 0, we optimize the list creation
by copying all buffers of the current CS directly into it. But
this is only valid if the CS doesn't have virtual buffers,
otherwise they are not added and hw might report VM faults.

This fixes VM faults with:
dEQP-VK.sparse_resources.image_sparse_binding.2d.rgba8ui.1024_128_1

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-17 15:00:21 +02:00
Kristian H. Kristensen
de3b34df97 freedreno: Add a6xx backend
This adds a freedreno backend for the a6xx generation GPUs, which at
the time of this commit is about 98% GLES2 conformant. Much remains to
be done - both performance work and feature work towards more recent
GLES versions, but this is a good start.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-16 19:13:36 -04:00
Rob Clark
6ee58e8257 freedreno: update generated headers
pull in a6xx registers

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-16 19:11:08 -04:00
Kristian H. Kristensen
e89683d5a2 freedreno: Fix warnings
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-16 19:11:08 -04:00
Dylan Baker
c782168751 scons: Check for mako 0.8.0
v2: - Use distutils to do the version checking

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107565
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-08-16 13:53:10 -07:00
Dylan Baker
64e4638130 scons: Require python 2.7
less than 2.7 is not supported.

v2: - Remove check for python >= 2.0, since we've already enforced 2.7

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-08-16 13:52:56 -07:00
Dylan Baker
5a8f824d8c meson: use python3 module to find python3
This handy helper is nice for OSes that are not linux or BSD like (mac
and windows) as it knows how to find python3 in odd places.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-08-16 13:51:44 -07:00
Dylan Baker
52194ae4df meson: Ensure that mako is >= 0.8.0
It's what autotools has required for a long time.

v3: - Use distutils.version.StrictVersion instead of comparing strings

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-08-16 13:50:51 -07:00
Eric Engestrom
03ec672213 svga: simplify Mesa version string
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-16 17:38:31 +01:00
Eric Engestrom
bc8abc1adf bin: always define MESA_GIT_SHA1 to make it directly usable in code
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-16 17:38:31 +01:00
Eric Engestrom
471f708ed6 git_sha1: simplify logic
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-16 17:38:31 +01:00
Eric Engestrom
9a6a631762 i965: drop unused assignment
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-16 17:38:31 +01:00
Eric Engestrom
7a1f4340b6 anv: drop cast-to-void of used variable
`device` is used 2 lines below, even visible in the diff context printed.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-16 17:38:31 +01:00
Eric Engestrom
6cf0d4f91f anv: use safer snprintf() to ensure NULL string-terminator
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-16 17:38:31 +01:00
Eric Engestrom
d6aea40326 intel/batch-decoder: replace local ARRAY_LENGTH() macro with global ARRAY_SIZE()
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-16 17:38:31 +01:00
Eric Engestrom
81c1989e4f intel: various python cleanups
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-16 17:38:25 +01:00
Eric Engestrom
aa78b29eba egl: check for buffer overflow *before* corrupting our memory
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-16 17:38:22 +01:00
Eric Engestrom
eb6b41749b egl/wayland: remove sign from bitfield formats
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-16 17:38:18 +01:00
Eric Engestrom
c5d9b48a71 mailmap: add various typos of Emil's address from the log
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-16 17:38:04 +01:00
Eric Engestrom
882ed53946 egl: some spelling fixes
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-16 14:15:18 +01:00
Samuel Pitoiset
f9e8456c39 radv: initialize the DCC predicate correctly when it's compressed
We have to do a fast-clear eliminate when clearing DCC
metadata with 0x20202020. I don't know if that fixes anything
but that seems correct to me.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-16 14:11:51 +02:00
Samuel Pitoiset
f3a78a9da0 radv: fix missing initialization of the conditional rendering state
This was missing when VK_EXT_conditional_rendering has been
implemented. The predication type should be -1 to avoid
restoring previous state when performing a decompression pass
with DCC enabled.

Note that we don't have to handle secondary command buffers
because we don't support this feature currently.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-16 14:11:48 +02:00
Eric Engestrom
c5dd02287f bin: split write_if_different() out
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-16 12:33:35 +01:00
Eric Engestrom
c2e00f9eee bin: whitespace cleanup
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-16 12:30:30 +01:00
Bas Nieuwenhuizen
011a811652 radv: Revert divisor = 0 case for vertex attribute extension.
Seems like DXVK depends on that and it might get reverted
upstream. Since apps are not supposed to use 0 in v2 anyway,
we should be safe implementing the old behavior there.

Fixes: 66e12451ac "radv: Update to new VK_EXT_vertex_attribute_divisor to version 2."
CC: 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-16 11:13:19 +02:00
Bas Nieuwenhuizen
3308db2dd7 radv: Possible on-demand compilation fix.
Seems that in a single case we use the renderpass before checking
the pipeline, so check the renderpass before we use it.

Fixes: fbcd167314 "radv: Add on-demand compilation of built-in shaders."
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-16 11:13:19 +02:00
Gert Wollny
1560c58b12 mesa/st: fix array indices off-by-one error in remapping
When moving the array sizes from the old list to the new one it was
not taken into account that the array indices start with one, but the
array_size array started at index zero, which resulted in incorrect array
sizes when arrays were merged. Correct this by copying the array_size
values of the retained arrays with an offset of -1.

Also fix whitespaces for the replaced lines.

Fixes: d8c2119f9b
  mesa/st/glsl_to_tgsi: Expose array live range tracking and merging
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-16 08:52:26 +02:00
Alexander Tsoy
9a96bf0ecd meson: fix build for egl platform_x11 without dri3 and gbm
Compiling EGL's platform_x11 without dri3 and gbm yields this compile
failure:

platform_x11 needs inc_loader:

../mesa-18.2.0-rc2/src/egl/drivers/dri2/platform_x11.c:48:10: fatal
error: loader.h: No such file or directory
 #include "loader.h"
          ^~~~~~~~~~

Fixes: 108d257a16 ("meson: build libEGL")
Bugzilla: https://bugs.gentoo.org/663534
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-08-15 16:37:16 -07:00
Jason Ekstrand
10f44da775 Revert "intel/nir: Call nir_lower_io_to_scalar_early"
Commit 4434591bf5 caused substantially more URB messages in
geometry and tessellation shaders.  Before we can really enable this
sort of optimization,  We either need some way of combining them back
together into vectors or we need to do cross-stage vector element
elimination without splitting everything into scalars.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107510
Fixes: 4434591bf5 "intel/nir: Call nir_lower_io_to_scalar_early"
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2018-08-15 17:56:50 -05:00
Erik Faye-Lund
da1f7c56da i965: do not emit empty surface state
If called with an empty size, brw_emit_buffer_surface_state asserts.
We already have a dedicated helper for uploading nothing, so let's use
that instead.

Avoids an assert in
dEQP-GLES31.functional.shaders.opaque_type_indexing.ssbo.const_literal_vertex
when running a debug build of i965.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-15 23:23:16 +01:00
Sergii Romantsov
743dff1cca intel/ppgtt: 4096 replaced by PAGE_SIZE
Usage of number 4096 replaced by PAGE_SIZE.

Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-15 23:23:16 +01:00
Sergii Romantsov
24839663a4 intel/ppgtt: memory address alignment
Kernel (for ppgtt) requires memory address to be
aligned to page size (4096).

-v2: added marking that also fixes initial commit 01058a5522.
-v3: numbers replaced by PAGE_SIZE; buffer-object size is aligned
instead of alignment of offsets (Chris Wilson).
-v4: changes related to PAGE_SIZE moved to separate commit
-v5: restored alignment to page-size for 0-size.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106997
Fixes: a363bb2cd0 (i965: Allocate VMA in userspace for full-PPGTT systems.)
Fixes: 01058a5522 (i965: Add virtual memory allocator infrastructure to brw_bufmgr.)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-15 23:23:16 +01:00
Timothy Arceri
f0a8accb0d radv: add Doom workaround
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-16 07:53:38 +10:00
Sergii Romantsov
efb28aa970 i965: Emitting 3DSTATE_SO_BUFFER of 0-size.
Avoided filling of whole structure and bo-allocation if
size of surface is 0.

Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
2018-08-15 13:15:28 -07:00
Erik Faye-Lund
98b3b6367a virgl: report actual max-texture sizes
Instead of doing conservative guesses, we should report the max levels
based on the max sizes we get from GL on the host.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
2018-08-15 18:48:16 +02:00
Erik Faye-Lund
825aaeae39 virgl: do not use SP_MAX_TEXTURE_*_LEVELS defines
These macro-names are also used for softpipe, so let's avoid confusion
by avoiding them. Besides, they are just used in one place in virgl, so
let's just inline them into the place they are used instead.

While we're at it, fixup an error in the comment for the 3D version.
Mesa subtracts computes max-size by doing by 2^(n-1), which means this
should be 256 cubed, not 512 cubed. The other comments are correct.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
2018-08-15 18:48:08 +02:00
Dylan Baker
ef7ae84daf docs: Add news item for 18.1.6 2018-08-15 09:09:59 -07:00
Samuel Pitoiset
71d5b2fbf8 radv: disable the auto-waitcnt-before-barrier LLVM option
This option allows us to remove additional s_waitcnt instructions
because s_barrier internally does s_waitcnt 0.

Though, apparently there is a problem with LDS accesses that
causes rendering issues with FFXV and DXVK. Disable this
optimization for now (RadeonSI still uses it).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107460
CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-15 16:21:50 +02:00
Samuel Pitoiset
85113c4d05 radv: fix memory leaks in radv_load_meta_pipeline()
Reported by Coverity.

Fixes: fbcd167314 ("radv: Add on-demand compilation of built-in shaders.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-15 16:20:58 +02:00
Samuel Pitoiset
17e79865cf radv: drop wrong initialization of COMPUTE_RESOURCE_LIMITS
The last parameter of radeon_set_sh_reg_seq() is the number of
dwords to emit. We were lucky because WAVES_PER_SH(0x3) is 3 but
it was initialized to 0.

COMPUTE_RESOURCE_LIMITS is correctly set when generating
compute pipelines, so we don't need to initialize it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-15 16:20:38 +02:00
Andres Gomez
53b4701cb0 docs: update calendar 18.2.0-rc3 is out
Signed-off-by: Andres Gomez <agomez@igalia.com>
2018-08-15 15:48:18 +03:00
Mauro Rossi
43318d5857 radv/meta_decompress: fix pointer to integer conversion
VK_NULL_HANDLE replaces NULL to avoid following building error:

external/mesa/src/amd/vulkan/radv_meta_decompress.c:365:54: error:
incompatible pointer to integer conversion passing 'void *' to parameter
of type 'VkShaderModule' (aka 'unsigned long long') [-Werror,-Wint-conversion]
                VkResult ret = create_pipeline(cmd_buffer->device, NULL, samples,
                                                                   ^~~~
prebuilts/clang/host/linux-x86/clang-4053586/lib64/clang/5.0.300080/include/stddef.h:105:16:
note: expanded from macro 'NULL'
#  define NULL ((void*)0)
               ^~~~~~~~~~
external/mesa/src/amd/vulkan/radv_meta_decompress.c:97:32:
note: passing argument to parameter 'vs_module_h' here
                VkShaderModule vs_module_h,
                               ^
1 error generated.

Fixes: fbcd167314 ("radv: Add on-demand compilation of built-in shaders.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-15 14:34:50 +02:00
Mauro Rossi
73b342c7a5 egl/android: fix regression in drm_gralloc path (v2)
This patch fixes a regression in mesa 18.2 and mesa-dev branches
for HAVE_DRM_GRALLOC code path which is causing black screen on Android
and prevents boot due to SIGSEGV MAPERR crash related to unproper handling
of drm_gralloc drm FD in new droid_open_device() path.

Problem is due to c7bb82136b ("egl/android: Add DRM node probing and filtering")

To avoid the crash the former existing working droid_open_device() is restored,
renamed droid_open_device_drm_gralloc() and kept within HAVE_DRM_GRALLOC braces.

Tested with mesa-dev and mesa 18.2 branch and oreo-x86 bootanimation
and Androdi GUI booting is fixed with i965, nouveau, radeon.
The changes are compatible with gbm_gralloc, I've tested build with hwc too.

(v2) remove indentation from HAVE_DRM_GRALLOC pre-processor directive

NOTE: Definition of enum{} for GRALLOC_MODULE_PERFORM_GET_DRM_FD
is not necessary and it's actually causing a redefinition building error,
because in HAVE_DRM_GRALLOC path gralloc_drm.h is already exported
by libgralloc_drm which is currently still a dependency.

Fixes: c7bb82136b ("egl/android: Add DRM node probing and filtering")
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2018-08-15 14:07:49 +02:00
Tapani Pälli
656ccf4ef8 mesa: shader dump/read support for ARB programs
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106283
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-08-15 11:03:35 +03:00
Danylo Piliaiev
479a849ad6 glsl: Avoid calling get_array_element for scalar constants
Accessing scalar constant as an array in function call or
initializer list triggered assert in get_array_element.
Examples:
   func(0[0]);
   vec2 t = { 0[0], 0 };

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107550

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-15 10:01:43 +03:00
Marek Olšák
bffa025ada radeonsi: enable 1 missing PS_SU perf counter on Polaris 2018-08-14 21:20:31 -04:00
Marek Olšák
df50099834 radeonsi: use radeon_info::name
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-14 21:20:31 -04:00
Marek Olšák
84652721b9 ac: add radeon_info::name
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-14 21:20:31 -04:00
Marek Olšák
de8d5edbc4 radeonsi: split si_clear_buffer to remove enum si_method
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:21:12 -04:00
Marek Olšák
4de92f2abb radeonsi: replace CP_DMA_USE_L2 with enum si_cache_policy
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:21:10 -04:00
Marek Olšák
bc132d62f9 radeonsi: declare coher in si_copy_buffer
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:21:09 -04:00
Marek Olšák
cddd7ce325 radeonsi: make PFP_SYNC_ME an explicit CP DMA flag
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:21:07 -04:00
Marek Olšák
277295962c radeonsi: don't use emit_data->args in load_emit
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:21:06 -04:00
Marek Olšák
8fb34050b5 radeonsi: don't use emit_data->args in store_emit
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:21:04 -04:00
Marek Olšák
a2c18bfbe3 radeonsi: don't use emit_data->args in atomic_emit
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:21:03 -04:00
Marek Olšák
297fb213b3 radeonsi: don't use emit_data->args in build_interp_intrinsic
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:21:01 -04:00
Marek Olšák
99ae440d4e radeonsi: inline atomic_fetch_args
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:20:59 -04:00
Marek Olšák
267e92893c radeonsi: inline store_fetch_args
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:20:58 -04:00
Marek Olšák
f15e55aa8a radeonsi: inline load_fetch_args
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:20:56 -04:00
Marek Olšák
2c94f321eb radeonsi: merge txq_emit and resq_emit
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:20:55 -04:00
Marek Olšák
a14c803166 radeonsi: inline resq_fetch_args
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:20:54 -04:00
Marek Olšák
347e52adcd radeonsi: inline txq_fetch_args
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:20:52 -04:00
Marek Olšák
c9b2ce2672 radeonsi: use get_resinfo directly in lower_gather4_integer
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:20:36 -04:00
Marek Olšák
7804ddaf87 radeonsi: inline tex_fetch_args into build_tex_intrinsic
The diff looks like it moves code that I didn't touch.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:20:34 -04:00
Marek Olšák
da1d8adc29 radeonsi: remove fetch_args callbacks for ALU instructions
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:20:33 -04:00
Marek Olšák
ac72a6bd0b radeonsi: move internal TGSI shaders into si_shaderlib_tgsi.c
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:20:31 -04:00
Marek Olšák
0ca8294ece radeonsi: implement EXT_window_rectangles
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:19:02 -04:00
Marek Olšák
465e929d6a gallium/u_blitter: save/restore window rectangles
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:19:01 -04:00
Marek Olšák
15fc0f8d4a noop: implement set_window_rectangles
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:18:59 -04:00
Marek Olšák
7c8716e4fb ddebug: implement set_window_rectangles
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:18:51 -04:00
Rodrigo Vivi
44f1dcf9b3 i965: Add a new CFL PCI ID.
One more CFL ID added to spec.

Align with kernel commit d0e062ebb3a4 ("drm/i915/cfl:
Add a new CFL PCI ID.")

Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-14 15:46:56 -07:00
Rob Clark
70bf639328 freedreno/ir3: add support for a6xx 'merged' register set
Starting with a6xx, half and full precision registers conflict.  Which
makes things a bit more efficient, ie. if some parts of the shader are
heavy on half-precision and others on full precision, you don't have to
allocate the worst case for both.  But it means we need to setup some
additional conflicts.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 17:59:02 -04:00
Rob Clark
4813060ed4 freedreno/ir3: small RA cleanup
Collapse is_temp() into it's only callsite, and pass compiler object as
struct rather than void.  Just cleanups to reduce noise in next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 17:59:02 -04:00
Rob Clark
fdd35f497b freedreno/ir3: stop hard-coding FS input regs
We originally did this because at the time we didn't know all the
bitfields to configure where various frag shader sysval's went.  But
we do.

So switch to using sysvals for all the frag shader inputs.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 17:59:02 -04:00
Rob Clark
e97b56172c freedreno/ir3: use r63.x for unused inputs
This way, unused sysval inputs, like frag_vcoord, get the correct regid
value to disable the input.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 17:59:02 -04:00
Rob Clark
066930e54d freedreno/ir3: create all inputs in first block
create_input()/create_input_compmask() should take the ctx as arg,
rather than block, to enforce that all inputs are created in the first
block, so that RA sees them as live at the start of the shader.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 17:59:02 -04:00
Rob Clark
62da068fd3 freedreno/ir3: rename s/frag_pos/frag_vcoord/g
Make it more clear that this is varying fetch related.  Also fixup some
comments.  Just cleanup for next patches.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 17:59:02 -04:00
Rob Clark
4a7f9feada compiler: add SYSTEM_VALUE_VARYING_COORD
Used internally in freedreno/ir3 for the vec2 value that hw passes to
shader to use as coordinate for bary.f (varying fetch) instruction.
This is not the same as SYSTEM_VALUE_FRAG_COORD.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 17:59:02 -04:00
Rob Clark
b5a098b202 freedreno/ir3: move per-generation compiler config
Move it from the compile ctx to the compiler object, before adding
new things for a6xx.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 17:59:02 -04:00
Bas Nieuwenhuizen
66e12451ac radv: Update to new VK_EXT_vertex_attribute_divisor to version 2.
Behavior wrt firstInstance got changed, and a divisor of 0 has been
disallowed.

The new version of the ext got published in specification 1.1.81.

Sending to stable since the only known user is DXVK, which needs
this for correctness.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: 18.2 <mesa-stable@lists.freedesktop.org>
2018-08-14 22:13:09 +02:00
Bas Nieuwenhuizen
4bb6c49375 radv: Allow ETC2 on RAVEN and VEGA10 instead of all GFX9.
Follow radeonsi.

Fixes: 3665f66ef2 "radv: Add support for ETC2 textures."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 22:11:04 +02:00
Bas Nieuwenhuizen
bf33ca7512 radv: Fix missing Android platform define.
CC: <mesa-stable@lists.freedesktop.org>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 22:11:04 +02:00
Rob Clark
13b9d32fb1 freedreno: move free() into fdN_context_destroy()
Following patches will be doing further cleanup after calling
fd_context_destroy() so it is easier if we move the free() into
the per-gen backend code.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 15:46:34 -04:00
Jonathan Marek
dc9705f30d freedreno: a2xx: ir2 update
this patch brings a number of changes to ir2:
-ir2 now generates CF clauses as necessary during assembly. this simplifies
 fd2_program/fd2_compiler and is necessary to implement optimization passes
-ir2 now has separate vector/scalar instructions. this will make it easier
 to implementing scheduling of scalar+vector instructions together. dst_reg
 is also now seperate from src registers instead of a single list
-ir2 now implements register allocation. this makes it possible to compile
 shaders which have more than 64 TGSI registers
-ir2 now implements the following optimizations: removal of IN/OUT MOV
 instructions generated by TGSI and removal of unused instructions when
 some exports are disabled
-ir2 now allows full 8-bit index for constants
-ir2_alloc no longer allocates 4 times too many bytes

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 12:46:25 -04:00
Andres Gomez
5406eb5513 docs: update calendar 18.2.0-rc1 and 18.2.0-rc2 are out
Signed-off-by: Andres Gomez <agomez@igalia.com>
2018-08-14 17:07:09 +03:00
Bas Nieuwenhuizen
fbcd167314 radv: Add on-demand compilation of built-in shaders.
In environments where we cannot cache, e.g. Android (no homedir),
ChromeOS (readonly rootfs) or sandboxes (cannot open cache), the
startup cost of creating a device in radv is rather high, due
to compiling all possible built-in pipelines up front. This meant
depending on the CPU a 1-4 sec cost of creating a Device.

For CTS this cost is unacceptable, and likely for starting random
apps too.

So if there is no cache, with this patch radv will compile shaders
on demand. Once there is a cache from the first run, even if
incomplete, the driver knows that it can likely write the cache
and precompiles everything.

Note that I did not switch the buffer and itob/btoi compute pipelines
to on-demand, since you cannot really do anything in Vulkan without
them and there are only a few.

This reduces the CTS runtime for the no caches scenario on my
threadripper from 32 minutes to 8 minutes.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-14 10:26:24 +02:00
Bas Nieuwenhuizen
24a9033d6f radv: Refactor blit pipeline creation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-14 10:26:11 +02:00
Bas Nieuwenhuizen
806a792b43 radv: Make fs key exemplars ordered to be a reverse fs_key lookup.
While at it, share the exemplars and account for a non-occurring
fs key.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-14 10:26:06 +02:00
Dave Airlie
0be5e9f5a1 virgl: ARB_texture_barrier support
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2018-08-14 16:55:56 +10:00
Dylan Baker
6d61aed231 docs: update calendar, add news item and link release notes for 18.1.6
Signed-off-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-13 10:06:45 -07:00
Dylan Baker
973ae7a06b docs: Add sha256 sums for 18.1.6 2018-08-13 10:05:44 -07:00
Dylan Baker
66c8a64e67 docs: Add release notes for 18.1.6 2018-08-13 10:05:42 -07:00
Alejandro Piñeiro
668ab8aeb1 mesa/glspirv: fix compilation with MSVC
From AppVeyor #8582, it seems that MSVC doesn't like uint, so this
patch replaces it with unsigned.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-08-13 18:57:18 +02:00
Eric Engestrom
f976d22759 travis: install correct version of mako for each build system
Meson now uses python3, so let's add a block for Autotools, move that
line into the buildsys-specific blocks, and set the correct version for
Meson.

Fixes: 2ee1c86d71 "meson: Build with Python 3"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-13 17:29:42 +01:00
Erik Faye-Lund
ae5770171c mesa/st/glsl_to_tgsi: fixup copy-paste mistake
This is clearly a copy-paste error; if we validate the reladdr2-pointer,
we don't want to traverse to the reladdr-pointer. Especially since the
check above shows that reladdr could be NULL here.

Noticed by Coverity.

CID: 1438389, 1438390
Fixes: 568bda2f2d ("mesa/st/glsl_to_tgsi: Split arrays whose elements are only accessed directly")
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-08-13 18:15:36 +02:00
Neil Roberts
c91a5f70fb i965/nir: Use the nir copy of shader_info to handle gl_PatchVerticesIn
Instead of using the copy of shader_info stored in gl_program, it now
uses the one in nir_shader. This is needed for SPIR-V because the
info.tess.tcs_vertices_out is filled in via _mesa_spirv_to_nir which
happens much later than with a GLSL shader. The copy of shader_data in
gl_program is only updated later via brw_shader_gather_info but that
is too late.

For GLSL this shouldn't create any problems because the nir copy of
the shader_info is immediately copied from the gl_program in
glsl_to_nir.

v2: updated after commit "i965: Combine both gl_PatchVerticesIn
    lowering passes." (488972) (Alejandro Piñeiro)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:27 +02:00
Neil Roberts
a105c1e6e5 mesa/glspirv: Set separate_shader on shader_info
The value is copied from the gl_program. If we don’t do this then it
will get reset back to zero in brw_shader_gather_info. This isn’t a
problem for GLSL because in that case the nir_shader is initialised
with a copy of the shader_info from the gl_program.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:27 +02:00
Iago Toral Quiroga
40947d4744 mesa/glspirv: pick off the only entry point we need
This is the same we do for vulkan drivers

This is needed to pass the following CTS test:
KHR-GL45.gl_spirv.spirv_modules_shader_binary_multiple_shader_objects_test

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:27 +02:00
Alejandro Piñeiro
32e1d4c34b mesa/glspirv: compute double inputs and remap attributes
input locations used by input attributes are not handled in the same
way in OpenGL vs Vulkan. There is a detailed explanation of such
differences on the following commit:

c2acf97fcc

So with this commit, the same adjustment that is done after
glsl_to_nir, is being done after spirv_to_nir, when it is used on
OpenGL (ARB_gl_spirv).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:27 +02:00
Alejandro Piñeiro
d6c8066663 nir/glsl: make nir_remap_attributes public
As we plan to reuse it for ARB_gl_spirv implementation.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:27 +02:00
Alejandro Piñeiro
af194bd38e nir/lower_samplers: don't assume a deref for both texture and sampler srcs
After commit "nir: Use derefs in nir_lower_samplers"
(75286c2d08) assumes one deref for both
the texture and the sampler. However there are cases (on OpenGL, using
ARB_gl_spirv) where SPIR-V is not providing a sampler, like for
texture query levels ops. Although we could make spirv_to_nir to
provide a sampler deref for those cases, it is not really needed, and
wrong from the Vulkan point of view.

This patch fixes the following (borrowed) tests run on SPIR-V mode:
  arb_compute_shader/execution/basic-texelFetch.shader_test
  arb_gpu_shader5/execution/sampler_array_indexing/fs-simple-texture-size.shader_test
  arb_texture_query_levels/execution/fs-baselevel.shader_test
  arb_texture_query_levels/execution/fs-maxlevel.shader_test
  arb_texture_query_levels/execution/fs-miptree.shader_test
  arb_texture_query_levels/execution/fs-nomips.shader_test
  arb_texture_query_levels/execution/vs-baselevel.shader_test
  arb_texture_query_levels/execution/vs-maxlevel.shader_test
  arb_texture_query_levels/execution/vs-miptree.shader_test
  arb_texture_query_levels/execution/vs-nomips.shader_test
  glsl-1.30/execution/fs-textureSize-compare.shader_test

v2: merge lower_tex_src_to_offset and calc_sampler_offsets together,
    update texture/sampler index and texture_array_size directly on
    lower_tex_src_to_offset (Jason)
v3: clarify one comment (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-13 16:28:27 +02:00
Alejandro Piñeiro
fe2de39fb2 nir/linker: take into account hidden uniforms
So they are not exposed through the introspection API.

It is worth to note that the number of hidden uniforms of GLSL linking
vs SPIR-V linking would be somewhat different due the differen order
of the nir lowerings/optimizations.

For example: gl_FbWposYTransform. This is introduced as part of
nir_lower_wpos_ytransform. On GLSL that is executed after the IR-based
linking. So that means that on GLSL the UniformStorage will not
include this uniform. With the SPIR-V linking, that uniform is already
present, but marked as hidden. So it will be included on the
UniformStorage, but as hidden.

One alternative would create a special how_declared for that case, but
seemed an overkill. Using hidden should be ok as far as it is used
properly.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:27 +02:00
Alejandro Piñeiro
5332d7582d nir: add how_declared to nir_variable.data
Equivalent to the already existing how_declared at GLSL IR. The only
difference is that we are not adding all the declaration_type
available on GLSL, only the one that we will use on the short term. We
would add more mode if needed on the future.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:26 +02:00
Neil Roberts
be6f472b23 spirv: Make VertexIndex and VertexId both non-zero-based
GLSL has gl_VertexID which is supposed to be non-zero-based.

SPIR-V has both VertexIndex and VertexId builtins whose meanings are
defined by the APIs.

Vulkan defines VertexIndex as being non-zero-based. In Vulkan VertexId
and InstanceId have no meaning and are pretty much just reserved for
OpenGL at this point.

GL_ARB_spirv removes VertexIndex and defines VertexId to be the same
as gl_VertexId (which is also non-zero-based).

Previously in Mesa it was treating VertexIndex as non-zero-based and
VertexId as zero-based, so it was breaking for GL. This behaviour was
apparently based on Khronos bug 14255. However that bug doesn’t seem
to have made a final decision for VertexId.

Assuming there really is no other definition for VertexId for Vulkan
it seems better to just make them both have the same value.

v2: update comment and commit descriptions, based on Jason Ekstrand
    explanation of the meaning/rationale behind all those builtins
    (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-13 16:23:36 +02:00
Alejandro Piñeiro
624c00f1a6 spirv: fill info.gs.input_primitive too
info.gs.output_primitive was already being filled. Not sure why this
is not needed on Vulkan, but we found to be needed for
ARB_gl_spirv. Specifically, this is needed to get the following test
passing:

KHR-GL45.gl_spirv.spirv_validation_builtin_variable_decorations_test

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 12:56:51 +02:00
Tapani Pälli
ed94a5799d docs/features: mark GL_EXT_render_snorm as done for i965
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-13 13:08:22 +03:00
Tapani Pälli
fa9e6c235d i965: enable EXT_render_snorm
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-08-13 12:03:17 +03:00
Tapani Pälli
0d356cf478 mesa: enable EXT_render_snorm extension
Patch sets additional formats renderable and enables the extension
when OpenGL ES 3.1 is supported.

v2: instead of dummy_true, have a separate toggle for extension
    (Eric Anholt)

v3: add missing checks, simplify some existing checks and fix
    glCopyTexImage2D check (Nanley Chery)

    add SHORT and BYTE support in read_pixels_es3_error_check

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-08-13 12:03:17 +03:00
Kenneth Graunke
de57926dc9 blorp: Properly handle Z24X8 blits.
One of the reasons we didn't notice that R24_UNORM_X8_TYPELESS
destinations were broken was that an earlier layer was swapping it
out for B8G8R8A8_UNORM.  That made Z24X8 -> Z24X8 blits work.

However, R32_FLOAT -> R24_UNORM_X8_TYPELESS was still totally broken.
The old code only considered one format at a time, without thinking
that format conversion may need to occur.

This patch moves the translation out to a place where it can consider
both formats.  If both are Z24X8, we continue using B8G8R8A8_UNORM to
avoid having to do shader math workarounds.  If we have a Z24X8
destination, but a non-matching source, we use our shader hacks to
actually render to it properly.

Fixes: 804856fa57 (intel/blorp: Handle more exotic destination formats)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-11 12:34:01 -07:00
Kenneth Graunke
8a29086285 blorp: Don't try to use R32_UNORM for R24_UNORM_X8_TYPELESS rendering.
The hardware doesn't support rendering to R24_UNORM_X8_TYPELESS, so
Jason decided to fake it with a bit of shader math and R32_UNORM RTs.

The only problem is that R32_UNORM isn't renderable either...so we've
just traded one bad format for another.

This patch makes us use R32_UINT instead.

Fixes: 804856fa57 (intel/blorp: Handle more exotic destination formats)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-11 12:33:27 -07:00
Jason Ekstrand
a9f7bcfdf9 intel: Switch the order of the 2x MSAA sample positions
The Vulkan 1.1.82 spec flipped the order to better match D3D.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-08-11 10:58:12 -05:00
Gert Wollny
8a87138885 mesa/st/tests: Add array life range estimation and renumbering tests
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
0981fc84df mesa/st/tests: Add array life range tests infrastructure to common test class
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
d8c2119f9b mesa/st/glsl_to_tgsi: Expose array live range tracking and merging
This patch ties in the array split, merge, and interleave code.

shader-db changes in the TGSI code are:

              original code  |  array-merge  |       change
              mean      max  |  mean    max  | best  mean %  worst
      -----------------------------------------------------------
      arrays   0.05       2  |   0.00     0  |  -2   -100      0
total temps    5.05      21  |   4.92    20  | -15   -2.59     1
      instr   55.33     988  |  55.20   988  | -15   -0.24     0

Evaluation:

Run shader-db in single thread mode (otherwise the output is
not ordered and the best and worst column don't make sense) to
get results pre-stats.txt and post-stats.txt. Then using
python pandas:

 import pandas as pd
 old_stats = pd.read_csv('pre-stats.txt')
 new_stats = pd.read_csv('post-stats.txt')
 omean = old_stats.mean()
 omax = old_stats.max()
 nmean = new_stats.mean()
 nmax = new_stats.max()
 delta =  new_stats - old_stats
 pd.concat([omean, omax, nmean, nmax, delta.min(),
            delta.mean()/old_stats.mean()*100, delta.max()],
            axis=1, keys=['mean', 'max', 'mean', 'max', 'best',
            'avg change %', 'worst'])

v4: - Correct typo and add bugs that are fixed by this series.
    - Update stats and describe stats evaluation

Bugzilla:
  https://bugs.freedesktop.org/show_bug.cgi?id=105371
  https://bugs.freedesktop.org/show_bug.cgi?id=100200
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
c317d0ab54 mesa/st/glsl_to_tgsi: add array life range evaluation into tracking code
v4: Also track the register given in inst->resource. (thanks: Benedikt Schemmer
    for testing the patches on radeonsi, which revealed that I was missing
    tracking this)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
5e58eb37f1 mesa/st/glsl_to_tgsi: add class for array access tracking
Because of the indirect access it is impossible to obtain an accurate per
component and array element tracking. Therefore, the tracking is simplified
to only track whether any element was accessed, whether this happend
conditionally in a loop. In addition, while tracking of temporaries requires
a per-componet tracking that is later fused, for arrays only the components
access mask is neede. The resulting tracking code and evaluation of the array
live range is sufficiently different from the evaluation of the live range of
temporaries to justify implementing this in a different class instead of
adding more complexity to the already existing code for temporary life
range evaluation.

v4: Update commit message to make it clearer why this class is seperate from
    the tracking of temporaries.
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
7d55d01b53 mesa/st/glsl_to_tgsi: move evaluation of read mask up in the call hierarchy
In preparation of the array live range tracking the evaluation of the read
mask is moved out the register live range tracking to the enclosing call
of the generalized read access tracking.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
f2a4636339 mesa/st/glsl_to_tgsi: rename access_record to register_merge_record and some more renames
In preparartion of adding the tracking of the live range the classes that refer
to temporary registers are renamed.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
8c89728889 mesa/st/tests: Add tests for array merge helper classes.
v2: - Define tests also in the meson.build file.
v4: - Check no-op mapping of all bits.
    - Convert tests to the new class layout used in the merge evaulation.
    - remove dependency on llvm in meson build (Thanks Dylan Baker for pointing
       out that this might not needed)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
12316aa217 mesa/st/glsl_to_tgsi: Add array merge logic
v4: - Update the code to use the new merge logic.
    - Use a cleaner, class-based approach for the evaluation of merges.
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
d097ef4204 mesa/st/glsl_to_tgsi: Add helper classes to apply array merging and interleaving
v4: - Remove logic for evaluation of swizzles and merges since this
        was moved to array_live_range. This class now only handles the
        actual remapping.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
d54c2f92f9 mesa/st/glsl_to_tgsi: Add helper class for array live range merging and interleaving
This class holds the array length, live range, and accessed components, and
it implements the logic for evaluating how arrays are merged and interleaved.

v4: - Add logic to evaluate merge and interleave of a pair of arrays to
      the class array_live_range.
    - document class
    - update commit message

Thanks Nicolai Hähnle for the pointers given.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
331ae3cde5 mesa/st/glsl_to_tgsi:rename lifetime to register_live_range
On one hand "live range" is the term used in the literature, and on the
other hand a distinction is needed from the array live ranges.

v4: Fix indentions and white spaces

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
f40c9d0225 mesa/st/glsl_to_tgsi: Properly resolve life times simple if/else + use constructs
in constructs like below, currently the live range estimation extends the live range
of t unecessarily to the whole loop because it was not detected that t is
unconditional written and later read only in the "if (a)" scope.

  while (foo)  {
    ...
    if (a) {
       ...
       if (b)
         t = ...
       else
         t = ...
       x = t;
       ...
    }
     ...
  }

This patch adds a unit test for this case and corrects the minimal live range estimation
accordingly.

v4: update comments
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
568bda2f2d mesa/st/glsl_to_tgsi: Split arrays whose elements are only accessed directly
Array whose elements are only accessed directly are replaced by the
according number of temporary registers. By doing so the otherwise
reserved register range becomes subject to further optimizations like
copy propagation and register merging.

Thanks to the resulting reduced register pressure this patch makes
the piglits

  spec/glsl-1.50/execution -
      variable-indexing/vs-output-array-vec3-index-wr-before-gs
      geometry/max-input-components

pass on r600 (barts) where they would fail before with a "GPR limit exceeded"
error (even with the spilling that was recently added).

v2: * rename method dissolve_arrays to split_arrays
    * unify the tracking and remapping methods for src and dst registers
    * also track access to arrays via reladdr*

v3: * enable this optimization only if the driver requests register merge

v4: * Correct comments
    * Also update inst->resource if it is an array element
      (thanks: Benedikt Schemmer for testing the patches on radeonsi, which
       revealed that I was missing tracking this)

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
b1cead3add mesa/st/glsl_to_tgsi: Add method to collect some TGSI statistics
When mesa is compiled in debug mode then this adds the possibility
to print out some statistics about the translated and optimized TGSI
shaders to a file.

The functionality is enabled by setting the environment variable

   GLSL_TO_TGSI_PRINT_STATS

to the file name where the statistics should be collected. The file is
opened in append mode so that statistics from various runs will be
accumulated.

v4: Make accress to log file thread save (thanks for pointing this out Nicolai
    Hähnle)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny
be95ca9be7 Gallium/tgsi: Correct signdness of return value of bit operations
The GLSL operations findLSB, findMSB, and countBits always return
a signed integer type. Let TGSI reflect this.

v2: Properly set values in infer_(src|dst)_type   (Thanks Roland
    Schneidegger for pointing out problems with my 1st approach)
v2: Set values in the common infer_type code path, and only add
    the correct source type for UMSB (Roland Schneidegger)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-08-11 11:14:29 +02:00
Mathieu Bridon
2ee1c86d71 meson: Build with Python 3
Now that all the build scripts are compatible with both Python 2 and 3,
we can flip the switch and tell Meson to use the latter.

Since Meson already depends on Python 3 anyway, this means we don't need
two different Python stacks to build Mesa.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-10 15:15:09 -07:00
Mathieu Bridon
bd27203f4d python: Rework bytes/unicode string handling
In both Python 2 and 3, opening a file without specifying the mode will
open it for reading in text mode ('r').

On Python 2, the read() method of a file object opened in mode 'r' will
return byte strings, while on Python 3 it will return unicode strings.

Explicitly specifying the binary mode ('rb') then decoding the byte
string means we always handle unicode strings on both Python 2 and 3.

Which in turns means all re.match(line) will return unicode strings as
well.

If we also make expandCString return unicode strings, we don't need the
call to the unicode() constructor any more.

We were using the ugettext() method because it always returns unicode
strings in Python 2, contrarily to the gettext() one which returns
byte strings. The ugettext() method doesn't exist on Python 3, so we
must use the right method on each version of Python.

The last hurdles are that Python 3 doesn't let us concatenate unicode
and byte strings directly, and that Python 2's stdout wants encoded byte
strings while Python 3's want unicode strings.

With these changes, the script gives the same output on both Python 2
and 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-10 15:14:48 -07:00
Mathieu Bridon
15ac05fd45 python: Fix inequality comparisons
On Python 3, executing `foo != bar` will first try to call
foo.__ne__(bar), and fallback on the opposite result of foo.__eq__(bar).

Python 2 does not do that.

As a result, those __eq__ methods were never called, when we were
testing for inequality.

Expliclty adding the __ne__ methods fixes this issue, in a way that is
compatible with both Python 2 and 3.

However, this means the __eq__ methods are now called when testing for
`foo != None`, so they need to be guarded correctly.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-10 08:45:59 -07:00
Gert Wollny
e94095ec30 mesa/st: ETC2 now uses R8G8B8A8_SRGB as fallback
The check for ETC2 compatibility was not updated when the fallback
format was changed.

Fixes: 71867a0a61
   st/mesa: Fall back to R8G8B8A8_SRGB for ETC2

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-10 10:09:22 +02:00
Mathieu Bridon
08fe9b3e3a python: Simplify list sorting
Instead of copying the list, then sorting the copy in-place, we can just
get a new sorted copy directly.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-09 16:49:19 -07:00
Mathieu Bridon
8d3ff6244c python: Use key-functions when sorting containers
In Python 2, the traditional way to sort containers was to use a
comparison function (which returned either -1, 0 or 1 when passed two
objects) and pass that as the "cmp" argument to the container's sort()
method.

Python 2.4 introduced key-functions, which instead only operate on a
given item, and return a sorting key for this item.

In general, this runs faster, because the cmp-function has to get run
multiple times for each item of the container.

Python 3 removed the cmp-function, enforcing usage of key-functions
instead.

This change makes the script compatible with Python 2 and Python 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-09 16:49:19 -07:00
Mathieu Bridon
1e668ca111 python: Better check for integer types
Python 3 lost the long type: now everything is an int, with the right
size.

This commit makes the script compatible with Python 2 (where we check
for both int and long) and Python 3 (where we only check for int).

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-09 16:49:19 -07:00
Mathieu Bridon
14f1ab998f python: Do not mix bytes and unicode strings
Mixing the two is a long-standing recipe for errors in Python 2, so much
so that Python 3 now completely separates them.

This commit stops treating both as if they were the same, and in the
process makes the script compatible with both Python 2 and 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-09 16:49:19 -07:00
Mathieu Bridon
c644b2d7a7 python: Explicitly use a list
On Python 2, the builtin functions filter() returns a list.

On Python 3, it returns an iterator.

Since we want to use those objects in contexts where we need lists, we
need to explicitly turn them into lists.

This makes the code compatible with both Python 2 and Python 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-09 16:49:18 -07:00
Mathieu Bridon
d9ca4a172e python: Use the right function for the job
The code was just reimplementing itertools.combinations_with_replacement
in a less efficient way.

This does change the order of the results slightly, but it should be ok.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-09 16:49:18 -07:00
Eric Anholt
b618d7ea59 egl: Fix leak of X11 pixmaps backing pbuffers in DRI3.
This is basically copied from the DRI2 destroy path.  Without this,
Raspberry Pi would quickly run out of CMA during the EGL tests in the CTS
due to all the pixmaps laying around.

Fixes: f35198bade ("egl/x11: Implement dri3 support with loader's dri3 helper")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-09 13:12:13 -07:00
Kenneth Graunke
08a5c395ab intel: Fix SIMD16 unaligned payload GRF reads on Gen4-5.
When the SIMD16 Gen4-5 fragment shader payload contains source depth
(g2-3), destination stencil (g4), and destination depth (g5-6), the
single register of stencil makes the destination depth unaligned.

We were generating this instruction in the RT write payload setup:

   mov(16)   m14<1>F   g5<8,8,1>F   { align1 compr };

which is illegal, instructions with a source region spanning more than
one register need to be aligned to even registers.  This is because the
hardware implicitly does (nr | 1) instead of (nr + 1) when splitting the
compressed instruction into two mov(8)'s.

I believe this would cause the hardware to load g5 twice, replicating
subspan 0-1's destination depth to subspan 2-3.  This showed up as 2x2
artifact blocks in both TIS-100 and Reicast.

Normally, we rely on the register allocator to even-align our virtual
GRFs.  But we don't control the payload, so we need to lower SIMD widths
to make it work.  To fix this, we teach lower_simd_width about the
restriction, and then call it again after lower_load_payload (which is
what generates the offending MOV).

Fixes: 8aee87fe4c (i965: Use SIMD16 instead of SIMD8 on Gen4 when possible.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107212
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=13728
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Diego Viola <diego.viola@gmail.com>
2018-08-09 12:33:41 -07:00
Kenneth Graunke
11b9f63a74 i965: Only enable depth IZ signals if there's an actual depthbuffer.
According to the G45 PRM Volume 2 Page 265 we're supposed to only set
these signals when there is an actual depth buffer.  Note that we
already do this for the stencil buffer by virtue of brw->stencil_enabled
invoking _mesa_is_stencil_enabled(ctx) which checks whether the current
drawbuffer's visual has stencil bits (which is updated based on what
buffers are bound).  We just need to do it for depth as well.

Not observed to fix anything.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-09 12:33:38 -07:00
Adam Jackson
63a6b719d9 glx: GLX_MESA_multithread_makecurrent is direct-only
This extension is not defined for indirect contexts. Marking it as
"client only", as the old code did here, would make the extension
available in indirect contexts, even though the server would certainly
not have it in its extension list.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-09 12:33:14 -04:00
Eric Engestrom
fcf259ef97 anv: set error in all failure paths
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: 5b196f39bd "anv/pipeline: Compile to NIR in compile_graphics"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-09 11:20:27 +01:00
Eric Engestrom
aac80f7597 intel/tools: add missing variable initialisation
Fixes: 6a60beba40 "intel/tools: Add an error state to aub translator"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-09 11:20:18 +01:00
vadym.shovkoplias
e0de26eacc drirc: Allow extension midshader for Metro Redux
This fixes both Metro 2033 Redux and Metro Last Light Redux

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99730
Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-09 13:13:20 +03:00
Tapani Pälli
03a5acec68 glsl: handle error case with ast_post_inc, ast_post_dec
Return ir_rvalue::error_value with ast_post_inc, ast_post_dec if
parser error was emitted previously. This way process_array_size
won't see bogus IR generated like with commit 9c676a6427.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98699
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-08-09 13:07:16 +03:00
Eric Anholt
fdfb689a48 vc4: Implement texture_subdata() to directly upload tiled data.
This avoids a memcpy into a temporary in the upload path.

Improves x11perf -putimage100 performance by 12.1586% +/- 1.38155% (n=145)
2018-08-08 18:14:31 -07:00
Eric Anholt
25bee5ef9e vc4: Handle partial loads/stores of tiled textures.
Previously, we would load out the tile-aligned area, update the raster
copy, and store it back.  This was a huge cost for XPutImage calls to the
screen under glamor.

Instead, implement a general load/store path that walks over the source
x/y writing into the corresponding pixel of the destination (using clever
math from
https://fgiesen.wordpress.com/2011/01/17/texture-tiling-and-swizzling/).
If things are aligned, we go through the previous utile-at-a-time loop.

Improves x11perf -putimage10 performance by 139.777% +/- 2.83464% (n=5)
Improves x11perf -putimage100 performance by 383.908% +/- 22.6297% (n=11)
Improves x11perf -getimage10 performance by 2.75731% +/- 0.585054% (n=145)
2018-08-08 16:45:44 -07:00
Eric Anholt
3e06b918aa vc4: Compile the LT image helper per cpp we might load/store.
For the partial load/store support I'm about to add, we want the memcpy to
be compiled out to a single load/store.  This should also eliminate the
calls to vc4_utile_width/height().

Improves x11perf -putimage100 performance by  3.76344% +/- 1.16978% (n=15)
2018-08-08 15:53:25 -07:00
Eric Anholt
d6a174669f vc4: Refactor to reuse the LT tile walking code. 2018-08-08 12:34:48 -07:00
Juan A. Suarez Romero
a9fb331ea7 wayland/egl: update surface size on window resize
According to EGL 1.5 spec, section 3.10.1.1 ("Native Window Resizing"):

  "If the native window corresponding to _surface_ has been resized
   prior to the swap, _surface_ must be resized to match. _surface_ will
   normally be resized by the EGL implementation at the time the native
   window is resized. If the implementation cannot do this transparently
   to the client, then *eglSwapBuffers* must detect the change and
   resize surface prior to copying its pixels to the native window."

So far, resizing a native window in Wayland/EGL was interpreted in Mesa
as a request to resize, which is not executed until the first draw call.
And hence, surface size is not updated until executing it. Thus,
querying the surface size with eglQuerySurface() after a window resize
still returns the old values.

This commit updates the surface size values as soon as the resize is
done, even when the real resize is done in the draw call. This makes the
semantics that any native window resize request take effect inmediately,
and if user calls eglQuerySurface() it will return the new resized
values.

v2: update surface size if there isn't a back surface (Daniel)

CC: Daniel Stone <daniel@fooishbar.org>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-08-08 18:29:58 +02:00
Juan A. Suarez Romero
1fe7cbdf05 wayland/egl: initialize window surface size to window size
When creating a windows surface with eglCreateWindowSurface(), the
width and height returned by eglQuerySurface(EGL_{WIDTH,HEIGHT}) is
invalid until buffers are updated (like calling glClear()).

But according to EGL 1.5 spec, section 3.5.6 ("Surface Attributes"):

  "Querying EGL_WIDTH and EGL_HEIGHT returns respectively the width and
   height, in pixels, of the surface. For a window or pixmap surface,
   these values are initially equal to the width and height of the
   native window or pixmap with respect to which the surface was
   created"

This fixes dEQP-EGL.functional.color_clears.* CTS tests

v2:
- Do not modify attached_{width,height} (Daniel)
- Do not update size on resizing window (Brendan)

CC: Daniel Stone <daniel@fooishbar.org>
CC: Brendan King <brendan.king@imgtec.com>
CC: mesa-stable@lists.freedesktop.org
Tested-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-08-08 18:28:52 +02:00
Juan A. Suarez Romero
f9d0e7d3bc travis: make drivers explicit in Meson targets
Like in the autotools target, make the list of drivers to be built in
each of the Meson targets explicit.

This will help to identify missing dependencies and other issues more
easily.

CC: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-08 17:56:32 +02:00
Brian Paul
51e878cdb3 svga: use pipe_sampler_view::target in svga_set_sampler_views()
instead of the underlying texture's target.  This fixes an issue
where the TGSI sampler type was not agreeing with the sampler view
target/type.  In particular, this fixes a Mint 19 XFCE desktop
scaling issue because the TGSI code was using a RECT sampler but
the sampler view's underlying texture was PIPE_TEXTURE_2D.

We want to use the sampler view's type rather than the underlying
resource, as we do for the view's surface format.

No piglit regressions.

VMware issue 2156696.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-08-08 08:20:10 -06:00
Brian Paul
92e5dc94ac svga: use SVGA3D_RS_FILLMODE for vgpu9
I'm not sure why we didn't support this in the past, but fillmode
is supported by all renderers nowadays.

Also fix the logic in svga_create_rasterizer_state() to avoid a few
swtnl case.

No piglit regressions

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-08-08 08:20:10 -06:00
Brian Paul
a45b495700 svga: add TGSI_SEMANTIC_FACE switch case in svga_swtnl_update_vdecl()
Fixes failed assertion running Piglit polygon-mode-face test.
Though, the test still does not pass.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-08-08 08:20:10 -06:00
Brian Paul
92e7342a6f xlib: remove unused Fake_glXGetAGPOffsetMESA() function
To silence compiler warning.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-08 08:20:09 -06:00
Brian Paul
6ff4795c62 gl.h: define GLeglImageOES depending on GL_EXT_EGL_image_storage
To avoid duplicate typedef with the definition in glext.h

V2: test for both GL_OES_EGL_image and GL_EXT_EGL_image_storage in
case both the GL and GLES headers are included.  Per Emil.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107488
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-08-08 08:20:01 -06:00
Emil Velikov
32aa7ff647 Android: copy -fno*math* options from the autotools build
Add -fno-math-errno and -fno-trapping-math to the build.

Mesa does not depend on the functionality provided, thus this should
result in slightly faster code and smaller binaries.

Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-08 13:45:55 +01:00
Emil Velikov
315c46cfdc autotools: use correct gl.pc LIBS when using glvnd
This is more of a hack, since glvnd itself should be providing the file.
Until that happens, ensure the libs is correctly set to -lGL

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-08-08 13:37:09 +01:00
Emil Velikov
8dc96416c9 glx: automake: add egl.pc/headers TODO when using glvnd
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-08-08 13:37:09 +01:00
Emil Velikov
94ed4c4a16 egl: automake: add egl.pc/headers TODO when using glvnd
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-08-08 13:37:09 +01:00
Emil Velikov
25a9450a44 autotools: error out when building with mangling and glvnd
It's not a thing that can work, nor is a wise idea to attempt.

v2: Tweak error message (Dylan)

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com> (v1)
2018-08-08 13:37:09 +01:00
Emil Velikov
d5ac236471 autotools: error out when using the broken --with-{gl, osmesa}-lib-name
The toggles were broken with the introduction of --enable-mangling.
Fixing that up might be possible, but it's not worth the complexity
since one can rename the libraries at any point.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-08-08 13:37:09 +01:00
Emil Velikov
4f2b73d9fd meson: recommend building the surfaceless platform
It has no special requirements, size and build-time is effectively zero.

v2: Rebase

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-08-08 13:37:09 +01:00
Emil Velikov
a7ea7511ba automake: require shared glapi when using DRI based libGL
This has been a requirement for ages, yet it seems like we never
explicitly errored out during configure.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-08-08 13:37:09 +01:00
Emil Velikov
834036500c ttn: remove {varying_slot, frag_result}_to_tgsi_semantic helpers
The respective drivers have been updated and the helpers are no longer
needed.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-08-08 13:33:07 +01:00
Juan A. Suarez Romero
db432194a1 travis: remove libedit-dev dependency in LLVM 6.0 targets
In LLVM <6.0 we added explicitly libedit-dev, as it was required to
satisfy apt dependencies.

In LLVM 6.0, this is not required anymore, so let's remove it.

CC: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-08 13:00:33 +02:00
Erik Faye-Lund
0f450e0cbe glsl_to_tgsi: plumb image writable through to driver
The virgl driver cares about the writable-flag on image definitions,
because it re-emits GLSL from the TGSI. However, so far it was hardcoded
to true in glsl_to_tgsi, which cause problems when virglrenderer is
running on top of GLES 3.1, where not all formats are supported for
writable images.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-08 09:35:09 +02:00
Eric Anholt
cfe69d0aaa vc4: Fix vc4_fence_server_sync() on pre-syncobj kernels.
We won't have an FD if we're just having the server wait on a fence
created by eglCreateSyncKHR().  Our seqno fences will happen in order, so
server-side waits are no-ops in that case.  Fixes
dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_server_sync.buffers.gen_delete

Fixes: b0acc3a562 ("broadcom/vc4: Native fence fd support")
2018-08-07 17:00:49 -07:00
Eric Anholt
69158c452b vc4: Ignore samplers for finding uniform offsets.
Fixes:
dEQP-GLES2.shaders.struct.uniform.sampler_array_fragment
dEQP-GLES2.shaders.struct.uniform.sampler_array_vertex
dEQP-GLES2.shaders.struct.uniform.sampler_nested_fragment
dEQP-GLES2.shaders.struct.uniform.sampler_nested_vertex

Cc: mesa-stable@lists.freedesktop.org
2018-08-07 17:00:22 -07:00
Eric Anholt
e24a8e5232 vc4: Extend dumping of uniforms in QIR and in the command stream.
Similar to what I did for V3D, provide some description of the uniforms.
2018-08-07 17:00:22 -07:00
Eric Anholt
3954331aff vc4: Pull uinfo->data[i] dereference out to the top of the loop.
Reduces the size of vc4_uniforms.o by about 10%.  We would basically
always end up loading the cachline of uinfo->data[i] anyway, so it should
be good for performance as well as making the code a bit cleaner.
2018-08-07 17:00:22 -07:00
Eric Anholt
550e9c917c vc4: Make sure to emit a tile coordinates between two MSAA loads.
The HW only executes a load once the tile coordinates packet happens, and
only tracks one at a time, so by emitting our two MSAA loads back to back
we would end up with an undefined color or Z buffer.  The simulator
doesn't seem to care, but sync up the RCL generation with the kernel
anyway.

Fixes dEQP-EGL.functional.render.multi_context.gles2.rgb888_window
2018-08-07 17:00:22 -07:00
Eric Anholt
9ab6912a00 vc4: Respect a sampler view's first_layer field.
Fixes texturing from EGL images created from cubemap faces, as in
dEQP-EGL.functional.image.create.gles2_cubemap_negative_x_rgba_texture

Cc: mesa-stable@lists.freedesktop.org
2018-08-07 17:00:22 -07:00
Dave Airlie
fe0a3a45bb virgl: add ARB_shader_clock support
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-08-08 08:36:40 +10:00
Mathieu Bridon
ba1ebf2ee1 python: Specify the template output encoding
We're trying to write a unicode string (i.e decoded) to a file opened
in binary (i.e encoded) mode.

In Python 2 this works, because of the automatic conversion between
byte and unicode strings.

In Python 3 this fails though, as no automatic conversion is attempted.

This change makes the scripts compatible with both versions of Python.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-07 13:28:35 -07:00
Mathieu Bridon
e1b88aee68 python: Fix rich comparisons
Python 3 doesn't call objects __cmp__() methods any more to compare
them. Instead, it requires implementing the rich comparison methods
explicitly: __eq__(), __ne(), __lt__(), __le__(), __gt__() and __ge__().

Fortunately Python 2 also supports those.

This commit only implements the comparison methods which are actually
used by the build scripts.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-07 13:10:34 -07:00
Mathieu Bridon
9b6746b7c0 python: Use explicit integer divisions
In Python 2, divisions of integers return an integer:

    >>> 32 / 4
    8

In Python 3 though, they return floats:

    >>> 32 / 4
    8.0

However, Python 3 has an explicit integer division operator:

    >>> 32 // 4
    8

That operator exists on Python >= 2.2, so let's use it everywhere to
make the scripts compatible with both Python 2 and 3.

In addition, using __future__.division tells Python 2 to behave the same
way as Python 3, which helps ensure the scripts produce the same output
in both versions of Python.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-07 13:07:44 -07:00
Chad Versace
3dc22381fa egl/main: Add bits for EGL_KHR_mutable_render_buffer
A follow-up patch enables EGL_KHR_mutable_render_buffer for Android.
This patch is separate from the Android patch because I think it's
easier to review the platform-independent bits separately.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-07 11:11:05 -07:00
Chad Versace
5c6d6eedb3 dri: Add param driCreateConfigs(mutable_render_buffer)
If set, then the config will have __DRI_ATTRIB_MUTABLE_RENDER_BUFFER,
which translates to EGL_MUTABLE_RENDER_BUFFER_BIT_KHR.

Not used yet.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-07 11:11:05 -07:00
Chad Versace
bbe2d50b58 dri: Define DRI_MutableRenderBuffer extensions
Define extensions DRI_MutableRenderBufferDriver and
DRI_MutableRenderBufferLoader. These are the two halves for
EGL_KHR_mutable_render_buffer.

Outside the DRI code there is one additional change.  Add
gl_config::mutableRenderBuffer to match
__DRI_ATTRIB_MUTABLE_RENDER_BUFFER. Neither are used yet.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-07 11:11:05 -07:00
Chad Versace
eabf59791e egl/dri2: In dri2_make_current, return early on failure
This pulls an 'else' block into the function's main body, making the
code easier to follow.

Without this change, the upcoming EGL_KHR_mutable_render_buffer patch
transforms dri2_make_current() into spaghetti.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-07 11:11:05 -07:00
Chad Versace
f48f9a78da egl: Simplify queries for EGL_RENDER_BUFFER
There exist *two* queryable EGL_RENDER_BUFFER states in EGL:
eglQuerySurface(EGL_RENDER_BUFFER) and
eglQueryContext(EGL_RENDER_BUFFER).

These changes eliminate potentially very fragile code in the upcoming
EGL_KHR_mutable_render_buffer implementation.

* eglQuerySurface(EGL_RENDER_BUFFER)

  The implementation of eglQuerySurface(EGL_RENDER_BUFFER) contained
  abstruse logic which required comprehending the specification
  complexities of how the two EGL_RENDER_BUFFER states interact.  The
  function sometimes returned _EGLContext::WindowRenderBuffer, sometimes
  _EGLSurface::RenderBuffer. Why? The function tried to encode the
  actual logic from the EGL spec. When did the function return which
  variable? Go study the EGL spec, hope you understand it, then hope
  Mesa mutated the EGL_RENDER_BUFFER state in all the correct places.
  Have fun.

  To simplify eglQuerySurface(EGL_RENDER_BUFFER), and to improve
  confidence in its correctness, flatten its indirect logic. For pixmap
  and pbuffer surfaces, simply return a hard-coded literal value, as the
  spec suggests. For window surfaces, simply return
  _EGLSurface::RequestedRenderBuffer.  Nothing difficult here.

* eglQueryContext(EGL_RENDER_BUFFER)

  The implementation of this suffered from the same issues as
  eglQuerySurface, and the solution is the same.  confidence in its
  correctness, flatten its indirect logic. For pixmap and pbuffer
  surfaces, simply return a hard-coded literal value, as the spec
  suggests. For window surfaces, simply return
  _EGLSurface::ActiveRenderBuffer.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-07 11:11:05 -07:00
Marek Olšák
d145e33e7c radeonsi: set GLC=1 for all write-only shader resources 2018-08-07 13:52:34 -04:00
Marek Olšák
2ab8cf6de5 radeonsi: don't load block dimensions into SGPRs if they are not variable 2018-08-07 13:52:34 -04:00
Juan A. Suarez Romero
03cff7ecd8 travis: meson/Vulkan requires LLVM 6.0
RADV now requires LLVM 6.0.

Fixes: fd1121e839 ("amd: remove support for LLVM 5.0")
CC: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-08-07 19:29:29 +02:00
Juan A. Suarez Romero
80f937ea4d travis: add ubuntu-toolchain-r-test
LLVM 6.0 requires libstc++4.9, which is not available in main Travis
repository.

v2: LLVM 6.0 requires libstdc+4.9, rather than GCC 4.9 (Jan Vesely)

Fixes: fd1121e839 ("amd: remove support for LLVM 5.0")
CC: Marek Olšák <marek.olsak@amd.com>
CC: Emil Velikov <emil.velikov@collabora.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-07 19:27:07 +02:00
Emil Velikov
85cad15298 egl: set EGL_BAD_NATIVE_PIXMAP in the copy_buffers fallback
As the spec says:

  EGL_BAD_NATIVE_PIXMAP is generated if the implementation
  does not support native pixmaps.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-07 17:59:24 +01:00
Emil Velikov
5463064f7a egl/x11: use the no-op dri2_fallback_copy_buffers for swrast
Currently dri2_copy_buffers is used for swrast, which depends on the
DRI2_FLUSH extension. Since that's not a thing on software based
drivers we crash out.

Do the slightly more graceful, thing of returning EGL_FALSE.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-07 17:59:09 +01:00
Emil Velikov
670cd4080b egl: remove unneeded _eglGetNativePlatform check
There's little point in calling _eglGetNativePlatform() in
eglCopyBuffers. The platform returned should be identical to the one
already stored in our _EGLDisplay.

In the following corner case, the check is incorrect.

The function _eglGetNativePlatform effectively invokes the old-style
eglGetDisplay platform selection. Thus if the EGL_PLATFORM platform does
not match with the EGL_EXT_platform_* used to create the display we'll
error out.

Addresses the egl-copy-buffers piglit test.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-07 17:58:52 +01:00
Emil Velikov
b4b277f770 travis: use https for all the links
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-07 17:27:06 +01:00
Emil Velikov
6b8657aff0 autoconf: stop exporting internal wayland details
With version v1.15 the "code" option was deprecated in favour of
"private-code" or "public-code".

Before the interface symbol generated was exported (which is a bad idea
since it's internal implementation detail) and others may misuse it.

That was the case with libva approx. 1 year ago. Since then libva was
fixed, so we can finally hide it by using "private-code"

Inspired by similar xserver patch by Adam Jackson.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-07 17:23:17 +01:00
Emil Velikov
2f1d9e6cb8 meson: stop exporting internal wayland details
With version v1.15 the "code" option was deprecated in favour of
"private-code" or "public-code".

Before the interface symbol generated was exported (which is a bad idea
since it's internal implementation detail) and others may misuse it.

That was the case with libva approx. 1 year ago. Since then libva was
fixed, so we can finally hide it by using "private-code"

Inspired by similar xserver patch by Adam Jackson.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-07 17:23:17 +01:00
Emil Velikov
c077b74ee8 meson: use dependency()+find_program() for wayland-scanner
Helps when the native wayland-scanner is located outside of PATH.
Inspired by the xserver code ;-)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-07 17:23:17 +01:00
Emil Velikov
54d844897f swr: don't export swr_create_screen_internal
With earlier rework the user and provider of the symbol are within the
same binary. Thus there's no point in exporting the function.

Spotted while reviewing patch from Chuck, that nearly added another
unneeded PUBLIC function.

Cc: Chuck Atkins <chuck.atkins@kitware.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Fixes: f50aa21456 "(swr: build driver proper separate from rasterizer")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com<mailto:george.kyriazis@intel.com>>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com<mailto:chuck.atkins@kitware.com>>
2018-08-07 17:23:17 +01:00
Eric Engestrom
e02f061b69 meson: install KHR/khrplatform.h when needed
Fixes: f7d42ee7d3 "include: update GL & GLES headers (v2)"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-07 15:57:32 +01:00
Eric Engestrom
ed07e831a8 i965: gen_shader_sha1() doesn't use the brw_context
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-07 14:20:50 +01:00
Eric Engestrom
87c156183c configure: install KHR/khrplatform.h when needed
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107511
Fixes: f7d42ee7d3 "include: update GL & GLES headers (v2)"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Brad King <brad.king@kitware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-07 14:20:50 +01:00
Lionel Landwerlin
303e7b39b5 intel: don't build tools without -Dtools=intel
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107487
Fixes: 4334196ab325c6w ("intel: tools: simplify meson build")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-07 11:58:47 +01:00
Erik Faye-Lund
c4f183492d virgl: update virgl_hw.h from virglrenderer
This just makes sure we're currently up-to-date with what
virglrenderer has.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-07 09:38:41 +02:00
Erik Faye-Lund
0914e1464e virgl: rename msaa_sample_positions -> sample_locations
This matches what this field is called in virglrenderer's copy of
this.

This reduces the diff between the two different versions of
virgl_hw.h, and should make it easier to upgrade the file in
the future.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-07 09:38:27 +02:00
Eric Anholt
9507e03699 vc4: Fix a leak of the no-vertex-elements workaround BO.
Fixes: bd1925562a ("vc4: Convert the driver to emitting the shader record using pack macros.")
2018-08-06 19:10:06 -07:00
Eric Anholt
86095e9bb1 vc4: Fix context creation when syncobjs aren't supported.
Noticed when trying to run current Mesa on rpi's downstream kernel.

Fixes: b0acc3a562 ("broadcom/vc4: Native fence fd support")
2018-08-06 19:10:06 -07:00
Eric Anholt
1561e4984e v3d: Emit the VCM_CACHE_SIZE packet.
This is needed to ensure that we don't get blocked waiting for VPM space
with bin/render overlapping.

Cc: "18.2" <mesa-stable@lists.freedesktop.org>
2018-08-06 13:03:23 -07:00
Eric Anholt
5d49076990 v3d: Drop "VC5" from the renderer string.
VC5 isn't a useful name any more, just stick to v3d.
2018-08-06 13:03:23 -07:00
Eric Anholt
50a8713d4f v3d: Avoid spilling that breaks the r5 usage after a ldvary.
Fixes bad rendering when forcing 2 spills in glxgears.

Cc: "18.2" <mesa-stable@lists.freedesktop.org>
2018-08-06 13:03:23 -07:00
Eric Anholt
f2c0d310d6 v3d: Make sure that QPU instruction-has-a-dest matches VIR.
Found when debugging register spilling -- we would try to spill the dest
of a STVPMV, inserting spill code after entering the last segment.  In
fact, we were likely to to choose to do this, given that the STVPMV "dest"
temp was never read from, making it cheap to spill.

Cc: "18.2" <mesa-stable@lists.freedesktop.org>
2018-08-06 13:03:23 -07:00
Eric Anholt
3f9cb2eb05 v3d: Wait for TMU writes to complete before continuing after a spill.
The simulator complained that we had write responses outstanding at shader
end.  It seems that a TMU read does not guarantee that previous TMU writes
by the thread have completed, which surprised me.

Cc: "18.2" <mesa-stable@lists.freedesktop.org>
2018-08-06 13:03:23 -07:00
Eric Anholt
ccbe33af5b v3d: Make sure we don't emit a thrsw before the last one finished.
Found while forcing some spilling, which creates a lot of short
tmua->thrsw->ldtmu sequences.

Cc: "18.2" <mesa-stable@lists.freedesktop.org>
2018-08-06 13:03:23 -07:00
Eric Anholt
f9d54dc3cf v3d: Add some debug code for forcing register spilling.
This is useful for periodically testing out register spilling to see how
it goes on simple shaders, rather than only failing on insanely
complicated ones.
2018-08-06 13:03:23 -07:00
Chad Versace
aaa41cd297 drisw: Fix build on Android Nougat, which lacks shm (v2)
In commit cf54bd5e8, dri_sw_winsys.c began using <sys/shm.h> to support
the new functions putImageShm, getImageShm in DRI_SWRastLoader. But
Android began supporting System V shared memory only in Oreo. Nougat has
no shm headers.

Fix the build by ifdef'ing out the shm code on Nougat.

Fixes: cf54bd5e8 "drisw: use shared memory when possible"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@gmail.com>
2018-08-06 11:09:38 -07:00
Ian Romanick
6229ee87c7 mesa: fix make check for AMD_framebuffer_multisample_advanced
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107483
Fixes: 3d6900d76e ("glapi: define AMD_framebuffer_multisample_advanced and add its functions")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: Vinson Lee <vlee@freedesktop.org>
2018-08-06 10:31:56 -07:00
Ian Romanick
b7946f6778 glapi: Fix GLES versioning for AMD_framebuffer_multisample_advanced functions
The GL_AMD_framebuffer_multisample_advanced spec says:

    OpenGL ES dependencies:

        Requires OpenGL ES 3.0.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107483
Fixes: 3d6900d76e ("glapi: define AMD_framebuffer_multisample_advanced and add its functions")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: Vinson Lee <vlee@freedesktop.org>
2018-08-06 10:30:06 -07:00
Gert Wollny
7a46b2d641 meson, install_megadrivers: Also remove stale symlinks
os.path.exists doesn't return True for stale symlinks, but they are in
the way later, when a link/file with the same name is to be created.
For instance it is conceivable that the pointed to file is replaced by
a file with a new name, and then the symlink is dead.

To handle this check specifically for all existing symlinks to be
removed. (This bugged me for some time with a link libXvMCr600.so
always being in the way of installing this file)

v2: use only os.lexist and replace all instances of os.exist (Dylan Baker)

v3: handle directory check correctly (Eric Engestrom)

Fixes: f7f1b30f81
       ("meson: extend install_megadrivers script to handle symmlinking")

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>(v2 minus dir check)
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
2018-08-06 18:42:01 +02:00
Tapani Pälli
5eb4b384d9 anv: add more swapchain formats
This change helps with some of the dEQP-VK.wsi.android.* tests that
try to create swapchain with using such formats.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-08-06 09:25:11 +03:00
Karol Herbst
c3325097be nvc0/ir: return 0 in imageLoad on incomplete textures
We already guarded all OP_SULDP against out of bound accesses, but we
ended up just reusing whatever value was stored in the dest registers.

Fixes CTS test shader_image_load_store.incomplete_textures

v2: fix for loads not ending up with predicates (bindless_texture)
v3: fix replacing the def

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-08-04 18:25:20 +02:00
Karol Herbst
0ca046d7e9 gm200/ir: optimize rcp(sqrt) to rsq
mitigates hurt shaders after adding sqrt:
total instructions in shared programs : 5456166 -> 5454825 (-0.02%)
total gprs used in shared programs    : 647522 -> 647551 (0.00%)
total shared used in shared programs  : 389120 -> 389120 (0.00%)
total local used in shared programs   : 21064 -> 21064 (0.00%)
total bytes used in shared programs   : 58288696 -> 58274448 (-0.02%)

                local     shared        gpr       inst      bytes
    helped           0           0           0         516         516
      hurt           0           0          27           2           2

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-08-04 15:24:08 +02:00
Karol Herbst
6f98a3065b gm200/ir: add native OP_SQRT support
./GpuTest /test=pixmark_piano 1024x640 30sec:
301 -> 327 points

shader-db:
total instructions in shared programs : 5472103 -> 5456166 (-0.29%)
total gprs used in shared programs    : 647530 -> 647522 (-0.00%)
total shared used in shared programs  : 389120 -> 389120 (0.00%)
total local used in shared programs   : 21064 -> 21064 (0.00%)
total bytes used in shared programs   : 58459304 -> 58288696 (-0.29%)

                local     shared        gpr       inst      bytes
    helped           0           0          27        8281        8281
      hurt           0           0          21         431         431

v2: use NVISA_GM200_CHIPSET

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-08-04 15:24:08 +02:00
Lionel Landwerlin
4334196ab3 intel: tools: simplify meson build
Remove the if tools condition and just put it through the install:
parameter.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-04 09:45:34 +01:00
Lionel Landwerlin
87a3c97781 intel: aubinator: simplify decoding
Since we don't support streaming an aub file, we can drop the decoding
status enum.

v2: include stdbool (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-04 09:40:14 +01:00
Lionel Landwerlin
02ebc064ea intel: common: add missing stdint include
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-04 09:39:01 +01:00
Lionel Landwerlin
db4770ee57 intel: decoder: remove unused variable
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-04 09:38:58 +01:00
Lionel Landwerlin
7471286bb0 intel: tools: aubwrite: reuse canonical address helper
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-04 09:38:44 +01:00
Lionel Landwerlin
35955afa7a intel: aubinator: fix read the context/ring
Up to now we've been lucky that the buffer returned was always exactly
at the address we requested.

Fixes: 144b40db54 ("intel: aubinator: drop the 1Tb GTT mapping")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-08-04 09:38:34 +01:00
Ian Romanick
3b07d28f81 nir: Transform expressions of b2f(a) and b2f(b) to a == b
All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14276886 -> 14276838 (<.01%)
instructions in affected programs: 312 -> 264 (-15.38%)
helped: 2
HURT: 0

total cycles in shared programs: 532578395 -> 532570985 (<.01%)
cycles in affected programs: 682562 -> 675152 (-1.09%)
helped: 374
HURT: 4
helped stats (abs) min: 2 max: 200 x̄: 20.39 x̃: 18
helped stats (rel) min: 0.07% max: 11.64% x̄: 1.25% x̃: 1.28%
HURT stats (abs)   min: 2 max: 114 x̄: 53.50 x̃: 49
HURT stats (rel)   min: 0.06% max: 11.70% x̄: 5.02% x̃: 4.15%
95% mean confidence interval for cycles value: -21.30 -17.91
95% mean confidence interval for cycles %-change: -1.30% -1.06%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10488123 -> 10488075 (<.01%)
instructions in affected programs: 336 -> 288 (-14.29%)
helped: 2
HURT: 0

total cycles in shared programs: 150260379 -> 150260439 (<.01%)
cycles in affected programs: 4726 -> 4786 (1.27%)
helped: 0
HURT: 2

No changes on Iron Lake or GM45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-08-04 01:12:03 -07:00
Ian Romanick
c658b6c4c8 nir: Transform expressions of b2f(a) and b2f(b) to a ^^ b
All Gen platforms had pretty similar results. (Skylake shown)
total instructions in shared programs: 14276892 -> 14276886 (<.01%)
instructions in affected programs: 484 -> 478 (-1.24%)
helped: 2
HURT: 0

total cycles in shared programs: 532578397 -> 532578395 (<.01%)
cycles in affected programs: 3522 -> 3520 (-0.06%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-08-04 01:12:03 -07:00
Ian Romanick
3aca80aabc nir: Transform expressions of b2f(a) and b2f(b) to !(a && b)
All Gen platforms had pretty similar results. (Skylake shown)
total cycles in shared programs: 532578400 -> 532578397 (<.01%)
cycles in affected programs: 2784 -> 2781 (-0.11%)
helped: 1
HURT: 1
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.26% max: 0.26% x̄: 0.26% x̃: 0.26%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08%

v2: s/fmax/fmin/.  Noticed by Thomas Helland.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-08-04 01:12:03 -07:00
Ian Romanick
1713c97181 nir: Transform expressions of b2f(a) and b2f(b) to a && b
No changes on any Gen platform.

v2: s/fmax/fmin/.  Noticed by Thomas Helland.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-08-04 01:12:03 -07:00
Ian Romanick
4425f4786a nir: Transform expressions of b2f(a) and b2f(b) to !(a || b)
All Gen6+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14276961 -> 14276892 (<.01%)
instructions in affected programs: 3215 -> 3146 (-2.15%)
helped: 28
HURT: 0
helped stats (abs) min: 1 max: 6 x̄: 2.46 x̃: 2
helped stats (rel) min: 0.47% max: 9.52% x̄: 4.34% x̃: 1.92%
95% mean confidence interval for instructions value: -2.87 -2.06
95% mean confidence interval for instructions %-change: -5.73% -2.95%
Instructions are helped.

total cycles in shared programs: 532577068 -> 532578400 (<.01%)
cycles in affected programs: 121864 -> 123196 (1.09%)
helped: 35
HURT: 30
helped stats (abs) min: 2 max: 268 x̄: 42.34 x̃: 22
helped stats (rel) min: 0.12% max: 12.14% x̄: 3.22% x̃: 1.86%
HURT stats (abs)   min: 2 max: 246 x̄: 93.80 x̃: 36
HURT stats (rel)   min: 0.09% max: 13.63% x̄: 4.47% x̃: 2.58%
95% mean confidence interval for cycles value: -5.02 46.01
95% mean confidence interval for cycles %-change: -0.99% 1.65%
Inconclusive result (value mean confidence interval includes 0).

Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 7781299 -> 7781342 (<.01%)
instructions in affected programs: 22300 -> 22343 (0.19%)
helped: 13
HURT: 40
helped stats (abs) min: 2 max: 3 x̄: 2.85 x̃: 3
helped stats (rel) min: 1.15% max: 7.69% x̄: 3.72% x̃: 3.33%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.26% max: 1.30% x̄: 0.47% x̃: 0.43%
95% mean confidence interval for instructions value: 0.23 1.39
95% mean confidence interval for instructions %-change: -1.18% 0.07%
Inconclusive result (%-change mean confidence interval includes 0).

total cycles in shared programs: 177878928 -> 177879332 (<.01%)
cycles in affected programs: 383298 -> 383702 (0.11%)
helped: 7
HURT: 43
helped stats (abs) min: 2 max: 18 x̄: 10.00 x̃: 10
helped stats (rel) min: 0.17% max: 4.81% x̄: 2.62% x̃: 3.40%
HURT stats (abs)   min: 2 max: 38 x̄: 11.02 x̃: 12
HURT stats (rel)   min: 0.08% max: 1.54% x̄: 0.25% x̃: 0.09%
95% mean confidence interval for cycles value: 5.21 10.95
95% mean confidence interval for cycles %-change: -0.51% 0.21%
Inconclusive result (%-change mean confidence interval includes 0).

v2: s/fmin/fmax/.  Noticed by Thomas Helland.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-08-04 01:12:03 -07:00
Ian Romanick
6b3670ae80 nir: Transform -fabs(a) >= 0 to a == 0
All Gen platforms had pretty similar results. (Skylake shown)
total instructions in shared programs: 14276964 -> 14276961 (<.01%)
instructions in affected programs: 411 -> 408 (-0.73%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.47% max: 1.96% x̄: 1.04% x̃: 0.68%

total cycles in shared programs: 532577062 -> 532577068 (<.01%)
cycles in affected programs: 1093 -> 1099 (0.55%)
helped: 1
HURT: 1
helped stats (abs) min: 16 max: 16 x̄: 16.00 x̃: 16
helped stats (rel) min: 7.77% max: 7.77% x̄: 7.77% x̃: 7.77%
HURT stats (abs)   min: 22 max: 22 x̄: 22.00 x̃: 22
HURT stats (rel)   min: 2.48% max: 2.48% x̄: 2.48% x̃: 2.48%

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-08-04 01:12:03 -07:00
Ian Romanick
46e7c340d4 nir: Transform expressions of b2f(a) and b2f(b) to a || b
All Gen6+ platforms had pretty similar results. (Skylake shown)
total instructions in shared programs: 14277184 -> 14276964 (<.01%)
instructions in affected programs: 10082 -> 9862 (-2.18%)
helped: 37
HURT: 1
helped stats (abs) min: 1 max: 30 x̄: 5.97 x̃: 4
helped stats (rel) min: 0.14% max: 16.00% x̄: 5.23% x̃: 2.04%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.70% max: 0.70% x̄: 0.70% x̃: 0.70%
95% mean confidence interval for instructions value: -7.87 -3.71
95% mean confidence interval for instructions %-change: -6.98% -3.16%
Instructions are helped.

total cycles in shared programs: 532577990 -> 532577062 (<.01%)
cycles in affected programs: 170959 -> 170031 (-0.54%)
helped: 33
HURT: 9
helped stats (abs) min: 2 max: 120 x̄: 30.91 x̃: 30
helped stats (rel) min: 0.02% max: 7.65% x̄: 2.66% x̃: 1.13%
HURT stats (abs)   min: 2 max: 24 x̄: 10.22 x̃: 8
HURT stats (rel)   min: 0.09% max: 1.79% x̄: 0.61% x̃: 0.22%
95% mean confidence interval for cycles value: -31.23 -12.96
95% mean confidence interval for cycles %-change: -2.90% -1.02%
Cycles are helped.

Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 7781539 -> 7781301 (<.01%)
instructions in affected programs: 10169 -> 9931 (-2.34%)
helped: 32
HURT: 0
helped stats (abs) min: 2 max: 20 x̄: 7.44 x̃: 6
helped stats (rel) min: 0.47% max: 17.02% x̄: 4.03% x̃: 1.88%
95% mean confidence interval for instructions value: -9.53 -5.34
95% mean confidence interval for instructions %-change: -5.94% -2.12%
Instructions are helped.

total cycles in shared programs: 177878590 -> 177878932 (<.01%)
cycles in affected programs: 78706 -> 79048 (0.43%)
helped: 7
HURT: 21
helped stats (abs) min: 6 max: 34 x̄: 24.57 x̃: 28
helped stats (rel) min: 0.15% max: 8.33% x̄: 4.66% x̃: 6.37%
HURT stats (abs)   min: 2 max: 86 x̄: 24.48 x̃: 22
HURT stats (rel)   min: 0.01% max: 4.28% x̄: 1.21% x̃: 0.70%
95% mean confidence interval for cycles value: 0.30 24.13
95% mean confidence interval for cycles %-change: -1.52% 1.01%
Inconclusive result (%-change mean confidence interval includes 0).

v2: s/fmin/fmax/.  Noticed by Thomas Helland.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-08-04 01:12:03 -07:00
Ian Romanick
be7d3ba34a nir: Transform -fabs(a) < 0 to a != 0
Unlike the much older -abs(a) >= 0.0 transformation, this is not
precise.  The behavior changes if a is NaN.

All Gen platforms had pretty similar results. (Skylake shown)
total instructions in shared programs: 14277216 -> 14277184 (<.01%)
instructions in affected programs: 2300 -> 2268 (-1.39%)
helped: 8
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 4.00 x̃: 3
helped stats (rel) min: 0.48% max: 15.15% x̄: 4.41% x̃: 1.01%
95% mean confidence interval for instructions value: -6.45 -1.55
95% mean confidence interval for instructions %-change: -9.96% 1.13%
Inconclusive result (%-change mean confidence interval includes 0).

total cycles in shared programs: 532577848 -> 532577990 (<.01%)
cycles in affected programs: 17486 -> 17628 (0.81%)
helped: 2
HURT: 5
helped stats (abs) min: 2 max: 6 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.06% max: 1.81% x̄: 0.93% x̃: 0.93%
HURT stats (abs)   min: 6 max: 50 x̄: 30.00 x̃: 26
HURT stats (rel)   min: 0.55% max: 2.17% x̄: 1.19% x̃: 1.02%
95% mean confidence interval for cycles value: -1.06 41.63
95% mean confidence interval for cycles %-change: -0.58% 1.74%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-08-04 01:12:03 -07:00
Ian Romanick
d49eab2757 nir: Rearrange bcsel with two bcsel sources
All Gen platforms had pretty similar results. (Skylake shown)
total instructions in shared programs: 14277220 -> 14277216 (<.01%)
instructions in affected programs: 422 -> 418 (-0.95%)
helped: 2
HURT: 0

total cycles in shared programs: 532577908 -> 532577848 (<.01%)
cycles in affected programs: 2800 -> 2740 (-2.14%)
helped: 2
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-08-04 01:12:03 -07:00
Ian Romanick
b92fded6eb nir: Collapse more repeated bcsels on the same argument
All Gen platforms had pretty similar results. (Skylake shown)
total instructions in shared programs: 14277230 -> 14277220 (<.01%)
instructions in affected programs: 751 -> 741 (-1.33%)
helped: 4
HURT: 0
helped stats (abs) min: 2 max: 3 x̄: 2.50 x̃: 2
helped stats (rel) min: 1.23% max: 1.40% x̄: 1.32% x̃: 1.32%
95% mean confidence interval for instructions value: -3.42 -1.58
95% mean confidence interval for instructions %-change: -1.47% -1.17%
Instructions are helped.

total cycles in shared programs: 532577947 -> 532577908 (<.01%)
cycles in affected programs: 10641 -> 10602 (-0.37%)
helped: 4
HURT: 3
helped stats (abs) min: 1 max: 40 x̄: 13.75 x̃: 7
helped stats (rel) min: 0.11% max: 3.08% x̄: 1.10% x̃: 0.60%
HURT stats (abs)   min: 2 max: 8 x̄: 5.33 x̃: 6
HURT stats (rel)   min: 0.13% max: 0.55% x̄: 0.30% x̃: 0.23%
95% mean confidence interval for cycles value: -20.69 9.55
95% mean confidence interval for cycles %-change: -1.63% 0.63%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-04 01:12:03 -07:00
Ian Romanick
408330ed48 nir: Don't compare i2f or u2i with zero
Broadwell and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14277620 -> 14277230 (<.01%)
instructions in affected programs: 36905 -> 36515 (-1.06%)
helped: 101
HURT: 6
helped stats (abs) min: 1 max: 6 x̄: 4.46 x̃: 6
helped stats (rel) min: 0.32% max: 7.69% x̄: 1.80% x̃: 1.51%
HURT stats (abs)   min: 1 max: 28 x̄: 10.00 x̃: 1
HURT stats (rel)   min: 0.33% max: 1.74% x̄: 0.68% x̃: 0.47%
95% mean confidence interval for instructions value: -4.59 -2.70
95% mean confidence interval for instructions %-change: -1.90% -1.41%
Instructions are helped.

total cycles in shared programs: 532580716 -> 532577947 (<.01%)
cycles in affected programs: 940575 -> 937806 (-0.29%)
helped: 92
HURT: 12
helped stats (abs) min: 2 max: 158 x̄: 51.04 x̃: 62
helped stats (rel) min: 0.24% max: 3.99% x̄: 2.14% x̃: 2.41%
HURT stats (abs)   min: 10 max: 1112 x̄: 160.58 x̃: 63
HURT stats (rel)   min: 0.06% max: 21.90% x̄: 4.22% x̃: 0.20%
95% mean confidence interval for cycles value: -50.66 -2.59
95% mean confidence interval for cycles %-change: -2.09% -0.73%
Cycles are helped.

total spills in shared programs: 8116 -> 8124 (0.10%)
spills in affected programs: 200 -> 208 (4.00%)
helped: 0
HURT: 2

total fills in shared programs: 11086 -> 11094 (0.07%)
fills in affected programs: 436 -> 444 (1.83%)
helped: 0
HURT: 2

Ivy Bridge and Haswell had similar results. (Haswell shown)
total instructions in shared programs: 12979054 -> 12978067 (<.01%)
instructions in affected programs: 33633 -> 32646 (-2.93%)
helped: 120
HURT: 2
helped stats (abs) min: 1 max: 13 x̄: 8.53 x̃: 13
helped stats (rel) min: 0.30% max: 16.67% x̄: 4.55% x̃: 3.17%
HURT stats (abs)   min: 18 max: 18 x̄: 18.00 x̃: 18
HURT stats (rel)   min: 1.15% max: 2.84% x̄: 2.00% x̃: 2.00%
95% mean confidence interval for instructions value: -9.19 -6.99
95% mean confidence interval for instructions %-change: -5.27% -3.62%
Instructions are helped.

total cycles in shared programs: 411212880 -> 411199636 (<.01%)
cycles in affected programs: 696441 -> 683197 (-1.90%)
helped: 107
HURT: 5
helped stats (abs) min: 2 max: 864 x̄: 124.90 x̃: 146
helped stats (rel) min: 0.03% max: 29.20% x̄: 8.58% x̃: 5.88%
HURT stats (abs)   min: 2 max: 50 x̄: 24.00 x̃: 22
HURT stats (rel)   min: 0.01% max: 5.35% x̄: 1.29% x̃: 0.25%
95% mean confidence interval for cycles value: -136.96 -99.54
95% mean confidence interval for cycles %-change: -9.75% -6.53%
Cycles are helped.

total spills in shared programs: 78623 -> 78631 (0.01%)
spills in affected programs: 66 -> 74 (12.12%)
helped: 0
HURT: 2

total fills in shared programs: 80104 -> 80108 (<.01%)
fills in affected programs: 133 -> 137 (3.01%)
helped: 0
HURT: 2

No changes on Sandy Bridge, Iron Lake, or GM45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-08-04 01:12:03 -07:00
Ian Romanick
a3845616a2 nir: Remove f2i(i2f(x)) conversions
Broadwell and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14277978 -> 14277620 (<.01%)
instructions in affected programs: 36957 -> 36599 (-0.97%)
helped: 76
HURT: 1
helped stats (abs) min: 2 max: 90 x̄: 4.89 x̃: 4
helped stats (rel) min: 0.44% max: 5.88% x̄: 1.04% x̃: 0.87%
HURT stats (abs)   min: 14 max: 14 x̄: 14.00 x̃: 14
HURT stats (rel)   min: 0.36% max: 0.36% x̄: 0.36% x̃: 0.36%
95% mean confidence interval for instructions value: -7.06 -2.24
95% mean confidence interval for instructions %-change: -1.28% -0.77%
Instructions are helped.

total cycles in shared programs: 532584581 -> 532580716 (<.01%)
cycles in affected programs: 973591 -> 969726 (-0.40%)
helped: 76
HURT: 1
helped stats (abs) min: 2 max: 9940 x̄: 159.80 x̃: 32
helped stats (rel) min: <.01% max: 8.70% x̄: 1.15% x̃: 1.19%
HURT stats (abs)   min: 8280 max: 8280 x̄: 8280.00 x̃: 8280
HURT stats (rel)   min: 2.10% max: 2.10% x̄: 2.10% x̃: 2.10%
95% mean confidence interval for cycles value: -386.98 286.59
95% mean confidence interval for cycles %-change: -1.41% -0.81%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 8127 -> 8116 (-0.14%)
spills in affected programs: 108 -> 97 (-10.19%)
helped: 1
HURT: 0

total fills in shared programs: 11090 -> 11086 (-0.04%)
fills in affected programs: 440 -> 436 (-0.91%)
helped: 1
HURT: 1

Haswell
total instructions in shared programs: 12979174 -> 12979054 (<.01%)
instructions in affected programs: 9040 -> 8920 (-1.33%)
helped: 14
HURT: 1
helped stats (abs) min: 2 max: 34 x̄: 8.79 x̃: 6
helped stats (rel) min: 0.41% max: 7.04% x̄: 2.66% x̃: 1.14%
HURT stats (abs)   min: 3 max: 3 x̄: 3.00 x̃: 3
HURT stats (rel)   min: 0.19% max: 0.19% x̄: 0.19% x̃: 0.19%
95% mean confidence interval for instructions value: -13.58 -2.42
95% mean confidence interval for instructions %-change: -3.94% -1.01%
Instructions are helped.

total cycles in shared programs: 411227148 -> 411212880 (<.01%)
cycles in affected programs: 630506 -> 616238 (-2.26%)
helped: 15
HURT: 0
helped stats (abs) min: 2 max: 11192 x̄: 951.20 x̃: 38
helped stats (rel) min: <.01% max: 16.01% x̄: 3.92% x̃: 0.17%
95% mean confidence interval for cycles value: -2544.28 641.88
95% mean confidence interval for cycles %-change: -6.89% -0.94%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 78626 -> 78623 (<.01%)
spills in affected programs: 42 -> 39 (-7.14%)
helped: 1
HURT: 0

total fills in shared programs: 80111 -> 80104 (<.01%)
fills in affected programs: 140 -> 133 (-5.00%)
helped: 1
HURT: 1

Ivy Bridge
total instructions in shared programs: 11684101 -> 11684030 (<.01%)
instructions in affected programs: 3080 -> 3009 (-2.31%)
helped: 4
HURT: 1
helped stats (abs) min: 5 max: 59 x̄: 18.50 x̃: 5
helped stats (rel) min: 6.47% max: 7.04% x̄: 6.87% x̃: 6.99%
HURT stats (abs)   min: 3 max: 3 x̄: 3.00 x̃: 3
HURT stats (rel)   min: 0.15% max: 0.15% x̄: 0.15% x̃: 0.15%
95% mean confidence interval for instructions value: -45.59 17.19
95% mean confidence interval for instructions %-change: -9.38% -1.56%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 258407697 -> 258389653 (<.01%)
cycles in affected programs: 328323 -> 310279 (-5.50%)
helped: 5
HURT: 0
helped stats (abs) min: 32 max: 14908 x̄: 3608.80 x̃: 32
helped stats (rel) min: 1.26% max: 17.22% x̄: 9.30% x̃: 10.60%
95% mean confidence interval for cycles value: -11616.71 4399.11
95% mean confidence interval for cycles %-change: -16.56% -2.03%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 4537 -> 4528 (-0.20%)
spills in affected programs: 64 -> 55 (-14.06%)
helped: 1
HURT: 0

total fills in shared programs: 4823 -> 4815 (-0.17%)
fills in affected programs: 189 -> 181 (-4.23%)
helped: 1
HURT: 1

Sandy Bridge
total instructions in shared programs: 10488464 -> 10488449 (<.01%)
instructions in affected programs: 272 -> 257 (-5.51%)
helped: 3
HURT: 0
helped stats (abs) min: 5 max: 5 x̄: 5.00 x̃: 5
helped stats (rel) min: 5.49% max: 5.56% x̄: 5.51% x̃: 5.49%

total cycles in shared programs: 150263359 -> 150263263 (<.01%)
cycles in affected programs: 7978 -> 7882 (-1.20%)
helped: 3
HURT: 0
helped stats (abs) min: 32 max: 32 x̄: 32.00 x̃: 32
helped stats (rel) min: 1.15% max: 1.23% x̄: 1.20% x̃: 1.23%

No changes on Iron Lake or GM45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-08-04 01:12:03 -07:00
Ian Romanick
ea6c276436 nir: Mark the 0.0 < abs(a) transformation as imprecise
Unlike the much older -abs(a) >= 0.0 transformation, this is not
precise.  The behavior changes if the source is NaN.

No shader-db changes on any platform.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-08-04 01:12:03 -07:00
Marek Olšák
4bad50ded9 radeonsi: cosmetic changes 2018-08-04 03:10:30 -04:00
Marek Olšák
6508b93d78 st/mesa: expose & set limits for AMD_framebuffer_multisample_advanced
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-08-04 02:47:58 -04:00
Marek Olšák
7f587b57f7 st/mesa: add renderbuffer support for AMD_framebuffer_multisample_advanced
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-08-04 02:46:55 -04:00
Marek Olšák
8e3d0019e1 st/mesa: pass storage_sample_count parameter into st_choose_format
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-08-04 02:46:55 -04:00
Marek Olšák
459f05c7ec mesa: add functional FBO changes for AMD_framebuffer_multisample_advanced
- relax FBO completeness rules
- validate sample counts

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-08-04 02:46:55 -04:00
Marek Olšák
328c1c8d99 mesa: add gl_renderbuffer::NumStorageSamples
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-08-04 02:46:55 -04:00
Marek Olšák
a96e946d25 mesa: implement glGet for AMD_framebuffer_multisample_advanced
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-08-04 02:46:55 -04:00
Marek Olšák
3d6900d76e glapi: define AMD_framebuffer_multisample_advanced and add its functions
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-08-04 02:46:55 -04:00
Marek Olšák
2d115056d3 mesa: add storageSamples parameter to renderbuffer functions
It's just passed to other functions but otherwise unused.
It will be used in following commits.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-08-04 02:46:55 -04:00
Marek Olšák
f7d42ee7d3 include: update GL & GLES headers (v2)
v2: use correct files

Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-04 02:43:05 -04:00
Marek Olšák
fd1121e839 amd: remove support for LLVM 5.0
Users are encouraged to switch to LLVM 6.0 released in March 2018.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-03 18:36:11 -04:00
Marek Olšák
461a864316 winsys/amdgpu: pass the BO list via the CS ioctl on DRM >= 3.27.0 2018-08-03 18:35:19 -04:00
Marek Olšák
0f79b2015b gallium/u_vbuf: handle indirect multidraws correctly and efficiently (v3)
v2: need to do MAX{start+count} instead of MAX{count}
    added piglit tests
v3: use malloc

Cc: 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-08-03 18:30:46 -04:00
Mauro Rossi
1c7a2433b2 android: radv: build vulkan.radv conditionally to radeonsi
A problem was reported with arm,arm64 targets build due to missing
libLLVM shared library dependency with AOSP; to avoid this issue vulkan.radv
is built conditionally only when radeonsi is in BOARD_GPU_DRIVERS

Fixes: 0ca153f869 ("android: radv: enable build of vulkan.radv HAL module")

Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
2018-08-03 20:09:16 +02:00
Roland Scheidegger
c72f91deba util: return 0 for NaNs in float_to_ubyte
d3d10 requires NaNs to get converted to 0 for float->unorm conversions
(and float->int etc.). GL spec probably doesn't care in general, but it
would make sense to have reasonable behavior in any case imho - the
old code was converting negative NaNs to 0, and positive NaNs to 255.
(Note that using float comparison isn't actually all that much more
effort in any case, at least with sse2 it's just float comparison
(ucommiss) instead of int one - I converted the second comparison
to float too simply because it saves the probably somewhat expensive
transfer of the float from simd to int domain (with sse2 via stack),
so the generated code actually has 2 less instructions, although float
comparisons are more expensive than int ones.)

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-08-03 17:07:38 +02:00
Jason Ekstrand
1d900e55fd anv/pipeline: Disable FS dispatch for pointless fragment shaders
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-03 05:52:23 -07:00
Timothy Arceri
d5175d21c7 nir: add fall through comment to nir_gather_info
This stops Coverity reporting a defect and helps make the code less
error-prone.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-03 09:30:57 +10:00
Dan Willemsen
12e3334f1e CleanSpec.mk: Remove HOST_OUT_release
This is a forward port of a patch from the AOSP/master tree:
bd633f11de%5E%21/

Which replaces HOST_OUT_release with HOST_OUT

As per Dan's explanation, the current code was incorrect to use
$(HOST_OUT_release) as $(HOST_OUT) will be set properly for
whether the current build that's being cleaned during
incrementals is using host debug or release builds.

Additionally Dan noted it was incredibly uncommon to use a debug
host build, as there was never a shortcut and one had to set an
environment variable manually. Thus it was rarely if ever tested.

Change-Id: I7972c0a50fa3520dcfa962d6dd7e602bfe22368d
Cc: Rob Herring <rob.herring@linaro.org>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Marissa Wall <marissaw@google.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2018-08-02 15:42:40 -06:00
Sumit Semwal
d0b63b6583 Android.common.mk: define HAVE_TIMESPEC_GET
This is a forward port of a patch from the AOSP/master tree:
bd30b663f5%5E%21/

Since https://android-review.googlesource.com/c/718518 added
timespec_get() to bionic, mesa3d doesn't build due to redefinition
of timespec_get().

Avoid redefinition by defining HAVE_TIMESPEC_GET flag.

Test: build and boot tested db820c to UI.

Change-Id: I3dcc8034b48785e45cd3fa50e4d9cf2c684694a0
Cc: Rob Herring <rob.herring@linaro.org>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Marissa Wall <marissaw@google.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2018-08-02 15:42:27 -06:00
Dan Willemsen
dc030d1ec9 util: Android.mk: Convert implicit rules to static pattern rules
This is a partial cherry-pick from AOSP's mesa3d tree:
 a88dcf769e%5E%21/

"We're deprecating make implicit rules, preferring static pattern
rules, or just regular rules."

Without this patch, the freedesktop/master branch won't build in
the AOSP environment, and this patch corrects that, as tested
on the Dragonboard 820c.

The i965 portion of the patch this is based on collided badly,
and I'm not sure how to best forward port it. However, so far
we don't see build issues without that portion.

Comments or feedback would be appreciated!

Change-Id: Id6dfd0d018cbd665fa19d80c14abd5f75fa10b8a
Cc: Rob Herring <rob.herring@linaro.org>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Marissa Wall <marissaw@google.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2018-08-02 15:42:23 -06:00
Darren Powell
726a48c94f radeonsi: add new R600_DEBUG test "testclearbufperf"
Signed-off-by: Darren Powell <darren.powell@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-08-02 16:09:22 -04:00
Brian Paul
977638006b mesa: add switch case for GL 2.0 in _mesa_compute_version()
Previously, I added a switch case for GL 2.1 (ed7a0770b881791dd697f3).
I don't know of any driver which only supports GL 2.0, but adding
this switch case avoids a failure if the app queries
GL_SHADING_LANGUAGE_VERSION.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-02 13:20:00 -06:00
Andres Gomez
2d4d139877 intel/tools: add error2aub creation into autotools
Tarball distribution is done through "make distcheck". We include the
meson targets also into autotools so they won't fail when building
from the tarball.

Fixes: 6a60beba40 ("intel/tools: Add an error state to aub translator")
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dylan Baker <dylan.c.baker@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-02 21:15:57 +03:00
Jason Ekstrand
7ef6cd0ee8 anv/pipeline: Do cross-stage linking optimizations
This appears to help the Aztec Ruins benchmark by about 2% on my Kaby
Lake gt2 laptop.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
a5bffa061d anv/pipeline: Pull most of the anv_pipeline_compile_* into common code
This leaves us with a series of little anv_pipeline_compile_* functions
which each take a compiler object, a mem_ctx, the stage to compile, and
the previous stage for VUE linking purposes.  Some of them do
interesting things but most are little more than wrappers around
brw_compile_*.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
5351339554 anv/pipeline: Add a separate "link" stage
This breaks compilation up a bit into "link" and "compile".  In the
"link" stage, new anv_pipeline_link_* helpers are called which are
responsible for setting up the binding table and doing anything needed
to properly link with the next stage in the pipeline if one exists.
They are called in reverse order starting with the fragment shader so
you can assume linking in later stages is already done.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
5b196f39bd anv/pipeline: Compile to NIR in compile_graphics
This pulls the SPIR-V to NIR step out into common code.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
946fcd02a9 anv/pipeline: Recompile all shaders if any are missing from the cache
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
f76d6d8a63 anv/pipeline: Drop anv_pipeline_add_compiled_stage
We can set active_stages much more directly and then it's just candy
around setting pipeline->stages[stage].

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
703a24932a anv/pipeline: Pull shader compilation out into a helper.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
f3c59ca947 anv/pipeline: Call anv_pipeline_compile_* in a loop
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
bdc3565c8c anv/pipeline: Hash the entire pipeline in one go
Instead of hashing each stage separately (and TES and TCS together), we
hash the entire pipeline.  This means we'll get fewer cache hits if
they, for instance, re-use the same VS over and over again but it also
means we can now safely do cross-stage optimizations.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
4a8236ae17 anv/pipeline: Populate keys up-front
Instead of having each anv_pipeline_compile_* function populate the
shader key, make it part of the anv_pipeline_stage struct and fill it
out up-front.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
76503b319a anv/pipline: Add a helper struct for per-stage info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jon Turney
a48c0659e1 meson: use correct keyword to fix a meson warning
With a sufficently recent meson, the following warning is produced:

WARNING: Passed invalid keyword argument "extra_args".
WARNING: This will become a hard error in the future.

It seems that compiler.links(args:) is meant here.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-02 18:12:49 +01:00
Andres Gomez
3013e22717 docs: add 18.3.0-devel release notes template
Signed-off-by: Andres Gomez <agomez@igalia.com>
2018-08-02 18:15:33 +03:00
Andres Gomez
873767cf42 mesa: bump version to 18.3.0-devel
Signed-off-by: Andres Gomez <agomez@igalia.com>
2018-08-02 18:00:15 +03:00
Eric Engestrom
44265cc65e egl/main: fix indentation
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
2018-08-02 12:54:05 +01:00
Eric Engestrom
dd007d1c2a loader: fix indentation
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
2018-08-02 12:53:58 +01:00
Vlad Golovkin
9d3a2394e4 swr: Remove unnecessary memset call
Zeroing memory after calloc is not necessary. This also allows to avoid
possible crash when allocation fails, because memset is called before
checking screen for NULL.

Fixes: a29d63ecf7 "swr: refactor swr_create_screen to allow
                              for proper cleanup on error"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-02 11:13:40 +01:00
Andres Gomez
8d3ccdbb9b mesa: replace binary constants with hexadecimal constants
The binary constant notation "0b" is a GCC extension. Instead, we use
hexadecimal notation to fix the MSVC 2013 build:

Compiling src\mesa\main\texcompress_astc.cpp ...
texcompress_astc.cpp
src\mesa\main\texcompress_astc.cpp(111) : error C2059: syntax error : 'bad suffix on number'

...

src\mesa\main\texcompress_astc.cpp(1007) : fatal error C1003: error count exceeds 100; stopping compilation
scons: *** [build\windows-x86-debug\mesa\main\texcompress_astc.obj] Error 2
scons: building terminated because of errors.

v2: Fix wrong conversion (Ilia).

Fixes: 38ab39f650 ("mesa: add ASTC 2D LDR decoder")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Brian Paul <brianp@vmware.com>
Cc: Roland Scheidegger <sroland@vmware.com>
Cc: Mike Lothian <mike@fireburn.co.uk>
Cc: Gert Wollny <gert.wollny@collabora.com>
Cc: Dieter Nützel <Dieter@nuetzel-hh.de>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-02 10:06:44 +03:00
Andres Gomez
1090e97e77 ddebug: use util_snprintf() in dd_get_debug_filename_and_mkdir
Instead of plain snprintf(). To fix the MSVC 2013 build:

  Compiling src\gallium\auxiliary\driver_ddebug\dd_draw.c ...
dd_draw.c
c:\projects\mesa\src\gallium\auxiliary\driver_ddebug\dd_util.h(60) : warning C4013: 'snprintf' undefined; assuming extern returning int

...

gallium.lib(dd_draw.obj) : error LNK2001: unresolved external symbol _snprintf
build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll : fatal error LNK1120: 1 unresolved externals
scons: *** [build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll] Error 1120
scons: building terminated because of errors.

Fixes: 6ff0c6f4eb ("gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Brian Paul <brianp@vmware.com>
Cc: Roland Scheidegger <sroland@vmware.com>
Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-08-02 10:06:44 +03:00
Andres Gomez
d7694136d3 kutil/queue: use util_snprintf() in util_queue_init
Instead of plain snprintf(). To fix the MSVC 2013 build:

  Compiling src\util\u_queue.c ...
u_queue.c
src\util\u_queue.c(325) : warning C4013: 'snprintf' undefined; assuming extern returning int

...

mesautil.lib(u_queue.obj) : error LNK2001: unresolved external symbol _snprintf
scons: building terminated because of errors.

Fixes: b238e33bc9 ("kutil/queue: add a process name into a thread name")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Brian Paul <brianp@vmware.com>
Cc: Roland Scheidegger <sroland@vmware.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-08-02 10:06:44 +03:00
Andres Gomez
18d9dc179f gallium/aux/util: use util_snprintf() in test_texture_barrier
Instead of plain snprintf(). To fix the MSVC 2013 build:

  Compiling src\gallium\auxiliary\util\u_tests.c ...
u_tests.c
src\gallium\auxiliary\util\u_tests.c(624) : warning C4013: 'snprintf' undefined; assuming extern returning int

...

gallium.lib(u_tests.obj) : error LNK2019: unresolved external symbol _snprintf referenced in function _test_texture_barrier
build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll : fatal error LNK1120: 1 unresolved externals
scons: *** [build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll] Error 1120
scons: building terminated because of errors.

Fixes: 56342c97ee ("gallium/u_tests: test FBFETCH and shader-based blending with MSAA")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Brian Paul <brianp@vmware.com>
Cc: Roland Scheidegger <sroland@vmware.com>
Cc: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-08-02 10:06:44 +03:00
Andres Gomez
9d220fa950 glsl: use util_snprintf()
Instead of plain snprintf(). To fix the MSVC 2013 build.

Fixes: 6ff0c6f4eb ("gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Brian Paul <brianp@vmware.com>
Cc: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-08-02 10:06:44 +03:00
Jordan Justen
8fcdb71d8c intel/compiler: Add brw_get_compiler_config_value for disk cache
During code review, Jason pointed out that:

2b3064c073 "i965, anv: Use INTEL_DEBUG for disk_cache driver flags"

Didn't account for INTEL_SCALER_* environment variables.

To fix this, let the compiler return the disk_cache driver flags.

Another possible fix would be to pull the INTEL_SCALER_* into
INTEL_DEBUG bits, but as we are currently using 41 of 64 bits, I
didn't think it was a good use of 4 more of these bits. (5 since
INTEL_PRECISE_TRIG needs to be accounted for as well.)

Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-01 23:49:16 -07:00
Jordan Justen
3887700dfd i965: Disable shader cache with INTEL_DEBUG=shader_time
Shader time hard codes an index of the shader time buffer within the
gen program.

In order to support shader time in the disk shader cache, we'd need to
add the shader time index into the program key. This should work, but
probably is not worth it for this particular debug feature.

Therefore, let's just disable the disk shader cache if the shader time
debug feature is used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106382
Fixes: 96fe36f7ac "i965: Enable disk shader cache by default"
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-01 23:30:49 -07:00
Timothy Arceri
bea4722c2e glsl: make a copy of array indices that are used to deref a function out param
Fixes new piglit test:
tests/spec/glsl-1.20/execution/qualifiers/vs-out-conversion-int-to-float-vec4-index.shader_test

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-02 11:06:28 +10:00
Jason Ekstrand
de9e5cf35a anv/pipeline: Add populate_tcs/tes_key helpers
They don't really do anything interesting, but it's more consistent this
way.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-01 18:02:28 -07:00
Jason Ekstrand
e621f57556 anv/pipeline: Rework the parameters to populate_wm_prog_key
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-01 18:02:28 -07:00
Jason Ekstrand
b2e0b0dad6 anv/pipeline: More aggressively optimize away color attachments
Instead of just looking at the number of color attachments, look at
which ones are actually used by the subpass.  This lets us potentially
throw away chunks of the fragment shader.  In DXVK, for example, all
subpasses have 8 attachments and most are VK_ATTACHMENT_UNUSED so this
is very helpful in that case.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-01 18:02:28 -07:00
Jason Ekstrand
80bc0b728c anv: Restrict the number of color regions to those actually written
The back-end compiler emits the number of color writes specified by
wm_prog_key::nr_color_regions regardless of what nir_store_outputs we
have.  Once we've gone through and figured out which render targets
actually exist and are written by the shader, we should restrict the key
to avoid extra RT write messages.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-01 18:02:28 -07:00
Jason Ekstrand
4d57e543b8 anv/pipeline: Fix up deref modes if we delete a FS output
With the new deref instructions, we have to keep the modes consistent
between the derefs and the variables they reference.  Since we remove
outputs by changing them to local variables, we need to run the fixup
pass to fix the modes.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-01 18:02:28 -07:00
Jason Ekstrand
7f75cf2a94 nir/lower_indirect: Bail early if modes == 0
There's no point in walking the program if we're never going to actually
lower anything.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-01 18:02:28 -07:00
Jason Ekstrand
4434591bf5 intel/nir: Call nir_lower_io_to_scalar_early
Shader-db results on Kaby Lake:

    total instructions in shared programs: 15166953 -> 15073611 (-0.62%)
    instructions in affected programs: 2390284 -> 2296942 (-3.91%)
    helped: 16469
    HURT: 505

    total loops in shared programs: 4954 -> 4951 (-0.06%)
    loops in affected programs: 3 -> 0
    helped: 3
    HURT: 0

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-01 18:02:28 -07:00
Jason Ekstrand
b0bb547f78 intel/nir: Split IO arrays into elements
The NIR nir_lower_io_arrays_to_elements pass attempts to split I/O
variables which are arrays or matrices into a sequence of separate
variables.  This can help link-time optimization by allowing us to
remove varyings at a more granular level.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15177645 -> 15168494 (-0.06%)
    instructions in affected programs: 79857 -> 70706 (-11.46%)
    helped: 392
    HURT: 0

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-01 18:02:28 -07:00
Jason Ekstrand
57804efa88 i965/fs: Flag all slots of a flat input as flat
Otherwise, only the first vec4 of a matrix or other complex type will
get marked as flat and we'll interpolate the others.  This was caught by
a dEQP test which started failing because it did a SSO vs. non-SSO
comparison.  Previously, we did the interpolation wrong consistently in
both versions.  However, with one of Tim Arceri's NIR linkingpatches, we
started splitting the matrix input into vectors at link time in the
non-SSO version and it started getting correctly interpolated which
didn't match the broken SSO version.  As of this commit, they both get
correctly interpolated.

Fixes: e61cc87c75 "i965/fs: Add a flat_inputs field to prog_data"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-01 18:02:28 -07:00
Jason Ekstrand
4e060385e9 intel/nir: Use the correct scalar stage for consumers when linking
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-01 18:02:28 -07:00
Dave Airlie
70c34a1bd2 docs: update 18.2.0 release notes for virgl 2018-08-02 08:43:56 +10:00
Dylan Baker
34998aae18 nir/meson: fix c vs cpp args for nir test
Fixes: d1992255bb
       ("meson: Add build Intel "anv" vulkan driver")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-01 12:51:22 -07:00
Dylan Baker
2877b6555c gallium: fix ddebug on windows
By including the proper headers for getpid and for mkdir.

Fixes: 6ff0c6f4eb
       ("gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-01 12:50:25 -07:00
Dylan Baker
17f49950da util: move process.[ch] to u_process.[ch]
On windows process.h is a system provided header, and it's required in
include/c11/threads_win32.h. This header interferes with searching for
that header, and results in windows build warnings with scons, but
errors in meson which doesn't allow implicit function declarations. Just
rename process to u_process, which follows the style of utils anyway.

Fixes: 2e1e6511f7
       ("util: extract get_process_name from xmlconfig.c")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-01 12:47:16 -07:00
Marek Olšák
cb6b241c30 ac,radeonsi: reduce optimizations for complex compute shaders on older APUs (v2)
To make dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23
finish sooner on the older CPUs. (otherwise it gets killed and we fail
the test)

Acked-by: Dave Airlie <airlied@gmail.com>
2018-08-01 15:25:18 -04:00
Eric Anholt
c2eab33b08 v3d: Actually put the "%s" in the snprintf.
I missed an important part when porting the change over, fixing my
compiler warning but breaking -Werror=format-security.

Fixes: e6ff5ac446 ("v3d: use snprintf(..., "%s", ...) instead of strncpy")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107443
2018-08-01 11:39:19 -07:00
Juan A. Suarez Romero
d742270564 vc4: Fix automake linking error.
CXXLD    gallium_dri.la
../../../../src/gallium/drivers/vc4/.libs/libvc4.a(vc4_cl_dump.o): In function `vc4_dump_cl':
src/gallium/drivers/vc4/vc4_cl_dump.c:45: undefined reference to `clif_dump_init'
src/gallium/drivers/vc4/vc4_cl_dump.c:82: undefined reference to `clif_dump_destroy'
../../../../src/broadcom/cle/.libs/libbroadcom_cle.a(cle_libbroadcom_cle_la-v3d_decoder.o): In function `v3d_field_iterator_next':
src/broadcom/cle/v3d_decoder.c:902: undefined reference to `clif_lookup_bo'

Fixes: e92959c4e0 ("v3d: Pass the whole clif_dump structure to v3d_print_group().")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107423
CC: Eric Anholt <eric@anholt.net>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-08-01 20:33:07 +02:00
Juan A. Suarez Romero
810c9a4eba scons: require scons 2.4 or greater
There is a bug with scons 2.3, used in Travis, where it fails to detect
some C functions.

Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-08-01 20:33:00 +02:00
Juan A. Suarez Romero
fea0b92042 travis: install scons from pip
The ubuntu version provided by Travis is a bit old, and does not detect
correctly some C functions.

Use a more modern version through scons.

Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-08-01 20:32:42 +02:00
Marek Olšák
26d3e2b4b0 docs: mark ARB_ES3_2_compatibility as done for radeonsi 2018-08-01 11:38:54 -04:00
Lionel Landwerlin
2477e516d9 intel: tools: aubwrite: split gen[89] from gen10+
Gen10+ has an additional bit in MI_BATCH_BUFFER_END to signal the end
of the context image.

We select the largest size for the context image regardless of the
generation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-08-01 15:31:56 +01:00
Mathieu Bridon
91939255a7 python: Use the unicode_escape codec
Python 2 had string_escape and unicode_escape codecs. Python 3 only has
the latter. These work the same as far as we're concerned, so let's use
the future-proof one.

However, the reste of the code expects unicode strings, so we need to
decode them again.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-01 14:26:19 +01:00
Mathieu Bridon
ad363913e6 python: Explicitly add the 'L' suffix on Python 3
Python 2 had two integer types: int and long. Python 3 dropped the
latter, as it made the int type automatically support bigger numbers.

As a result, Python 3 lost the 'L' suffix on integer litterals.

This probably doesn't make much difference when compiling the generated
C code, but adding it explicitly means that both Python 2 and 3 generate
the exact same C code anyway, which makes it easier to compare and check
for discrepencies when moving to Python 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-01 14:26:19 +01:00
Mathieu Bridon
a71df20855 python: Explicitly use byte strings
In both Python 2 and 3, zlib.Compress.compress() takes a byte string,
and returns a byte string as well.

In Python 2, the script was working because:

1. string literalls were byte strings;
2. opening a file in unicode mode, reading from it, then passing the
   unicode string to compress() would automatically encode to a byte
   string;

On Python 3, the above two points are not valid any more, so:

1. zlib.Compress.compress() refuses the passed unicode string;
2. compressed_data, defined as an empty unicode string literal, can't be
   concatenated with the byte string returned by compress();

This commit fixes this by explicitly using byte strings where
appropriate, so that the script works on both Python 2 and 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-01 14:26:19 +01:00
Mathieu Bridon
8678fe537a python: Use open(), not file()
The latter is a constructor for file objects, but when actually opening
a file, using the former is more idiomatic.

In addition, file() is not a builtin any more in Python 3, so this makes
the script compatible with both Python 2 and Python 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-01 14:26:19 +01:00
Mathieu Bridon
c24d826968 python: Open file in binary mode
The XML parser wants byte strings, not unicode strings.

In both Python 2 and 3, opening a file without specifying the mode will
open it for reading in text mode ('r').

On Python 2, the read() method of the file object will return byte
strings, while on Python 3 it will return unicode strings.

Explicitly specifying the binary mode ('rb') makes the behaviour
identical in both Python 2 and 3, returning what the XML parser
expects.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-01 14:26:19 +01:00
Mathieu Bridon
e40200e0aa python: Don't abuse hex()
The hex() builtin returns a string containing the hexa-decimal
representation of an integer.

When the argument is not an integer, then the function calls that
object's __hex__() method, if one is defined. That method is supposed to
return a string.

While that's not explicitly documented, that string is supposed to be a
valid hexa-decimal representation for a number. Python 2 doesn't enforce
this though, which is why we got away with returning things like
'NIR_TRUE' which are not numbers.

In Python 3, the hex() builtin instead calls an object's __index__()
method, which itself must return an integer. That integer is then
automatically converted to a string with its hexa-decimal representation
by the rest of the hex() function.

As a result, we really can't make this compatible with Python 3 as it
is.

The solution is to stop using the hex() builtin, and instead use a hex()
object method, which can return whatever we want, in Python 2 and 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-01 14:26:19 +01:00
Mathieu Bridon
12eb5b496b python: Better get character ordinals
In Python 2, iterating over a byte-string yields single-byte strings,
and we can pass them to ord() to get the corresponding integer.

In Python 3, iterating over a byte-string directly yields those
integers.

Transforming the byte string into a bytearray gives us a list of the
integers corresponding to each byte in the string, removing the need to
call ord().

This makes the script compatible with both Python 2 and 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-01 14:26:19 +01:00
Mario Kleiner
9bd8b0f700 loader_dri3: Handle mismatched depth 30 formats for Prime renderoffload.
Detect if the display (X-Server) gpu and Prime renderoffload gpu prefer
different channel ordering for color depth 30 formats ([X/A]BGR2101010
vs. [X/A]RGB2101010) and perform format conversion during the blitImage()
detiling op from tiled backbuffer -> linear buffer.

For this we need to find the visual (= red channel mask) for the
X-Drawable used to display on the server gpu. We use the same proven
logic for finding that visual as in commit "egl/x11: Handle both depth
30 formats for eglCreateImage()".

This is mostly to allow "NVidia Optimus" at depth 30, as Intel/AMD
gpu's prefer xRGB2101010 ordering, whereas NVidia gpu's prefer
xBGR2101010 ordering, so we can offload to nouveau without getting
funky colors.

Tested on Intel single gpu, NVidia single gpu, Intel + NVidia prime
offload with DRI3/Present.

Note: An unintended but pleasant surprise of this patch is that it also
seems to make the modesetting-ddx of server 1.20.0 work at depth 30
on nouveau, at least with unredirected "classic" X rendering, and
with redirected desktop compositing under XRender accel, and with OpenGL
compositing under GLX. Only X11 compositing via OpenGL + EGL still gives
funky colors. modesetting-ddx + glamor are not yet ready to deal with
nouveau's ABGR2101010 format, and treat it as ARGB2101010, also exposing
X-visuals with ARGB2101010 style channel masks. Seems somehow this triggers
the logic in this patch on modesetting-ddx + depth 30 + DRI3 buffer sharing
and does the "wrong" channel swizzling that then cancels out the "wrong"
swizzling of glamor and we end up with the proper pixel formatting in
the scanout buffer :). This so far tested on a NVA5 Tesla card under KDE5
Plasma as shipping with Ubuntu 16.04.4 LTS.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-01 12:55:37 +01:00
Mario Kleiner
61a02729f7 egl/x11: Handle both depth 30 formats for eglCreateImage(). (v4)
We need to distinguish if the backing storage of a pixmap
is XRGB2101010 or XBGR2101010, as different gpu hw supports
different formats. NVidia hw prefers XBGR, whereas AMD and
Intel are happy with XRGB.

Use the red channel mask of the first depth 30 visual of
the x-screen to distinguish which hw format to choose.

This fixes desktop composition of color depth 30 windows
when the X11 compositor uses EGL.

v2: Switch from using the visual of the root window to simply
    using the first depth 30 visual for the x-screen, as testing
    shows that each driver only exports either xrgb ordering or
    xbgr ordering for the channel masks of its depth 30 visuals,
    so this should be unambiguous and avoid trouble if X ever
    supports depth 30 pixmaps on screens with a non-depth 30 root
    window visual. This per Michels suggestion.

v3: No change to v2, but spent some time testing this more on
    AMD hw, with my software hacked up to intentionally choose
    pixel formats/visual with the non-preferred xBGR2101010
    ordering on the ati-ddx, also with a standard non-OpenGL
    X-Window with depth 30 visual, to make sure that things show
    up properly with the right colors on the screen when going
    through EGL+OpenGL based compositing on KDE-5. Iow. to confirm
    that my explanation to the v2 patch on the mailing list of why
    it should work and the actual practice agree (or possibly that
    i am good at fooling myself during testing ;).

v4: Drop the local `red_mask` and just `return visual->red_mask`/
    `return 0`, as suggested by Eric Engestrom.

    Rebased onto current master, to take the cleanup via the new
    function dri2_format_for_depth() into account.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-01 12:55:37 +01:00
Daniel Stone
753f603b52 gbm: Add support for 10bpp BGR formats
Add support for XBGR2101010 and ABGR2101010 formats.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-01 12:55:37 +01:00
Daniel Stone
275b23ed0e egl/wayland: Add 10bpc BGR configs
Add support for XBGR2101010 and ABGR2101010.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-08-01 12:55:37 +01:00
Iago Toral Quiroga
471bce5689 intel/compiler: implement 8-bit constant load
Fixes VK-GL-CTS CL#2567

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-01 08:08:15 +02:00
Iago Toral Quiroga
7e6c8b0cb7 intel/compiler: add setup_imm_(u)b helpers
The hardware doesn't support byte immediates, so similar to setup_imm_df()
for doubles, these helpers work by loading the constant value into a
VGRF.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-01 08:08:15 +02:00
Rhys Perry
bd56e117ff glsl: fix function inlining with opaque parameters
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-08-01 00:10:01 -04:00
Rhys Perry
f903bce8a6 glsl, glsl_to_tgsi: fix sampler/image constants
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-08-01 00:10:01 -04:00
Rhys Perry
ea2a3f52b4 glsl: allow ?: operator with images and samplers when bindless is enabled
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-08-01 00:10:01 -04:00
Rhys Perry
42d4acb39d glsl_to_tgsi: allow bound samplers and images to be used as l-values
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-08-01 00:10:00 -04:00
Rhys Perry
00589be6c4 gallium: add new SAMP2HND and IMG2HND opcodes
This commit does not add support for the opcodes in gallivm or tgsi_to_nir.c

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-08-01 00:10:00 -04:00
Dave Airlie
1fb388cd20 docs/features: update virgl GLES 3.1/3.2 status
virgl now exposes GLES3.1 and 3.2
2018-08-01 14:09:11 +10:00
Dave Airlie
e2c62170d5 docs/features: update virgl GL 4.3 support
virgl with up to date host renderer now exposes GL 4.3.
2018-08-01 14:08:33 +10:00
Erik Faye-Lund
21e33f4a10 virgl: enable FBFETCH if virglrenderer supports it
This fixes the following dEQP-GLES31 cases from NotSupported to
Pass for me:

- dEQP-GLES31.functional.blend_equation_advanced.state_query.*
- dEQP-GLES31.functional.blend_equation_advanced.basic.*
- dEQP-GLES31.functional.blend_equation_advanced.srgb.*
- dEQP-GLES31.functional.blend_equation_advanced.msaa.*
- dEQP-GLES31.functional.blend_equation_advanced.barrier.*
- dEQP-GLES31.functional.draw_buffers_indexed.overwrite_*advanced_blend_eq*
- dEQP-GLES31.functional.state_query.indexed.blend_equation_advanced_*
- dEQP-GLES31.functional.debug.negative_coverage.*.advanced_blend.*

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-08-01 14:05:22 +10:00
Erik Faye-Lund
7ef86a03f0 virgl: add texture_barrier stub
In gallium, supporting FBFETCH means supporting non-coherent fetches, but
in virglrenderer, due to technical reasons this is backed by coherent
fetches instead. This means we don't need to do anything for the barriers.

However, if we don't have a texture_barrier implementation, we get crashes
because the non-coherent extensions is exposed.

So, let's leave this as a NOP for now.

[airlied: I've got a more complete impl of this somewhere, once we
land the host side].
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-08-01 14:03:51 +10:00
Dave Airlie
6f5d463a78 virgl: enable robustness if the host exposes it
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-08-01 14:00:38 +10:00
Dave Airlie
2df8b80c4c virgl: Support ARB_framebuffer_no_attachments
This uses new protocol to send the default sizes to the host.

Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-08-01 14:00:35 +10:00
Dave Airlie
f8a8ea6a2d virgl: add initial ARB_compute_shader support
This hooks up compute shader creation and launch grid support.

Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-08-01 14:00:31 +10:00
Marek Olšák
157c6e8195 util: don't use __builtin_clz unconditionally
This fixes the build if __builtin_clz is unsupported.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-07-31 23:28:01 -04:00
Marek Olšák
c5c6e0187f ac/surface: fix MSAA corruption on Vega due to FMASK tile swizzle
a needle in the haystack?

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-31 22:56:40 -04:00
Eric Anholt
e6ff5ac446 v3d: use snprintf(..., "%s", ...) instead of strncpy
Fixes a compiler warning about terminator NUL, based on f836d799f9
("intel/decoder: use snprintf(..., "%s", ...) instead of strncpy")
2018-07-31 16:42:11 -07:00
Eric Anholt
3471ce9985 v3d: Add support for the TMUWT instruction.
This instruction is used to ensure that TMU stores have been processed
before moving on.  In particular, you need any TMU ops to be done by the
time the shader ends.
2018-07-31 16:05:04 -07:00
Marek Olšák
7d36c866d2 radeonsi: report supported EQAA combinations from is_format_supported
Framebuffer without attachments now supports 16 samples.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-07-31 18:28:41 -04:00
Marek Olšák
20dd75a926 radeonsi: use storage_samples instead of color_samples in most places
and use pipe_resource::nr_storage_samples instead of
r600_texture::num_color_samples.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-07-31 18:28:41 -04:00
Marek Olšák
966f155623 gallium: add storage_sample_count parameter into is_format_supported
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-07-31 18:28:41 -04:00
Marek Olšák
8632626c81 gallium: add pipe_resource::nr_storage_samples, and set it same as nr_samples
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-07-31 18:28:41 -04:00
Marek Olšák
0caf74bbcd gallium: add PIPE_CAP_FRAMEBUFFER_MSAA_CONSTRAINTS
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-07-31 18:28:41 -04:00
Marek Olšák
55d56dd859 docs: update radeonsi features and release notes 2018-07-31 18:12:37 -04:00
Marek Olšák
ed8b4ed6c4 st/mesa: implement ASTC 2D LDR fallback for all drivers
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Tested-By: Gert Wollny<gert.wollny@collabora.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
2018-07-31 18:09:57 -04:00
Marek Olšák
5fe52044ef st/mesa: add ETC2 & ASTC fast path for GetTex(Sub)Image
Not sure if GL/GLES can hit this path, but it's just decompression.

Tested-by: Mike Lothian <mike@fireburn.co.uk>
Tested-By: Gert Wollny<gert.wollny@collabora.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
2018-07-31 18:09:57 -04:00
Marek Olšák
ebe03d3699 st/mesa: generalize fallback_copy_image for compressed textures
in order to support ASTC

Tested-by: Mike Lothian <mike@fireburn.co.uk>
Tested-By: Gert Wollny<gert.wollny@collabora.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
2018-07-31 18:09:57 -04:00
Marek Olšák
c3fafa127a st/mesa: generalize code for the compressed texture map/unmap fallback
in order to support ASTC

Tested-by: Mike Lothian <mike@fireburn.co.uk>
Tested-By: Gert Wollny<gert.wollny@collabora.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
2018-07-31 18:09:57 -04:00
Marek Olšák
3d7e4311bf st/mesa: use st_compressed_format_fallback more
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Tested-By: Gert Wollny<gert.wollny@collabora.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
2018-07-31 18:09:57 -04:00
Marek Olšák
912e0525be st/mesa: generalize st_etc_fallback -> st_compressed_format_fallback
for ASTC support later

Tested-by: Mike Lothian <mike@fireburn.co.uk>
Tested-By: Gert Wollny<gert.wollny@collabora.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
2018-07-31 18:09:57 -04:00
Marek Olšák
38ab39f650 mesa: add ASTC 2D LDR decoder
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-07-31 18:09:57 -04:00
Dave Airlie
5be352b430 docs/features: mark virgl image features and GL4.2 as done 2018-08-01 08:06:41 +10:00
Gurchetan Singh
9c136e8a07 virgl: also mark sampler views as dirty
When texture buffers are used as images in compute shaders, the guest
never sees the modified data since the TBO is always marked as clean.

Fixes most dEQP-GLES31.functional.image_load_store.buffer.* tests.

Example test cases:
   dEQP-GLES31.functional.image_load_store.buffer.load_store.r32ui
   dEQP-GLES31.functional.image_load_store.buffer.qualifiers.coherent_r32f
   dEQP-GLES31.functional.image_load_store.buffer.format_reinterpret.rgba8_rgba8ui

Note: virglrenderer side patch also needed to bind TBOs correctly

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-01 08:05:39 +10:00
Dave Airlie
a090df0d5d virgl: add memory barrier support
Reviwed-by: Gert Wollny <gert.wollny@collabora.com>
2018-08-01 08:02:35 +10:00
Dave Airlie
6f75058359 virgl: add TXQS support
Reviwed-by: Gert Wollny <gert.wollny@collabora.com>
2018-08-01 08:02:32 +10:00
Dave Airlie
452eea140d virgl: add initial images support (v2)
v2: add max image samples support

Reviwed-by: Gert Wollny <gert.wollny@collabora.com>
2018-08-01 08:02:27 +10:00
Jon Turney
faa29c0e24 Make glXChooseFBConfig handle unspecified sRGB correctly
Make glXChooseFBConfig properly handle the case where the only matching
configs have the sRGB flag set, but no sRGB attribute is specified.

Since 6e06e281, the sRGBcapable flag is now actually compared, using
MATCH_DONT_CARE.

7b0f912e added defaulting of sRGBcapable to GL_FALSE in
__glXInitializeVisualConfigFromTags(), to handle servers which don't report
it, but this function is also used by glXChooseFBConfig(), so sRGBcapable is
implicitly false when not explicitly specified.

(This can cause e.g. glxinfo to fail to find anything matching the simple
config it looks for if all the candidates have the sRGB flag set to true.
I'm assuming this doesn't happen 'normally' as candidate configs with and
without sRGB true are available)

Move this defaulting to createConfigsFromProperties(), and set the default
for glXChooseFBConfig() in init_fbconfig_for_chooser() to GLX_DONT_CARE.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-31 13:56:13 -04:00
Olivier Fourdan
03a61b977e dri3: For 1.2, use root window instead of pixmap drawable
get_supported_modifiers() and pixmap_from_buffers() requests both
expect a window as drawable, passing a pixmap will fail as the Xserver
will fail to match the given drawable to a window.

That leads to dri3_alloc_render_buffer() to return NULL and breaks
rendering when using GLX_DOUBLEBUFFER on pixmaps.

Query the root window of the pixmap on first init, and use the root
window instead of the pixmap drawable for get_supported_modifiers()
and pixmap_from_buffers().

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107117
Fixes: 069fdd5 ("egl/x11: Support DRI3 v1.1")
Signed-off-by: Olivier Fourdan <ofourdan@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-31 13:51:59 -04:00
Alejandro Piñeiro
16b5e15e91 i965: enable XFB and GeometryStreams for gen7+
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-31 13:33:37 +02:00
Neil Roberts
b7421cda86 i965: Link XFB varyings for SPIR-V shaders
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-31 13:33:37 +02:00
Neil Roberts
b9719b4b05 nir/linker: Add the start of a pure-NIR linker for XFB
v2: ignore names on purpose, for consistency with other places where
    we are doing the same (Alejandro)

v3: changes proposed by Timothy Arceri, implemented by Alejandro Piñeiro:
   * Remove redundant 'struct active_xfb_varying'
   * Update several comments, including spec quotes if needed
   * Rename struct 'active_xfb_varying_array' to 'active_xfb_varyings'
   * Rename variable 'array' to 'active_varyings'
   * Replace one if condition for an assert (<MAX_FEEDBACK_BUFFERS)
   * Remove BufferMode initialization (was already done)

v4: simplify output pointer handling (Timothy)

Signed-off-by: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-31 13:33:37 +02:00
Neil Roberts
9fbe5bd811 nir/types: Add a wrapper to access gl_type
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-31 13:33:37 +02:00
Alejandro Piñeiro
739bb9e3d4 arb_gl_spirv: add calls to several nir lowerings
For now we are just adding nir lowerings that are needed/mandatory to
get things working. After everything is settled, we would start to add
good-to-have lowerings.

This patch adds the following calls:

  * nir_split_var_copits and nir_split_per_member_structs: as vulkan
    drivers are doing now. See commit
    b0c643d8f5 ("spirv: Use NIR
    per-member splitting") for more info.

    Without this commit, piglit tests like this crashes:
    spec/arb_gl_spirv/execution/varying/block

    And in general most of the shaders that includes any kind of
    struct.

   * nir_copy_prop: after nir_deref_instr introduction, function calls
    need this. See commit "nir,spirv: Rework function calls"
    (c11833ab24) for more info.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-31 13:33:37 +02:00
Alejandro Piñeiro
d69027536c compiler/spirv: add XFB and GeometryStreams capability check support
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-31 13:33:28 +02:00
Neil Roberts
1e3f61d1d5 nir/gather_info: Set info.gs.uses_streams
Whenever a non-zero stream is written to it now sets uses_streams to
true. This reflects the code in validate_geometry_shader_emissions for
GLSL.

v2: set uses_streams at gather_info instead that at spirv to nir
    (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-31 13:18:28 +02:00
Neil Roberts
b0af66bb17 spirv/nir: Fix the stream ID when emitting a primitive or vertex
It looks like it was previously taking the SPIR-V instruction number
directly instead of looking up the constant value.

v2: use vtn_constant_value helper (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-31 13:18:28 +02:00
Neil Roberts
13b8857fcf spirv: Handle the SpvDecorationStream decoration
From SPIR-V 1.0 spec, section 3.20, "Decoration":

   "Stream
    Apply to an object or a member of a structure type. Indicates the
    stream number to put an output on."

Note the "or", so that means that it is allowed for both a full struct
or a membef or a struct (although the wording is not really ideal, and
somewhat error-prone, imho).

We found this with some Geometry Streams tests for ARB_gl_spirv, where
the full gl_PerVertex is assigned Stream 0 (default value on OpenGL
for gl_PerVertex).

So this commit allows structs to have this Decoration, and sets the
stream at the nir variable if needed.

Signed-off-by: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

v2: squash two Decoration Stream patches (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-31 13:18:28 +02:00
Neil Roberts
d480623bef mesa/glspirv: Set last_vert_prog
v2: simplify last_vert check (Timothy)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-31 13:18:28 +02:00
Neil Roberts
cd4a14be06 spirv: Handle XFB variable decorations
These set the new explicit XFB members on nir_variable.

This is needed to support ARB_gl_spirv, as Vulkan doesn't support
transform feedback.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-31 13:18:28 +02:00
Neil Roberts
a5ec8461f9 spirv: Handle SpvExecutionModeXfb
This just sets has_transform_feedback_varyings on the shader.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-31 13:18:28 +02:00
Neil Roberts
3fd5b4c7aa nir: Add members for the explicit XFB properties to nir_variable
These are copied from the from the corresponding values in
ir_variable. The intention is to eventually use them in a pure-NIR
linker.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-31 13:18:28 +02:00
Christian Gmeiner
e1d4882d05 etnaviv: fix typo in query names
Fixes: d0bed0b494 ("etnaviv: support HI performance counters")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Chris Healy <cphealy@gmail.com>
2018-07-31 08:33:32 +02:00
Tapani Pälli
553af7a190 mesa: fix a typo (trivial)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-31 08:19:38 +03:00
Tapani Pälli
ce80abbb17 mesa: add glRenderbufferStorage support for EXT_texture_norm16 formats
These bits were missing, found when extending the Piglit test.

Fixes: 7f467d4f73 "mesa: GL_EXT_texture_norm16 extension plumbing"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-31 08:19:10 +03:00
David Riley
f94681b6e2 egl/surfaceless: Allow DRMless fallback.
Allow platform_surfaceless to use swrast even if DRM is not available.
To be used to allow a fuzzer for virgl to be run on a jailed VM without
hardware GL or DRM support.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Signed-off-by: David Riley <davidriley@chromium.org>
2018-07-30 19:40:45 -07:00
David Riley
b169b84be6 egl/surfaceless: Define DRI_SWRastLoader extension when using swrast.
Signed-off-by: David Riley <davidriley@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
[chadv: Dropped spurious hunk]
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-07-30 19:40:08 -07:00
Eric Anholt
d934492ff9 v3d: Dump the contents off all the buffers in CLIF mode.
A V3D_DEBUG=clif file from a non-texturing .shader_test can now be
successfully run through the CLIF runner in the simulator.  Now I need to
build an open source CLIF runner against the v3d DRM module.
2018-07-30 14:29:01 -07:00
Eric Anholt
99a5ac250b v3d: Split walking the CLs to generate relocs from walking CLs to dump.
We need to dump each buffer's contents in order for a CLIF file, so we
need to collect all of the relocs into a buffer (such as the indirect CL
full of both uniforms and GL shader states) before we start dumping.
2018-07-30 14:29:01 -07:00
Eric Anholt
2df6f1a3df v3d: Include commands to run the BCL and RCL in CLIF dumps. 2018-07-30 14:29:01 -07:00
Eric Anholt
c6449e33e3 v3d: Use a short, underscored name for packets in CLIF/CL dumping.
These will match the names that the CLIF parser expects to see.  I may in
the future decide to change more of the other names so that I match the
names the HW/closed SW team uses for their packets, rather than the names
in the spec (which only they and I can read anyway).
2018-07-30 14:29:01 -07:00
Eric Anholt
b56f8c475e v3d: Rename "configuration" and "config" in the XML to "cfg"
This matches what CLIF parsing expects, and makes
TILE_BINNING_MODE_CONFIGURATION_COMMON_CONFIGURATION into a much more
legible TILE_BINNING_MODE_CFG_COMMON.
2018-07-30 14:29:01 -07:00
Eric Anholt
300e609feb v3d: s/colour/color in the XML.
The CLIF format expects american english spelling, and the rest of Mesa is
too.  I was previously adhering to the spec's spelling, which is
counterproductive.
2018-07-30 14:29:01 -07:00
Eric Anholt
3a8550ad06 v3d: Rename primitives to prims in the XML to match CLIF names.
This makes us match up with the V3D HW team's names a bit more.
2018-07-30 14:29:01 -07:00
Eric Anholt
6237c64049 v3d: Print CLIF fixed-point values as just their decimal value.
The parser doesn't handle float input, so we have to dump the raw value.
2018-07-30 14:29:01 -07:00
Eric Anholt
8da47b7648 v3d: When not doing terminal pretty-printing, comment struct field names.
The struct field names aren't part of the CLIF ABI, just the order of
fields within the struct.  The comments are there for human readability.
2018-07-30 14:29:01 -07:00
Eric Anholt
103f21b13d v3d: Add a separate flag for CLIF ABI output versus human-readable CLs.
A few of the upcoming changes would make the V3D_DEBUG=cl output less
readable, so let's make proper CLIF file production be under a separate
V3D_DEBUG=clif flag.
2018-07-30 14:29:01 -07:00
Eric Anholt
89ac6fa403 v3d: Add pack header support for f187 values.
V3D only has one of these (the top 16 bits of a float32) left in its CLs,
but VC4 had many more.  This gets us proper pretty-printing of the values
instead of a large uint.
2018-07-30 14:29:01 -07:00
Eric Anholt
e146e3a795 v3d: Move depth offset packet setup to CSO creation time.
This should be some simpler memcpying at draw time, and makes the next
change easier.
2018-07-30 14:29:01 -07:00
Dave Airlie
9039cf70fa r600: reduce num compute threads to 1024.
I copied this value from radeonsi, but it was wrong, 1024
seems to be correct answer from looking at gpuinfo.

This should fix a few compute shader related hangs. (at least in CTS)

Cc: <mesa-stable@lists.freedesktop.org>
(airlied: pushed because it avoids hangs)
2018-07-31 04:55:38 +10:00
Rob Clark
0ea243dcd5 freedreno/a5xx: fix txf_ms
Somehow this got lost from the initial MSAA patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-30 12:31:05 -04:00
Rhys Perry
f310e86a42 nvc0: serialize before updating some constant buffer bindings on Maxwell+
To avoid serializing, this has the user constant buffer always be 65536
bytes and enabled unless it's required that something else is used for
constant buffer 0.

Fixes artifacts with at least XCOM: Enemy Within, 0 A.D. and Unigine
Valley, Heaven and Superposition.

v2: changed uniform_buffer_bound to be bool instead of a uint32_t
v3: remove magic constants
v3: remove pointless code in nvc0_validate_driverconst

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100177
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-07-30 15:04:26 +01:00
Eric Anholt
0a3f653180 v3d: Block bin on render when doing vertex texturing.
The kernel by default serializes the BCL on previous BCLs submitted on
this FD, but not RCLs.  For now this fix is conservative and blocks on
last RCL if any vertex texturing is done, which fails to get bin/render
overlap if there was an intermediate job that doesn't draw to the BCL's
buffer.  I've dropped a perf_debug() in here to note that as a potential
future improvement.

Fixes intermittent failures in
KHR-GLES3.copy_tex_image_conversions.required.*
2018-07-29 19:25:39 -07:00
Eric Anholt
34cefa7fe0 v3d: Fix meson build without vc4. 2018-07-29 19:22:33 -07:00
Eric Anholt
27f1bfe471 vc4: Fix meson build when enabled without v3d.
Reported-by: Rob Clark <robdclark@gmail.com>
Fixes: e92959c4e0 ("v3d: Pass the whole clif_dump structure to v3d_print_group().")
2018-07-29 19:13:29 -07:00
Jason Ekstrand
05fb2f88ec nir/instr_set: Fix nir_instrs_equal for derefs
We weren't returning at the end of the nir_isntr_type_deref case in
nir_instrs_equal and it was falling through to the default of false.
While we're at it, make the default unreachable because all statements
in the switch now have their own returns.  Had we done that before, we
would have caught this bug a long time ago.

Fixes: 19a4662a54 "nir: Add a deref instruction type"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Thomas Helland<thomashelland90@gmail.com>
2018-07-29 13:39:35 -07:00
Jason Ekstrand
9a4ab4c120 nir: Take if uses into account in ssa_def_components_read
Fixes: d800b7daa5 "nir: Add a helper for figuring out what..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-29 13:39:35 -07:00
Jason Ekstrand
5c1c6939ce util/list: Make some helpers take const lists
They're all just querying things about the list and not mutating
anything.

Reviewed-by: Thomas Helland<thomashelland90@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-07-29 13:39:35 -07:00
Rob Clark
0ddae4acae freedreno/a5xx: small cleanup
We no longer have semi-custom clear pipe that uses 3d state.  Normal
clears happen via hw blitter, and everything else uses u_blitter these
days.  So we don't need this hack.

TODO a3xx+a4xx could get same treatment.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-29 14:00:06 -04:00
Rob Clark
3932db0f7e freedreno/a5xx: remove unused prototype
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-29 13:50:19 -04:00
Rob Clark
104a49f166 freedreno: fix caps harder
Fixes: 868ca81c and f485e567
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-29 13:48:22 -04:00
Karol Herbst
bc0e0c2818 nir/lower_int64: mark all metadata as dirty
v2: use nir_metadata_preserve
    preserve metadata in case of !progress

Fixes: 074f5ba0b5
       "nir: Add a simple int64 lowering pass"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-28 19:59:28 +02:00
Mauro Rossi
0ca153f869 android: radv: enable build of vulkan.radv HAL module
src/amd/Android.mk requires to include src/amd/vulkan/Android.mk
to enable the build of vulkan.radv module

Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-07-28 12:40:14 +02:00
Mauro Rossi
212af3c9ea android: radv: add Android.mk for vulkan.radv HAL module
radv implements the Android Vulkan HAL interface, this patch adds
Android.mk building rules by porting of radv automake rules.
vendor HAL module is installed as /vendor/lib/hw/vulkan.radv.so

Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-07-28 12:40:07 +02:00
Mauro Rossi
1eb65c51ad radv: generate entrypoints for VK_ANDROID_native_buffer
Patch changes radv entrypoints generator to not skip this extension even
though it is set as disabled in the vk.xml

Reference: 63525ba730 ("android: enable VK_ANDROID_native_buffer")
Fixes: 69f447553c ("vulkan: Drop vk_android_native_buffer.xml")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-07-28 12:39:57 +02:00
Mauro Rossi
c67b36c8a1 radv: move vk_format_table.c to generated sources
Android build system will try to compile vk_format_table.c
as a shipped source, but at compile time it will be missing,
we move it to generated source, where it belongs

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-07-28 12:39:49 +02:00
Brian Paul
b4bda6e066 xlib: fix build break from _swrast_map_soft_renderbuffer() call
We need to pass the new flip_y argument.

Reviewed-by: Clayton Craft <clayton.a.craft@intel.com>
2018-07-27 21:21:24 -06:00
Brian Paul
90b189e5d2 swrast: fix crash in AA line code when there's no texture
Fixes a crash running the Piglit polygon-mode-facing test (and
probably others).

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-07-27 21:21:24 -06:00
Brian Paul
ce0f42dfe4 mesa: add switch case for GL 2.1 in _mesa_compute_version()
The xlib/swrast driver only supports GL 2.1.  This patch fixes a
crash if the app calls glGetString(GL_SHADING_LANGUAGE_VERSION).

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-07-27 21:21:24 -06:00
Brian Paul
4f51e8880d tgsi: whitespace fixes in tgsi_ureg.c
Trivial.
2018-07-27 21:21:24 -06:00
Brian Paul
f02243541d gallium/util: whitespace fixes in u_inlines.h
Trivial.
2018-07-27 21:21:24 -06:00
Brian Paul
4216a1d0a8 svga: whitespace fixes in svga_tgsi_decl_sm30.c
Trivial.
2018-07-27 21:21:24 -06:00
Brian Paul
2f1af8549d mesa: replace tabs with spaces in mipmap.c
Trivial.
2018-07-27 21:21:24 -06:00
Brian Paul
f39840f866 gallium/util: whitespace fixes in u_debug_memory.c
Trivial.
2018-07-27 21:21:24 -06:00
Brian Paul
2261d6a403 mesa: whitespace clean-up in texstore.c
Trivial.
2018-07-27 21:21:24 -06:00
Brian Paul
a67b629193 mesa: move var decls in texstore_rgba()
Move them closer to where they're first used.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-07-27 21:21:24 -06:00
Brian Paul
5e2582b381 mesa: remove unneeded free() call in texstore_rgba()
The pointer will always be NULL since that's what we just tested for.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-07-27 21:21:24 -06:00
Eric Anholt
942456f646 v3d: Skip printing sub-id or pad fields in CLIF dumping.
The parser doesn't expect them, so our fields would end up mismatched.
They're not really useful in console output, either.
2018-07-27 18:00:48 -07:00
Eric Anholt
3ee0ab599e v3d: Emit commands to switch CLIF parser to CL/shader/attr input mode.
By default after saying you are emitting a buffer, it'll expect a buffer
size.  Once you set a format, it'll keep parsing that format until you
announce something else.
2018-07-27 18:00:46 -07:00
Eric Anholt
a57770aa37 v3d: Dump fields in CLIF output in increasing offset order.
Previously, we emitted in XML order, which I happen to type in the
decreasing offset order of the specifications.  However, the CLIF parser
wants increasing offsets.
2018-07-27 17:56:55 -07:00
Eric Anholt
95bafeeabf v3d: Print addresses in CLIFs as references to buffers.
With CLIFs, the parser will choose an address for the buffer being
created, so we need to use effectively relocations to buffers instead of
the addresses that the driver uses.  This is also a whole lot more
intelligible for console output than raw addresses!
2018-07-27 17:56:36 -07:00
Eric Anholt
3c02838d29 v3d: Stop doing pretty-printed colorful booleans in CLIF output.
The parser wants to see a 1 or 0.  We can put "true" and "false" in a
comment to clarify that it's a boolean and the parser will skip it.
2018-07-27 17:55:57 -07:00
Eric Anholt
422910d2e7 v3d: Move clif dumping to a separate step from noting where the CLs are.
Now all the printing happens from the same worklist processing.
2018-07-27 17:08:35 -07:00
Eric Anholt
01b4952773 v3d: Move clif dump BO lookup into the clif dumper.
The clif dumper is going to need information about all of our BOs if we're
going to dump them for replay purposes.
2018-07-27 17:08:35 -07:00
Eric Anholt
e92959c4e0 v3d: Pass the whole clif_dump structure to v3d_print_group().
To generate CLIF files that the v3dv3 simulator can parse, we're going to
need to decode addresses, and for that we'll need the vaddr lookup
function from the clif structure from within v3d_decoder.
2018-07-27 17:08:35 -07:00
Timothy Arceri
77207e5380 ac: pass write param to get_sampler_desc() from get_image_descriptor()
Looks like a mistake from when the deref stuff landed.

Fixes: 506a07e4e3 ("ac/nir: Add deref support to image intrinsics.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-28 08:57:03 +10:00
Marek Olšák
d89a123dfd gallium/u_vbuf: split u_vbuf_get_minmax_index function (v2)
This will be used by indirect multidraws.

v2: clean up the function further, change return types to unsigned

Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
2018-07-27 17:50:40 -04:00
Alexander von Gluck IV
da8de6b757 gallium/auxiliary: Extern "c" fixes.
Used by C++ code such as Haiku's renderer.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-07-27 16:19:12 -05:00
Marek Olšák
5fe943aaee gallium/noop: implement invalidate_resource 2018-07-27 16:31:56 -04:00
Dave Airlie
5040319331 radv: fix cdw check vs tracing emit
If we have tracing enabled we could do all the tracing emits
and overflow the precalculated cdw_max.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-28 06:20:27 +10:00
Dave Airlie
b88468f15c radv: return binary code_size not variant code size to cache
The code sizes return here get passed to the cache shader insert function,
which then memcpy from the code ptr, and causes all sorts of valgrind
errors like:
==6755== Invalid read of size 8
==6755==    at 0x4C32FEE: memcpy@GLIBC_2.2.5 (vg_replace_strmem.c:1021)
==6755==    by 0x2305D4C7: radv_pipeline_cache_insert_shaders (radv_pipeline_cache.c:416)
==6755==    by 0x2305791D: radv_create_shaders (radv_pipeline.c:2158)
==6755==    by 0x2305C523: radv_pipeline_init (radv_pipeline.c:3404)
==6755==    by 0x2305C890: radv_graphics_pipeline_create (radv_pipeline.c:3515)
==6755==    by 0x230188AB: radv_device_init_meta_blit_color (radv_meta_blit.c:871)
==6755==    by 0x2301D50E: radv_device_init_meta_blit_state (radv_meta_blit.c:1278)
==6755==    by 0x23011893: radv_device_init_meta (radv_meta.c:352)
==6755==    by 0x2300744B: radv_CreateDevice (radv_device.c:1576)
==6755==    by 0x5187D0F: ??? (in /usr/lib64/libvulkan.so.1.1.77)
==6755==    by 0x518F6A3: ??? (in /usr/lib64/libvulkan.so.1.1.77)
==6755==    by 0x5192A42: vkCreateDevice (in /usr/lib64/libvulkan.so.1.1.77)
==6755==  Address 0x22a58548 is 4 bytes after a block of size 116 alloc'd
==6755==    at 0x4C2EBAB: malloc (vg_replace_malloc.c:299)
==6755==    by 0x23089DC4: ac_elf_read (ac_binary.c:144)
==6755==    by 0x23090A60: ac_compile_module_to_binary (ac_llvm_helper.cpp:162)
==6755==    by 0x23053F06: compile_to_memory_buffer (radv_llvm_helper.cpp:58)
==6755==    by 0x23053F06: radv_compile_to_binary (radv_llvm_helper.cpp:98)
==6755==    by 0x23052769: ac_llvm_compile (radv_nir_to_llvm.c:3394)
==6755==    by 0x23052823: ac_compile_llvm_module (radv_nir_to_llvm.c:3418)
==6755==    by 0x23053C05: radv_compile_nir_shader (radv_nir_to_llvm.c:3542)
==6755==    by 0x23061B4E: shader_variant_create (radv_shader.c:580)
==6755==    by 0x23061CFD: radv_shader_variant_create (radv_shader.c:634)
==6755==    by 0x23057765: radv_create_shaders (radv_pipeline.c:2123)
==6755==    by 0x2305C523: radv_pipeline_init (radv_pipeline.c:3404)
==6755==    by 0x2305C890: radv_graphics_pipeline_create (radv_pipeline.c:3515)

Since we are just inserting the code into the cache, we can avoid these
bad reads and data in the cache by just using the binary code size here.

Fixes: 939e5a382 (radv: add padding for the UMR disassembler)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-28 06:20:20 +10:00
Eric Anholt
22a1ba0403 v3d: Drop the use of the semaphores.
The kernel's scheduler doesn't rely on our emitting them, and in fact we'd
get in trouble if the kernel decided to schedule too many bins in a row
before getting around to scheduling the corresponding render.
2018-07-27 12:56:36 -07:00
Eric Anholt
9bf9a6d6a1 v3d: Drop the VG support from the XML.
This reflects a change on the HW/closed SW side to drop this unused HW.
With it dropped on their side, the CLIF parser no longer expects to find
VG fields.
2018-07-27 12:56:36 -07:00
Eric Anholt
5a1cc3861c v3d: Use /* */ instead of () for enum names in CLIF output.
This lets the comments be ignored by the CLIF parser.
2018-07-27 12:56:36 -07:00
Eric Anholt
95a0f99825 v3d: CLIF-dump the "Vec size" field as 0 == maximum value.
That's what a user should want to see, and what the CLIF parser wants.
This should maybe be generalized.
2018-07-27 12:56:36 -07:00
Eric Anholt
1c8e4632a7 v3d: Stop using spaces in the names of our buffers.
For CLIF dumping, we need names to not have spaces.  Rather than rewriting
them after the fact, just change the two cases where I had put a space in.
2018-07-27 12:56:36 -07:00
Fritz Koenig
ab05dd183c i965: implement GL_MESA_framebuffer_flip_y [v3]
Instead of using _mesa_is_winsys_fbo or
_mesa_is_user_fbo to infer if an fbo is
flipped use the FlipY flag.

v2:
  * additional window-system framebuffer checks [for jason]
v3:
  * s/inverted_y/flip_y/g [for chadv]
  * s/InvertedY/FlipY/g [for chadv]

Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-07-27 12:33:32 -07:00
Fritz Koenig
318c265160 mesa: GL_MESA_framebuffer_flip_y extension [v4]
Adds an extension to glFramebufferParameteri
that will specify if the framebuffer is vertically
flipped. Historically system framebuffers are
vertically flipped and user framebuffers are not.
Checking to see the state was done by looking at
the name field.  This adds an explicit field.

v2:
  * updated spec language [for chadv]
  * correctly specifying ES 3.1 [for chadv]
  * refactor access to rb->Name [for jason]
  * handle GetFramebufferParameteriv [for chadv]
v3:
  * correct _mesa_GetMultisamplefv [for kusmabite]
v4:
  * update spec language [for chadv]
  * s/GLboolean/bool/g [for chadv]
  * s/InvertedY/FlipY/g [for chadv]
  * s/inverted_y/flip_y/g [for chadv]
  * assert changes [for chadv]

Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-07-27 12:32:25 -07:00
Chad Versace
7953399e59 gallium/auxiliary: Fix Autotools on Android (v2)
Problem 1: u_debug_stack_android.cpp transitively included
"pipe/p_compiler.h", but src/gallium/include was missing from the C++
include path.

Problem 2: Add -std=c++11 to AM_CXXFLAGS. Android's libbacktrace headers
require C++11, but the Android toolchain (at least in the Chrome OS SDK)
does not enable C++11 by default.

v2: Add -std=c++11.

Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Cc: Eric Engestrom <eric.engestrom@intel.com>
2018-07-27 11:35:56 -07:00
Topi Pohjolainen
a5889d70f2 i965/icl: Disable binding table prefetching
Gen 11 workarounds table #2056 WABTPPrefetchDisable suggests to
disable prefetching of binding tables for ICLLP A0 and B0
steppings. It fixes multiple gpu hangs in
ext_framebuffer_multisample* tests on ICLLP B0 h/w.

Anuj: Add comments and commit message.
      Add gen 11 checks in the code.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-27 11:05:04 -07:00
Caio Marcelo de Oliveira Filho
1d71981b27 glsl: use only copy_propagation_elements
Now that the elements version handles both cases, remove the
non-elements version.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-07-27 10:51:25 -07:00
Caio Marcelo de Oliveira Filho
134b5a7047 glsl: teach copy_propagation_elements to deal with whole variables
Keep information in acp_entry whether the entry is full or not, and
use the ACP in more nodes when visiting the instructions:

- add_copy: write whole variables to the ACP state (regardless the
  type).

- visit(ir_dereference_variable *): perform the propagation here if we have a
  full candidate. Element-wise here doesn't apply because the mask
  isn't available at this point.

- visit_leave(ir_assignment *): process beyond scalar and vector, as
  the full variables might have other types.

Also import an improvement from opt_copy_propagation.cpp: if ir_call
is an intrinsic, we know the variables affected, so keep going.

v2: (all from Eric Anholt)
    Describe how acp_entry attributes are used.
    Don't do book-keeping to avoid adding repeated element to
    the dsts in write_elements().

v3: Use _mesa_set_remove_key. (Thomas Helland)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-07-27 10:51:25 -07:00
vadym.shovkoplias
399228ecad i965: Disable guardband clipping on SandyBridge for odd dimensions
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104388
Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-27 10:07:44 -07:00
Dylan Baker
665fc9cf55 docs: Update release calendar, add news item, and add release notes for 18.1.5 2018-07-27 07:08:59 -07:00
Dylan Baker
2b7b5d3100 docs: Add sha-256 sums for 18.1.5 2018-07-27 07:06:55 -07:00
Dylan Baker
5cc4ee3e17 docs: add 18.1.5 release notes 2018-07-27 07:06:53 -07:00
Iago Toral Quiroga
615aaedb93 intel/compiler: fix lower conversions to account for predication
The pass can create a temporary result for the instruction and then
moves from it to the original destination, however, if the original
instruction was predicated, the mov has to be predicated as well.

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2018-07-27 14:48:29 +02:00
Samuel Pitoiset
df679b1643 radv: allocate enough space in radv_cmd_buffer_after_draw()
The driver might emit up to 4 dwords when RADV_TRACE_FILE is
used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-27 14:31:29 +02:00
Samuel Pitoiset
c08ae911d9 radv: check CS space in radv_emit_write_data_packet()
This wasn't wrong but it looks better to me like this. It's
only used for debugging purposes (ie. RADV_TRACE_FILE).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-27 14:31:27 +02:00
Samuel Pitoiset
434630f57c radv: do not emit pipeline stats flushes on compute queue
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-27 14:31:26 +02:00
Samuel Pitoiset
c118c8938c radv: reduce CB/DB meta flushes in radv_dst_access_flush()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-27 14:31:24 +02:00
Kenneth Graunke
0c4e0471f5 radv: Fix build
I renamed this pass and forgot to update radv.

Fixes: 488972222c ("i965: Combine both gl_PatchVerticesIn lowering passes.")
2018-07-26 23:57:13 -07:00
Kenneth Graunke
488972222c i965: Combine both gl_PatchVerticesIn lowering passes.
Until now, we had separate passes for lowering gl_PatchVerticesIn to
a statically known constant (for TES inputs when linked against a TCS),
and a uniform in the other cases.  Annoyingly, one had to be run before
nir_lower_system_values, and the other afterward.  This simplified the
passes, but made life painful for the callers.

This patch combines both into a single pass.  If you give it a non-zero
static count, it uses that.  If you give it Mesa state slots, it turns
it back into a built-in uniform.  Otherwise, it does nothing.

This also moves the i965 uniform lowering out to shared code.

v2: Make token arrays const.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-26 21:51:36 -07:00
Sagar Ghuge
29dd5dda9d i965: Expose EXT_base_instance extension in OpenGLES 3.0
The extension requires at least OpenGL 3.0 and
OpenGL ES 3.0.

Fixes two ext_base_instance tests:

arb_base_instance-baseinstance-doesnt-affect-gl-instance-id_gles3
arb_base_instance-drawarrays_gles3

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-07-26 17:25:35 -07:00
Bas Nieuwenhuizen
3665f66ef2 radv: Add support for ETC2 textures.
Was surprised that is even supported by Vega.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-27 01:31:32 +02:00
Jan Vesely
1e8b8e0878 clover: Reduce wait_count in abort path.
Trigger waiter condition variable.
Passes 'events' CTS on carrizo and turks.
v2: reduce to 0

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-07-26 15:38:22 -04:00
Jan Vesely
c2942141ae clover: Don't extend illegal integer types.
It's OK to pass them in memory, which is what kernel invocation needs.
Fixes regressions since llvm r337535 ("Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering"):
	scalar-arithmetic-char
	scalar-arithmetic-uchar
	scalar-arithemtic-short
	scalar-arithmetic-ushort
	scalar-comparison-char
	scalar-comparison-uchar
	scalar-comparison-short
	scalar-comparison-ushort

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-07-26 15:38:22 -04:00
Kenneth Graunke
8794fe3e30 intel/compiler: Delete dead VS intrinsic handling.
These are lowered by brw_nir_lower_vs_inputs().  If they weren't, we
would have already hit the unreachable() in emit_system_values_block().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-26 11:45:34 -07:00
Eric Anholt
deecc1ef86 v3d: Avoid the GFXH-1461 workaround if we have only Z or only S.
This seems like a sensible precaution to avoid extra draws.  It doesn't
deal with the case of a Z24S8 buffer created by the window system for an
application that happens to never use S.
2018-07-26 11:02:25 -07:00
Eric Anholt
301c32caf4 v3d: Rework the ordering of how we clear things.
First, figure out if we can just sneak the clear into the TLB clear, even
if drawing has already happened (since we have job->load and job->clear to
tell us), taking into account GFXH-1461.  For any pieces we can't TLB
clear, fall back to drawing a quad without flushing the scene.

Fixes extra scene flushes in glmark2 due to GFXH-1461.
2018-07-26 11:02:25 -07:00
Eric Anholt
ceecddfe77 v3d: Only store buffers that have been written to.
I've seen cases where a color buffer is bound, but only Z is written, and
we end up storing color.
2018-07-26 11:02:25 -07:00
Eric Anholt
d29435e7cb v3d: Track the buffers being loaded separately.
We were computing this at RCL generation time, but that means you can't
unflag the store for an invalidate_resource, or not flag the store if
writmasking is disabled.
2018-07-26 11:02:20 -07:00
Eric Anholt
47f5d158ae v3d: Rename cleared/resolve to clear/store.
These describe what the fields mean in RCL generation.  "resolve" is left
over from VC4, and sounds like MSAA resolves (which may or may not be
involved in the store we generate).
2018-07-26 11:00:34 -07:00
Eric Anholt
d934d3206e nir: Add flipping of gl_PointCoord.y in nir_lower_wpos_ytransform.
This is controlled by a new nir_shader_compiler_options flag, and fixes
dEQP-GLES3.functional.shaders.builtin_variable.pointcoord on V3D.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-26 11:00:34 -07:00
Rhys Perry
b5a56a11da docs: fix incorrect placement of the ARB_sample_locations release notes
Seems something went wrong somehow when it was pushed.

v2: combine into one list

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek OIšák <marek.olsak@amd.com>
2018-07-26 11:49:23 +01:00
Eric Engestrom
2cc1849afb anv: drop unused local vars
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-26 10:21:03 +01:00
Eric Engestrom
2a4191bb38 anv: remove incorrect UNUSED flag
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-07-26 10:06:11 +01:00
Erik Faye-Lund
e68fe445f5 gallium: initialize ureg_dst::Invariant bit
When this bit was added, it seems the some initialization code
was omitted by mistake.

Since stack-variables have kinda random contents, and we don't
zero initialize the whole struct in these code-paths, we end up
getting random-ish values for this bit.

Spotted by Coverity in the following CIDs:
- 1438115
- 1438123
- 1438130

Fixes: 70425bcfe6 ("gallium: plumb
invariant output attrib thru TGSI")

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-07-26 09:01:33 +02:00
Samuel Pitoiset
ff0d553818 radv: fix adjusting vertex fetches since 16bit support
Move the integer conversion after the fixup.

This fixes some regressions with
dEQP-VK.pipeline.vertex_input.single_attribute.mat4.as_a2r10g10b10*

Fixes: b722b29f10 ("radv: add support for 16bit input/output")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-26 08:57:43 +02:00
Samuel Pitoiset
6465bf0015 nir: remove wrong assertion in print_var_decl()
This breaks printing input/output variables with more than
4 components like mat4.

Fixes: 1beef89ad8 ("nir: prepare for bumping up max components to 16")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-26 08:57:38 +02:00
Marek Olšák
ce8e6b970b ac: fix typo DSL_SEL -> DST_SEL 2018-07-26 01:45:47 -04:00
Marek Olšák
7039d9299e radeonsi: update a comment about cache behavior 2018-07-26 01:45:47 -04:00
Kenneth Graunke
37c3efca29 intel: Make the decoder just store addresses for bases, not buffers.
The various base addresses are simply addresses.  There may or may not
be a buffer located at those addresses.  So, it doesn't make much sense
to request one.  Just save the raw address so we can add it later, when
asking about BOs at the final <base + offset> address.

Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-25 14:43:54 -07:00
Kenneth Graunke
933223db3c intel: Make the decoder handle STATE_BASE_ADDRESS not being a buffer.
Normally, i965 programs STATE_BASE_ADDRESS every batch, and puts all
state for a given base in a single buffer.

I'm working on a prototype which emits STATE_BASE_ADDRESS only once at
startup, where each base address is a fixed 4GB region of the PPGTT.
State may live in many buffers in that 4GB region, even if there isn't
a buffer located at the actual base address itself.

To handle this, we need to save the STATE_BASE_ADDRESS values across
multiple batches, rather than assuming we'll see the command each time.
Then, each time we see a pointer, we need to ask the driver for the BO
map for that data.  (We can't just use the map for the base address, as
state may be in multiple buffers, and there may not even be a buffer
at the base address to map.)

v2: Fix things caught in review by Lionel:
 - Drop bogus bind_bo.size check.
 - Drop "get the BOs again" code - we just get the BOs as needed
 - Add a message about interface descriptor data being unavailable

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-25 14:43:47 -07:00
Eric Engestrom
aa59f9c8bc anv: don't crash on vkDestroyDevice(NULL)
CovID: 1438132
Fixes: a99c9e63a0 "anv: finish the binding_table_pool on
                              destroyDevice when use_softpin"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2018-07-25 21:04:30 +01:00
Eric Engestrom
270a44040c vulkan/wsi: fix incorrect assignment in assert()
CovID: 1438113, 1438118, 1438119, 1438121
Fixes: dc1d10b396 "anv,radv: Add support for VK_KHR_get_display_properties2"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-25 20:55:35 +01:00
Eric Engestrom
bbf8316fcb anv: fix python whitespace warning
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-07-25 20:55:35 +01:00
Eric Engestrom
e0347581f3 anv: cleanup python imports
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-07-25 20:55:35 +01:00
Eric Engestrom
ce7348507e anv: remove unnecessary semicolons in python
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-07-25 20:55:35 +01:00
Kenneth Graunke
a2c63cae14 st/nir: Fix st_nir_opts() prototype.
This wasn't updated for the new scalar ISA parameter.  It worked anyway
because all the function's callers live in the same file, so it found
the correct function.  Tim made this external for the new st prog_to_nir
translator, which got reverted, but which I'd like to land eventually.

So, fix the prototype.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-07-25 10:19:41 -07:00
Lionel Landwerlin
b21b38c46c intel: tools: dump: only store device id on success
We might fail on master node drm fd because we won't have the right
permissions.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-07-25 16:53:06 +01:00
Gert Wollny
82fc6bdebf r600: Scale integer valued texture border colors to float (v2)
It seems the hardware always expects floating point border color values
[0,1] for unsigned, and [-1,1] for signed texture component, regardless
of pixel type, but the border colors are passed according to texture
component type. Hence, before submitting the border color, convert and
scale it these ranges accordingly.

This doesn't seem to work for textures with 32 bit integer components
though, here, it seems that the border color is always set to zero,
regardless of the BORDER_COLOR_TYPE state set in Q_TEX_SAMPLER_WORD0_0.

v2: Simplyfy logic as suggested by Roland Schneidegger

Fixes:
  dEQP-GLES31.functional.texture.border_clamp.formats.compressed*
  dEQP-GLES31.functional.texture.border_clamp.formats.r* (non 32 bit integer)
  dEQP-GLES31.functional.texture.border_clamp.per_axis_wrap_mode.texture_2d*
 and a number of piglits out of
  piglit run gpu -t texture -t gather -t formats

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-07-25 08:58:33 +02:00
Jason Ekstrand
b3b170ade9 nir: Add a couple of iand/ior optimizations
Spotted in a shader in Batman: Arkham City.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-24 20:39:43 -07:00
Jordan Justen
2b3064c073 i965, anv: Use INTEL_DEBUG for disk_cache driver flags
Since various options within INTEL_DEBUG could impact code generation,
we need to set the disk cache driver_flags parameter based on the
INTEL_DEBUG flags in use.

An example that will affect the program generated by i965 is the
INTEL_DEBUG=nocompact option.

The DEBUG_DISK_CACHE_MASK value is added to mask the settings of
INTEL_DEBUG that can affect program generation.

v2:
 * Use driver_flags (Tim)
 * Also update Anvil (Jason)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-24 16:17:28 -07:00
Jordan Justen
69a686b0ae i965, anv: Add extra unused character in disk_cache renderer temp string
This extra character should not be used by snprintf, but we make it
available to verify that we printed the exact number we wanted, and
didn't overflow.

v2:
 * Also update Anvil

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-24 16:17:25 -07:00
Marek Olšák
7d2e6edd89 mesa: allow indirect draws with the default VAO and compatibility profile
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-24 16:00:09 -04:00
Danylo Piliaiev
49ed075615 mesa: Fix copy-paste error in ConservativeRasterDilateRange initialization
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 4580617509 ("mesa: add support for nvidia conservative
rasterization extensions")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-07-24 20:44:34 +01:00
Jason Ekstrand
f214baf72f nir/serialize: Alloc constants off the variable
nir_sweep assumes that constants area always allocated off the variable
to which they belong.  Violating this assumption causes them to get
freed early and leads to use-after-free bugs.

Fixes: 120da00975 "nir: add serialization and deserialization"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107366
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2018-07-24 12:34:07 -07:00
Karol Herbst
7f95564a22 nir: rename f2f16_undef to f2f16
we need rounding modes on other conversions involving floats and it is easier
to rename f2f16_undef than renaming all the other ones.

v2: rebased on master

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-24 20:40:05 +02:00
Karol Herbst
2083cfb6eb nir: add builtin builder
also move some of the GLSL builtins over we will need for implementing
some OpenCL builtins

v2: replace NIR_IMM_FP by nir_imm_floatN_t in ported code
    fix up changes caused by swizzle rework

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-24 20:40:05 +02:00
Rob Clark
9e90708d5d nir/spirv: import OpenCL.std.h
Lightly edited to be valid 'C' code.

Is there a bug open to fix this upstream?

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-24 20:40:05 +02:00
Marek Olšák
98ab24fdab radeonsi: handle SI_FORCE_FAMILY early
before LLVM target machines are created
2018-07-24 14:21:29 -04:00
Mathieu Bridon
9ebd8372b9 python: Use range() instead of xrange()
Python 2 has a range() function which returns a list, and an xrange()
one which returns an iterator.

Python 3 lost the function returning a list, and renamed the function
returning an iterator as range().

As a result, using range() makes the scripts compatible with both Python
versions 2 and 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-07-24 11:07:04 -07:00
Mathieu Bridon
022d2a381d python: Better use iterators
In Python 2, iterators had a .next() method.

In Python 3, instead they have a .__next__() method, which is
automatically called by the next() builtin.

In addition, it is better to use the iter() builtin to create an
iterator, rather than calling its __iter__() method.

These were also introduced in Python 2.6, so using it makes the script
compatible with Python 2 and 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-07-24 11:07:04 -07:00
Mathieu Bridon
01da2feb0e python: Better sort dictionary keys/values
In Python 2, dict.keys() and dict.values() both return a list, which can
be sorted in two ways:

* l.sort() modifies the list in-place;
* sorted(l) returns a new, sorted list;

In Python 3, dict.keys() and dict.values() do not return lists any more,
but iterators. Iterators do not have a .sort() method.

This commit moves the build scripts to using sorted() on dict keys and
values, which makes them compatible with both Python 2 and Python 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-07-24 11:07:04 -07:00
Mathieu Bridon
5530cb1296 python: Better iterate over dictionaries
In Python 2, dictionaries have 2 sets of methods to iterate over their
keys and values: keys()/values()/items() and iterkeys()/itervalues()/iteritems().

The former return lists while the latter return iterators.

Python 3 dropped the method which return lists, and renamed the methods
returning iterators to keys()/values()/items().

Using those names makes the scripts compatible with both Python 2 and 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-07-24 11:07:04 -07:00
Mathieu Bridon
fdf946ffbf python: Stop using the string module
Most functions in the builtin string module also exist as methods of
string objects.

Since the functions were removed from the string module in Python 3,
using the instance methods directly makes the code compatible with both
Python 2 and Python 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-07-24 11:07:04 -07:00
Mathieu Bridon
1d209275c2 python: Better check for keys in dicts
Python 3 lost the dict.has_key() method. Instead it requires using the
"in" operator.

This is also compatible with Python 2.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-07-24 11:07:04 -07:00
Kenneth Graunke
9b34742495 intel: Make the disassembler take a const pointer to the assembly.
Disassembling doesn't modify the assembly.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-24 11:04:56 -07:00
Andres Gomez
3647b16675 travis: manually generate sys/syscall.h
Until now, the needed bits were wrongly included in linux/memfd.h

Since Travis' sys/syscall.h doesn't provide the SYS_memfd_create, we
generate that header manually, including the needed bits to avoid
compilation problems, as the ones observed after:
3228335b55 ("intel: aubinator: handle GGTT mappings")

v2: replace fixes commit with the first direct user of
    syscall.h (Emil).

Fixes: 3228335b55 ("intel: aubinator: handle GGTT mappings")
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Dylan Baker <dylan.c.baker@intel.com>
Cc: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-07-24 19:52:11 +03:00
Andres Gomez
7665a05a3a docs: update calendar to match the 18.2 plan with the one announced
Additionally, I've extended the 18.1 cycle by one more release,
tentatively assigned to Dylan, due to the ~2 weeks delay for 18.2.

Cc: Dylan Baker <dylan.c.baker@intel.com>
Cc: Juan A. Suarez <jasuarez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-07-24 19:49:08 +03:00
Andres Gomez
1391892e73 docs: move releases from Fridays to Wednesdays
As discussed at:
https://lists.freedesktop.org/archives/mesa-dev/2018-March/188525.html

Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Dylan Baker <dylan.c.baker@intel.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Carl Worth <cworth@cworth.org>
Cc: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-07-24 19:48:01 +03:00
Andres Gomez
b0e49a9e7a docs: correct typo in the submitting patches instructions
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-07-24 19:47:40 +03:00
Bas Nieuwenhuizen
28b8c18d84 radv: Still enable inmemory & API level caching if disk cache is not enabled.
That we don't have a background disk cache does not mean we should
prevent the app caching anything.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-24 18:06:41 +02:00
Jose Fonseca
04d77d53aa gallium/tests: Don't ignore S3TC errors.
Now we do full S3TC decompression they should no longer fail.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-07-24 15:58:14 +01:00
Harish Krupo
fd734608c3 egl: Fix missing clamping in eglSetDamageRegionKHR
Clamp the x and y co-ordinates of the rectangles.

v2: Clamp width/height after converting to co-ordinates
    (Ilia Merkin)

Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-07-24 14:46:21 +01:00
Erik Faye-Lund
c3eaf8fe57 forward precise-flag if supported
New versions of virglrenderer supports the precise-flag, so let's
forward it from TGSI if that's the case.

This fixes a few dEQP-GLES31 tests:
- dEQP-GLES31.functional.tessellation.common_edge.quads_equal_spacing_precise
- dEQP-GLES31.functional.tessellation.common_edge.quads_fractional_even_spacing_precise
- dEQP-GLES31.functional.tessellation.common_edge.quads_fractional_odd_spacing_precise
- dEQP-GLES31.functional.tessellation.common_edge.triangles_equal_spacing_precise
- dEQP-GLES31.functional.tessellation.common_edge.triangles_fractional_even_spacing_precise
- dEQP-GLES31.functional.tessellation.common_edge.triangles_fractional_odd_spacing_precise

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-24 10:27:27 +02:00
Marek Olšák
6853862a58 radeonsi: fix pk2h breakage 2018-07-23 22:29:59 -04:00
Marek Olšák
86b52d4236 radeonsi: reduce LDS stalls by 40% for tessellation
40% is the decrease in the LGKM counter (which includes SMEM too)
for the GFX9 LSHS stage.

This will make the LDS size slightly larger, but I wasn't able to increase
the patch stride without corruption, so I'm increasing the vertex stride.
2018-07-23 20:23:52 -04:00
Tom Stellard
0866edede0 radeonsi: Add debug option to enable LLVM GlobalISel (v2)
R600_DEBUG=gisel will tell LLVM to use GlobalISel rather than
SelectionDAG for instruction selection.

v2: mareko: move the helper to src/amd/common

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <tstellar@redhat.com>
2018-07-23 20:23:48 -04:00
Jason Ekstrand
820d5e51b7 intel/compiler: Account for built-in uniforms in analyze_ubo_ranges
The original pass only looked for load_uniform intrinsics but there are
a number of other places that could end up loading a push constant.  One
obvious omission was images which always implicitly use a push constant.
Legacy VS clip planes also get pushed into the shader.  This fixes some
new Vulkan CTS tests that test random combinations of bindings and, in
particular, test lots of UBOs and images together.

Cc: mesa-stable@lists.freedesktop.org
Cc: Kenneth Graunke <kenneth@whitecape.org>
2018-07-23 15:28:17 -07:00
Daniel Schürmann
62024fa775 radv: enable VK_KHR_16bit_storage extension / 16bit storage features
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 23:16:26 +02:00
Daniel Schürmann
4d0b02bb5a ac: add support for 16bit load_push_constant
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 23:16:25 +02:00
Daniel Schürmann
b722b29f10 radv: add support for 16bit input/output
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 23:16:25 +02:00
Daniel Schürmann
87989339a0 nir: add 16bit type information to glsl types
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 23:16:25 +02:00
Daniel Schürmann
7e7ee82698 ac: add support for 16bit buffer loads
v2: Fixed dvec3 loads (bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 23:16:25 +02:00
Daniel Schürmann
a6a21e651d ac: add support for 16bit UBO loads
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 23:16:25 +02:00
Daniel Schürmann
3109c5257b ac: add support for 16bit ssbo stores
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 23:16:25 +02:00
Daniel Schürmann
f582367d49 ac: add 16bit conversion operations
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 23:16:25 +02:00
Dave Airlie
d73f1026b4 r600: enable tess_input_info for TES
There might be a nicer way to do this, but this is at least correct.

This fixes:
KHR-GL44.tessellation_shader.single.max_patch_vertices
KHR-GL44.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2018-07-23 21:11:35 +01:00
Dave Airlie
760622c328 docs/features: fix virgl gles3.1 entries 2018-07-24 06:10:46 +10:00
Roland Scheidegger
09828feab0 draw: force draw pipeline if there's more than 65535 vertices
The pt emit path can only handle 65535 - the number of vertices is
truncated to a ushort, resulting in a too small buffer allocation, which
will crash.

Forcing the pipeline path looks suboptimal, then again this bug is
probably there ever since GS is supported, so it seems it's not
happening often. (Note that the vertex_id in the vertex header is 16
bit too, however this is only used by the draw pipeline, and it denotes
the emit vertex nr, and that uses vbuf code, which will only emit smaller
chunks, so should be fine I think.)
Other solutions would be to simply allow 32bit counts for vertex
allocation, however 65535 is already larger than this was intended for
(the idea being it should be more cache friendly). Or could try to teach
the pt emit path to split the emit in smaller chunks (only the non-index
path can be affected, since gs output is always linear), but it's a bit
tricky (we don't know the primitive boundaries up-front).

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=107295

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-07-23 22:07:07 +02:00
Dave Airlie
51f67eeb21 docs/features: note ARB_copy_image is working on virgl 2018-07-24 06:06:15 +10:00
Dave Airlie
83332618c1 Revert "virgl: remove unused stride-arguments"
This reverts commit dc938b8398.

This adds warnings in vtest, and possibly breaks it.
2018-07-24 06:03:20 +10:00
Dave Airlie
69c2cd0b14 docs/features: note ssbo and atomic counters done for virgl 2018-07-24 05:56:35 +10:00
Dave Airlie
958b57ac82 virgl: add initial shader_storage_buffer_object support. (v2)
This adds the guest side support for ARB_shader_storage_buffer_object.

Co-authors: Gurchetan Singh <gurchetansingh@chromium.org>

v2: move to using separate maximums
(fixup macros)

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2018-07-24 05:54:21 +10:00
Jason Ekstrand
e4d346c86d nir: Add a couple trivial abs optimizations
Spotted in a shader in Batman: Arkham City.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-07-23 10:48:21 -07:00
Caio Marcelo de Oliveira Filho
52d831ff83 glsl: remove delegating constructors to allow build with C++98
Delegating constructors is a C++11 feature, so this was breaking when
compiling with C++98. Change the copy_propagation_state() calls that
used the convenience constructor to use a static member function
instead.

Since copy_propagation_state is expected to be heap allocated, this
change is a good fit.

Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107305
2018-07-23 10:34:43 -07:00
Eric Anholt
6b73a97f84 v3d: Implement a small immediates optimization, based on VC4's.
We can do one per instruction, and we have to be careful not to overwrite
raddr_b, but this greatly reduces the pressure on uniform loads
(particularly around ldvpm/stvpm instructions).

total instructions in shared programs: 90768 -> 88220 (-2.81%)
instructions in affected programs:     82711 -> 80163 (-3.08%)
2018-07-23 10:21:43 -07:00
Eric Anholt
79e0f042bc v3d: Return an invalid src number if asked for a missing implicit uniform.
Sometimes when iterating over sources, we might want to check if it's the
implicit one.  We wouldn't want to match on a non-implicit src using this
function.
2018-07-23 10:21:43 -07:00
Eric Anholt
f2ea936f48 v3d: Skip emitting texture config parameter 2 if it's just the defaults.
shader-db:
total instructions in shared programs: 91275 -> 90768 (-0.56%)
instructions in affected programs:     20702 -> 20195 (-2.45%)
2018-07-23 10:21:43 -07:00
Eric Anholt
421e99d777 v3d: Update an XXX comment for a path we handled in HW on V3D 4.x. 2018-07-23 10:21:43 -07:00
Eric Anholt
e7ae900341 v3d: Switch to using the new SFU instructions on V3D 4.x.
These instructions let us write directly to the phys regfile, instead of
just R4.  That lets us avoid moving out of R4 to avoid conflicting with
other SFU results, and to avoid conflicting with thread switches.

There is still an extra instruction of latency, which is not represented
in the scheduler at the moment.  If you use the result before it's ready,
the QPU will just stall, unlike the magic R4 mode where you'd read the
previous value.  That means that the following shader-db results aren't
quite representative (since we now cause some stalls instead of emitting
nops), but they're impressive enough that I'm happy with the change.

total instructions in shared programs: 95669 -> 91275 (-4.59%)
instructions in affected programs:     82590 -> 78196 (-5.32%)
2018-07-23 10:21:43 -07:00
Eric Anholt
58c1d3860f v3d: Add QPU pack/unpack for the new SFU instructions.
These instructions allow writing the result to any register, instead of a
special writeback to r4.
2018-07-23 10:21:43 -07:00
Eric Anholt
cdfa99657d v3d: Fix the name of the "flpop" operation.
Noticed while trying to sort a new op into the appropriate place to match
the documentation.
2018-07-23 10:21:43 -07:00
Eric Anholt
91e24e5718 v3d: Print the instruction we're testing in the QPU disasm/pack round-trip.
If we fail initial disassembly, it's good to know what instruction it was
that failed.
2018-07-23 10:21:42 -07:00
Eric Anholt
a1beb333d8 v3d: Drop unused vir_SAT() operation.
We lower saturates in NIR.
2018-07-23 10:21:42 -07:00
Eric Anholt
8dfc6ee317 v3d: Rotate through registers to improve post-RA scheduling options.
Similarly to VC4's implementation, by not picking r0 immediately upon
freeing it, we give the scheduler more of a chance to fit later writes in
earlier.  I'm not clear on whether there's any real cost to picking phys
over accumulators, so keep that behavior for now.

shader-db:
total instructions in shared programs: 96831 -> 95669 (-1.20%)
instructions in affected programs:     77254 -> 76092 (-1.50%)
2018-07-23 10:21:42 -07:00
Eric Anholt
1fb31819ae v3d: Allow reading from physical regs written in the previous instruction.
This restriction existed in V3D 2.x, but lifting it was a major change in
3.x.

shader-db results:
total instructions in shared programs: 98117 -> 96831 (-1.31%)
instructions in affected programs:     48520 -> 47234 (-2.65%)
2018-07-23 10:21:23 -07:00
Eric Engestrom
e6e22e4207 anv: remove unnecessary runtime copy of static string
It's actually also a bit safer, since now the compiler will warn if
the string is larger than the `.name` array.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-23 17:56:08 +01:00
Alex Smith
54f8f1545f anv: Pay attention to VK_ACCESS_MEMORY_(READ|WRITE)_BIT
According to the spec, these should apply to all read/write access
types (so would be equivalent to specifying all other access types
individually). Currently, they were doing nothing.

v2: Handle VK_ACCESS_MEMORY_WRITE_BIT in dstAccessMask.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-23 15:29:43 +01:00
Erik Faye-Lund
dc938b8398 virgl: remove unused stride-arguments
The IOCTLs doesn't pass this along, so computing them in the first
place is kinda pointless.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-07-23 11:21:09 +01:00
Samuel Pitoiset
6c58bc8d9c radv: print a big warning when RADV_TRACE_FILE is set
Users shouldn't use this debugging option except when we
ask them to do!

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 11:34:42 +02:00
Samuel Pitoiset
6e32d9e7b0 radv: fix a memleak for merged shaders on GFX9
modules[i] can be NULL for merged shaders but we have to
free the NIR code. radv_can_dump_shader_stats() already handles
if modules[i] is NULL, no need to check it twice.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 11:34:39 +02:00
Jason Ekstrand
d0ee0a0a5d intel/blorp: Fix blits to R8G8B8_UNORM_SRGB sRGB harder
The first fix attempt contained a nasty typo which somehow didn't get
caught in review.  It also didn't work as intended because the sRGB
conversion was happening but then throwing away all but the red channel
because it dind't know it was RGB.  Really, it's my fault for trying to
fix a bug without first writing tests.  I've now written tests and they
pass with this change. :)

Fixes: 11712b9ca1 "intel/blorp: Fix blits to R8G8B8_UNORM_SRGB"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-23 00:36:39 -07:00
Jason Ekstrand
abd629eb3d anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV
We've had several broadwell hangs that have come down to this bit just
not working correctly.  Most recently, we've had a pile of hangs
reported with apps running under DXVK:

https://github.com/doitsujin/dxvk/issues/469

Instead, use the bit that doesn't try to imply weird D3D coherency
things and just force-enables the PS like we want.

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-22 23:43:19 -07:00
Jason Ekstrand
b99493c628 anv: Properly handle GetImageSubresourceLayout on complex images
We support mipmapped and arrayed linear images so we need to support
vkGetImageSubresourceLayout on them.  Fortunately, it's just a trivial
call into ISL.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-22 23:24:10 -07:00
Timothy Arceri
78f391d343 radeonsi/nir: make use of nir_lower_load_const_to_scalar()
This allows NIR to CSE more operations. LLVM does this also so the
impact is limited, however doing this in NIR allows other opts to
make progress. For example some loops in Civilization Beyond Earth
shaders are unrolled.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-23 09:48:51 +10:00
Ilia Mirkin
257128079c anv/gen9: expose VK_EXT_post_depth_coverage
Note that the use of ICMS_INNER_CONSERVATIVE disagrees with the GL driver.
Perhaps it's more performant than ICMS_NORMAL and is otherwise permitted?
Not sure, so I left it as-is.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-22 14:56:44 -07:00
Ilia Mirkin
768f143667 spirv: add support for SPV_KHR_post_depth_coverage
Allow the capability to be exposed, and convert the new execution mode
into fs state.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-22 14:56:36 -07:00
Mauro Rossi
6cbbd5b4f8 android: util/disk_cache: fix building errors in gallium drivers
This patch applies the necessary changes in Android.common.mk
as per automake rules, to avoid following building error:

external/mesa/src/gallium/drivers/nouveau/nouveau_screen.c:159:8:
error: implicit declaration of function 'disk_cache_get_function_timestamp'
is invalid in C99 [-Werror,-Wimplicit-function-declaration]
   if (disk_cache_get_function_timestamp(nouveau_disk_cache_create,
       ^
1 error generated.

(v2) -DENABLE_SHADER_CACHE Android cflag is kept, to leave the AS-IS capability enabled

Fixes: cc10b34 ("util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-21 12:06:38 +02:00
Chih-Wei Huang
e7ffd3fb08 Android: fix a missing nir_intrinsics.h error
The commit 76dfed8ae2 changed nir_intrinsics.h to be a generated
header, but the corresponding dependency was not updated for Android.
It causes the error:

[  0% 19/4336] target  C: libmesa_pipe_radeonsi <= external/mesa/src/gallium/drivers/radeonsi/si_debug.c
...
In file included from external/mesa/src/gallium/drivers/radeonsi/si_debug.c:25:
In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:28:
In file included from external/mesa/src/gallium/drivers/radeonsi/si_shader.h:140:
In file included from external/mesa/src/amd/common/ac_llvm_build.h:30:
external/mesa/src/compiler/nir/nir.h:966:10: fatal error: 'nir_intrinsics.h' file not found
         ^~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 76dfed8ae2 ("nir: mako all the intrinsics")
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Mauro Rossi <issor.oruam@gmail.com>
2018-07-21 08:50:23 +02:00
Bas Nieuwenhuizen
e1febbefe8 nir: Fix end of function without return warning/error.
There always is a continue block, so let us just do unreachable.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 8cacf38f52 "nir: Do not use continue block after removing it."
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107312
2018-07-20 22:27:39 +02:00
Danylo Piliaiev
d24c35c3fb st: Sweep NIR after linking phase to free held memory
After optimization passes and many trasfromations most of memory
NIR holds is a garbage which was being freed only after shader deletion.
Freeing it at the end of linking will save memory which would be useful
in case there are a lot of complex shaders being compiled.
The common case for this issue is 32bit game running under Wine.

The cost of the optimization is around ~3-5% of compilation speed
with complex shaders.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-20 11:26:12 -07:00
Eric Anholt
945524ba0e st/dri: Don't require a dri_format for image creation.
Nothing in EGL_KHR_gl_image.txt seems to let us deny creation based on
formats, and doing so causes many failures in
dEQP-EGL.functional.image.api.*

The NONE value we were protecting from only gets looked at in the
__DRI_IMAGE_ATTRIB_FORMAT and __DRI_IMAGE_ATTRIB_FOURCC queries, which are
used from wayland and gbm (which throw an error cleanly on unknown format)
and DMABUF export.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-20 11:26:12 -07:00
Eric Anholt
f6750456c5 egl: Refuse EGL_MESA_image_dma_buf_export if we don't have a DRM fourcc.
The EGL CTS expects that you can make images from all sorts of things,
including things like z16 and s8, which we don't have DRM fourccs for.
Just return an error when trying to export one of those.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-20 11:26:12 -07:00
Eric Anholt
a221f9709e v3d: Fix incorrect handling of two fences created back-to-back.
Recreating our context's syncobj with ALREADY_SIGNALED meant that if you
created two fences in a row, then waiting on the second would succeed
immediately.  Instead, export a sync file in the gallium fence (since we
don't have a syncobj clone ioctl), and just create a new syncobj to wait
on whenever we need to.

Noticed while debugging
dEQP-GLES3.functional.fence_sync.client_wait_sync_finish
2018-07-20 11:11:29 -07:00
Eric Anholt
fc28692a5a v3d: Fix the timeout value passed to drmSyncobjWait().
The API wants an absolute time, so we need to go add gallium's argument to
CLOCK_MONOTONIC.
2018-07-20 11:11:29 -07:00
Eric Anholt
4f04bd68cf v3d: Fix drmSyncobjWait() return value checking even more.
It tends to return >0 in the success case (I think the value is something
like "how much of the timeout remained").  Fixes
dEQP-GLES3.functional.fence_sync.client_wait_sync_finish
2018-07-20 11:11:29 -07:00
Eric Anholt
2f90879a34 v3d: Use the list_first_entry/list_last_entry macros. 2018-07-20 11:11:29 -07:00
Eric Anholt
d0e53373e5 v3d: Move BO cache counting to dump time instead of cache management.
This is one less way to get the dump stats wrong.
2018-07-20 11:11:29 -07:00
Eric Anholt
7d6aef6fa5 v3d: Reduce the stale BO reclamation spam with dump_stats set.
This was obviously meant to be when we were actually freeing a BO, not
just when there was at least one BO in the list.
2018-07-20 11:11:29 -07:00
Eric Anholt
5d11094db1 v3d: Respect a sampler view's first_layer field.
Fixes texturing from EGL images created from cubemap faces, as in
dEQP-EGL.functional.image.create.gles2_cubemap_negative_x_rgba_texture
2018-07-20 11:11:29 -07:00
Sonny Jiang
c6737756ad radeonsi: emit_spi_map packets optimization
v2: marek: remove an empty line before break;
    rename reg_val_seq -> spi_ps_input_cntl
    "type * x" -> "type *x"

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-20 13:50:26 -04:00
Gert Wollny
4d094993c3 virgl: Expose GL_ARB_copy_image if host supports it
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-07-20 19:15:12 +02:00
Gert Wollny
0bde9739c0 virgl: Allow RGB32* textures only as buffer objects
When requesting a texture of the internal format GL_RGB32F Gallium will
try to allocate a renderable texture and returns RGBA32F or RGBX32F, but
when one requests GL_RGB32I or GL_RGB32UI the according 3-component
texture will be returned. This leads to problems later, when one wants
to use glCopyImageSubData to copy data between these textures that should
be compatible, but given the way virgl and Gallium  handle this the latter
fails with an assertion, because the per-texel bit size is different.

By allowing the GL_RGB32* only for texture buffers these problems are avoided
without losing the ARB_tbo_rgb32 extension (thanks Ilia Mirkin).

v2: Correct spelling (Gurchetan Singh)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-07-20 19:12:49 +02:00
Lionel Landwerlin
feb43ef674 intel: tools: dump: protect against multiple calls on destructor
When running gdb, make sure to pass the LD_PRELOAD variable only to
the executed program, not the debugger. Otherwise the debugger will
run the preloaded constructor/destructor too and bad things will
happen.

Suggested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-20 17:36:56 +01:00
Lionel Landwerlin
2a9069eb97 intel: tools: dump: make dump tool reliable under gdb
The problem with passing the configuration of the dump lib through a
file descriptor is that it can be read only once. But under gdb you
might want to rerun your program multiple times.

This change hands the configuration through a temporary file that is
deleted once the command line passes to intel_dump_gpu has exited.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-20 17:36:37 +01:00
Samuel Pitoiset
1efc9094e0 radv: don't flush DB before subpass FS resolves
That shouldn't be needed because the DB state is invalid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-20 17:30:13 +02:00
Gert Wollny
016807161b r600: Correct evaluation of cube array index and face
The array index needs to be corrected and it must be insured that it is
rounded and its value is non-negative before it is combined with the
face id.

v5: Use RNDNE instead of ADD 0.5 and FLOOR (Ilia Mirkin)

v6: Fix type (Roland Scheidegger)

Fixes 182 from android/cts/master/gles31-master.txt:
  dEQP-GLES31.functional.texture.filtering.cube_array.formats.*
  dEQP-GLES31.functional.texture.filtering.cube_array.sizes.*
  dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_*
  dEQP-GLES31.functional.texture.filtering.cube_array.combinations.linear_mipmap_*
  dEQP-GLES31.functional.texture.filtering.cube_array.no_edges_visible.*

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-07-20 14:55:12 +02:00
Gert Wollny
01766c1db6 r600: correct texture offset for array index lookup
Correct the array index for TEXTURE_*1D_ARRAY, and TEXTURE_*2D_ARRAY
The standard says the array index is evaluated according to

   floor(z + 0.5)

but RNDNE is sufficient also for the test cases were z is close to 1.5
and it is likely to hit 1.5, the corner case were RNDNE gives a result
different from above formula.

v5: - Use RNDNE instead of ADD 0.5 and FLOOR (Ilia Mirkin)
    - update commit message

Fixes 325 tests from android/cts/master/gles3-master.txt:
  dEQP-GLES3.functional.shaders.texture_functions.texture.*sampler2darray*
  dEQP-GLES3.functional.shaders.texture_functions.textureoffset.*sampler2darray*
  dEQP-GLES3.functional.shaders.texture_functions.texturelod.sampler2darray*
  dEQP-GLES3.functional.shaders.texture_functions.texturelodoffset.*sampler2darray*
  dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*sampler2darray*
  dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.*sampler2darray*
  dEQP-GLES3.functional.texture.filtering.2d_array.formats.*
  dEQP-GLES3.functional.texture.filtering.2d_array.sizes.*
  dEQP-GLES3.functional.texture.filtering.2d_array.combinations.*
  dEQP-GLES3.functional.texture.shadow.2d_array.*
  dEQP-GLES3.functional.texture.vertex.2d_array.*

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-07-20 14:55:12 +02:00
Gert Wollny
626bd455d4 r600: Delay emission of texture gradients and lookup offsets
Gradients used in texture lookups and the offsets must reside in the
same fetch clause (the first is imposed by the hardware and the second
is expected by sb). In order to ensure that no ALU clause is inserted
between emission and use of these, delay the emission of these
instructions until the texture instruction using them is also emitted.

This is needed in preparation for the correction of the texture array
indices.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-07-20 14:55:12 +02:00
Bas Nieuwenhuizen
cc10b34e9e util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.
radv always needs it, so just check the header instead. Also
do not declare the function if the variable is not set, so we
get a nice compile error instead of failing to open a device
at runtime.

Fixes: b87ef9e606 "util: fix MSVC build issue in disk_cache.h"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-20 12:09:19 +02:00
Bas Nieuwenhuizen
8cacf38f52 nir: Do not use continue block after removing it.
Reinserting code directly before a jump means the block gets split
and merged, removing the original block and replacing it in the
process.

Hence keeping a pointer to the continue block over a reinsert
causes issues.

This code changes nir_opt_if to simply look for the new continue
block.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107275
CC: 18.1 <mesa-stable@lists.freedesktop.org>
2018-07-20 12:09:19 +02:00
Samuel Pitoiset
ce454d02cc radv: simplify a condition in radv_src_access_flush()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-20 10:17:17 +02:00
Samuel Pitoiset
1ff25c4e6b radv: save current state just before resolving with FS
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-20 10:17:15 +02:00
Samuel Pitoiset
c3d5f124c6 radv: don't check if a subpass has resolve attachments twice
We already check that in radv_cmd_buffer_resolve_subpass().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-20 10:17:13 +02:00
Samuel Pitoiset
0a8127bbfb radv: make use of radv_subpass_barrier() when resolving subpasses
The goal is to use radv_barrier()/radv_subpass_barrier() as
much as possible for further optimizations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-20 10:17:11 +02:00
Rhys Perry
409a60df3b nv50/ir: move LateAlgebraicOpt back to right after ConstantFolding
total instructions in shared programs : 5480808 -> 5472107 (-0.16%)
total gprs used in shared programs    : 647530 -> 647532 (0.00%)
total shared used in shared programs  : 389120 -> 389120 (0.00%)
total local used in shared programs   : 21064 -> 21064 (0.00%)
total bytes used in shared programs   : 58551648 -> 58459352 (-0.16%)

                local     shared        gpr       inst      bytes
    helped           0           0          73        2609        2609
      hurt           0           0          71          34          34
2018-07-19 23:34:58 +02:00
Rhys Perry
2afef231db nv50/ir: handle SHLADD in IndirectPropagation
An alternative solution to the problem fixed in
0bd83d0 ("nv50/ir: move LateAlgebraicOpt to the very end").

total instructions in shared programs : 5481195 -> 5480808 (-0.01%)
total gprs used in shared programs    : 647535 -> 647530 (-0.00%)
total shared used in shared programs  : 389120 -> 389120 (0.00%)
total local used in shared programs   : 21064 -> 21064 (0.00%)
total bytes used in shared programs   : 58555784 -> 58551648 (-0.01%)

                local     shared        gpr       inst      bytes
    helped           0           0           2          34          34
      hurt           0           0           0           0           0
2018-07-19 23:34:58 +02:00
Rhys Perry
3b6edd0b59 gm107/ir: use CS2R for SV_CLOCK
This instruction seems to be faster than S2R and requires no barrier,
though the range of special registers it can read from is limited.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-07-19 23:34:58 +02:00
Lionel Landwerlin
94cf964586 intel: tools: dump: remove mentions of intel_aubdump
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-19 20:12:53 +01:00
Lionel Landwerlin
0f9d8b754f intel: tools: aubwrite: fix invalid frees on finish
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-07-19 20:11:56 +01:00
Samuel Pitoiset
3d41757788 ac/nir: add a workaround for bitfield_extract when count is 0
LLVM 7 returns incorrect results when count is 0, something
has been broken since LLVM 6. Of course, the best solution is
to fix LLVM but this workaround works as expected for now.

Original workaround by Philippe Rebohle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107276
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-19 20:41:10 +02:00
Nanley Chery
e2e32b6afd intel/isl/gen4: Make depth/stencil buffers Y-Tiled
Rendering to a linear depth buffer on gen4 is causing a GPU hang in the
CI system. Until a better explanation is found, assume that errata is
applicable to all gen4 platforms.

Fixes fbe01625f6
("i965/miptree: Share tiling_flags in miptree_create").

Reported-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107248
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-19 11:05:07 -07:00
Nanley Chery
44ab26d0c9 i965/misc: Use depth/stencil surf's tiling on gen4-5
Make the 3D engine aware of the depth/stencil surface's tiling before
doing any render operations.

Fixes fbe01625f6
("i965/miptree: Share tiling_flags in miptree_create").

Reported-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107248
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-19 11:05:07 -07:00
Caio Marcelo de Oliveira Filho
507a8037a7 glsl: don't let an 'if' then-branch kill copy propagation (elements) for else-branch
When handling 'if' in copy propagation elements, if a certain variable
was killed when processing the first branch of the 'if', then the
second would get any propagation from previous nodes.

    x = y;
    if (...) {
        z = x;  // This would turn into z = y.
        x = 22; // x gets killed.
    } else {
        w = x;  // This would NOT turn into w = y.
    }

With the change, we let copy propagation happen independently in the
two branches and only then apply the killed values for the subsequent
code.

One example in shader-db part of shaders/unity/8.shader_test:

    (assign  (xyz) (var_ref col_1)  (var_ref tmpvar_8) )
    (if (expression bool < (swiz y (var_ref xlv_TEXCOORD0) )(constant float (0.000000)) ) (
      (assign  (xyz) (var_ref col_1)  (expression vec3 + (var_ref tmpvar_8) ... ) ... )
    )
    (
      (assign  (xyz) (var_ref col_1)  (expression vec3 lrp (var_ref col_1) ... ) ... )
    ))

The variable col_1 was replaced by tmpvar_8 in the then-part but not
in the else-part.

NIR deals well with copy propagation, so it already covered for the
missing ones that this patch fixes.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-19 10:00:59 -07:00
Caio Marcelo de Oliveira Filho
e4f32dec23 glsl: change opt_copy_propagation_elements data structures
Instead of keeping multiple acp_entries in lists, have a single
acp_entry per variable. With this, the implementation of clone is more
convenient and now fully implemented. In the previous code, clone was
only partial.

Before this patch, each acp_entry struct represented a write to a
variable including LHS, RHS and a mask of what channels were written
to. There were two main hash tables, the first (lhs_ht) stored a list
of acp_entries per LHS variable, with the values available to copy for
that variable; the second (rhs_ht) was a "reverse index" for the first
hash table, so stored acp_entries per RHS variable.

After the patch, there's a single acp_entry struct per LHS variable,
it contains an array with references to the RHS variables per
channel. There now is a single hash table, from LHS variable to the
corresponding entry. The "reverse index" is stored in the ACP entry,
in the form of a set of variables that copy from the LHS. To make the
clone operation cheaper, the ACP entries are created on demand.

This should not change the result of copy propagation, a later patch
will take advantage of the clone operation.

v2: Add note clarifying how the hashtable is destroyed.

v3: (all from Eric Anholt)
    Add remove_unused_var_from_dsts() function for reuse.
    Remove from dsts as we go instead of clearing at the end.
    Add clarifying comment to erase().

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-19 10:00:30 -07:00
Caio Marcelo de Oliveira Filho
7b0d395250 glsl: separate copy propagation state
Separate higher level logic of visiting instructions and chosing when
to store and use new copy data from the datastructure holding the copy
propagation information. This will also make easier later patches that
change the structure.

v2: Remove empty destructor and clarify how hash tables are destroyed.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-19 10:00:30 -07:00
Lionel Landwerlin
49e86f09fe intel: tools: dump: trace memory writes
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-19 16:48:42 +01:00
Lionel Landwerlin
5ba3e5c358 intel: tools: dump: remove command execution feature
In commit 86cb05a6d3 ("intel: aubinator: remove standard input
processing option") we removed the ability to process aub as an input
stream because we're now rely on mmapping the aub file to back the
buffers aubinator is parsing.

intel_aubdump was the provider of the standard input data and since
we've copied/reworked intel_aubdump into intel_dump_gpu within Mesa,
we don't need that code anymore.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-19 10:11:54 +01:00
Danylo Piliaiev
494a206229 radv: Fix incorrect assumption about ternary operator precedence
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-19 10:04:27 +02:00
Marek Olšák
dcbcc83003 mesa: fix make check for AMD_performance_monitor 2018-07-19 01:17:01 -04:00
Marek Olšák
f097f0c55c mesa: remove dead code from api_loopback
This should only contain functions not set in vtxfmt.c.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-19 01:10:32 -04:00
Marek Olšák
987c2ece03 mesa: expose ARB_indirect_parameters in the compatibility profile
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1)

v2: fix dispatch_sanity
2018-07-19 01:10:18 -04:00
Marek Olšák
d40188800e vbo: fix ARB_multi_draw_indirect for the compatibility profile
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-19 00:58:51 -04:00
Marek Olšák
6c4652ea8a mesa: expose ARB_shader_viewport_layer_array in the compatibility profile
no changes needed for GL compat

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-19 00:58:51 -04:00
Marek Olšák
da528898bc mesa: expose ARB_ES3_1_compatibility in the compatibility profile
no changes needed for GL compat

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-19 00:58:51 -04:00
Marek Olšák
565dacc3d6 winsys/amdgpu: remove RADEON_SURF_FMASK leftover
RADEON_SURF_FMASK is never set.
2018-07-19 00:58:51 -04:00
Marek Olšák
9b82d128c9 ac: run LLVM optimization passes only on the final function after inlining 2018-07-19 00:58:49 -04:00
Bas Nieuwenhuizen
17b5a59b4e radv: Enable binning and dfsm by default on Raven.
Seems like it increases performance by 2-3% for some demos and games.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-19 02:38:21 +02:00
Bas Nieuwenhuizen
978570769d radv: Always set disable zpass increment bit when possible.
When no occlusion queries are active even if out of order is enabled.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-19 02:38:10 +02:00
Bas Nieuwenhuizen
82664af6cf radv: Select correct entries for binning.
Overshot it by one every time.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-19 02:38:01 +02:00
Bas Nieuwenhuizen
760211b77c radv: Fix number of samples used for binning.
Used the wrong register ...

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-19 02:37:54 +02:00
Bas Nieuwenhuizen
c0144e915a radv: Disable disabled color buffers in rbplus opts.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-19 02:37:47 +02:00
Marek Olšák
fb049742d6 r600: silence the signed overflow warning like radeonsi
r600_gpu_load.c: In function ‘r600_gpu_load_thread’:
../../../../src/util/os_time.h:82:7: warning: assuming signed overflow does not occur when assuming that (X + c) >= X is always true [-Wstrict-overflow]
    if (start <= end)
2018-07-18 17:48:48 -04:00
Andres Rodriguez
d3d9513556 radv: fix wmaybe-uninitialized in radv_meta_fast_clear.c
Assignment and usage of this variable both happen inside an
if(rad_image_has_dcc()) {} blocks. It seems gcc plays it safe and
assumes that both function calls could have different return values.

But in this case we should be safe.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-18 15:32:51 -04:00
Sonny Jiang
4bf7234061 radeonsi: emit_guardband packets optimization
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-18 15:04:27 -04:00
Sonny Jiang
80ade05b8d radeonsi: Save CLEAR_STATE initial values for optimization
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-18 15:04:27 -04:00
Jan Vesely
9baacf3fa7 radeonsi: Refuse to accept code with unhandled relocations
They might lead to unrecoverable GPU hang.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-07-18 13:56:56 -04:00
Eric Anholt
70534dbe29 Allow AMD_perfmon on GLES contexts
v2: whitespace alignment fix

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-07-18 10:39:21 -07:00
Eric Anholt
4ba478d7cd egl: Use the canonical drm-uapi fourcc header to avoid local defines.
We should only use a #define locally once it's been upstreamed, and at
that point you should just update our drm_fourcc.h.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-18 10:37:54 -07:00
Eric Anholt
2c6279d58b v3d: Fix tiling modifier support to use the new UIF define.
You can't use T tiled buffers on V3D 3.x and newer, it's been replaced
with a newer layout shared with other hardware blocks.
2018-07-18 10:37:49 -07:00
Eric Anholt
6c0482e176 drm-uapi: Update drm_fourcc.h for new format modifiers.
This brings in the Broadcom VC4 SAND and V3D 3.x+ UIF modifiers, from
drm-next commit 4da1d4c751c9b1b713c13043bad7c4d27cd1418c.
2018-07-18 10:37:49 -07:00
Marek Olšák
201ebf51d1 st/mesa: notify u_vbuf/driver that draw index bounds are unknown for indirect
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-18 13:33:30 -04:00
Timothy Pearson
e1621fda84 radeonsi: Use signed char for color_interp_vgpr_index
color_interp_vgpr_index was declared as a generic char value.
Because signed values are used in this variable, the result
was not safe across architectures and crashed on ppc64[el]
and arm.

Declare color_interp_vgpr_index as a signed type.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-18 13:31:29 -04:00
Jason Ekstrand
aaa6fac8f6 intel/blorp: Take an explicit filter parameter in blorp_blit
This lets us move the glBlitFramebuffer nonsense into the GL driver and
make the usage of BLORP mutch more explicit and obvious as to what it's
doing.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-07-18 09:47:28 -07:00
Jason Ekstrand
9fbe2a2007 intel/blorp: Add a blorp_filter enum for use in blorp_blit
At the moment, this is entirely internal but we'll expose it to clients
of the BLORP API in the next commit.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-07-18 09:47:28 -07:00
Caio Marcelo de Oliveira Filho
ea556471a1 intel/tools: add missing include for stdarg.h
Fixes build in GCC 8.1.1:

FAILED: src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o
gcc -Isrc/intel/tools/src@intel@tools@@intel_dump_gpu@sha -Isrc/intel/tools -I../../src/intel/tools -Isrc/../include -I../../src/../include -Isrc -I../../src -Isrc/mapi -I../../src/mapi -Isrc/mesa -I../../src/mesa -I../../src/gallium/include -I../../src/gallium/auxiliary -Isrc/intel -I../../src/intel -I../../include/drm-uapi -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -std=c99 -O2 -g -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS '-DVERSION="18.2.0-devel"' -DPACKAGE_VERSION=VERSION '-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"' -DGLX_USE_TLS -DENABLE_ST_OMX_BELLAGIO=0 -DENABLE_ST_OMX_TIZONIA=0 -DHAVE_X11_PLATFORM -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DGLX_USE_DRM -DHAVE_DRM_PLATFORM -DHAVE_SURFACELESS_PLATFORM -DENABLE_SHADER_CACHE -DHAVE___BUILTIN_BSWAP32 -DHAVE___BUILTIN_BSWAP64 -DHAVE___BUILTIN_CLZ -DHAVE___BUILTIN_CLZLL -DHAVE___BUILTIN_CTZ -DHAVE___BUILTIN_EXPECT -DHAVE___BUILTIN_FFS -DHAVE___BUILTIN_FFSLL -DHAVE___BUILTIN_POPCOUNT -DHAVE___BUILTIN_POPCOUNTLL -DHAVE___BUILTIN_UNREACHABLE -DHAVE_FUNC_ATTRIBUTE_CONST -DHAVE_FUNC_ATTRIBUTE_FLATTEN -DHAVE_FUNC_ATTRIBUTE_MALLOC -DHAVE_FUNC_ATTRIBUTE_PURE -DHAVE_FUNC_ATTRIBUTE_UNUSED -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT -DHAVE_FUNC_ATTRIBUTE_WEAK -DHAVE_FUNC_ATTRIBUTE_FORMAT -DHAVE_FUNC_ATTRIBUTE_PACKED -DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL -DHAVE_FUNC_ATTRIBUTE_VISIBILITY -DHAVE_FUNC_ATTRIBUTE_ALIAS -DHAVE_FUNC_ATTRIBUTE_NORETURN -D_GNU_SOURCE -DUSE_SSE41 -DUSE_GCC_ATOMIC_BUILTINS -DUSE_X86_64_ASM -DMAJOR_IN_SYSMACROS -DHAVE_SYS_SYSCTL_H -DHAVE_LINUX_FUTEX_H -DHAVE_ENDIAN_H -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_POSIX_MEMALIGN -DHAVE_TIMESPEC_GET -DHAVE_MEMFD_CREATE -DHAVE_STRTOD_L -DHAVE_DLADDR -DHAVE_DL_ITERATE_PHDR -DHAVE_ZLIB -DHAVE_PTHREAD -DHAVE_LIBDRM -DHAVE_LLVM=0x0600 -DMESA_LLVM_VERSION_PATCH=1 -DHAVE_VALGRIND -DHAVE_LIBUNWIND -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED -DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -Wall -Werror=implicit-function-declaration -Werror=missing-prototypes -fno-math-errno -fno-trapping-math -Wno-missing-field-initializers -fPIC -fvisibility=hidden -Wno-override-init  -MD -MQ 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o' -MF 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o.d' -o 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o' -c ../../src/intel/tools/aub_write.c
../../src/intel/tools/aub_write.c: In function ‘fail_if’:
../../src/intel/tools/aub_write.c:243:4: error: implicit declaration of function ‘va_start’; did you mean ‘assert’? [-Werror=implicit-function-declaration]
    va_start(args, format);
    ^~~~~~~~
    assert
../../src/intel/tools/aub_write.c:245:4: error: implicit declaration of function ‘va_end’; did you mean ‘rand’? [-Werror=implicit-function-declaration]
    va_end(args);
    ^~~~~~
    rand
cc1: some warnings being treated as errors

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-18 09:19:22 -07:00
Jason Ekstrand
2be30a1a39 intel/tools: Rename error2aub to intel_error2aub
Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-18 09:03:05 -07:00
Danylo Piliaiev
d219521379 i965: Sweep NIR after linking phase to free held memory
After optimization passes and many trasfromations most of memory
NIR holds is a garbage which was being freed only after shader deletion.
Freeing it at the end of linking will save memory which would be useful
in case there are a lot of complex shaders being compiled.
The common case for this issue is 32bit game running under Wine.

The cost of the optimization is around ~3-5% of compilation speed
with complex shaders.

V2: by Jason Ekstrand
    - Move nir_sweep up, right after the last change of NIR

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103274
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2018-07-18 09:00:18 -07:00
Marek Olšák
51d6b163da winsys/amdgpu: fix VDPAU interop by having one amdgpu_winsys_bo per BO (v2)
Dependencies between rings are inserted correctly if a buffer is
represented by only one unique amdgpu_winsys_bo instance.
Use a hash table keyed by amdgpu_bo_handle to have exactly one
amdgpu_winsys_bo per amdgpu_bo_handle.

v2: return offset and stride properly

Tested-by: Leo Liu <leo.liu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
2018-07-18 11:56:28 -04:00
Marek Olšák
e06b8ec106 winsys/amdgpu: use a better hash_pointer function
Tested-by: Leo Liu <leo.liu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
2018-07-18 11:56:28 -04:00
Marek Olšák
53684e9163 winsys/amdgpu: clean up error handling in amdgpu_bo_from_handle
Tested-by: Leo Liu <leo.liu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
2018-07-18 11:56:28 -04:00
Marek Olšák
a73e3d5e00 winsys/amdgpu: shorten bo->ws in amdgpu_bo_destroy
Tested-by: Leo Liu <leo.liu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
2018-07-18 11:56:28 -04:00
Jason Ekstrand
6a60beba40 intel/tools: Add an error state to aub translator
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-18 08:42:53 -07:00
Jason Ekstrand
d6ad32600e intel/tools: Break aub file writing into a helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-18 08:42:50 -07:00
Jason Ekstrand
0a457d987e intel/tools: Refactor aub dumping to remove singletons
Instead of having quite so many singletons, we use a struct aub_file to
organize the bits we need for writing an aub file.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-18 08:42:46 -07:00
Jason Ekstrand
6953d7f5d2 intel/dump_gpu: Fix corner cases in PPGTT range calculations
For large buffers which span an entire l1 page table, we got the range
calculations wrong.  In this case, we end up with an l1_start which is
the first byte represented by the given l1 table and an l1_end which is
the first byte after the range represented by the l1 table.  Then
l2_start_index == L2_index(l2_end) due to roll-over.  Instead, compute
lN_end using (1Ull << shift) - 1 so that lN_end is the last byte in the
range represented by the Nth level page table.  When we do this, we
don't need the conditional expression anymore.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-18 08:42:38 -07:00
Caio Marcelo de Oliveira Filho
322fa3e5be intel/blorp: fix uninitialized variable warning
Compiler doesn't pick up that level and start_layer will be defined,
so do as was done for num_layers in 4d8b476fa9 "intel/blorp: Fix
compiler warning about num_layers." and always set it.

Fixes warning

../../src/mesa/drivers/dri/i965/brw_blorp.c: In function ‘brw_blorp_clear_depth_stencil’:
../../src/mesa/drivers/dri/i965/brw_blorp.c:1439:4: warning: ‘start_layer’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    blorp_clear_depth_stencil(&batch, &depth_surf, &stencil_surf,
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                              level, start_layer, num_layers,
                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                              x0, y0, x1, y1,
                              ~~~~~~~~~~~~~~~
                              (mask & BUFFER_BIT_DEPTH), ctx->Depth.Clear,
                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                              stencil_mask, ctx->Stencil.Clear);
                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../src/mesa/drivers/dri/i965/brw_blorp.c:1439:4: warning: ‘level’ may be used uninitialized in this function [-Wmaybe-uninitialized]

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho
3bf19bfdc6 util/string_buffer: fix warning in tests
And also specify the maximum size when writing to static buffers. The
warning below refers to the case where "str5" could be larger than
"str5 - str4", then the strcat would have overlapping dst and src.

Compiler doesn't pick up the bound from the snprintf above, so we make
clear the bounds of str5 by using strncat() instead of strcat().

../../src/util/tests/string_buffer/string_buffer_test.cpp: In member function ‘virtual void string_buffer_string_buffer_tests_Test::TestBody()’:
../../src/util/tests/string_buffer/string_buffer_test.cpp:106:10: warning: ‘char* strcat(char*, const char*)’ accessing 81 or more bytes at offsets 48 and 128 may overlap 1 byte at offset 128 [-Wrestrict]
    strcat(str4, str5);
    ~~~~~~^~~~~~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho
577c8d7288 i965/miptree: avoid uninitialized variable warnings
GCC 8.1.1 is having a hard time identifying that the values are
properly initialized when used. In the 'memset_value' case, we pass
the uninitialized value to another function (that will use only if the
conditions match the initialization).

Just give enough hint to the compiler to figure things out. Fixes the
warnings

../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c: In function ‘intel_miptree_alloc_aux’:
../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1839:18: warning: ‘memset_value’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    mt->aux_buf = intel_alloc_aux_buffer(brw, &aux_surf, needs_memset,
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                         memset_value);
                                         ~~~~~~~~~~~~~
../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1698:10: warning: ‘initial_state’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       if (wants_memset)
          ^
../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1772:23: note: ‘initial_state’ was declared here
    enum isl_aux_state initial_state;
                       ^~~~~~~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho
8ec40824ae intel/batch-decoder: fix uninitialized values warnings
Code assumes that all the necessary fields will exist, but compiler
doesn't know about this. Provide zero as default values, like in other
decoding functions.

Fixes warnings

../../src/intel/common/gen_batch_decoder.c: In function ‘handle_media_interface_descriptor_load’:
../../src/intel/common/gen_batch_decoder.c:347:7: warning: ‘binding_entry_count’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       dump_binding_table(ctx, binding_table_offset, binding_entry_count);
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../src/intel/common/gen_batch_decoder.c:347:7: warning: ‘binding_table_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized]

../../src/intel/common/gen_batch_decoder.c:346:7: warning: ‘sampler_count’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       dump_samplers(ctx, sampler_offset, sampler_count);
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../src/intel/common/gen_batch_decoder.c:346:7: warning: ‘sampler_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized]

../../src/intel/common/gen_batch_decoder.c:343:7: warning: ‘ksp’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       ctx_disassemble_program(ctx, ksp, "compute shader");
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

../../src/intel/common/gen_batch_decoder.c: In function ‘decode_dynamic_state_pointers’:
../../src/intel/common/gen_batch_decoder.c:663:54: warning: ‘state_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    const uint32_t *state_map = ctx->dynamic_base.map + state_offset;
                                ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~

../../src/intel/common/gen_batch_decoder.c: In function ‘gen_print_batch’:
../../src/intel/common/gen_batch_decoder.c:856:13: warning: ‘next_batch.map’ may be used uninitialized in this function [-Wmaybe-uninitialized]
          if (next_batch.map == NULL) {
             ^
../../src/intel/common/gen_batch_decoder.c:860:13: warning: ‘next_batch.addr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
             gen_print_batch(ctx, next_batch.map, next_batch.size,
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                             next_batch.addr);
                             ~~~~~~~~~~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho
f836d799f9 intel/decoder: use snprintf(..., "%s", ...) instead of strncpy
strncpy() doesn't guarantee the terminator NUL, so we would need to
set ourselves. Just use snprintf() instead.

Fixes the warnings

../../src/intel/common/gen_decoder.c: In function ‘iter_decode_field’:
../../src/intel/common/gen_decoder.c:897:7: warning: ‘strncpy’ specified bound 128 equals destination size [-Wstringop-truncation]
       strncpy(iter->name, iter->field->name, sizeof(iter->name));
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘iter_advance_field’,
    inlined from ‘gen_field_iterator_next’ at ../../src/intel/common/gen_decoder.c:1015:9:
../../src/intel/common/gen_decoder.c:844:7: warning: ‘strncpy’ specified bound 128 equals destination size [-Wstringop-truncation]
       strncpy(iter->name, iter->field->name, sizeof(iter->name));
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho
20fcd152a2 anv: give more room to debug report
The error buffer is limited to 256, but the report contains the
filename and possibly other data. So give it more space.

Avoids the warnings

../../src/intel/vulkan/anv_util.c: In function ‘__anv_perf_warn’:
../../src/intel/vulkan/anv_util.c:66:42: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 254 [-Wformat-truncation=]
    snprintf(report, sizeof(report), "%s: %s", file, buffer);
                                          ^~         ~~~~~~
../../src/intel/vulkan/anv_util.c:66:4: note: ‘snprintf’ output 3 or more bytes (assuming 258) into a destination of size 256
    snprintf(report, sizeof(report), "%s: %s", file, buffer);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

../../src/intel/vulkan/anv_util.c: In function ‘__vk_errorf’:
../../src/intel/vulkan/anv_util.c:96:48: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 252 [-Wformat-truncation=]
       snprintf(report, sizeof(report), "%s:%d: %s (%s)", file, line, buffer,
                                                ^~                    ~~~~~~
../../src/intel/vulkan/anv_util.c:96:7: note: ‘snprintf’ output 8 or more bytes (assuming 263) into a destination of size 256
       snprintf(report, sizeof(report), "%s:%d: %s (%s)", file, line, buffer,
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                error_str);
                ~~~~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho
01d02e8906 anv: avoid warning when switching in VkStructureType
When one of the cases is not part of the enum, the compilar complains:

../../src/intel/vulkan/anv_formats.c: In function ‘anv_GetPhysicalDeviceFormatProperties2’:
../../src/intel/vulkan/anv_formats.c:728:7: warning: case value ‘1000001004’ not in enumerated type ‘VkStructureType’ {aka ‘enum VkStructureType’} [-Wswitch]
       case VK_STRUCTURE_TYPE_WSI_FORMAT_MODIFIER_PROPERTIES_LIST_MESA:
       ^~~~

Given the switch has an "default:" case, we don't lose anything by
switching on the unsigned value to avoid the warning.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho
df8f1637fa glsl: remove unnecessary parenthesis from macro
The "__inst" will contain the name used for the variable of type
"__type *". Parenthesis is not necessary as the name itself shouldn't
be an expression.

Fixes warning:

In file included from ../../src/mesa/main/mtypes.h:49,
                 from ../../src/intel/compiler/brw_compiler.h:30,
                 from ../../src/intel/compiler/brw_shader.h:29,
                 from ../../src/intel/compiler/brw_fs.h:31,
                 from ../../src/intel/compiler/brw_fs_cse.cpp:24:
../../src/intel/compiler/brw_fs_cse.cpp: In member function ‘bool fs_visitor::opt_cse_local(bblock_t*)’:
../../src/compiler/glsl/list.h:675:12: warning: unnecessary parentheses in declaration of ‘entry’ [-Wparentheses]
    __type *(__inst);                                      \
            ^
../../src/intel/compiler/brw_fs_cse.cpp:257:10: note: in expansion of macro ‘foreach_in_list_use_after’
          foreach_in_list_use_after(aeb_entry, entry, &aeb) {
          ^~~~~~~~~~~~~~~~~~~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho
4a29ee1861 intel/compiler: fix -Wsign-compare warning
Explicitly convert to signed integer. Conversion is valid since is the
same (implicitly) used to initialize the loop. Avoids the warning:

../../src/intel/compiler/brw_fs.cpp: In member function ‘bool fs_visitor::lower_simd_width()’:
../../src/intel/compiler/brw_fs.cpp:5761:45: warning: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Wsign-compare]
             split_inst.eot = inst->eot && i == n - 1;
                                           ~~^~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho
7df5f62768 intel/compiler: silence -Wclass-memaccess warnings
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho
ff8abce361 spirv: initialize is_vertex_input
Fixes warning:

../../src/compiler/spirv/vtn_variables.c: In function ‘var_decoration_cb’:
../../src/compiler/spirv/vtn_variables.c:1400:12: warning: ‘is_vertex_input’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       bool is_vertex_input;
            ^~~~~~~~~~~~~~~

The code used to set is_vertex_input in all possible codepaths, but
after 23edc5b1ef "spirv: translate default-block uniforms" the
compiler isn't sure all codepaths will initialize the variable.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-07-18 08:29:51 -07:00
Rob Clark
cbad8f3cc0 freedreno/a5xx: perfmance counters
AMD_performance_monitor support

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-18 10:19:03 -04:00
Rob Clark
33af91dc07 freedreno: batch query support (perfcounters)
Core infrastructure for performance counters, using gallium's batch
query interface (to support AMD_performance_monitor).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-18 10:19:03 -04:00
Rob Clark
9e30e7490d freedreno: batch query prep-work
For batch queries we have N different query_type's for one query, so
mapping a single query_type to a sample_provider doesn't really work
out.  Instead add a new constructor to construct a query directly
from a sample_provider.

Also, the sample buffer size needs to be determined at runtime, as
it depends on the number of query_types.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-18 10:19:03 -04:00
Rob Clark
37b724ff72 freedreno: rework accumulated query result vfunc
Take the query object, rather than the ctx.  The ctx ptr isn't hugely
useful but for back queries we will need the query object to properly
get the results.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-18 10:19:03 -04:00
Rob Clark
1f464d5301 freedreno/ir3: output ir3 and nir asm for frameretrace
See: 298dc8195b

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-18 10:10:45 -04:00
Rob Clark
e4c225ab6f freedreno/ir3: redirectable ir3 disasm output
For now it still goes to stdout, this will make it easier to support
output on stderr like what frameretrace expects.

(If we eventually have a proper GL extension for this, implementation
probably looks like dumping shader disasm to a tmp file and then dumping
that out over whatever mechanism is used.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-18 10:10:45 -04:00
Rob Clark
4c58db8064 freedreno/ir3: resync ir3 disassembler
Pull in latest updates from cffdump in envytools tree, so we can output
to other than just stdout.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-18 10:10:45 -04:00
Rob Clark
97a9283f5d freedreno: register usage queries
Avg number of (half) regs per draw, so we can corrolate fps dips to
shader register usage.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-18 10:10:44 -04:00
Rob Clark
8dfc9e22c1 nir: add lowering for gl_HelperInvocation
v2: reword comment about lower_helper_invocations to be more clear
    that it might not work on all hardware
v3: add special variant of load_sample_id which does not imply per-
    sample shading

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-18 10:10:44 -04:00
Rob Clark
09f240eb5f mesa: don't double incr/decr ActiveCounters
Frameretrace ends up w/ excess calls to SelectPerfMonitorCountersAMD()
which ends up re-enabling already enabled counters.  Which causes
ActiveCounters[group] to be double incremented for the same counter.
This causes BeginPerfMonitorAMD() to fail.

The AMD_performance_monitor spec doesn't say that an error should be
generated in this case.  So I think the safe thing to do is just safe-
guard against excess increments/decrements.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-18 10:10:44 -04:00
Rob Clark
426f1c60bc mesa: fix error msg typo
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-07-18 10:10:44 -04:00
Rob Clark
640b8eb5b1 nir: fixup intrinsic comment
Now the deref is the first src.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-07-18 10:10:44 -04:00
Tomeu Vizoso
3f7c2148b0 mesa: handle a bunch of formats in IMPLEMENTATION_COLOR_READ_*
Virgl could save a lot of work converting buffers in the host side
between formats if Mesa supported a bunch of other formats when reading
pixels.

This commit adds cases to handle specific formats so that the values
reported by the two calls match more closely the underlying native
formats.

In GLES is important that IMPLEMENTATION_COLOR_READ_* return the native
format and data type because the spec only allows reading with those,
besides GL_RGBA or GL_RGBA_INTEGER.

Additionally, because virgl currently doesn't implement such
conversions, this commit fixes several tests in
dEQP-GLES3.functional.fbo.color.clear.*, when using virgl in the guest
side.

The logic is based on knowledge that is shared with
_mesa_format_matches_format_and_type() but we cannot assert that the
results match as we don't have all the starting information at both
points. So leave the assert out and hope CI comes soon to save us all.

v2: * Let R10G10B10A2_UINT fall back to GL_RGBA_INTEGER (Eric Anholt)
    * Assert with _mesa_format_matches_format_and_type (Eric Anholt)

v3: * Remove the assert, as it won't be reliable (Eric Anholt)

v4: * Use _mesa_is_format_integer in the fallback (Eric Anholt)

v5: * Remove superfluous call to
      _mesa_uncompressed_format_to_type_and_comps (Eric Anholt)

Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-07-18 14:52:35 +01:00
Samuel Pitoiset
e45ba51ea4 radv: add support for VK_EXT_conditional_rendering
Inherited commands buffers are not supported.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-18 13:44:09 +02:00
Samuel Pitoiset
946cf3f39f radv: add support for non-inverted conditional rendering
By default, our internal rendering commands are discarded
only if the predicate is non-zero (ie. DRAW_VISIBLE). But
VK_EXT_conditional_rendering also allows to discard commands
when the predicate is zero, which means we have to use a
different flag.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-18 13:44:06 +02:00
Samuel Pitoiset
4d99caf590 radv: set the predicate for indirect/indexed draw commands
VK_EXT_conditional_rendering allows to discard draw commands
(not only normal draws).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-18 13:44:04 +02:00
Samuel Pitoiset
1e83f65673 radv: set the predicate for dispatch commands
VK_EXT_conditional_rendering allows to discard dispatch commands.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-18 13:44:01 +02:00
Lionel Landwerlin
83427acc87 i965: batchbuffer: write correct canonical offset with softpin
Addresses in the command streams should be in canonical form (i.e
bit[63:48] == bit[47]). If the [bo->gtt_offset, bo->gtt_offset +
target_offset] range contains the address 0x800000000000, the current
code will fail that criteria.

v2: Fix missing include (Lionel)

Fixes: 1c9053d076 ("i965: Prepare batchbuffer module for softpin support.")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-18 11:29:16 +01:00
Samuel Pitoiset
1376f2824f radv: remove unused variable in radv_CreateRenderPass2KHR()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-18 10:54:42 +02:00
Samuel Pitoiset
d9526384bd radv: optimize radv_stage_flush() for pre fragment shader stages
We don't need to emit PS_PARTIAL_FLUSH for the pre fragment shader
stages (ie. geometry/tessellation). Emitting VS_PARTIAL_FLUSH
is enough for these stages. Note that PS_PARTIAL_FLUSH also
synchronizes all vertex stages.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-18 10:09:05 +02:00
Samuel Iglesias Gonsálvez
0f29006256 anv: fix assert in anv_CmdBindDescriptorSets()
The assert is checking that we are not binding more descriptor sets
than the supported by the driver. When binding the descriptor set
number MAX_SETS-1, it was breaking the assert because
descriptorSetCount = 1.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-18 08:54:23 +02:00
Jan Vesely
154fbd03cc clover: Report error when pipe driver fails to create compute state
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-07-17 21:04:15 -04:00
Jan Vesely
866b25fd01 clover: Catch errors from executing event action
Abort all dependent events.
v2: Abort the current event as well.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-07-17 21:04:15 -04:00
Timothy Arceri
e105b0ca30 nir: add a couple of ior opts to nir_opt_algebraic
One of these was seen in a Deus Ex shader.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-18 09:53:27 +10:00
Timothy Arceri
c4188a9b9f nir: allow opt_peephole_select to handle nir_instr_type_deref
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-18 09:53:22 +10:00
Marek Olšák
bb5449cfee r600: fix warnings when unref'ing pool->bo 2018-07-17 14:51:45 -04:00
Konstantin Kharlamov
3f8fa7716d r600g: some -Wsign-compare fixes
Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-17 14:47:37 -04:00
Konstantin Kharlamov
b674a1d3b9 st/glx: constify some variables
Just a nice hint for both peoples and compilers.

Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-17 14:47:37 -04:00
Konstantin Kharlamov
1379d9759f st/nine: constify some variables
Just a nice hint for both peoples and compilers.

Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-17 14:47:37 -04:00
Konstantin Kharlamov
77ca550224 r600g: constify some variables
Just a nice hint for both peoples and compilers.

Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-17 14:47:37 -04:00
Konstantin Kharlamov
9b379591c9 r600g: do not use "fast-clear" for small textures (v3)
Ported from radeonsi. Improves windowed glxgears ran as

	vblank_mode=0 glxgears -info -geometry 0+0+512+512

from ≈2270 FPS to ≈2360 FPS. Tested with AMD TURKS.

v2: turned out glxgears ignores the option above, the correct way would
be "512x512+0+0". Now it can be seen 512x512 actually loses 30 FPS.
300×300 however wins around a hundred FPS, and to leave some room in
case results may differ for other cards I want not to nitpick in search
of an optimum but to simply leave 300×300 in the code.
v3: remove redundant braces, and try harder for the mail to stick to
the rest of the series.

Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-17 14:47:37 -04:00
Rob Clark
4cf8f329ed freedreno: re-work fd_batch_reference() locking
Annoyingly we still have to briefly drop the lock to unref resources..
but push the lock down into __fd_batch_destroy() so we can invalidate
the batch and reset resources before dropping the lock.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-17 11:00:00 -04:00
Rob Clark
4b847b38ae freedreno: make fd_batch a one-shot thing
Re-allocate rather than re-use.  Originally we had an unnecessarily
complex design to avoid re-allocating cmdstream buffers.  But now that
support for "growable" cmdstream buffers has been in place for a couple
years, I guess we can care a bit less about the extra overhead on older
kernels.

But making the batches one-shot removes a class of potential race
conditions vs the flush_queue.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-17 11:00:00 -04:00
Rob Clark
f129971e71 freedreno: flush immediately when reading a pending batch
Instead of the reading batch setting a dependency on the writing batch,
simply flush the writing batch immediately.  This avoids situations
where we have to flush the context's current batch later.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-17 11:00:00 -04:00
Rob Clark
20f677f6bc freedreno: get rid of noop render
This was basically to avoid a zero-dword IB (indirect-branch), but
instead just don't emit the IB packet in that case.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-17 11:00:00 -04:00
Rob Clark
15f6c0509a freedreno: fix samples=0 vs samples=1 confusion
pipe_framebuffer_state can have samples=0 in various cases, which is
actually the same thing as samples=1.  So use the _get_num_samples()
helper to populate the key, to avoid this looking like two distinct
fb states to the cache.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-17 11:00:00 -04:00
Rob Clark
d77fcdeb59 freedreno: comment for _invalidate_batch()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-17 11:00:00 -04:00
Rob Clark
f2570409f9 freedreno: hold batch references when flushing
It is possible for a batch to be freed under our feet when flushing, so
it is best to hold a reference to all of them up-front.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-17 11:00:00 -04:00
Karol Herbst
71add09e79 nir/spirv: print id for unsupported alu opcode
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-17 13:24:09 +02:00
Karol Herbst
1beef89ad8 nir: prepare for bumping up max components to 16
OpenCL knows vector of size 8 and 16.

v2: rebased on master (nir_swizzle rework)
    rework more declarations with nir_component_mask_t
    adjust print_var_decl

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-17 13:24:09 +02:00
Samuel Pitoiset
f65bee7e85 radv/winsys: use alloca() for semaphore dependencies
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-17 10:53:45 +02:00
Samuel Pitoiset
88e56804a7 radv: reduce number of CB/DB meta flushes for VK_ACCESS_TRANSFER_WRITE_BIT
If we know that the given image doesn't have any metadata,
we don't need to flush.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-17 09:34:20 +02:00
Samuel Pitoiset
b213947510 radv: fix implementation of VK_KHR_create_renderpass2 for multiviews
The Vulkan 1.1.80 spec says:

"viewMask has the same effect for the described subpass as
 VkRenderPassMultiviewCreateInfo::pViewMasks has on each
 corresponding subpass."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-17 09:04:35 +02:00
Erik Faye-Lund
591b700944 virgl: respect max_vertex_attrib_stride cap
This is required for OpenGL 4.4 and OpenGL ES 3.1 support.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-17 15:45:37 +10:00
Lepton Wu
04e278f793 virgl: Fix flush in virgl_encoder_inline_write.
The current code is buggy: if there are only 12 dwords left in cbuf,
we emit a zero data length command which will be rejected by virglrenderer.
Fix it by calling flush in this case.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-17 14:56:25 +10:00
Erik Faye-Lund
b5db3aa6e8 virgl: implement set_min_samples
This allows us to implement glMinSampleShading correctly, which up
until now just got ignored.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-17 13:59:47 +10:00
Caio Marcelo de Oliveira Filho
ba1b41b504 glsl: do second pass of const propagation in loops
When handling loops in constant propagation, implement the "FINISHME"
comment like copy propagation: perform a first pass to find values
that can't be propagated, then perform a second pass with the ACP
containing still valid values.

Certain values are killed because the loop may run more than one
iteration, so we can't copy propagate them as they would be invalid in
the later iterations.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-16 16:33:39 -07:00
Caio Marcelo de Oliveira Filho
d7849fd1da glsl: don't let an 'if' then-branch kill const propagation for else-branch
When handling 'if' in constant propagation, if a certain variable was
killed when processing the first branch of the 'if', then the second
would get any propagation from previous nodes. This is similar to the
change done for copy propagation code.

    x = 1;
    if (...) {
        z = x;  // This would turn into z = 1.
        x = 22; // x gets killed.
    } else {
        w = x;  // This would NOT turn into w = 1.
    }

With the change, we let constant propagation happen independently in
the two branches and only then apply the killed values for the
subsequent code.

The new code use a single hash table for keeping the kills of both
branches (the branches only write to it), and it gets deleted after we
use -- instead of waiting for mem_ctx to collect it.

NIR deals well with constant propagation, so it already covered for
the missing ones that this patch fixes.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-16 16:33:39 -07:00
Eric Anholt
229836fb37 v3d: Disable shader-db cycle estimates until we sort out TMU estimates.
I keep having to ignore these shader-db changes since I don't trust them,
so just disable the reports entirely.
2018-07-16 14:39:59 -07:00
Eric Anholt
2baab6bf2a v3d: Emit the lowered uniform just before its first use in a block.
total instructions in shared programs: 98578 -> 98119 (-0.47%)
instructions in affected programs:     27571 -> 27112 (-1.66%)

and it also eliminates most spills/fills on the CTS's randomized uniform
usage testcases.
2018-07-16 14:39:59 -07:00
Eric Anholt
26f830d9fc v3d: Add an assert that we don't provide an invalid texture return words.
The docs had an update noting this restriction, so reflect it in the code.
2018-07-16 14:39:59 -07:00
Eric Anholt
d661d78464 v3d: Apply GFXH-1625 restriction on TMUWT in the end of the shader.
This doesn't affect us yet since we're not doing TMUWTs, but I think we
will for GLES 3.1.
2018-07-16 14:39:59 -07:00
Sergii Romantsov
cec540fbc6 intel/batch_decoder: decoding of 3DSTATE_CONSTANT_BODY.
SNB doesn't have a definition of 3DSTATE_CONSTANT_BODY, thats
why we got segmentation fault when used INTEL_DEBUG=bat.
Fixed by adding of 3DSTATE_CONSTANT_BODY into 3DSTATE_CONSTANT
of VS, GS and PS structures.

v2: added definition of 3DSTATE_CONSTANT_BODY to the gen6.xml

Fixes: 169d8e011a (intel: Fix 3DSTATE_CONSTANT buffer decoding.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107190
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-16 12:18:36 -07:00
Marek Olšák
4054133dcc r600: fix build after the removal of RADEON_PRIO_* flags 2018-07-16 14:33:31 -04:00
Roland Scheidegger
b3474645d4 nir: fix msvc build
Empty initializer braces aren't valid c (it's a gnu extension, and
it's valid in c++).
Hopefully fixes appveyor / msvc build...

Fixes a3150c1d06
2018-07-16 20:07:53 +02:00
Jason Ekstrand
f378fa94b2 nir/worklist: Rework the foreach macro
This makes the arguments match the (thing, container) pattern used in
other nir_foreach macros and also renames it to make that a bit more
clear.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-16 11:02:10 -07:00
Eric Anholt
360714bfa5 intel: tools: Fix uninitialized variable warnings in intel_dump_gpu.
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-16 10:58:40 -07:00
Jason Ekstrand
5e030deaf2 spirv: Fix a couple of image atomic load/store bugs
For one thing, the NIR opcodes for image load/store always take and
return a vec4 value regardless of the image type.  We need to fix up
both the source and destination to handle it.  For another thing, we
weren't actually setting up a destination in the OpAtomicLoad case.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: mesa-stable@lists.freedesktop.org
2018-07-16 10:54:50 -07:00
Marek Olšák
f8aa116c3c winsys/amdgpu: clean up error handling in amdgpu_cs_submit_ib
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
6b1e0e51e6 radeonsi: rework RADEON_PRIO flags to be <= 31
This decreases sizeof(struct amdgpu_cs_buffer) from 24 to 16 bytes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
54ad9b444c radeonsi: merge DCC/CMASK/HTILE priority flags
For a later simplification.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
3e6888e5d7 radeonsi: remove non-GFX BO priority flags
For a later simplification.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
342fff6cbc winsys/amdgpu: use alloca when using global_bo_list
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
6ec44b7055 winsys/amdgpu: remove label bo_list_error
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
7346e5296e winsys/amdgpu: always update gfx_bo_list_counter
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
caf41fb96d winsys/amdgpu: make amdgpu_cs_context::flags & handles local
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Gert Wollny
78887e99e3 mesa/virgl: Fix off-by-one and copy-paste error in multisample position evaluation
Converting from a switch statement that would not allow intermediate sample counts
to use an if-else chain went a bit wrong, so that in some cases the range that
should be inclusive was exclusive and the line for 16 samples was copies wrongly.

v2: elaborate commit message.

Fixes: 91f48cdfe5
       virgl: Add support for glGetMultisample
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v1)
2018-07-16 12:51:39 +02:00
Karol Herbst
4d0d911875 nouveau: fix 3D blitter for unsigned to signed integer conversions
fixes a couple of packed_pixel CTS tests. No regressions inside a CTS run.

v2: simplify the changes a bit

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-15 19:28:37 +02:00
Karol Herbst
87c8af2836 nir: fix printing of vec16 type
Fixes: 2f181c8c18
       "glsl_types: vec8/vec16 support"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-15 19:28:37 +02:00
Rob Clark
427a3dbdb1 nir/spirv: implement BuiltInWorkDim
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-15 07:51:13 +02:00
Karol Herbst
39180d3931 nir/spirv: print id for unsupported builtins
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-15 07:51:13 +02:00
Jason Ekstrand
daa78f30b6 intel/blorp: Handle 3-component formats in clears
This fixes a nasty hang in Batman: Arkham City which apparently calls
vkCmdClearColorImage on a linear RGB image.

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-07-13 20:57:46 -07:00
Jason Ekstrand
11712b9ca1 intel/blorp: Fix blits to R8G8B8_UNORM_SRGB
In this case, the surface faking will give us a R8_UNORM surface and we
need to do an sRGB conversion in the shader.  Found by inspection.

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-07-13 20:57:46 -07:00
Caio Marcelo de Oliveira Filho
4ec8b39fcd util/hash_table: add helper to remove entry by key
And the corresponding test case.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-13 14:20:49 -07:00
Jason Ekstrand
a3150c1d06 nir/lower_tex: Use nir_format_srgb_to_linear
A while ago, we added a bunch of format conversion helpers; we should
use them instead of hand-rolling sRGB conversions.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-13 14:02:18 -07:00
Jason Ekstrand
b52d79514c vc4: Tell NIR to lower fdiv instructions
This should allow us to use them in nir_lower_tex

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-13 14:02:18 -07:00
Dylan Baker
53aca66874 docs: Update news, calendar, and relnotes for 18.1.4 2018-07-13 13:54:46 -07:00
Dylan Baker
97870f2cd0 docs: Add sha256 sums for 18.1.4 tarballs 2018-07-13 13:53:03 -07:00
Dylan Baker
e8df2f12d6 docs: Add release notes for 18.1.4 2018-07-13 13:53:01 -07:00
Eric Anholt
d009463a65 vc4: Switch to using u_transfer_helper for MSAA maps.
No requirement, just reduces code duplication.
2018-07-13 13:29:29 -07:00
Eric Anholt
afcc714c98 v3d: Work around GFXH-1461 bug losing our Z/S clears.
If you load S and clear Z or vice versa, the clear may get lost.  Just
fall back to drawing a quad.

Fixes KHR-GLES3.packed_depth_stencil.verify_read_pixels.depth24_stencil8
2018-07-13 13:29:29 -07:00
Eric Anholt
162fcdad6a meson: Move xvmc test tools from unit tests to installed tools.
These are not unit tests, as they rely on the host's XVMC and some user
configuration.  Switch them over to being general installed tools, to fix
unit testing.

Fixes: 22a817af8a ("meson: build gallium xvmc state tracker")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-07-13 13:29:29 -07:00
Gert Wollny
695a4cb0f6 r600: Add spill output to group only if register or target index changes
The current spill code checks in each instruction of an instruction group whether
spilling is needed and if so, it adds spilling for each component as a seperate
instruction and it allocates a new temporary for each component and since it takes
the write mask from the TGSI representation, all components might be written
each time and as a result already written components might be overwritten with
garbage like:

   ...
   y: MOV                R9.y,  [0x42140000 37].x
   t: MOV                R8.x,  [0x42040000 33].y
   ...
   MEM_SCRATCH  WRITE_IND_ACK 0     R9.xy__, @R4.x  ES:3
   MEM_SCRATCH  WRITE_IND_ACK 0     R8.xy__, @R4.x  ES:3
   ...

To resolve this isse accumulate spills to the same memory location so that only one
memory write instruction is emitted for an instruction group that writes up to all
four components.

This fixes updated piglits (see https://patchwork.freedesktop.org/series/46064/):
   spec/glsl-1.30/execution
       fs-large-local-array-vec2.shader_test
       fs-large-local-array-vec3.shader_test
       fs-large-local-array-vec4.shader_test

v2: fix some typos and add comment about piglits (Roland Scheidegger)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
2018-07-13 21:11:34 +02:00
Nanley Chery
3b4279f772 i965/miptree: Allocate MS texture BOs as BUSY
These buffer objects are never accessed with the CPU.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:36:26 -07:00
Nanley Chery
7784a9ceac i965/miptree: Inline make_separate_stencil
Note that the separate stencil miptree now has the same alloc_flag as
the depth component. Only stencil renderbuffers (as opposed to textures)
have BO_ALLOC_BUSY.

v2: Add note about BO_ALLOC_BUSY in message (Topi).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:36:26 -07:00
Nanley Chery
74cf188985 i965/miptree: Init r8stencil_needs_update to false
The current behavior masked two bugs where the flag was not set to true
after modifying the stencil texture. One case was a regression
introduced with commit bdbb527a65 and
another was a bug in the depthstencil mapping code. These have since
been fixed.

To prevent such bugs from being masked in the future, initialize
r8stencil_needs_update to false.

v2: Keep the delayed allocation.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:36:19 -07:00
Nanley Chery
ffac81fa5c i965/miptree: Refactor miptree_create
Enable a future patch to create the r8stencil_mt in this function.

v2: Explicitly set etc_format to MESA_FORMAT_NONE (Topi).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:31:21 -07:00
Nanley Chery
03cbaae03e i965/miptree: Add and use mt_surf_usage
v2: Make mt_fmt const (Topi).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:31:21 -07:00
Nanley Chery
32b22592a8 i965/miptree: Share alloc_flags in miptree_create
Note that this maintains BO_ALLOC_BUSY for depth renderbuffers, but not
depth textures.

v2: Add note about BO_ALLOC_BUSY in message (Topi).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:31:21 -07:00
Nanley Chery
2321e85759 i965/miptree: Share the miptree format in miptree_create
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:31:21 -07:00
Nanley Chery
fbe01625f6 i965/miptree: Share tiling_flags in miptree_create
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:31:21 -07:00
Nanley Chery
6c9947c3ef i965/miptree: Delete MIPTREE_CREATE_LINEAR
This enum constant was introduced to enable blit maps with
intel_miptree_create da2880bea0. Now that
such maps use the more direct make_surface function which allows you to
specify the tiling directly, the constant is no longer being used.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:31:21 -07:00
Nanley Chery
684fa59eb6 i965/miptree: Use make_surface in map_blit
Do this so that we don't have to special case linearly-tiled depth
buffers in miptree_create.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:31:21 -07:00
Nanley Chery
63d428dc17 i965/draw: Fix adding the stencil bo to the depth cache
Fix the case where stencil writes are enabled on a depth stencil
texture. Found by inspection.

v2: Fix message to allow for depth stencil writes (Topi).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:31:21 -07:00
Nanley Chery
be07cc43a2 i965/draw: Set the r8stencil flag after drawing
Fixes the regresion introduced with commit
bdbb527a65
"i965: Use ISL for emitting depth/stencil/hiz state on gen6+"

Found by inspection.

Prevents regressing the piglit test, fbo-depth-array stencil-draw, later
on in this series.

Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:31:21 -07:00
Nanley Chery
0eafe44ba7 i965/miptree: Set the r8stencil flag in map_depthstencil
Found by initializing the r8stencil_needs_update to false in
make_separate_stencil_surface.

Prevents regressing the piglit test arb_stencil_texturing-draw, later on
in the series.

Cc: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:31:21 -07:00
Nanley Chery
cef7ce07fa i965: Set the r8stencil flag in miptree_finish_write
This seems to be the most appropriate place.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-07-13 08:31:21 -07:00
Karol Herbst
cb65246ed2 nir: cleanup oversized arrays in nir_swizzle calls
There are no fixed sized array arguments in C, those are simply pointers
to unsized arrays and as the size is passed in anyway, just rely on that.

where possible calls are replaced by nir_channel and nir_channels.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-13 15:46:57 +02:00
Nanley Chery
0288fe8d04 i965/miptree: Use the correct BLT pitch
Retile miptrees to a linear tiling less often. Retiling can cause issues
with imported BOs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106738
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-07-12 19:16:30 -07:00
Nanley Chery
3df201e3e8 i965/miptree: Drop an if case from retile_as_linear
Drop an if statement whose predicate never evaluates to true. row_pitch
belongs to a surface with non-linear tiling. According to
isl_calc_tiled_min_row_pitch, the pitch is a multiple of the tile width.
By looking at isl_tiling_get_info, we see that non-linear tilings have
widths greater than or equal to 128B.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-07-12 19:16:30 -07:00
Nanley Chery
0ab2541943 i965: Make blt_pitch public
We'd like to reuse this helper.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-07-12 19:16:30 -07:00
Caio Marcelo de Oliveira Filho
1f6ce1973a nir: delete not needed for reinserted nir_cf_list
It wasn't causing problems since there's nothing to delete, but better
be consistent with the rest of existing codebase.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho
13cfd6cc96 glsl: remove struct kill_entry in constant propagation
The only value in kill_entry is the writemask, which can be stored in
the data pointer of the hash table entry.

Suggested by Eric Anholt.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho
d6e869afe9 glsl: slim the kill_entry struct used in const propagation
Since 4654439fdd "glsl: Use hash tables for
opt_constant_propagation() kill sets." uses a hash_table for storing
kill_entries, so the structs can be simplified.

Remove the exec_node from kill_entry since it is not used in an
exec_list anymore.

Remove the 'var' from kill_entry since it is now redundant with the
key of the hash table.

Suggested by Eric Anholt.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho
094225d69d i965: fix typo (wrong gen number) in comment
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho
fa0c19d17b util/set: helper to remove entry by key
v2: Add unit test. (Eric Anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho
b034facfbc util/set: add a clone function
v2: Add unit test. (Eric Anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho
8af0a45b47 util/set: add a basic unit test
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-12 14:03:51 -07:00
Marek Olšák
2e0b00ab7d radeonsi: add support for Vega20
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-12 16:48:12 -04:00
Eric Anholt
e8dc3c0c36 u_blitter: Add an option to draw the triangles using an index buffer.
For V3D, the HW will interpolate slightly differently along the shared
edge of the trifan.  The conformance tests manage to catch this in the
nearest_consistency_* group.  To get interpolation to match, we need the
last vertex of the triangle to be shared.

I first tried implementing draw_rectangle to do triangles instead, but
that was quite a bit (147 lines) of code duplication from u_blitter, and
this seems much simpler and less likely to break as u_blitter changes.

Fixes dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_* on V3D.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-12 11:49:22 -07:00
Eric Anholt
c17dac0534 u_draw: Add some indices to the util_draw_elements() helpers.
These helpers have been unused, and were definitely not useful since
330d0607ed ("gallium: remove pipe_index_buffer and set_index_buffer")
made it so that they never had an index buffer passed in.

For an upcoming u_blitter change to use these helpers, I have just 6 bytes
of index data, so pass it as user data until a more interesting caller
comes along.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-12 11:49:20 -07:00
Eric Anholt
50a3a283d0 vc4: Don't automatically reallocate a PERSISTENT-mapped buffer.
I had mistakenly used the COHERENT flag, which can only be set when
PERSISTENT is mapped, but isn't always.

Fixes: a2014c2eb9 ("vc4: Simplify the DISCARD_RANGE handling")
2018-07-12 11:31:08 -07:00
Eric Anholt
7714896256 v3d: Don't automatically reallocate a PERSISTENT-mapped buffer.
I had mistakenly used the COHERENT flag, which can only be set when
PERSISTENT is mapped, but isn't always.

Fixes piglit bufferstorage-persistent read
2018-07-12 11:31:08 -07:00
Eric Anholt
e48c615292 v3d: Fix stride of 1D_ARRAY mappings.
All of our other texture arrays will be tiled, but 1D is an array of
raster mappings and we had the wrong value plugged in here.  Fixes piglit
getteximage-targets 1D_ARRAY
2018-07-12 11:31:08 -07:00
Eric Anholt
97ddeed949 v3d: Fix MRT blending with independent blending disabled.
We were only emitting the RT blend state for RT 0 and only enabling it for
RT 0, when the gallium API for !independent_blend is for rt0's state to
apply to all of them.

Fixes piglit fbo-drawbuffers-blend-add.
2018-07-12 11:31:08 -07:00
Eric Anholt
e0dbbf9987 gallium/u_transfer_helper: Initialize the stride of MSAA maps.
We just never set the value that was returned for MSAA mappings (directly
reading back an MSAA framebuffer).  Since we're handing back ss_map, it
should be ss_map's stride from our nested transfer.

Fixes piglit /home/anholt/src/piglit/bin/fbo-depthstencil -samples=4
cases.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-07-12 11:31:06 -07:00
Eric Anholt
589bb5bd65 gallium/u_transfer_helper: Fix MSAA mappings with nonzero x/y.
We created a temporary with box->{width,height} and then tried to map
width,height from a nonzero offset when we meant to just map the whole
temporary.

Fixes segfaults in V3D in dEQP-GLES3.functional.prerequisite.read_pixels
with --deqp-egl-config-name=rgba8888d24s8ms4 and also piglit's read-front
clear-front-first -samples=4

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-07-12 11:31:00 -07:00
Jason Ekstrand
ccb8309516 util/rb_tree: Fix a compiler warning
Gcc 8 warns "cast to pointer from integer of different size" in 32-bit
builds.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-12 10:25:46 -07:00
Jose Maria Casanova Crespo
62f37ee53d i965/fs: unspills shoudn't use grf127 as dest since Gen8+
At 232ed89802 "i965/fs: Register allocator
shoudn't use grf127 for sends dest" we didn't take into account the case
of SEND instructions that are not send_from_grf. But since Gen7+ although
the backend still uses MRFs internally for sends they are finally
assigned to a GRFs.

In the case of unspills the backend assigns directly as source its
destination because it is suppose to be available. So we always have a
source-destination overlap. If the reg_allocator assigns registers that
include the grf127 we fail the validation rule that affects Gen8+
"r127 must not be used for return address when there is a src and dest
overlap in send instruction."

So this patch activates the grf127_send_hack_node for Gen8+ and if we
have any register spilled we add interferences to the destination of
the unspill operations.

We also need to avoid that opt_bank_conflicts() optimization, that runs
after the register allocation, doesn't move things around, causing the
grf127 to be used in the condition we were avoiding.

Fixes piglit test tests/spec/arb_compute_shader/linker/bug-93840.shader_test
and some shader-db crashed because of the grf127 validation rule..

v2: make sure that opt_bank_conflicts() optimization doesn't change
the use of grf127. (Caio)

Found by Caio Marcelo de Oliveira Filho

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107193
Fixes: 232ed89802 "i965/fs: Register allocator shoudn't use grf127 for sends dest"
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Cc: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-12 18:02:26 +02:00
Michel Dänzer
34e89e4d38 gallium: Check pipe_screen::resource_changed before dereferencing it
It's optional, only implemented by the etnaviv driver so far.

Fixes: 501d0edeca "st/mesa: call resource_changed when binding a
                     EGLImage to a texture"
Fixes: a37cf630b4 "gallium: add pipe_screen::resource_changed callback
                     wrappers"
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-07-12 17:39:12 +02:00
Jason Ekstrand
c2587ac4e5 docs/features: Add the missing KHR extensions
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-12 08:28:04 -07:00
Jason Ekstrand
55b68c4833 docs/features: Move the Vulkan 1.1 extensions to the 1.1 section
While we're at it, add some extensions we missed along the way like the
VK_KHR_maintenanceN extensions.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-12 08:28:04 -07:00
Jason Ekstrand
bc15d74529 docs/features: Mark some Vulkan extensions as done
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-12 08:28:04 -07:00
Karol Herbst
686e140ce0 nir/spirv: handle OpConstantComposites with OpUndef members
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-12 13:09:00 +02:00
Karol Herbst
154ef32e46 nir/spirv: implement BuiltInGlobalSize
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-12 13:09:00 +02:00
Karol Herbst
31cbcbdb87 nir: move lowering of SYSTEM_VALUE_LOCAL_GROUP_SIZE into a function
we already have this code duplicated and we will need it for the global
group size as well

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-12 13:09:00 +02:00
Karol Herbst
529aa9e646 compiler: add missing entries to gl_system_value_name
also reorder to match the gl_system_value enum.

It is weird that the STATIC_ASSERT doesn't trigger though.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-12 13:09:00 +02:00
Rob Clark
d4280561f5 nir/spirv: print extension name in fail msg
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-12 13:09:00 +02:00
Rob Clark
9ce0360f76 nir/spirv: Use imov where we might have 8 bit types
Otherwise nir_validate may complain about 8 bit floats, which do not exist.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-12 13:09:00 +02:00
Samuel Pitoiset
f1b3f7bfac radv: simplify the logic in radv_set_descriptor_set()
Now that 'set' can't be NULL because the meta operations no
longer bind a NULL descriptor, the logic can be simplified
a little bit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-12 11:08:49 +02:00
Samuel Pitoiset
826b3a8773 radv: remove one useless check in radv_bind_descriptor_set()
'set' shouldn't be NULL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-12 11:08:47 +02:00
Samuel Pitoiset
6bfbc7b38b radv/meta: do not restore a NULL descriptor
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-12 11:08:45 +02:00
Samuel Pitoiset
5b32926f7e radv: remove unnecessary verification code around ring_offsets_idx
I don't want to waste CPU cycles for nothing.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-12 11:08:42 +02:00
Samuel Pitoiset
6248fbe5e4 radv: get rid of buffer object priorities
We mostly use the same priority for all buffer objects, so
I don't think that matter much. This should reduce CPU
overhead a little bit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-12 11:08:40 +02:00
Lucas Stach
501d0edeca st/mesa: call resource_changed when binding a EGLImage to a texture
When a EGLImage is newly bound to a texture, we need to make sure the
driver is informed that the resource might have changed. Fixes stale
texture content on Etnaviv when binding an existing EGLImage to an
existing texture object.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-12 11:02:04 +02:00
Samuel Pitoiset
1f616a840e radv: emit a dummy ZPASS_DONE to prevent GPU hangs on GFX9
A ZPASS_DONE or PIXEL_STAT_DUMP_EVENT (of the DB occlusion
counters) must immediately precede every timestamp event to
prevent a GPU hang on GFX9.

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-12 10:22:36 +02:00
Samuel Pitoiset
3a16c722cf radv: add support for VK_KHR_create_renderpass2
VkCreateRenderPass2KHR() is quite similar to VkCreateRenderPass()
but refactoring the code is a bit painful.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-12 10:20:10 +02:00
Samuel Pitoiset
fe28978f2a radv: introduce radv_subpass_attachment data structure
Needed for VK_KHR_create_renderpass2.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-12 10:20:06 +02:00
Kenneth Graunke
c0874947f1 st/mesa: Only enable depth writes if the function isn't EQUAL.
If the depth function is EQUAL, then we'll only write the depth value
when it already matches what's in the buffer, which is pointless.
Skipping these writes can save bandwidth.

The state tracker can easily take care of this, so all drivers benefit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-11 11:23:20 -07:00
Chad Versace
be5fc0d7f1 anv/android: Fix type error in call to vk_errorf()
In a single call to vk_errorf() in the Android code, the arguments were
swapped. The bug has existed since day one. Chrome OS used to forgive
the warning, but it is now a compilation error.

CC: <mesa-stable@lists.freedesktop.org>
Fixes: 053d4c32 "anv: Implement VK_ANDROID_native_buffer (v9)"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-07-11 11:09:19 -07:00
Chad Versace
8e403bc959 anv/android: Fix Autotools build for VK_ANDROID_native_buffer
Changes to vk.xml and anv_entrypoints_gen.py broke the Autotools build
on Android. The changes undef'd the VK_ANDROID_native_buffer entrypoints
in anv_entrypoints.h.

Fix it with CPPFLAGS += -DVK_USE_PLATFORM_ANDROID_KHR.

CC: <mesa-stable@lists.freedesktop.org>
See-Also: 63525ba7 "android: enable VK_ANDROID_native_buffer"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-07-11 11:09:16 -07:00
Samuel Pitoiset
4a67ce886a radv: make sure to wait for CP DMA when needed
This might fix some synchronization issues. I don't know if
that will affect performance but it's required for correctness.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-11 12:11:56 +02:00
Rafael Antognolli
688d757e15 intel/tools/dump_gpu: Add option to print ppgtt mappings.
Using -vv will increase the verbosity, by printing the ppgtt mappings as
they get written into the aub file.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-10 09:05:44 -07:00
Neil Roberts
45106a1c93 spirv: Fix InterpolateAt* instructions for vecs with dynamic index
If the glsl is something like this:

  in vec4 some_input;
  interpolateAtCentroid(some_input[idx])

then it now gets generated as if it were:

  interpolateAtCentroid(some_input)[idx]

This is necessary because the index will get generated as a series of
nir_bcsel instructions so it would no longer be an input variable. It
is similar to what is done for GLSL in ca63a5ed3e.

Although I can’t find anything explicit in the Vulkan specs to say
this should be allowed, the SPIR-V spec just says “the operand
interpolant must be a pointer to the Input Storage Class”, which I
guess doesn’t rule out any type of pointer to an input.

This was found using the spec/glsl-4.40/execution/fs-interpolateAt*
Piglit tests with the ARB_gl_spirv branch.

Signed-off-by: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

v2: update after nir_deref_instr land on master. Implemented by
    Alejandro Piñeiro. Special thanks to Jason Ekstrand for guidance
    at the new nir_deref_instr world.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-10 11:43:40 +02:00
Francisco Jerez
18c086a9e6 intel/ir: Uncomment definition of several unused hardware opcodes.
There are a number of opcode_desc table entries for many of these
unused opcodes.  A symbolic opcode enum will be required in a future
commit in order to keep them in the opcode description tables.  The
alternative would be to remove the unused opcodes from the opcode
description tables.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
48d6fc5eb6 intel/fs: Initialize mlen for gen7 varying pull constant load messages.
This makes the message length available at the IR level, which should
save some guesswork in a future commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
6643143f6e intel/eu: Assert that the instruction is send-like in brw_set_desc_ex().
Constructing a descriptor in-place as part of the immediate of an ALU
instruction is no longer supported.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
6f81e2b994 intel/eu: Get rid of the return value of brw_send_indirect_message().
The return value is not used anymore.  This allows simplifying the
code slightly, and in addition it should frustrate anybody's attempts
to continue using the obsolete piecemeal approach to construct a
message descriptor in combination with brw_send_indirect_message().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
b3cce4c130 intel/eu: Get rid of the return value of brw_send_indirect_surface_message().
All users of brw_send_indirect_surface_message() should be providing a
full descriptor immediate up front by now, this isn't necessary
anymore.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
95b5367149 intel/eu: Use descriptor constructors for dataport typed surface messages.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
94166cef40 intel/eu: Use descriptor constructors for dataport scattered byte surface messages.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
2a9605d610 intel/eu: Use descriptor constructors for dataport untyped surface messages.
v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
8e707fc2af intel/eu: Provide single descriptor argument to brw_send_indirect_surface_message().
Instead of the current message_len, response_len and header_present
arguments.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
b10b4e7c45 intel/eu: Use descriptor constructors for pixel interpolator messages.
v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
8fa4bc4676 intel/eu: Use descriptor constructors for dataport write messages.
v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
2bac890bf5 intel/eu: Use descriptor constructors for dataport read messages.
v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
27c211e30f intel/eu: Use descriptor constructors for sampler messages.
v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
1c90ae5acc intel/eu: Provide desc immediate argument up front to brw_send_indirect_message().
The current approach of returning a setup instruction where additional
descriptor fields can be specified is still supported in order to keep
things working, but it will be removed later in this series.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
b382bdde1d TRIVIAL: intel/eu: Use a local devinfo variable in brw_shader_time_add().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
c3793d49e4 intel/eu: Use brw_set_desc() along with a helper to set common descriptor controls.
This replaces brw_set_message_descriptor() with the composition of
brw_set_desc() and a new inline helper function that packs the common
message descriptor controls into an integer.  The goal is to represent
all message descriptors as a 32-bit integer which is written at once
into the instruction, which is more flexible (SENDS anyone?), robust
(see d2eecf0b0b fixing an issue
ultimately caused by some bits of the extended message descriptor
being left undefined) and future-proof than the current approach of
specifying the individual descriptor fields directly into the
instruction.

This approach also seems more self-documenting, since it will allow
removing calls to functions with way too many arguments like
brw_set_*_message() and brw_send_indirect_message(), and instead
provide a single descriptor argument constructed from an appropriate
combination of brw_*_desc() helpers.

Note that because brw_set_message_descriptor() was (conditionally?)
overriding fields of the instruction which strictly speaking weren't
part of the message descriptor, this involves calling
brw_inst_set_sfid() and brw_inst_set_eot() in some cases in addition
to brw_set_desc().

v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
20b962232b intel/eu: Define SET_BITS helper more easily reusable than SET_FIELD.
Allows to specify a bitfield based on its upper and lower bounds
instead of a symbolic field definition, kind of what the current
GET_BITS macro is to GET_FIELD.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
d0f589a55b intel/eu: Define helper to specify the descriptor immediates of a SEND instruction.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
f55884cad3 intel/eu: Add brw_inst.h helpers for the SEND(C) descriptor and extended descriptor.
This introduces helpers that can be used to specify or extract the
whole descriptor of a SEND message instruction at once.  Because the
the instruction encoding of these is rather awkward on some
generations using the generic brw_inst.h macros doesn't seem like an
option.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Jordan Justen
1c8a045bfb i965: Support saving the gen program with glGetProgramBinary
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:33 -07:00
Jordan Justen
eb5b4b0fd1 i965: Add flag_state param to brw_search_cache
This allows brw_search_cache to be used to find programs without
causing extra state to be emitted in the case where the program isn't
being made active. (For example, to find the program to save out with
the ARB_get_program_binary interface.)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:33 -07:00
Jordan Justen
48ce7745dc mesa: Add gl_shader_program param to ProgramBinarySerializeDriverBlob
This might be required because some stages might generate different
programs depending on the other stages in the program. For example,
the i965 driver's tessellation control stage depends on the
tessellation evaluation shader.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:33 -07:00
Jordan Justen
36dd15f8b3 i965: Add brw_populate_default_key
We will need to populate the default key for ARB_get_program_binary to
allow us to retrieve the default gen program to store in the program
binary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:33 -07:00
Jordan Justen
65f2014740 i965: Replace brw_setup_tex_for_precompile brw with devinfo
Trying to make sure the setup of the default program key is not
dependent on the GL state.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:33 -07:00
Jordan Justen
e426286e21 i965: Regenerate blob without gen program for shader cache
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:33 -07:00
Jordan Justen
3a133223b3 compiler/blob: Add blob_skip_bytes
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:33 -07:00
Jordan Justen
8e7ee7433e i965: Add support for driver cache blob containing the gen program
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:33 -07:00
Jordan Justen
05bb4b4849 i965: Use brw_prog_key_set_id in disk cache load/store code
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:33 -07:00
Jordan Justen
170d76de9f i965: Add brw_prog_key_set_id helper to set the program id on any stage
For saving programs (shader cache; get program binary) it is useful to
set the id to 0, with the stage being a parameter.

For restoring programs it is useful to set the id to the id allocated
to the program at creation time.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:33 -07:00
Jordan Justen
1c1a7d11c8 i965: Add brw_stage_cache_id to map gl stages to brw cache_ids
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:32 -07:00
Jordan Justen
b9f9b35431 i965: Add brw_(read|write)_blob_program_data functions
We will want to use these for both the disk shader cache, and for the
ARB_get_program_binary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:32 -07:00
Jordan Justen
1777c23abf i965: Add brw_program_deserialize_driver_blob
brw_program_deserialize_driver_blob will be a more generic form of
brw_program_deserialize_nir. In addition to nir, it will also be able
to extract gen binaries and upload them to the program cache.

In this commit, it continues to only support nir.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:32 -07:00
Jordan Justen
f4c154afc1 i965: Move brw_program_*serialize_nir to brw_program_binary.c
This will allow get_program_binary to add the gen program into its
serialization in addition to just the nir program.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:32 -07:00
Jordan Justen
cce3994dee mesa: Always call ProgramBinarySerializeDriverBlob
The driver may prefer to have a different blob for
ARB_get_program_binary compared to the version saved out for the disk
shader cache.

Since they both use the driver_cache_blob field, we need to always
give the driver the opportunity to fill in the driver_cache_blob when
saving the program binary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:32 -07:00
Jordan Justen
6497be42b7 i965: Use ShaderCacheSerializeDriverBlob driver function
This function is called just before the gl_program::driver_cache_blob
is saved out as part of the gl_program serialization.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:32 -07:00
Jordan Justen
450f00e39d st/mesa: Use ShaderCacheSerializeDriverBlob driver function
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:32 -07:00
Jordan Justen
c510dd22a9 st/mesa: Skip serializing driver_cache_blob if it exists
Previously the mesa core code would not call to serialize the
driver_cache_blob if it existed. We will update it to always call to
serialize the driver_cache_blob meaning we should avoid re-serializing
it under mesa/state_tracker.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:32 -07:00
Jordan Justen
2a55553be3 mesa: Add disk shader cache driver blob callback
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-09 23:02:28 -07:00
Iago Toral Quiroga
213491600a intel/compiler: emit actual barriers for working-group level barriers
Until now we have assumed that we could skip emitting these barriers
in the general case based on empirical testing and a few assumptions
detailed in a comment in the driver code, however, recent CTS tests
have showed that we actually need them to produce correct behavior.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-10 07:46:34 +02:00
Dave Airlie
0cab6e51e3 radv: add some cxxflags for new c++ file
Looks like I broke intel CI compiles.

Fixes: 6f3aee40f9 (radv: using tls to store llvm related info and speed up compiles (v10))
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2018-07-10 10:48:03 +10:00
Jason Ekstrand
dc1d10b396 anv,radv: Add support for VK_KHR_get_display_properties2
Reviewed-by: Keith Packard <keithp@keithp.com>
2018-07-09 17:09:41 -07:00
Jason Ekstrand
c0a27c5946 intel/aubinator_error_decode: Allow for more sections
Error states coming from actual Vulkan applications tend to have fairly
long command buffers and lots of chained batches.  30 total BOs isn't
nearly enough.  This commit bumps it to 256, makes some things use the
actual number of sections instead of the #define, and adds asserts if we
ever go over 256 sections.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 16:40:54 -07:00
Jason Ekstrand
5009e73bb1 intel/batch_decoder: Recurse for all 2nd level batches
Our attempt to restart the loop with the second level batch worked at
one point but got broken at some point.  It was too fragile anyway and
we're not likely to have enough secondaries to actually overflow the
stack so we may as well recurse in both cases.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 16:40:54 -07:00
Dave Airlie
45e25adfe8 virgl/vtest: add support to vtest for new cap getting.
The vtest protocol is pretty simple but also pretty dumb, and
the v1 caps query was fixed size, with no nice way to expand it,
however the server also ignores any command it doesn't understand.

So we can query v2 caps by sending a v2 followed by a v1, if the
v2 is ignored we know it's an old vtest server, and the we get
a v2 answer then we can just read the v1 answer and discard it.

Acked-by: Jakob Bornecrantz <jakob@collabora.com> (sounds good)
2018-07-10 09:07:37 +10:00
Anuj Phogat
2badf0e85b i965/icl: Don't set float blend optimization bit in CACHE_MODE_SS
CACHE_MODE_SS is not listed in gfxspecs table for user mode
non-privileged registers. So, making any changes from Mesa
will do nothing. Kernel is already setting this bit in
CACHE_MODE_SS register which is saved/restored to/from
the HW context image.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 15:38:42 -07:00
Anuj Phogat
c1d8300117 anv/icl: Don't set float blend optimization bit in CACHE_MODE_SS
CACHE_MODE_SS is not listed in gfxspecs table for user mode
non-privileged registers. So, making any changes from Mesa
will do nothing. Kernel is already setting this bit in
CACHE_MODE_SS register which is saved/restored to/from
the HW context image.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 15:38:42 -07:00
Jason Ekstrand
227dabc266 anv: Implement VK_EXT_vertex_attribute_divisor
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-09 15:37:51 -07:00
Jason Ekstrand
2caf6c0392 anv/pipeline: Add a per-VB instance divisor
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-09 15:37:51 -07:00
Jason Ekstrand
32f4feb5a0 anv/pipeline: Use a per-VB struct instead of separate arrays
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-09 15:37:51 -07:00
Jose Maria Casanova Crespo
6db20229ab anv: Enable SPV_KHR_8bit_storage and VK_KHR_8bit_storage
Enables SPV_KHR_8bit_storage and VK_KHR_8bit_storage on gen 8+
using the VK_KHR_get_physical_device_properties2 functionality
to expose if the extension is supported or not.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-10 00:14:50 +02:00
Jose Maria Casanova Crespo
0c01bf70e0 spirv/nir: Add support for SPV_KHR_8bit_storage
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-10 00:14:50 +02:00
Jose Maria Casanova Crespo
f29c19cd5c spirv: Include headers and grammar for SPV_KHR_8bit_storage
Updates headers and grammar to ff684ffc6a35d2a58f0f63108877d0064ea33feb

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-10 00:14:50 +02:00
Jose Maria Casanova Crespo
cd0afab99b i965/fs: Enable store_ssbo for 8-bit types.
v2: Update comment according to this patch. (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-10 00:14:50 +02:00
Jose Maria Casanova Crespo
11c904d0d3 intel/compiler: relax brw_eu_validate for byte raw movs
When the destination is a BYTE type allow raw movs
even if the stride is not exact multiple of destination
type and exec type, execution type is Word and its size is 2.

This restriction was only allowing stride==2 destinations
for 8-bit types.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo
87fc9af3fc i965/fs: Enable conversions to 8-bit integers
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo
030472c1f0 i965: Support for 8-bit base types in helper functions
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo
232ed89802 i965/fs: Register allocator shoudn't use grf127 for sends dest
Since Gen8+ Intel PRM states that "r127 must not be used for return
address when there is a src and dest overlap in send instruction."

This patch implements this restriction creating new grf127_send_hack_node
at the register allocator. This node has a fixed assignation to grf127.

For vgrf that are used as destination of send messages we create node
interfereces with the grf127_send_hack_node. So the register allocator
will never assign to these vgrf a register that involves grf127.

If dispatch_width > 8 we don't create these interferences to the because
all instructions have node interferences between sources and destination.
That is enough to avoid the r127 restriction.

This fixes CTS tests that raised this issue as they were executed as SIMD8:

dEQP-VK.spirv_assembly.instruction.graphics.8bit_storage.8struct_to_32struct.storage_buffer_*int_geom

Shader-db results on Skylake:
   total instructions in shared programs: 7686798 -> 7686797 (<.01%)
   instructions in affected programs: 301 -> 300 (-0.33%)
   helped: 1
   HURT: 0

   total cycles in shared programs: 337092322 -> 337091919 (<.01%)
   cycles in affected programs: 22420415 -> 22420012 (<.01%)
   helped: 712
   HURT: 588

Shader-db results on Broadwell:

   total instructions in shared programs: 7658574 -> 7658625 (<.01%)
   instructions in affected programs: 19610 -> 19661 (0.26%)
   helped: 3
   HURT: 4

   total cycles in shared programs: 340694553 -> 340676378 (<.01%)
   cycles in affected programs: 24724915 -> 24706740 (-0.07%)
   helped: 998
   HURT: 916

   total spills in shared programs: 4300 -> 4311 (0.26%)
   spills in affected programs: 333 -> 344 (3.30%)
   helped: 1
   HURT: 3

   total fills in shared programs: 5370 -> 5378 (0.15%)
   fills in affected programs: 274 -> 282 (2.92%)
   helped: 1
   HURT: 3

v2: Avoid duplicating register classes without grf127. Let's use a node
    with a fixed assignation to grf127 and create interferences to send
    message vgrf destinations. (Eric Anholt)
v3: Update reference to CTS VK_KHR_8bit_storage failing tests.
    (Jose Maria Casanova)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo
0e47ecb29a intel/compiler: grf127 can not be dest when src and dest overlap in send
Implement at brw_eu_validate the restriction from Intel Broadwell PRM,
vol 07, section "Instruction Set Reference", subsection "EUISA
Instructions", Send Message (page 990):

"r127 must not be used for return address when there is a src and
dest overlap in send instruction."

v2: Style fixes (Matt Turner)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
2018-07-10 00:14:49 +02:00
Dave Airlie
6f3aee40f9 radv: using tls to store llvm related info and speed up compiles (v10)
This uses the common compiler passes abstraction to help radv
avoid fixed cost compiler overheads. This uses a linked list per
thread stored in thread local storage, with an entry in the list
for each target machine.

This should remove all the fixed overheads setup costs of creating
the pass manager each time.

This takes a demo app time to compile the radv meta shaders on nocache
and exit from 1.7s to 1s. It also has been reported to take the startup
time of uncached shaders on RoTR from 12m24s to 11m35s (Alex)

v2: fix llvm6 build, inline emit function, handle multiple targets
in one thread
v3: rebase and port onto new structure
v4: rename some vars (Bas)
v5: drag all code into radv for now, we can refactor it out later
for radeonsi if we make it shareable
v6: use a bit more C++ in the wrapper
v7: logic bugs fixed so it actually runs again.
v8: rebase on top of radeonsi changes.
v9: drop some C++ headers, cleanup list entry
v10: use pop_back (didn't have enough caffeine)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-10 07:58:03 +10:00
Adam Jackson
c1ec582059 swrast: Fix eglMakeCurrent(dpy, NULL, NULL, ctx) (v2)
Fixes 14 piglits, mostly in egl_khr_create_context.

v2: Also short-circuit the same-context-no-drawables case (Eric Anholt)

Fixes: https://github.com/anholt/libepoxy/issues/177
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2018-07-09 16:09:58 -04:00
Lionel Landwerlin
7205bdf41f intel: tools: dump_gpu: fix ppgtt mapping
We were not properly writing page tables when the virtual address
range spans multiple subtrees of the tables.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-09 21:08:08 +01:00
Eric Anholt
beeb94402f v3d: Implement noperspective varyings on V3D 4.x.
Fixes a bunch of piglit interpolation tests, and reduces my concern about
some MSAA blit shaders with noperspective varyings.
2018-07-09 11:48:32 -07:00
Eric Anholt
4b4795be9d v3d: Refactor flat shade/centroid flag emission.
The logic was duplicated in a pretty gross way, when what we really need
is just a helper function for stuffing the values in the packet.  This
will make implementing noperspective easier.
2018-07-09 11:48:32 -07:00
Eric Anholt
93f437d128 v3d: Fix typo in dither mode offset.
We weren't using the field yet, so it didn't affect anything.

Fixes: c0476d964a ("v3d: Express dithering mode in the same way that the CLIF parser does.")
2018-07-09 11:48:32 -07:00
zhaowei yuan
73ec437627 glsl: Treat sampler2DRect and sampler2DRectShadow as reserved in ES2
"sampler2DRect" and "sampler2DRectShadow" are specified as
reserved from GLSL 1.1 and GLSL ES 1.0

Signed-off-by: zhaowei yuan <zhaowei.yuan@samsung.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106906
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes: 34f7e761bc ("glsl/parser: Track built-in types using the glsl_type directly")
2018-07-09 11:37:08 -07:00
Charmaine Lee
097952abaa st/wgl: check for NULL piAttribList in wglCreatePbufferARB()
Java2d opengl pipeline passes NULL piAttribList to
wglCreatePbufferARB(). So skip parsing the attribute list
if it is NULL.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-07-06 17:32:49 -07:00
Jason Ekstrand
a695de5845 anv: Add support for VK_KHR_create_renderpass2
The implementation of CreateRenderPass2 uses the helpers we broke out in
previous commits.  The implementations of the new vkCmd functions just
call the old versions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 10:11:53 -07:00
Jason Ekstrand
208be8eafa anv: Make subpass::depth_stencil_attachment a pointer
This makes certain checks a bit easier and means that we don't have
the attachment information duplicated in the attachment list and in
depth_stencil_attachment.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 10:11:53 -07:00
Jason Ekstrand
75e308fc44 anv/pass: Move implicit dependency setup to anv_render_pass_compile
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 10:11:53 -07:00
Jason Ekstrand
144626946e anv/pass: Move some dependency setup into a helper
This new helper takes a VkSubpassDependency2KHR for future-proofing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 10:11:53 -07:00
Jason Ekstrand
6f9485d21f anv/pass: Move a bunch of analysis into a separate "compile" stage
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 10:11:53 -07:00
Jason Ekstrand
55285b8404 anv/pass: Use a designated initailizer for attachments
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 10:11:53 -07:00
Jason Ekstrand
6c746e8fea anv: Bump the advertised patch version to 80
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 10:11:53 -07:00
Adam Jackson
d257ec0136 glx: Don't allow glXMakeContextCurrent() with only one valid drawable
Drawable and readable need to either both be None or both be non-None.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-09 12:03:18 -04:00
Erik Faye-Lund
af6b7bf236 mesa: verify MaxVertexAttribStride for GLES 3.1
The OpenGL 3.1 specification, table Table 20.41 ("Implementation
Dependent Values"), defines the minimum-maximum value for
MAX_VERTEX_ATTRIB_STRIDE to be 2048.

So we shouldn't enable OpenGL ES 3.1 on implementations where this
isn't the case. Let's add a check for this

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-07-09 17:32:31 +02:00
Erik Faye-Lund
2e64a2f2d1 mesa: verify MaxVertexAttribStride for GL 4.4
The OpenGL 4.4 specification, table Table 23.55 ("Implementation
Dependent Values"), defines the minimum-maximum value for
MAX_VERTEX_ATTRIB_STRIDE to be 2048.

So we shouldn't enable OpenGL 4.4 on implementations where this isn't
the case. Let's add a check for this.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-07-09 17:32:31 +02:00
Erik Faye-Lund
747cf468ff r600: report incorrect max-vertex-attrib for GL 4.4
OpenGL 4.4 requires a max vertex attrib of 2048 or higher, but
r600 only supports 2047. Technically, this makes it an GL4.3 GPU,
but it's currently exposing GL4.4.

To avoid regressing the GL version supported in the following
patches, let's just lie and pretend like we support 2048. Any
applications using 2048 are already broken anyway.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-07-09 17:32:31 +02:00
Jose Maria Casanova Crespo
6706b421f0 intel/fs: use uint type for per_slot_offset at GS
This helps us to compact original instruction:

mul(8)  g3<1>D  g6<8,8,1>UD  0x00000006UD { align1 1Q };

So now we emit:

mul(8)  g3<1>UD g6<8,8,1>UD  0x00000006UD { align1 1Q compacted };

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-07-09 15:28:48 +02:00
Samuel Pitoiset
e8f82b33fb radv: add the trace BO to the list when starting a new cmdbuf
That might reduce CPU overhead a little bit when using
RADV_TRACE_FILE.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-09 13:57:01 +02:00
Samuel Pitoiset
5e5a28d52a radv: reduce CPU overhead in radv_flush_descriptors()
The number of enabled descriptors for a given pipeline stage
can be computed at compile time.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-09 13:56:58 +02:00
Iago Toral Quiroga
81ca08e030 intel/compiler: remove unused function
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 13:21:48 +02:00
Iago Toral Quiroga
449c22004c anv/pipeline: honor the pipeline_cache_enabled run-time flag
v2: merge both conditions to reduce the diff (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 08:40:26 +02:00
Roland Scheidegger
817efd8968 r600/sb: fix crash in fold_alu_op3
fold_assoc() called from fold_alu_op3() can lower the number of src to 2,
which then leads to an invalid access to n.src[2]->gvalue().
This didn't seem to have caused much harm in the past, but on Fedora 28
it will crash (presumably because -D_GLIBCXX_ASSERTIONS is used, although
with libstdc++ 4.8.5 this didn't do anything, -D_GLIBCXX_DEBUG was
needed to show the issue).

An alternative fix would be to instead call fold_alu_op2() from within
fold_assoc() when the number of src is reduced and return always TRUE
from fold_assoc() in this case, with the only actual difference being
the return value from fold_alu_op3() then. I'm not sure what the return
value actually should be in this case (or whether it even can make a
difference).

https://bugs.freedesktop.org/show_bug.cgi?id=106928
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-09 07:17:29 +01:00
Jason Ekstrand
7c92c7d151 vulkan: Update the XML and headers to 1.1.80
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-08 21:39:18 -07:00
Lionel Landwerlin
420bf14e12 i965: fix clear color bo address relocation
Fixes: 7987d041fd ("i965/surface_state: Emit the clear color address instead of value.")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-07 20:54:55 +01:00
Mauro Rossi
1a1f2b134c radv: winsys/amdgpu: include missing pthread.h header
pthread types are used in some files without explicitely including pthread.h.
This leads to compile errors on Android 7.x nougat-x86
e.g. in src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h

In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c:31:
In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.h:32:
external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h:52:2: error: unknown type name 'pthread_mutex_t'
        pthread_mutex_t global_bo_list_lock;
        ^
1 error generated.

Including pthread.h explicitely solves the building error

Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-07 20:53:59 +02:00
Karol Herbst
de13978733 nv50/ir: fix Instruction::isActionEqual for PHI instructions
phi instructions don't have the same results by simply having the same sources.
They need to be inside the same BasicBlock or share an equal condition
resulting into a path through the shader selecting equal sources as well.

short example:

cond = ...;
const0 = 0;
const1 = 1;

if (cond) {
  ssa_1 = const0;
} else {
  ssa_2 = const1;
}
ssa_3 = phi ssa_1 ssa_2;

if (!cond) {
  ssa_4 = const0;
} else {
  ssa_5 = const1;
}
ssa_6 = phi ssa_4 ssa_5;

allthough both phis actually have sources with equal results, merging them
would be wrong due to having a different condition selecting which source to
take.

For now we also stick an assert into GlobalCSE, because it should never end up
having to merge phi instructions.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-07-07 20:32:33 +02:00
Rhys Perry
f2cc694d8e nvc0/ir: use the combined tid special register
total instructions in shared programs : 5804448 -> 5804690 (0.00%)
total gprs used in shared programs    : 670065 -> 670065 (0.00%)
total shared used in shared programs  : 548832 -> 548832 (0.00%)
total local used in shared programs   : 21068 -> 21068 (0.00%)

                local     shared        gpr       inst      bytes
    helped           0           0           0           5           5
      hurt           0           0           0         191         191

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-07-07 20:31:56 +02:00
Jason Ekstrand
6e88561156 nir/print: Print texture and sampler indices
Commit 5fb69daa6076e56b deleted support from nir_print for printing the
texture and sampler indices on texture instructions.  This commit just
brings it back as best as we can.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-07 09:32:33 -07:00
Ian Romanick
f8e54d02f7 intel/compiler: Relax mixed type restriction for saturating immediates
At the time of commit 7bc6e455e2 (i965: Add support for saturating
immediates.) we thought mixed type saturates would be impossible.  We
were only thinking about type converting moves from D to F, for
example.  However, type converting moves w/saturate from F to DF are
definitely possible.  This change minimally relaxes the restriction to
allow cases that I have been able trigger via piglit tests.

Fixes new piglit tests:
 - arb_gpu_shader_fp64/execution/built-in-functions/fs-sign-sat-neg-abs.shader_test
 - arb_gpu_shader_fp64/execution/built-in-functions/vs-sign-sat-neg-abs.shader_test

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-06 16:20:10 -07:00
Ian Romanick
9626ea497d i965/vec4: Properly handle sign(-abs(x))
This is achived by copying the sign(abs(x)) optimization from the FS
backend.

On Gen7 an earlier platforms, this fixes new piglit tests:

 - glsl-1.10/execution/vs-sign-neg-abs.shader_test
 - glsl-1.10/execution/vs-sign-sat-neg-abs.shader_test

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-06 16:20:07 -07:00
Ian Romanick
88bd37c010 i965/fs: Properly handle sign(-abs(x))
Fixes new piglit tests:

 - glsl-1.10/execution/fs-sign-neg-abs.shader_test
 - glsl-1.10/execution/fs-sign-sat-neg-abs.shader_test
 - glsl-1.10/execution/vs-sign-neg-abs.shader_test
 - glsl-1.10/execution/vs-sign-sat-neg-abs.shader_test

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-06 16:20:04 -07:00
Lionel Landwerlin
c05c8d65ba vulkan: utils: handle hexadecimal values in registry
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-06 22:12:00 +01:00
Marek Olšák
0eaf069679 st/dri: fix a crash in server_wait_sync
Ported from i965 including the comment.

This fixes:
    dEQP-EGL.functional.reusable_sync.valid.wait_server

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-07-06 16:23:37 -04:00
Mathieu Bridon
b39bdb0716 python: Stop using the Python 2 exception syntax
We could have made this compatible with Python 3 by using:

    except Exception as e:

But since none of this code actually uses the exception objects, let's
just drop them entirely.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-07-06 10:18:43 -07:00
Mathieu Bridon
e5a8d51e54 python: Use spaces, not tabs
Python 3 doesn't allow mixing spaces and tabs in a script, contrarily to
Python 2.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-07-06 10:04:55 -07:00
Mathieu Bridon
0f7b18fa0d python: Use the print function
In Python 2, `print` was a statement, but it became a function in
Python 3.

Using print functions everywhere makes the script compatible with Python
versions >= 2.6, including Python 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2018-07-06 10:04:22 -07:00
Jon Turney
b3a42fa066 vma/tests: Fix compilation if limits.h defines PAGE_SIZE (v2)
per POSIX, limits.h may define PAGE_SIZE when the value is not indeterminate

v2: just change the variable name, since there's no intended correlation
here between this value and the machine's actual page size.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-07-06 14:01:08 +01:00
Samuel Pitoiset
85865dbe0d radv: fix emitting the view index on GFX9
For merged shaders, VS as HS for example.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-06 10:22:53 +02:00
Ian Romanick
965a06dbd7 i965/vec4: Make the vec4_visitor::nir_emit_instr default case unreachable
The bug fixed by the previous commit went undetected because extra
stderr messages are not flagged by the CI.  Copy the solution from
fs_visitor::nir_emit_instr and mark the default case unreachable.

An alternate solution is to delete the default case so that the compiler
will issue a warning.  That may require more work since there are other
(impossible) cases that exist.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-05 21:13:32 -07:00
Ian Romanick
a4d4787327 intel/compiler: More DCE after lowering
Some of the lowering passes, nir_lower_locals_to_regs for example, can
cause some previously live code to be dead.  This pass in particular
leaves a bunch of nir_instr_type_deref instructions floating around.
This causes shader-db runs on Gen5 through Haswell to spew tons of
messages like:

    VS instruction not yet implemented by NIR->vec4

UnrealEngine4/EffectsCaveDemo/239.shader_test is one shader that
generates these messages.  Cleaning up the dead code fixes that.

To verify, I did a shader-db before and after.  Even though all the
messages are gone, the results make my brain hurt. :(

Haswell
total cycles in shared programs: 411890163 -> 411891145 (<.01%)
cycles in affected programs: 57016 -> 57998 (1.72%)
helped: 3
HURT: 11
helped stats (abs) min: 2 max: 154 x̄: 96.67 x̃: 134
helped stats (rel) min: 0.08% max: 2.23% x̄: 1.42% x̃: 1.96%
HURT stats (abs)   min: 18 max: 686 x̄: 115.64 x̃: 20
HURT stats (rel)   min: 0.81% max: 7.12% x̄: 1.87% x̃: 0.93%
95% mean confidence interval for cycles value: -51.39 191.67
95% mean confidence interval for cycles %-change: -0.14% 2.46%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total cycles in shared programs: 259114802 -> 259115032 (<.01%)
cycles in affected programs: 24034 -> 24264 (0.96%)
helped: 1
HURT: 9
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08%
HURT stats (abs)   min: 18 max: 48 x̄: 25.78 x̃: 20
HURT stats (rel)   min: 0.80% max: 1.94% x̄: 1.08% x̃: 0.80%
95% mean confidence interval for cycles value: 12.42 33.58
95% mean confidence interval for cycles %-change: 0.54% 1.38%
Cycles are HURT.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes: 5a02ffb733 nir: Rework lower_locals_to_regs to use deref instructions
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-05 21:13:21 -07:00
Eric Anholt
9d0406c52f v3d: Fix leak of the default attributes BOs.
The GLES3 CTS makes a lot more progress on a run now.
2018-07-05 15:50:54 -07:00
Eric Anholt
6b11131373 v3d: Fix leak of the spill BO on context destruction. 2018-07-05 15:50:52 -07:00
Eric Anholt
4b2ba18ff3 nir: Apply fragment color clamping to gl_FragData[] as well.
From the ARB_color_buffer_float spec:

   35. Should the clamping of fragment shader output gl_FragData[n]
       be controlled by the fragment color clamp.

       RESOLVED: Since the destination of the FragData is a color
       buffer, the fragment color clamp control should apply.

Fixes arb_color_buffer_float-mrt mixed on v3d.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-07-05 12:39:36 -07:00
Eric Anholt
03f6d26b62 v3d: Skip emitting per-RT blend state for RTs with blend disabled.
Cleans up the CL of fbo-drawbuffers2-blend a bit.  We could do better on
more complicated cases by noticing if multiple RTs have the same blend
state and emitting them in a single packet.
2018-07-05 12:39:36 -07:00
Eric Anholt
572f6ab489 v3d: Add proper support for GL_EXT_draw_buffers2's blending enables.
I had flagged it as enabled on V3D 4.x, but not actually implemented the
per-RT enables.  Fixes piglit fbo_drawbuffers2-blend.
2018-07-05 12:39:36 -07:00
Eric Anholt
5601ab3981 v3d: Add support for GL_SAMPLE_ALPHA_TO_ONE.
Fixes piglit ext_framebuffer_multisample-draw-buffers-alpha-to-one
2018-07-05 12:39:36 -07:00
Eric Anholt
7b63371420 v3d: Respect swap_color_rb for the f32_color_rb case.
We don't actually set the two flags together, but I want to use the
r/g/b/a reordered fields in the next commit.
2018-07-05 12:39:36 -07:00
Eric Anholt
dbd52585fa st/nir: Disable varying packing when doing transform feedback.
The varying packing would result in st_nir_assign_var_locations() picking
new driver_locations, despite the pipe_stream_output already being set up
for the old driver location.  This left the gallium driver with no way to
work back to what varying was referenced by pipe_stream_output.

Fixes these tests on V3D:
dEQP-GLES3.functional.transform_feedback.random.separate.points.3
dEQP-GLES3.functional.transform_feedback.random.separate.points.7
dEQP-GLES3.functional.transform_feedback.random.separate.points.9
dEQP-GLES3.functional.transform_feedback.random.separate.triangles.3
dEQP-GLES3.functional.transform_feedback.random.separate.triangles.8

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-05 12:38:27 -07:00
Jon Turney
ab7aa0f10c meson: Set with_dri from with_gallium when DRI glx is explicitly configured
Set with_dri from with_gallium when DRI GLX is explicitly configured, as
well as when DRI GLX is chosen automatically.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-07-05 17:48:35 +01:00
Samuel Pitoiset
72fd93370f radv/winsys: make use of radeon_emit()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-05 17:23:25 +02:00
Samuel Pitoiset
f2a310849e radv: only flush CB meta in pipeline image barriers when needed
If the given image doesn't enable CMASK, FMASK or DCC that's
useless to flush CB metadata.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-05 17:20:16 +02:00
Samuel Pitoiset
17bb4c2cf5 radv: only flush DB meta in pipeline image barriers when needed
If the given image doesn't have HTILE, that's useless to flush
DB metadata.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-05 17:20:12 +02:00
Samuel Pitoiset
2a3e9c89ff radv: fix "error: initializer element is not constant" build error
GCC 4.8 fails to compile with "static const", while GCC 8.1
fails to compile with only "static".

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-07-05 17:12:02 +02:00
Lionel Landwerlin
78d5c1c82a util: u_queue: fix android build error
mesa/src/util/u_queue.c:242:15: error: address of array 'queue->name'
  will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]

Fixes: b238e33bc9 "kutil/queue: add a process name into a thread name"
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-07-05 15:42:26 +01:00
Benedikt Schemmer
93a5c9bc99 Util: fix msvc build
The MSVC preprocessor doesnt understand #warning

Fixes: 2e1e6511f7 ("util: extract get_process_name from xmlconfig.c")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-07-05 14:24:08 +01:00
Mathieu Bridon
f9b6dfd919 python: Specify the JSON separators
On Python 2, the default JSON separators are ', ' for items and ': ' for
dicts.

On Python 3, the default is the same when no indent is specified, but if
one is (and we do specify one) then the default items separator becomes
',' (the dict separator remains unchanged).

This change explicitly specifies the Python 3 default, which helps
ensuring that the output is identical, whether it was generated by
Python 2 or 3.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-07-05 12:52:38 +01:00
Mathieu Bridon
fe8a153648 python: Stabilize some script outputs
In Python, dictionaries and sets are unordered, and as a result their
is no guarantee that running this script twice will produce the same
output.

Using ordered dicts and explicitly sorting items makes the build more
reproducible, and will make it possible to verify that we're not
breaking anything when we move the build scripts to Python 3.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-07-05 12:52:12 +01:00
Lionel Landwerlin
d337713ec4 intel: tools: remove drm-uapi defines
We already embed the headers, no need to redefine defines/structs.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Lionel Landwerlin
87915baa23 intel: intel_dump_gpu: use simulator id in captures
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Lionel Landwerlin
aab21cedc6 intel: devinfo: add simulator id
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Scott D Phillips
0f53948c59 intel: tools: dump-gpu: dump 48-bit addresses
For gen8+, write out PPGTT tables in aub files so that full 48-bit
addresses can be serialized.

v2: Fix handling of `end` index in map_ppgtt

v3: Correctly mark GGTT entry as present (Rafael)

Signed-off-by: Scott D Phillips <scott.d.phillips@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Lionel Landwerlin
6e37b949d5 intel: tools: import intel_aubdump
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Lionel Landwerlin
fa00b9c1c9 intel: tools: update intel_aub.h
Scott added new stuff in IGT.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Lionel Landwerlin
5ffa35b64d intel: batch-decoder: add missing return line
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Lionel Landwerlin
28476c9d81 intel: batch-decoder: don't asks for constant BO until decoding
With PPGTT mappings, our aubinator implementation can be quite slow if
we request a buffer that doesn't exist. Instead of doing a PPGTT walk
for invalid addresses (0 lengths), wait until we're sure we want to
decode the data.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Scott D Phillips
c262ec19d0 intel/batch-decoder: handle non-contiguous binding table / surface state
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-05 11:57:45 +01:00
Scott D Phillips
3ebee627cb intel/tools/aubinator: aubinate ppgtt aubs
v2: by Lionel
    Fix memfd_create compilation issue
    Fix pml4 address stored on 32 instead of 64bits
    Return no buffer if first ppgtt page is not mapped

v3: Drop additional memfd_create() (Rafael)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Lionel Landwerlin
3228335b55 intel: aubinator: handle GGTT mappings
We use memfd to store physical pages as they get read/written to and
the GGTT entries translating virtual address to physical pages.

Based on a commit by Scott Phillips.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Jason Ekstrand
2602ea89d5 util: rb-tree: A simple, invasive, red-black tree
This is a simple, invasive, liberally licensed red-black tree
implementation. It's an invasive data structure similar to the
Linux kernel linked-list where the intention is that you embed a
rb_node struct the data structure you intend to put into the
tree.

The implementation is mostly based on the one in "Introduction to
Algorithms", third edition, by Cormen, Leiserson, Rivest, and
Stein. There were a few other key design points:

 * It's an invasive data structure similar to the [Linux kernel
   linked list].

 * It uses NULL for leaves instead of a sentinel. This means a few
   algorithms differ a small bit from the ones in "Introduction to
   Algorithms".

 * All search operations are inlined so that the compiler can
   optimize away the function pointer call.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-05 11:57:45 +01:00
Lionel Landwerlin
144b40db54 intel: aubinator: drop the 1Tb GTT mapping
Now that we're softpinning the address of our BOs in anv & i965, the
addresses selected start at the top of the addressing space. This is a
problem for the current implementation of aubinator which uses only a
40bit mmapped address space.

This change keeps track of all the memory writes from the aub file and
fetch them on request by the batch decoder. As a result we can get rid
of the 1<<40 mmapped address space and only rely on the mmap aub file
\o/

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Lionel Landwerlin
9d08ef6335 intel: aubinator: rework register writes handling
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Lionel Landwerlin
86cb05a6d3 intel: aubinator: remove standard input processing option
On a follow up commit in this series, we stop copying the data from
the mmap'ed file into our big gtt mmap, and start referencing data in
it directly. So reallocating the read buffer and adding more data from
stdin wouldn't work. For that reason, let's stop supporting stdin
process.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Lionel Landwerlin
08d85a8301 intel: aubinator: remove unused variables
These memory offsets are stored in the gen_batch_decode_ctx.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-07-05 11:57:45 +01:00
Mathieu Bridon
3153bcc73e gallium/auxiliary: Fix string matching
Commit f69bc797e1 did the following:

-        if format.layout in ('bptc', 'astc'):
+        if format.layout in ('astc'):

The intention was to go from matching either 'bptc' or 'astc' to
matching only 'astc'.

But the new code doesn't respect this intention any more, because in
Python `('astc')` is not a tuple containing a string, it is just the
string. (the parentheses are simply ignored)

That means we now match any substring of 'astc', for example 'a'.

This commit fixes the test to respect the original intention.

Fixes: f69bc797e1 "gallium/auxiliary: Add helper support for
                             bptc format compress/decompress"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-07-05 11:48:47 +01:00
Samuel Pitoiset
8339ba827b radv: optimize vkCmd{Set,Reset}Event() a little bit
Always emitting a bottom-of-pipe event is quite dumb. Instead,
start to optimize these functions by syncing PFP for the
top-of-pipe and syncing ME for the post-index-fetch event.

This can still be improved by emitting EOS events for
syncing PS and CS stages.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-05 11:31:06 +02:00
Samuel Pitoiset
f635109140 radv: optimize radv_CmdWaitEvents()
This introduces radv_barrier() (same as the draw/dispatch codepath).
This helper is used for merging the code from CmdWaitEvents() and
CmdPipelineBarrier because it's quite similar.

We do ignore the source stage mask for CmdWaitEvents because
it's irrelevant when event objects are used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-05 11:31:03 +02:00
Roland Scheidegger
620626a371 nir/linker: fix msvc build
Empty initializer braces aren't valid c (it's a gnu extension, and
it's valid in c++).
Hopefully fixes appveyor / msvc build...

Fixes 6677e131b8
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-05 09:27:05 +02:00
Gert Wollny
806a42fc47 r600: compare structure elements instead of doing a memcmp
Structures might be padded by the compiler and these padding bytes remain
un-initialized which in turn makes memcmp return a difference where from
the logical point of view there is none.

 Fixes valgrind:
     Conditional jump or move depends on uninitialised value(s)
       at 0x4C32CBA: __memcmp_sse4_1 (vg_replace_strmem.c:1099)
       by 0xB8D2537: r600_set_vertex_buffers (r600_state_common.c:573)
       by 0xB71D44A: u_vbuf_set_driver_vertex_buffers (u_vbuf.c:1129)
       by 0xB71F7BB: u_vbuf_draw_vbo (u_vbuf.c:1153)
       by 0xB3B92CB: st_draw_vbo (st_draw.c:235)
       by 0xB36B1AE: vbo_draw_arrays (vbo_exec_array.c:391)
       by 0xB36BB0D: vbo_exec_DrawArrays (vbo_exec_array.c:550)
       by 0x10A989: piglit_display (textureSize.c:157)
       by 0x4F8F174: run_test (piglit_fbo_framework.c:52)
       by 0x4F7BA12: piglit_gl_test_run (piglit-framework-gl.c:229)
       by 0x10A60A: main (textureSize.c:71)
     Uninitialised value was created by a stack allocation
       at 0xB3948FD: st_update_array (st_atom_array.c:388)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-07-05 07:59:07 +02:00
Gert Wollny
9c1ae6a1a1 r600: Add R4G4B4A4 and A1B5G5R5 to supported vertex formats
Below tests would fail with an error message
  "Vertex format (R4G4B4A4|R5G5B5A1) not supported."
Add the formate to the translation routine to enable these formats.

Fixes:
  dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_2d
  dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_cube
  dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_2d
  dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_cube
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_2d
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_cube
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_2d
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_cube
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_2d_array
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_3d
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_2d_array
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_3d
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_2d_array
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_3d
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_2d_array
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_3d
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-07-05 07:57:28 +02:00
Gert Wollny
5278436d67 r600: force LOD range to be only one value when mip.min filter is NONE
For a texture that has only one LOD defined, but for which
GL_TEXTURE_MAX_LEVEL is the default (1000) and
GL_TEXTURE_MIN_LOD != GL_TEXTURE_MAX_LOD the reading from the texture does
not properly resolve the LOD level and texture lookup might fail. Hence,
when no mipmap filter is given (indicating that no mip-mapping takes place),
force the LOD range to contain only value.

Fixes:
  dEQP-GLES3.functional.shaders.texture_functions.texture*.(i|u)sampler2d*
  dEQP-GLES3.functional.texture.format.sized.cube.rgb*
  out of VK_GL_CTS/android/cts/master/gles3-master.txt
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-07-05 07:57:28 +02:00
Gert Wollny
e7dd1a84a0 mesa/st: draw_vbo: initialize restart_index too
restart_index is later always used in a comparison, so it should be
initialized properly.

Fixes valgrind warning:
 Conditional jump or move depends on uninitialised value(s)
    at 0xB8D682F: r600_draw_vbo (r600_state_common.c:2153)
    by 0xB71F743: u_vbuf_draw_vbo (u_vbuf.c:1156)
    by 0xB3B92DB: st_draw_vbo (st_draw.c:235)
    by 0xB36B1AE: vbo_draw_arrays (vbo_exec_array.c:391)
    by 0xB36BB0D: vbo_exec_DrawArrays (vbo_exec_array.c:550)
    by 0x10A989: piglit_display (textureSize.c:157)
    by 0x4F8F174: run_test (piglit_fbo_framework.c:52)
    by 0x4F7BA12: piglit_gl_test_run (piglit-framework-gl.c:229)
    by 0x10A60A: main (textureSize.c:71)
 Uninitialised value was created by a stack allocation
    at 0xB3B90B0: st_draw_vbo (st_draw.c:143)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-07-05 07:57:16 +02:00
Timothy Arceri
0cb6537dee mesa: enable ARB_direct_state_access in OpenGL 4.5 compat profile
Its unlikely anyone will add proper ARB_direct_state_access compat
support before we branch 18.2. Enabling the extension in 4.5 at
least allows users to make use of MESA_GL_VERSION_OVERRIDE=4.5COMPAT
for games like No Mans Sky.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-05 13:15:34 +10:00
Timothy Arceri
39063334d3 util/drirc: turn on force_glsl_extensions_warn for No Mans Sky
The game forgets to enable multiple extensions in its shaders, one
of those extesions is EXT_texture_array. But enabling this config
entry fixes at least one other rendering issue that enabling
EXT_texture_array on its own doesn't fix.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-05 13:05:47 +10:00
Marek Olšák
9b4c4fe334 util/queue: remove leftover debug code 2018-07-04 22:19:47 -04:00
Marek Olšák
7fab8a4b37 Shorten u_queue names
There is a 15-character limit for thread names shared by the queue name
and process name. Shorten the thread name to make space for the process
name.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-04 22:03:35 -04:00
Marek Olšák
b238e33bc9 kutil/queue: add a process name into a thread name
v2: simplifications

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
2018-07-04 21:54:39 -04:00
Marek Olšák
7149bffe66 gallium/os: use util_get_process_name when possible
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-07-04 21:16:57 -04:00
Marek Olšák
2e1e6511f7 util: extract get_process_name from xmlconfig.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-07-04 21:16:03 -04:00
Marek Olšák
4695984dbc ac: fold LLVMContext creation into ac_llvm_context_init
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-04 15:48:18 -04:00
Marek Olšák
f5cb4194c9 radeonsi: reorder code in si_llvm_context_init
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-04 15:48:18 -04:00
Marek Olšák
ff330055e9 radeonsi: use ac_compile_module_to_binary to reduce compile times
Compile times of simple shaders are reduced by ~20%.
Compile times of prologs and epilogs are reduced by up to 40%.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-04 15:48:18 -04:00
Marek Olšák
0075e5fed8 ac: add reusable helpers for direct LLVM compilation
This is basically LLVMTargetMachineEmitToMemoryBuffer inlined and reworked.

struct ac_compiler_passes (opaque type) contains the main pass manager.

ac_create_llvm_passes -- the result can go to thread local storage
ac_destroy_llvm_passes -- can be called by a destructor in TLS
ac_compile_module_to_binary -- from LLVMModuleRef to ac_shader_binary

The motivation is to do the expensive call addPassesToEmitFile once
per context or thread.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-04 15:48:18 -04:00
Rhys Perry
c2ae9b4052 nvc0: implement multisampled images on Maxwell+
Changes in v2:
- make loadSuInfo32() protected without making the rest protected
- move NVC0_SU_INFO_* into nv50_ir_lowering_nvc0.h instead of duplicating
  NVC0_SU_INFO_MS

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-07-04 16:04:23 +02:00
Neil Roberts
2d5ddbe960 i965: Fix output register sizes when variable ranges are interleaved
In 6f5abf3146 this code was fixed to calculate the maximum size of
an attribute in a seperate pass and then allocate the registers to
that size. However this wasn’t taking into account ranges that overlap
but don’t have the same starting location. For example:

layout(location = 0, component = 0) out float a[4];
layout(location = 2, component = 1) out float b[4];

Previously, if ‘a’ was processed first then it would allocate a
register of size 4 for location 0 and it wouldn’t allocate another
register for location 2 because it would already be covered by the
range of 0. Then if something tries to write to b[2] it would try to
write past the end of the register allocated for ‘a’ and it would hit
an assert.

This patch changes it to scan for any overlapping ranges that start
within each range to calculate the maximum extent and allocate that
instead.

Fixed Piglit’s arb_enhanced_layouts/execution/component-layout/
vs-fs-array-interleave-range.shader_test

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 6f5abf3146 "i965: Fix output register sizes when multiple variables
       share a slot."
2018-07-04 10:57:51 +02:00
Dave Airlie
8c51caab24 r600/sb: cleanup if_conversion iterator to be legal C++
The current code causes:
/usr/include/c++/8/debug/safe_iterator.h:207:
Error: attempt to copy from a singular iterator.

This is due to the iterators getting invalidated, fix the
reverse iterator to use the return value from erase, and
cast it properly.

(used Mathias suggestion)
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-07-04 07:42:22 +01:00
Marek Olšák
45f9d58668 radeonsi: fix compiler breakage
Broken by d853d3a59b.
2018-07-04 00:13:38 -04:00
Dave Airlie
5b32b246cf ac: make some fns static
Some of the compiler functions are no longer called outside
the util file.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 10:29:26 +10:00
Dave Airlie
7398913a62 ac/radv: move llvm compiler info to struct and init in one place
This ports radv to the shared code, however due to a bug in LLVM
version prior to 7, radv cannot add target info at this stage,
as it would leak one for every shader compile, however I'd prefer
to keep this llvm damage in the shared code, since it isn't the
driver at fault here. We just add a flag to denote if the driver
can support leaking the target info or not, and the common code
does the right thing depending on the llvm version.

 Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 10:29:16 +10:00
Dave Airlie
d853d3a59b ac/radeonsi: port compiler init/destroy out of radeonsi.
We want to share this code with radv in the future, so port
it out of radeonsi.

Add a return value as radv will want that to know if this
succeeds

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 10:29:03 +10:00
Dave Airlie
35c82af539 radv/radeonsi: add a check ir tm options
This doesn't do much yet, but it makes it easier to move the code
to a common shared code base.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:32:35 +10:00
Dave Airlie
0eb65b4944 radeonsi: rename si_compiler -> ac_llvm_compiler
As precursor to moving init to common code, just rename the struct
and move it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:31:32 +10:00
Dave Airlie
887ba45c93 ac: add target library info helpers
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:31:29 +10:00
Dave Airlie
e1387eaf12 radv: create/destroy passmgr at the higher level.
This is prep work for moving this to a per-thread struct

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:31:05 +10:00
Dave Airlie
97d9b88447 radv: port to use common passmgr code.
This adds a inline always pass, but otherwise should work the
same.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-04 05:30:34 +10:00
Dave Airlie
584ad1eda9 ac/radeonsi: refactor out pass manager init to common code.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:18:01 +10:00
Dave Airlie
f2b3e96e75 radv: drop copy of ac_create_target_machine.
Once we split the init once stuff out, this can be shared again.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:15:35 +10:00
Dave Airlie
473be16c74 ac/radv: split the non-common init_once code from the common target code. (v2)
This just splits out the non-shared code and reuses ac_get_llvm_target in radv.

v2: rebase on Marek's patch - fixup brace position/whitespace

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:15:23 +10:00
Neil Roberts
590cc7c8f6 i965: Use the new nir atomic counter linker for SPIR-V shaders
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:41:46 +02:00
Alejandro Piñeiro
c13f8ea8ac i965: enable AtomicStorage capability for gen7+
That is the same gen requirement for ARB_shader_atomic_counters.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:41:46 +02:00
Antia Puentes
7600678216 mesa/glspirv: lower workgroup access to offsets
This will perform the CS shared lowering. See 8761a04d0d

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:41:46 +02:00
Antia Puentes
fbcebfc5bf nir: Fix OpAtomicCounterIDecrement for uniform atomic counters
From the SPIR-V 1.0 specification, section 3.32.18, "Atomic
Instructions":

   "OpAtomicIDecrement:
    <skip>
    The instruction's result is the Original Value."

However, we were implementing it, for uniform atomic counters, as a
pre-decrement operation, as was the one available from GLSL.

Renamed the former nir intrinsic 'atomic_counter_dec*' to
'atomic_counter_pre_dec*' for clarification purposes, as it implements
a pre-decrement operation as specified for GLSL. From GLSL 4.50 spec,
section 8.10, "Atomic Counter Functions":

   "uint atomicCounterDecrement (atomic_uint c)

    Atomically
    1. decrements the counter for c, and
    2. returns the value resulting from the decrement operation.

    These two steps are done atomically with respect to the atomic
    counter functions in this table."

Added a new nir intrinsic 'atomic_counter_post_dec*' which implements
a post-decrement operation as required by SPIR-V.

v2: (Timothy Arceri)
   * Add extra spec quotes on commit message
   * Use "post" instead "pos" to avoid confusion with "position"

Signed-off-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:41:46 +02:00
Neil Roberts
6677e131b8 nir/linker: Add a pure NIR implementation of the atomic counter linker
This is mostly just a straight-forward conversion of
link_assign_atomic_counter_resources to C directly using nir variables
instead of GLSL IR variables.

It is based on the version of link_assign_atomic_counter_resources in
6b8909f2d1. I’m noting this here to make it easier to track changes
and keep the NIR version up-to-date.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:41:46 +02:00
Neil Roberts
1fb9984d7e nir/types: Add wrappers for a couple of atomic counter methods
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:41:46 +02:00
Alejandro Piñeiro
54d7fca077 spirv/nir: add capability check for SpvCapabilityAtomicStorage
Capability that informs if atomic counters are supported. From SPIR-V
1.0 spec, section 3.7, "Storage Class", item 10 from table:

(Column "Storage Class"):

   "AtomicCounter For holding atomic counters. Visible across all
    functions of the current invocation. Atomic counter-specific
    memory."

(Column "Required Capability"):

   "AtomicStorage"

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:41:46 +02:00
Alejandro Piñeiro
12301766de spirv/nir: add atomic counter support on vtn_handle_ssbo_or_shared_atomic
So renamed to a more general vtn_handle_atomics

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:41:46 +02:00
Alejandro Piñeiro
c3eb0ba0ff spirv/nir: initialize offset on the nir var at vtn_create_variable
This is convenient when dealing with atomic counter uniforms. The
alternative would be doing that at vtn_handle_atomics.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:41:46 +02:00
Antia Puentes
4110bc4c17 nir/spirv: Fix atomic counter (multidimensional-)arrays
When constructing NIR if we have a SPIR-V uint variable and the
storage class is SpvStorageClassAtomicCounter, we store as NIR's
glsl_type an atomic_uint to reflect the fact that the variable is an
atomic counter.

However, we were tweaking the type only for atomic_uint scalars, we
have to do it as well for atomic_uint arrays and atomic_uint arrays of
arrays of any depth.

Signed-off-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

v2: update after deref patches got pushed (Alejandro Piñeiro)
v3: simplify repair_atomic_type (suggested by Timothy Arceri, included
    on the patch by Alejandro)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:41:46 +02:00
Alejandro Piñeiro
480d2c56b3 spirv/nir: tweak nir type when storage class is SpvStorageClassAtomicCounter
GLSL types differentiates uint from atomic uint. On SPIR-V the type is
uint, and the variable has a specific storage class. So we need to
tweak the type based on the storage class.

Ideally we would like to get the proper type at vtn_handle_type, but
we don't have the storage class at that moment.

We tweak only the nir type, as is the one that really requires it.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:41:46 +02:00
Alejandro Piñeiro
88d3325a44 nir_types: add glsl_atomic_uint_type() helper
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:41:46 +02:00
Alejandro Piñeiro
c6230b9358 spirv/nir: add offset at vtn_variable
Also initialize it on var_decoration_cb

This is equivalent to nir_variable.offset, used to store the location
an atomic counter is stored at.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:37:32 +02:00
Alejandro Piñeiro
768c275deb spirv/nir: SpvStorageClassAtomicCounter support on vtn_storage_class_to_mode
Atomic Counters are uniforms per spec.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:37:32 +02:00
Alejandro Piñeiro
a9e6298727 nir/linker: handle uniforms without explicit location
ARB_gl_spirv points that uniforms in general need explicit
location. But there are still some cases of uniforms without location,
like for example uniform atomic counters. Those doesn't have a
location from the OpenGL point of view (they are identified with a
binding and offset), but Mesa internally assigns it a location.

Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>

v2: squash with another patch, minor variable name tweak (Timothy
Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:37:32 +02:00
Alejandro Piñeiro
b0712df6cf compiler/glsl: refactor empty_uniform_block utilities to linker_util
This includes:
  * Move the defition of empty_uniform_block to linker_util.h
  * Move find_empty_block (with a rename) to linker_util.h
  * Refactor some code at linker.cpp to a new method at linker_util.h
    (link_util_update_empty_uniform_locations)

So all that code could be used by the GLSL linker and the NIR linker
used for ARB_gl_spirv.

v2: include just "ir_uniform.h" (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-03 12:37:32 +02:00
Ian Romanick
995d993710 i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible
Otherwise we can incorrectly cmod propagate in situations like

    add(8)          g10<1>.xD       g2<0>.xD        -16D
    ...
    cmp.ge.f0(8)    null<1>D        g2<0>.xD        16D
    ...
    (+f0) sel(8)    g21<1>.xyUD     g14<4>.xyyyUD   g18<4>.xyyyUD

Sadly, this change hurts quite a few shaders.

v2: Refactor writemask compatibility check into a separate function.
Suggested by Caio.

Ivy Bridge and Haswell had similar results. (Haswell shown)
total instructions in shared programs: 12968489 -> 12968738 (<.01%)
instructions in affected programs: 60679 -> 60928 (0.41%)
helped: 0
HURT: 249
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.22% max: 0.81% x̄: 0.46% x̃: 0.44%
95% mean confidence interval for instructions value: 1.00 1.00
95% mean confidence interval for instructions %-change: 0.44% 0.48%
Instructions are HURT.

total cycles in shared programs: 409171965 -> 409172317 (<.01%)
cycles in affected programs: 260056 -> 260408 (0.14%)
helped: 0
HURT: 176
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.04% max: 0.34% x̄: 0.17% x̃: 0.17%
95% mean confidence interval for cycles value: 2.00 2.00
95% mean confidence interval for cycles %-change: 0.16% 0.18%
Cycles are HURT.

Sandy Bridge
total instructions in shared programs: 10423577 -> 10423753 (<.01%)
instructions in affected programs: 40667 -> 40843 (0.43%)
helped: 0
HURT: 176
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.29% max: 0.79% x̄: 0.48% x̃: 0.42%
95% mean confidence interval for instructions value: 1.00 1.00
95% mean confidence interval for instructions %-change: 0.46% 0.51%
Instructions are HURT.

total cycles in shared programs: 146097503 -> 146097855 (<.01%)
cycles in affected programs: 503990 -> 504342 (0.07%)
helped: 0
HURT: 176
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.02% max: 0.36% x̄: 0.12% x̃: 0.11%
95% mean confidence interval for cycles value: 2.00 2.00
95% mean confidence interval for cycles %-change: 0.11% 0.13%
Cycles are HURT.

No changes on any other platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes: cd635d149b i965/vec4: Propagate conditional modifiers from compares to adds
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-02 19:19:16 -07:00
Ian Romanick
fb6dc8e894 intel/compiler: Silence unused parameter warnings brw_nir.c
src/intel/compiler/brw_nir.c: In function ‘brw_nir_lower_vue_outputs’:
src/intel/compiler/brw_nir.c:464:32: warning: unused parameter ‘is_scalar’ [-Wunused-parameter]
                           bool is_scalar)
                                ^~~~~~~~~
src/intel/compiler/brw_nir.c: In function ‘lower_bit_size_callback’:
src/intel/compiler/brw_nir.c:610:57: warning: unused parameter ‘data’ [-Wunused-parameter]
 lower_bit_size_callback(const nir_alu_instr *alu, void *data)
                                                         ^~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-02 16:17:19 -07:00
Kenneth Graunke
8e38947f6c i965: Fix BRW_NEW_NUM_SAMPLES to be in .brw, not .mesa
This is the wrong kind of dirty bit.  Caught by GCC warnings, due to
64-bit values being truncated to 32 bits.

Fixes: b95b0e2918 (intel/anv,blorp,i965: Implement the SKL 16x MSAA SIMD32 workaround)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-02 15:30:21 -07:00
Jason Ekstrand
afa8f58921 anv: Add support for the on-disk shader cache
The Vulkan API provides a mechanism for applications to cache their own
shaders and manage on-disk pipeline caching themselves.  Generally, this
is what I would recommend to application developers and I've resisted
implementing driver-side transparent caching in the Vulkan driver for a
long time.  However, not all applications do this and, for some
use-cases, it's just not practical.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-02 14:52:05 -07:00
Jason Ekstrand
e0f7a3aa5b anv/pipeline_cache: Add a _locked suffix to a function
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-02 13:07:06 -07:00
Jason Ekstrand
f5c38f4a30 anv: Add device-level helpers for searching for and uploading kernels
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-02 13:07:06 -07:00
Jason Ekstrand
eae192bf5f anv/pipeline: Stop optimizing for not having a cache
Before, we were only hashing the shader if we had a shader cache to
cache things in.  This means that if we ever get it wrong, we could end
up trying to cache a shader with an undefined hash.  Since not having a
shader cache is an extremely uncommon case, let's optimize for code
clarity and obvious correctness over avoiding a hash operation.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-02 13:07:06 -07:00
Jason Ekstrand
76fdc8a85c anv: Use a default pipeline cache if none is specified
If a client is dumb enough to not specify a pipeline cache, give it a
default.  We have to create one anyway for blorp so we may as well let
the client cache shaders in it.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-02 13:07:06 -07:00
Jason Ekstrand
d1c778b362 anv: Be more careful about hashing pipeline layouts
Previously, we just hashed the entire descriptor set layout verbatim.
This meant that a bunch of extra stuff such as pointers and reference
counts made its way into the cache.  It also meant that we weren't
properly hashing in the Y'CbCr conversion information information from
bound immutable samplers.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-02 13:07:06 -07:00
Jason Ekstrand
06412bfc98 anv,intel: Enable nir_opt_large_constants for Vulkan
According to RenderDoc, this shaves 99.6% of the run time off of the
ambient occlusion pass in Skyrim Special Edition when running under DXVK
and shaves 92% off the runtime for a reasonably representative frame.
When running the actual game, Skyrim goes from being a slide-show to a
very stable and playable framerate on my SKL GT4e machine.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-02 12:09:50 -07:00
Jason Ekstrand
70ce880434 anv: Add state setup support for shader constants
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-02 12:09:49 -07:00
Jason Ekstrand
3a5ed18c51 anv: Add support for shader constant data to the pipeline cache
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-02 12:09:47 -07:00
Jason Ekstrand
1235850522 nir: Add a large constants optimization pass
This pass searches for reasonably large local variables which can be
statically proven to be constant and moves them into shader constant
data.  This is especially useful when large tables are baked into the
shader source code because they can be moved into a UBO by the driver to
reduce register pressure and make indirect access cheaper.

v2 (Jason Ekstrand):
 - Use a size/align function to ensure we get the right alignments
 - Use the newly added deref offset helpers

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-02 12:09:45 -07:00
Jason Ekstrand
c90f221e0a nir: Add a concept of constant data associated with a shader
This commit adds a concept to NIR of having a blob of constant data
associated with a shader.  Instead of being a UBO or uniform that can be
manipulated by the client, this constant data considered part of the
shader and remains constant across all invocations of the given shader
until the end of time.  To access this constant data from the shader, we
add a new load_constant intrinsic.  The intention is that drivers will
eventually lower load_constant intrinsics to load_ubo, load_uniform, or
something similar.  Constant data will be used by the optimization pass
in the next commit but this concept may also be useful for OpenCL.

v2 (Jason Ekstrand):
 - Rename num_constants to constant_data_size (anholt)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-02 12:09:42 -07:00
Jason Ekstrand
e8e159e9df nir/deref: Add helpers for getting offsets
These are very similar to the related function in nir_lower_io except
that they don't handle per-vertex or packed things (that could be added,
in theory) and they take a more detailed size/align function pointer.
One day, we should consider switching nir_lower_io over to using the
more detailed size/align functions and then we could make it use these
helpers instead of having its own.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-02 12:09:41 -07:00
Jason Ekstrand
2bf8be99b0 nir/types: Add a natural size and alignment helper
The size and alignment are "natural" in the sense that everything is
aligned to a scalar.  This is a bit tighter than std430 where vec3s are
required to be aligned to a vec4.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-02 12:09:39 -07:00
Jason Ekstrand
893fc2d07d nir: Add a deref_instr_has_indirect helper
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-02 12:09:37 -07:00
Jason Ekstrand
70b16963fc util/macros: Import ALIGN_POT from ralloc.c
v2 (Jason Ekstrand):
 - Rename y to pot_align (Brian)
 - Also use ALIGN_POT in build_id.c and slab.c (Brian)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-02 12:09:14 -07:00
Eric Anholt
4819da2301 v3d: Claim PIPE_CAP_TGSI_CAN_READ_OUTPUTS.
Fixes warning at screen creation.  We store our outputs in normal temps
and just emit them to shader I/O at the end, due to our I/O ordering
requirements, so reading "outputs" in NIR is fine.
2018-07-02 11:35:41 -07:00
Marek Olšák
32e413ca59 ac: move all LLVM module initialization into ac_create_module
This removes some ugly code around module initialization.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-02 14:34:39 -04:00
Eric Anholt
49f7631c9f v3d: Emit a TF flush after each draw using TF.
This fixes GPU hangs on 7278 in transform feedback tests such as
GTF-GLES3.gtf.GL3Tests.transform_feedback2.transform_feedback2_basic
2018-07-02 10:05:14 -07:00
Karol Herbst
c7726fbfa5 nv50/ir: handle clipvertex for geom and tess shaders as well
this will be needed for compatibility profiles

v2: handle tess shaders

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-02 16:21:31 +02:00
Erik Faye-Lund
4c87705705 gallium/u_vbuf: drop min/max-scanning for empty indirect draws
When building with asserts enabled, we'll end up triggering an assert
in pipe_buffer_map_range down this code-path, due to trying to map
an empty range. Even if we avoid that, we'll trigger another assert
a bit later, because u_vbuf_get_minmax_index returns a min-index of
-1 here, which gets promoted to an unsigned value, and gives us an
out-of-bounds buffer-mapping offset.

Since we can't really have a well-defined min/max range here when
the range is empty anyway, we should just drop this dance in the
first place. After all, no rendering is going to be produced.

This fixes a crash in dEQP-GLES31.functional.draw_indirect.random.0
on VirGL for me.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-02 10:51:29 +02:00
Samuel Pitoiset
02db2363f0 radv: reset the image's predicate after a color decompression pass
After performing a fast-clear eliminate, a FMASK decompress,
or a DCC decompress, we can reset the predicate to FALSE.

With that, the GPU should be able to skip unnecessary color
decompression passes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-02 10:43:33 +02:00
Samuel Pitoiset
ff7daadca1 radv: enable/disable predication for the DCC decompression pass
Performing a DCC decompression pass is currently pretty rare,
but using predication allows the GPU to skip unnecessary passes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-02 10:43:17 +02:00
Samuel Pitoiset
939e5a3823 radv: add padding for the UMR disassembler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-02 10:42:17 +02:00
Gert Wollny
91f48cdfe5 virgl: Add support for glGetMultisample
Use caps to obtain the multisample sample positions for up to 16
positions and implement the according Gallium interface.

This implemenation (plus its counterpart in virglrenderer) assume that
the fixed sample position are always the same for a given number of samples
over the whole live time of a qemu session. It also assumes that sample
series are only given for 2, 4, 8, and 16 samples, and for intermediate
numbers N of samples the next higher supported set from above list is picked
and the sample positions for the first N samples are returned accordingly.

Fixes (when run on GL host):
    dEQP-GLES31.functional.texture.multisample.samples_1.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_2.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_3.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_4.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_8.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_10.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_12.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_13.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_16.sample_position

v2: remove unrelated chunk (thanks Ilia Mirkin)
v3: - also return positions for intermediate sample counts
    - fix unused varible warning
    - update description
v4: explain better what this patch assumes and how it handles sample numbers
    that are not directly advertised (thanks go to Erik Faye-Lund for making
    me aware that this should be documented)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-07-02 09:33:55 +02:00
Tomeu Vizoso
ba78e78cd5 st/mesa: Also check for PIPE_FORMAT_A8R8G8B8_SRGB for texture_sRGB
and PIPE_FORMAT_R8G8B8A8_SRGB, as well.

The reason for this is that when Virgl runs with GLES on the host, it
cannot directly upload textures in BGRA.

So to avoid a conversion step, consider the RGB sRGB formats as well for
this extension.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-02 09:33:48 +02:00
Tomeu Vizoso
71867a0a61 st/mesa: Fall back to R8G8B8A8_SRGB for ETC2
If the driver doesn't support PIPE_FORMAT_B8G8R8A8_SRGB, fall back to
PIPE_FORMAT_R8G8B8A8_SRGB.

Drivers such as Virgl will have a hard time supporting
PIPE_FORMAT_B8G8R8A8_SRGB when the host runs GLES, as GL_BGRA isn't as
well suported there.

So go with PIPE_FORMAT_R8G8B8A8_SRGB so these drivers can avoid a
conversion copy.

v2: Fix typo in commit message

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-02 09:33:41 +02:00
Tomeu Vizoso
e5604ef78b st/mesa/i965: Allow decompressing ETC2 to GL_RGBA
When Mesa itself implements ETC2 decompression, it currently
decompresses to formats in the GL_BGRA component order.

That can be problematic for drivers which cannot upload the texture data
as GL_BGRA, such as Virgl when it's backed by GLES on the host.

So this commit adds a flag to _mesa_unpack_etc2_format so callers can
specify the optimal component order.

In Gallium's case, it will be requested if the format isn't in
PIPE_FORMAT_B8G8R8A8_SRGB format.

For i965, it will remain GL_BGRA, as before.

v2: * Remove unnecesary include (Emil Velikov)

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-02 09:33:33 +02:00
Iago Toral Quiroga
1b54824687 anv/cmd_buffer: make descriptors dirty when emitting base state address
Every time we emit a new state base address we will need to re-emit our
binding tables, since they might have been emitted with a different base
state adress.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
2018-07-02 08:31:20 +02:00
Iago Toral Quiroga
6a1d8350c9 anv/cmd_buffer: clean dirty push constants flag after emitting push constants
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
2018-07-02 08:31:02 +02:00
Iago Toral Quiroga
198a72220b anv/cmd_buffer: never shrink the push constant buffer size
If we have to re-emit push constant data, we need to re-emit all
of it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
2018-07-02 08:30:40 +02:00
Denis Pauk
2854c0f795 gallium/llvmpipe: Enable support bptc format.
v2: none
v3: none

Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Rhys Perry <pendingchaos02@gmail.com>
CC: Matt Turner <mattst88@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-01 15:42:37 -04:00
Denis Pauk
530130e74f gallium/softpipe: Enable support bptc format.
v2: none
v3: none

Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-01 15:42:37 -04:00
Denis Pauk
f69bc797e1 gallium/auxiliary: Add helper support for bptc format compress/decompress
Reuse code shared with mesa/main/texcompress_bptc.

v2: Use block decompress function
v3: Include static bptc code from texcompress_bptc_tmp.h
    Suggested-by: Marek Olšák <maraeo@gmail.com>

Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
CC: Nicolai Hähnle <nicolai.haehnle@amd.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-01 15:42:37 -04:00
Denis Pauk
bf4871f9e8 mesa: add header for share bptc decompress functions
Move shared bptc functions to texcompress_bptc_tmp.h:
* fetch_rgba_unorm_from_block
* fetch_rgb_float_from_block
* compress_rgba_unorm
* compress_rgb_float

Create decompress functions:
* decompress_rgba_unorm
* decompress_rgb_float

Functions will be reused in gallium/auxiliary code.

v2: Add block decompress function
v3: Move all shared code to header
    Suggested-by: Marek Olšák <maraeo@gmail.com>

Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
CC: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-01 15:42:36 -04:00
Marek Olšák
99c6cae227 glsl/cache: save and restore ExternalSamplersUsed
Shaders that need special code for external samplers were broken if
they were loaded from the cache.

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-30 01:04:16 -04:00
Timothy Arceri
463f849097 nir: fix selection of loop terminator when two or more have the same limit
We need to add loop terminators to the list in the order we come
across them otherwise if two or more have the same exit condition
we will select that last one rather than the first one even though
its unreachable.

This fix is for simple unrolls where we only have a single exit
point. When unrolling these type of loops the unreachable
terminators and their unreachable branch are removed prior to
unrolling. Because of the logic change we also switch some
list access in the complex unrolling logic to avoid breakage.

Fixes: 6772a17acc ("nir: Add a loop analysis pass")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-30 10:13:03 +10:00
Timothy Arceri
18293be622 radeonsi: enable OpenGL 4.4 compat profile
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
ddb351f7fe mesa: enable ARB_vertex_attrib_64bit in compat profile
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
c283b413c1 mesa: add outstanding ARB_vertex_attrib_64bit dlist support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Dave Airlie
98d02104a7 vbo_save: add support for doubles to display list code
Required for ARB_vertex_attrib_64bit compat profile support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
d2caa37741 mesa: add compat profile support for ARB_multi_draw_indirect
v2: add missing ARB_base_instance support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
103b8f11d6 mesa: make valid_draw_indirect_multi() accessible externally
We will use this to add compat support to ARB_multi_draw_indirect
in the following patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
5f90fb4007 mesa: add ARB_draw_indirect support to compat profile
v2: add missing ARB_base_instance support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
9b32c80357 mesa: generate GL_INVALID_OPERATION using draw indirect in dlist
The spec doesn't explicitly say to generate an error but since
DrawArraysInstanced* and DrawElementsInstanced* do, it makes
sense to do it for these functions also.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
03f1a2e8df mesa: add missing display list support for ARB_compute_shader
The extension is enabled for compat profile but there is currently
no display list support.

I filed a spec bug and it has been agreed that
glDispatchComputeIndirect should generate an INVALID_OPERATION
error when called during display list compilation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
87d6093583 mesa: expose some ARB_viewport_array dependent extensions in compat
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
d87913e72a mesa: enable ARB_viewport_array in compat profile
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
d332986589 mesa: add ARB_viewport_array display list support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
df5e22cb7d mesa: enable ARB_shader_subroutine in compat profile
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
05f3589e67 mesa: add glUniformSubroutinesuiv() display list support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
52e3ef2400 mesa: stop hiding remaining query parameters from OpenGL compat
I managed to miss these two in my last pass at this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
9f77a9729e mesa: enable ARB_gpu_shader_fp64 in compat profile
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:33 +10:00
Timothy Arceri
a138fbc955 mesa: add ProgramUniform*d display list support
This is required for fp64 to be enabled in compat profile.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:32 +10:00
Timothy Arceri
145f517cbd mesa: add Uniform*d support to display lists
This is required so we can enable fp64 support in compat profile.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-30 08:38:32 +10:00
Karol Herbst
04b443104d st/glsl_to_nir: run lower_output_reads on !PIPE_CAP_TGSI_CAN_READ_OUTPUTS
this is required for Drivers which don't allow reading from outputs.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-06-29 23:43:26 +02:00
Eric Anholt
a77cb724da v3d: Move GL shader state dumping out of per-version compilation.
It doesn't depend on V3D_VER, since it's just calling v3d_print_group.
2018-06-29 13:36:28 -07:00
Eric Anholt
c2901ff80f v3d: Add missing Stream field to transform feedback specs on V3D 4.1.
Noticed when trying to CLIF parse a transform feedback job that hangs on
HW.
2018-06-29 13:36:28 -07:00
Eric Anholt
69efc1e025 v3d: Add missing "tri trip or fan" flag in Primitive List Format. 2018-06-29 13:36:28 -07:00
Eric Anholt
b341b39db3 v3d: Fix the shader code address field widths on V3D 4.1+
We were overlapping it with the threadable/nan flags, resulting in
incorrect relocations (threadable/nan included in the offset) and wrong
ordering in the CLIF files.
2018-06-29 13:36:28 -07:00
Eric Anholt
6c3c11ba19 v3d: Add missing "no prim pack" field to the V3D4.1+ GL shader state.
It looks like we don't need this flag for anything (not that I'm clear on
what it does), but it makes our struct dumping line up with CLIF parsing.
2018-06-29 13:36:28 -07:00
Eric Anholt
c0476d964a v3d: Express dithering mode in the same way that the CLIF parser does. 2018-06-29 13:36:28 -07:00
Eric Anholt
24d2f1347d v3d: Add missing "number of bin tile lists" field.
Noticed when trying to feed our dumps through the CLIF parser.  Since this
is a "minus one" field, we were already filling in the value we wanted (0).
2018-06-29 13:36:28 -07:00
Eric Anholt
b65b61cefe v3d: Rewrite the color write masks to match CLIF format.
The render_target_* fields gave us pretty(ish) printing, but meant we were
incompatible with CLIF, and had much more verbose code generating them.
2018-06-29 13:36:28 -07:00
Eric Anholt
38172dcba9 v3d: Merge the V3D 4.1 and 4.2 XML into V3D 3.3'x XML.
The XML ends up noisier if you're only looking at one version, but from
the diffstat there's obvious wins in terms of deduplication.  This will
get even more significant if we ever support 3.2 or 4.0.
2018-06-29 13:36:28 -07:00
Eric Anholt
725561c0b6 v3d: Switch v3d_decoder.c to the XML's top min_ver/max_ver fields.
The XML zipper wants one XML per version for filling out its tables, but
we want to do more than one GPU version per XML now.  Assume that the
"gen" field will be the same as min_ver and look up our XML text assuming
that they're listed in increasing min_ver.
2018-06-29 13:36:28 -07:00
Eric Anholt
f8af5c58c3 v3d: Create XML fields for min_ver and max_ver of a packet/struct/enum.
This will be used to merge together the V3D 3.3-4.1 XML with the variants
disabled based on the version.
2018-06-29 13:36:28 -07:00
Eric Anholt
6f7ad7ed11 v3d: Pass the version being generated to the pack generator script.
It turns out that most V3D versions change very few packets, so keeping
separate copies of the XML per version makes changing the XML a pain as
you have to replicate your changes to each one.  This is the start of
changing it so that one XML can generate headers for multiple versions.
2018-06-29 13:36:28 -07:00
Jose Maria Casanova Crespo
a99c9e63a0 anv: finish the binding_table_pool on destroyDevice when use_softpin
Running VK-CTS in batch execution mode was raising the
VK_ERROR_INITIALIZATION_FAILED error in multiple tests. But when the
same failing tests were run isolated they always passed.

createDevice and destroyDevice were called before and after every
tests. Because the binding_table_pool was never closed, we reached the
maximum number of open file descriptors (ulimit -n) and when that
happened every call to createDevice implied a
VK_ERROR_INITIALIZATION_FAILED error.

Fixes: c7db0ed4e9
      ("anv: Use a separate pool for binding tables when soft pinning")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-29 21:49:31 +02:00
Marek Olšák
ea8b55b49f gallium/util: remove dummy function util_format_is_supported
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2018-06-29 15:31:49 -04:00
Dylan Baker
82bf8a6a82 docs: update calendar, add news and link release notes to 18.1.3 2018-06-29 11:04:22 -07:00
Dylan Baker
9dfcf044f7 docs: Add SHA256 sums to notes for 18.1.3 2018-06-29 11:02:41 -07:00
Dylan Baker
2fa6c3821f docs: Add release notes for 18.1.3 2018-06-29 11:02:39 -07:00
Rhys Perry
ffba56cc3c nv50/ir: improve maintainability of Target*::initOpInfo()
This is mainly useful for when one needs to add new opcodes in a painless
and reliable way.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-06-29 16:47:27 +02:00
Rhys Perry
d885303a38 nv50/ir: fix image stores with indirect handles
Having this if statement here prevented the next if statement from being
reached in the case of image stores, which is needed for instructions with
indirect bindless handles like "STORE TEMP[ADDR[2].x+1](1) ...".

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-06-29 16:07:59 +02:00
Ross Burton
d7c4ce1d1d egl: fix build race in automake
There is a parallel make build issue in src/egl/drivers/dri2/
for wayland builds. Can be reproduced with:

$ rm src/egl/drivers/dri2/*.h src/egl/drivers/dri2/platform_wayland.lo
$ make -C src/egl/ drivers/dri2/platform_wayland.lo
../../../mesa-18.1.2/src/egl/drivers/dri2/platform_wayland.c:50:10: fatal error: linux-dmabuf-unstable-v1-client-protocol.h: No such file or directory

This patch adds the missing dependency.

Fixes: 02cc359372 "egl/wayland: Use linux-dmabuf interface for buffers"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>

[Eric: fixed up the commit title]
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-29 12:49:51 +01:00
Marek Olšák
5a6414f135 radeonsi: implement vertex color clamping for tess and GS 2018-06-28 22:41:12 -04:00
Marek Olšák
034b385fc2 radeonsi: move VS_STATE_SGPR before draw SGPRs
for vertex color clamping.
2018-06-28 22:27:25 -04:00
Marek Olšák
0c554bc5d5 radeonsi: don't use malloc in si_generate_gs_copy_shader 2018-06-28 22:27:25 -04:00
Marek Olšák
7bac3b589c radeonsi: disable DCC statistics gathering on everything but Stoney
I think we don't need it on other chips.
2018-06-28 22:27:25 -04:00
Marek Olšák
0da94fa19c radeonsi: don't enable DCC statistics gathering for small surfaces 2018-06-28 22:27:25 -04:00
Marek Olšák
f8b0c54e3f radeonsi: simplify logic around vi_separate_dcc_try_enable 2018-06-28 22:27:25 -04:00
Marek Olšák
41f80373b4 radeonsi: fix memory exhaustion issue with DCC statistics gathering with DRI2
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
2018-06-28 22:27:25 -04:00
Marek Olšák
fb28bf23db radeonsi: remove references to Evergreen 2018-06-28 22:27:25 -04:00
Marek Olšák
1542169a4a radeonsi: enable shader caching for compute shaders
Compute shaders were not using the shader cache.
2018-06-28 22:27:25 -04:00
Marek Olšák
d77557c9db radeonsi: store compute local_size into tgsi_shader_info
This is kinda a hack, but it's enough for the shader cache.
2018-06-28 22:27:25 -04:00
Marek Olšák
d13f240269 radeonsi: unify duplicated code for initial shader compilation 2018-06-28 22:27:25 -04:00
Marek Olšák
8e9c57a7fe ac: set +auto-waitcnt-before-barrier when needed
This removes useless s_waitcnt before barriers.
Only radeonsi uses this function.
2018-06-28 22:27:25 -04:00
Marek Olšák
7d6ec9d43b radeonsi/gfx9: insert the barrier between merged shaders inside the if block 2018-06-28 22:27:25 -04:00
Joe M. Kniss
70425bcfe6 gallium: plumb invariant output attrib thru TGSI
Add support for glsl 'invariant' modifier for output data declarations.
Gallium drivers that use TGSI serialization currently loose invariant
modifiers in glsl shaders.

v2: use boolean for invariant instead of unsigned.

Tested: chromiumos on qemu with virglrenderer.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-06-29 11:11:54 +10:00
Francisco Jerez
c2c803be7b intel/fs: Build 32-wide FS shaders.
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-28 13:25:21 -07:00
Jason Ekstrand
b95b0e2918 intel/anv,blorp,i965: Implement the SKL 16x MSAA SIMD32 workaround
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-28 13:25:18 -07:00
Jason Ekstrand
d5e028a57b intel/fs: Add fields to wm_prog_data for SIMD32 dispatch
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
bcbc7d3a17 intel/fs: Fix nir_intrinsic_load_helper_invocation for SIMD32.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
7144247c2c intel/fs: Fix fs_builder::sample_mask_reg() for 32-wide FS dispatch.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
37c1df28c9 intel/fs: Fix Gen6+ interpolation setup for SIMD32
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
e208bc3bb7 intel/fs: Get rid of MOV_DISPATCH_TO_FLAGS
We can just emit the MOV in the two places where we use this.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
5e3028d826 intel/fs: Emit MOV_DISPATCH_TO_FLAGS once for the centroid workaround
There's no reason for us to emit it a pile of times and then have a
whole pass to clean it up.  Just emit it once like we really want.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
40fe108e2b intel/fs: Generalize the unlit centroid workaround
This generalizes the unlit centroid workaround so it's less code and now
supports SIMD32.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
1d381731e0 intel/fs: Fix sample id setup for SIMD32.
v2 (Jason Ekstrand):
 - Disallow gl_SampleId in SIMD32 on gen7

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
2fd0aed89a intel/fs: Fix Gen7 compressed source region alignment restriction for SIMD32
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
6909aed90e intel/fs: Implement 32-wide FS payload setup on Gen6+
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
f6c4aace22 intel/fs: Extend thread payload layout to SIMD32
And handle 32-wide payload register reads in fetch_payload_reg().

v2 (Jason Ekstrand);
 - Fix some whitespace and brace placement

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
8f143f70d6 intel/fs: Wrap FS payload register look-up in a helper function.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
d996e5b812 intel/fs: Use fs_regs instead of brw_regs in the unlit centroid workaround
While we're here, we change to using horiz_offset() instead of abusing
half().

v2 (Jason Ekstrand):
 - Use horiz_offset() instead of half()

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
38aee1a06d intel/fs: Simplify fs_visitor::emit_samplepos_setup
The original code manually handled splitting the MOVs to 8-wide to
handle various regioning restrictions.  Now that we have a SIMD width
splitting pass that handles these things, we can just emit everything at
the full width and let the SIMD splitting pass handle it.  We also now
have a useful "subscript" helper which is designed exactly for the case
where you want to take a W type and read it as a vector of Bs so we may
as well use that too.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
244a0ff3a8 i965: Add plumbing for shader time in 32-wide FS dispatch mode.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
2d7d652d5c intel/fs: Disable opt_sampler_eot() in 32-wide dispatch.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
db6ca13efc intel/fs: Emit LINE+MAC for LINTERP with unaligned coordinates
On g4x through Sandy Bridge, src1 (the coordinates) of the PLN
instruction is required to be an even register number.  When it's odd
(which can happen with SIMD32), we have to emit a LINE+MAC combination
instead.  Unfortunately, we can't just fall through to the gen4 case
because the input registers are still set up for PLN which lays out the
four src1 registers differently in SIMD16 than LINE.

v2 (Jason Ekstrand):
 - Take advantage of both accumulators and emit LINE LINE MAC MAC
   (Based on a patch from Francisco Jerez)
 - Unify the gen4 and gen4x-6 cases using a loop

v3 (Jason Ekstrand):
 - Don't unify gen4 with gen4x-6 as this turns out to be more fragile
   than first thought without reworking the gen4 barycentric coordinate
   layout.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
566e6abd6d intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLN
When we don't have PLN (gen4 and gen11+), we implement LINTERP as either
LINE+MAC or a pair of MADs.  In both cases, the accumulator is written
by the first of the two instructions and read by the second.  Even
though the accumulator value isn't actually ever used from a logical
instruction perspective, it is trashed so we need to make the scheduler
aware.  Otherwise, the scheduler could end up re-ordering instructions
and putting a LINTERP between another an instruction which writes the
accumulator and another which tries to use that result.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
73d60455e9 intel/fs: Rework INTERPOLATE_AT_PER_SLOT_OFFSET
This reworks INTERPOLATE_AT_PER_SLOT_OFFSET to work more like an ALU
operation and less like a send.  This is less code over-all and, as a
side-effect, it now properly handles execution groups and lowering so
SIMD32 support just falls out.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
74b477039d intel/fs: Add the group to the flag subreg number on SNB and older
We want consistent behavior in the meaning of the flag_subreg field
between SNB and IVB+.

v2 (Jason Ekstrand):
 - Add some extra commentary

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
2aefa5e19f intel/fs: Fix FB read header setup for SIMD32.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
e06f5b30cc intel/fs: Fix logical FB write lowering for SIMD32
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
ce370902d4 intel/fs: Fix FB write message control codegen for SIMD32.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
8b788069fb intel/fs: Don't enable dual source blend if no outputs are written
This prevents a crash in some arb_enhanced_layouts tests that would be
caused by the next commit.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
48241c780a intel/fs: Fix codegen of FS_OPCODE_SET_SAMPLE_ID for SIMD32.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
789d20df36 intel/eu: Fix pixel interpolator queries for SIMD32.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
1650442026 intel/fs: Disable SIMD32 dispatch for fragment shaders with discard.
Current discard handling requires dedicating the second flag register to
discard.  However, control-flow in SIMD32 requires both flag registers
so it's incompatible with the current discard handling.  Just don't
support SIMD32+discard for now.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
1811cbdc25 intel/fs: Disable SIMD32 dispatch on Gen4-6 with control flow
The hardware's control flow logic is 16-wide so we're out of luck
here.  We could, in theory, support SIMD32 if we know the control-flow
is uniform but we don't have that information at this point.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
d5b617a28e intel/fs: Split instructions low to high in lower_simd_width
Commit 0d905597f fixed an issue with the placement of the zip and unzip
instructions.  However, as a side-effect, it reversed the order in which
we were emitting the split instructions so that they went from high
group to low instead of low to high.  This is fine for most things like
texture instructions and the like but certain render target writes
really want to be emitted low to high.  This commit just switches the
order back around to be low to high.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 0d905597f "intel/fs: Be more explicit about our placement of [un]zip"
2018-06-28 13:19:38 -07:00
Jason Ekstrand
0b830081f0 intel/fs: Rework KSP data to be SIMD width-based
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
9d78abbef8 intel/compiler: Add and use helpers for working with KSP indices
The pixel shader dispatch table is kind-of a confusing mess.  This adds
some helpers for dealing with it and for easily extracting the correct
data from wm_prog_data.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
85750348bc i965: Re-arrange shader kernel setup in WM state
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
5b6e91dd35 intel/fs: Remove program key argument from generator.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
a14fb0184a intel/fs: Set up FB write message headers in the visitor
Doing instruction header setup in the generator is awful for a number
of reasons.  For one, we can't schedule the header setup at all.  For
another, it means lots of implied writes which the instruction scheduler
and other passes can't properly read about.  The second isn't a huge
problem for FB writes since they always happen at the end.  We made a
similar change to sampler handling in ff4726077d.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
dda31a7bbc intel/fs: Fix implied_mrf_writes() for headerless FB writes.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
90643689aa intel/fs: Fix fs_inst::flags_written() for Gen4-5 FB writes.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
ed09e78023 intel/eu: Return new instruction to caller from brw_fb_WRITE().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
c0a1c248b8 intel/fs: Pull FB write implied headers from src[0]
Now that we have the implied header in src[0] for tracking purposes, we
may as well use it in the generator.  This makes things a tiny bit more
general.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
b1cc9a9ae1 intel/fs: Properly track implied header regs read by FB writes
The FB write opcode on gen4-5 does implied copies from g0 and g1 to the
message payload.  With this commit, we start tracking that as part of
the IR by having the FB write read from g0-1.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
d91fa20655 intel/fs: FS_OPCODE_REP_FB_WRITE has side effects
It doesn't matter since we don't ever run replicated write shaders
through the optimizer but it's good to be complete.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Dylan Baker
e83cd38eac docs: Add news item for mesa 18.1.2
Which I forgot to do when 18.1.2 came out.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-06-28 10:06:44 -07:00
Rhys Perry
c92eb71a65 nvc0: remove magic values in nve4_set_tex_handles()
With this commit, things no longer break if NVC0_CB_AUX_TEX_INFO is
changed to anything other than 0x20.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-06-28 18:22:06 +02:00
Rhys Perry
6bb0f87c60 nvc0/ir: fix TargetNVC0::insnCanLoadOffset()
Previously, TargetNVC0::insnCanLoadOffset() returned whether the offset
could be set to a specific value. The IndirectPropagation pass expected
it to return whether the offset could be increased by a specific value,
which is what TargetNV50::insnCanLoadOffset() does.

Fixes: 37b67db6ae
	("nvc0/ir: be careful about propagating very large offsets into const load")

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-06-28 18:22:06 +02:00
Alok Hota
5b7d4f9428 swr/rast: Updating code style based on current clang-format rules
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-06-28 08:18:14 -05:00
Vinson Lee
f90a60fe79 swr/rast: Fix addPassesToEmitFile usage with llvm-7.0.
Fix build error after llvm-7.0svn r332881 ("CodeGen: Add a dwo output
file argument to addPassesToEmitFile and hook it up to dwo output.").

  CXX      rasterizer/jitter/libmesaswr_la-JitManager.lo
rasterizer/jitter/JitManager.cpp:368:93: error: too few arguments to function call, expected at least 4, have 3
        pTarget->addPassesToEmitFile(*pMPasses, filestream, TargetMachine::CGFT_AssemblyFile);
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                        ^

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-06-28 08:18:06 -05:00
Alok Hota
c7e9102d89 swr/rast: Handling removed LLVM intrinsics in trunk
- Functionality replaced with emulated intrinsics
- Fixes Bug 106558

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-06-28 08:18:00 -05:00
Alok Hota
83d3ddd0ec swr/rast: Adding SCATTERPS functionality to BuilderGfxMem
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-06-28 08:17:55 -05:00
Alok Hota
4509cdbb37 swr/rast: Adding Read/Write specifier to TranslateGfxAddress stack
- Removing unused generic translate function
- Requiring read/write specifier in builder_gfx_mem

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-06-28 08:17:33 -05:00
Chad Versace
dc6665422a gallium: Fix automake for Android (v2)
Chromium OS uses Autotools and pkg-config when building Mesa for
Android. The gallium drivers were failing to find the headers and
libraries for zlib and Android's libbacktrace.

v2:
  - Don't add a check for zlib.pc. configure.ac already checks for
    zlib.pc elsewhere. [for tfiga]
  - Check for backtrace.pc separately from the other Android libs.
    [for tfiga]

Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-27 19:58:16 -07:00
Timothy Arceri
2a5121bf35 glsl: skip comparison opt when adding vars of different size
The spec allows adding scalars with a vector or matrix. In this case
the opt was losing swizzle and size information.

This fixes a bug with Doom (2016) shaders.

Fixes: 34ec1a24d6 ("glsl: Optimize (x + y cmp 0) into (x cmp -y).")

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-28 12:15:17 +10:00
Jason Ekstrand
e8eb182ec5 Revert "anv: Print the actual enum for ignored structure types"
This reverts commit fda7014c35.  It was
hitting an unreachable when the sType was unknown.
2018-06-27 14:10:37 -07:00
Jason Ekstrand
fda7014c35 anv: Print the actual enum for ignored structure types
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-06-27 12:43:18 -07:00
Jason Ekstrand
6a35ba5ce9 i965/bufmgr: Use the correct argument order for bo_alloc_internal
The memzone and flags parameters were accidentally flipped in the call
from brw_bo_alloc_tiled_2d.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-27 12:43:18 -07:00
Keith Packard
60e6b6fa96 vulkan/wsi_common_display: Return SURFACE_LOST for fatal DRM errors
Instead of encouraging the client to re-create the swapchain and keep
going with an OUT_OF_DATE error, tell the client that further use of
the current surface will not succeed as the associated kernel objects
are no longer valid.

In particular, when a DRM lease is revoked, then the client needs to
get another lease and create a new surface for that.

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-27 10:02:18 -07:00
Eric Anholt
6bb046cd29 glsl: Make sure that packed varyings reflect always_active_io properly.
The always_active_io flag was only set according to the first variable
that got packed in, so NIR io compaction would end up compacting XFB
varyings that shouldn't move at that point.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-27 09:35:55 -07:00
Eric Anholt
ad1a4cb563 v3d: Fix Z clipping when viewport.scale[2] is negative.
Fixes:
dEQP-GLES3.functional.shaders.builtin_variable.depth_range_fragment
dEQP-GLES3.functional.shaders.builtin_variable.depth_range_vertex
2018-06-27 09:35:51 -07:00
Eric Anholt
9f80bcc2bc v3d: Convert a bunch of our "minus one" fields over to the new XML attr.
This fixes up their formatting for CLIF files and makes the code more
legible.
2018-06-27 09:13:48 -07:00
Eric Anholt
18b1bb0b63 v3d: Add pack/unpack/decode support for fields with a "- 1" modifier.
Right now, we name these fields as "field name minus one" so that your C
code obviously states what the value should be.  However, it's easy enough
to handle at the codegen level with another little XML attribute, meaning
less C code and easier-to-read values in CLIF dumping and gdb as well.

(The actual CLIF format for simulator and FPGA replay takes in
pre-minus-one values, so we need it there too).
2018-06-27 09:13:48 -07:00
Tapani Pälli
e9a77c3e96 i965: small cleanup in blorp debug printing output (trivial)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-27 11:05:48 +03:00
Tapani Pälli
9a92acec67 mesa: add a space between headers and source (trivial)
There used to be one and it looks like it was removed by eb63640c1d.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-27 11:05:48 +03:00
Tapani Pälli
58ba7ab535 features.txt: mark some extensions as done
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-27 11:05:48 +03:00
Danylo Piliaiev
e7cdaa895a mesa: Return number of result bits for GL_ANY_SAMPLES_PASSED_CONSERVATIVE
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106986
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-06-27 11:02:34 +03:00
Samuel Pitoiset
7a57c82767 radv: use separate bind points for the dynamic buffers
The Vulkan spec says:

   "pipelineBindPoint is a VkPipelineBindPoint indicating whether
    the descriptors will be used by graphics pipelines or compute
    pipelines. There is a separate set of bind points for each of
    graphics and compute, so binding one does not disturb the other."

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-27 09:48:31 +02:00
Samuel Pitoiset
9c09e7d66e radv: remove unused 'predicated' parameter from some functions
It's always false.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-27 09:48:15 +02:00
Dave Airlie
a6b64d6dde virgl: add ARB_texture_view support
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2018-06-27 14:08:00 +10:00
Jason Ekstrand
ff6db94c18 nir/opt_if: Remove unneeded phis if we make progress
Now that SSA values can be derefs and they have special rules, we have
to be a bit more careful about our LCSSA phis.  In particular, we need
to clean up in case LCSSA ended up creating a phi node for a deref.
This fixes validation issues with some Vulkan CTS tests with the new
deref instructions.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-06-26 10:47:26 -07:00
Samuel Pitoiset
fa42fa1a60 radv: emit PIPELINESTAT_{START,STOP} events for pipeline stats queries
Ported from RadeonSI.
This appears to fix some random fails with:
dEQP-VK.query_pool.statistics_query.*

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-26 18:23:16 +02:00
Tapani Pälli
ab2643e4b0 glsl: serialize data from glTransformFeedbackVaryings
While XFB has been enabled for cache, we did not serialize enough
data for the whole API to work (such as glGetProgramiv).

Fixes: 6d830940f7 "Allow shader cache usage with transform feedback"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106907
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-26 12:44:22 +03:00
Samuel Pitoiset
bcbd8dd6c9 radv: enable VK_EXT_shader_stencil_export
The driver already supports exporting the stencil value.

The following CTS test now pass:
dEQP-VK.pipeline.shader_stencil_export.op_replace

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-26 10:40:10 +02:00
Samuel Pitoiset
ba5e25ed29 radv: ignore pInheritanceInfo for primary command buffers
From the Vulkan spec:
"If this is a primary command buffer, then this value is ignored."

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-26 10:39:43 +02:00
Andrii Simiklit
232c5d75ea i965/gen6/gs: Handle case where a GS doesn't allocate VUE
We can not use the VUE Dereference flags combination for EOT
message under ILK and SNB because the threads are not initialized
there with initial VUE handle unlike Pre-IL.
So to avoid GPU hangs on SNB and ILK we need
to avoid usage of the VUE Dereference flags combination.
(Was tested only on SNB but according to the specification
SNB Volume 2 Part 1: 1.6.5.3, 1.6.5.6
the ILK must behave itself in the similar way)

v2: Approach to fix this issue was changed.
Instead of different EOT flags in the program end
we will create VUE every time even if GS produces no output.

v3: Clean up the patch.
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105399
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2018-06-26 08:18:55 +02:00
Dave Airlie
318ff60ccd radeon: duplicate cmask surface for now.
The radeon winsys isn't linked against the ac code, I have vague
memories of this causing some problems before, for now fix the build
but just duplicating the code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-26 11:26:35 +10:00
Marek Olšák
bd963f8430 radeonsi: rename r600_transfer -> si_transfer
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
eabeeb86b2 radeonsi: properly set cmask_buffer in si_reallocate_texture_inplace
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
d4755ef389 radeonsi: remove redundant si_texture::cmask_size
cmask_buffer and surface.cmask_size can replace its role.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
2a8d1039b6 radeonsi: inline struct r600_cmask_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
166250f4e5 radeonsi: move CMASK size computation into ac_surface
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
3da693b7d9 ac/surface: move cmask_size/alignment into radeon_surf
cmask_size is changed to uint32_t because it can't be greater than 4GB.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
2d64a68c6f radeonsi: rename r600_surface -> si_surface
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
218e133695 radeonsi: rename r600_memory_object -> si_memory_object
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
e5df04f13d radeonsi: remove unused r600_memory_object::offset
The real offset is passed through resource_from_memobj.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
45004abfd5 radeonsi: unify duplicated texture_from_handle & texture_from_memobj
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
cac7ab1192 radeonsi: reorder and initialize more fields in si_reallocate_texture_inplace
Some fields shouldn't be initialized, like framebuffers_bound and other stats.
It's hopefully complete now.

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
2018-06-25 18:33:58 -04:00
Marek Olšák
7888245ef3 radeonsi: stop using lp_build_emit_llvm_unary/binary
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
0810f15046 radeonsi: stop using lp_build_alloc
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
21ba8a204e radeonsi: use gallivm less
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
965904eebd radeonsi: stop using lp_bld_intr.h
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
6ab54d25a6 radeonsi: remove last uses of lp_build_context::undef
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
30f3e2200a radeonsi: stop using lp_bld_arit.h
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
5f54fc3ad1 radeonsi: stop using lp_build_gather_values
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
7bd40dc2f2 radeonsi: clean up some #includes
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
f154555733 radeonsi: clean up passing the is_monolithic flag for compilation
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Robert Foss
c7bb82136b egl/android: Add DRM node probing and filtering
This patch both adds support for probing & filtering DRM nodes
and switches away from using the GRALLOC_MODULE_PERFORM_GET_DRM_FD
gralloc call.

Currently the filtering is based just on the driver name,
and the desired name is supplied using the "drm.gpu.vendor_name"
Android property.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-06-25 18:54:10 +02:00
Rob Herring
3f7bca44d9 egl/android: #ifdef out flink name support
Maintaining both flink names and prime fd support which are provided by
2 different gralloc implementations is problematic because we have a
dependency on a specific gralloc implementation header.

This mostly disables the dependency on the gralloc implementation and
headers. The dependency on GRALLOC_MODULE_PERFORM_GET_DRM_FD remains for
now, but the definition is added locally to remove the header
dependency.

drm_gralloc support can be enabled by setting
BOARD_USES_DRM_GRALLOC=true in BoardConfig.mk.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-06-25 18:54:09 +02:00
Robert Foss
5a34aba07d gallium/util: Fix build error due to cast to different size
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-25 18:54:09 +02:00
Samuel Pitoiset
07cb1373a2 radv: fix HTILE metadata initialization in presence of subpass clears
If the driver ends up by performing a slow depthstencil clear,
the HTILE metadata won't be initialized correctly.

This fixes random VM faults on Polaris while running CTS
with Bas's runner. This doesn't seem to regress performance.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-25 17:38:59 +02:00
Gert Wollny
eebb65258d r600/sb: give the scheduler more margin to find valid instructions groups
For instruction sequences that change the address register with every load
the current limit to bail out of the scheduler and reject the optimisation
was too tight, i.e. it was expected that at least one pending instruction
would be scheduled each time.

Give the scheduler more margin to sort out these load sequences by allowing
a number of rounds where no instruction is scheduled.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106163

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-25 05:40:19 +01:00
Gert Wollny
cd7db0ab0a r600/sb: fix rotated register in while loop
This patch is based on
https://lists.freedesktop.org/archives/mesa-dev/2018-February/185805.html

Dave Airlie:

 "A bunch of CTS tests led me to write
  tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
  which r600/sb always fell over on.

  GCM seems to move some of the copies into other basic blocks,
  if we don't allow this to happen then it doesn't seem to schedule
  them badly.

  Everything I've read on SSA/phi copies say they have to happen
  in parallel, so keeping them in the same basic block seems like
  a good way to keep some of that property."

This patch differs from the one proposed by Dave in that it only adds
the NF_DONT_MOVE flag to copy_move instructions that are created by split_phi*
and that are located in loops.

Fixes piglit: tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
(no regressions in the shader set). It also fixes all failing tests from

  dEQP-GLES3.functional.shaders.loops.*

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-25 05:39:41 +01:00
Rob Clark
1977e92ee3 freedreno/ir3: fix deref conversion fallout
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-23 18:23:11 -04:00
Rob Clark
445871de94 freedreno/ir3: fix unused variable warning
Fixes: cf0c7258ee freedreno/a5xx: MSAA
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-23 18:23:11 -04:00
Rob Clark
868ca81cbe freedreno: fix HW_ATOMIC_COUNTERS cap
This was mistakenly exposed, even though we want atomic counters to be
lowered to atomic ops on an SSBO like nearly every other GPU.  Which
somehow recently started getting segfaults due to calling a null
pipe->set_hw_atomic_buffers().

Fixes a crash in stk, and probably other things.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-23 18:23:11 -04:00
Keith Packard
1df586be12 radv: add VK_EXT_display_control to radv driver [v5]
This extension provides fences and frame count information to direct
display contexts. It uses new kernel ioctls to provide 64-bits of
vblank sequence and nanosecond resolution.

v2:
	Rework fence integration into the driver so that waiting for
	any of a mixture of fence types (wsi, driver or syncobjs)
	causes the driver to poll, while a list of just syncobjs or
	just driver fences will block. When we get syncobjs for wsi
	fences, we'll adapt to use them.

v3:	Adopt Jason Ekstrand's coding conventions

	Declare variables at first use, eliminate extra whitespace between
	types and names. Wrap lines to 80 columns.

	Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

v4:	Adapt to WSI fence API change. It now returns VkResult and
	no longer has an option for relative timeouts.

v5:	wsi_register_display_event and wsi_register_device_event now
	use the default allocator when NULL is provided, so remove the
	computation of 'alloc' here.

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-23 07:59:00 -07:00
Keith Packard
16eb390834 anv: add VK_EXT_display_control to anv driver [v5]
This extension provides fences and frame count information to direct
display contexts. It uses new kernel ioctls to provide 64-bits of
vblank sequence and nanosecond resolution.

v2:	Adopt Jason Ekstrand's coding conventions

	Declare variables at first use, eliminate extra whitespace between
	types and names. Wrap lines to 80 columns.

	Add extension to list in alphabetical order

	Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

v3:	Adapt to WSI fence API change. It now returns VkResult and
	no longer has an option for relative timeouts.

v4:	wsi_register_display_event and wsi_register_device_event now
	use the default allocator when NULL is provided, so remove the
	computation of 'alloc' here.

v5:	use zalloc2 instead of alloc2 for the WSI fence.

	Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2018-06-23 07:59:00 -07:00
Keith Packard
86c8d93e5a vulkan: add VK_EXT_display_control [v10]
This extension provides fences and frame count information to direct
display contexts. It uses new kernel ioctls to provide 64-bits of
vblank sequence and nanosecond resolution.

v2: Remove DRM_CRTC_SEQUENCE_FIRST_PIXEL_OUT flag. This has
    been removed from the proposed kernel API.

    Add NULL parameter to drmCrtcQueueSequence ioctl as we
    don't care what sequence the event was actually queued to.

v3: Adapt to pthread clock switch to MONOTONIC

v4: Fix scope for wsi_display_mode andwsi_display_connector allocs

    Suggested-by: Jason Ekstrand <jason@jlekstrand.net>

v5: Adopt Jason Ekstrand's coding conventions

    Declare variables at first use, eliminate extra whitespace between
    types and names. Wrap lines to 80 columns.

    Use wsi_rel_to_abs_time helper function to convert relative
    timeouts to absolute timeouts without causing overflow.

    Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

v6:
    Change WSI fence wait function to return VkResult instead of
    bool. This makes the meaning of the return value easier to
    understand, and allows for the indication of failure.

    Also change the WSI fence wait function to take only absolute
    timeouts and not provide an option for a relative timeout. No
    users wanted relative timeouts, and it's simpler if that option
    isn't available.

    Terminate the DPMS property loop once we've found the property.

    Assert that the fence hasn't already been destroyed in
    wsi_display_fence_destroy.

    Rearrange the event handler function order in the file to place
    routines in an easier to find order.

    Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

v7:
    Adapt to API changes for surface_get_capabilities

v8:
    Use wsi->alloc in register_display_event so that callers
    don't have to dig out an allocator for us.

v9:
    Fix a few minor formatting issues

    Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

v10:
    Use wsi->alloc if none provided in wsi_display_fence_alloc.

    Now that drivers are expected to pass the allocator argument
    straight through from the application, we need to check those
    for NULL everywhere.

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2018-06-23 07:59:00 -07:00
Keith Packard
5581dd5c32 anv: Support wait for heterogeneous list of fences [v3]
Handle the case where the set of fences to wait for is not all of the
same type by either waiting for them sequentially (waitAll), or
polling them until the timer has expired (!waitAll). We hope the
latter case is not common.

While the current code makes sure that it always has fences of only
one type, that will not be true when we add WSI fences. Split out this
refactoring to make merging that clearer.

v2: Adopt Jason Ekstrand's coding conventions

    Declare variables at first use, eliminate extra whitespace between
    types and names. Wrap lines to 80 columns.

    Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

v2:
    Cast INT64_MAX to uint64_t to make of its use as the maximum
    possible timeout clearly unsigned to the reader.

    Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

    Make anv_wait_for_fences with !waitAll check all fences at least
    once, even if the requested timeout has already passed.

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2018-06-23 07:59:00 -07:00
Bas Nieuwenhuizen
8c4f430d43 radv: Enable lower_io_to_temporaries after deref changes.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 21:23:06 -07:00
Jason Ekstrand
aef4213fca nir/lower_system_values: Assert/assume direct var derefs
System values are never arrays or structs so we can assume a direct var
deref.  This simplifies things a bit and prevents us from accidentally
throwing away an array index.

Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 21:23:06 -07:00
Jason Ekstrand
a331d7d1cd nir: Remove old-school deref chain support
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 21:23:06 -07:00
Jason Ekstrand
9800b81ffb nir: Remove deref chain support from analyze_loops
Note that this patch needs to come late in the series since this pass
can be run after any pass that damages nir_metadata_loop_analysis.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 21:23:06 -07:00
Rob Clark
2db8784167 freedreno/ir3: convert to deref instructions
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 21:23:05 -07:00
Rob Clark
95683bdce3 nir: promote intrinsic_get_var() to helper
Useful in a few other places.. let's not copy-pasta

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
5a02ffb733 nir: Rework lower_locals_to_regs to use deref instructions
This completely reworks the pass to support deref instructions and
delete support for old deref chains

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
2fa7a4a541 intel,ir3: Re-enable nir_opt_copy_prop_vars
Now that it's rewritten for deref instructions, we can turn it back on.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen
67df3739c5 radeonsi: Remove deref chain support in nir scan pass.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen
9cb345588b radv: Remove deref chain support in radv shader info pass.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen
a1e9d799ad ac/nir: Remove deref chain support.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen
9bfd81b217 radeonsi: Add deref support to the nir scan pass.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
ba2bd20f87 nir: Rework opt_copy_prop_vars to use deref instructions
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
fa6ffcc083 nir/copy_prop_vars: Re-order some logic in compare_derefs
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
c5d9a65944 nir: Remove deref chain support from split_per_member_structs
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
18175ab66f nir: Remove deref chain support from opt_undef
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
aeb4bbfd1e nir: Remove deref chain support from split_var_copies
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
636256cdc7 nir: Remove deref chain support from dead_variables
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
378d7cf3ba nir: Remove deref chain support from propagate_invariant
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
c6a9c2b60b nir: Remove deref chain support from lower_var_copies
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
fc59230a46 nir: Remove deref chain support from lower_drawpixels
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
d4dd2ca4a7 nir: Remove deref chain support from opt_peephole_select
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
54bfc0cbcf nir: Remove deref chain support from lower_tex
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
a3589bb01f nir: Remove deref chain support from lower_wpos_ytransform
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
3992665c52 nir: Remove deref chain support from lower_wpos_center
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
8a62db7712 nir: Remove deref chain support from lower_system_values
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
e5db1b951c nir: Remove deref chain support from remove_unused_varyings
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
6bdd867968 nir: Delete lower_io_types
It's only used by the ir3 stand-alone compiler and Rob said we could
delete it.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
c6fc653232 nir: Remove deref chain support from lower_phis_to_scalar
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
47ffb893e6 nir: Convert lower_io to deref instructions
This deletes support for _var intrinsics and legacy deref chains in
favor of deref instructions.  The internals are also reworked a bit to
use deref instructions directly.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
0d03c63e91 nir/lower_io: Convert atomic lowering to deref instructions
No one is currently using so we can make this change irrespective of
driver.  We may use it again in i965 so it's best to pretend to keep it
working.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
c290e8c4b0 nir: Remove deref chain support from lower_global_vars_to_local
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
41c52c963a nir: Remove deref chain support from lower_clamp_color_outputs
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
d2adc08abe nir: Remove deref chain support from lower_alpha_test
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
81f29d6d33 nir: Remove deref chain support from lower_atomics
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
4b0ea65333 nir: Remove deref chain support from lower_clip_cull_distance_arrays
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
a42af8d0d6 nir: Remove deref chain support from lower_indirect_derefs
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
69866af357 nir: Rework gather_info to entirely use deref instructions
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
b1a18b8797 nir/vars_to_ssa: Rework to entirely use deref instructions
This commit reworks nir_lower_vars_to_ssa to use deref instructions and
deref paths internally instead of deref chains.  We also drop support
for the old load/store/copy_var intrinsics.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
f747ff1969 nir/vars_to_ssa: Add an is_direct field to deref_node
This makes us build the is_direct parameter as the nodes are constructed
rather than as we walk the chain.  This will be useful later.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Eric Anholt
e1f0a1b029 broadcom/vc4: Remove deref chain support from nir_lower_txf_ms.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Rob Clark
3d19f116ad st,ir3,radeonsi: push lower_deref_instrs back into driver
vc4+vc5 is not really effected by the deref chain to deref instr
conversion, so it no longer needs this pass.  For others, now that
all the passes mesa/st uses are using deref instructions, push the
lowering to deref chains back into driver.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Rob Clark
3e8879be5c nir/lower_samplers: remove legacy version
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Rob Clark
a20929fed2 nir: convert lower_samplers_as_deref to deref instructions
This also removes the legacy version of lower_samplers.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Rob Clark
0bc15340be mesa/st: re-enable lower_io_to_elements()
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Rob Clark
245ce114c9 nir: convert lower_io_arrays_to_elements to deref instructions
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Rob Clark
c409cfddcf mesa/st/nir: convert lower_builtins to deref instructions
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Rob Clark
3859e0b4fe mesa/st: temporarily disable lower_io_to_elements()
Not required for correctness, and makes the order of converting passes
to deref instructions hard to get right for both prog_to_nir and
glsl_to_nir cases.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Rob Clark
c6009a1e8e nir: convert lower_io_to_scalar to deref instructions
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Rob Clark
d143f6c856 move lower_deref_instrs
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
d7b0be48ef nir: Use deref instructions in lower_constant_initializers
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
85f4149f8a nir/builder: Use deref instructions for load/store/copy_var
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen
3573570afe radv: Disable lower_io_to_temporaries during deref changes.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
75286c2d08 nir: Use derefs in nir_lower_samplers
We change glsl_to_nir to provide derefs for bot textures and samplers
while we're at it.  This makes the lowering much easier since we only
either replace sources or remove them.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
36efae1d66 nir/lower_samplers: Clean up function arguments
This little refactor makes us stop passing stage around and puts the
builder as the first parameter to some functions.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Rob Clark
a6ebbbc594 nir/lower_samplers: split out _legacy version for deref chains
To simplify the transition, and make things bisectable, split out a
legacy copy or lower_samplers.  This way the i965 and gallium drivers
can independently switch over to deref instructions.

Since the lower_samplers_as_deref pass is only used by gallium drivers,
it can be converted in lock-step with moving the lower_deref_instrs
pass, and so does not need a corresponding _legacy clone.

This legacy pass will be removed in a future commit.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
3891c1906f intel/blorp: Stop setting tex->texture/sampler
nir_tex_instr_create uses rzalloc so it's already NULL

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
606eb56ab9 intel/nir: Only lower load/store derefs
Everything else should already be handled.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
71cd9ebed9 intel/fs: Use image_deref intrinsics instead of image_var
Since we had to rewrite the deref walking loop anyway, I took the
opportunity to make it a bit clearer and more efficient.  In particular,
in the AoA case, we will now emit one minmax instead of one per array
level.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
032b845edf anv/pipeline: Convert apply_pipeline_layout to deref instructions
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
43bb707fa4 anv/apply_pipeline_layout: Simplify extract_tex_src_plane
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
9fb36011d1 anv/pipeline: Convert lower_multiview to deref instructions
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
d57e724a45 anv/pipeline: Convert YCbCr lowering to deref instructiosn
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
38f1b89805 anv/pipeline: Convert lower_input_attachments to deref instructions
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
5cd7324a57 anv/pipeline: Do less deref instruction lowering
This commit removes most of the deref instruction lowering.  Instead of
lowering early, we only lower textures and images and we only do so
right before any of the anv image lowering passes.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen
1d59034de2 radv: Remove image_var stores.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen
43af92edc5 radv: Use deref instructions for tex derefs in meta shaders.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen
657cedb12f ac/nir: Add deref interp support.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen
d00e7d42f5 ac/nir: Add shared atomic deref instr support.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen
302884d121 radv: Gather info for deref instr based load/store.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen
547d970122 ac/nir: Add deref based var loads/stores.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:03 -07:00
Bas Nieuwenhuizen
5780af9880 radv: Add shader info support for image deref instructions.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:02 -07:00
Bas Nieuwenhuizen
506a07e4e3 ac/nir: Add deref support to image intrinsics.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen
bb5781c9a7 ac/nir: Implement derefs for integer gather4 lowering.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:58 -07:00
Bas Nieuwenhuizen
ca271e266e ac/nir: Support deref instructions in tex instructions.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:58 -07:00
Bas Nieuwenhuizen
9b14eacf0e ac/nir: Support deref instructions in get_sampler_desc.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:58 -07:00
Bas Nieuwenhuizen
4a888beea9 ac/nir: Implement the deref instr for shared memory.
v2: Store the result in ctx->ssa_defs.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:58 -07:00
Jason Ekstrand
c11833ab24 nir,spirv: Rework function calls
This commit completely reworks function calls in NIR.  Instead of having
a set of variables for the parameters and return value, nir_call_instr
now has simply has a number of sources which get mapped to load_param
intrinsics inside the functions.  It's up to the client API to build an
ABI on top of that.  In SPIR-V, out parameters are handled by passing
the result of a deref through as an SSA value and storing to it.

This virtue of this approach can be seen by how much it allows us to
delete from core NIR.  In particular, nir_inline_functions gets halved
and goes from a fairly difficult pass to understand in detail to almost
trivial.  It also simplifies spirv_to_nir somewhat because NIR functions
never were a good fit for SPIR-V.

Unfortunately, there is no good way to do this without a mega-commit.
Core NIR and SPIR-V have to be changed at the same time.  This also
requires changes to anv and radv because nir_inline_functions couldn't
handle deref instructions before this change and can't work without them
after this change.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:58 -07:00
Jason Ekstrand
58799b6a5b spirv/cfg: Make the builder fully capable for both walks
We were only initializing vtn_builder::func for the pre-walk where we
build the CFG.  We were only initializing the nir_builder for the later
walk through the instructions even though were were setting b->cursor
for the pre-walk.  Let's set both both places so that everything is
consistent.  This useful because we handle OpFunctionParameter in the
pre-walk and we're going to need to be able to emit instructions.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:58 -07:00
Jason Ekstrand
3fc3798677 spirv: Record the type of functions
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:57 -07:00
Jason Ekstrand
2f9bfd7dd9 spirv: Update vtn_pointer_to/from_ssa to handle deref pointers
Now that pointers can be derefs and derefs just produce SSA values, we
can convert any pointer to/from SSA.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:57 -07:00
Jason Ekstrand
d5930c222c spirv: Allow pointers to have a deref at the base
Previously, pointers fell into two categories: index/offset for UBOs,
SSBOs, etc. and var + access chain for logical pointers.  This commit
adds another logical pointer mode that's deref + access chain.

It's tempting to think that we can just replace variable-based pointers
with deref-based or at least replace the access chain with a deref
chain.  Unfortunately, there are a few sticky bits that prevent this:

 1) We can't return deref-based pointers from OpVariable because those
    opcodes may come outside of a function so there's no place to emit
    the deref instructions.

 2) We can't always use variable-based pointers because we may not
    always know the variable.  (We do now, but he upcoming function
    rework will take that option away.)

 3) We also can't replace the access chain struct with a deref.  Due to
    the re-ordering we do in order to handle loop continues, the derefs
    we would emit as part of OpAccessChain may not dominate their uses.
    We normally fix this up with nir_repair_ssa but that generates phi
    nodes which we don't want in the middle of our deref chains.

All in all, we have no real better option than to support partial access
chains while also re-emitting the deref instructions on the spot.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:57 -07:00
Jason Ekstrand
fdd5ffee32 spirv: Clean up vtn_pointer_to_offset
Now that push constants are using on-the-fly offsets, we no longer need
to handle access chains in vtn_pointer_to_offset.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:57 -07:00
Jason Ekstrand
7dfa440922 spirv: Make push constants an offset-based pointer
Push constants have been a weird edge-case for a while in that they have
explitic offsets but we've been internally building access chains for
them.  This mostly works but it means that passing pointers to push
constants through as function arguments is broken.  The easy thing to do
for now is to just treat them like UBOs or SSBOs only without a block
index.  This does loose a bit of information since we no longer have an
accurate access range and any indirect access will look like it could
read the whole block.  Unfortunately, there's not much we can do about
that.  Once NIR derefs get a bit more powerful, we can plumb these
through as derefs and be able to reason about them again.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:57 -07:00
Jason Ekstrand
b0c643d8f5 spirv: Use NIR per-member splitting
Before, we were doing structure splitting in spirv_to_nir.
Unfortunately, this doesn't really work when you think about passing
struct pointers into functions.  Doing it later in NIR is a much better
plan.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:57 -07:00
Jason Ekstrand
2100c2f3a2 nir/spirv: Pass nir_variable_data into apply_var_decoration
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:57 -07:00
Jason Ekstrand
39bf61aa37 nir: Add a concept of per-member structs and a lowering pass
This adds a concept of "members" to a variable with an interface type.
It allows you to specify the full variable data for each member of the
interface instead of once for the variable.  We also add a lowering pass
to lower those variables to a sequence of variables and rewrite all the
derefs accordingly.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:57 -07:00
Jason Ekstrand
eb40540b8a spirv: Use deref instructions for most variables
The only thing still using old-school drefs are function calls.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:57 -07:00
Jason Ekstrand
e5130012e4 st/nir: Move lower_deref_instrs later
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:57 -07:00
Jason Ekstrand
152057b138 i965: Move nir_lower_deref_instrs to right before locals_to_regs
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:57 -07:00
Jason Ekstrand
a649610ace nir/lower_tex: Always copy deref and offset sources
This should make nir_lower_tex properly handle deref instructions as
well as make it more correct when texture arrays are used and it's
called after lowering samplers to binding table indices.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:56 -07:00
Jason Ekstrand
261fe676e5 intel/nir: Fixup deref modes after lowering patch vertices
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:56 -07:00
Jason Ekstrand
d7d5aab45b intel,ir3: Disable nir_opt_copy_prop_vars
This pass doesn't handle deref instructions yet.  Making it handle both
legacy derefs and deref instructions would be painful.  Since it's not
important for correctness, just disable it for now.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:56 -07:00
Jason Ekstrand
5dc58908b7 nir: Support deref instructions in opt_undef
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:56 -07:00
Jason Ekstrand
f46ecdc441 nir: Consider deref instructions in opt_peephole_select
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:56 -07:00
Jason Ekstrand
1e1733aaf0 nir: Consider deref instructions in lower_phis_to_scalar
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:56 -07:00
Jason Ekstrand
775ef13384 nir: Support deref instructions in lower_drawpixels
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:56 -07:00
Jason Ekstrand
932c6577a0 nir: Support deref instructions in lower_clamp_color_outputs
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:56 -07:00
Jason Ekstrand
076b6627c2 nir: Support deref instructions in lower_alpha_test
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:56 -07:00
Jason Ekstrand
414148cdc1 nir: Support deref instructions in loop_analyze
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:56 -07:00
Jason Ekstrand
e786fcf777 nir: Support deref instructions in remove_unused_varyings
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:56 -07:00
Jason Ekstrand
933c2851ab nir: Support deref instructions in lower_pos_center
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:55 -07:00
Jason Ekstrand
64057fd333 nir: Support deref instructions in lower_wpos_ytransform
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:55 -07:00
Jason Ekstrand
2c9ca29372 nir: Support deref instructions in lower_atomics
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:55 -07:00
Jason Ekstrand
d029167ea0 nir: Support deref instructions in lower_io
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:55 -07:00
Jason Ekstrand
59b43be105 nir: Support deref instructions in gather_info
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:55 -07:00
Jason Ekstrand
1442969ae1 nir: Support deref instructions in propagate_invariant
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:55 -07:00
Jason Ekstrand
f23356a4dd nir: Support deref instructions in lower_clip_cull
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:55 -07:00
Jason Ekstrand
61b7bef3a3 nir: Support deref instructions in lower_system_values
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:55 -07:00
Jason Ekstrand
1285cc9616 nir: Support deref instructions in lower_indirect_derefs
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:55 -07:00
Jason Ekstrand
dccb3acb63 nir: Support deref instructions in lower_vars_to_ssa
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:55 -07:00
Jason Ekstrand
9fe99129df nir: Support deref instructions in split_var_copies
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:55 -07:00
Jason Ekstrand
4a4e175738 nir: Support deref instructions in lower_var_copies
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:55 -07:00
Jason Ekstrand
a406f7e0c9 nir: Add a deref path helper struct
This commit introduces a new nir_deref.h header for helpers that are
less common and really only needed by a few heavy-duty passes.  In this
header is a new struct for representing a full deref path which can be
walked in either direction.

v2 (Jason Ekstrand):
 - Assert that deref != NULL (Caio)
 - Fill _short_path with 0xdeadbeef in debug builds when not used (Caio)
 - Make nir_deref_path a typedef (Rob)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:54 -07:00
Jason Ekstrand
535289a3a9 nir: Support deref instructions in lower_io_to_temporaries
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:54 -07:00
Jason Ekstrand
21befc46ef nir: Support deref instructions in lower_global_vars_to_local
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:54 -07:00
Jason Ekstrand
54e440945e nir: Add a pass for fixing deref modes
This will be needed by anything which changes variable modes without
rewriting derefs.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:54 -07:00
Jason Ekstrand
f917814c14 nir: Support deref instructions in remove_dead_variables
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:54 -07:00
Rob Clark
f03a33a19a ttn: convert to deref instructions
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:54 -07:00
Jason Ekstrand
82c498510e prog/nir: Use deref instructions for params
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:54 -07:00
Jason Ekstrand
2c7b892909 glsl/nir: Use deref instructions instead of dref chains
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:54 -07:00
Jason Ekstrand
7f41a99cac glsl/nir: Only claim to handle intrinsic functions
Non-intrinsic function handling has never actually been tested and
probably doesn't work.  Just get rid of it for now.  We can always add
it back in later if it's useful.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:54 -07:00
Rob Clark
d80c342d89 nir: add deref lowering sanity checking
This will be removed at the end of the transition, but add some tracking
plus asserts to help ensure that lowering passes are called at the
correct point (pre or post deref instruction lowering) as passes are
converted and the point where lower_deref_instrs() is called is moved.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:54 -07:00
Jason Ekstrand
74212c2414 anv,i965,radv,st,ir3: Call nir_lower_deref_instrs
This inserts a call to nir_lower_deref_instrs at every call site of
glsl_to_nir, spirv_to_nir, and prog_to_nir.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:54 -07:00
Jason Ekstrand
8b7aa66169 nir/deref: Add some deref cleanup functions
Sometimes it's useful for a pass to be able to clean up its own derefs
instead of waiting for DCE.  This little helper makes it very easy.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:53 -07:00
Jason Ekstrand
a80fa2766e nir: Add helpers for working with deref instructions
This commit adds a pass for lowering deref instructions to deref chains
as well as some smaller helpers to ease the transition.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:53 -07:00
Jason Ekstrand
5286b5d832 nir: Add deref sources to texture instructions
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:53 -07:00
Jason Ekstrand
f1dc2088e2 nir: Add _deref versions of all of the _var intrinsics
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:53 -07:00
Jason Ekstrand
de7f60b653 nir/builder: Add deref building helpers
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:53 -07:00
Jason Ekstrand
19a4662a54 nir: Add a deref instruction type
This commit adds a new instruction type to NIR for handling derefs.
Nothing uses it yet but this adds the data structure as well as all of
the code to validate, print, clone, and [de]serialize them.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:53 -07:00
Jason Ekstrand
5fbbbda37a nir/validate: Rework intrinsic type validation
This moves the switch statement for specific intrinsics above source and
destination validation.  We also rework the source and destination
validation to use different bit_size values for each source and/or
destination.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:53 -07:00
Karol Herbst
133e8bf4de nv50/ir: only avoid spilling constrained def if a mov is added
fix spilling regression introduced by 5428066f5e

this is just a minor mistake done while moving the code out into a new
function. The function contained a loop which might have been terminated
earlier and skipped setting noSpill to 1. After the refactoring it was always
set.

Fixes: 5428066f5e
	("nv50/ir: make a copy of tex src if it's referenced multiple times")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-06-23 03:00:24 +02:00
Dylan Baker
ced3df5623 meson: Fix typo that breaks -Dgalium-xvmc=false
_xmvc -> _xvmc. Sigh

Fixes: a6943bb4ce
       ("meson: Fix auto option for xvmc")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2018-06-22 10:16:27 -07:00
Dylan Baker
94cf397092 meson: Fix auto option for va
The same as the previous two patches, but for the libva state tracker.

Fixes: 724916c8a8
       ("meson: dedup gallium-xvmc logic")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-22 09:51:25 -07:00
Dylan Baker
a6943bb4ce meson: Fix auto option for xvmc
This fixes the same problem as the previous patch did for vdpau, but for
xvmc.

Fixes: 724916c8a8
       ("meson: dedup gallium-xvmc logic")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-22 09:51:18 -07:00
Dylan Baker
d9a8008a93 meson: Correct behavior of vdpau=auto
Currently if vdpau is set to auto, it will be disabled only in cases
where gallium is disabled or the host OS is not supported (mac, haiku,
windows). However on (for example) Linux if libvdpau is not installed
then the build will error because of the unmet dependency. This corrects
auto to do the right thing, and not error if libvdpau is not installed.

Fixes: 992af0a4b8
       ("meson: dedup gallium-vdpau logic")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-22 09:51:11 -07:00
Samuel Pitoiset
ca59c3906d radv: always check the return error when submitting a CS
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-22 17:47:10 +02:00
Samuel Pitoiset
68d9517690 radv: check the return values of radv_signal_fence()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-22 17:47:09 +02:00
Samuel Pitoiset
07832083d3 radv: change the returned error in radv_signal_fence()
From my point of view, when we aren't able to submit a CS
something terribly wrong happens and we are most likely
going to lost the device.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-22 17:47:06 +02:00
Jonathan Marek
94bc06b196 freedreno: a2xx: fix clear color
the format of the CLEAR_COLOR register doesn't depend on the target format
this fixes clear color when rendering to 32-bit RGBA and 16-bit targets

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-22 08:23:10 -04:00
Jonathan Marek
dd8553dd95 freedreno: a2xx: fix crash when freeing context
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-22 08:23:10 -04:00
Jonathan Marek
6eeac34cee freedreno: a2xx: fix crash on first clear
blend can be NULL, so check for that

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-22 08:23:10 -04:00
Jonathan Marek
17e16ba9db freedreno: add a20x
this patch adds support for a20x, which has some differences with a220:
-no VGT_MAX_VTX_INDX register
-no CLEAR_COLOR register
-set RB_BC_CONTROL in restore (hangs without)
-different CP_DRAW_INDX format

tested with kmscube and glmark2 scenes, on par with a220

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-22 08:23:10 -04:00
Jonathan Marek
d5ff36b97b freedreno: a2xx: increase size of the offset field in instr_fetch_vtx_t
The offset field is 22 bit large.
11 bits are necessary because MaxVertexAttribRelativeOffset = 2047

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-22 08:23:10 -04:00
Eric Anholt
69ae42ca4c v3d: Don't forget to initialize the buffer offset of a new winsys handle. 2018-06-21 15:56:18 -07:00
Eric Anholt
ee9a6a13fb v3d, vc4: Disable valgrind checking of CLE inputs when NDEBUG is set.
For a meson -Db_ndebug=true release build on x86_64, reduces text size of
libv3d.a from 53.0k to 51.6k.  Inspired by 0d5329d626 ("anv: Disable
__gen_validate_value if NDEBUG is set.")
2018-06-21 15:46:40 -07:00
Marek Olšák
a2790b134a mesa: fix glGetInteger64v for arrays of integers
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:55:15 -04:00
Marek Olšák
ce4b8b952a ac/surface: disallow rotated micro tile mode
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-21 14:42:14 -04:00
Marek Olšák
9410cd53c3 radeonsi: fix occlusion queries with 16x AA without FBO attachments on Stoney
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-21 14:42:14 -04:00
Marek Olšák
9c21002f6e radeonsi: handle non-clearable DCC buffers as MSAA resolve dst
This is reproducible on Stoney, but other chips may be affected too.

Cc 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-21 14:42:14 -04:00
Marek Olšák
587e712eda radeonsi: disable DCC MSAA for 128bpp formats on Stoney
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-21 14:42:14 -04:00
Rob Clark
6764aae169 docs: update freedreno features
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-21 08:54:48 -04:00
Rob Clark
fbd154294f mesa: fix GLES 3.1 version calculation
All of ARB_gpu_shader5 is most certainly not required for GLES 3.1
(most of it is in OES_gpu_shader5 on top of GLES 3.1).

Some of what is required from ARB_gpu_shader5 is provided by
ARB_texture_gather, so check for that.  The remaining subset of
ARB_gpu_shader5 doesn't have individual extensions to check for,
but I guess it is unlikely that some driver has all of these
extensions but not, say, integer bitfield manipulation.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-21 08:54:47 -04:00
Rob Clark
cf0c7258ee freedreno/a5xx: MSAA
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-21 08:54:47 -04:00
Rob Clark
b6e690ef80 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-21 08:54:47 -04:00
Rob Clark
418b3fd184 freedreno/ir3: txf_ms support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-21 08:54:47 -04:00
Rob Clark
d03bd103f8 freedreno/a5xx: fix gpu hangs with large compute shaders
Similar to the combined limit for VS+FS, there is an upper limit for
shader size to run from internel memory.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-21 08:54:47 -04:00
Rob Clark
e1e40935b4 freedreno/ir3: fix base_vertex
Fixes: c366f422f0 nir: Offset vertex_id by first_vertex instead of base_vertex
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-21 08:54:47 -04:00
Eduardo Lima Mitev
77e790f99a i965: Link uniforms of SPIR-V programs using the NIR linker
v2: nir_link_uniforms renamed to gl_nir_link_uniforms

Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Neil Roberts
ae0208e5b4 i965: Setup glsl uniforms by index rather than name matching
Previously when setting up a uniform it would try to walk the uniform
storage slots and find one that matches the name of the given
variable. However, each variable already has a location which is an
index into the UniformStorage array so we can just directly jump to
the right slot. Some of the variables take up more than one slot so we
still need to calculate how many it uses.

The main reason to do this is to support ARB_gl_spirv because in that
case the uniforms don’t have names so the previous approach won’t
work.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev
57b6184931 i965: account for NIR uniforms without name
Right now, the BRW linker code assumes nir_variable::name is always
non-NULL, but thanks to ARB_gl_spirv we will soon be linking SPIR-V
programs, and those explicitly require matching uniforms by location.
The name is just a debug hint.

Instead of checking for the name this patch makes it check for
var->num_state_slots on the assumption that everything that had an
internal name also had some state slots. This seems likely because the
two code paths that are taken when the name begins with "gl_" already
have an assert that var->state_slots is not NULL.

v2: simplified, most of it moved to glsl/nir/spirv (Neil Roberts)
v3: check for num_state_slots instead of the name. This is needed
    because we do actually have nameless builtins with SPIR-V such as
    PatchVerticesIn and we want them to hit the
    _mesa_add_state_reference code path (Neil Roberts)

Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Neil Roberts
7dd96a0653 i965: Update TexturesUsed after linking the shaders
Otherwise if the shader is SPIR-V then SamplerUsed won’t have been
initialised yet so it will end up thinking no textures are used. This
was causing a crash later on if nothing causes it to regenerate
TexturesUsed before the next render.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev
4bf8b80f54 i965: Build SPIR-V programs' resource list using NIR
v2: tweak after nir_linker.h being renamed to gl_nir_linker.h

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev
3cf12c6317 nir/linker: Add nir_build_program_resource_list()
This function is equivalent to the linker.cpp
build_program_resource_list() but will extract the resources from NIR
shaders instead.

For now, only uniforms and program inputs are implemented.

v2: move from compiler/nir to compiler/glsl (Timothy Arceri)

v3: remove support for inputs, that is still WIP (spotted by Timothy
    Arceri)

Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Alejandro Piñeiro
215c9359ed compiler/link: move add_program_resource to linker_util
So it could be used by the GLSL and NIR linker.

v2: (Timothy Arceri)
   * Moved from compiler to compiler/glsl
   * Method renamed to link_util_add_program_resource

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Neil Roberts
2bf91733fc nir/linker: Set the uniform initial values
This is based on link_uniform_initializers.cpp.

v2: move from compiler/nir to compiler/glsl (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev
7a9e5cdfbb nir/linker: Add gl_nir_link_uniforms()
This function will be the entry point for linking the uniforms from
the nir_shader objects associated with the gl_linked_shaders of a
program.

This patch includes initial support for linking uniforms from NIR
shaders. It is tailored for the ARB_gl_spirv needs, and it is far from
complete, but it should handle most cases of uniforms, array
uniforms, structs, samplers and images.

There are some FIXMEs related to specific features that will be
implemented in following patches, like atomic counters, UBOs and
SSBOs.

Also, note that ARB_gl_spirv makes mandatory explicit location for
normal uniforms, so this code only handles uniforms with explicit
location. But there are cases, like uniform atomic counters, that
doesn't have a location from the OpenGL point of view (they have a
binding), but that Mesa assign internally a location. That will be
handled on following patches.

A nir_linker.h file is also added. More NIR-linking related API will
be added in subsequent patches and those will include stuff from Mesa,
so reusing nir.h didn't seem a good idea.

v2: move from compiler/nir to compiler/glsl (Timothy Arceri)
v3: sets var->driver.location if the uniform was found from a previous
    stage (Neil Roberts).

Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Alejandro Piñeiro
aa95f0bc5b compiler/link: add linker_util.h, move linker_error/warning to it
Linker utilities common to the GLSL IR and NIR linker (the latter to
be used for ARB_gl_spirv).

We need to move it to a new header as the NIR linker doesn't need to
know about ir_variable, and others, included at linker.h.

v2: move from src/compiler to src/compiler/glsl (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Neil Roberts
b995bda9bc spirv: Set nir_variable->explicit_binding
When SpvDecorationBinding is encountered in the SPIR-V source it now
sets explicit_binding on the nir_variable. This will be used to
determine whether to initialise sampler and image uniforms with the
binding value.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Neil Roberts
386f09be9b spirv: Get rid of vtn_variable_mode_image/sampler
vtn_variable_mode_image and _sampler are instead replaced with
vtn_variable_mode_uniform which encompasses both of them. In the few
places where it was neccessary to distinguish between the two, the
GLSL type of the pointer is used instead.

The main reason to do this is that on OpenGL it is permitted to put
images and samplers into structs and declare a uniform with them. That
means that variables can now have a mix of uniform, sampler and image
modes so picking a single one of those modes for a variable no longer
makes sense.

This fixes OpLoad on a sampler within a struct which was previously
using the variable mode to determine whether it was a sampler or not.
The type of the variable is a struct so it was not being considered to
be uniform mode even though the member being loaded should be sampler
mode.

The previous code appeared to be using var->interface_type as a place
to store the type of the variable without the enclosing array for
images and samplers. I guess this worked because opaque types can not
appear in interfaces so the interface_type is sort of unused. This
patch removes the overloading of var->interface_type and any places
that needed the type without the array can now just deduce it from
var->type.

v2: squash in this patch the changes to anv/nir (Timothy)

Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Nicolai Hähnle
23edc5b1ef spirv: translate default-block uniforms
They are supported by SPIR-V for ARB_gl_spirv.

v2 (changes on top of Nicolai's original patch):
   * Handle UniformConstant storage class for uniforms other than
     samplers and images. (Eduardo Lima)
   * Handle location decoration also for samplers and images. (Eduardo
     Lima)
   * Rebase update (spirv_to_nir options added, logging changes, and
     others) (Alejandro Piñeiro)

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev
3d6664763d nir/types: Add a utility wrapper to glsl_type::sampler_index()
I think it is more accurate to call it a sampler target (?).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev
f1ab16cf17 nir/types: Add a glsl_get_component_slots() utility
It is basically a wrapper around glsl_type::component_slots().

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev
2b8765b824 nir/lower_samplers: Limit assert to GLSL shader programs
Vulkan has the concept of separate image and sampler objects in the
SPIR-V code whereas GL conflates them into one. nir_lower_samplers
contains an assert to verify that sampler operand is not being set on
the nir instruction. However when the code comes from spirv_to_nir the
sampler operand is always set. GL_arb_gl_spirv explicitly states that
OpTypeSampler is not supported so it retains the GL behaviour of not
being able to seperate them. Therefore the sampler will always be the
same as the texture. This GL version of the lowering code ignores
instr->sampler and sets instr->sampler_index to the same value as
instr->texture_index. Some other places in the code (such as in
nir_print) assume that once the instruction is lowered then both
instr->texture and instr->sampler will be NULL, so to keep this
behaviour we now set instr->sampler to NULL after ignoring it to fill
in instr->sampler_index.

Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Neil Roberts
652be1563f nir: Add explicit_binding to nir_variable
This is copied from the corresponding value in ir_variable. The
intention is to eventually use it in a pure-NIR linker.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Alejandro Piñeiro
8d1ec2ed5a mesa/main: add NULL name check when searching for a resource name
Since ARB_gl_spirv name reflection can be missing. piglit
shader_runner does several resource checking, so this commit is useful
to get even the more simple piglit tests running without crashing on
SPIR-V mode.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Alejandro Piñeiro
a6dc3d22eb i965: use gl_shader_program_data::spirv
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev
a940683733 mesa/main: Add a 'spirv' flag to gl_shader_program_data
This will be used by the linker code to differentiate between programs
made out of SPIR-V or GLSL shaders.

This was rejected in the past, assuming that it was equivalent to
check for "shProg->_LinkedShaders[stage]->spirv_data != NULL". But:

  * At some points of the linking process it would be needed to check
    if _LinkerShaders[stage] is present, so the full check would be:

    "shProg->_LinkedShaders[stage] != NULL &&
     shProg->_LinkedShaders[stage]->spirv_data != NULL"

  * Sometimes you would like to do some specific to SPIR-V
    independently of the stage, or for any stage. For example, "link
    all the uniforms, for all stages". In that case checking for the
    flag would be equivalent to iterate all the _LinkedShaders and
    check if there is any spirv_data available.

The former makes readibility really worse. Both could be solved by
adding two helpers. But adding a flag seems really more simple and
readable.

v2: added justification for the flag on the commit message (Alejandro)

Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-21 14:25:05 +02:00
Emil Velikov
697254111b docs/release-calendar: restore the missing 18.1 column
Earlier commit removed the column, instead of adjusting the height.

Cc: Dylan Baker <dylan@pnwbakers.com>
Fixes: 0d4f338a11 ("docs: Update release-notes and calendar")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-21 12:09:39 +01:00
Emil Velikov
dfb1f2759c configure: use compliant grep regex checks
The current `grep "foo\|bar"' trips on some grep implementations, like
the FreeBSD one. Instead use `egrep "foo|bar"' as suggested by Stefan.

Cc: Stefan Esser <se@FreeBSD.org>
Reported-by: Stefan Esser <se@FreeBSD.org>
Bugzilla: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=228673
Fixes: 1914c814a6 ("configure: error out if building OMX w/o supported platform")
Fixes: 63e11ac2b5 ("configure: error out if building VA w/o supported platform")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-21 12:09:39 +01:00
Emil Velikov
d589eddc8b glsl/tests/glcpp: reinstate "error out if no tests found"
With the recent rework of converting the shell script to a python one
the check for actual tests was dropped.

Bring that back, since it was explicitly added considering we had a ~2
year period, during which the tests were not run.

v2: use raise Exception() over  print() & return false (Dylan)

Fixes: db8cd8e367 ("glcpp/tests: Convert shell scripts to a python
script")
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-21 12:09:39 +01:00
Emil Velikov
a2f5292c82 glsl/glcpp/tests: reinstate srcdir/abs_builddir blurb
Bring back the "detection" of the said variables, to allow
standalone execution.

Fixes: db8cd8e367 ("glcpp/tests: Convert shell scripts to a python
script")
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-21 12:09:39 +01:00
Emil Velikov
87cebace54 glsl: fold glcpp-test-cr-lf.sh into glcpp-test.sh
As of recently both of these have been reworked so they invoke a python
script. At the same time the latter can be executed with the combined
arguments of both scripts.

AKA we no longer need to have them separate.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-21 12:09:39 +01:00
Emil Velikov
1c1f70d12f st/dri: constify dri_fill_st_visual's screen
As the function says - only the visual is changed.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-21 12:09:39 +01:00
Emil Velikov
ccaa9f09cc mesa: remove struct gl_extensions::ATI_separate_stencil
Virtually every driver that supports ATI_separate_stencil
also supports EXT_stencil_two_side.

Use the latter boolean for both extension. With that in mind we can drop
the explicit true from the drivers and the nasty comment in
compute_version().

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-21 12:09:39 +01:00
Eric Engestrom
1714dfca8a travis: add libXrandr and its randrproto dependency
Fixes: 3f960c1338 "vulkan: EXT_acquire_xlib_display requires libXrandr headers to build"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-21 11:46:47 +01:00
Juan A. Suarez Romero
d24839be70 swr: bump minimum supported LLVM version to 5.0
RADV now requires LLVM 5.0 or greater, and thus we can't build dist
tarball because swr requires LLVM 4.0.

Let's bump required LLVM to 5.0 in swr too.

Fixes: f9eb1ef870 ("amd: remove support for LLVM 4.0")
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-06-21 12:16:46 +02:00
Grazvydas Ignotas
f966929805 radeonsi: add a debug flag to zero vram allocations
This allows to avoid having to see garbage in Dying Light loading screen
at least, which probably expects Windows/NV behavior of all allocations
being zeroed by default.

Analogous to radv flag with the same name.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-21 12:18:50 +03:00
Grazvydas Ignotas
4e0d93dc0e radeonsi: use shifts for sign extension
Avoids a branch and reduces code size a tiny bit:
    text   data     bss      dec    hex filename
10804563 398653 2070368 13273584 ca89f0 /tmp/radeonsi_dri.so.old
10804499 398653 2070368 13273520 ca89b0 /tmp/radeonsi_dri.so

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-21 12:17:34 +03:00
Samuel Pitoiset
af17a29ad8 radv: set EVENT_WRITE_EOP.INT_SEL = wait for write confirmation
Ported from RadeonSI.
Not sure why this is needed but AMDVLK does something similar.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-21 10:31:03 +02:00
Samuel Pitoiset
41f6096c26 radv: use EOP_DATA_SEL_* instead of magic numbers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-21 10:31:02 +02:00
Roland Scheidegger
53959fcbd8 r600: fix copy/paste bug for sampleMaskIn workaround
The sampleMaskIn workaround (b936f4d1ca)
tries to figure out if the shader is running at per-sample frequency, but
there's a typo bug so it will only recognize per-sample linar inputs,
not per-sample perspective ones.

Spotted by Eric Engestrom <eric.engestrom@intel.com>

Fixes: b936f4d1ca0d2ab1e828a "r600: partly fix sampleMaskIn value"
2018-06-21 02:37:11 +02:00
Eric Anholt
edb7890750 v3d: Fix min vs mag determination when not doing mip filtering.
Fixes all 128 failing tests in
dEQP-GLES3.functional.texture.filtering.*.combinations
2018-06-20 12:31:54 -07:00
Keith Packard
3f960c1338 vulkan: EXT_acquire_xlib_display requires libXrandr headers to build
When VK_USE_PLATFORM_XLIB_XRANDR_EXT is defined, vulkan.h includes
X11/extensions/Xrandr.h for the RROutput typedef which is used in
the vkGetRandROutputDisplayEXT interface.

Make sure we have the required header by checking during the build,
and also set CFLAGS to point at the right directory.

We don't need to link against the library as we don't use any
functions from there, so don't add the _LIBS value in the autotools
build.

Signed-off-by: Keith Packard <keithp@keithp.com>
Fixes: dbac8e25f8 "radv: Add EXT_acquire_xlib_display to radv driver [v2]"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-20 10:42:05 -07:00
Eric Anholt
f49d112a01 v3d: Implement ALPHA_TO_COVERAGE.
There's a convenient "FTOC" instruction for generating the coverage now,
unlike vc4.  This fixes
dEQP-GLES3.functional.multisample.fbo_4_samples.proportionality_alpha_to_coverage
2018-06-20 09:30:46 -07:00
Eric Anholt
94f7c011d6 v3d: Track write reference to the separate stencil buffer.
Otherwise, a blit from separate stencil may fail to flush the job that
initialized it, or new drawing could fail to flush a blit reading from
stencil.

Fixes:
dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_basic
dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_scale
dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_stencil_only
dEQP-GLES3.functional.fbo.msaa.2_samples.depth32f_stencil8
dEQP-GLES3.functional.fbo.msaa.4_samples.depth32f_stencil8
2018-06-20 09:30:46 -07:00
Eric Anholt
a52c357a65 v3d: Add missing reference to the separate stencil buffer.
Noticed while debugging a missing flush of rendering in the z32f_s8 case.
2018-06-20 09:30:46 -07:00
Eric Anholt
1334295f29 v3d: Fix return value from fence_finish.
We needed to convert from a -errno to a boolean success value.  Fixes:

GTF-GLES3.gtf.GL3Tests.sync.sync_functionality_clientwaitsync_flush
GTF-GLES3.gtf.GL3Tests.sync.sync_functionality_clientwaitsync_signaled
2018-06-20 09:30:46 -07:00
Christian Gmeiner
8b3099353e mesa/st: only do scalar lowerings if driver benefits
As not every (upcoming) backend compiler is happy with
nir_lower_xxx_to_scalar lowerings do them only if the backend
is scalar (and not vec4) based.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-06-20 17:56:37 +02:00
Christian Gmeiner
f485e5671c gallium: add scalar isa shader cap
v1 -> v2:
 - nv30 is _NOT_ scalar as suggested by Ilia Mirkin.
 - Change from a screen cap to a shader cap as suggested
   by Eric Anholt.
 - radeonsi is scalar as suggested by Marek Olšák.
 - Change missing ones to be scalar.

v2 -> v3:
 - r600 prefers vec4 as suggested by Marek Olšák.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-20 17:55:39 +02:00
Keith Packard
050d8a4b42 radv: Add VK_EXT_display_surface_counter to radv driver
This extension is required to support EXT_display_control as it offers
a way to query whether the vblank counter is supported.

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-20 08:16:45 -07:00
Keith Packard
1801d7c73c anv: Add VK_EXT_display_surface_counter to anv driver [v2]
This extension is required to support EXT_display_control as it offers
a way to query whether the vblank counter is supported.

v2:
	Add extension to list in alphabetical order

	Suggested-by:  Jason Ekstrand <jason@jlekstrand.net>

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-20 08:16:34 -07:00
Jason Ekstrand
b1a013d035 Vulkan/wsi: Implement VK_EXT_display_surface_counter
This extension is required to support EXT_display_control as it offers a
way to query whether the vblank counter is supported.  Internally, it is
implemented using a fake MESA extension which provides a chain-in to
GetSurfaceCapabilities2KHR which contains the one added field.  This has
the advantage of reducing number of callbacks needed in the back-ends.
It also means that anything chained into GetSurfaceCapabilities2EXT
through VkSurfaceCapabilities2KHR::pNext so we only need to handle
crawling the pNext chain once per back-end.

Reviewed-by: Keith Packard <keithp@keithp.com>
2018-06-20 08:16:03 -07:00
Jason Ekstrand
8f3b58ebee vulkan/wsi: Get rid of the get_capabilities hook
Instead, we can just use get_capabilities2.  This way back-ends only
have to implement one hook.

Reviewed-by: Keith Packard <keithp@keithp.com>
2018-06-20 08:16:03 -07:00
Eric Engestrom
7f3cb7db08 intel/aubinator: drop unused functions
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-20 15:17:26 +01:00
Samuel Pitoiset
65b3fed037 radv: always initialize the clear depth/stencil values to 0
Similar to the clear color values.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-20 13:21:42 +02:00
Samuel Pitoiset
204cf5714a radv: always initialize the clear color values to 0
Having random data in there is probably not the best.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-20 13:21:42 +02:00
Samuel Pitoiset
4b564bd612 radv: always initialize the DCC predicate to FALSE
This might eventually skip some useless DCC decompression
passes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-20 13:21:42 +02:00
Samuel Pitoiset
70c1bee187 radv: do not use an user SGPR for the sample position offset
We know the number of samples at compile time.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-20 13:21:42 +02:00
Samuel Pitoiset
20170865db radv: don't store the number of samples as log2
Needed for the following patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-20 13:21:42 +02:00
Gert Wollny
8a6e3f0c5d gallium/aux/util/u_cpu_detect.h: Fix -Wsign-compare warning in u_cpu_detect.c
Change the type of util_cpu_caps::nr_cpus to int because sysconfig
returns a signed value, fixes:

u_cpu_detect.c: In function 'util_cpu_detect':
u_cpu_detect.c:317:30: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
    if (util_cpu_caps.nr_cpus == -1)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
33f4e8a043 gallium/aux/util/u_debug.h: Fix "noreturn" warnings in debug mode
Only decorate function as noreturn when DEBUG is not defined, because
when compiled in DEBUG mode the function actually executes an int3 and
may return, fixes:
u_debug.c: In function '_debug_assert_fail':
u_debug.c:309:1: warning: 'noreturn' function does return

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
70f632962a gallium/aux/util: Fix some warnings
util/u_cpu_detect.c: In function 'util_cpu_detect':
util/u_cpu_detect.c:377:30: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    if (util_cpu_caps.nr_cpus == ~0u)
                              ^~

util/u_hash_table.c:274:21: warning: unused parameter 'k' [-Wunused-
parameter]
 util_hash_inc(void *k, void *v, void *d)
                     ^
util/u_hash_table.c:274:30: warning: unused parameter 'v' [-Wunused-
parameter]
 util_hash_inc(void *k, void *v, void *d)
                              ^

util/u_tests.c: In function 'test_texture_barrier':
util/u_tests.c:652:25: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
       for (int i = 0; i < num_samples / 2; i++) {
                         ^

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
3e091d5a7a gallium/aux/tgsi_ureg.c: remove unused parameter from match_or_expand_immediate64
remove "type" from "match_or_expand_immediate64", fixes:

tgsi/tgsi_ureg.c: In function 'match_or_expand_immediate64':
tgsi/tgsi_ureg.c:837:34: warning: unused parameter 'type' [-Wunused-
parameter]
                              int type,
                                  ^~~~

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
f79b980486 gallium/aux/tgsi_two_side.c: Fix -Wsign-compare warnings
Integer propagation rules can sometimes be irritating. With
"unsigned x" "x + 1" gets propagated to a signed integer, so explicitely
assign the sum to an unsigned and use that for comaprison.

In file included from tgsi/tgsi_two_side.c:41:0:
tgsi/tgsi_two_side.c: In function 'xform_decl':
./util/u_math.h:660:29: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
 #define MAX2( A, B )   ( (A)>(B) ? (A) : (B) )
                             ^
tgsi/tgsi_two_side.c:86:24: note: in expansion of macro 'MAX2'
       ts->num_inputs = MAX2(ts->num_inputs, decl->Range.Last + 1);
                        ^~~~
./util/u_math.h:660:40: warning: signed and unsigned type in conditional
expression [-Wsign-compare]
 #define MAX2( A, B )   ( (A)>(B) ? (A) : (B) )
                                        ^
tgsi/tgsi_two_side.c:86:24: note: in expansion of macro 'MAX2'
       ts->num_inputs = MAX2(ts->num_inputs, decl->Range.Last + 1);
                        ^~~~
./util/u_math.h:660:29: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
 #define MAX2( A, B )   ( (A)>(B) ? (A) : (B) )
                             ^
tgsi/tgsi_two_side.c:89:23: note: in expansion of macro 'MAX2'
       ts->num_temps = MAX2(ts->num_temps, decl->Range.Last + 1);
                       ^~~~
./util/u_math.h:660:40: warning: signed and unsigned type in conditional
expression [-Wsign-compare]
 #define MAX2( A, B )   ( (A)>(B) ? (A) : (B) )
                                        ^
tgsi/tgsi_two_side.c:89:23: note: in expansion of macro 'MAX2'
       ts->num_temps = MAX2(ts->num_temps, decl->Range.Last + 1);
                       ^~~~
tgsi/tgsi_two_side.c: In function 'xform_inst':
tgsi/tgsi_two_side.c:184:45: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
             if (inst->Src[i].Register.Index == ts-
>front_color_input[j]) {
                                             ^~

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
dc5ba7e17c gallium/aux/tgsi_ureg.c: Fix various warnings
tgsi/tgsi_ureg.c: In function 'ureg_DECL_sampler':
tgsi/tgsi_ureg.c:721:34: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
       if (ureg->sampler[i].Index == nr)
                                  ^~
tgsi/tgsi_ureg.c: In function 'match_or_expand_immediate64':
tgsi/tgsi_ureg.c:837:34: warning: unused parameter 'type' [-Wunused-
parameter]
                              int type,
                                  ^~~~
tgsi/tgsi_ureg.c: In function 'emit_decls':
tgsi/tgsi_ureg.c:1821:31: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
       if (ureg->properties[i] != ~0)
                               ^~
tgsi/tgsi_ureg.c: In function 'ureg_create_with_screen':
tgsi/tgsi_ureg.c:2193:18: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < ARRAY_SIZE(ureg->properties); i++)
                  ^

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
c5e8280504 gallium/aux/tgsi_text.c: Fix -Wsign-compare warnings
tgsi/tgsi_text.c: In function 'parse_identifier':
tgsi/tgsi_text.c:218:16: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
          if (i == len - 1)
                ^~
tgsi/tgsi_text.c: In function 'parse_optional_swizzle':
tgsi/tgsi_text.c:873:21: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
       for (i = 0; i < components; i++) {
                     ^
tgsi/tgsi_text.c: In function 'parse_instruction':
tgsi/tgsi_text.c:1103:18: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < info->num_dst + info->num_src + info->is_tex; i++) {
                  ^
tgsi/tgsi_text.c:1118:18: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
       else if (i < info->num_dst + info->num_src) {
                  ^
tgsi/tgsi_text.c: In function 'parse_immediate':
tgsi/tgsi_text.c:1660:24: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for (type = 0; type < ARRAY_SIZE(tgsi_immediate_type_names); ++type)
{
                        ^
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
b16b6d0889 gallium/aux/tgsi_point_sprite.c: Fix -Wsign-compare warnings
tgsi/tgsi_lowering.c: In function 'emit_twoside':
tgsi/tgsi_lowering.c:1179:18: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < ctx->two_side_colors; i++) {
                  ^
tgsi/tgsi_lowering.c:1208:18: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < ctx->two_side_colors; i++) {
                  ^
tgsi/tgsi_lowering.c:1216:18: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < ctx->two_side_colors; i++) {
                  ^
tgsi/tgsi_lowering.c: In function 'emit_decls':
tgsi/tgsi_lowering.c:1280:18: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < ctx->numtmp; i++) {
                  ^
tgsi/tgsi_lowering.c: In function 'rename_color_inputs':
tgsi/tgsi_lowering.c:1311:28: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
             if (src->Index == ctx->two_side_idx[j]) {
                            ^~

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
3792d85755 gallium/aux/tgsi_lowering.c: Fix -Wsign-compare warnings
tgsi/tgsi_lowering.c: In function 'emit_twoside':
tgsi/tgsi_lowering.c:1179:18: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < ctx->two_side_colors; i++) {
                  ^
tgsi/tgsi_lowering.c:1208:18: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < ctx->two_side_colors; i++) {
                  ^
tgsi/tgsi_lowering.c:1216:18: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < ctx->two_side_colors; i++) {
                  ^
tgsi/tgsi_lowering.c: In function 'emit_decls':
tgsi/tgsi_lowering.c:1280:18: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < ctx->numtmp; i++) {
                  ^
tgsi/tgsi_lowering.c: In function 'rename_color_inputs':
tgsi/tgsi_lowering.c:1311:28: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
             if (src->Index == ctx->two_side_idx[j]) {
                            ^~

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
7a3daaab41 gallium/aux/tgsi_build.c: Fix -Wsign-compare warnings
tgsi/tgsi_build.c: In function 'tgsi_build_full_immediate':
tgsi/tgsi_build.c:622:18: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for( i = 0; i < full_imm->Immediate.NrTokens - 1; i++ ) {
                  ^
tgsi/tgsi_build.c: In function 'tgsi_build_full_property':
tgsi/tgsi_build.c:1393:18: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for( i = 0; i < full_prop->Property.NrTokens - 1; i++ ) {
                  ^

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
94f40d3ac0 gallium/aux/tgsi_build.c: Remove now unused variable
Removing the unused prev_tocken from the function calls made this local
variable also unused.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
dc46b2aa99 gallium/aux/tgsi_build.c: Remove unused parameters prev_token from various functions
remove parameter prev_token unused in
   tgsi_build_instruction_label
   tgsi_build_instruction_texture
   tgsi_build_instruction_memory
   tgsi_build_texture_offset

This fixes the following warnings:

tgsi/tgsi_build.c: In function 'tgsi_build_instruction_label':
tgsi/tgsi_build.c:716:24: warning: unused parameter 'prev_token' [-
Wunused-parameter]
    struct tgsi_token  *prev_token,
                        ^~~~~~~~~~
tgsi/tgsi_build.c: In function 'tgsi_build_instruction_texture':
tgsi/tgsi_build.c:749:23: warning: unused parameter 'prev_token' [-
Wunused-parameter]
    struct tgsi_token *prev_token,
                       ^~~~~~~~~~
tgsi/tgsi_build.c: In function 'tgsi_build_instruction_memory':
tgsi/tgsi_build.c:784:23: warning: unused parameter 'prev_token' [-
Wunused-parameter]
    struct tgsi_token *prev_token,
                       ^~~~~~~~~~
tgsi/tgsi_build.c: In function 'tgsi_build_texture_offset':
tgsi/tgsi_build.c:819:23: warning: unused parameter 'prev_token' [-
Wunused-parameter]
    struct tgsi_token *prev_token,
                       ^~~~~~~~~~

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
f06194b012 gallium/aux/tgsi_exec.c: Fix various -Wsign-compare
tgsi/tgsi_exec.c: In function 'exec_tex':
tgsi/tgsi_exec.c:2254:46: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
       assert(shadow_ref >= dim && shadow_ref < ARRAY_SIZE(args));
                                              ^
./util/u_debug.h:189:30: note: in definition of macro 'debug_assert'
 #define debug_assert(expr) ((expr) ? (void)0 : _debug_assert_fail(#expr,
__FILE__, __LINE__, __FUNCTION__))
                              ^~~~
tgsi/tgsi_exec.c:2254:7: note: in expansion of macro 'assert'
       assert(shadow_ref >= dim && shadow_ref < ARRAY_SIZE(args));
       ^~~~~~
tgsi/tgsi_exec.c:2290:23: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
       for (i = dim; i < ARRAY_SIZE(args); i++)
                       ^
In file included from ./util/u_memory.h:39:0,
                 from tgsi/tgsi_exec.c:62:
tgsi/tgsi_exec.c: In function 'exec_lodq':
tgsi/tgsi_exec.c:2357:15: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    assert(dim <= ARRAY_SIZE(coords));
               ^
./util/u_debug.h:189:30: note: in definition of macro 'debug_assert'
 #define debug_assert(expr) ((expr) ? (void)0 : _debug_assert_fail(#expr,
__FILE__, __LINE__, __FUNCTION__))
                              ^~~~
tgsi/tgsi_exec.c:2357:4: note: in expansion of macro 'assert'
    assert(dim <= ARRAY_SIZE(coords));
    ^~~~~~
tgsi/tgsi_exec.c:2363:20: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
    for (i = dim; i < ARRAY_SIZE(coords); i++) {
                    ^

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
a7cbb9ba46 gallium/aux/tgsi_exec.c: remove superfluous parameter from etch_source_d
Remove unused parameter src_datatype from fetch_source_d, fixes warning;

tgsi/tgsi_exec.c: In function 'fetch_source_d':
tgsi/tgsi_exec.c:1594:40: warning: unused parameter 'src_datatype' [-Wunused-parameter]
                enum tgsi_exec_datatype src_datatype)
                                        ^~~~~~~~~~~~
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
5fe1b3b848 gallium/aux/tgsi_exec.c: remove superfluous parameter from store_dest_dstret
remove unused parameter inst from store_dest_dstret (and consequently also from
store_dest_double), fixes warning:

tgsi/tgsi_exec.c: In Funktion »store_dest_dstret«:
tgsi/tgsi_exec.c:1765:47: Warning: unused parameter »inst« [-Wunused-parameter]
           const struct tgsi_full_instruction *inst)
                                               ^~~~

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
c9b53c6410 gallium/aux/tgsi_exec.c: Remove unused parameter from fetch_src_file_channel
remove unused parameter chan_index from fetch_src_file_channel, fixes warning:

tgsi/tgsi_exec.c: In Funktion »fetch_src_file_channel«:
tgsi/tgsi_exec.c:1480:35: Warning: unused parameter »chan_index« [-Wunused-parameter]
                        const uint chan_index,
                                   ^~~~~~~~~~

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
38a9b42d8e gallium/aux/tgsi_exec.c: Remove paramater inst from exec_kill
Fixes warning:
tgsi/tgsi_exec.c: In Funktion »exec_kill«:
tgsi/tgsi_exec.c:2049:47: Warning: unused parameter »inst« [-Wunused-parameter]
           const struct tgsi_full_instruction *inst)
                                               ^~~~

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
b8fca73e47 gallium/aux/tgsi_aa_point.c: Fix -Wsign-compare warnings
tgsi/tgsi_aa_point.c:32:0:
tgsi/tgsi_aa_point.c: In Funktion »aa_decl«:
./util/u_math.h:660:29: Comparison between signed and unsigned in
conditional expressions [-Wsign-compare]
 #define MAX2( A, B )   ( (A)>(B) ? (A) : (B) )
                             ^
tgsi/tgsi_aa_point.c:76:21: Remark: when substituting of the macro
»MAX2«
       ts->num_tmp = MAX2(ts->num_tmp, decl->Range.Last + 1);
                     ^~~~
./util/u_math.h:660:40: Warning: signed and unsigned type in conditional
expression [-Wsign-compare]
 #define MAX2( A, B )   ( (A)>(B) ? (A) : (B) )
                                        ^
tgsi/tgsi_aa_point.c:76:21: Remark: when substituting of the macro
»MAX2«
       ts->num_tmp = MAX2(ts->num_tmp, decl->Range.Last + 1);
                     ^~~~
tgsi/tgsi_aa_point.c: In Funktion »aa_inst«:
tgsi/tgsi_aa_point.c:220:31: Comparison between signed and unsigned in
conditional expressions [-Wsign-compare]
           dst->Register.Index == ts->color_out) {

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
09b3b37b95 gallium/aux/tgsi_sanity.c: Fix -Wsign-compare warnings
tgsi_sanity.c: In function 'iter_instruction':
tgsi_sanity.c:316:29: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
       if (ctx->index_of_END != ~0) {
                             ^~
tgsi_sanity.c: In function 'epilog':
tgsi_sanity.c:488:26: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
    if (ctx->index_of_END == ~0) {

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
bf6b695a90 gallium/aux/tgsi/tgsi_parse.c: Fix two warnings
tgsi_parse.c: In function 'tgsi_parse_free':
tgsi_parse.c:54:31: warning: unused parameter 'ctx' [-Wunused-parameter]
    struct tgsi_parse_context *ctx )
                               ^~~
tgsi_parse.c: In function 'tgsi_parse_end_of_tokens':
tgsi_parse.c:62:25: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
    return ctx->Position >=

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
fc9e259e58 gallium/aux/tgsi/tgsi_dump.c: Fix -Wsign-compare warnings
tgsi_dump.c: In function 'iter_property':
tgsi_dump.c:443:18: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
    for (i = 0; i < prop->Property.NrTokens - 1; ++i) {
                  ^
tgsi_dump.c:459:13: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
       if (i < prop->Property.NrTokens - 2)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
03ac9708cf gallium/aux/cso_cache: Fix various warnings
cso_cache.c: In Function »delete_blend_state«:
cso_cache/cso_cache.c:90:51: Warning: unused parameter »data« [-Wunused-
parameter]
 static void delete_blend_state(void *state, void *data)
                                                   ^~~~
cso_cache/cso_cache.c: In Funktion »delete_depth_stencil_state«:
cso_cache/cso_cache.c:98:59: Warning: unused parameter »data« [-Wunused-
parameter]
 static void delete_depth_stencil_state(void *state, void *data)
                                                           ^~~~
cso_cache/cso_cache.c: In Funktion »delete_sampler_state«:
cso_cache/cso_cache.c:106:53: Warning: unused parameter »data« [-
Wunused-parameter]
 static void delete_sampler_state(void *state, void *data)
                                                     ^~~~
cso_cache/cso_cache.c: In Funktion »delete_rasterizer_state«:
cso_cache/cso_cache.c:114:56: Warning: unused parameter »data« [-
Wunused-parameter]
 static void delete_rasterizer_state(void *state, void *data)
                                                        ^~~~
cso_cache/cso_cache.c: In Funktion »delete_velements«:
cso_cache/cso_cache.c:122:49: Warning: unused parameter »data« [-
Wunused-parameter]
 static void delete_velements(void *state, void *data)
                                                 ^~~~
cso_cache/cso_cache.c: In Funktion »sanitize_cb«:
cso_cache/cso_cache.c:166:52: Warning: unused parameter »user_data« [-
Wunused-parameter]
                                int max_size, void *user_data)
                                                    ^~~~~~~~~
gallium/aux/cso_context.c: a -Wunused-parameter warning

cso_cache/cso_context.c: In Funktion »delete_sampler_state«:
cso_cache/cso_context.c:163:57: Warning: unused parameter »ctx« [-
Wunused-parameter]
 static boolean delete_sampler_state(struct cso_context *ctx, void
*state)
                                                         ^~~

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-20 11:08:28 +02:00
Gert Wollny
81e5bf3cfe configure.ac: Add CFLAG -Wno-missing-field-initializers (v5)
This warning is misleading: When a struct is partially initialized without
assigning to the structure members by name, then the remaining fields
will be zeroed out, and this warning will be issued (if enabled). If, on the
other hand, the partial initialization is done by assigning to named members,
the remaining structure elements may hold random data, but the warning is not
issued. Since in Mesa the first approach to initialize structure elements is
used very often, and it is usually assumed that the remaining elements are
zeroed out, heeding this warning would be counter-productive.

v2: - add -Wno-missing-field-initializers to meson-build
    - fix empty line error
    (both Eric Engestrom)

v3: * check for -Wmissing-field-initializers warning and then disable it
      because gcc and clang always accept -Wno-* (Dylan Baker)
    * Also disable this warning for C++

v4: * meson.build add -Wno-missing-field-initializers to
      c_args instead of no_override_init_args (Eric Engstrom)

v5: * configure.ac: Correct copy/paste error with CFLAGS/CXXFLAGS

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v2)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
2018-06-20 11:08:28 +02:00
Samuel Pitoiset
916dda5cf7 radv: remove unnecessary code around CACHE_FLUSH_AND_INV_TS_EVENT
AMDVLK also always uses CACHE_FLUSH_AND_INV_TS_EVENT. The other
workaround is to flush DB metadata after emitting the framebuffer,
but that seems slower.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-20 10:08:37 +02:00
Bas Nieuwenhuizen
4705a5dfda radv: Fix flush_bits being used uninitialized.
A case of making things worse while trying to fix something minor ...

Fixes: ef79457004 "radv: Merge the flush bits of CMASK & DCC clear."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-06-20 10:02:39 +02:00
Keith Packard
dbac8e25f8 radv: Add EXT_acquire_xlib_display to radv driver [v2]
This extension adds the ability to borrow an X RandR output for
temporary use directly by a Vulkan application to the radv driver.

v2:
	Simplify addition of VK_USE_PLATFORM_XLIB_XRANDR_KHR to
	vulkan_wsi_args

	Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com>

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-19 14:17:46 -07:00
Keith Packard
46090a642d anv: Add EXT_acquire_xlib_display to anv driver [v3]
This extension adds the ability to borrow an X RandR output for
temporary use directly by a Vulkan application to the anv driver.

v2:
	Simplify addition of VK_USE_PLATFORM_XLIB_XRANDR_KHR to
	vulkan_wsi_args

	Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com>

v3:
	Add extension to list in alphabetical order

	Suggested-by:  Jason Ekstrand <jason@jlekstrand.net>

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-19 14:17:46 -07:00
Keith Packard
7ab1fffcd2 vulkan: Add EXT_acquire_xlib_display [v5]
This extension adds the ability to borrow an X RandR output for
temporary use directly by a Vulkan application. For DRM, we use the
Linux resource leasing mechanism.

v2:
	Clean up xlib_lease detection

	* Use separate temporary '_xlib_lease' variable to hold the
	  option value to avoid changin the type of a variable.

	* Use boolean expressions instead of additional if statements
	  to compute resulting with_xlib_lease value.

	* Simplify addition of VK_USE_PLATFORM_XLIB_XRANDR_KHR to
          vulkan_wsi_args

	  Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com>

	Move mode list from wsi_display to wsi_display_connector

	Fix scope for wsi_display_mode and wsi_display_connector allocs

	  Suggested-by: Jason Ekstrand <jason@jlekstrand.net>

v3:
	Adopt Jason Ekstrand's coding conventions

	Declare variables at first use, eliminate extra whitespace
	between types and names. Wrap lines to 80 columns.

	Explicitly forbid multiple DRM leases. Making the code support
	this looks tricky and will require additional thought.

	Use xcb_randr_output_t throughout the internals of the
	implementation. Convert at the public API
	(wsi_get_randr_output_display).

	Clean up check for usable active_crtc (possible when only the
	desired output is connected to the crtc).

	Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

v4:
	Move output resource fetching closer to use in
	wsi_display_get_output. This simplifies the error returns in
	earlier parts of the code a bit.

	Return VK_ERROR_INITIALIZATION_FAILED from
	wsi_acquire_xlib_display. Jason says this is the right error
	message.

	Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

v5:
	randr doesn't pass vscan over the wire, so we set vscan to 0
	for randr-acquired modes, and test wsi modes for vscan <= 1
	when comparing against randr modes.

    	Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-19 14:17:46 -07:00
Keith Packard
5a2efefb0a radv: Add EXT_direct_mode_display to radv driver
Add support for the EXT_direct_mode_display extension. This just
provides the vkReleaseDisplayEXT function.

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-19 14:17:46 -07:00
Keith Packard
f89d3874fb anv: Add EXT_direct_mode_display to anv driver [v2]
Add support for the EXT_direct_mode_display extension. This just
provides the vkReleaseDisplayEXT function.

v2: Add extension to list in alphabetical order

    Suggested-by:  Jason Ekstrand <jason@jlekstrand.net>

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-19 14:17:46 -07:00
Keith Packard
352d320a07 vulkan: Add EXT_direct_mode_display [v2]
Add support for the EXT_direct_mode_display extension. This just
provides the vkReleaseDisplayEXT function.

v2:
	Adopt Jason Ekstrand's coding conventions

	Declare variables at first use, eliminate extra whitespace
	between types and names. Wrap lines to 80 columns.

	Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-19 14:17:46 -07:00
Keith Packard
451b58a51e radv: Add KHR_display extension to radv [v5]
This adds support for the KHR_display extension to the radv Vulkan
driver. The driver now attempts to open the master DRM node when the
KHR_display extension is requested so that the common winsys code can
perform the necessary operations.

v2:
	* Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to
          vulkan_wsi_args

	Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com>

v3:
	Adapt to new wsi_device_init API (added display_fd)

v4:
	Adopt Jason Ekstrand's coding conventions

	Declare variables at first use, eliminate extra whitespace
	between types and names. Wrap lines to 80 columns.

	Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

v5:
	Add vkCreateDisplayModeKHR. This doesn't actually create
	new modes, it only looks to see if the requested parameters
	matches an existing mode and returns that.

    	Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-19 14:17:46 -07:00
Keith Packard
54d0daa481 anv: Add KHR_display extension to anv [v7]
This adds support for the KHR_display extension to the anv Vulkan
driver. The driver now attempts to open the master DRM node when the
KHR_display extension is requested so that the common winsys code can
perform the necessary operations.

v2: Make sure primary fd is usable

	When KHR_display is selected, we try to open the primary node
	instead of the render node in case the user wants to use
	KHR_display for presentation. However, if we're actually going
	to end up using RandR leases, then we don't care if the
	resulting fd can't be used for display, but the kernel also
	prevents us from using it for drawing when someone else has
	master.

v3:
	Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to vulkan_wsi_args

	Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com>

v4:
	Adapt primary node usage to new wsi_device_init API

v5:
	Adopt Jason Ekstrand's coding conventions

        Declare variables at first use, eliminate extra whitespace between
        types and names. Wrap lines to 80 columns.

	Remove spurious MM_PER_PIXEL define

        Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

v6:
	Open DRM master before initializing WSI layer.

	The DRM master FD is passed to the WSI layer during
	initialization, so we need to open the device slightly earlier
	in the function.

	Close DRM master in device_finish.

	Use anv_gem_get_param to detect working master_fd instead of
	directly using the ioctl.

        Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

v7:
	Add vkCreateDisplayModeKHR. This doesn't actually create
	new modes, it only looks to see if the requested parameters
	matches an existing mode and returns that.

    	Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-19 14:17:46 -07:00
Keith Packard
da997ebec9 vulkan: Add KHR_display extension using DRM [v10]
This adds support for the KHR_display extension support to the vulkan
WSI layer. Driver support will be added separately.

v2:
	* fix double ;; in wsi_common_display.c

	* Move mode list from wsi_display to wsi_display_connector

	* Fix scope for wsi_display_mode andwsi_display_connector
          allocs

	* Switch all allocations to vk_zalloc instead of vk_alloc.

	* Fix DRM failure in
          wsi_display_get_physical_device_display_properties

	  When DRM fails, or when we don't have a master fd
	  (presumably due to application errors), just return 0
	  properties from this function, which is at least a valid
	  response.

	* Use vk_outarray for all property queries

	  This is a bit less error-prone than open-coding the same
	  stuff.

	* Remove VK_COMPOSITE_ALPHA_INHERIT_BIT_KHR from surface caps

	  Until we have multi-plane support, we shouldn't pretend to
	  have any multi-plane semantics, even if undefined.

	Suggested-by: Jason Ekstrand <jason@jlekstrand.net>

	* Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to
          vulkan_wsi_args

	Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com>

v3:
	Add separate 'display_fd' and 'render_fd' arguments to
	wsi_device_init API. This allows drivers to use different FDs
	for the different aspects of the device.

	Use largest mode as display size when no preferred mode.

	If the display doesn't provide a preferred mode, we'll assume
	that the largest supported mode is the "physical size" of the
	device and report that.

v4:
	Make wsi_image_state enumeration values uppercase.
	Follow more common mesa conventions.

	Remove 'render_fd' from wsi_device_init API.  The
	wsi_common_display code doesn't use this fd at all, so stop
	passing it in. This avoids any potential confusion over which
	fd to use when creating display-relative object handles.

	Remove call to wsi_create_prime_image which would never have
	been reached as the necessary condition (use_prime_blit) is
	never set.

	whitespace cleanups in wsi_common_display.c

	Suggested-by: Jason Ekstrand <jason@jlekstrand.net>

	Add depth/bpp info to available surface formats.  Instead of
	hard-coding depth 24 bpp 32 in the drmModeAddFB call, use the
	requested format to find suitable values.

	Destroy kernel buffers and FBs when swapchain is destroyed. We
	were leaking both of these kernel objects across swapchain
	destruction.

	Note that wsi_display_wait_for_event waits for anything to
	happen.  wsi_display_wait_for_event is simply a yield so that
	the caller can then check to see if the desired state change
	has occurred.

	Record swapchain failures in chain for later return. If some
	asynchronous swapchain activity fails, we need to tell the
	application eventually. Record the failure in the swapchain
	and report it at the next acquire_next_image or queue_present
	call.

	Fix error returns from wsi_display_setup_connector.  If a
	malloc failed, then the result should be
	VK_ERROR_OUT_OF_HOST_MEMORY. Otherwise, the associated ioctl
	failed and we're either VT switched away, or our lease has
	been revoked, in which case we should return
	VK_ERROR_OUT_OF_DATE_KHR.

	Make sure both sides of if/else brace use matches

	Note that we assume drmModeSetCrtc is synchronous. Add a
	comment explaining why we can idle any previous displayed
	image as soon as the mode set returns.

	Note that EACCES from drmModePageFlip means VT inactive.  When
	vt switched away drmModePageFlip returns EACCES. Poll once a
	second waiting until we get some other return value back.

	Clean up after alloc failure in
	wsi_display_surface_create_swapchain. Destroy any created
	images, free the swapchain.

	Remove physical_device from wsi_display_init_wsi. We never
	need this value, so remove it from the API and from the
	internal wsi_display structure.

	Use drmModeAddFB2 in wsi_display_image_init.  This takes a drm
	format instead of depth/bpp, which provides more control over
	the format of the data.

v5:
	Set the 'currentStackIndex' member of the
	VkDisplayPlanePropertiesKHR record to zero, instead of
	indexing across all displays. This value is the stack depth of
	the plane within an individual display, and as the current
	code supports only a single plane per display, should be set
	to zero for all elements

	Discovered-by: David Mao <David.Mao@amd.com>

v6:
	Remove 'platform_display' bits from the build and use the
	existing 'platform_drm' instead.

v7:
	Ensure VK_ICD_WSI_PLATFORM_MAX is large enough by
	setting to VK_ICD_WSI_PLATFORM_DISPLAY + 1

v8:
	Simplify wsi_device_init failure from wsi_display_init_wsi
	by using the same pattern as the other wsi layers.

    Adopt Jason Ekstrand's white space and variable declaration
	suggestions. Declare variables at first use, eliminate extra
	whitespace between types and names, add list iterator helpers,
	switch to lower-case list_ macros.

    Respond to Jason's April 8 review:

	* Create a function to convert relative to absolute timeouts
          to catch overflow issues in one place

	* use VK_NULL_HANDLE to clear prop->currentDisplay

	* Get rid of available_present_modes array.

	* return OUT_OF_DATE_KHR when display_queue_next called after
	  display has been released.

	* Make errors from mode setting fatal in display_queue_next

	* Remove duplicate pthread_mutex_init call

	* Add wsi_init_pthread_cond_monotonic helper function to
	  isolate pthread error handling from wsi_display_init_wsi

	Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

v9:
	Fix vscan handling by using MAX2(vscan, 1) everywhere. Vscan
	can be zero anywhere, which is treated the same as 1.

	Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

v10:
	Respond to Vulkan CTS failures.

	1. Initialize planeReorderPossible in display_properties code

	2. Only report connected displays in
	   get_display_plane_supported_displays

	3. Return VK_ERROR_OUT_OF_HOST_MEMORY when pthread cond
	   initialization fails.

	Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>

	4. Add vkCreateDisplayModeKHR. This doesn't actually create
	   new modes, it only looks to see if the requested parameters
	   matches an existing mode and returns that.

	Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
2018-06-19 14:17:46 -07:00
Bas Nieuwenhuizen
ef79457004 radv: Merge the flush bits of CMASK & DCC clear.
Probably won't be much different in practice, but still wrong.

Fixes Coverity issue 1435002.

Not CC'ing to stable since this is only hit if you enable MSAA
DCC via RADV_DEBUG.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-06-19 22:35:13 +02:00
Bas Nieuwenhuizen
ed06b1cdca radv: Don't check for pipeline being set in draw.
Draws without pipeline are definitely not allowed.

Fixes Coverity issue 1434216.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-06-19 22:35:13 +02:00
Marek Olšák
1ba87f4438 radeonsi: rename r600_texture -> si_texture, rxxx -> xxx or sxxx
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-19 13:08:50 -04:00
Marek Olšák
6703fec58c amd,radeonsi: rename radeon_winsys_cs -> radeon_cmdbuf
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-19 13:08:50 -04:00
Rob Clark
39b4fdc45f freedreno/a5xx: move emit_marker5() into a5xx backend
The scratch registers move again in a6xx.. so for post-a4xx let's just
move this into the backend, and move the one place it used to be needed
in core into fd5_emit_ib().  For a6xx we will do similar, calling
emit_marker6() from fd6_emit_ib().

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-19 13:02:28 -04:00
Rob Clark
0c8d9e923a freedreno/a5xx: fix crash in dEQP-GLES31.stress.vertex_attribute_binding.buffer_bounds.bind_vertex_buffer_offset_near_wrap_10
This is kind of a hack, but really the only problem is the
debug_assert() in OUT_RELOC().  But the debug_assert() is
useful to catch real issues.  So just add some #ifdef DEBUG
code to filter things out before we hit the assert.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-19 13:02:28 -04:00
Rob Clark
4a41b02d46 freedreno/a5xx: don't crash if compute shader compile fails
It is impolite, and a bit annoying with dEQP (all tests running in
single process).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-19 13:02:28 -04:00
Rob Clark
658f1f6003 freedreno/ir3: fix missing recursion into block condition
Fixes a problem seen with dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.row_major_mat4

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-19 13:02:28 -04:00
Rob Clark
1a6150207c freedreno/a5xx: better FOUR_QUAD/TWO_QUAD decision for compute
If we aren't going to get full occupancy, then use TWO_QUAD.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-19 13:02:28 -04:00
Rob Clark
f07154421a freedreno/a5xx: bordercolor fixes
Need a bit of hand-holding for stencil bordercolor, and add border color
values for sRGB.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-19 13:02:28 -04:00
Rob Clark
ced14f1c7a freedreno: remove per-stateobj dirty_mask's
These never got updated in fd_context_all_dirty() so actually trying to
rely on them (in the case of fd5_emit_images()) ends up in some cases
where state is not emitted but should be.  Best to just rip this out.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-19 13:02:28 -04:00
Rob Clark
5708440597 freedreno/a5xx: remove one image stateblock
I think this ends up just setting uniform/const memory.  But we upload
x/y/z stride differently.  At best this is unneeded, at worst it could
possibly clobber other uniform/const memory.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-19 13:02:28 -04:00
Rob Clark
e0c6135625 freedreno/a5xx: cubemap image fixes
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-19 13:02:28 -04:00
Rob Clark
0bb0cac8dc freedreno/ir3: handle image buffer
Similar to txf case, we need to insert a 2nd coordinate (zero).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-19 13:02:28 -04:00
Rob Clark
d1d2b13518 freedreno/ir3: handle arrays of images
Unlike textures, this doesn't get lowered for us.  (Would be nice
if they were.. at least until we are ready to deal w/ indirect
indexing..)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-19 13:02:28 -04:00
Rob Clark
5b2ef78532 freedreno/ir3: images can be arrays too
Seems I previously toally forgot about 2d-arrays, etc..

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-19 13:02:28 -04:00
Rob Clark
f489fa1f3f freedreno/ir3: use move_load_const pass
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-19 13:02:28 -04:00
Rob Clark
7235c144a6 nir: add pass to move load_const
Run this pass late (after opt loop) to move load_const instructions back
into the basic blocks which use the result, in cases where a load_const
is only consumed in a single block.

This helps reduce register usage in cases where the backend driver
cannot lower the load_const to a uniform.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-19 13:02:28 -04:00
Rob Clark
c9d6e579ec mesa/st/nir: fix driver_location for arrays of image/sampler
We can have arrays of images or samplers.  But I forgot to handle that
case long ago.  Suprised no one complained yet.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-19 13:02:28 -04:00
Rob Clark
228457234c nir: add comment for loop_unroll pass
Save the next person from digging through the code to figure out what
the indirect_mask parameter actually does.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-19 13:02:28 -04:00
Rob Clark
e3bbc1eaf4 glsl: fix random typo
Just something I stumbled across.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-19 13:02:28 -04:00
Marek Olšák
dfeb61c5cf radeonsi: ignore PIPE_RESOURCE_FLAG_MAP_COHERENT
We treat coherent and non-coherent buffers the same.

And move external_usage for better packing.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-19 12:52:28 -04:00
Marek Olšák
9322974ec7 radeonsi: always put persistent buffers into GTT on radeon
This improves performance for certain games.

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-19 12:52:28 -04:00
Marek Olšák
ffbbc008be radeonsi: fix si_get_num_queries for radeon
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-19 12:52:28 -04:00
Marek Olšák
94b29763a4 radeonsi: don't expose performance counters for non-existent blocks
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-19 12:52:28 -04:00
Marek Olšák
a2451a4c23 ac/gpu_info: add radeon_info::num_tcc_blocks
The values for the radeon winsys were copied from the kernel driver.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-19 12:52:28 -04:00
Marek Olšák
166c00e28e radeonsi: set a better NUM_PATCHES hard limit
AMDVLK uses 64 (distributed) and 16 (non-distributed).
radeonsi will use 63 and 16.
* This might improve tessellation performance on Hawaii, Bonaire, Tahiti,
  Pitcairn. (they will use 16)
* I'm not sure if this matters for 1 SE configs.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-19 12:52:28 -04:00
Marek Olšák
0d685ba290 radeonsi: make sure LS-HS vector lanes are reasonably occupied
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-19 12:52:28 -04:00
Marek Olšák
e93fe403bc radeonsi: properly compute an LS-HS thread group size limit
"64 / max * 4" is less than "64 * 4 / max".

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-19 12:52:28 -04:00
Eric Anholt
da0115b1c3 v3d: Fix blitting from a linear winsys BO.
This is the case for the simulator environment, and broke many blitter
tests by trying to texture from linear while the HW can only actually do
UIF/UBLINEAR/LT.  Just make a temporary and copy into it with the CPU,
then blit from that.

This is the kind of path that should use the TFU, but I haven't exposed
that hardware yet.

Fixes dEQP-GLES3.functional.fbo.blit.default_framebuffer.*
2018-06-19 09:42:20 -07:00
Eric Anholt
07b243674f v3d: Add missing always_flush debug flag.
The #define existed and was checked in the driver.
2018-06-19 09:42:20 -07:00
Tomeu Vizoso
9b1cb50ba4 virgl: Remove debugging left-overs
Some fprintfs were probably left unintentionally a few years ago and are
a bit of a nuisance.

Fixes: 2d3301e4d5 ("virgl: fix reference counting of prime handles")
       Cc: Rob Herring <robh@kernel.org>

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-19 13:35:13 +02:00
Timothy Arceri
6c243ac2dd glsl: fix desktop glsl linking regression
The prog->Shaders[i]->IsES check was accidentally removed causing
ES linking rules to be applied to desktop GLSL.

Fixes: 725b1a406d ("mesa/util: add allow_glsl_relaxed_es driconfig override")
2018-06-19 17:58:05 +10:00
Timothy Arceri
a9114b5e3e util: add allow_glsl_relaxed_es to drirc for Google Earth VR
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-19 12:09:56 +10:00
Timothy Arceri
725b1a406d mesa/util: add allow_glsl_relaxed_es driconfig override
This relaxes a number of ES shader restrictions allowing shaders
to follow more desktop GLSL like rules.

This initial implementation relaxes the following:

 - allows linking ES shaders with desktop shaders
 - allows mismatching precision qualifiers
 - always enables standard derivative builtins

These relaxations allow Google Earth VR shaders to compile.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-19 12:09:56 +10:00
Timothy Arceri
781c23ece6 util: add allow_glsl_builtin_const_expression to drirc for Google Earth VR
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-19 12:09:56 +10:00
Timothy Arceri
90dbab0f9a mesa/util: add allow_glsl_builtin_const_expression driconf override
Google Earth VR shaders uses builtins in constant expressions with
GLSL 1.10. That feature wasn't allowed until GLSL 1.20.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-19 12:09:56 +10:00
Timothy Arceri
de93f546a7 util: manually extract the program name from program_invocation_name
Glibc has the same code to get program_invocation_short_name. However
for some reason the short name gets mangled for some wine apps.

For example with Google Earth VR I get:

program_invocation_name:
"/home/tarceri/.local/share/Steam/steamapps/common/EarthVR/Earth.exe"

program_invocation_short_name:
"e"

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-19 12:09:56 +10:00
Bas Nieuwenhuizen
1a8501a9dd ac/surface: Set compressZ for stencil-only surfaces.
We HTILE compress stencil-only surfaces too.

CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-19 02:52:01 +02:00
Jason Ekstrand
0146d79636 anv: Use a single global API patch version
The Vulkan API has only one patch version shared among all of the
major.minor versions.  We should also advertise the same patch version
regardless of major.minor.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106941
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-18 17:11:52 -07:00
Timothy Arceri
68bf94a8b0 radeonsi: enable OpenGL 3.3 compat profile
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-19 09:21:33 +10:00
Timothy Arceri
89a5d6f715 mesa: add ff fragment shader support for geom and tess shaders
This is required for compatibility profile support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-06-19 09:21:33 +10:00
Eric Anholt
e636199c1c v3d: Set the SO offsets correctly if we have to re-emit.
This should fix TF across a glFlush() or TF pause/restart.  Fixes
dEQP-GLES3.functional.transform_feedback.array.interleaved.lines.highp_float
and many, many others.
2018-06-18 14:54:16 -07:00
Marek Olšák
94178044d5 gallium/hud: = should rename the last added data source
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-18 17:53:15 -04:00
Rafael Antognolli
ba2c18763b anv: Disable constant buffer 0 being relative.
If we are on gen8+ and have context isolation support, just make that
constant buffer address be absolute, so we can use it for push UBOs too.

v2: Do not duplicate constant_buffer_0_is_relative flag (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-18 14:41:38 -07:00
Rafael Antognolli
be18d5a0ce anv/device: Check for kernel support of context isolation.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-18 14:41:38 -07:00
Rafael Antognolli
056214ebfc intel/genxml: Add bitmasks for CS_DEBUG_MODE2/INSTPM.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-18 14:41:38 -07:00
Alok Hota
a678f40e46 swr/rast: Clang-Format most rasterizer source code
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-06-18 13:57:38 -05:00
Eric Engestrom
d85fef1e34 radv: fix reported number of available VGPRs
It's a bit late to round up after an integer division.

Fixes: de88979413 "radv: Implement VK_AMD_shader_info"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
2018-06-18 17:08:22 +01:00
Eric Engestrom
9a4bd6b45f mesa: add missing return in error path
Fixes: 67f40dadaa "mesa: add support for ARB_sample_locations"
Cc: Rhys Perry <pendingchaos02@gmail.com>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-06-18 16:19:48 +01:00
Bas Nieuwenhuizen
a3d93eec7c radv: Use less conservative approximation for context rolls.
Drops the number of time we set the scissor by 4x for F1 2017,
which results in a consistent performance improvement of about 4%.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-06-18 16:21:10 +02:00
Eric Engestrom
4d08c1e7d1 radv: fix bitwise check
Fixes: 922cd38172 "radv: implement out-of-order rasterization when it's safe on VI+"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-06-18 12:15:18 +01:00
Eric Engestrom
e8eb84826e meson: fix i965/anv/isl genX static lib names
Shouldn't make any functional difference, just that `liblibanv_gen90.a`
will now be called `libanv_gen90.a`.

Fixes: 3218056e0e "meson: Build i965 and dri stack"
Fixes: d1992255bb "meson: Add build Intel "anv" vulkan driver"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-18 12:03:24 +01:00
Timothy Arceri
66673bef94 mesa: Unconditionally enable floating-point textures
ARB_texture_float references US Patent #6,650,327 [1] which has a filing date
of June 16 1998.

According to [2], patents filed after 1995 expire 20 years from the filing
date, giving an expiration of June 17 2018.

[1] https://www.google.com/patents/US6650327
[2] https://en.wikipedia.org/wiki/Term_of_patent_in_the_United_States

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-18 09:29:38 +10:00
Jose Maria Casanova Crespo
b8e099e7d5 intel/fs: shuffle_64bit_data_for_32bit_write is not used anymore
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
a4965842d6 intel/fs: Use new shuffle_32bit_write for all 64-bit storage writes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
a4d445b93c intel/fs: shuffle_32bit_load_result_to_64bit_data is not used anymore
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
71b319a285 intel/fs: Use shuffle_from_32bit_read for 64-bit FS load_input
As the previous use of shuffle_32bit_load_result_to_64bit_data
had a source/destination overlap for 64-bit. Now a temporary destination
is used for 64-bit cases to use shuffle_from_32bit_read that doesn't
handle src/dst overlaps.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
8003ae87f4 intel/fs: shuffle_from_32bit_read at load_per_vertex_input at TCS/TES
Previously, the shuffle function had a source/destination overlap that
needs to be avoided to use shuffle_from_32bit_read. As we can use for
the shuffle destination the destination of removed MOVs.

This change also avoids the internal MOVs done by the previous shuffle
to deal with possible overlaps.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
5565630f85 intel/fs: Use shuffle_from_32bit_read at VS load_input
shuffle_from_32bit_read manages 32-bit reads to 32-bit destination
in the same way that the previous loop so now we just call the new
function for all bitsizes, simplifying also the 64-bit load_input.

v2: Add comment about future 16-bit support (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
152bffb69b intel/fs: Use shuffle_from_32bit_read for 64-bit gs_input_load
This implementation avoids two unneeded MOVs for each 64-bit
component. One was done in the old shuffle, to avoid cases of
src/dst overlap but this is not the case. And the removed MOV
was already being being done in the shuffle.

Copy propagation wasn't able to remove them because shuffle
destination values are defined with partial writes because they
have stride == 2.

v2: Reword commit log summary (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
8b26a2d96d intel/fs: shuffle_from_32bit_read for 64-bit do_untyped_vector_read
do_untyped_vector_read is used at load_ssbo and load_shared.

The previous MOVs are removed because shuffle_from_32bit_read
can handle storing the shuffle results in the expected destination
just using the proper offset.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
c2297bdf19 intel/fs: Remove old 16-bit shuffle/unshuffle functions
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
fd3d8a8f79 intel/fs: Use shuffle_for_32bit_write for 16-bits store_ssbo
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
20e4732f7d intel/fs: Use shuffle_from_32bit_read to read 16-bit SSBO
Using shuffle_from_32bit_read instead of 16-bit shuffle functions
avoids the need of retype. At the same time new function are
ready for 8-bit type SSBO reads.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
a0891eabca intel/fs: Use shuffle_from_32bit_read at VARYING_PULL_CONSTANT_LOAD
shuffle_from_32bit_read can manage the shuffle/unshuffle needed
for different 8/16/32/64 bit-sizes at VARYING PULL CONSTANT LOAD.
To get the specific component the first_component parameter is used.

In the case of the previous 16-bit shuffle, the shuffle operation was
generating not needed MOVs where its results where never used. This
behaviour passed unnoticed on SIMD16 because dead_code_eliminate
pass removed the generated instructions but for SIMD8 they cound't be
removed because of being partial writes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
22c654941b intel/fs: New shuffle_for_32bit_write and shuffle_from_32bit_read
These new shuffle functions deal with the shuffle/unshuffle operations
needed for read/write operations using 32-bit components when the
read/written components have a different bit-size (8, 16, 64-bits).
Shuffle from 32-bit to 32-bit becomes a simple MOV.

shuffle_src_to_dst takes care of doing a shuffle when source type is
smaller than destination type and an unshuffle when source type is
bigger than destination. So this new read/write functions just need
to call shuffle_src_to_dst assuming that writes use a 32-bit
destination and reads use a 32-bit source.

As shuffle_for_32bit_write/from_32bit_read components take components
in unit of source/destination types and shuffle_src_to_dst takes units
of the smallest type component, we adjust components and first_component
parameters.

To enable this new functions it is needed than there is no
source/destination overlap in the case of shuffle_from_32bit_read.
That never happens on shuffle_for_32bit_write as it allocates a new
destination register as it was at shuffle_64bit_data_for_32bit_write.

v2: Reword commit log and add comments to explain why first_component
    and components parameters are adjusted. (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
a5665056e5 intel/fs: general 8/16/32/64-bit shuffle_src_to_dst function
This new function takes care of shuffle/unshuffle components of a
particular bit-size in components with a different bit-size.

If source type size is smaller than destination type size the operation
needed is a component shuffle. The opposite case would be an unshuffle.

Component units are measured in terms of the smaller type between
source and destination. As we are un/shuffling the smaller components
from/into a bigger one.

The operation allows to skip first_component number of components from
the source.

Shuffle MOVs are retyped using integer types avoiding problems with
denorms and float types if source and destination bitsize is different.
This allows to simplify uses of shuffle functions that are dealing with
these retypes individually.

Now there is a new restriction so source and destination can not overlap
anymore when calling this shuffle function. Following patches that migrate
to use this new function will take care individually of avoiding source
and destination overlaps.

v2: (Jason Ekstrand)
    - Rewrite overlap asserts.
    - Manage type_sz(src.type) == type_sz(dst.type) case using MOVs
      from source to dest. This works for 64-bit to 64-bits
      operation that on Gen7 as it doesn't support Q registers.
    - Explain that components units are based in the smallest type.
v3: - Fix unshuffle overlap assert (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Fonseca
d882331f7a appveyor: Consume LLVM 5.0.1.
https://ci.appveyor.com/project/jrfonseca/mesa/build/47

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-06-16 18:09:20 +01:00
Bas Nieuwenhuizen
c4714f698b ac: Clear meminfo to avoid valgrind warning.
Somehow valgrind misses that the value is initialized by the ioctl.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-06-16 19:03:47 +02:00
Samuel Pitoiset
5917761e3d radv: fix emitting the TCS regs on GFX9
The primitive ID is NULL and this generates an invalid
select instruction which crashes because one operand is NULL.

This fixes crashes in The Long Journey Home, Quantum Break
and Just Cause 3 with DXVK.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106756
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-16 10:18:51 +02:00
Ian Romanick
355868dbfc nir: Document a couple instances of parent_instr
nir_ssa_def::parent_instr and nir_src::parent_instr have the same name,
but they mean really different things.  I choose to save the next person
the hour+ that I just spent figuring that out.  Even now that I know, I
doubt I'd notice in code review that someone typed foo->parent_instr
when they actually meant foo->ssa->parent_instr.

v2: Minor wording tweak in nir_ssa_def::parent_instr.  Suggested by
Jason.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-15 17:36:51 -07:00
Ian Romanick
4467040cb6 i965/fs: Propagate conditional modifiers from not instructions
Skylake
total instructions in shared programs: 14399081 -> 14399010 (<.01%)
instructions in affected programs: 26961 -> 26890 (-0.26%)
helped: 57
HURT: 0
helped stats (abs) min: 1 max: 6 x̄: 1.25 x̃: 1
helped stats (rel) min: 0.16% max: 0.80% x̄: 0.30% x̃: 0.18%
95% mean confidence interval for instructions value: -1.50 -0.99
95% mean confidence interval for instructions %-change: -0.35% -0.25%
Instructions are helped.

total cycles in shared programs: 532978307 -> 532976050 (<.01%)
cycles in affected programs: 468629 -> 466372 (-0.48%)
helped: 33
HURT: 20
helped stats (abs) min: 3 max: 360 x̄: 116.52 x̃: 98
helped stats (rel) min: 0.06% max: 3.63% x̄: 1.66% x̃: 1.27%
HURT stats (abs)   min: 2 max: 172 x̄: 79.40 x̃: 43
HURT stats (rel)   min: 0.04% max: 3.02% x̄: 1.48% x̃: 0.44%
95% mean confidence interval for cycles value: -81.29 -3.88
95% mean confidence interval for cycles %-change: -1.07% 0.12%
Inconclusive result (%-change mean confidence interval includes 0).

All Gen6+ platforms, except Ivy Bridge, had similar results. (Haswell shown)
total instructions in shared programs: 12973897 -> 12973838 (<.01%)
instructions in affected programs: 25970 -> 25911 (-0.23%)
helped: 55
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.07 x̃: 1
helped stats (rel) min: 0.16% max: 0.62% x̄: 0.28% x̃: 0.18%
95% mean confidence interval for instructions value: -1.14 -1.00
95% mean confidence interval for instructions %-change: -0.32% -0.24%
Instructions are helped.

total cycles in shared programs: 410355841 -> 410352067 (<.01%)
cycles in affected programs: 578454 -> 574680 (-0.65%)
helped: 47
HURT: 5
helped stats (abs) min: 3 max: 360 x̄: 85.74 x̃: 18
helped stats (rel) min: 0.05% max: 3.68% x̄: 1.18% x̃: 0.38%
HURT stats (abs)   min: 2 max: 242 x̄: 51.20 x̃: 4
HURT stats (rel)   min: <.01% max: 0.45% x̄: 0.15% x̃: 0.11%
95% mean confidence interval for cycles value: -104.89 -40.27
95% mean confidence interval for cycles %-change: -1.45% -0.66%
Cycles are helped.

Ivy Bridge
total instructions in shared programs: 11679351 -> 11679301 (<.01%)
instructions in affected programs: 28208 -> 28158 (-0.18%)
helped: 50
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.12% max: 0.54% x̄: 0.23% x̃: 0.16%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.27% -0.19%
Instructions are helped.

total cycles in shared programs: 257445362 -> 257444662 (<.01%)
cycles in affected programs: 419338 -> 418638 (-0.17%)
helped: 40
HURT: 3
helped stats (abs) min: 1 max: 170 x̄: 65.05 x̃: 24
helped stats (rel) min: 0.02% max: 3.51% x̄: 1.26% x̃: 0.41%
HURT stats (abs)   min: 2 max: 1588 x̄: 634.00 x̃: 312
HURT stats (rel)   min: 0.05% max: 2.97% x̄: 1.21% x̃: 0.62%
95% mean confidence interval for cycles value: -97.96 65.41
95% mean confidence interval for cycles %-change: -1.56% -0.62%
Inconclusive result (value mean confidence interval includes 0).

No changes on Iron Lake or GM45.

v2: Move 'if (cond != BRW_CONDITIONAL_Z && cond != BRW_CONDITIONAL_NZ)'
check outside the loop.  Suggested by Iago.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 17:22:27 -07:00
Ian Romanick
f2d8bb7a7b i965/fs: Rearrange code to remove most of the gotos
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 17:22:27 -07:00
Ian Romanick
77f269bb56 i965/fs: Refactor propagation of conditional modifiers from compares to adds
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 17:22:27 -07:00
Ian Romanick
22f9fbc0d9 i965/vec4: Optimize OR with 0 into a MOV
All of the affected shaders are geometry shaders... the same ones from
the similar fs changes.

The "No changes on any other platforms" comment below is not quite
right.  Without the previous change to register coalescing, this
optimization caused quite a few regressions in tests that either used
gl_ClipVertex or used different interpolation modes.  I observed that
with both patches applied,
glsl-1.10/execution/interpolation/interpolation-none-gl_BackSecondaryColor-smooth-vertex.shader_test
was one instruction shorter.  I suspect other shaders would be similarly
affected.  Since this is all based on NOS, shader-db does not reflect
it.

Haswell
total instructions in shared programs: 12954955 -> 12954918 (<.01%)
instructions in affected programs: 3603 -> 3566 (-1.03%)
helped: 37
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.21% max: 2.50% x̄: 1.99% x̃: 2.50%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -2.30% -1.69%
Instructions are helped.

total cycles in shared programs: 410012108 -> 410012098 (<.01%)
cycles in affected programs: 3540 -> 3530 (-0.28%)
helped: 5
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.28% max: 0.28% x̄: 0.28% x̃: 0.28%
95% mean confidence interval for cycles value: -2.00 -2.00
95% mean confidence interval for cycles %-change: -0.28% -0.28%
Cycles are helped.

Ivy Bridge
total instructions in shared programs: 11679387 -> 11679351 (<.01%)
instructions in affected programs: 3292 -> 3256 (-1.09%)
helped: 36
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.21% max: 2.50% x̄: 2.04% x̃: 2.50%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -2.34% -1.74%
Instructions are helped.

No changes on any other platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 17:22:27 -07:00
Ian Romanick
e6a9bd97b9 i965/vec4: Don't register coalesce into source of VS_OPCODE_UNPACK_FLAGS_SIMD4X2
This prevents regressions in a bunch of clipping and interpolation tests
caused by the next patch (i965/vec4: Optimize OR with 0 into a MOV).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 17:22:27 -07:00
Ian Romanick
284b563fb0 i965/fs: Optimize OR with 0 into a MOV
fs_visitor::set_gs_stream_control_data_bits generates some code like
"control_data_bits | stream_id << ((2 * (vertex_count - 1)) % 32)" as
part of EmitVertex.  The first time this (dynamically) occurs in the
shader, control_data_bits is zero.  Many times we can determine this
statically and various optimizations will collaborate to make one of the
OR operands literal zero.

Converting the OR to a MOV usually allows it to be copy-propagated away.
However, this does not happen in at least some shaders (in the assembly
output of shaders/closed/UnrealEngine4/EffectsCaveDemo/301.shader_test,
search for shl).

All of the affected shaders are geometry shaders.

Broadwell and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14375452 -> 14375413 (<.01%)
instructions in affected programs: 6422 -> 6383 (-0.61%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.14% max: 2.56% x̄: 1.91% x̃: 2.56%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -2.26% -1.57%
Instructions are helped.

total cycles in shared programs: 531981179 -> 531980555 (<.01%)
cycles in affected programs: 27493 -> 26869 (-2.27%)
helped: 39
HURT: 0
helped stats (abs) min: 16 max: 16 x̄: 16.00 x̃: 16
helped stats (rel) min: 0.60% max: 7.92% x̄: 5.94% x̃: 7.92%
95% mean confidence interval for cycles value: -16.00 -16.00
95% mean confidence interval for cycles %-change: -6.98% -4.90%
Cycles are helped.

No changes on earlier platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 17:22:27 -07:00
Eric Anholt
4106f6ce54 v3d: Handle a no-intersection scissor even if it's outside of the VP.
The min/maxes ended up producing a negative clip width/height for
dEQP-GLES3.functional.fragment_ops.scissor.outside_render_line.  Just make
sure they stay at 0 (or v3d 3.x's workaround) if that happens.
2018-06-15 16:09:39 -07:00
Eric Anholt
9aa670e52a v3d: Use the proper depth texture type for sampling.
Fixes failing tests in dEQP-GLES3.functional.texture.shadow
2018-06-15 16:09:39 -07:00
Eric Anholt
778594ae12 v3d: Limit shader threading according to our maximum TMU fifo usage.
Fixes simulator assertion failures in
dEQP-GLES3.functional.shaders.texture_functions.texture.samplercubeshadow_bias_fragment
and similar complicated cases.
2018-06-15 16:09:39 -07:00
Eric Anholt
e130ada243 v3d: Fix shaders using pixel center W but no varyings.
The docs called this field "uses both center W and centroid W", but
actually it's "do you need center W even if varyings don't obviously call
for it?"

Fixes dEQP-GLES3.functional.shaders.builtin_variable.fragcoord_w
2018-06-15 16:09:39 -07:00
Dylan Baker
0d4f338a11 docs: Update release-notes and calendar 2018-06-15 13:53:25 -07:00
Dylan Baker
3c454fc84a docs: Add release notes for 18.1.2 2018-06-15 13:52:44 -07:00
Rafael Antognolli
9e1f208795 intel/aubinator: Use int to store getopt_long flags.
getopt_long flag parameter is an int pointer, so if we use bool to store
those values, when getopt_long writes to one of them, it might end up
overwriting the next one.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 09:03:10 -07:00
Samuel Pitoiset
f8e2c4c57c Revert "radv: always set/load both depth and stencil clear values"
This fixes a rendering regression with RoTR.

This reverts commit 4bdad9fadd.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 16:52:06 +02:00
Samuel Pitoiset
a2f6e72138 radv: don't check for linear images in emit_fast_color_clear()
We don't enable CMASK for linear surfaces and addrlib only
enables DCC for tiling surfaces.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:12 +02:00
Samuel Pitoiset
3befac52db radv: allow RADV_PERFTEST=dccmsaa on GFX9
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:10 +02:00
Samuel Pitoiset
bfca15e16a radv: add RADV_DEBUG=checkir
This allows to run the LLVM verifier pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:08 +02:00
Samuel Pitoiset
706d51de7f radv: update ZRANGE_PRECISION in radv_update_bound_fast_clear_ds()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:06 +02:00
Samuel Pitoiset
fa8bc821a8 radv: clean up radv_{set,load}_depth_clear_regs() helpers
And replace _regs by _metadata because it makes more sense.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:04 +02:00
Samuel Pitoiset
4bdad9fadd radv: always set/load both depth and stencil clear values
I don't think that matter much to emit both values and that
makes the code a bit simpler.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:02 +02:00
Samuel Pitoiset
2193a6a828 radv: update the fast ds clear values only if the image is bound
It's unnecessary to update the fast depth/stencil clear values
if the fast cleared depth/stencil image isn't currently bound.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:00 +02:00
Samuel Pitoiset
be794fa26b radv: clean up radv_{set,load}_color_clear_regs() helpers
And replace _regs by _metadata because it makes more sense.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:53:58 +02:00
Samuel Pitoiset
d7b772abb4 radv: update the fast color clear values only if the image is bound
It's unnecessary to update the fast color clear values if the
fast cleared color image isn't currently bound.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:53:55 +02:00
Christian Gmeiner
efae127993 util/bitset: include util/macro.h
BITSET_FFS(x) macro makes use of ARRAY_SIZE(x) macro which is
defined in util/macro.h. Include it directy to make usage more
straightforward.

Fixes: 692bd4a1ab ("util: replace Elements() with ARRAY_SIZE()")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-15 11:26:30 +01:00
Lukas Rusak
4cfc4cef80 meson: fix private libs when building without glx
I noticed that the generated pkg-config files will include
glx and x11 dependencies even when x11 isn't a selected platform.

This fixes the private libs and was tested by building kmscube

V2:
  - check if gallium-xlib is being used for glx

Fixes: 108d257a16 "meson: build libEGL"
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-15 10:43:22 +01:00
Rhys Perry
30f1ab7a59 docs: document addition of GL_ARB_sample_locations for nvc0
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v2)
2018-06-14 20:09:45 -06:00
Rhys Perry
66ca7e400b nvc0: add support for programmable sample locations
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
2018-06-14 20:09:45 -06:00
Rhys Perry
9f217facbd st/mesa: add support for ARB_sample_locations
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v2)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
2018-06-14 20:09:45 -06:00
Rhys Perry
51a221e378 gallium: add support for programmable sample locations
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v2)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
2018-06-14 20:09:45 -06:00
Rhys Perry
67f40dadaa mesa: add support for ARB_sample_locations
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v2)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
2018-06-14 20:09:45 -06:00
Eric Anholt
cd2e673abc v3d: Fix polygon offset for Z16 buffers.
Fixes:
dEQP-GLES3.functional.polygon_offset.fixed16_displacement_with_units
dEQP-GLES3.functional.polygon_offset.fixed16_render_with_units
2018-06-14 17:03:16 -07:00
Eric Anholt
d91e06a065 v3d: Fix configuration setup of mixed f32 and f16 render targets.
Fixes dEQP-GLES3.functional.fragment_out.random.26 and 6 others.
2018-06-14 16:52:25 -07:00
Eric Anholt
6784aa9870 v3d: Don't set the first_ez_state to DISABLED if after only UNDECIDED draws.
We need to have the RCL start with EZ enabled, since those undecided draws
had EZ enabled.  But we do need to update from UNDECIDED to LT or GT as
necessary still.

Fixes many simulator assertion fails in deqp
fragment_ops/interaction/basic_shader/*
2018-06-14 16:52:25 -07:00
Eric Anholt
9080642449 v3d: Use the right size for v3d 4.x TEXTURE_SHADER_STATE BO.
This doesn't really matter, since they both get rounded up to 4096.
2018-06-14 16:52:25 -07:00
Eric Anholt
31548187cf v3d: Add static asserts for other packed packet sizes. 2018-06-14 16:52:25 -07:00
Eric Anholt
0eef4d7f8f v3d: Fix the size of the packed attribute state.
Fixes segfaults in dEQP-GLES3.functional.vertex_array_objects.all_attributes.
2018-06-14 16:52:25 -07:00
Eric Anholt
7d8fe50af3 v3d: Remove some unused context fields from vc4. 2018-06-14 16:52:25 -07:00
Eric Anholt
48011c42aa v3d: Remove unused QUNIFORM_STENCIL left over from vc4. 2018-06-14 16:52:25 -07:00
Eric Anholt
4564537222 v3d: Use our #define for max attributes in shader caps. 2018-06-14 16:52:25 -07:00
Eric Anholt
a40bc33b11 v3d: Fix undefined results for a swap_color_rb RT from a float shader output.
Fixes segfaults and undefined behavior in
dEQP-GLES3.functional.fragment_out.basic.fixed.srgb8_alpha8_lowp_float
2018-06-14 16:52:25 -07:00
Dave Airlie
600d34c822 radv: remove multisample bit from shader key.
This wasn't being used anywhere inside the shader from what I can see.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 09:33:20 +10:00
Kenneth Graunke
f6898f2b55 intel/compiler: Properly consider UBO loads that cross 32B boundaries.
The UBO push analysis pass incorrectly assumed that all values would fit
within a 32B chunk, and only recorded a bit for the 32B chunk containing
the starting offset.

For example, if a UBO contained the following, tightly packed:

   vec4 a;  // [0, 16)
   float b; // [16, 20)
   vec4 c;  // [20, 36)

then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1,
which means that we ought to record two 32B chunks in the bitfield.

Similarly, dvec4s would suffer from the same problem.

v2: Rewrite the accounting, my calculations were wrong.
v3: Write a comment about partial values (requested by Jason).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v3]
2018-06-14 14:58:59 -07:00
Ian Romanick
37bd9ccd21 glsl: Don't copy propagate elements from SSBO or shared variables either
Since SSBOs can be written by a different GPU thread, copy propagating a
read can cause the value to magically change.  SSBO reads are also very
expensive, so doing it twice will be slower.

The same shader was helped by this patch and the previous.

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14399119 -> 14399113 (<.01%)
instructions in affected programs: 683 -> 677 (-0.88%)
helped: 1
HURT: 0

total cycles in shared programs: 532973113 -> 532971865 (<.01%)
cycles in affected programs: 524666 -> 523418 (-0.24%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
2018-06-14 11:28:12 -07:00
Ian Romanick
461a5c899c glsl: Don't copy propagate from SSBO or shared variables either
Since SSBOs can be written by other GPU threads, copy propagating a read
can cause the value to magically change.  SSBO reads are also very
expensive, so doing it twice will be slower.

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14399120 -> 14399119 (<.01%)
instructions in affected programs: 684 -> 683 (-0.15%)
helped: 1
HURT: 0

total cycles in shared programs: 532978931 -> 532973113 (<.01%)
cycles in affected programs: 530484 -> 524666 (-1.10%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
2018-06-14 11:26:33 -07:00
Lukas Rusak
1d92d6486a meson: only build vl_winsys_dri.c when x11 platform is used
This seems to have been missed in the move from autotools

This fixes the following build issue:

../src/gallium/auxiliary/vl/vl_winsys_dri.c:34:10: fatal error: X11/Xlib-xcb.h: No such file or directory
 #include <X11/Xlib-xcb.h>
          ^~~~~~~~~~~~~~~~

Fixes: b1b65397d0
       ("meson: Build gallium auxiliary")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-14 10:34:51 -07:00
Brian Paul
b9e6438adf st/mesa: add missing switch cases in glsl_to_tgsi_visitor::visit()
To silence compiler warning about unhandled switch cases.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-06-14 11:29:51 -06:00
Bas Nieuwenhuizen
41dabdc475 radv: Fix output for sparse MRTs.
We need to init the cb_shader_format correctly with the changed
col_format, so this moves the col_format adjustment to before the
adjustment to before the cb_shader_mask gets generated.

Fixes: 06d3c65098 "radv: fix a GPU hang when MRTs are sparse"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106903
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-06-14 11:48:24 +02:00
Samuel Pitoiset
68dead112e radv: update the ZRANGE_PRECISION value for the TC-compat bug
On GFX8+, there is a bug that affects TC-compatible depth surfaces
when the ZRange is not reset after LateZ kills pixels.

The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match
the last fast clear value. Because the value is set to 1 by default,
we only need to update it when clearing Z to 0.0.

We also need to set the depth clear regs and to update
ZRANGE_PRECISION when initializing a TC-compat depth image to 0.

Original patch from James Legg.

This fixes random CTS fails with
dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.input.*

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-14 11:38:29 +02:00
Samuel Iglesias Gonsálvez
183adc51f8 anv: reduce maxFragmentInputComponents
If the application asks for the maximum number of fragment input
components (128), use all of them plus some builtins that are
passed in the VUE, then we exceed the maximum number of used VUE
slots (32) and we break one assert that checks this limit.

Also, with separate shader objects, we add CLIP_DIST0, CLIP_DIST1
builtins in brw_compute_vue_map() because we don't know if
gl_ClipDistance is going to be read/write by an adjacent stage.

Fixes VK-GL-CTS CL#2569.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-14 09:54:28 +02:00
Marek Olšák
6d671078a8 radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointers
This fixes:
GL45-CTS.pipeline_statistics_query_tests_ARB.functional_compute_shader_invocations

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-06-13 22:00:12 -04:00
Marek Olšák
a4312742a5 radeonsi/gfx9: update & clean up a DPBB heuristic
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:43 -04:00
Marek Olšák
47b780be21 radeonsi/gfx9: set POPS_DRAIN_PS_ON_OVERLAP due to a hw bug
This may not be needed yet, but let's set it now.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:42 -04:00
Marek Olšák
a152ca70f2 radeonsi/gfx9: remove UINT_MAX array terminators in bin size tables
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:40 -04:00
Marek Olšák
cd0be6cdc8 radeonsi/gfx9: update bin sizes
This is based on our docs (recently updated), not amdvlk.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:39 -04:00
Marek Olšák
2f51081a93 radeonsi/gfx9: update primitive binning code for EQAA
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:37 -04:00
Marek Olšák
22e994bb75 radeonsi: assume that rasterizer state is non-NULL in draw_vbo
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:36 -04:00
Marek Olšák
f3b3ee6974 radeonsi: micro-optimize prim checking and fix guardband with lines+adjacency
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:34 -04:00
Marek Olšák
d6974feb90 radeonsi: move the guardband registers into a separate state atom
They have a different frequency of updates and don't change when scissors
change.

I think this even fixes something in si_update_vs_viewport_state.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:31 -04:00
Marek Olšák
68b1c669e7 radeonsi/gfx9: implement the scissor bug workaround without performance drop
This might improve performance on Vega10 and Raven.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:27 -04:00
Marek Olšák
73b0d10152 radeonsi: don't set VGT_LS_HS_CONFIG if it doesn't change
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:25 -04:00
Marek Olšák
28ee825e19 radeonsi: move VGT_GS_OUT_PRIM_TYPE into si_shader_gs
same as amdvlk.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:23 -04:00
Marek Olšák
99e0ba6868 radeonsi: record CLIPVERTEX output usage properly for compatibility profiles
This was missed when adding CLIPVERTEX support into GS & tess.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:20 -04:00
Marek Olšák
47a57a709d radeonsi: fix FBFETCH with 2D MSAA arrays
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:17 -04:00
Marek Olšák
e5e57c3a5e ac: handle undefined EQAA samples in ac_apply_fmask_to_sample
RADV might wanna use this helper too.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:12 -04:00
Marek Olšák
a2d4c8ff6d radeonsi: return real memory usage instead of per-process usage
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-13 21:47:36 -04:00
Marek Olšák
95ecde42eb ac/gpu_info: report real total memory sizes
The change from MIN2 to MAX2 is intentional.

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-13 21:47:36 -04:00
Dave Airlie
f11b664f48 docs: mark virgl GL 4.0 features as complete.
virgl should now expose GL4.1 where it can.
2018-06-14 10:38:11 +10:00
Dave Airlie
7b6f2704eb virgl: add ARB_tessellation_shader support. (v2)
This should add all the pieces to enable tess shaders on virgl.

v2: fixup transform to handle tess and strip out precise.
set default for max patch varyings to work around issue when
tess gets enabled from v1 caps but v2 caps aren't in place. (Elie)

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-06-14 10:36:31 +10:00
Dave Airlie
babd1d526b glsl: allow standalone semicolons outside main()
GLSL 4.60 offically added this but games and older CTS suites actually
had shaders that did this, we may as well enable it everywhere.

Adding stable because it appears apps in the wild do this.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
2018-06-14 10:21:51 +10:00
Samuel Pitoiset
51e23d3419 radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8
This causes rendering issues in Shadow Warrior 2 with DXVK.

Cc: mesa-stable@lists.freedesktop.org
Fixes: ccc64f3133 ("radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106912
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-13 20:30:04 +02:00
Andrew Galante
baf16b2ea3 configure.ac: Test for __atomic_add_fetch in atomic checks
Some platforms have 64-bit __atomic_load_n but not 64-bit
__atomic_add_fetch, so test for both of them.

Bug: https://bugs.gentoo.org/655616
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-13 10:09:46 -07:00
Andrew Galante
9d547a7617 meson: Test for __atomic_add_fetch in atomic checks
Some platforms have 64-bit __atomic_load_n but not 64-bit
__atomic_add_fetch, so test for both of them.

Bug: https://bugs.gentoo.org/655616
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-13 10:09:46 -07:00
Matt Turner
b29b5a82a1 meson: Fix -latomic check
Commit 54ba73ef10 (configure.ac/meson.build: Fix -latomic test) fixed
some checks for -latomic, and then commit 54bbe600ec (configure.ac:
rework -latomic check) further extended the fixes in configure.ac but
not in Meson. This commit extends those fixes to the Meson tests.

Fixes: 54bbe600ec (configure.ac: rework -latomic check)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-13 10:09:46 -07:00
Dylan Baker
9cc577761f meson: Remove various completed todos
v3: - Remove "won't do" todos, so only completed todo's are now removed.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
2018-06-13 10:07:03 -07:00
Dylan Baker
0ce3f3538b meson: Make use of optional modules
meson 0.43 gained support for optional modules, which clover wold like
to use. Since we require 0.44.1 now we can rely on them being available
for clover.

compile tested only.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-13 10:06:58 -07:00
Dylan Baker
34bbb24ce7 meson: Add support for ppc assembly/optimizations
v2: - Use -mpower8-vector in compiler test for altivec
    - rename altivec option to power8
    - reword power8 option description to be more clear, originally I
      had made it a boolean, but replaced it with an auto option.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-13 10:06:54 -07:00
Dylan Baker
e26af22143 meson: Add support for SPARC assembly
This was blindly copied from autotools and tested by a helpful gentoo
user.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-13 10:06:25 -07:00
Dylan Baker
6eaa013685 meson: Set include dirs for asm
v2: - split this from the next patch
    - Only include x86-64 and not x86 when buiding x86_64

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-13 10:06:23 -07:00
Dylan Baker
65e447c5df meson: move cc and cpp definitions to top of main meson.build
This just makes using cc and cpp easier.

v2: - Add this patch to fix altivec

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-13 10:06:16 -07:00
Jason Ekstrand
51376cd749 Revert "intel/compiler: Properly consider UBO loads that cross 32B boundaries."
This reverts commit b8fa847c2e.

This broke about 30k Vulkan CTS tests.
2018-06-13 09:23:55 -07:00
Kenneth Graunke
b8fa847c2e intel/compiler: Properly consider UBO loads that cross 32B boundaries.
The UBO push analysis pass incorrectly assumed that all values would fit
within a 32B chunk, and only recorded a bit for the 32B chunk containing
the starting offset.

For example, if a UBO contained the following, tightly packed:

   vec4 a;  // [0, 16)
   float b; // [16, 20)
   vec4 c;  // [20, 36)

then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1,
which means that we ought to record two 32B chunks in the bitfield.

Similarly, dvec4s would suffer from the same problem.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-06-13 02:07:58 -07:00
Ross Burton
3c288da5ee drivers/dri/i965: add missing #include
brw_bufmgr.h uses time_t without include time.h, so the build fails under musl.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-12 12:08:30 +01:00
Mauro Rossi
fb9ab2fbd3 anv/android: Use an address for each anv_image plane
Fixes to avoid building error after change in image->planes[] structure,
{bo,bo_offset} has to be replaced by address.{bo,offset}
and update is needed also in the assert() for debug builds.

external/mesa/src/intel/vulkan/anv_android.c:188:21:
error: no member named 'bo' in 'struct anv_image::(anonymous at external/mesa/src/intel/vulkan/anv_private.h:2647:4)'
   image->planes[0].bo = bo;
   ~~~~~~~~~~~~~~~~ ^
1 error generated.

Fixes: bf34ef16ac ("anv: Use an address for each anv_image plane")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-12 11:17:43 +03:00
Mauro Rossi
a1220e7311 anv/android: Set the BO flags in bo_cache_import (v2)
Changes to avoid building error:

external/mesa/src/intel/vulkan/anv_android.c:131:72:
error: too few arguments to function call, expected 5, have 4
   result = anv_bo_cache_import(device, &device->bo_cache, dma_buf, &bo);
            ~~~~~~~~~~~~~~~~~~~                                        ^
1 error generated.

(v2) Set the correct bo_flags based on support of 48bit addresses and soft-pin

Fixes: b0d50247a7 ("anv/allocator: Set the BO flags in bo_cache_alloc/import")
Fixes: e7d0378bd9 ("anv: Soft-pin client-allocated memory")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-12 11:16:39 +03:00
Kenneth Graunke
0d5329d626 anv: Disable __gen_validate_value if NDEBUG is set.
We were enabling undefined memory checking for genxml values based on
Valgrind being installed at build time, even for release builds.  This
generates piles and piles of assembly whenever you touch genxml.

With gcc 7.3.1 and -O3 and -march=native on a Kabylake with Valgrind
installed at build time:

      text    data    bss     dec    hex filename
   5978385  262884  13488 6254757 5f70a5 libvulkan_intel.so
   3799377  262884  13488 4075749 3e30e5 libvulkan_intel.so

That's a 36% reduction in text size.

Fixes: 047ed02723 (vk/emit: Use valgrind to validate every packed field)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-11 14:55:32 -07:00
Eric Engestrom
06e8771dec README: wording fix for previous commit
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-11 18:34:58 +01:00
Eric Engestrom
d9f54dceca README: add link to WhosWho for IRC nicks
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-11 18:33:12 +01:00
Eric Engestrom
eadc068406 add project README
Now that we're using GitLab, let's take advantage of the "landing page"
README feature with some minimal information, mostly to point people to
the right resources.

Acked-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-11 18:02:35 +01:00
Eric Engestrom
e43c012433 i965: fix resource leak
v2: intel_miptree_release() already takes care of the planes, no need
    to hand-code the loop (Lionel)

Coverity ID: 1436909
Fixes: 3352f2d746 "i965: Create multiple miptrees for planar YUV images"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
2018-06-11 14:54:23 +01:00
Rob Clark
55d1a77c29 freedreno/ir3: use pipe_image_view's cpp
At least for PIPE_BUFFER, we could get the resource used as (for
example) R32F imageBuffer.  So using cpp=1 from the rsc is wrong.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
9bb90a3255 freedreno/ir3: fix image dimensions offset
copy-pasta fail from how SSBO sizes are handled.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
e9fc9c16c9 freedreno/a5xx: correct image/ssbo offset
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
132e5b0b34 freedreno/ir3: use saml always if we have lod
In some cases we get plain tex opcodes (but w/ a lod argument).. in this
case always use the saml instruction.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
cf5dda3349 freedreno/ir3: don't cp absneg into meta:fi
If using a fanin (collect) to collect of consecutive registers together,
we can CP mov's into the fanin, but not (abs) or (neg).  No places that
allow those modifiers are consuming a fanin anyways.  But this caused an
absneg to be lost between a ldgb and stgb for shaders like:

  outputs[n] = abs(input[n])

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
39e7a39e91 freedreno/ir3: rework size/type conversion instructions
With 8b and 16b, there are a lot more to handle.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
a52e698219 freedreno/ir3: propagate HALF flag across fanout
If we have a fanout (split) meta instruction to split the result of a
vector instruction, propagate the HALF flag back to the original
instruction.  Otherwise result ends up in a full precision register
while instruction(s) that use the result look in a half-precision
register.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
fc1690c9d9 freedreno/a5xx: add sample-id/sample-mask-in
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
619d2317cd freedreno/ir3: add sample-id/sample-mask-in
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
a49c87956e freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
067d89c2cd freedreno/ir3: image atomics use image-store path
image reads are handled via tex state, whereas image writes and atomics
are handled via SSBO state block.  Previously we were only considering
image write, and not image atomics which also uses the SSBO state block.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Kyle Brenneman
41642bdbca egl/glvnd: Fix a segfault in eglGetProcAddress.
If FindProcIndex in egldispatchstubs.c is called with a name that's less than
the first entry in the array, it would end up trying to store an index of -1 in
an unsigned integer, wrap around to 2^32, and then crash when it tries to look
that up.

Change FindProcIndex so that it uses bsearch(3) instead of implementing its own
binary search, like the GLX equivalent FindGLXFunction does.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-11 12:17:07 +01:00
Jordan Justen
e266b32059 mesa/program_binary: add implicit UseProgram after successful ProgramBinary
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106810
Fixes: b4c37ce214 "i965: Add ARB_get_program_binary support using nir_serialization"
Ref: 3fe8d04a6d "mesa: don't always set _NEW_PROGRAM when linking"
Ref: c505d6d852 "mesa: use gl_program for CurrentProgram rather than gl_shader_program"
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-10 21:12:46 -07:00
Dave Airlie
525cfe5dab features.txt: update virgl GL4.1 status.
All the features for GL4.1 are done (64-bit attribs were part of
the fp64 enable).

Once tessellation shaders land this will be advertised
2018-06-11 10:49:14 +10:00
Dave Airlie
77d7d7acab virgl: enable ARB_gpu_shader_fp64
This enables ARB_gpu_shader_fp64 if the host provides it.

Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-06-11 08:35:03 +10:00
Samuel Pitoiset
135e4d434f radv: add a workaround for DXVK hangs by setting amdgpu-skip-threshold
Workaround for bug in llvm that causes the GPU to hang in presence
of nested loops because there is an exec mask issue. The proper
solution is to fix LLVM but this might require a bunch of work.

This fixes a bunch of GPU hangs that happen with DXVK.

Vega10:
Totals from affected shaders:
SGPRS: 110456 -> 110456 (0.00 %)
VGPRS: 122800 -> 122800 (0.00 %)
Spilled SGPRs: 7478 -> 7478 (0.00 %)
Spilled VGPRs: 36 -> 36 (0.00 %)
Code Size: 9901104 -> 9922928 (0.22 %) bytes
Max Waves: 7143 -> 7143 (0.00 %)

Code size slightly increases because it inserts more branch
instructions but that's expected. I don't see any real performance
changes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105613
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-09 14:16:49 +02:00
Samuel Pitoiset
94706f0de4 radv: fix missing ZRANGE_PRECISION(1) for GFX9+
ZRANGE_PRECISION(1) seems to be the default optimal value, but
it was only set for VI and older chips.

This fixes a rendering issue with Banished through DXVK, and
might fix more than that.

There is still the ZRANGE_PRECISION bug that we need to handle
but that can be fixed later.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-09 10:57:01 +02:00
Gustavo Lima Chaves
7dfaf025c5 anv: enable VK_EXT_shader_stencil_export
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-08 11:16:01 -07:00
Gustavo Lima Chaves
7cc5178bba spirv: add/hookup SpvCapabilityStencilExportEXT
v2:
An attempt to support SpvExecutionModeStencilRefReplacingEXT's behavior
also follows, with the interpretation to said mode being we prevent
writes to the built-in FragStencilRefEXT variable when the execution
mode isn't set.

v3:
A more cautious reading of 1db44252d0 led
me to a missing change that would stop (what I later discovered were)
GPU hangs on the CTS test written to exercise this.

v4:
Turn FragStencilRefEXT decoration usage without StencilRefReplacingEXT
mode into a warning, instead of trying to make the variable read-only.
If we are to follow the originating extension on GL, the built-in
variable in question should never be readable anyway.

v5/v6: rebases.

v7:
Fix check for gen9 lost in rebase. (Ilia)
Reduce the scope of the bool used to track whether
SpvExecutionModeStencilRefReplacingEXT was used. Was in shader_info,
moved to vtn_builder. (Jason)

v8:
Assert for fragment shader handling StencilRefReplacingEXT execution
mode. (Caio)
Remove warning logic, since an entry point might not have
StencilRefReplacingEXT execution mode, but the global output variable
might still exist for another entry point in the module. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-08 11:15:37 -07:00
Eric Anholt
22cc83cf87 travis: Add the v3d driver to the automake build.
Hopefully this reduces the number of fixup commits we need for the
automake build.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-08 09:50:38 -07:00
Eric Anholt
3db39d84d2 travis: Do our automake build tests with srcdir != builddir.
This will catch many automake bugs that end-users get to experience first,
otherwise.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-08 09:50:28 -07:00
Eric Engestrom
37eb56d239 autotools/meson: compile against wayland-egl-*backend*
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106861
Fixes: 1db4ec0546 "egl: rewire the build systems to use libwayland-egl"
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Andreas Hartmetz <ahartmetz@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-08 16:45:43 +01:00
Cameron Kumar
cb03803253 vulkan/wsi: Destroy swapchain images after terminating FIFO queues
The queue_manager thread can access the images from x11_present_to_x11,
hence this reorder prevents dereferencing of dangling pointers.

Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Fixes: e73d136a02 ("vulkan/wsi/x11: Implement FIFO mode.")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-08 14:06:46 +01:00
Sonny Jiang
ce64c1b70a radeonsi: emit_dpbb_state packets optimization
Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-06-07 23:26:40 -04:00
Sonny Jiang
7dcfa1f46e radeonsi: emit_clip_state packets optimization
Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-06-07 23:26:36 -04:00
Sonny Jiang
06b47005d3 radeonsi: emit_msaa_sample_locs packets optimization
Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-06-07 23:26:36 -04:00
Sonny Jiang
a1b4b00ce2 radeonsi: emit_msaa_config packets optimization
Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-06-07 23:26:36 -04:00
Sonny Jiang
2bad413f55 radeonsi: emit_cb_render_state packets optimization
Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-06-07 23:26:25 -04:00
Sonny Jiang
43b0269ce3 radeonsi: emit_db_render_state packets optimization
Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-06-07 23:26:25 -04:00
Jan Vesely
d797f1f47e drisw: Fix invalid pointer arithmetic
Use of void * in pointer arithmetic is illegal, use char * instead.
Fixes: cf54bd5e83 ("drisw: use shared memory when possible")

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2018-06-07 21:01:29 -04:00
Timothy Arceri
03c370d2f1 radeonsi: fix possible truncation on renderer string
Fixes truncation warning in gcc 8.1

Fixes: 8539c9bf31 ("gallium/radeon: add the kernel version into the renderer string")
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-06-08 10:07:55 +10:00
Timothy Arceri
fae3b38770 ac: fix possible truncation of intrinsic name
Fixes the gcc warning:
snprintf’ output between 26 and 33 bytes into a destination of size 32

Fixes: d5f7ebda3e ("ac: add LLVM build functions for subgroup instrinsics")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-08 09:24:15 +10:00
Bas Nieuwenhuizen
4fc2d5e141 amd/common: Fix number of coords for getlod.
The LLVM 6 code reduced it to a non-array call. We need to do that
with the new code too.

This fixes dEQP-VK.glsl.texture_functions.query.texturequerylod.*array* for radv.

Fixes: a9a7993441 "amd/common: use the dimension-aware image intrinsics on LLVM 7+"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-07 23:59:52 +02:00
Dave Airlie
9be56316cf features: add virgl to the GL features list
This hopefully adds virgl to the correct places and current statuses
of various extensions.

virgl of course relies on two external things
a) host driver that can support the features
b) up to date host virglrenderer library that can support the features.

This list will be maintained as latest (a) + (b) + mesa.

Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-06-08 07:34:53 +10:00
Matt Turner
a5abb2da74 meson: Add support for read-only text segment on x86
Port of 6dfc5e28f7 (configure.ac: Add support to enable read-only text
segment on x86.) to Meson.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-07 14:16:44 -07:00
Dylan Baker
8f2421d73b meson: work around gentoo applying -m32 to host compiler in cross builds
Gentoo's ebuild system always adds -m32 to the compiler for doing x86_64
-> x86 cross builds, while meson expects it not to do that. This
results in an x86 -> x86 cross build, and assembly gets disabled.

Fixes: 2d62fc0646
       ("meson: disable x86 asm in fewer cases.")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-07 11:54:06 -07:00
Jason Ekstrand
e0fa239962 i965/screen: Sanity check that all formats we advertise are useable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 11:23:34 -07:00
Jason Ekstrand
0e7f3febf7 i965/screen: Use RGBA non-sRGB formats for images
Not all of the MESA_FORMAT and ISL_FORMAT helpers we use can properly
handle RGBX formats.  Also, we don't want to make decisions based on
those in the first place because we can't render to RGBA and we use the
non-sRGB version to determine whether or not to allow CCS_E.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 11:23:34 -07:00
Jason Ekstrand
a266934935 i965/screen: Return false for unsupported formats in query_modifiers
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 11:23:34 -07:00
Jason Ekstrand
eeae485149 i965/screen: Refactor query_dma_buf_formats
This reworks it to work like query_dma_buf_modifiers and, in particular,
makes it more flexible so that we can disallow a non-static set of
formats.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 11:23:34 -07:00
Jason Ekstrand
3b54dd87f7 intel/isl: Add bounds-checking assertions for the format_info table
We follow the same convention as isl_format_get_layout in having two
assertions to ensure that only valid formats are passed in.  We also
check against the array size of the table because some valid formats
such as CCS formats will may be past the end of the table.  This fixes
some potential out-of-bounds array access even in valid cases.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 11:23:34 -07:00
Jason Ekstrand
778e2881a0 intel/isl: Add bounds-checking assertions in isl_format_get_layout
We add two assertions instead of one because the first assertion that
format != ISL_FORMAT_UNSUPPORTED is more descriptive and checks for a
real but unsupported enumerant while the second ensures that they don't
pass in garbage values.  We also update some other helpers to use
isl_format_get_layout instead of using the table directly so that they
get bounds checking too.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 11:23:34 -07:00
Dylan Baker
c267f46ef2 meson: Clarify why asm cannot be used in cross compile
This makes the reasoning for why a cross compile is not using asm
clearer (hopefully).

v2: - fix typos

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-07 10:40:35 -07:00
Eric Engestrom
f436ae237b docs: talk about Wayland instead of libwayland
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-07 18:06:40 +01:00
Jason Ekstrand
237c5ac4f9 anv: Set fence/semaphore types to NONE in impl_cleanup
There were some places that were calling anv_semaphore_impl_cleanup and
neither deleting the semaphore nor setting the type back to NONE.  Just
set it to NONE in impl_cleanup to avoid these issues.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106643
Fixes: 031f57eba "anv: Add a basic implementation of VK_KHX_external..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 09:46:45 -07:00
Plamena Manolova
3ba16d640e nir: Add global invocation id intrinsic.
Add the missing nir intrinsic for the gl_GlobalInvocationID
compute shader variable.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-07 14:53:12 +01:00
Eric Engestrom
61edad216e travis: bump libwayland to the first version with libwayland-egl
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-07 11:10:11 +01:00
Kenneth Graunke
3ea2d791f3 i965: Require softpin support for Cannonlake and later.
This isn't strictly necessary, but anyone running Cannonlake will
already have Kernel 4.5 or later, so there's no reason to support
the relocation model on Gen10+.

This will let us avoid dealing with them for new features.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-06 19:45:09 -07:00
Kenneth Graunke
a363bb2cd0 i965: Allocate VMA in userspace for full-PPGTT systems.
This patch enables soft-pinning of all buffers, allowing us to skip
relocation processing entirely.  All systems with full PPGTT and > 4GB
of VMA should gain these benefits.  This should be most Gen8+.

Unfortunately, this excludes a few systems:
- Cherryview (only has 32-bit addressing, despite 48-bit pointers)
- Broadwell with a 32-bit kernel
- Anybody running pre-4.5 kernel.

We may enable it for Cherryview in the future, but it would require
some tweaks to the memory zone.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-06 19:45:09 -07:00
Kenneth Graunke
74259b98aa intel/blorp: Emit VF cache invalidates for 48-bit bugs with softpin.
commit 92f01fc5f9 made i965 start emitting
VF cache invalidates when the high bits of vertex buffers change.  But
we were not tracking vertex buffers emitted by BLORP.  This was papered
over by a mistake where I emitted VF cache invalidates all the time,
which Chris fixed in commit 3ac5fbadfd.

This patch adds a new hook which allows the driver to track addresses
and request a VF cache invalidate as appropriate.

v2: Make the driver do the PIPE_CONTROL so it can apply workarounds
    (caught by Jason Ekstrand).  Rebase on anv bug fix.
v3: Don't screw up the boolean (caught by Jason Ekstrand).

Fixes: 92f01fc5f9 ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-06 19:45:09 -07:00
Timothy Arceri
2a74296f24 nir: add opt_if_loop_terminator()
This pass detects potential loop terminators and moves intructions
from the non breaking branch after the if-statement.

This enables both the new opt_if_simplification() pass and loop
unrolling to potentially progress further.

Unexpectedly this change speed up shader-db run times by ~3%

Ivy Bridge shader-db results (all changes in dolphin/ubershaders):

total instructions in shared programs: 9995662 -> 9995338 (-0.00%)
instructions in affected programs: 87845 -> 87521 (-0.37%)
helped: 27
HURT: 0

total cycles in shared programs: 230931495 -> 230925015 (-0.00%)
cycles in affected programs: 56391385 -> 56384905 (-0.01%)
helped: 27
HURT: 0

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-07 11:33:04 +10:00
Timothy Arceri
1098bc5e85 nir: move ends_in_break() helper to nir_loop_analyze.h
We will use the helper while simplifying potential loop terminators
in the following patch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-07 11:33:04 +10:00
Timothy Arceri
186988e28f radv: fix Coverity no effect control flow issue
swizzle is unsigned so "desc->swizzle[c] < 0" is never true.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-07 10:10:57 +10:00
Jason Ekstrand
44c614843c intel/blorp: Don't vertex fetch directly from clear values
On gen8+, we have to VF cache flush whenever a vertex binding aliases a
previous binding at the same index modulo 4GiB.  We deal with this in
Vulkan by ensuring that vertex buffers and the dynamic state (from which
BLORP pulls its vertex buffers) are in the same 4GiB region of the
address space.  That doesn't work if we're reading clear colors with the
VF unit.  In order to work around this we switch to using MI commands to
copy the clear value into the vertex buffer we allocate for the normal
constant data.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-06 16:32:38 -07:00
Lionel Landwerlin
b28a2510cc dri: add missing 16bits formats mapping
i965 advertises the 16-bit R and RG formats through
eglQueryDmaBufFormatsEXT but falls over when a client tries to use or
asks more information about such a format because
driImageFormatToGLFormat returns MESA_FORMAT_NONE.

Found by Eero Tamminen.

v2: Add G16R16 formats (Lionel)

v3: Fix G16R16 mapping to mesa format (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106642
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-07 00:09:21 +01:00
Eric Anholt
833c404600 nir: Look into uniform structs for samplers when counting num_textures.
mesa/st decides whether to update samplers after a program change based on
whether num_textures is nonzero.  By not counting samplers in a uniform
struct, we would segfault in
KHR-GLES3.shaders.struct.uniform.sampler_vertex if it was run in the same
context after a non-vertex-shader-uniform testcase (as is the case during
a full conformance run).

v2: Implement using two separate pure functions instead of updating
    pointers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-06 13:46:55 -07:00
Eric Anholt
f69473a712 v3d: Work around GFXH-1461/GFXH-1689 by using CLEAR_TILE_BUFFERS.
This doesn't seem to have done anything to my test results.  However,
given that we've still got a class of GPU hangs, following the workarounds
that the closed driver does so that we get the same command sequences
seems like a good idea.
2018-06-06 13:46:55 -07:00
Eric Anholt
9d5860310d v3d: Enable the new NIR bitfield operation lowering paths.
These together get the GLSL 3.00 unorm/snorm pack functions and
MESA_shader_integer operations working.

v2: Fix commit message typo.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
73953b0713 nir: Add lowering for nir_op_bit_count.
This is basically the same as the GLSL lowering path.

v2: Fix typo in the link

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
7afa26d4e3 nir: Add lowering for nir_op_bitfield_reverse.
This is basically the same as the GLSL lowering path.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
6e1597c2d9 nir: Add an ALU lowering pass for mul_high.
This is based on the glsl/lower_instructions.cpp implementation, but
should be much more readable.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
6a0db5f08f nir: Add lowering for find_lsb.
There is a fairly simple relation to turn this into ufind_msb.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
d4c7c3c225 nir: Add lowering for ifind_msb to ufind_msb.
ufind_msb is easily expressed in terms of clz, and we can reduce ifind_msb
to that.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
af88acf4c4 nir: Add lowering from ibitfield_extract/ubitfield_extract to shifts.
V3D doesn't have opcodes for ibfe/ubfe, so we need to lower similarly to
glsl/lower_instructions.cpp.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
74618ccbca nir: Add lowering for bitfieldInsert without using bfi.
If you don't have HW to do bfi, then lowering bitfieldInsert to bfi makes
things harder than keeping the "bits" argument around.

This still uses bfm, but I've added the obvious lowering of bfm if you
need it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Engestrom
735b104707 docs: add note about moving to libwayland-egl in 18.2.0
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Andres Gomez <agomez@igalia.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-06 12:12:03 -07:00
Eric Engestrom
b9361c9df0 egl: remove wayland-egl now that we're using libwayland-egl
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Daniel Stone <daniels@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-06 12:12:01 -07:00
Eric Engestrom
1db4ec0546 egl: rewire the build systems to use libwayland-egl
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Daniel Stone <daniels@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-06 12:11:57 -07:00
zhaowei yuan
67f7a16b59 glsl: Take 'double' as reserved after GLSL ES 1.0
GLSL ES 1.0.17 specifies that "double" is a keyword reserved

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106823
Signed-off-by: zhaowei yuan <zhaowei.yuan@samsung.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-05 23:39:25 -07:00
Marek Olšák
17a42062cc r300g/swtcl: make pipe_context uploaders use malloc'd memory as before
Discovered by Roland Scheidegger.

The resource_create code uses GPU memory for PIPE_BIND_CUSTOM, but
malloc'd memory otherwise. Vertex and index buffers should use malloc'd
memory.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
2018-06-05 22:52:08 -04:00
Jason Ekstrand
01ad2067bb intel/eu: Use a struct copy instead of a memcpy
The memcpy had the wrong size and this was causing crashes on 32-bit
builds of the driver.

Fixes: 6a9525bf67 "intel/eu: Switch to a logical state stack"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106830
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-05 15:51:01 -07:00
Philip Rebohle
cc21e96d5f radv: Use correct color format for fast clears
Using the image format is incorrect when the view has a different
format than the image. Instead, the view format needs to be used.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106687
2018-06-05 23:51:03 +02:00
Eric Anholt
2b1b2cbf61 v3d: Be more explicit about include directory from our generated code.
You'd need src/broadcom/cle/ in the -I previously, for srcdir != builddir.
nir was fine at that, but automake didn't have it.

Bugzilla: https://github.com/anholt/mesa/issues/104
2018-06-05 12:44:49 -07:00
Bas Nieuwenhuizen
2a10fd902d radv: Do not hardcode fast clear formats.
except for the odd one out.

This should support many more formats.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-05 20:53:21 +02:00
Scott D Phillips
6fb22114a0 intel/tools: add intel_sanitize_gpu to EXTRA_DIST
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106778
Fixes: cc41603d6d ("intel/tools: new intel_sanitize_gpu tool")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-05 10:32:35 -07:00
Scott D Phillips
08535dd886 util/tests/vma: Fix warning c++11-narrowing
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106801
Fixes: 943fecc569 ("util: Add a randomized test for the virtual memory allocator")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-05 10:32:07 -07:00
Scott D Phillips
4b123fb74b util: tests: vma test depends on C++11 support
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106776
Fixes: 943fecc569 ("util: Add a randomized test for the virtual memory allocator")
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-05 10:13:14 -07:00
Michel Dänzer
6b8f3724c8 glx: Fix number of property values to read in glXImportContextEXT
We were trying to read twice as many as the X server sent us, which
upset XCB:

[xcb] Too much data requested from _XRead
[xcb] This is most likely caused by a broken X extension library
[xcb] Aborting, sorry about that.
glx-free-context: ../../src/xcb_io.c:732: _XRead: Assertion `!xcb_xlib_too_much_data_requested' failed.

Fixing this takes 3 GLX piglit tests from crash to pass.

Fixes: 0852162950 "glx: Be more tolerant in glXImportContext (v2)"
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-06-05 18:56:43 +02:00
Eric Engestrom
c765c39ea7 configure: radv depends on mako
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106784
Fixes: 17201a2eb0 "radv: port to using updated anv
                              entrypoint/extension generator."
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-05 16:32:48 +01:00
Eric Engestrom
5bdc38f356 travis: use correct form for array options
I'd like to eventually drop support for the confusing "an array of
a single empty string is meant to be interpreted as an empty array", so
let's start by not using it anymore.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-05 16:31:23 +01:00
Lionel Landwerlin
9aedee64ac anv: intel: add softpin flag on imported BOs
Looks like we forgot to update this bit of the driver for softpin.

Fixes: 4affeba1e9 ("anv: Soft-pin everything else")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-05 14:18:35 +01:00
Eric Engestrom
66c61797ad autotools: add missing android file to package
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106779
Fixes: ff904978a1 "gallium/util: Android backtrace support"
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-05 10:39:04 +01:00
Eric Engestrom
7c4423cce9 meson: fix platforms check for -D egl=true
Fixes: 0ed6a87a10 "meson: fix platforms=[]"
Reported-by: Christoph Haag <haagch@frickel.club>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-05 10:38:57 +01:00
Mathias Fröhlich
1ac4439d62 mesa: Make sure that imm draws are flushed before other draws execute.
The recent patch

    mesa: Remove FLUSH_VERTICES from VAO state changes.

    Pending draw calls on immediate mode or display list calls do
    not depend on changes of the VAO state. So, remove calls to
    FLUSH_VERTICES and flag _NEW_ARRAY as appropriate.

uncovered a problem that non immediate mode draw calls do only
flush outstanding immediate mode draws if FLUSH_UPDATE_CURRENT
is set in ctx->Driver.NeedFlush.
In that case, due to the sequence of _mesa_set_draw_vao commands
we could end up with the VAO from the FLUSH_VERTICES call set
into gl_context::Array._DrawVAO when the array draw is executed.
So the change pulls FLUSH_CURRENT out of _mesa_validate_* calls
into the array draw calls being validated.
The change introduces a new macro FLUSH_FOR_DRAW beside FLUSH_VERTICES
and FLUSH_CURRENT that flushes on changed current attributes as well
as on outstanding immediate mode draw calls. Use FLUSH_FOR_DRAW
in the non immediate mode draw code paths.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106594
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-06-05 07:05:24 +02:00
gurchetansingh@chromium.org
a7b74a77fa virgl: use bits in caps set v2
Let's add another field to caps v2, that can help report boolean
values.

Suggested-by: Gert Wollny <gert.wollny@collabora.com>
Suggested-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-05 14:29:00 +10:00
gurchetansingh@chromium.org
6ce94a50bb virgl: add shader offset alignment to to v2 caps struct
This is the SSBO analogue to fe0647. User supplied data must
be a multiple of GL_SHADER_STORAGE_BUFFER_OFFSET_ALIGNMENT.

This fixes 44 GLES31 tests on airlied@'s GLES31 sketch branches with
Nvidia hardware, but this patch standalone can applied to master. The
alignment restriction on Nvidia is 32, hence the default value.

Example tests:
   dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.0
   dEQP-GLES31.functional.ssbo.layout.multi_basic_types.single_buffer.std430

v2: Move to a better place in case statement
v3: Rebase

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-05 14:28:49 +10:00
Kenneth Graunke
1c9053d076 i965: Prepare batchbuffer module for softpin support.
If EXEC_OBJECT_PINNED is set, we don't want to emit any relocations.
We simply want to add the BO to the validation list, and possibly mark
it as writeable.  The new brw_use_pinned_bo() interface does just that.

To avoid having to make every caller consider both the relocation and
softpin cases, we make emit_reloc() call brw_use_pinned_bo() when given
a softpinned buffer.

We also can't grow buffers that are softpinned - the mechanism places a
larger BO at the same offset as the original, which requires moving BOs
around in the VMA.  With softpin, we only allocate enough VMA for the
original size of the BO.

v2: Assert that BOs aren't pinned if the kernel says we should move them
    (feedback from Chris Wilson)

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-04 18:38:41 -07:00
Kenneth Graunke
01058a5522 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
This introduces a new fast virtual memory allocator integrated with our
BO cache bucketing.  For larger objects, it falls back to the simple
free-list allocator (util_vma).

This puts the allocators in place but doesn't enable softpin yet.

v2:
 (feedback from Chris Wilson)
 - Check (bo->kflags & EXEC_OBJECT_PINNED) instead of a global flag
 - Avoid vma_free(0ull) on the err_free path.
 - Only enable if the kernel says we have full PPGTT support
 - Make bucketing allocators more resistant to failing to grow arrays
 (feedback from Scott Phillips)
 - Don't use node after popping it from the list.
 - Avoid undefined behavior in canonicalization by reusing new helper
 - Comment updates
 (feedback from myself)
 - Avoid __vma_alloc vs. vma_alloc by making a zero_high_bits helper
   to return a non-canonical address with the high bits zeroed.
 - Don't shadow loop variable 'i' when destroying things (ugly; worked)
v3:
 - Replace zero_high_bits with new common gen_48b_address helper.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-04 18:38:41 -07:00
Jason Ekstrand
e99b32d4d6 i965: Disable internal CCS for shadows of multi-sampled windows
If window system supports Y-tiling but not CCS_E, we currently create an
internal CCS for any window system buffers and then resolve right before
handing it off to X or Wayland.  In the case of the single-sampled
shadow of a multi-sampled window system buffer, this is pointless
because the only thing we do with it is use it as a MSAA resolve target
so we do MSAA resolve -> CCS resolve -> hand to the window system.
Instead, just disable CCS for the shadow and then the MSAA resolve will
write uncompressed directly into it.  If the window system supports
CCS_E, we will still use CCS_E, we just won't do internal CCS.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 15:27:29 -07:00
Jason Ekstrand
6ab9fe7673 i965/miptree: Rename a parameter to create_for_dri_image
Instead of having it be a general "is this a winsys image" boolean, make
it more specific to the actual purpose.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 15:27:16 -07:00
Jason Ekstrand
6a9525bf67 intel/eu: Switch to a logical state stack
Instead of the state stack that's based on copying a dummy instruction
around, we start using a logical stack of brw_insn_states.  This uses a
bit less memory and is way less conceptually bogus.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 14:03:03 -07:00
Jason Ekstrand
db9675f5a4 intel/eu: Set flag [sub]register number differently for 3src
Prior to gen8, the flag [sub]register number is in a different spot on
3src instructions than on other instructions.  Starting with Broadwell,
they made it consistent.  This commit fixes bugs that occur when a
conditional modifier gets propagated into a 3src instruction such as a
MAD.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 14:03:03 -07:00
Jason Ekstrand
2d20303e18 intel/eu: Copy fields manually in brw_next_insn
Instead of doing a memcpy, this moves us to start with a blank
instruction (memset to zero) and copy the fields over one at a time.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 14:03:03 -07:00
Jason Ekstrand
381fac2740 intel/eu: Add some brw_get_default_ helpers
This is much cleaner than everything that wants a default value poking
at the bits of p->current directly.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 14:03:03 -07:00
Jose Fonseca
db38c3b4ba trace: Fix parsing of recent traces.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-06-04 21:06:31 +01:00
Jose Fonseca
8652ff7cdf trace: Fix trace_context_transfer_unmap methods.
The emitted buffer_subdata/texture_subdata call didn't match the
respective signatures.

v2: Actually emit buffer_subdata call.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-06-04 21:06:31 +01:00
Nicolai Hähnle
a9a7993441 amd/common: use the dimension-aware image intrinsics on LLVM 7+
Requires LLVM trunk r329166.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-06-04 21:34:59 +02:00
Kenneth Graunke
b3ba47c592 i965: Fix batch-last mode to properly swap BOs.
On pre-4.13 kernels, which don't support I915_EXEC_BATCH_FIRST, we move
the validation list entry to the end...but incorrectly left the exec_bo
array alone, causing a mismatch where exec_bos[0] no longer corresponded
with validation_list[0] (and similarly for the last entry).

One example of resulting breakage is that we'd update bo->gtt_offset
based on the wrong buffer.  This wreaked total havoc when trying to use
softpin, and likely caused unnecessary relocations in the normal case.

Fixes: 29ba502a4e (i965: Use I915_EXEC_BATCH_FIRST when available.)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-04 09:43:09 -07:00
Samuel Pitoiset
06d3c65098 radv: fix a GPU hang when MRTs are sparse
When the i-th target format is set, all previous target formats
must be non-zero to avoid hangs. In other words, without this
if a fragment shader exports mrt0, mrt2 and mrt3, the GPU hangs
because the target format of mrt1 is zero.

This fixes DXVK GPU hangs with "Seven: The Days Long Gone",
"GTA V" and probably more games.

Cc: "18.0" 18.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-04 14:01:33 +02:00
Bas Nieuwenhuizen
2835b6baf4 radv: Don't pass a TESS_EVAL shader when tesselation is not enabled.
Otherwise on pre-GFX9, if the constant layout allows both TESS_EVAL and
GEOMETRY shaders, but the PIPELINE has only GEOMETRY, it would return the
GEOMETRY shader for the TESS_EVAL shader.

This would cause the flush_constants code to emit the GEOMETRY constants
to the TESS_EVAL registers and then conclude that it did not need to set
the GEOMETRY shader registers.

Fixes: dfff9fb6f8 "radv: Handle GFX9 merged shaders in radv_flush_constants()"
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-04 13:46:24 +02:00
Samuel Pitoiset
e3e929f8c3 nir: implement the GLSL equivalent of if simplication in nir_opt_if
This pass turns:

   if (cond) {
   } else {
      do_work();
   }

into:

   if (!cond) {
      do_work();
   } else {
   }

Here's the vkpipeline-db stats (from affected shaders) on Polaris10:

Totals from affected shaders:
SGPRS: 17272 -> 17296 (0.14 %)
VGPRS: 18712 -> 18740 (0.15 %)
Spilled SGPRs: 1179 -> 1142 (-3.14 %)
Code Size: 1503364 -> 1515176 (0.79 %) bytes
Max Waves: 916 -> 911 (-0.55 %)

This pass only affects Serious Sam 2017 (Vulkan) on my side. The
stats are not really good for now. Some shaders look quite dumb
but this will be improved with further NIR passes, like ifs
combination.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-04 12:41:10 +02:00
Samuel Pitoiset
e44f90eccf nir: make is_comparison() a non-static helper function
Rename and change the prototype for consistency regarding
nir_tex_instr_is_query(). This function will be used in the
following patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-04 12:41:08 +02:00
Dave Airlie
67eccd6aa2 nir: use num_components wrappers in print/validate.
These wrappers were introduces, so start using them.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-04 05:58:42 +10:00
Juan A. Suarez Romero
bad7332f7c doc: update calendar, add news and link release notes for 18.0.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-06-03 10:19:32 +00:00
Juan A. Suarez Romero
41c01d79ee docs: add sha256 checksums for 18.0.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit aba161e63a)
2018-06-03 10:12:02 +00:00
Juan A. Suarez Romero
a89cb6711b docs: add release notes for 18.0.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit ca0037aaef)
2018-06-03 10:12:00 +00:00
Jose Fonseca
8841c2cda5 scons: Fix MinGW cross compilation with LLVM 5.0.
LLVM 5.0 requires additional Win32 libraries, and MinGW with pthreads.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-06-02 09:58:50 +01:00
Jason Ekstrand
64e619674e anv: Don't even bother processing relocs if we have softpin
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 16:34:26 -07:00
Jason Ekstrand
c7be17c8d3 anv: Refactor reloc handling in execbuf_add_bo
This just separates the reloc list vs. BO set cases and lets us avoid an
allocation if relocs->deps->entries == 0.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 16:34:25 -07:00
Jason Ekstrand
7105b7890a anv: Assert that the kernel leaves pinned BO addresses alone
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 16:33:07 -07:00
Scott D Phillips
4affeba1e9 anv: Soft-pin everything else
v2 (Jason Ekstrand):
 - Break up Scott's mega-patch

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:13 -07:00
Scott D Phillips
f3dbe0419d anv: Soft-pin batch buffers
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:12 -07:00
Jason Ekstrand
a0b133286a anv/batch_chain: Simplify secondary batch return chaining
Previously, we did this weird thing where we left space and an empty
relocation for use in a hypothetical MI_BATCH_BUFFER_START that would be
added to the secondary later.  Then, when it came time to chain it into
the primary, we would back that out and emit an MI_BATCH_BUFFER_START.
This worked well but it was always a bit hacky, fragile and ugly.  This
commit instead adds a helper for rewriting the MI_BATCH_BUFFER_START at
the end of an anv_batch_bo and we use that helper for both batch bo list
cloning and handling returns from secondaries.  The new helper doesn't
actually modify the batch in any way but instead just adjusts the
relocation as needed.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:12 -07:00
Jason Ekstrand
4f20c665b4 anv/batch_chain: Call batch_bo_finish at the end of end_batch_buffer
The only reason we were calling it in the middle was that one of the
cases for figuring out the secondary command buffer execution type
wanted batch_bo->length which gets set by batch_bo_finish.  It's easy
enough to recalculate and now batch_bo_finish is called in a sensible
location.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:11 -07:00
Jason Ekstrand
e7d0378bd9 anv: Soft-pin client-allocated memory
Now that we've done all that refactoring, addresses are now being
directly written into surface states by ISL and BLORP whenever a BO is
pinned so there's really nothing to do besides enable it.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:11 -07:00
Jason Ekstrand
caf41c78ca anv/allocator: Support softpin in the BO cache
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:11 -07:00
Jason Ekstrand
b0d50247a7 anv/allocator: Set the BO flags in bo_cache_alloc/import
It's safer to set them there because we have the opportunity to properly
handle combining flags if a BO is imported more than once.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:10 -07:00
Scott D Phillips
27cc68d9e9 anv: For pinned BOs, skip relocations, but track bo usage
References to pinned BOs won't need to be relocated at a later
point, so just write the final value of the reference into the bo
directly.

Add a `set` to the relocation lists for tracking dependencies that
were previously tracked by relocations. When a batch is executed, we
add the referenced pinned BOs to the exec list.

v2: - visit bos from the dependency set in a deterministic order (Jason)
v3: - compar => compare, drat (Jason)
    - Reworded commit message, provided by (Jordan)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-01 14:27:10 -07:00
Scott D Phillips
c7db0ed4e9 anv: Use a separate pool for binding tables when soft pinning
Soft pinning lets us satisfy the binding table address
requirements without using both sides of a growing state_pool.

If you do use both sides of a state pool, then you need to read
the state pool's center_bo_offset (with the device mutex held) to
know the final offset of relocations that target the state pool
bo.

By having a separate pool for binding tables that only grows in
the forward direction, the center_bo_offset is always 0 and
relocations don't need an update pass to adjust relocations with
the mutex held.

v2: - don't introduce a separate state flag for separate binding tables (Jason)
    - replace bo and map accessors with a single binding_table_pool accessor (Jason)
v3: - assert bt_block->offset >= 0 for the separate binding table (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-01 14:27:10 -07:00
Scott D Phillips
e662bdb820 anv: Soft-pin state pools
The state_pools reserve virtual address space of the full
BLOCK_POOL_MEMFD_SIZE, but maintain the current behavior of
growing from the middle.

v2: - rename block_pool::offset to block_pool::start_address (Jason)
    - assign state pool start_address statically (Jason)
v3: - remove unnecessary bo_flags tampering for the dynamic pool (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-01 13:49:22 -07:00
Ian Romanick
f00fcfb7a2 nir: Lower !f2b(x) to x == 0.0
Some trivial help now, but it also prevents ~40 regressions caused by
Samuel's "nir: implement the GLSL equivalent of if simplication in
nir_opt_if" patch.

All Gen4+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14369557 -> 14369555 (<.01%)
instructions in affected programs: 442 -> 440 (-0.45%)
helped: 2
HURT: 0

total cycles in shared programs: 532425772 -> 532425743 (<.01%)
cycles in affected programs: 6086 -> 6057 (-0.48%)
helped: 2
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-06-01 10:14:53 -07:00
Ian Romanick
619c51722b nir: Add some missing "optimization undo" patterns
d8d18516b0 and 03fb13f646 added some patterns to undo conversions like

   (('ior', ('flt', a, b), ('flt', a, c)), ('flt', a, ('fmax', b, c)))

If further optimization cause some of the operands to either be the same
or be constants, undoing the transformation can lead to further savings.

I don't know why these patterns were not added in those patches.  I did
not check to see which specific patterns actually helped.  I just added
all of them for symmetry.  This prevents some loop unrolling regressions
Plane Shift caused by Samuel's "nir: implement the GLSL equivalent of if
simplication in nir_opt_if" patch.

Skylake and Broadwell had similar results. (Skylake shown)
total instructions in shared programs: 14369768 -> 14369557 (<.01%)
instructions in affected programs: 44076 -> 43865 (-0.48%)
helped: 141
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 1.50 x̃: 1
helped stats (rel) min: 0.07% max: 1.52% x̄: 0.66% x̃: 0.60%
95% mean confidence interval for instructions value: -1.67 -1.32
95% mean confidence interval for instructions %-change: -0.72% -0.59%
Instructions are helped.

total cycles in shared programs: 532430629 -> 532425772 (<.01%)
cycles in affected programs: 1170832 -> 1165975 (-0.41%)
helped: 101
HURT: 5
helped stats (abs) min: 1 max: 160 x̄: 48.54 x̃: 32
helped stats (rel) min: <.01% max: 8.49% x̄: 2.76% x̃: 2.03%
HURT stats (abs)   min: 2 max: 22 x̄: 9.20 x̃: 4
HURT stats (rel)   min: <.01% max: 0.05% x̄: 0.02% x̃: <.01%
95% mean confidence interval for cycles value: -53.64 -38.00
95% mean confidence interval for cycles %-change: -3.06% -2.20%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-01 10:13:16 -07:00
Eric Engestrom
57fbc2ac50 docs/meson: mention how to use array options
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Eric Engestrom
03a2e7b662 meson: drop unused empty string array element
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Eric Engestrom
0ed6a87a10 meson: fix platforms=[]
Fixes: 5608d0a2ce ("meson: use array type options")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Eric Engestrom
a92cdcd598 meson: fix vulkan-drivers=[]
Fixes: 5608d0a2ce ("meson: use array type options")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Eric Engestrom
a425db4d7d meson: fix gallium-drivers=[]
Fixes: 5608d0a2ce ("meson: use array type options")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Eric Engestrom
393abd6a57 meson: fix dri-drivers=[]
Fixes: 5608d0a2ce ("meson: use array type options")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Eric Engestrom
8faa22c146 REVIEWERS: add root meson.build to the Meson reviewers group
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Juan A. Suarez Romero
cbe4baed1f glsl: Add ir_binop_vector_extract in NIR
Implement ir_binop_vector_extract using NIR operations. Based on SPIR-V
to NIR approach.

This fixes:
dEQP-GLES3.functional.shaders.indexing.moredynamic.with_value_from_indexing_expression_fragment
Piglit's glsl-fs-vec4-indexing-8.shader_test

CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral <itoral@igalia.com>
2018-06-01 18:09:22 +02:00
Dylan Baker
4ad8e2ac82 doc: update calendar, add news and link release notes for 18.1.1 2018-06-01 08:39:17 -07:00
Dylan Baker
55ee53ea19 docs/relnotes: Add sha256 sums for mesa 18.1.1 2018-06-01 08:39:17 -07:00
Dylan Baker
423c4fe954 docs: Add release notes for 18.1.1 2018-06-01 08:39:17 -07:00
Plamena Manolova
939312702e i965: Add ARB_fragment_shader_interlock support.
Adds suppport for ARB_fragment_shader_interlock. We achieve
the interlock and fragment ordering by issuing a memory fence
via sendc.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-06-01 16:36:39 +01:00
Plamena Manolova
60e843c4d5 mesa: Add GL/GLSL plumbing for ARB_fragment_shader_interlock.
This extension provides new GLSL built-in functions
beginInvocationInterlockARB() and endInvocationInterlockARB()
that delimit a critical section of fragment shader code. For
pairs of shader invocations with "overlapping" coverage in a
given pixel, the OpenGL implementation will guarantee that the
critical section of the fragment shader will be executed for
only one fragment at a time.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-06-01 16:36:36 +01:00
Martin Pelikán
53719f818c compiler/spirv: reject invalid shader code properly
After bebe3d626e, b->fail_jump is prepared after vtn_create_builder
which can longjmp(3) to it through its vtx_assert()s.  This corrupts
the stack and creates confusing core dumps, so we need to avoid it.

While there, I decided to print the offending values for debugability.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-01 08:09:35 -07:00
Juan A. Suarez Romero
360bfb619f docs: change release manager for 18.1
Dylan will replace Emil as the release manager for 18.1.x series.

CC: Emil Velikov <emil.l.velikov@gmail.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-01 15:24:02 +02:00
Gert Wollny
ef3a6e3d98 virgl: Always assume that ORIGIN_UPPER_LEFT and PIXEL_CENTER* are supported
The driver must support at least one of

  PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT
  PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT

and one of

  PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER
  PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER

otherwise glsl_to_tgsi will fire an assert.

ORIGIN_UPPER_LEFT is the default convention, and is supported by
all mesa drivers, hence it seems reasonable to always report the caps
to be enabled.  On gles ORIGIN_LOWER_LEFT is generally not supported,
so we rely on the caps reported by the host that depend on whether we
run on an GL or an EGL host.

For PIXEL_CENTER it is completely host driver dependend on what is
supported, and since we do not report the actual host driver capabilities
it is best to mark both as supported, this is how it works for a GL
host too.

Fixes:
   dEQP-GLES3.functional.shaders.builtin_variable.fragcoord_xyz
   dEQP-GLES3.functional.shaders.metamorphic.bubblesort_flag.variant_1
   dEQP-GLES3.functional.shaders.metamorphic.bubblesort_flag.variant_2

Reviewed-by: Gurchetan Singh <gurcetansingh@chromium.org>
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-06-01 12:04:21 +01:00
Alex Smith
01a2414045 radeonsi: Fix crash on shaders using MSAA image load/store
The value returned by tgsi_util_get_texture_coord_dim() does not
account for the sample index. This means image_fetch_coords() will not
fetch it, leading to a null deref in ac_build_image_opcode() which
expects it to be present (the return value of ac_num_coords() *does*
include the sample index).

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-01 08:53:38 +01:00
Alex Smith
dfff9fb6f8 radv: Handle GFX9 merged shaders in radv_flush_constants()
This was not previously handled correctly. For example,
push_constant_stages might only contain MESA_SHADER_VERTEX because
only that stage was changed by CmdPushConstants or
CmdBindDescriptorSets.

In that case, if vertex has been merged with tess control, then the
push constant address wouldn't be updated since
pipeline->shaders[MESA_SHADER_VERTEX] would be NULL.

Use radv_get_shader() instead of getting the shader directly so that
we get the right shader if merged. Also, skip emitting the address
redundantly - if two merged stages are set in push_constant_stages
this change would have made the address get emitted twice.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-01 08:53:34 +01:00
Alex Smith
7ca0167ae9 radv: Consolidate GFX9 merged shader lookup logic
This was being handled in a few different places, consolidate it into a
single radv_get_shader() function.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-01 08:53:31 +01:00
Alex Smith
0fa51bfdbe radv: Set active_stages the same whether or not shaders were cached
With GFX9 merged shaders, active_stages would be set to the original
stages specified if shaders were not cached, but to the stages still
present after merging if they were.

Be consistent and use the original stages.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-01 08:53:01 +01:00
Marek Olšák
9e61147ef6 st/mesa: relax requirements for ARB_ES3_compatibility
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106748

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-01 01:04:17 -04:00
Scott D Phillips
29a139b308 anv/blorp: Write relocated values into surface states
v2 (Jason Ekstrand):
 - Split the blorp bit into it's own patch and re-order a bit
 - Use anv_address helpers

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-31 16:51:47 -07:00
Jason Ekstrand
bf34ef16ac anv: Use an address for each anv_image plane
This is better than having BO and offset fields.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
1f2328c3b7 anv/cmd_buffer: Rework surface relocation helpers
This commit renames add_surface_state_reloc to add_surface_reloc and
makes it takes an address.  We also rename add_image_view_relocs to
add_surface_state_relocs because it takes an anv_surface_state and
doesn't really care about the image view anymore.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
f270a09737 anv: Use an anv_address in anv_buffer
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
8a8bd39d5e anv/cmd_buffer: Use anv_address for handling indirect parameters
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
1029458ee3 anv: Use an anv_address in anv_buffer_view
Instead of storing a BO and offset separately, use an anv_address.  This
changes anv_fill_buffer_surface_state to use anv_address and we now call
anv_address_physical and pass that into ISL.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
de1c5c1b50 anv: Use full anv_addresses in anv_surface_state
This refactors surface state filling to work entirely in terms of
anv_addresses instead of offsets.  This should make things simpler for
when we go to soft-pin image buffers.  Among other things,
add_image_view_relocs now only cares about the addresses in the surface
state and doesn't really need the image view anymore.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
94081ffc80 anv: Add some anv_address helpers
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Scott D Phillips
aaea46242d anv: Add vma_heap allocators in anv_device
These will be used to assign virtual addresses to soft pinned
buffers in a later patch.

Two allocators are added for separate 'low' and 'high' virtual
memory areas. Another alternative would have been to add a
double-sided allocator, which wasn't done here just because it
didn't appear to give any code complexity advantages.

v2 (Scott Phillips):
 - rename has_exec_softpin to use_softpin (Jason)
 - Only remove bottom one page and top 4 GiB from virt (Jason)
 - refer to comment in anv_allocator about state address + size
   overflowing 48 bits (Jason)
 - Mention hi/lo allocators vs double-sided allocator in
   commit message (Chris)
 - assign state pool memory ranges statically (Jason)

v3 (Jason Ekstrand):
 - Use (LOW|HIGH)_HEAP_(MIN|MAX)_ADDRESS rather than (1 << 31) for
   determining which heap to use in anv_vma_free
 - Only return de-canonicalized addresses to the heap

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
6e4672f881 intel/common: Add an address de-canonicalization helper
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:45 -07:00
Scott D Phillips
943fecc569 util: Add a randomized test for the virtual memory allocator
The test pseudo-randomly makes allocations and deallocations with
the virtual memory allocator and checks that the results are
consistent. Specifically, we test that:

 * no result from the allocator overlaps an already allocated range
 * allocated memory fulfills the stated alignment requirement
 * a failed result from the allocator could not have been fulfilled
 * memory freed to the allocator can later be allocated again

v2: - fix if() in test() to actually run fill()
v3: - add c++11 build flag (Jason)
    - test the full 64-bit range (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-31 16:51:35 -07:00
Jason Ekstrand
f19ad5d31f util: Add a virtual memory allocator
This is simple linear-walk first-fit allocator roughly based on the
allocator in the radeon winsys code.  This allocator has two primary
functional differences:

 1) It cleanly returns 0 on allocation failure

 2) It allocates addresses top-down instead of bottom-up.

The second one is needed for Intel because high addresses (with bit 47
set) need to be canonicalized in order to work properly.  If we allocate
bottom-up, then high addresses will be very rare (if they ever happen).
We'd rather always have high addresses so that the canonicalization code
gets better testing.

v2: - [scott-ph] remove _heap_validate() if NDEBUG is defined (Jordan)

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Tested-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-31 16:17:35 -07:00
Bas Nieuwenhuizen
b9fb2c266a radv: Add startup debug option.
This adds a RADV_DEBUG=startup option to dump more info about
instance creation and device enumeration.

A common question end users have is why the direver is not loading
for them, and this has two common reasons:
1) They did not install the driver.
2) AMDGPU is not used for the card in the kernel.

This adds some info messages so we can easily get a some useful
output from end users.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-31 11:51:23 +02:00
Bas Nieuwenhuizen
38933c1151 radv: Add option to print errors even in optimized builds.
Errors are not that common of a case so we can eat a slight perf
hit in having to call a function and do a runtime check.

In turn this makes debugging random errors happening for end users
easier, because they don't have to have a debug build on hand.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-31 11:51:23 +02:00
Bas Nieuwenhuizen
729f7373de radv: Make the sem_info allocate/free functions static.
They are only used in 1 file.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-31 11:51:23 +02:00
Samuel Pitoiset
70f9e2589e nir: optimize iand(ieq(a, 0), ieq(b, 0)) to ieq(ior(a, b), 0)
Totals from affected shaders:
SGPRS: 80 -> 80 (0.00 %)
VGPRS: 48 -> 48 (0.00 %)
Code Size: 2120 -> 2096 (-1.13 %) bytes
Max Waves: 16 -> 16 (0.00 %)

Only two Rise of Tomb Raider shaders are affected on my side.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-31 10:57:16 +02:00
Tapani Pälli
c983c6abaf mesa: don't call Driver.TexEnv with invalid arguments
Patch skips useless and possibly dangerous calls down to the driver
in case invalid arguments were given. I noticed this would be happening
with demo of Darwinia game. AFAIK this does not fix anything but makes
this path safer and more like how other API functions are implemented.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-31 09:24:17 +03:00
Vinson Lee
d511bba2f9 v3d: Fix automake linking error.
CXXLD    gallium_dri.la
../../../../src/broadcom/.libs/libbroadcom.a(clif_dump.o): In function `clif_dump_packet':
src/broadcom/clif/clif_dump.c:87: undefined reference to `v3d33_clif_dump_packet'
src/broadcom/clif/clif_dump.c:85: undefined reference to `v3d41_clif_dump_packet'
../../../../src/broadcom/.libs/libbroadcom.a(clif_dump.o): In function `clif_process_worklist':
src/broadcom/clif/clif_dump.c:140: undefined reference to `v3d41_clif_dump_gl_shader_state_record'
src/broadcom/clif/clif_dump.c:144: undefined reference to `v3d33_clif_dump_gl_shader_state_record'

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-30 11:55:09 -07:00
Jakob Bornecrantz
d6cee5a162 virgl: Update virgl_hw.h
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-05-30 17:07:26 +01:00
Dave Airlie
e2b6d830b2 virgl: add ARB_transform_feedback_overflow_query support
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-05-30 17:02:55 +01:00
Dave Airlie
22b072c194 virgl: add polygon offset clamp
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-05-30 17:02:51 +01:00
Dave Airlie
49204ff8ad virgl: add derivative control support
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-05-30 17:02:47 +01:00
Dave Airlie
46fe349af2 virgl: add ARB_conditional_render_inverted support
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-05-30 17:02:40 +01:00
Dave Airlie
f9eb7e8b76 virgl: update caps bitset to latest version.
This makes this use all 32 bits, so future sets need to be
defined in a new struct.

Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-05-30 17:02:19 +01:00
Timothy Arceri
e8b368ad1c nir: add unsigned comparison simplifications
This avoids loop unrolling regressions in Wolfenstein II on DXVK
with an upcoming optimisation series from Samuel.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-30 22:48:37 +10:00
Bas Nieuwenhuizen
c2799574eb radv: Only expose subgroup shuffles on VI+.
The current implementation depends on bpermute, which
is VI+.

Fixes: f2c6a55061 "radv: enable subgroup capabilities"
Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-30 13:49:46 +02:00
Samuel Pitoiset
02c7916298 radv: fix emitting descriptor pointers with LLVM < 7
This was terribly wrong, I forced use of 32-bit pointers when
emitting shader descriptor pointers. This fixes GPU hangs with
LLVM 5&6 because 32-bit pointers are only supported with LLVM 7.

Fixes: 88d1ed0f81 ("radv: emit shader descriptor pointers consecutively")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-30 11:38:54 +02:00
Ilia Mirkin
04fff21c62 nv30: add a couple of missed shader caps
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-30 02:06:28 -04:00
Ilia Mirkin
30918b77ac nv30: ensure that displayable formats are marked accordingly
Fixes: f7604d8af5 ("st/dri: only expose config formats that are display targets")
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-30 02:06:28 -04:00
Marek Olšák
858ac8942d mesa: expose ARB_tessellation_shader in the compatibility profile
Gallium drivers don't expose this yet due to:
    "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY"

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
16ac832392 mesa: expose AMD_vertex_shader_layer in the compatibility profile
This requires layered FBOs from GL 3.2.

Gallium drivers don't expose this yet due to:
    "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY"

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
518d8065ce mesa: expose ARB_gpu_shader5 in the compatibility profile
Gallium drivers don't expose this yet due to:
    "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY"

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
dd93bc4f34 st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
34ea55d820 gallium: add PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
e453fc76e7 mesa: update fixed-func state constants for TCS, TES, GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
27a9f27310 mesa: print Compatibility Profile in the version string
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
d3a87537dd glsl: parse #version XXX compatibility
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
a7d0c53ab8 st/mesa: fix assertion failures with GL_UNSIGNED_INT64_ARB (v2)
Bindless texture handles can be passed via vertex attribs using this type.
They use the double codepath, so don't use st_pipe_vertex_format.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-29 20:09:00 -04:00
Marek Olšák
a8e1413876 mesa: handle GL_UNSIGNED_INT64_ARB properly (v2)
Bindless texture handles can be passed via vertex attribs using this type.
This fixes a bunch of bindless piglit tests on radeonsi.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-29 20:09:00 -04:00
Timothy Arceri
1f7a3a1102 mesa: add display list support for glPatchParameter{i,fv}()
This is required for tessellation shader Compat profile support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-30 09:37:35 +10:00
Dave Airlie
d3ff478732 glx/drisw: make the shm/non-shm loader extensions separately.
I disliked removing the const here, function tables are meant
to be const just to avoid having to think about them,
make a second table for the shm vs non-shm paths to use.

Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:54 +10:00
Marc-André Lureau
33ce3aa512 drisw/glx: implement getImageShm
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:54 +10:00
Marc-André Lureau
17b27725fe drisw: use getImageShm() if available
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:54 +10:00
Marc-André Lureau
9feaf33371 drisw: learn to query shmid handle type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:54 +10:00
Marc-André Lureau
bcd80be49a drisw/glx: use XShm if possible
Implements putImageShm from DRIswrastLoaderExtension.

If XShm extension is not available, or fails, it will fallback on
regular XPutImage().

Tested on Linux only with 16bpp and 32bpp visual.

(airlied: tested on 24bpp as well)

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:54 +10:00
Marc-André Lureau
cf54bd5e83 drisw: use shared memory when possible
If drisw_loader_funcs implements put_image_shm, allocates display
target data with shared memory and display with put_image_shm().

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:54 +10:00
Marc-André Lureau
63c427fa71 drisw: use putImageShm if available
If the DRIswrastLoaderExtension implements putImageShm, bind it to
drisw_loader_funcs.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:53 +10:00
Marc-André Lureau
de8085e649 dri: add putImageShm and getImageShm to swrastLoader
Add new API to put and get an image using shared memory. Instead of only
passing the data pointer, 3 arguments are given: the shmid, the data
offset and the shmaddr.

Bump interface version.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:53 +10:00
Dave Airlie
b7ac0779e0 gallium/winsys: rename DRM_API_HANDLE_* to WINSYS_HANDLE_*
This just renames this as we want to add an shm handle which
isn't really drm related.

Originally by: Marc-André Lureau <marcandre.lureau@gmail.com>
(airlied: I used this sed script instead)
This was generated with:
 git grep -l 'DRM_API_' | xargs sed -i 's/DRM_API_/WINSYS_/g'

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-30 09:11:53 +10:00
Marc-André Lureau
d2eaff33d0 gallium: move winsys handle to it's own file.
This will be used in the drisw interface later, which isn't
drm specific.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-30 09:11:53 +10:00
Francisco Jerez
4bd2047dee intel/fs: Add explicit last_rt flag to fb writes orthogonal to eot.
When using multiple RT write messages to the same RT such as for
dual-source blending or all RT writes in SIMD32, we have to set the
"Last Render Target Select" bit on all write messages that target the
last RT but only set EOT on the last RT write in the shader.
Special-casing for dual-source blend works today because that is the
only case which requires multiple RT write messages per RT.  When we
start doing SIMD32, this will become much more common so we add a
dedicated bit for it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Francisco Jerez
d3cd6b7215 intel/fs: Replace the CINTERP opcode with a simple MOV
The only reason it was it's own opcode was so that we could detect it
and adjust the source register based on the payload setup.  Now that
we're using the ATTR file for FS inputs, there's no point in having a
magic opcode for this.

v2 (Jason Ekstrand):
 - Break the bit which removes the CINTERP opcode into its own patch

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Francisco Jerez
39de901a96 intel/fs: Use the ATTR file for FS inputs
This replaces the special magic opcodes which implicitly read inputs
with explicit use of the ATTR file.

v2 (Jason Ekstrand):
 - Break into multiple patches
 - Change the units of the FS ATTR to be in logical scalars

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Francisco Jerez
4bfa2ac2ea intel/fs: Rename a local variable so it doesn't shadow component()
v2 (Jason Ekstrand):
 - Break the refactor into its own patch

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Francisco Jerez
11c71f0e75 intel/eu: Remove brw_codegen::compressed_stack.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Jason Ekstrand
71a86d1fc6 intel/fs: Use groups for SIMD16 LINTERP on gen11+
This is better than compression control because it naturally extends to
SIMD32.

v2:
 - Push/pop instruction state around adjusted codegen (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Jason Ekstrand
a1a850cd34 intel/fs: Assert that the gen4-6 plane restrictions are followed
The fall-back does not work correctly in SIMD16 mode and the register
allocator should ensure that we never hit this case anyway.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Jan Vesely
ed834aefa2 travis: Add clover llvm-6.0 build
v2: Don't force build using gcc-4.8
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-By: Aaron Watry <awatry@gmail.com>
2018-05-29 17:36:16 -04:00
Jan Vesely
41b878e1bd clover: Cleanup compat code for llvm < 3.9
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-By: Aaron Watry <awatry@gmail.com>
2018-05-29 17:36:16 -04:00
Jan Vesely
d424be0fed clover: Fix build after llvm r332881.
v2: fix whitespace and indentation

r332881 added an extra parameter to the emit function.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106619
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-By: Aaron Watry <awatry@gmail.com>
Tested-By: Aaron Watry <awatry@gmail.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
2018-05-29 17:36:16 -04:00
Chris Wilson
3ac5fbadfd i965: Only emit VF cache invalidations when the high bits changes
Commit 92f01fc5f9 ("i965: Emit VF cache invalidates for 48-bit
addressing bugs with softpin.") tried to only emit the VF invalidate if
the high bits changed, but it accidentally always set need_invalidate to
true; causing it to emit unconditionally emit the pipe control before
every primitive.

Fixes: 92f01fc5f9 ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106708
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-29 12:16:26 -07:00
Eric Engestrom
e4fe2fd3bb vulkan: don't free uninitialised memory
The modifiers array hasn't been initialised by then, much less with data
that would need freeing.
Move the label after the loop to fix this.

Fixes: c80c08e226 ("vulkan/wsi/x11: Add support for DRI3 v1.2")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-29 17:44:13 +01:00
Eric Engestrom
51a17e7fee dri: replace two-way switch case with a table lookup
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
---
v2: rebased on top of 432df741e0 "dri_util: Add
R10G10B10{A,X}2 translation between DRI and mesa_format."
2018-05-29 17:44:13 +01:00
Eric Engestrom
d3ca7bd452 dri: fix error value returned by driGLFormatToImageFormat()
0 is not a valid value for the __DRI_IMAGE_FORMAT_* enum.
It is, however, the value of MESA_FORMAT_NONE, which two of the callers
(i915 & i965) checked for.

The other callers (that check for errors, ie. st/dri) already check for
__DRI_IMAGE_FORMAT_NONE.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-29 17:44:13 +01:00
Eric Engestrom
1945231b48 egl/x11: fix build with DRI3 disabled
Fixes: 473af0b541 "egl/x11: deduplicate depth-to-format logic"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Yogesh Marathe <yogesh.marathe@intel.com>
2018-05-29 17:01:21 +01:00
Emil Velikov
63b95fb291 meson: require shared glapi when using DRI based libGL
Just like we do in the autotools build.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-29 16:56:19 +01:00
Emil Velikov
728d1da159 meson: remove unreachable with_glx == 'auto' check
Cannot happen since, props to the autodetection further up.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-29 16:31:46 +01:00
Thierry Reding
9e539012df tegra: Treat resources with modifiers as scanout
Resources created with modifiers are treated as scanout because there is
no way for applications to specify the usage (though that capability may
be useful to have in the future). Currently all the resources created by
applications with modifiers are for scanout, so make sure they have bind
flags set accordingly.

This is necessary in order to properly export buffers for such resources
so that they can be shared with scanout hardware.

Tested-by: Daniel Kolesa <daniel@octaforge.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-29 16:48:37 +02:00
Thierry Reding
9603d81df0 tegra: Fix scanout resources without modifiers
Resources created for scanout but without modifiers need to be treated
as pitch-linear. This is because applications that don't use modifiers
to create resources must be assumed to not understand modifiers and in
turn won't be able to create a DRM framebuffer and passing along which
modifiers were picked by the implementation.

Tested-by: Daniel Kolesa <daniel@octaforge.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-29 16:48:34 +02:00
Thierry Reding
bd3e97e5aa tegra: Remove usage of non-stable UAPI
This code path is no longer required with framebuffer modifier support.

Tested-by: Daniel Kolesa <daniel@octaforge.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-29 16:47:45 +02:00
Eric Engestrom
f736be86bb docs: add favicon to the website
favicon.png is just gears.png resized to 64x64, and favicon.ico is
generated using this command, adapted from the ImageMagick example [1]:

  $ convert favicon.png -background black \
      \( -clone 0 -resize 16x16 \) \
      \( -clone 0 -resize 32x32 \) \
      \( -clone 0 -resize 48x48 \) \
      \( -clone 0 -resize 64x64 \) \
      -delete 0 -alpha off -colors 256 favicon.ico

We could edit every html page to add `<link rel="icon" href="favicon.ico" />`,
but there's not much point as pretty much every browser will pick it up
automatically if the file is named `favicon.ico` and is in the root folder.

[1] http://www.imagemagick.org/Usage/thumbnails/#favicon

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-29 14:48:21 +01:00
Eric Engestrom
e6a1aca0b2 docs: add missing html closing tag
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-29 14:48:21 +01:00
Eric Engestrom
3b5376330f docs: add missing html tag
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-29 14:48:21 +01:00
Karol Herbst
56792a0876 nir/print: fix printing of 8/16 bit constant variables
v2 (Jose Maria Casanova Crespo <jmcasanova@igalia.com>): add float16 support

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2018-05-29 13:43:49 +02:00
Pierre Moreau
f0e80e123c nv50/ir: Extend ImmediateValue::applyLog2 to 64-bit integers
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-29 13:37:45 +02:00
Pierre Moreau
03f592a164 util/u_math: Implement a logbase2 function for unsigned long
v2 (Karol Herbst <kherbst@redhat.com>):
* removed unneeded ll
* ll -> ull

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-29 13:37:45 +02:00
Eric Engestrom
539aa604a0 docs: trivial typo fix
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-29 12:10:14 +01:00
Samuel Pitoiset
88d1ed0f81 radv: emit shader descriptor pointers consecutively
This reduces the number of SET_SH_REG packets which are emitted
for applications that use more than one descriptor set per stage.

We should be able to emit more SET_SH_REG packets consecutively
(like push constants and vertex buffers for the vertex stage),
but this will be improved later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-29 10:07:18 +02:00
Samuel Pitoiset
21baf33a94 radv: allow radv_emit_shader_pointer_head() to emit more pointers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-29 10:07:16 +02:00
Samuel Pitoiset
288fe7ec71 radv: split radv_emit_shader_pointer()
This will allow to emit consecutive shader pointers for
reducing the number of emitted SET_SH_REG packets, which
is recommended.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-29 10:07:13 +02:00
Rhys Perry
57e721a456 gm107/ir: prevent WaW hazards in instruction scheduling
Previously, findFirstUse() only considered reads "uses". This fixes that
by making it check both an instruction's sources and definitions. It
also shortens both findFistUse() and findFirstDef() along the way.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-28 13:59:56 -04:00
Bas Nieuwenhuizen
a29bc043ae radv: Implement VK_KHR_draw_indirect_count.
Literally the same as the AMD ext.

Passes *indirect_draw_count* CTS tests.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-28 12:08:26 +02:00
Bas Nieuwenhuizen
b0002e4e05 vulkan: Update header+vk.xml to 1.1.76
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-05-28 12:08:20 +02:00
Bas Nieuwenhuizen
6914d5a2c0 radv: Implement alternate GFX9 scissor workaround.
This improves dota2 performance for me by 11% when I force the
GPU DPM level to low (otherwise dota2 is CPU limited for 4k on my
threadripper), which should be a large part of the radv-amdvlk gap.
(For me with that was radv 60.3 -> 66.6, while AMDVLK does about 68
fps)

It looks like dota2 rendered the GUI with a bunch of draws with
a SetScissors before almost each draw, causing a lot of pipeline
stalls.

I'm not really happy with the duplication of code, but overriding
radeon_set_context_reg would also be messy since we have the
pre-recorded pipelines and a bunch of si_cmd_buffer code, as well
as some memory->context reg loads for which things would be more
complicated.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-28 12:04:25 +02:00
Eric Anholt
3b6dfcf7ae Revert "st/nir: use NIR for asm programs"
This reverts commit 5c33e8c772.  It broke
fixed function vertex programs on vc4 and v3d, and apparently caused
trouble for radeonsi's NIR paths as well.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
https://bugs.freedesktop.org/show_bug.cgi?id=106673
2018-05-28 14:41:03 +10:00
Scott D Phillips
4714784dae anv: move canonical_address calculation into a separate function
A later patch will make use of this in other places. Also, remove
dependency on undefined behavior of left-shifting a signed value.

v2: - move function into a separate header (Chris)
v3: (by Ken) Add new header to the various build systems.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-27 19:24:33 -07:00
Gert Wollny
1aec4a07d4 r600: Fix SSG when not all components are written
Make sure only those components are written to that are specified in the
write mask.

Fixes:
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_float_vertex
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_float_fragment
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_float_vertex
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_float_fragment
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_float_vertex
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_float_fragment
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_vec3_vertex
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_vec3_fragment
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_vec3_vertex
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_vec3_fragment
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_vec3_vertex
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_vec3_fragment
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-05-28 02:57:46 +01:00
Gert Wollny
42cd2810aa r600: Correct IDIV if DST and SRC use the same temporary
In cases like

  IDIV TEMP[0].xy TEMP[0].xx TEMP[1].yy

the result will be written to the same register that is also a source register.
Since the components are evaluated one by one, this may result in overwriting
the source value for a later operation. Work around this by adding another
temporary to store the result if the destination temporary index is equal to
one of the source temporary indices.

Fixes:
  dEQP-GLES2.functional.shaders.operator.binary_operator.div.*
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-05-28 02:57:46 +01:00
Kenneth Graunke
58fb613a51 i965: Revert recent tiled memcpy changes.
This reverts commit 79fe00efb4.
This reverts commit f5e8b13f78.
This reverts commit d21c086d81.

They broke the Android build and I'd rather not leave it broken
for the long holiday weekend.
2018-05-26 16:25:50 -07:00
Scott D Phillips
79fe00efb4 i965/miptree: Use cpu tiling/detiling when mapping
Rename the (un)map_gtt functions to (un)map_map (map by
returning a map) and add new functions (un)map_tiled_memcpy that
return a shadow buffer populated with the intel_tiled_memcpy
functions.

Tiling/detiling with the cpu will be the only way to handle Yf/Ys
tiling, when support is added for those formats.

v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson)

v3: Add units to parameter names of tile_extents (Nanley Chery)
    Use _mesa_align_malloc for the shadow copy (Nanley)
    Continue using gtt maps on gen4 (Nanley)

v4: Use streaming_load_memcpy when detiling

v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it
    takes precedence.  Add intel_miptree_access_raw, needed after
    rebasing on commit b499b85b0f.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-25 21:35:50 -07:00
Chris Wilson
f5e8b13f78 i915: Fix streaming loads for intel_tiled_memcpy
We stream from a tiled and aligned source into an unaligned user buffer,
so we need to use _mm_storeu_si128.

Fixes: d21c086d81 (i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-25 21:35:50 -07:00
Marek Olšák
18c50498db radeonsi: remove unused variable addr_vec
trivial
2018-05-25 18:37:57 -04:00
Jason Ekstrand
ae514ca695 intel/blorp: Support blits and clears on surfaces with offsets
For certain EGLImage cases, we represent a single slice or LOD of an
image with a byte offset to a tile and X/Y intratile offsets to the
given slice.  Most of i965 is fine with this but it breaks blorp.  This
is a terrible way to represent slices of a surface in EGL and we should
stop some day but that's a very scary and thorny path.  This gets blorp
to start working with those surfaces and fixes some dEQP EGL test bugs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-25 14:01:44 -07:00
Marek Olšák
2f65c67043 radeonsi: fix passing gl_ClipVertex for GS and tess
Also add the fprintf call.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-25 16:46:00 -04:00
Marek Olšák
a7d61c0753 radeonsi: fix color inputs/outputs for GS and tess
GS is tested, tessellation is untested.

Have outputs_written_before_ps for HW VS and outputs_written for other
stages. The reason is that COLOR and BCOLOR alias for HW VS, which
drives elimination of VS outputs based on PS inputs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-25 16:46:00 -04:00
Marek Olšák
92ea9329e5 radeonsi: fix incorrect parentheses around VS-PS varying elimination
I don't know if it caused issues.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-25 16:46:00 -04:00
Marek Olšák
a4ba7cd6a2 st/mesa: simplify lastLevel determination in st_finalize_texture
This fixes shader images where we always bind stObj->pt and not individual
gl_texture_images.

Roughly based on i965 commit 845ad2667a
which does a similar thing but for a different reason.

This fixes GL CTS assertion failures introduced by Ilia.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-25 16:31:36 -04:00
Scott D Phillips
d21c086d81 i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear
The reference for MOVNTDQA says:

    For WC memory type, the nontemporal hint may be implemented by
    loading a temporary internal buffer with the equivalent of an
    aligned cache line without filling this data to the cache.
    [...] Subsequent MOVNTDQA reads to unread portions of the WC
    cache line will receive data from the temporary internal
    buffer if data is available.

This hidden cache line sized temporary buffer can improve the
read performance from wc maps.

v2: Add mfence at start of tiled_to_linear for streaming loads (Chris)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-25 11:05:46 -07:00
Alok Hota
fb20ae0374 swr/rast: Adjusted avx512 primitive assembly for msvc codegen
Optimize AVX-512 PA Assemble (PA_STATE_OPT). Reduced generated code by
about 4x, MSVC compiler was going crazy making temporaries and
split-loading inputs onto the stack unless explicit AVX-512 load ops
were added

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:57:02 -05:00
Alok Hota
b3360f5c8b swr/rast: Moved memory init out of core swr init
Added two new files for a wrapper function for initialization

v2: added missing include for single architecture builds

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:56:55 -05:00
Alok Hota
b6b114c1ae swr/rast: Removed superfluous JitManager argument from passes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:56:49 -05:00
Alok Hota
98d0201577 swr/rast: Renamed MetaData calls
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:56:43 -05:00
Alok Hota
14b5cac0be swr/rast: Use metadata to communicate between passes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:56:37 -05:00
Alok Hota
f09636e2e1 swr/rast: Check gCoreBuckets/CORE_BUCKETS equal length at compile time
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:56:01 -05:00
Alok Hota
cfe75cc7b5 swr/rast: Added in-place building to SCATTERPS
SCATTERPS previously assumed it was being used with an existing basic
block

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:55:37 -05:00
Samuel Pitoiset
45eb24fedf radv: run the EarlyCSEMemSSA LLVM pass
It's recommended by the instruction combining pass, and
RadeonSI also runs it.

This pass used to segfault with one shader of F12017 in the
past, but it no longer crashes. Maybe the LLVM IR generated
by RADV has changed.

Polaris10:
Totals from affected shaders:
SGPRS: 441352 -> 441648 (0.07 %)
VGPRS: 310888 -> 300784 (-3.25 %)
Spilled SGPRs: 13576 -> 12983 (-4.37 %)
Code Size: 22560328 -> 22420544 (-0.62 %) bytes
Max Waves: 40755 -> 41366 (1.50 %)

Vega10:
Totals from affected shaders:
SGPRS: 442848 -> 442000 (-0.19 %)
VGPRS: 310396 -> 300460 (-3.20 %)
Spilled SGPRs: 13708 -> 12906 (-5.85 %)
Code Size: 22479428 -> 22336216 (-0.64 %) bytes
Max Waves: 45783 -> 46506 (1.58 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-25 14:24:14 +02:00
Samuel Pitoiset
66e38654c9 radv: fix dumping compute shader on the graphics queue
The graphics pipeline can be NULL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-25 11:58:07 +02:00
Samuel Pitoiset
de06dfa9ea radv: add radv_dump_pipeline_state() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-25 11:58:05 +02:00
Samuel Pitoiset
6f0530ecfe radv: rework how shaders are dumped when generating a hang report
Use a flag for the active stages instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-25 11:58:03 +02:00
Samuel Pitoiset
8c406f0b4d radv: remove unused parameter in radv_dump_annotated_shader()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-25 11:57:59 +02:00
Jose Dapena Paz
6c61c31dc2 mesa: do not leak ctx->Shader.ReferencedProgram references
When glUseProgram is used, references to the included shaders are
added in ctx->Shader.ReferencedProgram. But those references are not
decreased when the shader data is deallocated. Thus, those shaders
are leaked.

Explicitely remove the pending references to these shaders.

Fixes: e6506b3cd2 ("mesa: retain gl_shader_programs after glDeleteProgram if they are in use")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-25 10:38:09 +10:00
Marek Olšák
508b423dd6 radeonsi: set DB_EQAA.MAX_ANCHOR_SAMPLES correctly
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-24 13:41:57 -04:00
Marek Olšák
07e02c8617 radeonsi: round ps_iter_samples in set_min_samples
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-24 13:41:57 -04:00
Marek Olšák
510c88f9d1 radeonsi: remove redundant ps_iter_samples clamp
si_get_ps_iter_samples already does this.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-24 13:41:56 -04:00
Marek Olšák
25cdf754e4 radeonsi: remove some old gfx 9.x registers
Leftover from bring up.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-24 13:41:56 -04:00
Marek Olšák
b936f9aa32 radeonsi: disable primitive binning for all blitter ops
same as amdvlk.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-24 13:41:56 -04:00
Marek Olšák
8c1c451a90 ac/surface/gfx6: don't overallocate mipmapped HTILE
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-24 13:41:56 -04:00
Eric Engestrom
473af0b541 egl/x11: deduplicate depth-to-format logic
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-24 18:01:45 +01:00
Tapani Pälli
7b54404c9d i965: enable OES_texture_view for gen8+
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-24 12:53:07 +03:00
Tapani Pälli
3ddcdcf94d mesa: changes to expose OES_texture_view extension
Functionality already covered by ARB_texture_view, patch also
adds missing 'gles guard' for enums (added in f1563e6392).

Tested via arb_texture_view.*_gles3 tests and individual app
utilizing texture view with ETC2.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-24 12:53:07 +03:00
Juan A. Suarez Romero
046b2b651e docs: update release calendar for 18.1 series
v2: extend 18.1 series (Andres)
v3: fix copy/paste typo (Engestrom)

CC: Andres Gomez <agomez@igalia.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-05-24 11:47:47 +02:00
Samuel Pitoiset
38a8c5903b radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS
Do not lower FS inputs because this moves all load_var
instructions at beginning of shaders and because
interp_var_at_sample (and friends) seem broken. That might
be eventually enabled later on if we really want to preload
all FS inputs at beginning.

Polaris10:
Totals from affected shaders:
SGPRS: 54072 -> 54264 (0.36 %)
VGPRS: 38580 -> 38124 (-1.18 %)
Spilled SGPRs: 652 -> 652 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 2128116 -> 2127380 (-0.03 %) bytes
Max Waves: 8048 -> 8086 (0.47 %)

Vega10:
Totals from affected shaders:
SGPRS: 52616 -> 52656 (0.08 %)
VGPRS: 37536 -> 37116 (-1.12 %)
Spilled SGPRs: 828 -> 828 (0.00 %)
Code Size: 2043756 -> 2042672 (-0.05 %) bytes
Max Waves: 9176 -> 9254 (0.85 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-24 09:18:57 +02:00
Samuel Pitoiset
ded1509587 radv: call nir_split_var_copies() before nir_lower_var_copies()
This doesn't nothing special currently because we don't create
any copy_var instructions, but this is needed for the next patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-24 09:18:54 +02:00
Francisco Jerez
936cd3c87a i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.
Instead of directly using intel_obj->buffer.  Among other things
intel_bufferobj_buffer() will update intel_buffer_object::
gpu_active_start/end, which are used by glBufferSubData() to decide
which path to take.  Fixes a failure in the Piglit
ARB_shader_image_load_store-host-mem-barrier Buffer Update/WaW tests,
which could be reproduced with a non-standard glGetTexSubImage
implementation (see bug report).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105351
Reported-by: Nanley Chery <nanleychery@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-05-23 16:21:34 -07:00
Francisco Jerez
e989acb03b i965: Handle non-zero texture buffer offsets in buffer object range calculation.
Otherwise the specified surface state will allow the GPU to access
memory up to BufferOffset bytes past the end of the buffer.  Found by
inspection.

v2: Protect against out-of-range BufferOffset (Nanley).
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-05-23 16:21:28 -07:00
Francisco Jerez
156d2c6e62 i965: Move buffer texture size calculation into a common helper function.
The buffer texture size calculations (should be easy enough, right?)
are repeated in three different places, each of them subtly broken in
a different way.  E.g. the image load/store path was never fixed to
clamp to MaxTextureBufferSize, and none of them are taking into
account the buffer offset correctly.  It's easier to fix it all in one
place.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106481
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-05-23 16:21:09 -07:00
Francisco Jerez
5a68147803 Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"
This reverts commit c0ed52f614.  It was
preventing the image format validation from being done on buffer
textures, which is required to ensure that the application doesn't
attempt to bind a buffer texture with an internal format incompatible
with the image unit format (e.g. of different texel size), which is
not allowed by the spec (it's not allowed for *any* texture target,
whether or not there is spec wording restricting this behavior
specifically for buffer textures) and will cause the driver to
calculate texel bounds incorrectly and potentially crash instead of
the expected behavior.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106465
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-05-23 16:21:09 -07:00
Bas Nieuwenhuizen
699e1f5aac ac: Use DPP for build_ddxy where possible.
WQM is pretty reliable now on LLVM 7, so let us just use
DPP + WQM.

This gives approximately a 1.5% performance increase on the
vrcompositor built-in benchmark.

v2: Use ac_build_quad_swizzle.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-23 21:02:45 +02:00
Miguel Casas
b73b340c37 i965: add {X,A}BGR2101010 to 'intel_image_formats'
This patch adds {X,A}BGR2101010 entries to the list of supported
'intel_image_formats'.

Bug: https://crbug.com/776093
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-23 10:19:04 -07:00
Miguel Casas
432df741e0 dri_util: Add R10G10B10{A,X}2 translation between DRI and mesa_format.
Add R10G10B10{A,X}2 translation between mesa_format and DRI format
to driGLFormatToImageFormat() and driImageFormatToGLFormat().

Bug: https://crbug.com/776093
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-23 10:17:45 -07:00
Dylan Baker
c8acfd5ab2 bin/get-pick-listh.sh: force git --pretty=medium
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-05-23 09:54:17 -07:00
Dylan Baker
5a639bdb81 bin/bugzilla_mesa.sh: explicitly set the --pretty argument
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-05-23 09:54:00 -07:00
Eric Engestrom
ec986241f3 docs: drop unnecessary out-of-frame target
I'm guessing an earlier version of the website used to have the page
contents in <frames>, but this isn't the case anymore so just drop the
unnecessary `target="_main"` :)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-23 16:52:23 +01:00
Eric Engestrom
09a6cb7be6 docs: fix various html tags mistakes
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-23 16:52:23 +01:00
Eric Engestrom
8034f5f623 docs: fix < & > used in html code
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-23 16:52:23 +01:00
Juan A. Suarez Romero
6db0660d08 docs: add news notes to 18.1.0
CC: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-05-23 13:06:55 +02:00
Dave Airlie
f2f464de57 tgsi/scan: add hw atomic to the list of memory accessing files
This fixes 4 out of 5 cases in:
arb_framebuffer_no_attachments-atomic on cayman.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
2018-05-23 03:51:40 +01:00
Roland Scheidegger
7b89fcec41 llvmpipe: improve rasterization discard logic
This unifies the explicit rasterization discard as well as the implicit
rasterization disabled logic (which we need for another state tracker),
which really should do the exact same thing.
We'll now toss out the prims early on in setup with (implicit or
explicit) discard, rather than do setup and binning with them, which
was entirely pointless.
(We should eventually get rid of implicit discard, which should also
enable us to discard stuff already in draw, hence draw would be
able to skip the pointless clip and fallback stages in this case.)
We still need separate logic for only null ps - this is not the same
as rasterization discard. But simplify the logic there and don't count
primitives simply when there's an empty fs, regardless of depth/stencil
tests, which seems perfectly acceptable by d3d10.
While here, also fix statistics for primitives if face culling is
enabled.
No piglit changes.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-05-23 04:23:32 +02:00
Bas Nieuwenhuizen
047438287c ac/surface/gfx6: Don't force a tile index for fmask.
The bpe of the fmask often differs from the bpe of the main
surface. On SI that means it has to get a different tile
index.

addrlib is capable of figuring this out itself, so just pass
-1 instead to let it know that it is not preset.

Fixes: 9bf3570fed "ac/surface/gfx6: compute FMASK together with the color surface"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106511
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106499
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-23 02:23:03 +02:00
Jason Ekstrand
a347a5a12c i965: Remove ring switching entirely
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:39 -07:00
Jason Ekstrand
b499b85b0f i965/miptree: Move the access_raw call to the individual map functions
The only function that doesn't need to call access_raw is map_blit.  If
it takes the blitter path, it will happen as part of intel_miptree_copy.
If map_blit takes the blorp path, brw_blorp_copy_miptrees will handle
doing whatever resolves are needed.  This should save us resolves in
quite a few cases and will probably help performance a bit.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:37 -07:00
Jason Ekstrand
f566a1264c i965: Remove support for the BLT ring
We still support the blitter on gen4-5 but it's on the same ring as 3D.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:35 -07:00
Jason Ekstrand
33affda8bf i965/miptree: Use blorp for blit maps on gen6+
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:34 -07:00
Jason Ekstrand
0eedb0fca9 i965/miptree: Use blorp for validation tex copies on gen6+
It's faster than the blitter and can handle things like stencil properly
so it doesn't require software fallbacks.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:32 -07:00
Jason Ekstrand
80fc3896f3 i965: Delete the blitter path for CopyTexSubImage
The blorp path (called first) can do anything the blitter path can do so
it's just dead code.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:31 -07:00
Jason Ekstrand
8162256b01 i965: Don't fall back to the blitter in BlitFramebuffer
On gen4-5, we try the blitter before we even try blorp.  On newer
platforms, blorp can do everything the blitter can so there's no point
in even having the blitter fall-back path.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:29 -07:00
Jason Ekstrand
e596563b08 i965: Remove some unused includes of intel_blit.h
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:27 -07:00
Jason Ekstrand
a9499374a9 i965/blit: Delete intel_emit_linear_blit
This function is no longer used.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:25 -07:00
Jason Ekstrand
7fd962093f i965: Use meta for pixel ops on gen6+
Using meta for anything is fairly aweful and definitely has more CPU
overhead.  However, it also uses the 3D pipe and is therefore likely
faster in terms of GPU time than the blitter.  Also, the blitter code
has so many early returns that it's probably not buying us that much.
We may as well just use meta all the time instead of working over-time
to find the tiny case where we can use the blitter.  We keep gen4-5
using the old blit paths to avoid perturbing old hardware too much.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:20 -07:00
Kenneth Graunke
92f01fc5f9 i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.
We'd like to start using soft-pin to assign BO addresses up front, and
never move them again.  Our previous plan for dealing with 48-bit VF
cache bugs was to relocate vertex buffers to the low 4GB, so we'd never
have addresses that alias in the low 32 bits.  But that requires moving
buffers dynamically.

This patch tracks the last seen BO address for each vertex/index buffer,
and emits a VF cache invalidate if the high bits change.  (Ideally, we
won't hit this case very often.)  This should work for the soft-pin
case, but unfortunately won't work in the relocation case, as we don't
actually know the addresses.  So, we have to use both methods.

v2: Mention that the cache uses a <VertexBufferIndex, Address> tuple
    more explicitly (suggested by Scott).  Mention "single batch" too
    (suggested by Chris).

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-22 10:02:28 -07:00
Kenneth Graunke
c7259259d4 i965: Introduce a "memory zone" concept on BO allocation.
We're planning to start managing the PPGTT in userspace in the near
future, rather than relying on the kernel to assign addresses.  While
most buffers can go anywhere, some need to be restricted to within 4GB
of a base address.

This commit adds a "memory zone" parameter to the BO allocation
functions, which lets the caller specify which base address the BO will
be associated with, or BRW_MEMZONE_OTHER for the full 48-bit VMA.

Eventually, I hope to create a 4GB memory zone corresponding to each
state base address.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-05-22 10:01:09 -07:00
Jason Ekstrand
417b9e5770 intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0
Fixes: d6cd14f213 "i965/fs: Define new shader opcode to..."
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2018-05-22 09:53:23 -07:00
Michel Dänzer
fe2edb25dd dri3: Stricter SBC wraparound handling
Prevents corrupting the upper 32 bits of draw->recv_sbc when
draw->send_sbc resets to 0 (which currently happens when the window is
unbound from a context and bound to one again), which in turn caused
loader_dri3_swap_buffers_msc to calculate target_msc with corrupted
upper 32 bits. This resulted in hangs with the Xorg modesetting driver
as of xserver 1.20 (older versions and other drivers ignored the upper
32 bits of the target MSC, which is why this wasn't noticed earlier).

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/106351
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2018-05-22 17:59:53 +02:00
Samuel Pitoiset
75e919c045 radv: fix computation of user sgprs for 32-bit pointers
With 32-bit pointers we only need one user SGPR per desc set.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:29 +02:00
Samuel Pitoiset
c5536fc813 radv: drop user_sgpr_info::sgpr_count
It's only used inside allocate_user_sgprs().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:26 +02:00
Samuel Pitoiset
36a4d6d081 radv: add support for 32-bit pointers in user data SGPRs
We still use 64-bit GPU pointers for all ring buffers because
llvm.amdgcn.implicit.buffer.ptr doesn't seem to support 32-bit
GPU pointers for now. This can be improved later anyways.

Vega10:
Totals from affected shaders:
SGPRS: 1008722 -> 1026710 (1.78 %)
VGPRS: 706580 -> 707136 (0.08 %)
Spilled SGPRs: 22555 -> 22209 (-1.53 %)
Spilled VGPRs: 75 -> 75 (0.00 %)
Code Size: 34819208 -> 35202140 (1.10 %) bytes
Max Waves: 175423 -> 175086 (-0.19 %)

Polaris10:
Totals from affected shaders:
SGPRS: 1029849 -> 1036517 (0.65 %)
VGPRS: 709984 -> 708872 (-0.16 %)
Spilled SGPRs: 22672 -> 22309 (-1.60 %)
Spilled VGPRs: 82 -> 66 (-19.51 %)
Scratch size: 76 -> 60 (-21.05 %) dwords per thread
Code Size: 34915336 -> 35309752 (1.13 %) bytes
Max Waves: 151221 -> 151677 (0.30 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:22 +02:00
Samuel Pitoiset
b654ef5808 radv: add set_loc_shader_ptr() helper
This helper will hep for switching to 32-bit GPU pointers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:20 +02:00
Samuel Pitoiset
14a7547c08 radv: allocate descriptor BOs in the 32-bit addr space
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:18 +02:00
Samuel Pitoiset
0d1406ad12 radv: allocate the upload BO in the 32-bit addr space
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:17 +02:00
Samuel Pitoiset
d8a61d3232 radv: set amdgpu-32bit-address-high-bits LLVM attribute
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:15 +02:00
Samuel Pitoiset
fe2649d3ad radv/winsys: allow to allocate BOs in the 32-bit addr space
This introduces a new flag called RADEON_FLAG_32BIT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:13 +02:00
Samuel Pitoiset
b60e0ee789 radv/winsys: request high address
This is needed for 32-bit GPU pointers. Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:09 +02:00
Anuj Phogat
0748383a60 i965/glk: Add l3 banks count for 2x6 configuration
2x6 configuration with pci-id 0x3185 has same number of
banks (2) as 3x6 configuration (pci-id 0x3184).

Reported-by: Clayton Craft <clayton.a.craft@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: eb23be1d97 "i965: Add and initialize l3_banks field for gen7+"
Cc: Francisco Jerez <currojerez@riseup.net>
2018-05-21 16:43:26 -07:00
Vinson Lee
85f61197df v3d: Include v3d_drm.h path.
Fix build error.

  CC       v3d_blit.lo
In file included from v3d_blit.c:27:0:
v3d_context.h:39:10: fatal error: v3d_drm.h: No such file or directory
 #include "v3d_drm.h"
          ^~~~~~~~~~~

Fixes: 8a793d42f1 ("v3d: Switch the vc5 driver to using the finalized V3D UABI.")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-21 11:15:47 -07:00
Samuel Pitoiset
73df16dcee radv: fix centroid interpolation
It's legal to set the centroid and sample interpolation modes
when MSAA disabled. So, we have to initialize the centroid
inputs because the hardware doesn't.

This fixes rendering issues with DXVK and The Witness, World of
Warcraft, Trackmania and probably more games.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106315
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102390
CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-21 13:57:46 +02:00
Bas Nieuwenhuizen
f26b008e28 radv: Cleanup unused prime blit path.
Since we have the common WSI code, we use vkCmdCopyImageToBuffer
instead.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-21 10:33:41 +02:00
Bas Nieuwenhuizen
a63a0960e3 radv: Fix SRGB compute copies.
SRGB stores are broken. We had compensation code in the
resolve path but none in the copy path. Since we don't
want any conversion and it does not matter for DCC,
just make everything UNORM instead.

This happened to cause wrong colors for the PRIME path, as
that uses image->buffer copies which always use the compute
path.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106587
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-05-21 10:33:41 +02:00
Tapani Pälli
63525ba730 android: enable VK_ANDROID_native_buffer
Patch changes entrypoints generator to not skip this extension even
though it is set as disabled in the xml. We also need compilation
flag VK_USE_PLATFORM_ANDROID_KHR to be enabled.

It looks like this extension got disabled in commit 69f447553c.

v2: just remove the whole 'supported' attrib check + remove
    vk_icd.h compilation fix (fix in VulkanHeaders instead)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-21 09:26:50 +03:00
Tapani Pälli
437acae704 vulkan: update vk_icd.h to current upstream
Import from commit eb0c1fd on branch 'master'
of https://github.com/KhronosGroup/Vulkan-Headers.git.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-21 09:26:50 +03:00
Dave Airlie
bfa74bb44d virgl: set texture buffer offset alignment to disable ARB_texture_buffer_range.
The host side hasn't got support for this feature yet, so don't enable it
unless we get the caps from the host.

This makes the texture buffer range piglit tests skip now.

Fixes: fe0647df5a (virgl: add offset alignment values to to v2 caps struct)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-05-21 12:44:55 +10:00
Timothy Arceri
2e6c987a85 mesa: stop hiding query parameters from OpenGL compat
Just let the extension detection do its job as we will be adding
compat profile support in future, also we want these to work
with compat profile version overrides.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-21 09:39:03 +10:00
Christoph Haag
549e54270b radv: fix VK_EXT_descriptor_indexing
GetPhysicalDeviceProperties2KHR() was crashing because features was null

Fixes: 0e10790558 "radv: Enable VK_EXT_descriptor_indexing."
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-20 13:36:07 +02:00
Bas Nieuwenhuizen
a1c87235a9 ac/surface: Only align linear power of two fmt textures.
We're not sharing 32_32_32 formats between different GPUs, so we
do not have to align for vega on pre-vega cards.

Fixes: e361970ed7 "radv: Add support for IMG_DATA_FORMAT_32_32_32."
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-20 11:57:59 +02:00
Bas Nieuwenhuizen
62e0e089d7 amd/addrlib: Use defines in autotools build.
Otherwise stuff like NDEBUG would not be passed through.

CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106479
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-20 11:57:59 +02:00
Aaron Watry
cfe582f9dc r600/compute: Mark several functions as static
They're not used anywhere else, so keep them private

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2018-05-19 10:22:16 -05:00
Aaron Watry
d21e64c626 r600/compute: Remove unused compute_memory_pool functions
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2018-05-19 10:21:57 -05:00
Roland Scheidegger
6f558fb0f7 draw: get rid of special logic to not emit null tris
I've confirmed after 77554d220d we no
longer need this to pass some tests from another api (as we no longer
generate the bogus extra null tris in the first place).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-05-19 02:49:58 +02:00
Dylan Baker
c86e9a5fe5 docs: Add sha sums for release 2018-05-18 16:44:50 -07:00
Dylan Baker
1d46852830 docs: Add release notes for 18.1.0 2018-05-18 16:44:43 -07:00
Alyssa Rosenzweig
5d85a0a55b nir: Implement optional b2f->iand lowering
This pass is required by the Midgard compiler; our instruction set uses
NIR-style booleans (~0 for true) but lacks a dedicated b2f instruction.
Normally, this lowering pass would be implemented in a backend-specific
algebraic pass, but this conflicts with the existing iand->b2f pass in
nir_opt_algebraic.py, hanging the compiler. This patch thus makes the
existing pass optional (default on -- all other backends should remain
unaffected), adding an optional pass for lowering the opposite
direction.

v2: Defer lowering until late algebraic optimisations to allow
optimising the b2f instruction itself.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-05-18 22:44:09 +02:00
Jan Vesely
8ed2cabd04 travis: Adapt to radeonsi dropping support for LLVM 4
meson Vulkan, Clover, and autotools Vulkan need to be switched to llvm 5

Fixes: f9eb1ef870
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-18 13:59:37 -04:00
Marek Olšák
3d64ed5785 radeonsi: skip ES output stores for undefined output components
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-18 13:38:07 -04:00
Nanley Chery
0ab25f05ab i965: isl: Move the MCS gen7+ assertion into ISL
This is useful for every user of ISL. Drop the comment along the way to
match similar functions in ISL.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-18 09:53:06 -07:00
Nanley Chery
f88caf2321 i965/miptree: Remove format assertion in alloc_aux
intel_miptree_supports_{ccs,mcs,hiz} ensures the format is valid for the
color or depth miptree before the miptree is assigned an aux_usage.
alloc_aux switches on the aux_usage so don't assert that the format is
valid.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-18 09:53:06 -07:00
Nanley Chery
8007b2d78b i965/miptree: Simplify the switch in supports_ccs
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-18 09:53:06 -07:00
Nanley Chery
da98441fef i965: Make get_ccs_surf succeed in alloc_aux
Synchronize the requirements listed in isl_surf_get_ccs_surf with
intel_miptree_supports_ccs by importing a restriction from ISL. Some
implications:
* We successfully create every aux_surf in alloc_aux
* We only return false from alloc_aux if we run out of memory

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-18 09:53:06 -07:00
Brian Paul
42aee8f4f6 llvmpipe: fix check for a no-op shader
The tgsi_info.num_tokens fix broke llvmpipe's detection of no-op shaders.
Fix the code to check for num_instructions <= 1 instead.

Fixes: 8fde9429c3 ("tgsi: fix incorrect tgsi_shader_info::num_tokens
computation")
Tested-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-05-18 09:09:41 -06:00
Samuel Pitoiset
03c4816093 radv: pass radv_nir_compiler_options directly to create_llvm_function()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-18 11:07:01 +02:00
Christian Gmeiner
2eb3f794d9 st/mesa: only define GLSL 1.4 for compat if driver supports it
Currently GLSL 1.4 is defined for all gallium drivers even only
GLSL 1.2 is supported as seen on etnaviv.

v1 -> v2:
 - use _min(..) as suggested by Lucas Stach and Michel Dänzer

Fixes: 4560aad780 ("mesa: add GLSLVersionCompat constant")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-18 10:46:24 +02:00
Dave Airlie
48e28ab961 vbo: remove MaxVertexAttribStride assert check.
Some drivers (virgl) don't support GL4.4 or GLES3.1 yet,
so never fill in this const.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-05-18 14:58:15 +10:00
Timothy Arceri
c0c69bd8dd mesa: drop GL_EXT_polygon_offset support
glPolygonOffset() has been part of the GL standard since 1.1. Also
niether AMD or Nvidia support this in their binary drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61761
2018-05-18 09:21:24 +10:00
Brian Paul
8fde9429c3 tgsi: fix incorrect tgsi_shader_info::num_tokens computation
We were incrementing num_tokens in each loop iteration while parsing
the shader.  But each call to tgsi_parse_token() can consume more than
one token (and often does).  Instead, just call the tgsi_num_tokens()
function.

Luckily, this issue doesn't seem to effect any current users of this
field (llvmpipe just checks for <= 1, for example).

Reviewed-by: Neha Bhende<bhenden@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-05-17 15:02:05 -06:00
Samuel Pitoiset
fcba3934fc radv: add radv_emit_shader_pointer() helper
For future work (support for 32-bit GPU pointers).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 21:28:59 +02:00
Samuel Pitoiset
9b2c310a70 radv: add some helpers for cleaning up radv_get_preamble_cs()
Because this function looks a bit ugly to me.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 21:28:57 +02:00
Marek Olšák
f9eb1ef870 amd: remove support for LLVM 4.0
It doesn't support GFX9.

Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-17 14:54:41 -04:00
Juan A. Suarez Romero
11a0d5563f docs: update calendar, add news and link release notes to 18.0.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-17 18:45:26 +00:00
Juan A. Suarez Romero
042e21976a docs: add sha256 checksums for 18.0.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 69ef6e4a75)
2018-05-17 18:40:53 +00:00
Juan A. Suarez Romero
bb7750e8da docs: add release notes for 18.0.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 3b49ab6219)
2018-05-17 18:40:51 +00:00
Mathias Fröhlich
6fac626193 mesa: The glArrayElement api is independent of the current program.
All the shader program dependent handling is done on the level
of the gl_Context::Array._DrawVAO/_DrawVAOEnabledAttribs.
So, skip array element invalidation on _NEW_PROGRAM.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-17 20:13:40 +02:00
Mathias Fröhlich
984cb4e512 mesa: Flag _NEW_ARRAY only if we are changing ctx->Array.VAO.
For the VAO internal helper functions that may be called
with a non current VAO, flag the _NEW_ARRAY state only
if it is the current ctx->Array.VAO.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-17 20:13:39 +02:00
Mathias Fröhlich
5c7e3a90ed mesa: Remove flush_vertices argument from VAO methods.
The flush_vertices argument is now unused, remove it.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-17 20:13:39 +02:00
Mathias Fröhlich
9c7be67968 mesa: Remove FLUSH_VERTICES from VAO state changes.
Pending draw calls on immediate mode or display list calls do
not depend on changes of the VAO state. So, remove calls to
FLUSH_VERTICES and flag _NEW_ARRAY as appropriate.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-17 20:13:39 +02:00
Juan A. Suarez Romero
0a2c947556 docs: add 18.0.5 in the release calendar
Mesa 18.1 series has not been released yet, so let's extend 18.0 lifetime.

v2: Add missing closing TR tags (Eric Engestrom)

CC: Andres Gomez <agomez@igalia.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-05-17 19:01:19 +02:00
Alok Hota
936ce75285 swr/rast: Added FEClipRectangles event
and also added some comments

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-17 10:53:14 -05:00
Alok Hota
a33d376133 swr/rast: Whitespace and tab-to-spaces changes
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-17 10:53:10 -05:00
Alok Hota
7970fcff25 swr/rast: fix VCVTPD2PS generation for AVX512
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-17 10:53:06 -05:00
Alok Hota
a0dddac1cb swr/rast: Rectlist support for GS
Add rectlist as an option for GS.  Needed to support some driver
optimizations.

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-17 10:53:01 -05:00
Alok Hota
7926d18fa5 swr/rast: Remove unneeded virtual from methods
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-17 10:52:21 -05:00
Stefan Schake
b0acc3a562 broadcom/vc4: Native fence fd support
With the syncobj support in place, lets use it to implement the
EGL_ANDROID_native_fence_sync extension. This mostly follows previous
implementations in freedreno and etnaviv.

v2: Drop the flags (Eric)
    Handle in_fence_fd already in job_submit (Eric)
    Drop extra vc4_fence_context_init (Eric)
    Dup fds with CLOEXEC (Eric)
    Mention exact extension name (Eric)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:30 +01:00
Stefan Schake
44036c354d broadcom/vc4: Store job fence in syncobj
This gives us access to the fence created for the render job.

v2: Drop flag (Eric)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:28 +01:00
Stefan Schake
9ed05e2520 broadcom/vc4: Detect syncobj support
We need to know if the kernel supports syncobj submission since otherwise
all the DRM syncobj calls fail.

v2: Use drmGetCap to detect syncobj support (Eric)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:26 +01:00
Stefan Schake
4fc0ebdff5 broadcom/vc4: Bump libdrm requirement
Require a version of libdrm with syncobj support.

v2: Don't require a libdrm_vc4, just bump core libdrm if vc4 enabled (by
    anholt)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:24 +01:00
Stefan Schake
580d1f4c60 drm-uapi: Update vc4 header with syncobj submit support
v2: Synchronized with kernel v2
v3: Update for the finalized kernel ABI (pad2 field)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:21 +01:00
Stefan Schake
1ec01a911b broadcom/vc4: Drop libdrm_vc4 requirement
This was missed in the move back to the local uapi copy.
libdrm_vc4 only seems to consist of headers that also exist in the
Mesa tree.

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:12 +01:00
Eric Anholt
97894b1267 v3d: Add support for glSampleMask / glSampleCoverage. 2018-05-17 15:09:46 +01:00
Eric Anholt
9bbc3f8cf1 v3d: Enable NaN propagation in the VS and CS as well.
Fixes piglit vs-isnan-*.shader_test at the expense of gl-1.0-spot-light.
2018-05-17 15:09:12 +01:00
Nanley Chery
edfb57c0a0 i965/blorp: Disable BLORP clear color updates
With the previous patches, we now update the indirect clear color buffer
every time the clear color changes. Avoid redundant updates.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:42 -07:00
Nanley Chery
02f5512fed intel/blorp: Add a NO_UPDATE_CLEAR_COLOR batch flag
Allow callers to handle updating the indirect clear color buffer
themselves. This can reduce the number of clear color updates in the
case where a caller performs multiple fast clears with the same clear
color.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:42 -07:00
Nanley Chery
f8ac11d69f i965/blorp: Also skip the fast clear if the clear color differs
If the aux state is CLEAR and clear color value has changed, only the
surface state must be updated. The bit-pattern in the aux buffer is
exactly the same.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:42 -07:00
Nanley Chery
43616404be i965/clear: Drop a stale comment in fast_clear_depth
This comment made more sense when it was above the calls to
intel_miptree_slice_set_needs_depth_resolve(). We stopped using these
functions at commit 554f7d6d02
("i965: Move depth to the new resolve functions").

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
82849fb6d5 i965: Update the indirect buffer in set_clear_color
For depth buffers, we avoid fast-clearing if the aux_state is already
CLEAR. We do the same for color buffers only if the clear color
doesn't change. We require that the clear colors match because, in
that case, we don't update the indirect clear color outside of BLORP.

Update the indirect clear color for color buffers as well. We'll
enable the same depth buffer optimization for color buffers in a
later patch.

Note that we're now actually updating the indirect clear color twice
in the case where we use BLORP to perform the fast-clear. This is
only temporary. In later patches, we'll prevent BLORP from performing
the update.

v2: Add more context to the commit message (Topi).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-17 07:06:41 -07:00
Nanley Chery
5b315f3ad1 i965/clear: Remove an early return in fast_clear_depth
Reduce complexity and allow the next patch to delete some code. With
this change, clear operations will still be skipped and setting the
aux_state will cause no side-effects.

Remove the associated comment which implies an early return.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
6f609ca609 i965: Use set_clear_color for depth miptrees
Reduce code duplication now and prevent it in the following commits.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
92a0a87b6f Revert "i965: Make the miptree clear color setter take a gl_color_union"
This reverts commit 1d94aa1987.

The next patch will make depth miptrees use the clear color setter that
was originally being used for color miptrees. Go back to using the
isl_color_value parameter because it's the same type as the
fast_clear_color field used by color and depth miptrees.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
bb18af82c3 i965/miptree: Unify aux buffer allocation
There isn't much that changes between the aux allocation functions.
Remove the duplicated code.

v2: Inline the switch statement (Jason).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
6c41a2ef3b i965: Prepare to delete intel_miptree_alloc_ccs()
We're going to delete intel_miptree_alloc_ccs() in the next commit. With
that in mind, replace the use of this function in
do_single_blorp_clear() with intel_miptree_alloc_aux() and move the
delayed allocation logic to it's callers.

v2: Duplicate the delayed allocation comment (Topi Pohjolainen).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
beed9c4550 i965/miptree: Drop the mt param from alloc_aux_buffer
Drop an unused parameter.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
6b1836aabe i965/miptree: Drop the alloc_flags param from alloc_aux_buffer
We have enough information to determine the optimal flags internally.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
3dd7f600e0 i965/miptree: Drop the name param from alloc_aux_buffer
A name of "aux-miptree" should be sufficient.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
58d99a21f1 i965/miptree: Initialize the indirect clear color to zero
The indirect clear color isn't correctly tracked in
intel_miptree::fast_clear_color. The initial value of ::fast_clear_color
is zero, while that of the indirect clear color is undefined.

Topi Pohjolainen discovered this issue with MCS buffers. This issue is
apparent when fast-clearing an MCS buffer for the first time with
glClearColor = {0.0,}. Although the indirect clear color is undefined,
the initial aux state of the MCS is CLEAR and the tracked clear color is
zero, so we avoid updating the indirect clear color with {0.0,}.

Make the indirect clear color match the initial value of
::fast_clear_color.

Note: although we only have to drop HiZ's BO_ALLOC_BUSY flag for gen10+,
we also drop it pre-gen10 to keep things simple. We add this flag back
for pre-gen10 in a later patch.

v2: Add a note about dropping HiZ's BO_ALLOC_BUSY flag (Topi).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
b58675e93f i965/miptree: Add and use a memset option in alloc_aux_buffer
Add infrastructure for initializing the clear color BO.
intel_miptree_init_mcs is no longer needed with change.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
8a9491058d i965/miptree: Zero-initialize CCS_D buffers
Before this patch, the aux_state was actually AUX_INVALID because the BO
was never defined. This was fine on single slice miptrees because we
would fast-clear the resource right after creation. For multi-slice
miptrees on SKL+ however, this results in undefined behavior when
accessing a non-base slice. Here's a specific example:

1) Fast clear level 0
   * Undefined CCS_D buffer allocated in "PASS_THROUGH" state.
   * Level 0 transitions to the CLEAR state.
2) Render to level 1
   * Level 1 may have a 2-bit pattern of 2's.
   * Rendering with a 2 in the CCS is undefined.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
816f2dc67d i965/miptree: Fix handling of uninitialized MCS buffers
Before this patch, if we failed to initialize an MCS buffer, we'd
end up in a state in which the miptree thinks it has an MCS buffer,
but doesn't. We also leaked the clear_color_bo if it existed.

With this patch, we now free the miptree aux buffer resources and let
intel_miptree_alloc_mcs() know that the MCS buffer no longer exists.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Samuel Pitoiset
1fba2e10b3 radv: only declare the ESGS rings for pre GFX9 chips
GFX9 uses LDS instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 14:14:20 +02:00
Samuel Pitoiset
d349d4bd24 radv: allow to print GPU info with RADV_DEBUG=info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 14:14:17 +02:00
Samuel Pitoiset
56d53ed1d6 radv: do not emit unnecessary ES output stores
GFX9:
Totals from affected shaders:
SGPRS: 472 -> 464 (-1.69 %)
VGPRS: 576 -> 584 (1.39 %)
Code Size: 45432 -> 44324 (-2.44 %) bytes
Max Waves: 40 -> 40 (0.00 %)

VI:
SGPRS: 720 -> 720 (0.00 %)
VGPRS: 728 -> 728 (0.00 %)
Code Size: 45348 -> 43992 (-2.99 %) bytes
Max Waves: 120 -> 120 (0.00 %)

This affects Rise of Tomb Raider and the three Vulkan demos
that use a geometry shader (geometryshader, deferredshadows
and viewportarray).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 14:14:13 +02:00
Samuel Pitoiset
a6e44d1271 radv: do not emit unnecessary GS output stores
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 14:14:11 +02:00
Samuel Pitoiset
507402ada6 radv: only pass the global BO list at submit time if enabled
That way the winsys might use a faster path when the global
BO list is NULL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 13:48:27 +02:00
Samuel Pitoiset
6211799aff radv: remove the radv_finishme() when compiling shaders
Having an entrypoint different than "main" doesn't mean we
have multiple shaders per module.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 13:48:24 +02:00
Samuel Pitoiset
1e86eaf7d8 radv: remove radv_device::llvm_supports_spill
It's always true.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 13:48:21 +02:00
Timothy Arceri
f71714022b mesa: add glUniform*ui{v} support to display lists
Fixes: a017c7ecb7 "mesa: display list support for uint uniforms"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78097
2018-05-17 13:07:48 +10:00
Dieter Nützel
7f1dc93357 radeonsi: create .gitignore
Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-05-16 21:48:17 -04:00
Dave Airlie
eba4cf797c ac/llvm: use amdgcn.tbuffer.store instead of SI.tbuffer.store intrinsic
Drop the use of the old intrinsic.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-17 11:46:53 +10:00
Eric Anholt
b2e7c32703 v3d: Fix wiring filters to NEAREST for 32-bit texture returns.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104626
2018-05-16 21:19:07 +01:00
Eric Anholt
795488d2bf v3d: Enable the driver by default.
Now that we have a stabilized ABI and a fairly conformant driver, turn it
on.
2018-05-16 21:19:07 +01:00
Eric Anholt
01ae6a9181 v3d: Rename driver functions from vc5 to v3d.
This is the final step of the driver rename.
2018-05-16 21:19:07 +01:00
Eric Anholt
8c47ebbd23 v3d: Rename the driver files from "vc5" to "v3d". 2018-05-16 21:19:07 +01:00
Eric Anholt
c4c488a2ae v3d: Rename the vc5_dri.so driver to v3d_dri.so.
This allows the driver to load against the merged kernel DRM driver.  In
the process, rename most of the build system variables and gallium
plumbing functions.
2018-05-16 21:19:07 +01:00
Eric Anholt
8a793d42f1 v3d: Switch the vc5 driver to using the finalized V3D UABI.
In the process of merging to the kernel, I renamed the driver to the
general product line's name (since we have both vc5 and vc6 supported
already).  Since the ABI is finalized, move the header to include/drm-uapi.
2018-05-16 21:19:07 +01:00
Charmaine Lee
33a86acd78 svga: fix incompatible bind flags at buffer validation time
At buffer resource validation time, if the resource handle is not yet
created and if the initial buffer bind flags and the tobind flags are
incompatible, just use the tobind flags to create the resource handle.
On the other hand, if the bind flags are compatible, we can combine
the bind flags for the resource handle creation.

Fixes piglit gl-3.1-buffer-bindings crash.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-05-16 13:04:16 -06:00
jenny.q.cao
1261b34cd5 mesa: cast the GLenum16 to GLint to avoid compile warning on android
Cast the enum to GLint to avoid the compile warning:
/src/mesa/main/get.c:3005:19:
warning: comparison of constant -32768 with expression of type
'GLenum16' (aka 'unsigned short') is always false
-Wtautologicalia-constant-out-of-range-compare

Tests: compilation without this warning
Signed-off-by: jenny.q.cao <jenny.q.cao@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-05-16 13:02:43 -06:00
Stuart Young
f806cc9eb6 etnaviv: Fix missing rnndb file in tarballs
Seems that when the rnndb files for etniviv were updated/included back
in Nov 2017, hw/texdesc_3d.xml.h was missed from Makefile.sources and
meson.build. This was all during the conversion to meson, so it apears
to have slipped through the cracks. As such, this file has been missing
from the official tarballs since inclusion in Mesa, so the git trees
and tarballs differ.

Found due to lintian errors in the Debian packages.

Fixes: f1e1c60ff6 ("etnaviv: Update from rnndb")
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-05-16 19:36:10 +02:00
Matthias Groß
71892fbe19 gallium/hud: add frametime graph (v2)
Thanks for your comment. This version has an additional boolean in the
fps_info struct to distinguish between fps and frame time calculation.
The struct is initialised in the respecting install functions for this
purpose.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-05-15 19:30:12 -04:00
Jan Vesely
f3521ce2c4 eg/compute: Use reference counting to handle compute memory pool.
Use pipe_reference to release old RAT surfaces.
RAT surface adds a reference to pool bo, so use reference counting for pool->bo
as well.

v2: Use the same pattern for both defrag paths
    Drop confusing comment

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-05-15 19:01:47 -04:00
Roland Scheidegger
e01af38d6f gallivm: Use alloca_undef with array type instead of alloca_array
Use a single allocation of array type instead of the old-style array
allocation for the temp and immediate arrays.
Probably only makes a difference if they aren't used indirectly (so,
if we used them solely because there's too many temps or immediates).
In this case the sroa and early-cse passes can sometimes do some
optimizations which they otherwise cannot.
(As a side note, for the temp reg array, we actually really should
use one allocation per array id, not just one for everything.)
Note that the instcombine pass would actually promote such
allocations to single alloc of array type as well, but it's too late
for some artificial shaders we've seen to help (we don't want to run
instcombine at the beginning due to its cost, hence would need
another sroa/cse pass after instcombine). sroa/early-cse help there
because they can actually eliminate all of the huge shader, reducing
it to a single const output (don't ask...).
(Interestingly, instcombine also removes all the bitcasts we do on that
allocation for single-value gathering, and in the end directly indexes
into the single vector elements, which according to spec is only
semi-valid, but this happens regardless. Another thing instcombine also
does is use inbound GEPs, which is probably something we should do
manually as well - for indirectly indexed reg files llvm may not be
able to figure it out on its own, but we should be able to guarantee
all pointers are always inbound. In any case, by the looks of it
using single allocation with array type seems to be the right thing
to do even for ordinary shaders.)
No piglit change.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-05-16 00:04:48 +02:00
Dieter Nützel
bd0b6b9f17 radv: add generated files to .gitignore(s)
Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-15 22:53:55 +02:00
Samuel Pitoiset
6bde8c5608 spirv: fix visiting inner loops with same break/continue block
We should stop walking through the CFG when the inner loop's
break block ends up as the same block as the outer loop's
continue block because we are already going to visit it.

This fixes the following assertion which ends up by crashing
in RADV or ANV:

SPIR-V parsing FAILED:
In file ../src/compiler/spirv/vtn_cfg.c:381
block->node.link.next == NULL
0 bytes into the SPIR-V binary

This also fixes a crash with a camera shader from SteamVR.

v2: make use of vtn_get_branch_type() and add an assertion

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106090
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106504
CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-15 21:38:19 +02:00
Rob Clark
d89f58a6b8 mesa/st: handle vert_attrib_mask in nir case too
Note, actually fixes 9987a072cb, but the problems don't show up until
19a91841c3.

Fixes: 19a91841c3 st/mesa: Use Array._DrawVAO in st_atom_array.c.
Fixes: 9987a072cb st/mesa: Make the input_to_index array available.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-05-15 15:15:33 -04:00
Marek Olšák
3e27b377f2 cso: check count == 0 in cso_set_vertex_buffers
The code didn't expect that, leading to crashes.

Fixes: 86d63b53a2 "gallium: remove aux_vertex_buffer_slot code"

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2018-05-15 12:36:27 -04:00
Rob Clark
dace607245 vc5: use util_copy_framebuffer_state
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-15 08:48:13 -04:00
Rob Clark
dae4c98dd7 vc4: use util_copy_framebuffer_state
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-15 08:47:35 -04:00
Rob Clark
f897b67dc1 freedreno/a5xx: remove fd5_shader_stateobj
Extra level of indirection that serves no purpose.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-05-15 08:46:46 -04:00
Rob Clark
d48a2404a2 freedreno/a4xx: remove fd4_shader_stateobj
Extra level of indirection that serves no purpose.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-05-15 08:46:46 -04:00
Rob Clark
2c40f2ba32 freedreno/a3xx: remove fd3_shader_stateobj
Extra level of indirection that serves no purpose.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-05-15 08:46:46 -04:00
Rob Clark
273f7d8404 freedreno: fence should hold a ref to pipe
Since the fence can outlive the context, and all it really needs to wait
on a fence is the pipe, use the new fd_pipe reference counting to hold a
ref to the pipe and drop the ctx pointer.

This fixes a crash seen with (for example) glmark2:

  #0  fd_pipe_wait_timeout (pipe=0xbf48678b3cd7b32b, timestamp=0, timeout=18446744073709551615) at freedreno_pipe.c:101
  #1  0x0000ffffbdf75914 in fd_fence_finish (pscreen=0x561110, ctx=0x0, fence=0xc55c10, timeout=18446744073709551615) at ../src/gallium/drivers/freedreno/freedreno_fence.c:96
  #2  0x0000ffffbde154e4 in dri_flush (cPriv=0xb1ff80, dPriv=0x556660, flags=3, reason=__DRI2_THROTTLE_SWAPBUFFER) at ../src/gallium/state_trackers/dri/dri_drawable.c:569
  #3  0x0000ffffbecd8b44 in loader_dri3_flush (draw=0x558a28, flags=3, throttle_reason=__DRI2_THROTTLE_SWAPBUFFER) at ../src/loader/loader_dri3_helper.c:656
  #4  0x0000ffffbecbc36c in glx_dri3_flush_drawable (draw=0x558a28, flags=3) at ../src/glx/dri3_glx.c:132
  #5  0x0000ffffbecd91e8 in loader_dri3_swap_buffers_msc (draw=0x558a28, target_msc=0, divisor=0, remainder=0, flush_flags=3, force_copy=false) at ../src/loader/loader_dri3_helper.c:827
  #6  0x0000ffffbecbcfc4 in dri3_swap_buffers (pdraw=0x5589f0, target_msc=0, divisor=0, remainder=0, flush=1) at ../src/glx/dri3_glx.c:587
  #7  0x0000ffffbec98218 in glXSwapBuffers (dpy=0x502bb0, drawable=2097154) at ../src/glx/glxcmds.c:840
  #8  0x000000000040994c in CanvasGeneric::update (this=0xfffffffff400) at ../src/canvas-generic.cpp:114
  #9  0x0000000000411594 in MainLoop::step (this=this@entry=0x5728f0) at ../src/main-loop.cpp:108
  #10 0x0000000000409498 in do_benchmark (canvas=...) at ../src/main.cpp:117
  #11 0x00000000004071b0 in main (argc=<optimized out>, argv=<optimized out>) at ../src/main.cpp:210

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-05-15 08:46:46 -04:00
Rob Clark
a8c0daa172 freedreno: batch cache doesn't hold a ref to batch
The cache doesn't hold a (strong) reference to the batch.  So we
shouldn't be trying to drop a reference, as that leads to:

   #0  0x0000ffffbecb37a0 in raise () from /lib64/libc.so.6
   #1  0x0000ffffbeca159c in abort () from /lib64/libc.so.6
   #2  0x0000ffffbecacf48 in __assert_fail_base () from /lib64/libc.so.6
   #3  0x0000ffffbecacfa8 in __assert_fail () from /lib64/libc.so.6
   #4  0x0000ffffbd28def0 in pipe_reference_described (ptr=0x4f47130, reference=0x0, get_desc=0xffffbd2e0f08 <__fd_batch_describe>) at ../src/gallium/auxiliary/util/u_inlines.h:88
   #5  0x0000ffffbd28e188 in fd_batch_reference_locked (ptr=0x4f40de0, batch=0x0) at ../src/gallium/drivers/freedreno/freedreno_batch.h:258
   #6  0x0000ffffbd28e9a8 in fd_bc_invalidate_resource (rsc=0x4f40ca0, destroy=true) at ../src/gallium/drivers/freedreno/freedreno_batch_cache.c:244
   #7  0x0000ffffbd293778 in fd_resource_destroy (pscreen=0xedc170, prsc=0x4f40ca0) at ../src/gallium/drivers/freedreno/freedreno_resource.c:644
   #8  0x0000ffffbd922674 in u_transfer_helper_resource_destroy (pscreen=0xedc170, prsc=0x4f40ca0) at ../src/gallium/auxiliary/util/u_transfer_helper.c:144
   #9  0x0000ffffbd29527c in pipe_resource_reference (ptr=0x4f455d8, tex=0x0) at ../src/gallium/auxiliary/util/u_inlines.h:144
   #10 0x0000ffffbd29548c in fd_surface_destroy (pctx=0x1012720, psurf=0x4f455d0) at ../src/gallium/drivers/freedreno/freedreno_surface.c:78
   #11 0x0000ffffbd1f9c48 in pipe_surface_reference (ptr=0x4f471d0, surf=0x0) at ../src/gallium/auxiliary/util/u_inlines.h:113
   #12 0x0000ffffbd1f9ef4 in util_copy_framebuffer_state (dst=0x4f471c8, src=0x0) at ../src/gallium/auxiliary/util/u_framebuffer.c:114
   #13 0x0000ffffbd2e0e30 in __fd_batch_destroy (batch=0x4f47130) at ../src/gallium/drivers/freedreno/freedreno_batch.c:225
   #14 0x0000ffffbd28e1b0 in fd_batch_reference_locked (ptr=0xfffffffff010, batch=0x0) at ../src/gallium/drivers/freedreno/freedreno_batch.h:262
   #15 0x0000ffffbd28e6b0 in fd_bc_invalidate_context (ctx=0x1012720) at ../src/gallium/drivers/freedreno/freedreno_batch_cache.c:190
   #16 0x0000ffffbd2e2b6c in fd_context_destroy (pctx=0x1012720) at ../src/gallium/drivers/freedreno/freedreno_context.c:139
   #17 0x0000ffffbd2c3280 in fd5_context_destroy (pctx=0x1012720) at ../src/gallium/drivers/freedreno/a5xx/fd5_context.c:56
   #18 0x0000ffffbd5b7a8c in st_destroy_context_priv (st=0xfd72f0, destroy_pipe=true) at ../src/mesa/state_tracker/st_context.c:281

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-05-15 08:46:26 -04:00
Eric Engestrom
37d44e2608 docs/meson: mark code/commands as <code>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-15 10:33:39 +01:00
Eric Engestrom
5829f616ec docs/meson: replace plaintext url with a link
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-15 10:33:36 +01:00
Eric Engestrom
67c550708a docs/meson: fix various html issues
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-15 10:33:34 +01:00
Eric Engestrom
dc2dc1fa30 docs/meson: fix various typos
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-15 10:33:28 +01:00
Eric Engestrom
6c5df78d8b meson: fix copyright symbol
Fixes: bd68f1013c "autotools, meson: add tileset.h"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-15 10:31:46 +01:00
Juan A. Suarez Romero
bd68f1013c autotools, meson: add tileset.h
Fixes: 4e52cb51b5 ("swr/rast: Thread locked tiles improvement")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-15 10:00:11 +02:00
Thomas Hellstrom
3d0b4979ee st/xa: Bump minor
Bump xa minor to signal that the underlying mesa version is suitable for dri3.

This is a bit ugly since it doesn't relate to a specific xa interface change.
Recently there has been a number of fixes in mesa that helps enabling dri3
without any significant regressions in automated testing and common desktop
usage latency. However, the xf86-video-vmware driver has no other way to tell
but inspecting the xa version.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-05-15 09:27:46 +02:00
Dave Airlie
9585e70206 virgl: enable vertex streams when glsl level is high enough.
This enabled the vertex streams out when the host supports
GL4.0.
2018-05-15 14:56:57 +10:00
Kai Wasserbäch
b691d9192c opencl: autotools: Fix linking order for OpenCL target
Otherwise the build fails with an undefined reference to
clang::FrontendTimesIsEnabled.

Bugzilla: https://bugs.freedesktop.org/106209
Cc: Jan Vesely <jan.vesely@rutgers.edu>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Acked-by: Jan Vesely <jan.vesely@rutgers.edu>
Tested-by: Aaron Watry <awatry@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-05-14 22:45:01 -04:00
Samuel Pitoiset
97b179570c radv: reduce the number of parameters export by the GS copy shader
By using the geometry shader output usage mask.

This improves all Vulkan demos that use a geometry shader
(ie. geometryshader, deferredshadows, viewportarray).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-14 21:38:23 +02:00
Samuel Pitoiset
560bd9eb67 radv: scan the geometry shader output usage mask
For reducing the number of parameters that are exported by
the GS copy shader.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-14 21:38:21 +02:00
Samuel Pitoiset
ea43d935ab radv: run the shader info pass before emitting the GS copy shader
For further optimizations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-14 21:38:19 +02:00
Samuel Pitoiset
7cbc6f2621 radv: check that layout isn't NULL in radv_nir_shader_info_pass()
An upcoming patch will run the shader info pass on the
geometry shader just before emitting the GS copy shader.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-14 21:38:17 +02:00
Jason Ekstrand
18f8200a99 intel/blorp: Use linear formats for CCS_E clear colors in copies
It's clear that the original code meant to do this and there is even a
10-line comment explaining why.  Originally, we had a simple function
for packing the clear colors which was unaware of sRGB.  However, in
a6b66a7b26, when we started using ISL to do the packing, the wrong
format was used.

Fixes: a6b66a7b26 "intel/blorp: Use ISL instead of bitcast_color..."
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-14 10:41:26 -07:00
Bas Nieuwenhuizen
f944a59996 radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.
The hardware always interprets the alpha as unsigned and fixing it
in the shader is going to add unacceptable overheads.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-14 18:58:30 +02:00
Bas Nieuwenhuizen
3d4d388e39 radv: Fix up 2_10_10_10 alpha sign.
Pre-Vega HW always interprets the alpha for this format as unsigned,
so we have to implement a fixup to do the sign correctly for signed
formats.

v2: Improve indexing mess.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-14 18:58:20 +02:00
Bas Nieuwenhuizen
e361970ed7 radv: Add support for IMG_DATA_FORMAT_32_32_32.
Basic sampling support for linear tiling.

No CTS regressions, but it seems the blitting coverage is not very
extensive.

https://bugs.freedesktop.org/show_bug.cgi?id=106331
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-14 18:58:12 +02:00
Bas Nieuwenhuizen
dd102405de radv: Translate logic ops.
radeonsi could pass them through but the enum changed between
Gallium and Vulkan, so we have to translate.

In progress I made the register defines a bit more readable.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100430
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-14 16:49:06 +02:00
Bas Nieuwenhuizen
62f50df7b7 radv: Fix multiview queries.
This moves the extra queries to after the main query ended, instead
of doing it after the begin and hence doing nesting.

We also emit only (view count - 1) extra queries, as the main query
is already there for the first view.

This fixes the CTS occasionally getting stuck in
dEQP-VK.multiview.queries* waiting on results.

Fixes: 32b4f3c38d "radv/query: handle multiview queries properly. (v3)"
CC: 18.1 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-14 16:49:06 +02:00
Eric Engestrom
f0cdc39b13 meson: remove dependency antipattern
`dep_valgrind != []` now (0.45) produces a warning that is quite explicit:
  WARNING: Trying to compare values of different types (DependencyHolder, list) using !=.
  The result of this is undefined and will become a hard error in a future Meson release.

`dep_valgrind = []` used to be the recommended way to deal with
non-existant dependency, but these don't work with `.found()`, so now
the recommended way is to declare a impossible dependency, which
null_dep does for us in Mesa.

In short, we don't need and shouldn't check for `!= []` anywhere anymore.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-14 14:55:36 +01:00
Samuel Pitoiset
ece398277c radv: remove useless check in radv_create_shaders()
radv_can_dump_shader() already handles if module is NULL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-14 12:38:01 +02:00
Samuel Pitoiset
8ade3e4684 radv: allow to dump the GS copy shader with RADV_DEBUG="shaders"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-14 12:38:00 +02:00
Samuel Pitoiset
553418af1e radv: move {load,store}_var intrinsics scanning in different functions
These are going to be crazy and we are probably going to add
more scan stuff in the future. Also use switch cases instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-14 12:37:58 +02:00
jenny.q.cao
ff7521c9ba android: change include "cutils/log.h" to "log/log.h" on Android API >=26
There is a compile warning from Android 8 (API version 26) from "include cutils/log.h"
warning: "Deprecated: don't include cutils/log.h, use either android/log.h or log/log.h"-W#warnings,
Change to include "log/log.h" on Android 8 or later major version to avoid this warning

Signed-off-by: jenny.q.cao <jenny.q.cao@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-14 08:08:31 +03:00
Roland Scheidegger
cf3fb42fb5 llvmpipe: Fix random number generation for unit tests
We were never producing negative numbers for signed types.
Also fix only producing half the valid range for uint32, and
properly clamp signed values.

Because this now also properly tests snorm with actually negative
values, need to increase eps for such conversions. I believe these
cannot actually be hit in ordinary operation (e.g. if a snorm texture
is sampled and output to snorm RT, it will still go through snorm->float
and float->snorm conversion), so don't bother to do anything to fix
the bad accuracy (might be quite complex).
Basically, the issue is for something like snorm16->snorm8 that in the
end this will just use a 8 bit arithmetic right shift.
But the math behind it says we should actually do a division by 32767 / 127, which
is ~258, not 256. So the result can be one bit off (values have too large
magnitude), and furthermore, the shift has incorrect rounding (always rounds
down). For positive numbers, these errors have different direction, but
for negative ones they have the same, hence for some values the error will
be 2 bit in the end.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=106232
2018-05-14 03:14:00 +02:00
Dave Airlie
5978d54a09 radv: use compute path for multi-layer images.
I don't think the hw resolve path can't handle multi-layer images.

This fixes all the:
dEQP-VK.renderpass.multisample_resolve.layers_*
tests on my VI card.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
2018-05-14 08:57:54 +10:00
Dave Airlie
98dbaa445a radv: resolve all layers in compute resolve path.
This path should iterate across all layers, I've some ideas
for doing this in a single pass, but this is simpler for now.

This passes the tests because we don't use the fragment path
unless we have DCC, and we don't have DCC on layered images.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
2018-05-14 08:57:27 +10:00
Dave Airlie
b16fc6cda1 radv/resolve: do fmask decompress on all layers.
For a multi-layer subpass resolve we want to make sure we flush all
the layers.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
2018-05-14 08:56:47 +10:00
Rhys Perry
8f6cbb8c7d nvc0: fix setting of subpixel precision during conservative rasterization
Fixes: 07dac3e040 ("nvc0: add conservative rasterization support")
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-13 13:21:41 -04:00
Rhys Perry
c879011c72 anv,nir: add generated files to .gitignore(s)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-12 20:14:49 -07:00
Marek Olšák
86d63b53a2 gallium: remove aux_vertex_buffer_slot code
The slot index is always 0, and is pretty unlikely to change in the future.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-05-12 21:08:09 -04:00
Timothy Arceri
ce188813bf radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT
When VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT is set we skip NIR
linking optimisations and only run over the NIR optimisation loop
once similar to the GLSLOptimizeConservatively constant used by
some GL drivers.

We need to run over the opts at least once to avoid errors in LLVM
(e.g. dead vars it can't handle) and also to reduce the time spent
compiling the IR in LLVM.

With this change the Blacksmith Unity demos compilation times
go from 329760 ms -> 299881 ms when using Wine and DXVK.

V2: add bit to radv_pipeline_key

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106246
2018-05-13 09:58:33 +10:00
Vinson Lee
26ddc4f9e1 scons: Add PROGRAM_NIR_FILES.
Fix SCons build error.

  Linking build/linux-x86_64-debug/gallium/targets/libgl-xlib/libGL.so.1.5 ...
build/linux-x86_64-debug/mesa/libmesa.a(st_program.os): In function `st_translate_prog_to_nir':
src/mesa/state_tracker/st_program.c:392: undefined reference to `prog_to_nir'

Fixes: 5c33e8c772 ("st/nir: use NIR for asm programs")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2018-05-12 00:50:05 -07:00
Timothy Arceri
5c33e8c772 st/nir: use NIR for asm programs
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-12 14:48:21 +10:00
Timothy Arceri
0b3e9564bd st/nir: make st_nir_opts() available externally
The following patch will make use of this for asm style programs.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-12 14:48:21 +10:00
Boyuan Zhang
0907d3ab9c radeon/vce: add firmware support for ver 53 and up
All vce firmwares with major version greater than or equal to 53 are supported

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-05-11 14:59:00 -04:00
Rob Clark
a7c81a7f67 etnaviv: remove pipe_fence_handle::ctx
A fence can outlive the ctx it was created from (see glmark2).. etnaviv
doesn't actually need fence->ctx so lets remove it before someone makes
the mistake of assuming it is a valid pointer.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-05-11 18:42:13 +02:00
George Kyriazis
4e52cb51b5 swr/rast: Thread locked tiles improvement
- Change tilemgr TILE_ID encoding to use Morton-order (Z-order).
- Change locked tiles set to bitset.  Makes clear, set, get much faster.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:26:35 -05:00
George Kyriazis
8238c791dc swr/rast: Add Builder::GetVectorType()
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:25:47 -05:00
George Kyriazis
8cb55dae2e swr/rast: Prepend the console output with a newline
It can get jumbled with output from other threads.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:25:24 -05:00
George Kyriazis
db25fcfcde swr/rast: Add ConcatLists()
for concatenating lists

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:22:57 -05:00
George Kyriazis
dcaca3c7b3 swr/rast: Add constant initializer for uint64_t
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:22:17 -05:00
George Kyriazis
70f0a28b83 swr/rast: Use binner topology to assemble backend attributes
Previously was using the draw topology, which may change if GS or Tess
are active. Only affected attributes marked with constant interpolation,
which limited the impact.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:21:52 -05:00
George Kyriazis
b3b0f0e0ec swr/rast: Change formatting
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:21:22 -05:00
Ville Syrjälä
659910eda0 meson: Fix build for egl platform_x11 with dri3
platform_x11 with dri3 needs inc_loader.

In file included from ../src/egl/drivers/dri2/platform_x11_dri3.c:35:0:
../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: loader_dri3_helper.h: No such file or directory
In file included from ../src/egl/drivers/dri2/platform_x11.c:46:0:
../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: loader_dri3_helper.h: No such file or directory
In file included from ../src/egl/drivers/dri2/egl_dri2.c:61:0:
../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: loader_dri3_helper.h: No such file or directory

Cc: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2018-05-11 17:41:57 +03:00
Samuel Pitoiset
efc10949cc radv: move ac_build_if_state on top of radv_nir_to_llvm.c
These helpers will be needed for future work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-11 12:35:07 +02:00
Samuel Pitoiset
3a410f0afc radv: minor cleanups in radv_fill_shader_variant()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-11 12:35:05 +02:00
Jan Vesely
58272c1ad7 winsys/amdgpu: Destroy dev_hash table when the last winsys is removed.
Fixes memory leak on module unload.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-10 23:23:50 -04:00
Marek Olšák
a2e9d9b4c1 ac/gpu_info: add has_read_registers_query
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:11 -04:00
Marek Olšák
9b1fdfc541 ac/gpu_info: add has_2d_tiling
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:10 -04:00
Marek Olšák
d26696283d ac/gpu_info: add has_sparse_vm_mappings
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:08 -04:00
Marek Olšák
125adc92ad ac/gpu_info: add has_unaligned_shader_loads
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:07 -04:00
Marek Olšák
8b9694da4b radeonsi: expose ARB_query_buffer_object on ancient kernels too
It doesn't use indirect dispatches.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:04 -04:00
Marek Olšák
e9c08bc658 ac/gpu_info: add has_indirect_compute_dispatch
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:03 -04:00
Marek Olšák
64265ac8d5 ac/gpu_info: add kernel_flushes_tc_l2_after_ib
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:01 -04:00
Marek Olšák
14c5a93bfa ac/gpu_info: add has_format_bc1_through_bc7
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:00 -04:00
Marek Olšák
2bd2c173e8 ac/gpu_info: add has_eqaa_surface_allocator
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:39:58 -04:00
Marek Olšák
e720cb6135 radeonsi: clean up the reset status query implementation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:39:57 -04:00
Marek Olšák
3060f62340 ac/gpu_info: add has_bo_metadata
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:39:56 -04:00
Marek Olšák
09f1bab483 ac/gpu_info: add si_TA_CS_BC_BASE_ADDR_allowed
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:39:54 -04:00
Marek Olšák
8b58a14ef7 ac/gpu_info: add htile_cmask_support_1d_tiling
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:39:53 -04:00
Marek Olšák
b81149e258 ac/gpu_info: add kernel_flushes_hdp_before_ib
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:39:47 -04:00
Marek Olšák
a969f184cf radeonsi: add an environment variable that forces EQAA for MSAA allocations
This is for testing and experiments.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:34:37 -04:00
Marek Olšák
2309cedf44 radeonsi: set up EQAA image descriptors properly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:34:36 -04:00
Marek Olšák
7ac4ef097d radeonsi: add EQAA SC,DB,CB register programming
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:34:34 -04:00
Marek Olšák
9d00580e75 radeonsi: support creating EQAA color textures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:34:32 -04:00
Marek Olšák
912b0163dc ac/surface: add EQAA support
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:34:31 -04:00
Marek Olšák
ee31762ef5 radeonsi: use better sample locations for 8x EQAA
Verified with the piglit MSAA accuracy test.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:32:57 -04:00
Marek Olšák
4b6df225f7 radeonsi: improve quality of 16 sample locations
This results in better 16x and 8x quality when using these locations.
Verified with the piglit MSAA accuracy test.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:29:02 -04:00
Marek Olšák
01fd543c82 radeonsi: use better sample locations for 4x MSAA
Discovered by luck. Verified with the piglit MSAA accuracy test.
It also shows that the worst case EQAA 16s4f results in very good 4x MSAA
in the worst case.

Nine might not like these positions, but they are prettier to the eye and
GL doesn't care.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:28:12 -04:00
Marek Olšák
8d8b71ccfa radeonsi: reorder sample locations as required by EQAA
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:27:46 -04:00
Marek Olšák
5769a5ec01 radeonsi: simplify si_get_sample_position
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
9f456b3a3c radeonsi: simplify arrays of sample locations
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
3d70b5beae radeonsi: set DB_EQAA the same as Vulkan
These never change, but they only affect EQAA, which isn't implemented.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
b5ed039325 radeonsi: remove CM_ prefixes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
656fd607be radeonsi: don't update clear color registers if they don't change
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
835095973d radeonsi: remove r600_fmask_info
radeon_surf contains almost everything.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
bdc3e410f7 ac/surface: unify common legacy and gfx9 fmask fields
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
9bf3570fed ac/surface/gfx6: compute FMASK together with the color surface
instead of invoking FMASK computation separately.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
276acda835 ac/surface/gfx9: fix a typo in CMASK RB/pipe alignment
No change in behavior because it's always aligned.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Marek Olšák
6841845b00 ac: set correct LLVM processor names for Raven & Vega12
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Marek Olšák
6f7f10d285 ac: sort raster configs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Marek Olšák
e7b82a9978 ac: remove 1 RB raster config for Iceland
Iceland always reports 2 RBs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Marek Olšák
cb0f5cddcc ac: move the Fiji kernel workaround for raster config out of the switch
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Marek Olšák
ce954ac6f3 ac: enable both RBs on Kaveri
This can result in 2x increase in performance on non-harvested Kaveris.

v2: don't do it on radeon

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Marek Olšák
597b9e8810 radeonsi/gfx9: work around a GPU hang due to broken indirect indexing in LLVM
Fixes: 6d19120da8 "radeonsi/gfx9: workaround for INTERP with indirect indexing"
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Jason Ekstrand
b784561c1a intel/isl/storage: Don't lower most UNORM formats on gen11+
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-05-10 14:13:24 -07:00
Jason Ekstrand
399962e7c6 intel/isl: Several UNORM formats support typed writes on gen11+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-05-10 14:12:55 -07:00
Brian Paul
e4211b36bb mesa: revert GL_[SECONDARY_]COLOR_ARRAY_SIZE glGet type to TYPE_INT
Since size can be 3, 4 or GL_BGRA we need to keep these glGet types
as TYPE_INT, not TYPE_UBYTE.

Fixes: d07466fe18 ("mesa: fix glGetInteger/Float/etc queries for
vertex arrays attribs")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106462
cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-05-10 09:49:40 -06:00
Andres Rodriguez
34e9e4023f radv: disable DCC for shareable images on GFX9+
This seems to be broken at the moment for opengl interop.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-10 11:27:12 -04:00
Thomas Petazzoni
54bbe600ec configure.ac: rework -latomic check
The configure.ac logic added in commit
2ef7f23820 ("configure: check if
-latomic is needed for __atomic_*") makes the assumption that if a
64-bit atomic intrinsic test program fails to link without -latomic,
it is because we must use -latomic.

Unfortunately, this is not completely correct: libatomic only appeared
in gcc 4.8, and therefore gcc versions before that will not have
libatomic, and therefore don't provide atomic intrinsics for all
architectures. This issue was for example encountered on PowerPC with
a gcc 4.7 toolchain, where the build fails with:

powerpc-ctng_e500v2-linux-gnuspe/bin/ld: cannot find -latomic

This commit aims at fixing that, by not assuming -latomic is
available. The commit re-organizes the atomic intrinsics detection as
follows:

 (1) Test if a program using 64-bit atomic intrinsics links properly,
     without -latomic. If this is the case, we have atomic intrinsics,
     and we're good to go.

 (2) If (1) has failed, then test to link the same program, but this
     time with -latomic in LDFLAGS. If this is the case, then we have
     atomic intrinsics, provided we link with -latomic.

This has been tested in three situations:

 - On x86-64, where atomic instrinsics are all built-in, with no need
   for libatomic. In this case, config.log contains:

   GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE='#'
   GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE=''
   LIBATOMIC_LIBS=''

   This means: atomic intrinsics are available, and we don't need to
   link with libatomic.

 - On NIOS2, where atomic intrinsics are available, but some of them
   (64-bit ones) require using libatomic. In this case, config.log
   contains:

   GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE='#'
   GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE=''
   LIBATOMIC_LIBS='-latomic'

   This means: atomic intrinsics are available, and we need to link
   with libatomic.

 - On PowerPC with an old gcc 4.7 toolchain, where 32-bit atomic
   instrinsics are available, but not 64-bit atomic instrinsics, and
   there is no libatomic. In this case, config.log contains:

   GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE=''
   GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE='#'

   With means that atomic intrinsics are not usable.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
2018-05-10 08:13:57 -07:00
Brian Paul
d07466fe18 mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs
The vertex array Size and Stride attributes are now ubyte and short,
respectively.  The glGet code needed to be updated to handle those
types, but wasn't.

Fixes the new piglit test gl-1.5-get-array-attribs test.

v2: fix inadvertant whitespace change, change COLOR_ARRAY_SIZE to UBYTE,
misc fixes suggested by Justin

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106450
Fixes: d5f42f96e1 ("mesa: shrink size of gl_array_attributes (v2)")
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-05-10 08:08:11 -06:00
Jan Vesely
45dfa6f4e7 winsys/radeon: Destroy fd_hash table when the last winsys is removed.
Fixes memory leak on module unload.
v2: Use util_hash_table helper function

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2018-05-10 05:12:48 -04:00
Jan Vesely
d146768d13 gallium/auxiliary: Add helper function to count the number of entries in hash table
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2018-05-10 05:12:43 -04:00
Samuel Pitoiset
0defc55547 radv: move handling nosisched option in a better place
It's a per-application optimization, so it makes more sense
to do that in radv_handle_per_app_options().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-10 10:57:41 +02:00
Grazvydas Ignotas
4fdce205dd radv: assorted typo fixes
Trivial.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-10 11:50:46 +03:00
Mathias Fröhlich
f660683027 mesa/vbo/tnl: Move gl_vertex_array related stuff to tnl.
The only remaining users of gl_vertex_array are tnl based
drivers. So move everything related to that into tnl and
rename it accordingly.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
881d2fcafa mesa: Remove Array._DrawArrays.
Only tnl based drivers still use this array. So remove it
from core mesa and use Array._DrawVAO instead.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
899476b6b1 i965: Remove the now unused gl_vertex_array.
Was meant to be temporary in i965.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
0fabd55306 i965: Remove the gl_vertex_array indirection.
For now store binding and attrib in brw_vertex_element.
The i965 driver still provides lots of opportunity to make use
of the unique binding information in the VAO which is currently not
taken from the VAO.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
172c9a908f i965: Implement all_varyings_in_vbos in terms of Array._DrawVAO.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
79eb6ab7b6 st/mesa: Remove the now unused gl_vertex_array.
Was meant to be temporary in gallium.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
4c77f0d065 st/mesa: Make feedback draw and rasterpos use _DrawVAO.
Instead of playing with Array._DrawArrays, make the feedback draw
path use Array._DrawVAO. Also st_RasterPos needs to use the VAO then.

v2: Use helper methods to get the offset values for array and binding.
    Update comments.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
19a91841c3 st/mesa: Use Array._DrawVAO in st_atom_array.c.
Finally make use of the binding information in the VAO when
setting up arrays for draw.

v2: Emit less relocations also for interleaved userspace arrays.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:15 +02:00
Mathias Fröhlich
9987a072cb st/mesa: Make the input_to_index array available.
The input_to_index array is already available internally
when preparing vertex programs. Store the map in
struct st_vertex_program.
Also store the bitmask of mesa vertex processing inputs in
struct st_vp_variant.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:15 +02:00
Mathias Fröhlich
f24bf45210 st/mesa: Use _DrawVAO for edgeflag enabled check.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:15 +02:00
Mathias Fröhlich
d1698d4311 mesa: Compute effective buffer bindings in the vao.
Compute VAO buffer binding information past the position/generic0 mapping.
Scan for duplicate buffer bindings and collapse them into derived
effective buffer binding index and effective attribute mask variables.
Provide a set of helper functions to access the distilled
information in the VAO. All of them prefixed with _mesa_draw_...
to indicate that they are meant to query draw information.

v2: Also group user space arrays containing interleaved arrays.
    Add _Eff*Offset to be copied on attribute and binding copy.
    Update comments.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:15 +02:00
Gert Wollny
fb4011ace9 virgl: Add support for passing GL_ANY_SAMPLES_PASSED_CONSERVATIVE
This is needed for fixing CTS:
   dEQP-GLES3.functional.occlusion_query.conservative*

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
2018-05-10 12:26:57 +10:00
Dave Airlie
ce027ac5c7 r600: fix constant buffer bounds.
If you have an indirect access to a constant buffer on r600/eg
use a vertex fetch in the shader. However apps have expected
behaviour on those out of bounds accessess (even if illegal).

If the constants were being uploaded as part of a larger
upload buffer, we'd set the range of allowed access to a lot
larger than required so apps would get values back from
other parts of the upload buffer instead of the expected out
of bounds access.

This fixes rendering bugs in Trine and Witcher 1, thanks
to iive for nagging me effectively until I figured it out :-)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91808
Cc: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-05-10 02:14:32 +01:00
Jason Ekstrand
a8a740f272 i965,anv: Set the CS stall bit on the ISP disable PIPE_CONTROL
From the bspec docs for "Indirect State Pointers Disable":

    "At the completion of the post-sync operation associated with this
    pipe control packet, the indirect state pointers in the hardware are
    considered invalid"

So the ISP disable is a post-sync type of operation which means that it
should be combined with a CS stall.  Without this, the simulator throws
an error.

Fixes: 766d801ca "anv: emit pixel scoreboard stall before ISP disable"
Fixes: f536097f6 "i965: require pixel scoreboard stall prior to ISP disable"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-09 18:03:28 -07:00
Dave Airlie
56766b8515 radv: handle arrays in the fmask descriptor.
This fixes the fmask descriptor generation to handle 2d ms arrays.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-10 10:42:49 +10:00
Matt Turner
0f959215c3 gallium/tests: Fix assignment of EXTRA_DIST
Fixes: 6754c2e83d ("autotools: Include new meson files")
2018-05-09 16:38:47 -07:00
Matt Turner
0097940223 configure.ac: Check for grep with AC_PROG_GREP
Perhaps with a new version of autoconf, I began seeing:

| checking the name lister (/usr/bin/nm -B) interface... ./configure: line 6973: External.*some_variable: command not found
| BSD nm

This is because AC_PROG_NM expands to

	...
	if $GREP 'External.*some_variable' conftest.out > /dev/null; then
	    lt_cv_nm_interface="MS dumpbin"
	fi
	...

I'm not sure if it's a bug in AC_PROG_NM that it doesn't call
AC_PROG_GREP, but it's easy enough for us to do it.
2018-05-09 16:38:47 -07:00
Xiong, James
0ab266dc1b main: fail texture_storage() call if the size is not okay
Signed-off-by: Xiong, James <james.xiong@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 09:34:31 +10:00
Xiong, James
08c1444c95 main: return 0 length when the queried program object's not linked
Signed-off-by: Xiong, James <james.xiong@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-10 09:34:19 +10:00
Kenneth Graunke
a83face48a i965: Shut up unused variable warnings.
These are only used in assertions.
2018-05-09 16:20:50 -07:00
Ross Burton
1755654d9f src/intel/Makefile.vulkan.am: add missing MKDIR_GEN
Out of tree builds can try to write into a directory that doesn't exist yet:

| Traceback (most recent call last):
|   File "../../../mesa-18.0.2/src/intel/vulkan/anv_icd.py", line 46, in <module>
|     with open(args.out, 'w') as f:
| IOError: [Errno 2] No such file or directory: 'vulkan/intel_icd.x86_64.json'
| Makefile:4882: recipe for target 'vulkan/intel_icd.x86_64.json' failed

Add missing MKDIR_GEN calls to solve this.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-09 16:08:52 -07:00
Rhys Perry
5ac16ed047 mesa: fix error handling in get_framebuffer_parameteriv
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-09 14:32:40 -07:00
Lionel Landwerlin
766d801ca3 anv: emit pixel scoreboard stall before ISP disable
We want to make sure that all indirect state data has been loaded into
the EUs before disable the pointers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Fixes: 78c125af39 ("anv/gen10: Ignore push constant packets during context restore.")
2018-05-09 20:11:57 +01:00
Lionel Landwerlin
f536097f67 i965: require pixel scoreboard stall prior to ISP disable
Invalidating the indirect state pointers might affect a previously
scheduled & still running 3DPRIMITIVE (causing page fault). So stall
on pixel scoreboard before that.

v2: Fix compile issue :(

v3: Stall on pixel scoreboard

v4: Drop the post sync operation (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Fixes: ca19ee33d7 ("i965/gen10: Ignore push constant packets during context restore.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106243
2018-05-09 20:11:51 +01:00
Jason Ekstrand
561348caa1 intel/isl: Allow CCS_E on 1010102 formats
On CNL and above, CCS_E supports 1010102 formats and R11G11B10F.  We had
shut them off during early enabling because blorp_copy couldn't handle
them.  Now it can handle 1010102 formats so we can turn them back on.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
ccb44b8a94 intel/blorp: Allow CCS copies of 1010102 formats
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
1978de66f7 intel/blorp: Add support for more format bitcasting
nir_format_bitcast_uint_vec_unmasked can only be used to cast between
formats with uniform channel sizes.  In particular, it cannot handle
10_10_10_2 formats.  By making use of the NIR helper for uint vector
casts, we should now be able to bitcast between any two uint formats so
long as their channels are in RGBA order (possibly with channels
missing).  In order to do this we need to rework the key a bit to pass
the actual formats instead of just the number of bits in each.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
7998fe268e intel/blorp: Use nir_format_bitcast_uint_vec_unmasked
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
047e68389f nir/format_convert: Add code for bitcasting vectors
This is a fairly direct port from blorp.  The only real change is that
the nir_format_convert version doesn't assume that everything is a vec4.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
a6b66a7b26 intel/blorp: Use ISL instead of bitcast_color_value_to_uint
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
09ced65420 intel/isl: Add format conversion code
This adds helpers to ISL to convert an isl_color_value to and from
binary data encoded with a given isl_format.  The conversion is done
using ISL's built-in format introspection so it's fairly slow as format
conversions go but it should be fine for a single pixel value.  In
particular, we can use this to convert clear colors.

As a side-effect, we now rely on the sRGB helpers in libmesautil so we
need to tweak the build system a bit.  All prior uses of src/util in ISL
were header-only.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
8152c60e01 intel/isl/format: Get rid of the ALPHA colorspace
Alpha-only formats are just linear.  There's no need to specially
deliminate them as being in their own colorspace.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
8ab73790ef intel/isl/format: Add field locations informations to channel_layout
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
96598fbc02 intel/isl/format: Add a column for channel order to the table
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
d08d6a3da8 i965/blorp: Remove a pile of blorp_blit restrictions
Previously, blorp could only blit into something that was renderable.
Thanks to recent additions to blorp, it can now blit into basically
anything so long as it isn't compressed.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
465d8566cd i965/blorp: Allow blorp blits for 16x MSAA
BLORP has supported 16x MSAA for quite a while now, we just never
bothered to enable it for CopyTexSubImage.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
09eede9c9d anv: Allow blitting to/from any supported format
Now that blorp handles all the cases, why not?  The only real change we
have to make is to stop using anv_swizzle_for_render() in blorp_blit
because it doesn't work for B4G4R4A4 and blorp now natively handles that.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
8ce31c9cc5 intel/blorp: Support the RGB workaround on more formats
Previously we only supported UINT formats because that's what blorp_copy
required.  If we want to use it in blorp_blit, however, we need to
support everything.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
4e26e3dea9 intel/blorp: Silently convert RGBX destination formats to RGBA
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
08cd834996 intel/isl: Add some helpers for working with RGBX formats
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
804856fa57 intel/blorp: Handle more exotic destination formats
This commit adds support for the following formats as destination
formats even though the hardware does not support rendering to them:

 - ISL_FORMAT_R24_UNORM_X8_TYPELESS
 - ISL_FORMAT_A4B4G4R4_UNORM
 - ISL_FORMAT_L8_UNORM_SRGB
 - ISL_FORMAT_R9G9B9E5_SHAREDEXP

This is done by using a different format and emitting shader code to
fake it the rest of the way.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
9e492bb92e intel/blorp: Include nir_format_convert.h in blorp_blit.c
nir_mask_shift_or is now defined in nir_format_convert.h so we can
delete the copy in blorp_blit.c.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
9981709d8f nir/format_convert: Add a function to pack RGB9_E5 formats
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
4e337b42f9 nir/format_convert: Add pack/unpack for R11F_G11F_B10F
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
98156b0019 nir/format_convert: Add linear <-> sRGB helpers
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
2fdd966e3d nir: Add the start of a format conversion helper header
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
906c32ce87 intel/blorp: Add swizzle support for all hardware
This commit makes blorp capable of swizzling anything even on hardware
that doesn't support texture swizzle.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
1ef4f5aff1 intel/isl: Add a helper for inverting swizzles
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
242f6f7492 intel/isl: Add a helper for composing swizzles
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
dad67cc245 intel/isl: Add an isl_swizzle_supports_rendering helper
This helper encodes more details, specifically about Haswell, than the
previous asserts in isl_surface_state.c.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
23d703de1f i965/surface_state: Use an identity swizzle pre-Haswell
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
293b8de161 blorp: Handle the RGB workaround more like other workarounds
The previous version was sort-of strapped on in that it just adjusted
the blit rectangle and trusted in the fact that we would use texelFetch
and round to the nearest integer to ensure that the component positions
matched.  This new version, while slightly more complicated, is more
accurate because all three components end up with exactly the same
dst_pos and so they will get interpolated and sampled at the same
texture coordinate.  This makes the workaround suitable for using with
scaled blits.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Lionel Landwerlin
3853f1c6f4 i965: silence unused variable
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2dc29e095f ("i965: Don't leak blorp on Gen4-5.")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-05-09 18:12:10 +01:00
Lionel Landwerlin
11d36c373a intel: devinfo: silence coverity warning
It's just not possible to have a device with no subslices.

CID: 1433511
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-05-09 15:21:01 +01:00
Michel Dänzer
6f81e07ecb dri3: Only update number of back buffers in loader_dri3_get_buffers
And only free no longer needed back buffers there as well.

We want to stick to the same back buffer throughout a frame, otherwise
we can run into various issues.

Bugzilla: https://bugs.freedesktop.org/105906
Bugzilla: https://bugs.freedesktop.org/106399
Fixes: 3160cb86aa "egl/x11: Re-allocate buffers if format is suboptimal"
Reported-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-05-09 15:40:41 +02:00
Samuel Iglesias Gonsálvez
2cf64fdb46 anv: ignore pColorBlendState if all color attachments of the subpass are unused
According to Vulkan spec:

  "pColorBlendState is a pointer to an instance of the
   VkPipelineColorBlendStateCreateInfo structure, and is ignored if the
   pipeline has rasterization disabled or if the subpass of the render pass the
   pipeline is created against does not use any color attachments."

Fixes tests from CL#2505:

   dEQP-VK.renderpass.*.simple.color_unused_omit_blend_state

v2:
- Check that blend is not NULL before usage.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-09 07:01:10 +02:00
Timothy Arceri
e7a7b712fe mesa: remove hard-coded OpenGL 3.2 compat limit
Just let validate_context_version() do it instead. This fixes
MESA_GL_VERSION_OVERRIDE for compat, it will also allow us to
enable new compat versions on a per driver bases in future.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-09 14:24:43 +10:00
Timothy Arceri
4560aad780 mesa: add GLSLVersionCompat constant
This allows drivers to define what version of GLSL they support
in compat. This will be needed in order to support compat 3.2
without breaking drivers that wont support it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-09 14:24:36 +10:00
Timothy Arceri
be3ee9d141 mesa: dont call _mesa_override_glsl_version() in _mesa_init_constants()
All drivers that support GLSL will later set their default GLSL versions
overriding this override call. They currently all call
 _mesa_override_glsl_version() again later in order to support overrides.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-09 14:24:29 +10:00
Timothy Arceri
2a621acc8d mesa: dont set GLSLVersion in _mesa_init_constants()
Just leave it as 0 and let the drivers set it (as they already do)
to avoid redundantly initialising it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-09 14:24:22 +10:00
Jan Vesely
0783399d79 pipe-loader: Free driver_name in error path
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-08 21:35:07 -04:00
Brian Paul
901db25d5b glsl: change ast_type_qualifier bitset size to work around GCC 5.4 bug
Change the size of the bitset from 128 bits to 96.  This works around an
apparent GCC 5.4 bug in which bad SSE code is generated, leading to a
crash in ast_type_qualifier::validate_in_qualifier() (ast_type.cpp:654).

This can be repro'd with the Piglit test tests/spec/glsl-1.50/execution/
varying-struct-basic-gs-fs.shader_test

Bugzilla:https://bugs.freedesktop.org/show_bug.cgi?id=105497
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-08 19:06:09 -06:00
Kenneth Graunke
20f06bc72b i965: Dump validation list on INTEL_DEBUG=bat,submit.
This is really useful when debugging any sort of buffer management
issues, so just printing it during INTEL_DEBUG=bat,submit seems
reasonable.  With bat, we're already spamming so much output that
it doesn't really hurt.  With submit, it's still easy to grep for
the older information, and the new information is nice too.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-08 10:08:16 -07:00
Jason Ekstrand
06d3841882 i965/miptree: Remove redundant fields from intel_miptree_aux_buffer
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:46 -07:00
Jason Ekstrand
4f4779b367 i965: Simplify brw_emit_depthbuffer and brw_emit_depth_stencil_hiz
Now that we're using ISL, a good chunk of brw_emit_depthstencil is
pointless checks which ISL will do for us anyway.  Since we only have
one manual depth buffer emit function, move the useful bits into it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:45 -07:00
Jason Ekstrand
96f01501d7 i965: Move brw_emit_depth_stencil_hiz higher up in the file
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:45 -07:00
Jason Ekstrand
bdbb527a65 i965: Use ISL for emitting depth/stencil/hiz state on gen6+
We leave gen4-5 alone because the ISL code hasn't really been well-
tested on gen4-5 or with combined depth-stencil because we don't use
BLORP for depth operations on gen4-5.  Also, the gen4-5 code has to deal
with intratile offsets for LOD hacks and ISL doesn't handle those yet.
We could make ISL handle gen4-5 capable or we could just not bother.

Among other things, this should make future platform enabling easier
because it means we don't have to update multiple (or hand-rolled!)
depth stencil emit paths.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:44 -07:00
Jason Ekstrand
ccd3dce3c0 i965: Use the brw_depthbuffer atom on all gens
The only reason why we had two atoms was that the one we used for gen7+
depended on _NEW_DEPTH and _NEW_STENCIL as well as _NEW_BUFFERS.  Since
this is no longer true, we can combine them into one atom.  We do add a
dependence on BRW_NEW_AUX_STATE but that should never get set on gen4-5
so adding it is a no-op for those platforms.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:44 -07:00
Jason Ekstrand
514bb6f41e i965: Always set depth/stencil write enables on gen7+
The hardware will AND these fields with the corresponding fields in
DEPTH_STENCIL_STATE so there's no real reason to toggle them on and off
based on state bits.  This removes our reliance on the _NEW_DEPTH and
_NEW_STENCIL state bits and better matches what ISL does.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:43 -07:00
Jason Ekstrand
c4d00da7b7 i965: Re-order depth/stencil/hiz/clear packets to match ISL
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:42 -07:00
Jason Ekstrand
6fc3404911 i965: Re-emit depth/stencil/hiz on BRW_NEW_AUX_STATE
Certain things can change the aux usage or fast clear color of a depth
surface and we want to re-emit if that happens.  For instance, if you do
a fast depth clear of an already clear depth surface, we will just set
the clear color and not do anything else.  In that case, we could fail
to re-emit 3DSTATE_CLEAR_PARAMS and not get the new fast-clear color.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:23:55 -07:00
Lionel Landwerlin
3cdf1bf97d intel: devinfo: fix assertion on devices with odd number of EUs
I forgot to change the assert in the second helper function in a
previous change.

This hit the assert() on a Broadwell platform with 1 slice, 3
subslices but all EUs disabled in subslice 1 & 2.

Fixes: c1900f5b0f ("intel: devinfo: add helper functions to fill fusing masks values")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 15:15:54 +01:00
Bas Nieuwenhuizen
b17cfb08a3 vulkan/wsi: Only use LINEAR modifier for prime if supported.
This was setting the LINEAR modifier if neither the
X server nor the driver supported modifiers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106180
Fixes: c80c08e226 "vulkan/wsi/x11: Add support for DRI3 v1.2"
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Tested-by: Abel Garcia Dorta <mercuriete@gmail.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-08 15:47:16 +02:00
Jan Vesely
a9e4be9212 eg/compute: Drop reference to kernel_param bo in destructor
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-08 09:02:38 -04:00
Jan Vesely
a1e8fcce3e r600: Cleanup constant buffers on context destruction
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-08 09:02:30 -04:00
Alejandro Piñeiro
b6648798cf mesa/formatquery: remove online compression check on is_resource_supported
is_resource_supported returns if the combination of
target/internalformat is supported in at least one operation. Online
compression is only mandatory for glTexImage2D. Some formats doesn't
support online compression, but can be used in any case, with
glCompressed*D methods.

Without this commit, ETC2 internalformats were returning FALSE, even
for the drivers supporting it. So any other query (like
TEXTURE_COMPRESSED) was returning FALSE/NONE instead of the proper
value.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-08 08:19:38 +02:00
Kenneth Graunke
e6fb8196ce intel/genxml: Assert that genxml field start and ends are sane.
Chris recently fixed a bunch of genxml end < start bugs, as well as
booleans that are wider than a bit.  These are way too easy to write, so
asserting that the fields are sane is a good plan.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-07 23:06:52 -07:00
Kenneth Graunke
f83fd929b7 intel/genxml: Fix some more fake booleans in genxml.
None of these are actually booleans.  Tile Parameter is a tiling mode
enum.  Display pipes take plane numbers.  Predicate Enable has some
operations (and the default value of 6 was particular bogus).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-07 23:06:52 -07:00
Kenneth Graunke
33906eeaca intel/genxml: Make assert in gen_pack_header print a message.
Python's assert can take both a condition and a string, which will cause
it to print the string if the assertion trips.  (You can't use parens as
that creates a tuple.)  Doing "condition and string" works in C, but
doesn't have the desired effect in Python.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-07 23:06:52 -07:00
Kenneth Graunke
2dc29e095f i965: Don't leak blorp on Gen4-5.
We used to only initialize BLORP on Gen6+.  When we added it on Gen4-5,
we forgot to destroy it unconditionally.

Fixes: 752d7af77a (i965: Add blorp support for gen4-5)
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-07 23:05:59 -07:00
Matt Turner
ed5af94373 nir: Transform discard_if(true) into discard
Noticed while reviewing Tim Arceri's NIR inlining series.

Without his series:

instructions in affected programs: 16 -> 14 (-12.50%)
helped: 2

With his series:

instructions in affected programs: 196 -> 174 (-11.22%)
helped: 22

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-07 13:50:23 -07:00
Jan Vesely
ea1fff4416 eg/compute: Drop reference on code_bo in destructor.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-07 15:04:03 -04:00
Nicolas Boichat
54ba73ef10 configure.ac/meson.build: Fix -latomic test
When compiling with LLVM 6.0 on x86 (32-bit) for Android, the test
fails to detect that -latomic is actually required, as the atomic
call is inlined.

In the code itself (src/util/disk_cache.c), we see this pattern:
p_atomic_add(cache->size, - (uint64_t)size);
where cache->size is an uint64_t *, and results in the following
link time error without -latomic:
src/util/disk_cache.c:628: error: undefined reference to '__atomic_fetch_add_8'

Fix the configure/meson test to replicate this pattern, which then
correctly realizes the need for -latomic.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
2018-05-07 10:14:53 -07:00
Scott D Phillips
8b519075ea anv: remove unused field anv_queue::pool
The last use of the field was removed in 2015's ("48a87f4ba06
anv/queue: Get rid of the serial")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-07 09:03:46 -07:00
Kenneth Graunke
0b1cfd01ff i965: Set initial kflags on BO creation.
This simplifies kflag initialization, by creating a bufmgr-wide setting
for initial kflags, and just applying it whenever we create a new BO.

This also properly allows 48-bit addresses for imported BOs (via prime
or flink), which I had missed in my earlier 48-bit support series.

This will be useful when adding softpin support, as we'll want to add
EXEC_OBJECT_PINNED to initial_kflags as well.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-05-07 08:47:21 -07:00
Juan A. Suarez Romero
7ee54fc33d docs: update calendar, add news and link release notes to 18.0.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-07 11:25:54 +00:00
Juan A. Suarez Romero
78e103da8b docs: add sha256 checksums for 18.0.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit ae12c5e990)
2018-05-07 11:19:36 +00:00
Juan A. Suarez Romero
6c06d4e17b docs: add sha256 checksums for 18.0.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 6dc2658fd6)
2018-05-07 11:19:34 +00:00
Chris Wilson
cf440d85db intel/genxml: Fix a few invalid field widths
A couple of typos found by inspecting field.end - field.start, revealed
a few wide integers declared as bool and some that ended before they
started.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-07 11:34:13 +01:00
Vinson Lee
cd5319a64f swr/rast: Fix include for createInstructionCombiningPass with llvm-7.0.
Fix build error after llvm-7.0.0svn r330669 ("InstCombine: Fix layering
by not including Scalar.h in InstCombine").

  CXX      rasterizer/jitter/libmesaswr_la-blend_jit.lo
rasterizer/jitter/blend_jit.cpp:816:20: error: use of undeclared identifier 'createInstructionCombiningPass'; did you mean 'createInstructionSimplifierPass'?
        passes.add(createInstructionCombiningPass());
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                   createInstructionSimplifierPass

Suggested-by: George Kyriazis <george.kyriazis@intel.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-05 13:20:53 -07:00
Jan Vesely
2f1ad72ac1 clover: Add explicit virtual destructor to argument class
It is needed to destroy the v vector in scalar_argument
Fixes memory leaks on parameter set/bind.

v2: Drop redundant sclara_argument destructor

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-05-05 13:17:08 -04:00
Iago Toral Quiroga
e4c667b9e8 anv/device: expose shaderInt16 support in gen8+
This rollbacks the revert of this patch introduced with
commit 7cf284f18e.

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-05 12:41:14 +02:00
Iago Toral Quiroga
5a12bdac09 i965/compiler: handle conversion to smaller type in the lowering pass for that
This rollbacks the revert of this same patch introduced in
commit 7b9c15628a.

And also squahes the following patch to prevent a piglit regression caused
by this change:

intel/compiler: Fix lower_conversions for 8-bit types.
Author: Jose Maria Casanova Crespo <jmcasanova@igalia.com>

For 8-bit types the execution type is word. A byte raw MOV has 16-bit
execution type and 8-bit destination and it shouldn't be considered
a conversion case. So there is no need to change alignment and enter
in lower_conversions for these instructions.

Fixes a regresion in the piglit test "glsl-fs-shader-stencil-export"
that is introduced with this patch from the Vulkan shaderInt16 series:
'i965/compiler: handle conversion to smaller type in the lowering
pass for that'. The problem is caused because there is already a case
in the driver that injects Byte instructions like this:

mov(8)          g127<1>UB       g2<32,8,4>UB

And the aforementioned pass was not accounting for the special
handling of the execution size of Byte instructions. This patch
fixes this.

v2: (Jason Ekstrand)
   - Simplify is_byte_raw_mov, include reference to PRM and not
   consider B <-> UB conversions as raw movs.

v3: (Matt Turner)
   - Indentation style fixes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-05 12:41:02 +02:00
Iago Toral Quiroga
a75f967388 intel/compiler: handle 16-bit to 64-bit conversions in BSW platforms
These are subject to the general restriction that anything that is converted
to 64-bit needs to be aligned to 64-bit.  We had this already in place for
32-bit to 64-bit conversions, so this patch generalizes the implementation
to take effect on any conversion to 64-bit from a source smaller than
64-bit.

Fixes assembly validation errors in the following CTS tests in BSW:
dEQP-VK.spirv_assembly.instruction.compute.sconvert.int16_to_int64
dEQP-VK.spirv_assembly.instruction.compute.uconvert.uint16_to_uint64
dEQP-VK.spirv_assembly.instruction.compute.sconvert.int16_to_uint64

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-05 12:26:37 +02:00
Caio Marcelo de Oliveira Filho
9d1ff2261c intel/genxml: recognize 0x, 0o and 0b when setting default value
Remove the need of converting values that are documented in
hexadecimal. This patch would allow writing

    <field name="3D Command Sub Opcode" ... default="0x1B"/>

instead of

    <field name="3D Command Sub Opcode" ... default="27"/>

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-04 23:58:10 +01:00
Ian Romanick
9a10a2fd5f r200: Enable NV_fog_distance
With the previous fixes in place, it appears to just work.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-04 15:29:30 -07:00
Ian Romanick
9d0bf720ed i965: Enable NV_fog_distance
With the previous fixes in place, it appears to just work.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-04 15:29:28 -07:00
Ian Romanick
df80ffa4aa ffvertex: Don't try to read output registers in fog calculation
Gallium drivers use _mesa_remove_output_reads() via st_program to lower
output reads away.  It seems better to just generate the right thing in
the first place.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-04 15:27:50 -07:00
Ian Romanick
f2db3be620 mesa: Add missing support for glFogiv(GL_FOG_DISTANCE_MODE_NV)
Found by inspection, so I made a piglit test too.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-04 15:27:44 -07:00
Ian Romanick
d350276b03 mesa: Silence an unused parameter warning
main/framebuffer.c: In function ‘update_color_draw_buffers’:
main/framebuffer.c:629:46: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 update_color_draw_buffers(struct gl_context *ctx, struct gl_framebuffer *fb)
                                              ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-04 15:27:40 -07:00
Gert Wollny
e695a35f40 mesa/main/readpix: Correct handling of packed floating point values
Make sure that clamping in the pixel transfer operations is enabled/disabled
for packed floating point values just like it is done for single normal and
half precision floating point values.

This fixes a series of CTS tests with virgl that use r11f_g11f_b10f
buffers as target, and where virglrenderer reads these surfaces back
using the format GL_UNSIGNED_INT_10F_11F_11F_REV.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-04 10:47:46 -07:00
Scott D Phillips
5c075b0855 util/set: add a set_clear function
Clear a set back to the state of having zero entries.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-04 10:13:33 -07:00
Tapani Pälli
affe63b1da egl: add EGL_BAD_MATCH error case for surfaceless and android
Just like is done for other backends when suitable config is not
found (added in fd4eba4929).

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-05-04 14:04:03 +03:00
Nicolai Hähnle
c0acb596f4 amd/common: use llvm.amdgcn.wqm for explicit derivatives
To comply with an upcoming change in LLVM, see
https://reviews.llvm.org/D46051

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-04 11:02:48 +02:00
Rhys Perry
b30949a9c2 nv50/ir: fix printing of pixld
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-03 22:57:46 -04:00
Drew Davenport
4373dd3215 st/va: Support YUV formats in vaCreateSurfaces
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2018-05-03 15:48:35 -07:00
Mark Janes
7cf284f18e Revert "anv/device: expose shaderInt16 support in gen8+"
This reverts commit 0ba0ac815e.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-03 15:26:59 -07:00
Mark Janes
7b9c15628a Revert "i965/compiler: handle conversion to smaller type in the lowering pass for that"
This reverts commit 96b5153790.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-03 15:26:59 -07:00
Vinson Lee
589622a2fe swr/rast: Fix WriteBitcodeToFile usage with llvm-7.0.
Fix build error after llvm-7.0svn r325155 ("Pass a reference to a module
to the bitcode writer.").

  CXX      rasterizer/jitter/libmesaswr_la-JitManager.lo
rasterizer/jitter/JitManager.cpp:548:30: error: reference to type 'const llvm::Module' could not bind to an lvalue of type 'const llvm::Module *'
    llvm::WriteBitcodeToFile(M, bitcodeStream);
                             ^

Suggested-by: George Kyriazis <george.kyriazis@intel.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-03 14:06:09 -07:00
Deepak Rawat
9a21c96126 egl/x11: Send invalidate to driver on copy_region path in swap_buffer
Similar to swap_available path send invalidate to the driver because
egl/X11 is not watching for for server's invalidate events. The
dri2_copy_region path is trigerred when server supports DRI2 version
minor 1.

Tested with piglit egl tests for regression.

V2: Move invalidate from dri2_copy_region to swap_buffer common.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2018-05-03 13:55:58 +02:00
Juan A. Suarez Romero
fd4eba4929 egl: check if colorspace/surface type is supported
According to EGL 1.4 spec, section 3.5.1 ("Creating On-Screen Rendering
Surfaces"), if config does not support the colorspace or alpha format
attributes specified in attrib_list (as defined for
eglCreateWindowSurface), an EGL_BAD_MATCH error is generated.

This fixes dEQP-EGL.functional.wide_color.*_888_colorspace_srgb (still
not merged,
https://android-review.googlesource.com/c/platform/external/deqp/+/667322),
which is crashing when trying to create a windows surface with RGB888
configuration and sRGB colorspace.

v2: Handle the fix in other backends (Tapani)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-03 12:26:12 +02:00
Iago Toral Quiroga
0ba0ac815e anv/device: expose shaderInt16 support in gen8+
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
002cb6f2b3 anv/pipeline: support SpvCapabilityInt16 in gen8+
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
f07c05576f compiler/spirv: add implementation to check for SpvCapabilityInt16 support
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
dd41630d9a intel/compiler: implement 16-bit pack/unpack opcodes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
1dacb56279 compiler/spirv: implement 16-bit bitcasts
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
2d648e5ba3 compiler/lower_64bit_packing: rename the pass to be more generic
It can do 32-bit packing too now.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
d2564af842 nir/lower_64bit_packing: extend the pass to handle packing from / to 16-bit.
With 16-bit support we can now do 32-bit packing, a follow-up patch will
rename the pass to something more generic.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
c9653cc14c nir: add opcodes for 16-bit packing and unpacking
Noitice that we don't need 'split' versions of the 64-bit to / from
16-bit opcodes which we require during pack lowering to implement these
operations. This is because these operations can be expressed as a
collection of 32-bit from / to 16-bit and 64-bit to / from 32-bit
operations, so we don't need new opcodes specifically for them.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
6318808a05 intel/compiler: fix 16-bit comparisons
NIR assumes that booleans are always 32-bit, but Intel hardware produces
16-bit booleans for 16-bit comparisons. This means that we need to convert
the 16-bit result to 32-bit.

In the future we want to add an optimization pass to clean this up and
hopefully remove the conversions.

v2 (Jason): use the type of the source for the temporary and use
            brw_reg_type_from_bit_size for the conversion to 32-bit.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Iago Toral Quiroga
b11e9425df intel/compiler: lower some 16-bit integer operations to 32-bit
These are not supported in hardware for 16-bit integers.

We do the lowering pass after the optimization loop to ensure that we
lower ALU operations injected by algebraic optimizations too.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Iago Toral Quiroga
b9a3d8c23e compiler/nir: add a lowering pass to convert the bit size of ALU operations
Not all bit-sizes may be supported natively in hardware for all operations.
This pass allows drivers to lower such operations to a bit-size that is
actually supported and then converts the result back to the original
bit-size.

Compiler backends control which operations and wich bit-sizes require
the lowering through a callback function.

v2: generalize this pass and make it available in NIR core (Rob, Jason)
v3: remove some temporaries and reduce nesting in instruction loop using
    a continue statement (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo
f575277f7e intel/compiler: support negate and abs of half float immediates
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo
f0e6dacee5 intel/compiler: fix brw_imm_w for negative 16-bit integers
16-bit immediates need to replicate the 16-bit immediate value
in both words of the 32-bit value. This needs to be careful
to avoid sign-extension, which the previous implementation was
not handling properly.

For example, with the previous implementation, storing the value
-3 would generate imm.d = 0xfffffffd due to signed integer sign
extension, which is not correct. Instead, we should cast to
uint16_t, which gives us the correct result: imm.ud = 0xfffdfffd.

We only had a couple of cases hitting this path in the driver
until now, one with value -1, which would work since all bits are
one in this case, and another with value -2 in brw_clip_tri(),
which would hit the aforementioned issue (this case only affects
gen4 although we are not aware of whether this was causing an
actual bug somewhere).

v2: Make explicit uint32_t casting for left shift (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo
2a76f03c90 intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate
From Intel Skylake PRM, vol 07, "Immediate" section (page 768):

"For a word, unsigned word, or half-float immediate data,
software must replicate the same 16-bit immediate value to both
the lower word and the high word of the 32-bit immediate field
in a GEN instruction."

This fixes the int16/uint16 negate and abs immediates that weren't
taking into account the replication in lower and upper words.

v2: Integer cases are different to Float cases. (Jason Ekstrand)
    Included reference to PRM (Jose Maria Casanova)
v3: Make explicit uint32_t casting for left shift (Jason Ekstrand)
    Split half float implementation. (Jason Ekstrand)
    Fix brw_abs_immediate (Jose Maria Casanova)

Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo
e5fc3c0717 intel/compiler: implement nir_instr_type_load_const for 16-bit constants
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Iago Toral Quiroga
939501c8ed intel/compiler: implement conversions from 16-bit int/float to bool
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Iago Toral Quiroga
d5a419176f intel/compiler: implement conversion between float/int 16-bit types
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Iago Toral Quiroga
96b5153790 i965/compiler: handle conversion to smaller type in the lowering pass for that
The lowering pass was specialized to act on 64-bit to 32-bit conversions only,
but the implementation is valid for other cases.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Iago Toral Quiroga
5361a87ee7 intel/compiler: fix isign for 16-bit integers
We need to use 16-bit constants with 16-bit instructions,
otherwise we get the following validation error:

"Destination stride must be equal to the ratio of the sizes of
 the execution data type to the destination type"

Because the execution data type is 4B due to the 32-bit integer
constant.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Chris Wilson
b5e266765a i965: Always try to create a logical context
Always enable use of HW logical contexts to preserve GPU state between
batches when the kernel supports such constructs, continuing to enforce
the required support for gen6+.

At runtime, this effectively removes the BRW_NEW_CONTEXT flag (and the
upload of invariant state) from the start of every batch for any kernel
supporting contexts. So long as the older atoms are correctly listening
to the right flag (NEW_CONTEXT rather than NEW_BATCH) this should
eliminate a few redundant state uploads for the older platforms.

No piglits were harmed on ctg and ilk, both with and without logical
contexts.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-03 01:39:33 -07:00
Neil Roberts
e17d0ccbbd spirv: Apply OriginUpperLeft to FragCoord
This behaviour was changed in 1e5b09f42f. The commit message
for that says it is just a “tidy up” so my assumption is that the
behaviour change was a mistake. It’s a little hard to decipher looking
at the diff, but the previous code before that patch was:

  if (builtin == SpvBuiltInFragCoord || builtin == SpvBuiltInSamplePosition)
     nir_var->data.origin_upper_left = b->origin_upper_left;

  if (builtin == SpvBuiltInFragCoord)
     nir_var->data.pixel_center_integer = b->pixel_center_integer;

After the patch the code was:

  case SpvBuiltInSamplePosition:
     nir_var->data.origin_upper_left = b->origin_upper_left;
     /* fallthrough */
  case SpvBuiltInFragCoord:
     nir_var->data.pixel_center_integer = b->pixel_center_integer;
     break;

Before the patch origin_upper_left affected both builtins and
pixel_center_integer only affected FragCoord. After the patch
origin_upper_left only affects SamplePosition and pixel_center_integer
affects both variables.

This patch tries to restore the previous behaviour by changing the
code to:

  case SpvBuiltInFragCoord:
     nir_var->data.pixel_center_integer = b->pixel_center_integer;
     /* fallthrough */
  case SpvBuiltInSamplePosition:
     nir_var->data.origin_upper_left = b->origin_upper_left;
     break;

This change will be important for ARB_gl_spirv which is meant to
support OriginLowerLeft.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Fixes: 1e5b09f42f "spirv: Tidy some repeated if checks..."
2018-05-03 10:08:42 +02:00
Samuel Iglesias Gonsálvez
b291a3a4a3 spirv: convert some operands for bitwise shift and bitwise ops to uint32
SPIR-V allows to define the shift, offset and count operands for
shift and bitfield opcodes with a bit-size different than 32 bits,
but in NIR the opcodes have that limitation. As agreed in the
mailing list, this patch adds a conversion to 32 bits to fix this.

For more info, see:

https://lists.freedesktop.org/archives/mesa-dev/2018-April/193026.html

v2:
- src_bit_size will have zero value for variable bit-size operands (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 07:07:24 +02:00
Timothy Arceri
58c05ede96 mesa: enable geom shaders in OpenGL 3.2 Compat profile
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-03 12:08:21 +10:00
Bas Nieuwenhuizen
ffa15861ef radv: UseEnumerateInstanceVersion for the default version.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-02 21:57:08 +02:00
Bas Nieuwenhuizen
467c562a29 radv: Don't check the incoming apiVersion on CreateInstance.
This fixes

dEQP-VK.api.device_init.create_instance_invalid_api_version

CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-02 21:57:08 +02:00
Bas Nieuwenhuizen
9267ff9883 radv: Allow vkEnumerateInstanceVersion ProcAddr without instance.
Apparently the somewhere between 1.1.70 and 1.1.73 the loader started
depending on this. The loader then creates a 1.0 instance, which gets
into funny situation because we have a 1.1 device.

No idea how to do line wrapping in Mako though, my random guesses
did not work.

CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-02 21:57:08 +02:00
Lionel Landwerlin
336decd67e intel: aubinator: add an option to limit the number of decoded VBO lines
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 19:46:47 +01:00
Lionel Landwerlin
000452aebc intel: decoder: limit to the number decoded lines from VBO
By default we set no limit, but the debug batch decoder in i965 sets
it to 100.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 19:46:47 +01:00
Jason Ekstrand
bd35345e85 anv: Advertise variableMultisampleRate
Initially, I didn't understand this feature.  Turns out that all it
means is that you can switch multisample rates in the middle of a
zero-attachment subpass.  We've been able to do this since forever.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-05-02 10:59:03 -07:00
Rob Clark
28e410f6a5 nir: add missing dependency in meson.build
nir_builder_opcodes.h also depends on nir_intrinsics.py for generating
the system-value builders.

Reported-by: Christoph Haag <haagch@frickel.club>
Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 13:57:51 -04:00
Matthew Nicholls
97d57ef917 radv: fix multisample image copies
Previously before fb077b0728, the LOD parameter was being used in place of the
sample index, which would only copy the first sample to all samples in the
destination image. After that multisample image copies wouldn't copy anything
from my observations.

This fixes some copy_and_blit CTS tests.

v3.1: - set lod to 0 for nir_txf_ms (Samuel)
v2: - use GLSL_SAMPLER_DIM_MS instead of 2D (Samuel)
    - updated commit description (Samuel)

Fix this properly by copying each sample in a separate radv_CmdDraw and using a
pipeline with the correct rasterizationSamples for the destination image.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-02 19:32:00 +02:00
Kenneth Graunke
169d8e011a intel: Fix 3DSTATE_CONSTANT buffer decoding.
First, this was iterating over the 3DSTATE_CONSTANT_* instruction
but trying to process fields of the 3DSTATE_CONSTANT_BODY substructure.

Secondly, the fields have been called Buffer[0] and Read Length[0],
for a while now, and we were not handling the subscripts correctly.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-02 10:09:28 -07:00
Lionel Landwerlin
cf1d587879 intel: fix aubinator include
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7c22c150c4 ("intel: Move batch decoder/disassembler from tools/ to common/")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 17:54:29 +01:00
Kenneth Graunke
0ab423388c i965: Reuse batch decoder infrastructure rather than open coding it.
With the new callback, Jason's newer batch decoder infrastructure
should be able to do just as well as the old open coded INTEL_DEBUG=bat
handling, with much less code.  If there are any limitations, we'd like
to improve the common code rather than doing one-off hacks here.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-02 09:27:56 -07:00
Kenneth Graunke
bf91b81a0b intel: Give the batch decoder a callback to ask about state size.
Given an arbitrary batch, we don't always know what the size of certain
things are, such as how many entries are in a binding table.  But it's
easy for the driver to track that information, so with a simple callback
we can calculate this correctly for INTEL_DEBUG=bat.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-02 09:27:56 -07:00
Kenneth Graunke
7c22c150c4 intel: Move batch decoder/disassembler from tools/ to common/
Making these part of libintel_common allows us to use them in the DRI
driver.  The standalone tool binaries already link against the common
library, too, so it's no harder for them.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-02 09:27:56 -07:00
Kenneth Graunke
5c04971831 i965: Allocate shadow batches to explicitly be the BO size.
This unfortunately makes it malloc/realloc on every new batch, rather
than once at startup.  But it ensures that the shadow buffer's size will
absolutely match the BO size.  Otherwise, as we tune BATCH_SZ/STATE_SZ
or bufmgr cache bucket sizes, we may get a BO size that's rounded up,
and fail to allocate the shadow buffer large enough.

This doesn't fix any bugs today, as BATCH_SZ/STATE_SZ are the size of
a cache bucket, but it's better to be safe than sorry.

Reported-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-02 09:26:55 -07:00
Lionel Landwerlin
ec5df73803 intel: batch-decoder: iterate VERTEX_BUFFER_STATE fields
The gen_field_iterator only iterates the fields of a given gen_group.
If we want to iterate the fields of another gen_group contained as
field, we need to do it manually.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 17:11:28 +01:00
Lionel Landwerlin
acbce2ac57 intel: decoder: fix starting dword of struct fields
Struct fields might span several dwords, but iter_dword is incremented
up to the last dword of the current field before we print out the
struct's fields. We can't use iter_dword for computing the offset into
the pointer of data to decode.

v2: Fix displayed offset number (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 17:11:28 +01:00
Lionel Landwerlin
467430ddcc intel: decoder: document when fields should be used
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 17:10:37 +01:00
Lionel Landwerlin
4f128f7850 intel: decoder: identify groups with fixed length
<register> & <struct> elements always have fixed length. The
get_length() method implies that we're dealing with an instruction in
which the length is encoded into the variable data but the field
iterator uses it without checking what kind of gen_group it is dealing
with.

Let's make get_length() report the correct length regardless of the
gen_group (register, struct or instruction).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 17:10:37 +01:00
Lionel Landwerlin
3c416a50d8 intel: decoder: make the field iterator use more natural
while (iter_next()) { ... }

instead of

do { ... } while (iter_next());

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 17:10:37 +01:00
Vlad Golovkin
967aabca06 nv50: Extract needed value bits without shifting them before calling bitcount
This can save one instruction since bitcount doesn't care about specific
bits' positions.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-05-02 15:12:48 +02:00
Antia Puentes
3a1df14a7b intel: activate the gl_BaseVertex lowering
Surplus code related to the basevertex is removed.

The Vertex Elements contain now:
* VE 1: <firstvertex, BaseInstance, VertexID, InstanceID>
* VE 2: <DrawID, is_indexed_draw, 0, 0>

Also fixes unreachable message.

Fixes OpenGL CTS tests:
* KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters
* KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters
* KHR-GL46.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters
* KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysParameters
* KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters

Fixes Piglit tests:
* arb_shader_draw_parameters-drawid-indirect baseinstance
* arb_shader_draw_parameters-basevertex

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678
2018-05-02 11:24:46 +02:00
Antia Puentes
0fb204fac1 compiler/nir: Add conditional lowering for gl_BaseVertex
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-02 11:24:31 +02:00
Antia Puentes
0cbf29fa55 intel: emit is_indexed_draw in the same VE than gl_DrawID
The Vertex Elements are now:
* VE 1: <BaseVertex/firstvertex, BaseInstance, VertexID, InstanceID>
* VE 2: <DrawID, is-indexed-draw, 0, 0>

VE1 is it kept as it was before, VE2 additionally contains the new
system value.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-02 11:23:34 +02:00
Antia Puentes
6ba9088d9c intel/compiler: Add uses_is_indexed_draw flag
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-02 11:20:48 +02:00
Antia Puentes
9e6b886cf2 compiler: Add SYSTEM_VALUE_IS_INDEXED_DRAW and instrinsics
This VS system value contains if the draw command used to start the
rendering was an indexed draw command or a non-indexed one
(~0/0 respectively). Useful to calculate the gl_BaseVertex as:
(SYSTEM_VALUE_IS_INDEXED_DRAW & SYSTEM_VALUE_FIRST_VERTEX).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-02 11:20:40 +02:00
Samuel Pitoiset
0737c1e3a6 radv: enable out-of-order rasterization by default
As the implementation is conservative, we can now enable it
by default. It can be disabled with RADV_DEBUG=nooutoforder.

Don't expect much more than 1% of improvements, but the gain
seems consistent.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-02 10:33:24 +02:00
Samuel Pitoiset
1d766b0196 radv: only disable out-of-order rast for perfect occlusion queries
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-02 10:33:22 +02:00
Kenneth Graunke
1122fb2d98 i965: Drop unused gen5 sampler default color struct.
Trivial.
2018-05-01 23:09:25 -07:00
Kenneth Graunke
9f6082f6c7 i965: Make brw_vs_outputs_written static.
Drop a prototype.  Trivial.
2018-05-01 23:09:16 -07:00
Nanley Chery
3e56e4642f i965/tex_image: Avoid the ASTC LDR workaround on gen9lp
Both the internal documentation and the results of testing this in the
CI suggest that this is unnecessary. Add the fixes tag because this
reduces an internal benchmark's startup time by about 17 seconds
(reported by Eero).

Fixes: 710b1d2e66 "i965/tex_image: Flush certain subnormal ASTC channel values"
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-01 16:47:39 -07:00
Eric Anholt
800be7f277 freedreno: Fix ir3_cmdline.c build.
Fixes: 6487e7a30c ("nir: move GL specific passes to src/compiler/glsl")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-05-01 16:38:37 -07:00
Jason Ekstrand
d216ffc604 anv: Allow lookup of vkEnumerateInstanceVersion without an instance
Fixes: cbab2d1da5
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-01 14:45:51 -07:00
Jason Ekstrand
d5a0787f03 anv: Don't advertise Float64 or Int64 on HW without 64-bit types
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-05-01 14:45:50 -07:00
Samuel Pitoiset
d8db5986ce radv: compute the number of subpass attachments correctly
Only count color attachments twice if resolves are used, also
account for the depth stencil attachment if present.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-01 22:18:03 +02:00
Dave Airlie
e66f64c285 radv: set fmask_surf_index on fmask surfaces.
This is needed for gfx9 and later for all fmask surface index.

(Mentioned by Marek on irc)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-02 06:01:42 +10:00
Brian Paul
f298ed93d9 gallium/i915: fix PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE typo
Fixes: fffe5e2d14 ("gallium: add initial support for conservative
rasterization")
Trivial.
2018-05-01 09:52:22 -06:00
Rhys Perry
07dac3e040 nvc0: add conservative rasterization support
Subpixel precision bias, dilation and the post-snap mode are supported on
GM200 and newer. The pre-snap mode is supported for triangle primitives on
GP100.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-30 21:13:53 -06:00
Rhys Perry
97f5f399ef st/mesa: add support for nvidia conservative rasterization extensions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-30 21:13:53 -06:00
Rhys Perry
fffe5e2d14 gallium: add initial support for conservative rasterization
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-30 21:13:53 -06:00
Rhys Perry
4580617509 mesa: add support for nvidia conservative rasterization extensions
Although the specs are written against compatibility GL 4.3 and allows core
profile and GLES2+, it is exposed for GL 1.0+ and GLES1 and GLES2+.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-30 21:13:53 -06:00
Brian Paul
31ab0427a7 glsl/tests: add GLSL_TYPE_UINT8, GLSL_TYPE_INT8 cases to switch statements
To silence warnings about unhandled switch values.
Untested otherwise.

v2: move the INT/UINT8 cases after the INT/UINT16 cases, per Eric.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-30 21:13:53 -06:00
Brian Paul
efec712d51 tgsi: use enums instead of unsigned in ureg code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-30 21:13:53 -06:00
Timothy Arceri
6487e7a30c nir: move GL specific passes to src/compiler/glsl
With this we should have no passes in src/compiler/nir with any
dependencies on headers from core GL Mesa.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-05-01 12:39:33 +10:00
Andres Rodriguez
f56e22e496 radv/winsys: fix leaking resources from bo's imported by fd
A bo's ref_count was not being initialized when imported from an fd.
Therefore, we would fail to free the resource during VkFreeMemory().

This patch fixes applications like hifi VR in threaded mode, which
perform frequent imports/releases of IPC shared memory.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
2018-04-30 18:20:30 -04:00
Scott D Phillips
2a08ae3c7c i965/tiled_memcpy: ytiled_to_linear a cache line at a time
Similar to the transformation applied to linear_to_ytiled, also align
each readback from the ytiled source to a cacheline (i.e. transfer a
whole cacheline from the source before moving on to the next column).
This will allow us to utilize movntqda (_mm_stream_si128) in a
subsequent patch to obtain near WB readback performance when accessing
the uncached ytiled memory, an order of magnitude improvement.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-30 15:18:36 -07:00
Chris Wilson
682bdaa658 i965: Record mipmap resolver for unmapping
When mapping a region of the mipmap_tree, record which complementary
method to use to unmap it afterwards. By doing so we can avoid
duplicating the decision tree used when mapping and thereby eliminate
trivial errors that can be introduced if the two if-chains become out of
sync.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Chris Wilson
5367295e1a i965: Move unmap_depthstencil before map_depthstencil
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Chris Wilson
ab2825c898 i965: Move unmap_etc before map_etc
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Chris Wilson
9e7e88049f i965: Move unmap_s8 before map_s8
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Chris Wilson
b3ad6f5ca6 i965: Move unmap_movntdqa before map_movntdqa
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Chris Wilson
f348d07a62 i965: Move unmap_blit before map_blit
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Chris Wilson
359624142d i965: Move unmap_gtt before map_gtt
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Dave Airlie
8d3529872c ac/nir: expand 64-bit vec3 loads to fix shuffling.
If loading 64-bit vec3 values, a 4 component load would be followed
by a 2 component load and the resulting shuffle would fail as it
requires 2 4 components. This just expands the second results
vector out to 4 components.

This fixes 100 CTS tests:
dEQP-VK.spirv_assembly.type.vec3.*64*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-01 05:58:14 +10:00
Kenneth Graunke
bde12f75e1 i965: Don't stomp initial kflags for program cache.
We want to flag EXEC_OBJECT_CAPTURE, but we ought to preserve any
existing kflags.  Today, there are none (as the program cache doesn't
support 48-bit addressing), but once we start using softpin, we'll
need to preserve EXEC_OBJECT_PINNED.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-04-30 11:34:19 -07:00
Kenneth Graunke
0cc98522f9 i965: Let batchbuffers be placed anywhere in the 48-bit address space.
We were trying to mark batch buffers with EXEC_OBJECT_CAPTURE, and
accidentally stomped EXEC_OBJECT_SUPPORTS_48B_ADDRESS in the process.

There's no reason to restrict batch buffers to the lower 4GB.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-04-30 11:34:19 -07:00
Scott D Phillips
8ffc6ee251 intel: fix check for 48b ppgtt support
The previous logic of the supports_48b_addresses wasn't actually
checking if i915.ko was running with full_48bit_ppgtt. The ENOENT
it was checking for was actually coming from the invalid context
id provided in the test execbuffer.  There is no path in the
kernel driver where the presence of
EXEC_OBJECT_SUPPORTS_48B_ADDRESS leads to an error.

Instead, check the default context's GTT_SIZE param for a value
greater than 4 GiB

v2 (Ken): Fix in i965 as well.
v3 Check GTT_SIZE instead of HAS_ALIASING_PPGTT (Chris Wilson)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 11:34:19 -07:00
Leo Liu
1c5f4f4e17 st/omx/enc: fix blit setup for YUV LoadImage
The blit here involves scaling since it's copying from I8 format to R8G8 format.
Half of source will be filtered out with PIPE_TEX_FILTER_NEAREST instruction, it
looks that GPU always uses the second half as source. Currently we use "1" as
the start point of x for R, then causing 1 source pixel of U component shift to
right. So "-1" should be the start point for U component.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-30 11:55:36 -04:00
Juan A. Suarez Romero
4d449c94e4 autotools, meson: bump up required VA version
Due using a new VP9 config we use, required VA API 0.39

Fixes: 413c5ca372 ("travis: update libva required version")
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-30 13:59:37 +02:00
Juan A. Suarez Romero
96ed3714fc docs: update calendar, add news and link release notes to 18.0.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-28 17:01:48 +00:00
Juan A. Suarez Romero
8f1159bf9a docs: add sha256 checksums for 18.0.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit b3eed3ad03)
2018-04-28 16:58:39 +00:00
Juan A. Suarez Romero
14f85260de docs: add release notes for 18.0.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit d38da7bd2d)
2018-04-28 16:58:36 +00:00
Marek Olšák
8b7358fe43 radeonsi: increase the number of compiler threads depending on the CPU
The compiler queue was limited to 3 threads, so shader-db running
on a 16-thread CPU would have a bottleneck on the 3-thread queue.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
3f0eaaf6d9 radeonsi: avoid a crash in gallivm_dispose_target_library_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
e75fc8d033 radeonsi: move data_layout into si_compiler
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
797d673c9a radeonsi: move passmgr into si_compiler
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
c1823ff661 radeonsi: move target_library_info into si_compiler
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
5a94f15aa7 radeonsi: use si_compiler::triple in si_llvm_optimize_module
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
43f0a10051 radeonsi: add triple into si_compiler
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
87eb597758 radeonsi: add struct si_compiler containing LLVMTargetMachineRef
It will contain more variables.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
788d66553a radeonsi: rename r600_texture::resource to buffer
r600_resource could be renamed to si_buffer.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
6fadfc01c6 radeonsi: use r600_resource() typecast helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
3160ee876a radeonsi: remove unused atom parameter from si_atom::emit
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
de344209ad radeonsi: inline 2 trivial state structures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
e395475096 radeonsi: remove function si_init_atom
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
ccebcba893 radeonsi: remove si_atom::id
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
639b673fc3 radeonsi: don't use an indirect table for state atoms
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
9054799b39 radeonsi: rename r600_atom -> si_atom
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
a8abbbb172 radeonsi: remove r600_pipe_common.h
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
6d19120da8 radeonsi/gfx9: workaround for INTERP with indirect indexing
and clean up the conditions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
2018-04-27 17:56:04 -04:00
Marek Olšák
2d69b485f5 radeonsi: rewrite DCC format compatibility checking code
It might be better to use a slow compressed clear when clearing to 1.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
c732d069b3 radeonsi: implement DCC fast clear swizzle constraints more accurately
Reduce swizzle constraints to the ALPHA_IS_ON_MSB constraint and the clear
value of 1.

This significantly changes the DCC fast clear code, and fixes fast clear
for RGB formats without alpha.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
9ef423f720 radeonsi: rename variables and document stuff around DCC fast clear
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
1cc2e0cc6b radeonsi: fully enable 2x DCC MSAA for array and non-array textures
The clear code is exactly the same as for 1 sample buffers -
just clear the whole thing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
ca33d961a4 radeonsi: enable fast color clear for level 0 of mipmapped textures on <= VI
GFX9 is more complicated and needs a compute shader that we should just
copy from amdvlk.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
174e11c3f5 ac/surface: handle DCC subresource fast clear restriction on VI
v2: require the previous level to be clearable for determining whether
    the last unaligned level is clearable

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
George Kyriazis
838f15650e swr/rast: No need to export GetSimdValidIndicesGfx
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
7caeee3432 swr/rast: Small editorial changes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
f276517ebf swr/rast: Use new processor detection mechanism
Use specific avx512 selection mechanism based on avx512er bit instead of
getHostCPUName().  LLVM 6.0.0 has a bug that reports wrong string for KNL
(fixed in 6.0.1).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
8ace547e8d swr/rast: Output rasterizer dir to console since it's process specific
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
c328c5d0f4 swr/rast: Add TranslateGfxAddress for shader
Also add GFX_MEM_CLIENT_SHADER

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
edc41f73b8 swr/rast: jit PRINT improvements.
Sign-extend integer types to 32bit when specifying "%d" and add new %u
which zero-extends to 32bit. Improves  printing of sub 32bit integer types
(i1 specifically).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
5d403178e6 swr/rast: Fix regressions.
Bump jit cache revision number to force recompile.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
577af2bed4 swr/rast: Cleanup old cruft.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
aeab9db50a swr/rast: Package events.proto with core output
However only if the file exists in DEBUG_OUTPUT_DIR. The expectation is
that AR rasterizerLauncher will start placing it there when launching
a workload (which is in a subsequent checkin)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
b97bb0ea6d swr/rast: Fix init in EventHandlerWorkerStats
Make sure we initialize variables.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
9a72d4c03e swr/rast: Fix return type of VCVTPS2PH.
expecting <8xi16> return.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
3f008c5505 swr/rast: WIP Translation handling
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
7986519d50 swr/rast: Use different handing for stream masks
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
6b1c852ebc swr/rast: Silence warnings
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
e6daa62a48 swr/rast: Add support for TexelMask evaluation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
cec1b52cac swr/rast: Internal core change
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
7b343a215e swr/rast: Fix x86 lowering 64-bit float handling
- 64-bit cvt-to-float needs to be explicitly handled
- gathers need the right parameter types to work with doubles

Fixes draw-vertices piglit tests

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
fa4ab7910e swr/rast: Add some SIMD_T utility functors
VecEqual and VecHash

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
18c9cb85d1 swr/rast: Fix wrong type allocation
ALLOCA pointer elements, not pointers.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
1cdbce8805 swr: touch generated files to update timestamp
previous change in generators necessitates this change

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
9ceeb671a3 swr/rast: Fix byte offset for non-indexed draws
for the case when USE_SIMD16_SHADERS == FALSE

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
Marek Olšák
7083ac7290 util/u_queue: fix a deadlock in util_queue_finish
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 13:28:17 -04:00
Dylan Baker
7772de5283 meson: fix race condition revealed by using 0.44
Previously there was a special target that blocked for the generation of
anv_entrypoints.h, with meson 0.44 we don't need this, we can use a new
language feature instead. The problem is that previously that blocking
target would hide a race condition for the generation of another header,
anv_extensions.h. Now the build sometimes fails when anv_extensions.h is
not generated in time.

v2: - clarify the race condition in the commit message (Emil)

CC: Mark Janes <mark.a.janes@intel.com>
Fixes: 92550d9b16
       ("meson: remove workaround for custom target creating .h and .c files")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-27 10:24:51 -07:00
Dylan Baker
0c23bd76d1 bin: force git show to use default pretty setting
I have pretty default to short, which breaks this script.

v2: - Fix both places that don't define a --pretty (Emil)

cc: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Andres Gomez <agomez@igalia.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-27 10:19:55 -07:00
Tapani Pälli
b3ad4b6971 mesa: add TBO support for GL_EXT_texture_norm16
Earlier plumbing missed interaction with texture buffer objects.

Fixes: 7f467d4f73 "mesa: GL_EXT_texture_norm16 extension plumbing"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-27 14:34:43 +03:00
Samuel Pitoiset
d38425ce87 ac: fix texture query LOD for 1D textures on GFX9
1D textures are allocated as 2D which means we only need
one coordinate for texture query LOD.

Fixes: 625dcbbc45 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 11:15:35 +02:00
Christian Gmeiner
3e69127939 etnaviv: remove not needed includes
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2018-04-27 09:04:56 +02:00
Christian Gmeiner
2ba587aac7 etnaviv: remove redundant include
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2018-04-27 09:04:53 +02:00
Timothy Arceri
79b0556f29 glsl: replace some asserts with unreachable when processing the ast
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-04-27 10:18:47 +10:00
Timothy Arceri
410f901bee mesa: drop the buffer mode param from the DrawBuffer driver function
No drivers used it.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-27 10:09:10 +10:00
Anuj Phogat
b695a7bd8e anv/icl: Enable Vulkan on Ice Lake
This patch enables the Vulkan driver on Ice Lake h/w
with added warning about preliminary support.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-04-26 16:31:27 -07:00
Caio Marcelo de Oliveira Filho
c9bdc7f7e2 anv: enable VK_EXT_shader_viewport_index_layer
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-26 15:32:05 -07:00
Jason Ekstrand
3db93f9128 anv/allocator: Don't shrink either end of the block pool
Previously, we only tried to ensure that we didn't shrink either end
below what was already handed out.  However, due to the way we handle
relocations with block pools, we can't shrink the back end at all.  It's
probably best to not shrink in either direction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105374
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106147
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2018-04-26 13:17:14 -07:00
Eric Anholt
76ee9edcb4 broadcom/vc5: Add support for centroid varyings.
It would be nice to share the flags packet emit logic with flat shade
flags, but I couldn't come up with a good way while still using our pack
macros.  We need to refactor this to shader record setup at compile time,
anyway.

Fixes ext_framebuffer_multisample-interpolation * centroid-*
2018-04-26 11:30:22 -07:00
Eric Anholt
e2f3317801 broadcom/vc5: Add an assert about GFXH-1559.
Our TF outputs always start at 6 or 7 currently, so we don't hit the
broken 8 case.  Let's make sure that doesn't change somehow.
2018-04-26 11:30:22 -07:00
Eric Anholt
77b4f30bae broadcom/vc5: Add validation that we don't violate GFXH-1633 requirements.
We don't use ldunifa yet, but we will eventually for UBOs.
2018-04-26 11:30:22 -07:00
Eric Anholt
089c32eefd broadcom/vc5: Add validation that we don't violate GFXH-1625 requirements.
We don't use TMUWT yet, but we will once we do SSBOs.
2018-04-26 11:30:22 -07:00
Eric Anholt
57ceb95c84 broadcom/vc5: Implement GFXH-1742 workaround (emit 2 dummy stores on 4.x).
This should fix help with intermittent GPU hangs in tests switching
formats while rendering small frames.  Unfortunately, it didn't help with
the tests I'm having troubles with.
2018-04-26 11:30:22 -07:00
Eric Anholt
dc4cb04ee5 broadcom/vc5: Add QPU validation for register writes after thrend.
The next shader gets to start writing the register file during these
slots, so make sure we don't stomp over them.

The only case of hitting this that I could imagine would be dead writes.
2018-04-26 11:30:22 -07:00
Eric Anholt
8adf813f83 st: Choose a 2101010 format for GL_RGB/GL_RGBA with a 2_10_10_10 type.
GLES's GL_EXT_texture_type_2_10_10_10_REV allows uploading this type to an
unsized internalformat, and it should be non-color-renderable.
fbobject.c's implementation of the check for color-renderable is checks
that the texture has a 2101010 mesa format, so make sure that we have
chosen a 2101010 format so that check can do what it meant to.

Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgb on vc5.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-26 11:30:22 -07:00
Charmaine Lee
8aef7fccb7 st/mesa: fix missing setting of _ElementSize in new_draw_rasterpos_stage
With this patch, _ElementSize is initialized along with the rest
of the vertex array attributes in new_draw_rasterpos_stage().
This fixes a crash in st_pipe_vertex_format() when running
topogun-1.06-orc-84k-resize trace file with VMware svga driver.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-26 10:29:02 -07:00
Drew Davenport
e923e8151d st/va: Fix typos
s/attibute/attribute/
s/suface/surface/

v2: rebased(Leo)

Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-04-26 11:16:05 -04:00
Drew Davenport
893808006a st/va: Fix potential buffer overread
VASurfaceAttribExternalBuffers.pitches is indexed by
plane. Current implementation only supports single plane layout.

Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-04-26 11:16:05 -04:00
Boyuan Zhang
deba56accf radeon/vcn: fix mpeg4 msg buffer settings
Previous bit-fields assignments are incorrect and will result certain mpeg4
decode failed due to wrong flag values. This patch fixes these assignments.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-04-26 11:16:05 -04:00
Ian Romanick
bf5e0276b6 radeon: Drop broken front_buffer_reading/drawing optimization
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-26 09:38:51 -04:00
Ian Romanick
0b3231966f radeon: Use _mesa_is_front_buffer_drawing
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-26 09:38:51 -04:00
Samuel Pitoiset
d7ffe3b384 radv: set ac_surf_info::num_channels correctly
num_channels has been introduced since "ac/surface: don't set
the display flag for obviously unsupported cases".

Based on RadeonSI.

Fixes: e29facff31 ("ac/surface: don't set the display flag for obviously unsupported cases (v2)")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-26 15:34:14 +02:00
Samuel Pitoiset
a6fbefa67b radv: fix DCC enablement since partial MSAA implementation
dcc_msaa_allowed is always false on GFX9+ and only true on VI
if RADV_PERFTEST=dccmsaa is set. This means DCC was disabled
in some situations where it should not.

This is likely going to fix a performance regression.

Fixes: 2f63b3dd09 ("radv: enable DCC for MSAA 2x textures on VI under an option")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-26 15:34:11 +02:00
Karol Herbst
227b1af866 nir/opt_constant_folding: fix folding of 8 and 16 bit ints
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-26 11:16:15 +02:00
Karol Herbst
14943add44 nir: print 8 and 16 bit constants correctly
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-26 11:16:15 +02:00
Karol Herbst
543a8c66a7 nir: support converting to 8-bit integers in nir_type_conversion_op
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-26 11:16:15 +02:00
Neil Roberts
c4ab1bdcc9 spirv: Don’t check for NaN for most OpFOrd* comparisons
For all of the OpFOrd* comparisons except OpFOrdNotEqual the hardware
should probably already return false if one of the operands is NaN so
we don’t need to have an explicit check for it. This seems to at least
work on Intel hardware. This should reduce the number of instructions
generated for the most common comparisons.

For what it’s worth, the original code to handle this was added in
e062eb6415. The commit message for that says that it was to fix
some CTS tests for OpFUnord* opcodes. Even if the hardware doesn’t
handle NaNs this patch shouldn’t affect those tests. At any rate they
have since been moved out of the mustpass list. Incidentally those
tests fail on the nvidia proprietary driver so it doesn’t seem like
handling NaNs correctly is a priority.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-26 10:08:14 +02:00
Matt Atwood
3ba5a646e5 Intel: Add a Kaby Lake PCI ID
v2: Branding changed

Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-04-25 13:31:55 -07:00
Eric Anholt
069c409f43 gallium/util: Fix incorrect refcounting of separate stencil.
The driver may have a reference on the separate stencil buffer for some
reason (like an unflushed job using it), so we can't directly free the
resource and should instead just decrement the refcount that we own.
Fixes double-free in KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8
on vc5.

Fixes: e94eb5e600 ("gallium/util: add u_transfer_helper")
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-04-25 12:14:33 -07:00
Eric Anholt
0d4ce00d70 broadcom/vc5: Fix reloads of separate stencil buffers.
Like for stores, we need to emit a separate load_general packet.
2018-04-25 09:21:54 -07:00
Eric Anholt
9f3f4284c0 broadcom/vc5: Fix cpp of MSAA surfaces on 4.x.
The internal-type-bpp path is for surfaces that get stored in the raw TLB
format.  For 4.x, we're storing MSAA as just 2x width/height at the
original format.
2018-04-25 09:21:54 -07:00
Eric Anholt
ac207acb97 broadcom/vc5: Implement stencil blits using RGBA.
Fixes piglit fbo-depthstencil blit default_fb
2018-04-25 09:21:54 -07:00
Eric Anholt
503716fa86 broadcom/vc5: Remove leftover vc4 MSAA lowering setup in the FS key. 2018-04-25 09:21:54 -07:00
Eric Anholt
5710532e9e broadcom/vc5: Fix tile load/store of MSAA surfaces on 4.x.
For single-sample we have to always program SAMPLE_0, but for multisample
we want to store all the samples.
2018-04-25 09:21:54 -07:00
Juan A. Suarez Romero
413c5ca372 travis: update libva required version
Commit fa328456e8 added VP9 config support, but this needs a newer
libva version, 1.7.0 or above.

Fixes: fa328456e8 ("st/va: add VP9 config to enable profile2")
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-25 16:09:20 +02:00
Tapani Pälli
7f467d4f73 mesa: GL_EXT_texture_norm16 extension plumbing
Patch enables use of short and unsigned short data for texture uploads,
rendering and reading of framebuffers within the restrictions specified
in GL_EXT_texture_norm16 spec.

Patch also enables those 16bit format layout qualifiers listed in
GL_NV_image_formats that depend on EXT_texture_norm16.

v2: expose extension with dummy_true
    fix layout qualifier map changes (Ilia Mirkin)

v3: use _mesa_has_EXT_texture_norm16, other fixes
    and cleanup (Ilia Mirkin)

v4: fix rest of the issues found

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-25 14:26:20 +03:00
Jordan Justen
b0c5774027 meson: Fix with_intel_vk and with_amd_vk variables
Fixes: 5608d0a2ce "meson: use array type options"
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-24 23:12:42 -07:00
Roland Scheidegger
77554d220d draw: fix different sign logic when clipping
The logic was flawed, since mul(x,y) will be <= 0 (exactly 0) when
the sign is the same but both numbers are sufficiently small
(if the product is smaller than 2^-128).
This could apparently lead to emitting a sufficient amount of
additional bogus vertices to overflow the allocated array for them,
hitting an assertion (still safe with release builds since we just
aborted clipping after the assertion in this case - I'm however unsure
if this is now really no longer possible, so that code stays).
Not sure if the additional vertices could cause other grief, I didn't
see anything wrong even when hitting the assertion.

Essentially, both +-0 are treated as positive (the vertex is considered
to be inside the clip volume for this plane), so integrate the logic
determining different sign into the branch there.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-04-25 04:50:20 +02:00
Roland Scheidegger
98578df27b draw: simplify clip null tri logic
Simplifies the logic when to emit null tris (albeit the reasons why we
have to do this remain unclear).
This is strictly just logic simplification, the behavior doesn't change
at all.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-04-25 04:50:20 +02:00
Ilia Mirkin
c17ddcb4b4 nvc0/ir: all short immediates are sign-extended, adjust LIMM test
Some analysis suggests that all short immediates are sign-extended. The
insnCanLoad logic already accounted for this, but we could still pick
the wrong form when emitting actual instructions that support both short
and long immediates (with the long form usually having additional
restrictions that insnCanLoad should be aware of).

This also reverses a bunch of commits that had previously "worked
around" this issue in various emitters:

9c63224540: gm107/ir: make use of ADD32I for all immediates
83a4f28dc2: gm107/ir: make use of LOP32I for all immediates
b84c97587b: gm107/ir: make use of IMUL32I for all immediates
d30768025a: gk110/ir: make use of IMUL32I for all immediates

as well as the original import for UMUL in the nvc0 emitter.

Reported-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-04-24 21:37:44 -04:00
Boyan Ding
6695f9d5c5 mesa: call DrawBufferAllocate driver hook in update_framebuffer for windows-system FB
When draw buffers are changed on a bound framebuffer, DrawBufferAllocate()
hook should be called. However, it is missing in update_framebuffer with
window-system framebuffer, in which FB's draw buffer state should match
context state, potentially resulting in a change.

Note: This is needed because gallium delays creating the front buffer,
      i965 works fine without this change.

V2 (Timothy Arceri):
 - Rebased on merged/simplified DrawBuffer driver function
 - Move DrawBuffer call outside fb->ColorDrawBuffer[0] !=
   ctx->Color.DrawBuffer[0] check to make piglit pass.

v3 (Timothy Arceri):
 - Call new DrawBuffaerAllocate() driver function.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v2)
Reviewed-by: Brian Paul <brianp@vmware.com> (v2)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99116
2018-04-25 09:08:26 +10:00
Timothy Arceri
6ca09f3a60 st/mesa: add new driver function DrawBufferAllocate
Unlike some of the classic drivers the st was only using DrawBuffer()
to allocated some buffers on-demand. Creating a separate function
will allow us to call it from update_framebuffer() in the following
patch without regressing some of the older classic drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-25 09:08:26 +10:00
Timothy Arceri
2554b8cb00 mesa: some C99 tidy ups for framebuffer.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-04-25 09:08:26 +10:00
Dylan Baker
1d01b52d76 meson: Fix no-rtti in llvm detection
Because I clearly wasn't thinking and clearly didn't do a good job
testing. Sigh

Fixes: c5a97d658e
       ("meson: fix builds against LLVM built without rtti")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-24 15:26:51 -07:00
Dylan Baker
be0a2cfc65 meson: use new warning function
Instead of emulating it with message.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
5608d0a2ce meson: use array type options
This option type is nice since it involves less converting strings into
lists, and because it validates the values that are provided.

v2: - Set with_any_vk to true if any vulkan driver is built (Eric)

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
c5a97d658e meson: fix builds against LLVM built without rtti
Building without rtti is a frought with peril, but it's something that
autotools supports so we need to support it too.

Since we've moved to version 0.44 as a whole we can use the meson
functionality for accessing random llvm-config options we can check for
rtti and add -fno-rtti to all C++ code accordingly.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-24 14:08:15 -07:00
Dylan Baker
595021bf1a meson: remove dummy_cpp
meson has gotten pretty smart about tracking C and C++ dependencies
(internal and external), and using the right linker. This wasn't always
the case and we created empty c++ files to force the use of the c++
linker. We don't need that any more.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
db90c8627c meson: allow empty sources when using link_whole
meson used to get grumpy if the sources list was empty, even when using
--whole-archive (link_whole). In more recent versions that's not true,
so remove the workaround.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
92550d9b16 meson: remove workaround for custom target creating .h and .c files
In more modern versions of meson a custom_target returns an index-able
object. This allows us to create accurate dependency models for targets
that rely only on the header and not on the code from anv_entrypoints.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
5a670d08c0 meson: raise required version to 0.44.1
We have already required 0.44 for building clover and swr, so it was
already partially required. This just makes it required across the board
instead of just for clover and swr.

There is a bug in 0.44 which makes it impossible to build mesa in some
configurations, so require 0.44.1 which fixes this.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
1546f76a39 meson: fix graw-xlib after auxiliary consolidation
This one's completely my fault, I didn't do good enough testing after
rebasing and this got missed.

Fixes: d28c246501
       ("meson: build graw tests")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
c73abb4f82 meson: only build mesa_st tests when build-tests is true
Since we have an option to turn test building on and off, we should
honor that.

Fixes: 34cb4d0ebc
       ("meson: build tests for gallium mesa state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
aaab624245 meson: don't build classic mesa tests without dri_drivers
Since mesa_classic is build-on-demand the tests will create a demand and
add a bunch of extra compilation.

Fixes: 43a6e84927
       ("meson: build mesa test.")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Nanley Chery
0e8b16e0a2 i965/meta_util: Re-enable sRGB-encoded fast-clears on CNL
The paths which sample with the clear color are now using a getter which
performs the sRGB decode needed to enable this fast clear.

This path can be exercised by fast-clearing a texture, then performing
an operation which requires sRGB decoding. Test coverage for this
feature is provided with the following tests:

* Shader texture calls:
  - spec@ext_texture_srgb@tex-srgb

* Shader texelfetch calls:
  - spec@arb_framebuffer_srgb@fbo-fast-clear
  - spec@arb_framebuffer_srgb@msaa-fast-clear

* Blending:
  - spec@arb_framebuffer_srgb@arb_framebuffer_srgb-fast-clear-blend

* Blitting:
  - spec@arb_framebuffer_srgb@blit texture srgb msaa enabled clear

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 13:41:14 -07:00
Nanley Chery
129ad66dd5 i965/miptree: Extend the sRGB-blending WA to future platforms
The blending issue seems to be present on CNL as well.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 13:41:14 -07:00
Nanley Chery
7ea013c6d3 i965: Add and use a getter for the clear color
It returns both the inline clear color and a clear address which points
to the indirect clear color buffer (or NULL if unused/non-existent).
This getter allows CNL to sample from fast-cleared sRGB textures
correctly by doing the needed sRGB-decode on the clear color (inline)
and making the indirect clear color buffer unused.

v2 (Rafael):
* Have a more detailed commit message.
* Add a comment on the sRGB conversion process.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 13:41:14 -07:00
Jason Ekstrand
b55077a8bc util/srgb: Add a float sRGB -> linear helper
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 13:41:14 -07:00
Nanley Chery
cd5ce363e3 i965/wm_surface_state: Use the clear address if clear_bo is non-NULL
We want to add and use a getter that turns off the indirect path by
returning zero for the clear color bo and offset.

v2: Fix usage of "clear address" in commit message (Jason).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 13:41:14 -07:00
Nanley Chery
af4e9295fe i965: Add and use a single miptree aux_buf field
We want to add and use a function that accesses the auxiliary buffer's
clear_color_bo and doesn't care if it has an MCS or HiZ buffer
specifically.

v2 (Jason Ekstrand):
* Drop intel_miptree_get_aux_buffer().
* Mention CCS in the aux_buf field.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 13:41:14 -07:00
Nanley Chery
5503b65103 i965: Add and use a getter for the miptree aux buffer
Make the next patch easier to read by eliminating most of the would-be
duplicate field accesses now.

v2: Update the HiZ comment instead of deleting it (Rafael).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-04-24 13:41:14 -07:00
Karol Herbst
e4f675dc42 gm107/ir/lib: fix sched in div u32 builtin
Imad needs to set a read barrier.

With significant big work groups I was getting wrong results for div u32. Turns
out the issue was with the sched opcodes.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-24 22:31:59 +02:00
Ian Romanick
0d5ce25c1c intel/compiler: Add scheduler deps for instructions that implicitly read g0
Otherwise the scheduler can move the writes after the reads.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95009
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95012
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Cc: Clayton A Craft <clayton.a.craft@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2018-04-24 14:31:21 -04:00
Ian Romanick
cd32a4e5f4 intel/compiler: Silence unused parameter warnings in empty vec4_instruction_scheduler methods
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::count_reads_remaining(backend_instruction*)’:
src/intel/compiler/brw_schedule_instructions.cpp:764:72: warning: unused parameter ‘be’ [-Wunused-parameter]
 vec4_instruction_scheduler::count_reads_remaining(backend_instruction *be)
                                                                        ^~
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::setup_liveness(cfg_t*)’:
src/intel/compiler/brw_schedule_instructions.cpp:769:51: warning: unused parameter ‘cfg’ [-Wunused-parameter]
 vec4_instruction_scheduler::setup_liveness(cfg_t *cfg)
                                                   ^~~
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::update_register_pressure(backend_instruction*)’:
src/intel/compiler/brw_schedule_instructions.cpp:774:75: warning: unused parameter ‘be’ [-Wunused-parameter]
 vec4_instruction_scheduler::update_register_pressure(backend_instruction *be)
                                                                           ^~
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual int vec4_instruction_scheduler::get_register_pressure_benefit(backend_instruction*)’:
src/intel/compiler/brw_schedule_instructions.cpp:779:80: warning: unused parameter ‘be’ [-Wunused-parameter]
 vec4_instruction_scheduler::get_register_pressure_benefit(backend_instruction *be)
                                                                                ^~
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual int vec4_instruction_scheduler::issue_time(backend_instruction*)’:
src/intel/compiler/brw_schedule_instructions.cpp:1550:61: warning: unused parameter ‘inst’ [-Wunused-parameter]
 vec4_instruction_scheduler::issue_time(backend_instruction *inst)
                                                             ^~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 14:31:21 -04:00
Ian Romanick
bdb15c2344 intel/compiler: Silence unused parameter warning in compile_cs_to_nir
src/intel/compiler/brw_fs.cpp: In function ‘nir_shader* compile_cs_to_nir(const brw_compiler*, void*, const brw_cs_prog_key*, brw_cs_prog_data*, const nir_shader*, unsigned int)’:
src/intel/compiler/brw_fs.cpp:7205:44: warning: unused parameter ‘prog_data’ [-Wunused-parameter]
                   struct brw_cs_prog_data *prog_data,
                                            ^~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 14:31:21 -04:00
Ian Romanick
d84b2ed1d7 intel/compiler: Silence unused parameter warnings in generate_foo methods
Since all of the fs_generator::generate_foo methods take a fs_inst * as
the first parameter, just remove the name to quiet the compiler.

src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_barrier(fs_inst*, brw_reg)’:
src/intel/compiler/brw_fs_generator.cpp:743:41: warning: unused parameter ‘inst’ [-Wunused-parameter]
 fs_generator::generate_barrier(fs_inst *inst, struct brw_reg src)
                                         ^~~~
src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_discard_jump(fs_inst*)’:
src/intel/compiler/brw_fs_generator.cpp:1326:46: warning: unused parameter ‘inst’ [-Wunused-parameter]
 fs_generator::generate_discard_jump(fs_inst *inst)
                                              ^~~~
src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_pack_half_2x16_split(fs_inst*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_fs_generator.cpp:1675:54: warning: unused parameter ‘inst’ [-Wunused-parameter]
 fs_generator::generate_pack_half_2x16_split(fs_inst *inst,
                                                      ^~~~
src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_shader_time_add(fs_inst*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_fs_generator.cpp:1743:49: warning: unused parameter ‘inst’ [-Wunused-parameter]
 fs_generator::generate_shader_time_add(fs_inst *inst,
                                                 ^~~~
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_set_simd4x2_header_gen9(brw_codegen*, brw::vec4_instruction*, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:1412:52: warning: unused parameter ‘inst’ [-Wunused-parameter]
                                  vec4_instruction *inst,
                                                    ^~~~
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_mov_indirect(brw_codegen*, brw::vec4_instruction*, brw_reg, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:1430:41: warning: unused parameter ‘inst’ [-Wunused-parameter]
                       vec4_instruction *inst,
                                         ^~~~
src/intel/compiler/brw_vec4_generator.cpp:1432:63: warning: unused parameter ‘length’ [-Wunused-parameter]
                       struct brw_reg indirect, struct brw_reg length)
                                                               ^~~~~~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 14:31:21 -04:00
Eric Anholt
3d21fc193e broadcom/vc5: Set up internal_format for imported resources.
Without this, we'd assertion fail in u_transfer_helper when mapping an
imported resource.
2018-04-24 10:37:29 -07:00
Eric Anholt
f08f477a93 broadcom/vc5: Assert that created BOs have offset != 0.
The kernel shouldn't return a bo at NULL, and the HW special-cases NULL
address values for things like OQs.
2018-04-24 10:37:29 -07:00
Eric Anholt
482f2e24b5 broadcom/vc5: Don't allocate simulator BOs at offset 0.
The kernel won't return us BOs at offset 0 (because things like OQs
wouldn't work there), so we shouldn't in the simulator either.
2018-04-24 10:37:29 -07:00
Eric Anholt
82cdb801fd broadcom/vc5: Add sim support for the GET_BO_OFFSET ioctl.
Otherwise we'd crash immediately upon importing a BO through EGL
interfaces.
2018-04-24 10:37:29 -07:00
Eric Anholt
3cdd055ed2 broadcom/vc5: Treat imports of DRM_FORMAT_MOD_INVALID BOs as linear.
We don't have any kernel metadata about BO tiling, so this probably is all
we should do for the moment.
2018-04-24 10:37:29 -07:00
Tapani Pälli
c2e159d050 i965: expose MESA_FORMAT_R8G8B8A8_SRGB visual
Exposing the visual makes following dEQP tests pass on Android:

   dEQP-EGL.functional.wide_color.window_8888_colorspace_srgb
   dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_srgb

Visual is exposed only when DRI_LOADER_CAP_RGBA_ORDERING is set.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-24 14:55:18 +03:00
Tapani Pälli
fa4d4d97f3 dri: Add __DRI_IMAGE_FORMAT_SABGR8
Add format definition and required plumbing to create images.
Note that there is no match to drm_fourcc definition, just like
with existing _DRI_IMAGE_FOURCC_SARGB8888.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-24 14:55:18 +03:00
Marek Olšák
4559aefb5c Revert "st/dri: Fix dangling pointer to a destroyed dri_drawable"
This reverts commit dab02dea34.

It causes crashes of qtcreator and firefox.

Fixes: dab02de "st/dri: Fix dangling pointer to a destroyed dri_drawable"

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
2018-04-24 00:00:20 -04:00
Roland Scheidegger
e8e1d287a3 gallivm: dump bitcode before optimization
If we dump the bitcode for off-line debug purposes, we really want the
pre-optimized bitcode, otherwise it's useless in identifying problems
with IR optimization (if you have a shader which takes an hour to do
IR optimization, it's also nice you don't have to wait that hour...).
Also, print out the function passes for opt which correspond to what
was used for jit compilation (and also the opt level for codegen).
Using opt/llc this way should then pretty much mimic what was done
for jit. (When specifying something like -time-passes
-debug-pass=[Structure|Arguments] (for either opt or llc) that also
gives very useful information in which passes all the time was spent,
and which passes are really run along with the order - llvm will add
passes due to dependencies on its own, and of course -O2 for llc
comes with a ~100 pass list.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-04-24 04:49:39 +02:00
Roland Scheidegger
e89cf59c27 gallivm: (trivial) do division by 1000 with int64
Conversion to int can otherwise overflow if compile times are over
~71min. (Yes this can happen...)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-04-24 04:49:39 +02:00
Roland Scheidegger
45b8f620a5 gallivm: remove LICM pass
LICM is simply too expensive, even though it presumably can help quite
a bit in some cases.
It was definitely cheaper in llvm 3.3, though as far as I can tell with
llvm 3.3 it failed to do anything in most cases. early-cse also actually
seems to cause licm to be able to move things when it previously couldn't,
which causes noticeable compile time increases.
There's more loop passes in llvm, but I'm not sure which ones are helpful,
and I couldn't find anything which would roughly do what the old licm in
llvm 3.3 did, so ditch it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-04-24 04:49:39 +02:00
Roland Scheidegger
8b9ab674b9 gallivm: add early cse pass
This pass is quite cheap, and can simplify the IR quite a bit for our
generated IR.
In particular on a variety of shaders I've found the time saved by
other passes due to the simplified IR more than makes up for the cost
of this pass, and on top of that the end result is actually better.
The only downside I've found is this enables the LICM pass to move some
things out of the main shader loop (in the case I've seen, instanced
vertex fetch (which is constant within the jit shader) plus the derived
instructions in the shader) which it couldn't do before for some reason.
This would actually be desirable but can increase compile time
considerably (licm seems to have considerable cost when it actually can
move things out of loops, due to alias analysis). But blaming early cse
for this seems inappropriate. (Note that the first two sroa / earlycse
passes are similar to what a standard llvm opt -O1/-O2 pipeline would
do, albeit this has some more passes even before but I don't think
they'd do much for us.)
It also in particular helps some crazy shader used for driver
verification (don't ask...) a lot (about factor of 6 faster in compile
time) (due to simplfiying the ir before LICM is run).
While here, also move licm behind simplifycfg. For some shaders there
seems to be very significant compile time gains (we've seen a factor
of 10000 albeit that was a really crazy shader you'd certainly never
see in a real app), beause LICM is quite expensive and there's cases
where running simplifycfg (along with sroa and early-cse) before licm
reduces IR complexity significantly. (I'm not entirely sure if it would
make sense to also run it afterwards.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-04-24 04:49:39 +02:00
Vlad Golovkin
1ff1dc1c63 glsl/glcpp: Handle hex constants with 0X prefix
GLSL 4.6 spec describes hex constant as:

hexadecimal-constant:
    0x hexadecimal-digit
    0X hexadecimal-digit
    hexadecimal-constant hexadecimal-digit

Right now if you have a shader with the following structure:

    #if 0X1 // or any hex number with the 0X prefix
    // some code
    #endif

the code between #if and #endif gets removed because the checking is performed
only for "0x" prefix which results in strtoll being called with the base 8 and
after encountering the 'X' char the strtoll returns 0. Letting strtoll detect
the base makes this limitation go away and also makes code easier to read.

From the strtoll Linux man page:

"If base is zero or 16, the string may then include a "0x" prefix, and the
number will be read in base 16; otherwise, a zero base is taken as 10 (decimal)
unless the next character is '0', in which case it is taken as 8 (octal)."

This matches the behaviour in the GLSL spec.

This patch also adds a test for uppercase hex prefix.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-24 09:55:05 +10:00
Timothy Arceri
295f57e09a mesa: rename api_validate.{c,h} -> draw_validate.{c,h}
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65422
2018-04-24 09:23:30 +10:00
Dave Airlie
a90c9f33cf ac/radv/radeonsi: refactor harvest config register getters.
This refactors the code out to share it between radv and radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-24 09:08:34 +10:00
Dave Airlie
8e4d54505a radv: only set raster_config_1 outside the index registers.
This follows what radeonsi does.

Ported from radeonsi:
    radeonsi: emit PA_SC_RASTER_CONFIG_1 only once

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-24 09:08:34 +10:00
Dave Airlie
f77caa7411 ac/radv/radeonsi: refactor max simd waves into common code.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-24 09:08:33 +10:00
Dave Airlie
899df55ee0 ac/radv/radeonsi: refactor raster_config default values getters.
This just makes this common code between the two drivers.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-24 09:07:51 +10:00
Dave Airlie
8de7ff91be radeonsi: use common gs_table_depth code
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-24 09:05:43 +10:00
Dave Airlie
9afe9c0fe2 radv: use common gs_table_depth code.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-24 09:05:43 +10:00
Dave Airlie
5e2ef28390 ac/info: move gs table depth to common code.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-24 09:05:38 +10:00
Dave Airlie
b25f6cde89 radeonsi: don't runtime check gs table info
We can just unreachable here, this aligns with radv code, makes
it easier to move to common code.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-24 09:05:29 +10:00
Dave Airlie
40783a7fa3 radv/gfx9: don't use gs_table_depth on gfx9.
Missed this on initial radeonsi port, we shouldn't use this value
on gfx9, but also in gfx8 only for when we have a geom shader.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-04-24 09:04:42 +10:00
Jason Ekstrand
de1f22d595 i965/fs: Return mlen * 8 for size_read() for INTERPOLATE_AT_*
They are send messages and this makes size_read() and mlen agree.  For
both of these opcodes, the payload is just a dummy so mlen == 1 and this
should decrease register pressure a bit.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: mesa-stable@lists.freedesktop.org
2018-04-23 14:04:42 -07:00
Samuel Pitoiset
d136a5fad9 ac: fix the number of coordinates for ac_image_get_lod and arrays
This fixes crashes for the following CTS:
dEQP-VK.glsl.texture_functions.query.texturequerylod.*

Cubemaps are the same as 2D arrays.

Fixes: 625dcbbc45 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-23 21:48:38 +02:00
Lionel Landwerlin
2964e16e51 i965: perf: enable GPA query statistics
The combinaison of GPA/MDAPI components expects a particular name &
layout for their pipeline statistics query.

v2: Limit the query GPA/MDAPI statistics to gen7->9 (Lionel)

v3: Add curly braces (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-23 18:30:10 +01:00
Lionel Landwerlin
2e3025c817 i965: perf: add support for raw queries
The INTEL_performance_query extension provides a list of queries that
a user can select to monitor a particular workload. Each query reports
different sets of counters (roughly looking at different parts of the
hardware, i.e. caches/fixed functions/etc...).

Each query has an associated configuration that we need to program
into the hardware before using the query. Up to now, we provided
predefined queries. This change allows the user to build its own query
(and associated configuration) externally, and have the i965 driver
use that configuration through a new query named :

   Intel_Raw_Hardware_Counters_Set_0_Query

When this query is selected, the i965 driver will report raw counters
deltas (meaning their values need to be interpreted by the user, as
opposed to existing queries that provide human readable values).

This change is also useful for debug purposes for building new
pre-defined queries and verifying the underlying numbers make sense
before writing equations for user readable output.

This change's purpose is also to enable GPA. GPA uses a library called
MDAPI that processes raw counter data. MDAPI expects raw data to have
a certain layout (per generation which is a bit unfortunate...). This
change also embeds the expected data layouts.

v2: Enable raw queries on gen 7->11, v1 had 7->9 (Lionel)

v3: Don't assert on cherryview for gen7... (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-23 18:30:10 +01:00
Lionel Landwerlin
c61d445a5a i965: perf: read slice/unslice frequencies from OA reports
v2: Add comment breaking down where the frequency values come from (Ken)

v3: More documentation (Ken/Lionel)
    Adjust clock ratio multiplier to reflect the divider's behavior (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-23 18:30:10 +01:00
Lionel Landwerlin
43fcb72d2c i965: perf: snapshot RPSTAT register
This register contains the current/previous frequency of the GT, it's
one of the value GPA would like to have as part of their queries.

v2: Don't use this register on baytrail/cherryview (Ken)
    Use GET_FIELD() macro (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-23 18:30:10 +01:00
Lionel Landwerlin
d71b442416 i965: perf: extract utility functions
We would like to reuse a number of the functions and structures in
another file in a future commit.

We also move the previous content of brw_performance_query.h into
brw_performance_query_metrics.h to be included by generated metrics
files.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-23 18:30:10 +01:00
Samuel Pitoiset
e37e643589 ac: teach get_ac_sampler_dim() about subpass attachments
Suggested by Nicolai.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-23 19:10:56 +02:00
Samuel Pitoiset
84fef802fb ac/nir: add missing round_slice for 1D arrays
This fixes a bunch of CTS fails with 1D arrays:

dEQP-VK.glsl.texture_functions.texture*.sampler1darray_*

Fixes: 625dcbbc45 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-23 19:10:52 +02:00
Dylan Baker
10e4290524 bin/install_megadrivers: rename a few variables to make things clearer
Originally the "each" variable was just a part of the "drivers"
variable. It's not anymore so it's a bit ambiguous.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-23 09:57:35 -07:00
Dylan Baker
ae3f45c11e bin/install_megadrivers: fix DESTDIR and -D*-path
This fixes -Ddri-drivers-path, -Dvdpau-libs-path, etc. with DESTDIR when
those paths are absolute. Currently due to the way python's os.path.join
handles absolute paths these will ignore DESTDIR, which is bad. This
fixes them to be relative to DESTDIR if that is set.

Fixes: 3218056e0e
       ("meson: Build i965 and dri stack")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-23 09:57:35 -07:00
Dylan Baker
dbf5b772b3 compiler/glsl: close fd's in glcpp_test.py
I would have thought falling out of scope would allow the gc to collect
these, but apparently it doesn't, and this hits an fd limit on macos.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106133
Fixes: db8cd8e367
       ("glcpp/tests: Convert shell scripts to a python script")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2018-04-23 09:55:17 -07:00
Bas Nieuwenhuizen
0e945fdf23 nir: Do not use progress for unreachable code in return lowering.
We seem to use progress for two cases:
1) When we lowered some returns.
2) When we remove unreachable code.

If just case 2 happens we assert as state->return_flag has not
been allocated yet, but we are still trying to do insert all
predicates based on it.

This splits the concerns. We only use progress internally for case 1
and then keep track of 2 in a separate variable to indicate progress
in the return value of the pass.

This is slightly better than transforming the assert into
if (!state->return_flag) return, as the solution in this patch avoids
inserting predicates even if some other part of the might need them.

Fixes: 6e22ad6edc "nir: return early when lowering a return at the end of a function"
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106174
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-23 16:55:15 +02:00
Józef Kucia
8328c64eb1 radv: advertise 8 bits of subpixel precision for viewports
This is what radeonsi does.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-23 11:16:11 +02:00
Johan Klokkhammer Helsing
dab02dea34 st/dri: Fix dangling pointer to a destroyed dri_drawable
If an EGLSurface is created, made current and destroyed, and then a second
EGLSurface is created. Then the second malloc in driCreateNewDrawable may
return the same pointer address the first surface's drawable had.
Consequently, when dri_make_current later tries to determine if it should
update the texture_stamp it compares the surface's drawable pointer against
the drawable in the last call to dri_make_current and assumes it's the same
surface (which it isn't).

When texture_stamp is left unset, then dri_st_framebuffer_validate thinks
it has already called update_drawable_info for that drawable, leaving it
unvalidated and this is when bad things starts to happen. In my case it
manifested itself by the width and height of the surface being unset.

This is fixed this by setting the pointer to NULL before freeing the
surface.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106126
Signed-off-by: Johan Klokkhammer Helsing <johan.helsing@qt.io>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
2018-04-23 04:25:40 -04:00
Ilia Mirkin
5428066f5e nv50/ir: make a copy of tex src if it's referenced multiple times
For nv50 we coalesce the srcs and defs into a single node. As such, we
can end up with impossible constraints if the source is referenced
after the tex operation (which, due to the coalescing of values, will
have overwritten it).

This logic already exists for inserting moves for MERGE/UNION sources.
It's the exact same idea here, so leverage that code, which also
includes a few optimizations around not extending live ranges
unnecessarily.

Fixes tests/spec/glsl-1.30/execution/fs-textureSize-components.shader_test

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-22 23:03:16 -04:00
Lepton Wu
6c5abb68c7 virgl: disable virgl when no 3D for virtio gpu.
If users are running mesa under old version of qemu or have turned off
GL at runtime, virtio gpu driver actually doesn't work. Adds a detection
here so mesa can fall back to software rendering.

v2:
 - move detection from loader to virgl (Ilia, Emil)

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-04-23 12:35:29 +10:00
Dave Airlie
a8420e2530 radv: mark const structs as extern in header file to avoid lto damage
The copr repo from che was using LTO and he reported radv broke
recently with it. When testing with lto builds here I noticed
that we weren't seeing any instance extensions reported.

It appears LTO was treating the const without extern as an empty
struct, this is possibly a gcc bug, but we can work around it
just by marking these with extern.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-04-23 05:55:22 +10:00
Dylan Baker
f8c4716854 Bump version after 18.1
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-22 09:35:56 -07:00
Ilia Mirkin
3f1cad48b8 gallium/tests/trivial: fix viewport depth transform
These were getting mapped off into outer space, which would cause nv50
and nvc0 to clip the primitives (as depth_clip was enabled).

These drivers are configured to clip everything outside the [0, 1]
range, even though the hardware supports other view settings.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-04-21 23:31:48 -04:00
Ilia Mirkin
fe8b6d7e1f trace: allow image resource to be null
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-21 23:29:39 -04:00
Karol Herbst
63572091b5 nv50/ir/ra: prefer def == src2 for fma with immediates on nvc0
This helps with the PostRALoadPropagation pass moving long immediates into
FMA/MAD instructions.

changes in shader-db:
total instructions in shared programs : 5894114 -> 5886074 (-0.14%)
total gprs used in shared programs    : 666558 -> 666563 (0.00%)
total shared used in shared programs  : 520416 -> 520416 (0.00%)
total local used in shared programs   : 53524 -> 53524 (0.00%)
total bytes used in shared programs   : 54006744 -> 53932472 (-0.14%)

                local     shared        gpr       inst      bytes
    helped           0           0           2        4192        4192
      hurt           0           0           7           9           9

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
[imirkin: minor edits to separate nv50 and nvc0+ cases]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-21 10:53:59 -04:00
Rhys Perry
cc35b76e99 docs/features: mark GL_ARB_post_depth_coverage as DONE for nvc0
This was done a while ago but never marked on features.txt. Note that
this is only supported on GM200+.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-21 10:02:55 -04:00
Dylan Baker
6754c2e83d autotools: Include new meson files
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-20 20:26:56 -07:00
Dylan Baker
5c8e2501a6 autotools: Add passes.h to sources so it will be included in the tarball
This was introduced in commit 8f848ada8a
but not added to the sources list, which is necessary for it to be
included in release tarballs.

Fixes: 8f848ada8a
       ("swr/rast: Start refactoring of builder/packetizer.")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-20 20:26:54 -07:00
Dylan Baker
cfd7d2ba0d autotools: include include/vulkan headers
This is needed to provide vk_android_native_buffer.h for vk_enum_to_str.

v2: - remove accidentally included changes

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-20 20:26:49 -07:00
Rhys Perry
a0e57432b7 nvc0: fix line width on GM20x+
This has the side-effect of fixing polygon-offset piglit test failures.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-20 20:43:59 -04:00
Nanley Chery
7b20329107 i965/miptree: Delete an unused function
We're going to combine ::mcs_buf and ::hiz_buf in later commits. Once
that happens, this function no longer make sense.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-20 17:14:37 -07:00
Nanley Chery
010abacc95 i965/miptree: Don't leak the clear_color_bo
Free the clear_color_bo in addition to freeing the
intel_miptree_aux_buffer which holds the reference to it.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-20 17:14:37 -07:00
Jason Ekstrand
9d2ef3c9ec i965/blorp: Do the gen11 BTI flush
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-20 16:30:14 -07:00
Jason Ekstrand
185630c6bc anv/blorp: Do the gen11 BTI flush
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-20 16:30:14 -07:00
Lucas Stach
52e93e309f etnaviv: fix texture_format_needs_swiz
memcmp returns 0 when both swizzles are the same, which means we don't
need any hardware swizzling. texture_format_needs_swiz should return
true when the return value of the memcmp is non-zero.

Fixes: 751ae6afbe ("etnaviv: add support for swizzled texture formats")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2018-04-20 18:54:10 +02:00
Samuel Pitoiset
8f13975713 ac/nir: fix image dimension for subpass attachments
For subpass attachments we need one more coordinate with
the layer, so make them array types.

This fixes a bunch of CTS fails with RADV.

Fixes: 24fb3e6aa1 ("ac/nir: use ac_build_image_opcode for image intrinsics")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 18:44:51 +02:00
Bas Nieuwenhuizen
e1df849c3c radv: Mark GTT memory as device local for APUs.
Otherwise a lot of games complain about not having enough memory,
and it is sort of local so this seems reasonable to me.

CC: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-20 18:16:16 +02:00
Samuel Pitoiset
fedd0a4215 radv/winsys: allow to submit up to 4 IBs for chips without chaining
The SI family doesn't support chaining which means the maximum
size in dwords per CS is limited. When that limit was reached
we failed to submit the CS and the application crashed.

This patch allows to submit up to 4 IBs which is currently the
limit, but recent amdgpu supports more than that.

Please note that we can reach the limit of 4 IBs per submit
but currently we can't improve that. The only solution is to
upgrade libdrm. That will be improved later but for now this
should fix crashes on SI or when using RADV_DEBUG=noibs.

Fixes: 36cb5508e8 ("radv/winsys: Fail early on overgrown cs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105775
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 18:12:26 +02:00
Stefan Schake
ff904978a1 gallium/util: Android backtrace support
We can't use any of the existing implementations in u_debug_stack.
Android technically has libunwind, but it's been modified to the point
where it no longer compiles with the Mesa usage. The library is also
not meant to be referenced by vendor libraries. The officially sanctioned
way of obtaining backtraces is through the Android own libbacktrace, a
C++ library. Access it through a separate C++ source file on Android only.

Signed-off-by: Stefan Schake <stschake@gmail.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-20 18:49:49 +03:00
Stefan Schake
2abd4f4b49 gallium/util: Don't stub u_debug_stack on Android
The fallback path for no libunwind ends up being stubs for Android.
Don't compile them in so we can provide our own implementation.

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-20 18:49:37 +03:00
Samuel Pitoiset
dd069e9b41 ac/nir: handle nir_intrinsic_load_first_vertex like base_vertex
This fixes a ton of CTS crashes.

Fixes: c366f422f0 ("nir: Offset vertex_id by first_vertex instead of base_vertex")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 17:07:38 +02:00
Samuel Pitoiset
b21a4efb55 radv/winsys: allow local BOs on APUs
Ported from RadeonSI.

Local BOs ignore BO priorities, and we don't need those on APUs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 16:18:24 +02:00
Samuel Pitoiset
5c1233ed62 radv: use a global BO list only for VK_EXT_descriptor_indexing
Maintaining two different paths is annoying but this gets
rid of the performance regression introduced by the global
BO list.

We might find a better solution in the future, but for now
just keeps two paths.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 16:18:18 +02:00
Samuel Pitoiset
7bd5367546 Revert "radv: Don't store buffer references in the descriptor set."
In order to reduce a performance regression introduced by
4b13fe55a4 ("radv: Keep a global BO list for VkMemory."),
we are going to maintain two different paths.

One when VK_EXT_descriptor_indexing is enabled by the
application because we need to have a global BO list, and
one (the old one) when it's not enabled.

With Talos on Polaris, the global BO list reduces performance
by 10% which is too much for me.

This reverts commit ab6cadd3ec.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 16:18:13 +02:00
Jose Maria Casanova Crespo
eb96bd57c7 i965/fs: retype offset_reg to UD at load_ssbo
All operations with offset_reg at do_vector_read are done
with UD type. So copy propagation was not working through
the generated MOVs:

mov(8) vgrf9:UD, vgrf7:D

This change allows removing the MOV generated for reading the
first components for 16-bit and 64-bit ssbo reads with
non-constant offsets.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-20 13:30:12 +02:00
Nicolai Hähnle
24fb3e6aa1 ac/nir: use ac_build_image_opcode for image intrinsics
So that we'll use the dimension-aware intrinsics in the future.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:30:07 +02:00
Nicolai Hähnle
74063431f1 radeonsi: generate image load/store/atomic ops using ac_build_image_opcode
In preparation of dimension-aware LLVM image intrinsics.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:29:57 +02:00
Nicolai Hähnle
625dcbbc45 amd/common: pass address components individually to ac_build_image_intrinsic
This is in preparation for the new image intrinsics.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:23:52 +02:00
Nicolai Hähnle
f931583828 amd/common: pass new enum ac_image_dim to ac_build_image_opcode
This is in preparation for the new, dimension-aware LLVM image
intrinsics.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:23:40 +02:00
Nicolai Hähnle
9cb52d470a radeonsi/nir: fix crash in test involving the sample mask
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:50 +02:00
Nicolai Hähnle
552bc37c6f radeonsi/nir: set FS properties only when scanning a fragment shader
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:47 +02:00
Nicolai Hähnle
a807a9b215 ac/nir: fix atomic compare-and-swap
The LLVM instruction returns { i32, i1 }, where the i1 indicates success.
We're only interested in the first part, which is the loaded value.

Fixes dEQP-GLES31.functional.compute.shared_var.atomic.compswap.*

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:40 +02:00
Nicolai Hähnle
e788b987d8 radeonsi: fix error paths of si_texture_transfer_map
trans is zero-initialized, but trans->resource is setup immediately so
needs to be dereferenced.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:33 +02:00
Nicolai Hähnle
68ee1d5796 glsl: prevent spurious Valgrind errors when serializing NIR
It looks as if the structure fields array is fully initialized below,
but in fact at least gcc in debug builds will not actually overwrite
the unused bits of bit fields.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:23 +02:00
Aaron Watry
354b12681b clover: Fix host access validation for sub-buffer creation
From CL 1.2 Section 5.2.1:
    CL_INVALID_VALUE if buffer was created with CL_MEM_HOST_WRITE_ONLY and
    flags specify CL_MEM_HOST_READ_ONLY , or if buffer was created with
    CL_MEM_HOST_READ_ONLY and flags specify CL_MEM_HOST_WRITE_ONLY , or if
    buffer was created with CL_MEM_HOST_NO_ACCESS and flags specify
    CL_MEM_HOST_READ_ONLY or CL_MEM_HOST_WRITE_ONLY .

Fixes CL 1.2 CTS test/api get_buffer_info

v2: Correct host_access_flags check (Francisco)

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-04-19 20:57:37 -05:00
Neil Roberts
c366f422f0 nir: Offset vertex_id by first_vertex instead of base_vertex
base_vertex will be zero for non-indexed calls and in that case we
need vertex_id to be offset by the ‘first’ parameter instead. That is
what we get with first_vertex. This is true for both GL and Vulkan.

The freedreno driver is also setting vertex_id_zero_based on
nir_options. In order to avoid breakage this patch switches the
relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can
retain the same behavior.

v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from
SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-19 15:57:45 -07:00
Neil Roberts
c4f30a9100 spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX
The base vertex in Vulkan is different from GL in that for non-indexed
primitives the value is taken from the firstVertex parameter instead
of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX
instead of BASE_VERTEX.

v2 (idr): Add comment describing why SYSTEM_VALUE_FIRST_VERTEX is used
for SpvBuiltInBaseVertex.  Suggested by Jason.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-19 15:57:45 -07:00
Antia Puentes
c32e1035cb intel: Handle firstvertex in an identical way to BaseVertex
Until we set gl_BaseVertex to zero for non-indexed draw calls
both have an identical value.

The Vertex Elements are kept like that:
* VE 1: <BaseVertex/firstvertex, BaseInstance, VertexID, InstanceID>
* VE 2: <Draw ID, 0, 0, 0>

v2 (idr): Mark nir_intrinsic_load_first_vertex as "unreachable" in
emit_system_values_block and fs_visitor::nir_emit_vs_intrinsic.
2018-04-19 15:57:45 -07:00
Neil Roberts
0c8395e15d intel/compiler: Add a uses_firstvertex flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-19 15:57:45 -07:00
Antia Puentes
5ff848df7b compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics
This VS system value will contain the value passed as <basevertex> for
indexed draw calls or the value passed as <first> for non-indexed draw
calls. It can be used to calculate the gl_VertexID as
SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX.

From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays":

-  Page 352:
"The index of any element transferred to the GL by DrawArraysOneInstance
is referred to as its vertex ID, and may be read by a vertex shader as
gl_VertexID.  The vertex ID of the ith element transferred is first +
i."

- Page 355:
"The index of any element transferred to the GL by
DrawElementsOneInstance is referred to as its vertex ID, and may be read
by a vertex shader as gl_VertexID.  The vertex ID of the ith element
transferred is the sum of basevertex and the value stored in the
currently bound element array buffer at offset indices + i."

Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but
this will have to change when the value of gl_BaseVertex is
fixed. Currently its value is broken for non-indexed draw calls because
it must be zero but we are setting it to <first>.

v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of
SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth).

v3 (idr): Rebase on Rob Clark converting nir_intrinsics.h to be
generated.  Reformat commit message to 72 columns.

Reviewed-by: Neil Roberts <nroberts@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-19 15:57:45 -07:00
Mike Lothian
051fddb4a9 meson: Build st_tests_common with gtest
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106131
Fixes: 34cb4d0ebc ("meson: build tests for gallium mesa state tracker")
Signed-off-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-19 09:04:51 -07:00
Bas Nieuwenhuizen
dffdef6737 radv: Add Vega M support.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-19 16:36:21 +02:00
Bas Nieuwenhuizen
d1ce31d36c radv: Add bound checking workaround for dynamic buffers.
I have seen a few applications and games do the dynamic buffer bounds incorrectly, this
make it easier to work around, e.g. for debugging.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-19 16:13:25 +02:00
Thomas Hellstrom
e0c08183fb svga: Fix incorrect advertizing of EGL_KHR_gl_colorspace
When advertizing this extension, egl_dri2 uses the DRI2_RENDERER_QUERY
extension to query whether an sRGB format is supported. That extension will
query our driver with the BIND flag PIPE_BIND_RENDER_TARGET rather than
PIPE_BIND_DISPLAY_TARGET which is used when building the configs.
We only return the correct value for PIPE_BIND_DISPLAY_TARGET.

The inconsistency causes EGL to crash at surface initialization if sRGB is
not supported. Fix this by supporting both bind flags.

Testing done:
piglit egl_gl_colorspace srgb

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-19 13:42:51 +02:00
Mike Lothian
79487c427e swr: Fix include for createPromoteMemoryToRegisterPass
Include llvm/Transforms/Utils.h with the newest LLVM 7

v2: Include with " " rather than < > (Vinson Lee)

v3: Use LLVM_VERSION_MAJOR rather than HAVE_LLVM (George Kyriazis)

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-04-19 00:39:04 -07:00
Samuel Pitoiset
2f63b3dd09 radv: enable DCC for MSAA 2x textures on VI under an option
This can be enabled with RADV_PERFTEST=dccmsaa.

DCC for MSAA textures is actually not as easy to implement. It
looks like there is some corner cases. I will improve support
incrementally.

Vega support, as well as Polaris improvements, will be added later.

No CTS changes on Polaris using RADV_DEBUG=zerovram and
RADV_PERFTEST=dccmsaa.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:55 +02:00
Samuel Pitoiset
dc3d39771f radv: decompress DCC for multisampled source images before resolving
Multisampled source images (ie. color attachments) can be now
DCC compressed, so the driver needs to perform a DCC decompression
pass before resolving

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:52 +02:00
Samuel Pitoiset
1aefb62f1e radv: add a workaround for fast clears with DCC and MSAA textures
This should be fixed at some point in order to improve
performance.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:50 +02:00
Samuel Pitoiset
373fa0b599 radv: allocate CMASK for DCC fast clear with MSAA
CMASK is required because it should be cleared to
0xCCCCCCCC for MSAA textures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:48 +02:00
Samuel Pitoiset
255506c4e0 radv: implement fast color clear for DCC with MSAA
When DCC is enabled with MSAA textures, CMASK should be
cleared to 0xCCCCCCCC.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:45 +02:00
Samuel Pitoiset
796b6f4aab radv: make sure to sync after resolving using the compute path
This fixes some random CTS failures:

dEQP-VK.renderpass.multisample.*.

Performing a fast-clear eliminate is still useless, but it
seems that we need to sync.

Found while running CTS with RADV_DEBUG=zerovram.

Fixes: 56a171a499 ("radv: don't fast-clear eliminate after resolving a subpass with compute")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:09:55 +02:00
Samuel Pitoiset
4a698660ae radv: dump the SHA1 of SPIRV in the hang report
Might be useful for debugging purposes, especially when we
want to replace a shader on the fly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:09:52 +02:00
Bas Nieuwenhuizen
0e10790558 radv: Enable VK_EXT_descriptor_indexing.
This adds everything except non-uniform indexing, which needs a bit
more work and testing.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
5f7ebb5206 spirv: Add support for runtime descriptor array cap.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
c48feaf2d1 spirv: Add support for VK_EXT_descriptor_indexing uniform indexing caps.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
b5e04e9217 radv: Support allocating variable size descriptor sets.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
78c54acbe8 radv: Add support for variable descriptor set layouts.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
082c11e8a5 radv: Fix GetDescriptorSetLayoutSupport.
The continue means we do alignment differently than during creation,
making the buffer smaller than expected.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
d02bbde1a8 radv: Use sorted bindings for set layout creation.
Previously we did not care about havin the set storage in order,
but for variable descriptor count we want the highest binding
at the end of the storage.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
ab6cadd3ec radv: Don't store buffer references in the descriptor set.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
4b13fe55a4 radv: Keep a global BO list for VkMemory.
With update after bind we can't attach bo's to the command buffer
from the descriptor set anymore, so we have to have a global BO
list.

I am somewhat surprised this works really well even though we have
implicit synchronization in the WSI based on the bo list associations
and with the new behavior every command buffer is associated with
every swapchain image. But I could not find slowdowns in games because
of it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
22d6b89e39 spirv: Update spirv.h to 12f8de9f04327336b699b1b80aa390ae7f9ddbf4
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Kenneth Graunke
da25ae92be i965: Fix shadow batches to be the same size as the real BO.
brw_bo_alloc may round up our allocation size to the next bucket size.
In this case, we would malloc a shadow buffer that was the original
intended size, but use bo->size (the larger size) for all of our checks.

This could cause us to run off the end of the shadow buffer.

v2: Actually use the new BO size (caught by Lionel)

Reported-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c7dcee58b5 (i965: Avoid problems from referencing orphaned BOs after growing.)
2018-04-18 13:55:08 -07:00
Marek Olšák
7bd24d951a glsl_to_tgsi: try harder to lower unsupported ir_binop_vector_extract
This fixes some piglits.

Cc: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 15:34:52 -04:00
Leo Liu
90de03708f radeon/vce: disable vce dual pipe on VegaM
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 14:45:35 -04:00
Marek Olšák
c6f1d36019 radeonsi: add support for VegaM
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 14:45:33 -04:00
Marek Olšák
d6a66bc8db amd/addrlib: add support for VegaM
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 14:45:32 -04:00
Marek Olšák
d15fb766aa radeonsi/gfx9: fix a hang with an empty first IB
This packet causes the no-op IB detection to fail, so the IB is always
submitted. Also fix the no-op IB detection by moving the begin call.

Cc: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 14:42:06 -04:00
Dylan Baker
d28c246501 meson: build graw tests
This only enables the null and xlib target, so no windows support yet.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
34cb4d0ebc meson: build tests for gallium mesa state tracker
v2: - Fix typo

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
de01018293 meson: build gallium unit tests
v2: - gate unit tests on swrast being enabled (Eric A)
v3: - rebase on libtrace being merged with gallium auxiliary

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v2)
2018-04-18 09:03:57 -07:00
Dylan Baker
4c794c7834 meson: Build gallium trivial tests
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
7fee8fed16 meson: Remove TODO about mesa/main tests
They're already done.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
5d16c86add meson: enable glcpp test
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
db8cd8e367 glcpp/tests: Convert shell scripts to a python script
This ports glcpp-test.sh and glcpp-test-cr-lf.sh to a python script that
accepts arguments for each line ending type. This should allow for
better reporting to users.

v2: - Use $PYTHON2 to be consistent with other tests in mesa

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
8cb96c4031 glsl/tests: Remove unused compare_ir.py script
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-18 09:03:57 -07:00
Dylan Baker
877d250ea1 meson: enable optimization-test
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-18 09:03:57 -07:00
Dylan Baker
97c28cb082 glsl/tests: Convert optimization-test.sh to pure python
This patch converts optimization-test.sh to python, in this process it
removes external shell dependencies including diff. It replaces the
python script that generates shell scripts with a python library that
generates test cases and runs them using subprocess.

v2: - use $PYTHON2 to be consistent with other tests in mesa

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-18 09:03:57 -07:00
Dylan Baker
ad9c2f2018 meson: run glsl compiler warnings test
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
3b52d29227 glsl/tests: reimplement warnings-test in python
This reimplements the test in python with a shell script wrapper that
allows autotools to continue to run the test without realizing that
anything has changed.

Using python has two advantages, first it's portable so this test can be
run on windows as well as Linux since it just requires python, no more
diff, pwd or sh. It's also no longer tied to autotools implementation
details, like the environment variables $srcdir and $abs_builddir,
though the autotools shell wrapper still uses those, which makes it
possible to run the test in meson.

v2: - Use $PYTHON2 in script to be consistent with other scripts in mesa

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
George Kyriazis
12a002a3a1 swr/rast: Fix VGATHERPD lowering
Also Implement VHSUBPS in x86 lowering pass.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
99fe90722d swr/rast: Replace x86 VMOVMSK with llvm-only implementation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
0899122c03 swr/rast: Optimize late/bindless JIT of samplers
Add per-worker thread private data to all shader calls
Add per-worker sampler cache and jit context
Add late LoadTexel JIT support
Add per-worker-thread Sampler / LoadTexel JIT

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
ec7154abc0 swr/rast: Implement VROUND intrinsic in x86 lowering pass
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
bb02da3c1b swr/rast: Refactor to improve code sharing.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
94ca1c018f swr/rast: minimize codegen redundant work
Move filtering of redundant codegen operations into gen scripts themselves

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
7f34860125 swr/rast: double-pump in x86 lowering pass
Add support for double-pumping a smaller SIMD width intrinsic.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
96ad8f5a23 swr/rast: Fix 64bit float loads in x86 lowering pass
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
1ffbbbee97 swr/rast: Add shader stats infrastructure (WIP)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
a81c625cb7 swr/rast: Type-check TemplateArgUnroller
Allows direct use of enum values in conversion to template args.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
2966ee1028 swr/rast: Add vgather to x86 lowering pass.
Add support for generic VGATHERPD intrinsic in x86 lowering pass.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
e4929b5d26 swr/rast: fix comment
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
670a99c233 swr/rast: add cvt instructions in x86 lowering pass
Support generic VCVTPD2PS and VCVTPH2PS in x86 lowering pass.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
aa482014e5 swr/rast: Fix alloca usage in jitter
Fix issue where temporary allocas were getting hoisted to function entry
unnecessarily. We now explicitly mark temporary allocas and skip hoisting
during the hoist pass. Shuold reduce stack usage.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
81371a5909 swr/rast: Change gfx pointers to gfxptr_t
Changing type to gfxptr for indices and related changes to fetch and mem
builder code.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
71239478d3 swr/rast: Fix byte offset for non-indexed draws
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
c57b594317 swr/rast: Add support for setting optimization level
for JIT compilation

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
4f0df5e2f7 swr/rast: Adding translate call to builder_gfx_mem.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
f135f54b18 swr/rast: Fix codegen for typedef types
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
c5d7b37fe7 swr: add x86 lowering pass to fragment shader
Needed because some FP paths (namely stipple) use gather intrinsics
that now need to be lowered to x86.

v2: fix typo in commit message
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
9161c40d14 swr/rast: Enable generalized fetch jit
Enable generalized fetch jit with 8 or 16 wide SIMD target. Still some
work needed to remove some simd8 double pumping for 16-wide target.

Also removed unused non-gather load vertices path.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
d73082b98b swr/rast: Add builder_gfx_mem.{h|cpp}
Abstract usage scenarios for memory accesses into builder_gfx_mem.
Builder_gfx_mem will convert gfxptr_t from 64-bit int to regular pointer
types for use by builder_mem.

v2: reworded commit message; renamed enum more appropriately
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
1eb72673fc swr/rast: Lower VGATHERPS and VGATHERPS_16 to x86.
Some more work to do before we can support simultaneous 8-wide and
16-wide and remove the VGATHERPS_16 version.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
b15fb78df5 swr/rast: Cleanup of JitManager convenience types
Small cleanup. Remove convenience types from JitManager and standardize
on the Builder's convenience types.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
d68694016c swr/rast: Lower PERMD and PERMPS to x86.
Add support for providing an emulation callback function for arch/width
combinations that don't map cleanly to an x86 intrinsic.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
8f848ada8a swr/rast: Start refactoring of builder/packetizer.
Move x86 intrinsic lowering to a separate pass. Builder now instantiates
generic intrinsics for features not supported by llvm. The separate x86
lowering pass is responsible for lowering to valid x86 for the target
SIMD architecture. Currently it's a port of existing code to get it
up and running quickly. Will eventually support optimized x86 for AVX,
AVX2 and AVX512.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
ffc0aeb4ec swr/rast: Simplify #define usage in gen source file
Removed preprocessor defines from structures passed to LLVM jitted code.

The python scripts do not understand the preprocessor defines and ignores
them. So for fields that are compiled out due to a preprocessor define
the LLVM script accounts for them anyway because it doesn't know what
the defines are set to. The sanitize defines for open source are fine
in that they're safely used.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
f36026ce2e swr/rast: Move CallPrint() to a separate file
Needed work for jit code debug.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
67c8bb4db7 swr/rast: Fix name mangling for LLVM pow intrinsic
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
7a5054aa1c swr/rast: Add some archrast counters
Hook up archrast counters for shader stats: instructions executed.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
f52a501716 swr/rast: Code cleanup
Removing some code that doesn't seem to do anything meaningful.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
093c1aee88 swr/rast: Add "Num Instructions Executed" stats intrinsic.
Added a SWR_SHADER_STATS structure which is passed to each shader. The
stats pass will instrument the shader to populate this.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
5fbee5e4ef swr/rast: Add MEM_ADD helper function to Builder.
mem[offset] += value

This function will be heavily used by all stats intrinsics.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
9103119cb3 swr/rast: Permute work for simd16
Fix slow permutes in PA tri lists under SIMD16 emulation on AVX

Added missing permute (interlane, immediate) to SIMDLIB

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
4c69823d15 swr/rast: WIP builder rewrite (2)
Finish up the remaining explicit intrinsic uses. At this point all
explicit Intrinsic::getDeclaration() usage has been replaced with auto
generated macros generated with gen_llvm_ir_macros.py. Going forward,
make sure to only use the intrinsics here, adding new ones as needed.

Next step is to remove all references to x86 intrinsics to keep the
builder target-independent. Any x86 lowering will be handled by a
separate pass.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
c2163dc56a swr/rast: Add autogen of helper llvm intrinsics.
Replace sqrt, maskload, fp min/max, cttz, ctlz with llvm equivalent.
Replace AVX maskedstore intrinsic with LLVM intrinsic. Add helper llvm
macros for stacksave, stackrestore, popcnt.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
6427315e43 swr/rast: WIP builder rewrite.
Start removing avx2 macros for functionality that exists in llvm.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
a16f8e0554 swr/rast: LLVM 6 fix
for getting masked gather intrinsic (also compatible with LLVM 4)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
a92cc09c7a swr/rast: Changes to allow jitter to compile with LLVM5
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
0f6fef9632 swr/rast: Add some archrast stats
Add stats for degenerate and backfacing primitive counts

Wire archrast stats for alpha blend and alpha test.
pass value to jitter, upon return have archrast event increment a value

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
b488028854 swr/rast: Silence some unused variable warnings
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
e84bfec4ab swr/rast: Add debug type info for i128
Help support debug info in 16 wide shaders.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
a3edcfe1fb swr/rast: Use blend context struct to pass params
Stuff parameters into a blend context struct before passing down through
the PFN_BLEND_JIT_FUNC function pointer. Needed for stat changes.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
be6cf0fd7c swr/rast: Introduce JIT_MEM_CLIENT
Add assert for correct usage of memory accesses

v2: reworded commit message; renamed enum more appropriately
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
d34edffe48 swr/rast: Add some instructions to jitter
VPHADDD, PMAXUD, PMINUD

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
Juan A. Suarez Romero
4aa03581b5 docs: update calendar, add news and link release notes to 18.0.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-18 15:29:12 +00:00
Juan A. Suarez Romero
ad51d8871e docs: add sha256 checksums for 18.0.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit a1c421c638)
2018-04-18 15:25:32 +00:00
Juan A. Suarez Romero
76cadaa1de docs: add release notes for 18.0.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 8bd719e3fa)
2018-04-18 15:25:30 +00:00
Juan A. Suarez Romero
193d615917 docs: update calendar, add news and link release notes to 17.3.9
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-18 09:45:11 +00:00
Juan A. Suarez Romero
6372227209 docs: add sha256 checksums for 17.3.9
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit cf0864dc63)
2018-04-18 09:40:44 +00:00
Juan A. Suarez Romero
6a1261bd09 docs: add release notes for 17.3.9
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 6d88ea9dd4)
2018-04-18 09:40:42 +00:00
Dylan Baker
b9ad5282ba Revert "meson: add wrap for libdrm"
This reverts commit 6217eedc9b.

I was using this for testing and accidentally put it on master

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-17 13:48:55 -07:00
Dylan Baker
efcbcfa7c8 Revert "Add subprojects directory and git ignore"
This reverts commit 21e2e73f71.

I was using this for testing and accidentally put it on master

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-17 13:48:43 -07:00
Jan Alexander Steffens (heftig)
5cf752b18b meson: Version libMesaOpenCL like autotools does
This is for parity with autotools. It names the library
libMesaOpenCL.so.1.0.0 and points mesa.icd to the .1 symlink.

opencl_version now matches configure.ac's OPENCL_VERSION.

Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com>
Tested-By: Aaron Watry <awatry@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-17 13:46:15 -07:00
Jan Alexander Steffens (heftig)
5bb98cfd92 meson: Add library versions to swr drivers
This is for parity with autotools.

Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-17 13:46:15 -07:00
Dylan Baker
6217eedc9b meson: add wrap for libdrm
Currently this requires libdrm from git, since the version reported by
meson is wrong.
2018-04-17 13:46:15 -07:00
Dylan Baker
21e2e73f71 Add subprojects directory and git ignore
For meson wraps.
2018-04-17 13:46:15 -07:00
Samuel Pitoiset
893e19efb7 radv: fix scissor computation when using half-pixel viewport offset
'scale[i]' can be non-integer.

Original patch by Philip Rebohle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106074
Fixes: 0f3de89a56 ("radv: Use the guard band.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-17 22:12:14 +02:00
Neil Roberts
608d70bc02 spirv: Accept doubles in FaceForward, Reflect and Refract
The SPIR-V spec doesn’t specify a size requirement for these and the
equivalent functions in the GLSL spec have explicit alternatives for
doubles. Refract is a little bit more complicated due to the fact that
the final argument is always supposed to be a scalar 32- or 16- bit
float regardless of the other operands. However in practice it seems
there is a bug in glslang that makes it convert the argument to 64-bit
if you actually try to pass it a 32-bit value while the other
arguments are 64-bit. This adds an optional conversion of the final
argument in order to support any type.

These have been tested against the automatically generated tests of
glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch
which tests it with quite a large range of combinations.

The issue with glslang has been filed here:
https://github.com/KhronosGroup/glslang/issues/1279

v2: Convert the eta operand of Refract from any size in order to make
    it eventually cope with 16-bit floats.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 20:58:11 +02:00
Neil Roberts
6e499572b9 spirv: Add a 64-bit implementation of OpIsInf
The only change neccessary is to change the type of the constant used
to compare against.

This has been tested against the arb_gpu_shader_fp64/execution/
fs-isinf-dvec tests using the ARB_gl_spirv branch.

v2: Use nir_imm_floatN_t for the constant.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 20:58:06 +02:00
Neil Roberts
696f4abcbc spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins
There is an existing macro that is used to choose between either a
float or a double immediate constant based on the bit size of the
first operand to the builtin. This is now changed to use the new
nir_imm_floatN_t helper function to reduce the number of places that
make this decision.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 20:58:03 +02:00
Neil Roberts
e7b2c125c3 nir/builder: Add a nir_imm_floatN_t helper
This lets you easily build float immediates just given the bit size.
If we have this single place here to handle this then it will be
easier to add support for 16-bit floats later.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 20:57:36 +02:00
Timothy Arceri
6e22ad6edc nir: return early when lowering a return at the end of a function
Otherwise we create unused conditional return flags and things
get unnecessarily ugly fast when lowering nested functions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 14:17:56 +10:00
Timothy Arceri
d3cafc18fc mesa: merge the driver functions DrawBuffers and DrawBuffer
The extra params we unused by the drivers that used DrawBuffers.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-17 14:17:48 +10:00
Marc Dietrich
268d8f244b glsl: fix gcc 8 parenthesis warning
fixes warnings like this:
[184/1137] Compiling C++ object 'src/compiler/glsl/glsl@sta/lower_jumps.cpp.o'.
In file included from ../src/mesa/main/mtypes.h:48,
                 from ../src/compiler/glsl_types.h:149,
                 from ../src/compiler/glsl/lower_jumps.cpp:59:
../src/compiler/glsl/lower_jumps.cpp: In member function '{anonymous}::block_record {anonymous}::ir_lower_jumps_visitor::visit_block(exec_list*)':
../src/compiler/glsl/list.h:650:17: warning: unnecessary parentheses in declaration of 'node' [-Wparentheses]
    for (__type *(__inst) = (__type *)(__list)->head_sentinel.next; \
                 ^
../src/compiler/glsl/lower_jumps.cpp:510:7: note: in expansion of macro 'foreach_in_list'
       foreach_in_list(ir_instruction, node, list) {
       ^~~~~~~~~~~~~~~

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-17 11:53:59 +10:00
Rob Clark
2a55344e7d compiler: int8/uint8 fixes
A couple spots were missed for handling of the new INT8/UINT8 base type.

Also de-duplicate get_base_type().. get_scalar_type() had nearly the
same switch statement, with the exception that anything with base_type
that was not scalar would return error_type.  So just handle that one
special case in get_scalar_type().

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-16 20:41:18 -04:00
Marek Olšák
60299e9abe radeonsi: don't emit partial flushes for internal CS flushes only
Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-16 16:58:10 -04:00
Marek Olšák
692f550740 winsys/amdgpu: always set AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE
There is a kernel patch that adds the new flag.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-16 16:58:10 -04:00
Marek Olšák
1b3199d14d radeonsi: implement mechanism for IBs without partial flushes at the end (v6)
(This patch doesn't enable the behavior. It will be enabled in a later
commit.)

Draw calls from multiple IBs can be executed in parallel.

v2: do emit partial flushes on SI
v3: invalidate all shader caches at the beginning of IBs
v4: don't call si_emit_cache_flush in si_flush_gfx_cs if not needed,
    only do this for flushes invoked internally
v5: empty IBs should wait for idle if the flush requires it
v6: split the commit

If we artificially limit the number of draw calls per IB to 5, we'll get
a lot more IBs, leading to a lot more partial flushes. Let's see how
the removal of partial flushes changes GPU utilization in that scenario:

With partial flushes (time busy):
    CP: 99%
    SPI: 86%
    CB: 73:

Without partial flushes (time busy):
    CP: 99%
    SPI: 93%
    CB: 81%

Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-16 16:58:10 -04:00
Erico Nunes
d19b488339 nir: fix ir_binop_gequal glsl_to_nir conversion
ir_binop_gequal needs to be converted to nir_op_sge when native integers
are not supported in the driver.
Otherwise it becomes no different than ir_binop_less after the
conversion.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-16 07:59:25 -07:00
Jason Ekstrand
72ab499c9f anv,radv: Drop XML workarounds for VK_ANDROID_native_buffer
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-16 07:59:25 -07:00
Jason Ekstrand
35ef0f767e vulkan: Update the XML and headers to 1.1.73
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-16 07:59:25 -07:00
Samuel Pitoiset
62510846b6 radv: clean up radv_decompress_resolve_subpass_src()
To handle the source color image transitions in the same place.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:21:05 +02:00
Samuel Pitoiset
56a171a499 radv: don't fast-clear eliminate after resolving a subpass with compute
That looks useless, and I think radv_handle_image_transition()
will do a fast-clear eliminate because it's called after the
resolve.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:21:02 +02:00
Samuel Pitoiset
7e84d69861 radv: handle CMASK/FMASK transitions only if DCC is disabled
DCC implies a fast-clear eliminate, so I think this sounds
reasonable.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:59 +02:00
Samuel Pitoiset
584d1f2711 radv: merge radv_handle_{dcc,cmask}_image_transition() functions
Into radv_handle_color_image_transition().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:56 +02:00
Samuel Pitoiset
d5812b900b radv: add radv_init_color_image_metadata() helper
In order to separate initialization from decompression. In the
future, that will allow us to init DCC/FMASK/CMASK in one shot.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:54 +02:00
Samuel Pitoiset
fde7b90ecf radv: make radv_initialise_cmask() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:51 +02:00
Samuel Pitoiset
790f6e4718 radv: clean up radv_handle_image_transition() a bit
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:49 +02:00
Samuel Pitoiset
6967d32beb radv: add radv_handle_color_image_transition() helper
To handle CMASK, FMASK and DCC transitions in the same place.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:45 +02:00
Samuel Pitoiset
c6b1f1c97a radv: handle DCC image transitions before CMASK/FMASK transitions
Mostly because DCC implies a fast-clear eliminate and we
should be able to skip some DCC decompressions by setting
a predicate like for CMASK and FMASK.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:42 +02:00
Samuel Pitoiset
79c87a45b6 radv: disable prediction only if it has been enabled
When decompressing DCC we don't enable it, so it's useless
to disable it. This reduces the number of prediction packets
sent to the GPU when performing color decompression passes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:39 +02:00
Bas Nieuwenhuizen
b0e3a9b19f ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too.
No clue how I missed those ...

Fixes: 4503ff760c "ac/nir: Add workaround for GFX9 buffer views."
CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105320
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-16 11:55:48 +02:00
Brian Paul
6a519a157b gallium/osmesa: link with winsock2 library on Windows
To fix the MSVC build.  The build broke because we started to compile
the ddebug code on Windows after the mtypes.h changes.  Building ddebug
caused us to also use the u_network.c code for the first time.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-13 19:06:55 -06:00
Brian Paul
201c08c463 gallium/util: put (void) in a few function signatures
To match the header file.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-13 19:06:55 -06:00
Brian Paul
65d1040435 ddebug: add PIPE_OS_UNIX/LINUX checks to fix MSVC build
Don't include Unix headers or use Unix functions when building with MSVC.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-13 19:06:55 -06:00
Brian Paul
6d41edbf8a mesa: protect #include of unistd.h with _MSV_VER check
unistd.h is unix only.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-13 19:06:55 -06:00
Brian Paul
bf67fec235 mesa: remove unused 'i' in dimensions_error_check()
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-13 19:06:55 -06:00
Marek Olšák
976db661ff radeonsi: restore si_emit_cache_flush call at the end of IBs
Fixes: 918b798668 "radeonsi: make sure CP DMA is idle at the end of IBs"
2018-04-13 20:05:53 -04:00
Daniel Schürmann
f2c6a55061 radv: enable subgroup capabilities
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 01:03:15 +02:00
Daniel Schürmann
4b0616e533 ac: handle subgroup intrinsics
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 01:03:15 +02:00
Daniel Schürmann
d5f7ebda3e ac: add LLVM build functions for subgroup instrinsics
Co-authored-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 01:03:09 +02:00
Daniel Schürmann
d19f20e793 ac: make ballot and umsb capable of 64bit inputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Daniel Schürmann
79701b414c nir: lower 64bit subgroup shuffle intrinsics
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Daniel Schürmann
fd5b0e0a64 nir/spirv: Fix warning and add missing breaks.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Daniel Schürmann
54937d820d nir: use ballot_bit_size when lowering ballot_bitfield_extract
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Daniel Schürmann
4d802df3aa nir: subgroups instructions for 64bit ballot sizes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Brian Paul
1098c18af3 glsl: #undef THIS macro to fix MSVC build
THIS is a macro in one of the MSVC header files.  It's also a token
in the GLSL lexer.  This causes a compilation failure with MSVC.
This issue seems to be newly exposed after the recent mtypes.h removal
patches.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-04-13 13:53:12 -06:00
Brian Paul
5dc7233f44 glsl: rename 'interface' var to 'iface' to fix MSVC build
The recent mtypes.h removal patches seems to have exposed a MSVC
issue where 'interface' is defined as a macro in an MSVC header file.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-04-13 13:53:08 -06:00
Brian Paul
73f1e33d34 mesa: remove snprintf macro in imports.h to fix MSVC build
snprintf is a macro in the MSVC stdio.h header and we needed to
include that header before imports.h where we also defined an
snprintf macro.  Otherwise, the MSVC build would fail.  The recent
mtypes.h removal patches seems to have exposed this issue.

This patch simply removes our snprintf macro and replaces one use
of it in teximage.c with _mesa_snprintf().  There are other calls
to snprintf() in DRI drivers, but none of them are built on Windows.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-04-13 13:52:57 -06:00
Lionel Landwerlin
0a6547014f anv: fix number of planes for depth & stencil
We're not counting correctly with depth & stencil images.

Additionally we need to move an assert that is meant just for color
attachments.

v2: Move an assert() (Reported by Craig)
    Change aspect mask checks (Francesco)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a62a979335 ("anv: enable multiple planes per image/imageView")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105994
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-04-13 11:44:53 -07:00
Marek Olšák
6ff0c6f4eb gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times
which also simplifies the build scripts.
2018-04-13 14:08:14 -04:00
Marek Olšák
918b798668 radeonsi: make sure CP DMA is idle at the end of IBs 2018-04-13 14:07:20 -04:00
Marek Olšák
b6ad7075b9 gallium/hud: add a simple HUD view that only draws text
Add this prefix to the env var: "simple," For example:
    GALLIUM_HUD=simple,fps

The X coordinates are the same, but the Y coordinates are different, because
there is only text.

'+' happens to behave the same as "\n".
',' happens to behave the same as "\n\n".
2018-04-13 14:07:20 -04:00
Dylan Baker
506671594a mesa: Include unistd.h in program_lexer
Which was previously provided implicitly by mtypes.h

CC: Marek Olšák <marek.olsak@amd.com>
CC: Mark Janes <mark.a.janes@intel.com>
Fixes: 43d66c8c2d
       ("mesa: include mtypes.h less")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-04-13 11:03:37 -07:00
Marek Olšák
9a1363427e radeonsi: always prefetch later shaders after the draw packet
so that the draw is started as soon as possible.

v2: only prefetch the API VS and VBO descriptors

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
e4b7974ec7 radeonsi: emit shader pointers before cache flushes & waits
This code was written with the constant engine in mind.
We can simplify it now.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
82799c5035 radeonsi/gfx9: don't use the workaround for gather4 + stencil
it doesn't seem to be needed.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
1372ccfe6f radeonsi: disable TC-compat HTILE on Tonga and Iceland
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
afe0bd2c55 radeonsi: force 2D tiling on VI only when TC-compat HTILE is really enabled
just pass the flag that indicates it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
29a09e1d38 radeonsi: don't flush HTILE if there is no HTILE clear
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
5fb31a1734 radeonsi: merge 2 identical if statements in si_clear
and other cleanups

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
8a28679987 radeonsi: don't do GFX-specific texture decompression for compute
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
307bccc6df radeonsi: simplify generating the renderer string
HAVE_LLVM > 0 is a tautology.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
a3b785be4d winsys/amdgpu: allow local BOs on APUs
Local BOs ignore BO priorities, and we don't need those on APUs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Juan A. Suarez Romero
b37b35a5d2 getteximage: assume texture image is empty for non defined levels
Current code is returning an INVALID_OPERATION when trying to use
getTextureImage() on a level that has not been explicitly defined.

That is, we define a mipmapped Texture2D with 3 levels, and try to use
GetTextureImage() for the 4th levels, and INVALID_OPERATION is returned.

Nevertheless, such case is not listed as an error in OpenGL 4.6 spec,
section 8.11.4 ("Texture Image Queries"), where all the case errors for
this function are defined. So it seems this is a valid operation.

On the other hand, in section 8.22 ("Texture State and Proxy State") it
states:

  "Each initial texture image is null. It has zero width, height, and
   depth, internal format RGBA, or R8 for buffer textures, component
   sizes set to zero and component types set to NONE, the compressed
   flag set to FALSE, a zero compressed size, and the bound buffer
   object name is zero."

We can assume that we are reading this initialized empty image when
calling GetTextureImage() with a non defined level.

With this assumption, we will reach one of the other error cases defined
for the functions. In the end this means that we would end up returning
INVALID_VALUE to the caller.

This fixes arb_get_texture_sub_image piglit tests.

v2: just return INVALID_VALUE if there is no defined level (Iago)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-13 17:47:37 +02:00
Juan A. Suarez Romero
8d411eb6b3 gettextureimage: verify cube map is complete
According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"),
relative to errors for GetTexImage, GetTextureImage, and GetnTexImage:

  "An INVALID_OPERATION error is generated by GetTextureImage if the
   effective target is TEXTURE_CUBE_MAP or TEXTURE_CUBE_MAP_ARRAY, and
   the texture object is not cube complete or cube array complete,
   respectively."

This fixes arb_get_texture_sub_image piglit tests.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-13 17:47:27 +02:00
Juan A. Suarez Romero
42891dbaa1 gettextsubimage: verify zoffset and depth are correct
According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"),
relative to errors for GetTextureSubImage() function:

  "An INVALID_VALUE error is generated if the effective target is
   TEXTURE_1D and either yoffset is not zero, or height is not one.

   An INVALID_VALUE error is generated if the effective target is
   TEXTURE_1D, TEXTURE_1D_ARRAY, TEXTURE_2D or TEXTURE_RECTANGLE, and
   either zoffset is not zero, or depth is not one."

The commit fixes the check for height and depth.

This fixes arb_get_texture_sub_image piglit tests.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-13 17:47:27 +02:00
Timothy Arceri
a63e69f5f0 mesa: free debug messages when destroying the debug state
Fixes: 04a8baad37 "mesa: refactor _mesa_PopDebugGroup and _mesa_free_errors_data"

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98281
2018-04-13 22:20:48 +10:00
Timothy Arceri
c500ab2735 mesa: fix x86 builds
Fixes: 43d66c8c2d "mesa: include mtypes.h less"
2018-04-13 22:13:46 +10:00
Marek Olšák
e961824ba8 Fix make check 2018-04-12 20:03:13 -04:00
Marek Olšák
6d6b1b3890 Fix scons build 2018-04-12 19:55:01 -04:00
Marek Olšák
43d66c8c2d mesa: include mtypes.h less
- remove mtypes.h from most header files
- add main/menums.h for often used definitions
- remove main/core.h

v2: fix radv build

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-12 19:31:30 -04:00
Marek Olšák
57f4268da4 mesa: include dispatch.h less
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-12 19:31:28 -04:00
Bas Nieuwenhuizen
6ff98dbf7c radv: Implement VK_EXT_vertex_attribute_divisor.
Pretty straight forward, just pass the divisors through the shader
key and then do a LLVM divide.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-12 22:57:23 +02:00
Bas Nieuwenhuizen
7eff8d7d35 ac/surface: Allow S swizzle for displayable surfaces.
For dcn1 && < 64 bpp displayable surfaces, addrlib only accepts
S swizzles.

At the same time addrlib prefers D swizzles is allowed, so we can
just allow S swizzles as fallback.

Fixes: b64b712558 "ac/surface/gfx9: request desired micro tile mode explicitly"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-12 21:24:55 +02:00
Eric Anholt
7bc77dbb00 broadcom/vc5: Fix a stray '`' in a comment. 2018-04-12 11:20:50 -07:00
Eric Anholt
b225cdcecc broadcom/vc5: Update the UABI for in/out syncobjs
This is the ABI I'm hoping to stabilize for merging the driver.  seqnos
are eliminated, which allows for the GPU scheduler to task-switch between
DRM fds even after submission to the kernel.  In/out sync objects are
introduced, to allow the Android fencing extension (not yet implemented,
but should be trivial), and to also allow the driver to tell the kernel to
not start a bin until a previous render is complete.
2018-04-12 11:20:50 -07:00
Eric Anholt
d9c525ed22 broadcom/vc5: Drop the finished_seqno optimization.
With the DRM scheduler changes, I'm about to remove all seqnos from the
UABI.
2018-04-12 11:20:50 -07:00
Eric Anholt
aedfd8ede4 broadcom/vc5: Drop the throttling code.
Since I'll be using the DRM scheduler, we won't run into the problem of a
runaway client starving other clients of GPU time.
2018-04-12 11:20:50 -07:00
Eric Anholt
dd9c476165 broadcom/vc5: Move flush_last_load into load_general, like for stores.
This should avoid mistakes with not flushing as we change the series of
loads.  Already, it fixes a hopefully unreachable case where we were
emitting just the TILE_COORDINATES and not the dummy store that needs to
go with it.
2018-04-12 11:20:50 -07:00
Eric Anholt
6a21a582fb broadcom/vc5: Rename read_but_not_cleared to loads_pending.
This is a more obvious name for what the variable means, and matches what
it's called for stores.
2018-04-12 11:20:50 -07:00
Eric Anholt
b946218c48 broadcom/vc5: Refactor the implicit coords/stores_pending logic.
Since I just fixed a bug due to forgetting to do these right, do it once
in the helper func.
2018-04-12 11:20:50 -07:00
Eric Anholt
ec60559f97 broadcom/vc5: Emit missing TILE_COORDINATES_IMPLICIT in separate z/s stores.
Fixes a simulator assertion failure in
KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8
2018-04-12 11:20:50 -07:00
Eric Anholt
8f2999120d broadcom/vc5: Add checks that we don't try to do raw Z+S load/stores.
This was dying in the simulator on
GTF-GLES3.gtf.GL3Tests.packed_depth_stencil.packed_depth_stencil_blit.
We'll need to do basically the same thing as Z32F/S8 does in the MSAA
Z24S8 case.
2018-04-12 11:20:50 -07:00
Eric Anholt
7553cbfc9d broadcom/vc5: Fix MSAA depth/stencil size setup.
The v3dX(get_internal_type_bpp_for_output_format)() call only handles
color output formats (which overlap in enum numbers with depth output
formats), so for depth we just need to take the normal cpp times the
number of samples.
2018-04-12 11:20:50 -07:00
Leo Liu
fa328456e8 st/va: add VP9 config to enable profile2
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
dac0024b58 radeonsi: use PIPE_FORMAT_P016 format for VP9 profile2
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
f1277dabbc radeon/vcn: add VP9 profile2 support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
e8724bd1e3 vl: add VP9 profile2 support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
d9a31341ec st/va: add VP9 config to enable profile0
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
ef52ba8aa0 st/va: parse VP9 uncompressed frame header
To get some of UVD required parameters.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
bf0f5fe929 st/va: add slice parameter handling for VP9
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
05176fe65e st/va: add picture parameter handling for VP9
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
9ff83d13e5 st/va: add handles for VP9 buffers
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
30438fbf46 st/va: add VP9 picture to context
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
0f373a65e5 radeonsi: cap VP9 support to progressive buffer
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
6adaf6de6d radeonsi: cap VP9 support to Raven
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
905368669d radeon/vcn: add VP9 context buffer
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
e2ce7c0a62 radeon/vcn: get VP9 msg buffer
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
6000bdb75b radeon/vcn: fill probability table to prob buffers
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
93c0f3cc13 radeon/vcn: add VP9 message buffer interface
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
caaecf3d3b radeon/vcn: add VP9 prob table buffer
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
b628ea039f vl: add VP9 probability tables
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
eb22785bd8 radeon/vcn: add VP9 dpb buffer size
The current FW has restricted the size to the worse case,
and the new dynamic dpb buffer support is on the way from
firmware side, we will change accordingly.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
f73befdd9b radeon/vcn: add VP9 stream type for decoder
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
ca1646db89 vl: add VP9 picture description
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
29bc354684 vl: add VP9 profile0 and format
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Samuel Pitoiset
9eac49246c radv: fix radv_layout_dcc_compressed() when image doesn't have DCC
num_dcc_levels means that DCC is supported, but this doesn't
mean that it's enabled by the driver. Instead, we should rely
on radv_image_has_dcc().

This fixes some multisample regressions since 0babc8e5d6
("radv: fix picking the method for resolve subpass") on Vega.
This is because the resolve method changed from HW to FS, but
those fails are totally unexpected, so there might some
differences between Polaris and Vega here.

Fixes: 44fcf58744 ("radv: Disable DCC for GENERAL layout and compute transfer dest.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-12 09:58:46 +02:00
Samuel Pitoiset
ab0e625a67 radv: add radv_decompress_resolve_{subpass}_src() helpers
This helper shares common code before resolving using either
a fragment or a compute shader.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-12 09:58:44 +02:00
Samuel Pitoiset
ed93d90a67 radv: add radv_init_dcc_control_reg() helper
And add some comments.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-12 09:58:41 +02:00
Timothy Arceri
c7e3d31b0b glsl: fix compat shaders in GLSL 1.40
The compatibility and core tokens were not added until GLSL 1.50,
for GLSL 1.40 just assume all shaders built with a compat profile
are compat shaders.

Fixes rendering issues in Dawn of War II on radeonsi which has
enabled OpenGL 3.1 compat support.

Fixes: a0c8b49284 "mesa: enable OpenGL 3.1 with ARB_compatibility"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807
2018-04-12 11:51:08 +10:00
Ian Romanick
f3b14ca2e1 mesa: Silence remaining unused parameter warnings in teximage.c
src/mesa/main/teximage.c: In function ‘_mesa_test_proxy_teximage’:
src/mesa/main/teximage.c:1301:51: warning: unused parameter ‘level’ [-Wunused-parameter]
                           GLuint numLevels, GLint level,
                                                   ^~~~~
src/mesa/main/teximage.c: In function ‘texsubimage_error_check’:
src/mesa/main/teximage.c:2186:30: warning: unused parameter ‘dsa’ [-Wunused-parameter]
                         bool dsa, const char *callerName)
                              ^~~
src/mesa/main/teximage.c: In function ‘copytexture_error_check’:
src/mesa/main/teximage.c:2297:32: warning: unused parameter ‘width’ [-Wunused-parameter]
                          GLint width, GLint height, GLint border )
                                ^~~~~
src/mesa/main/teximage.c:2297:45: warning: unused parameter ‘height’ [-Wunused-parameter]
                          GLint width, GLint height, GLint border )
                                             ^~~~~~
src/mesa/main/teximage.c: In function ‘check_rtt_cb’:
src/mesa/main/teximage.c:2679:21: warning: unused parameter ‘key’ [-Wunused-parameter]
 check_rtt_cb(GLuint key, void *data, void *userData)
                     ^~~
src/mesa/main/teximage.c: In function ‘override_internal_format’:
src/mesa/main/teximage.c:2756:55: warning: unused parameter ‘width’ [-Wunused-parameter]
 override_internal_format(GLenum internalFormat, GLint width, GLint height)
                                                       ^~~~~
src/mesa/main/teximage.c:2756:68: warning: unused parameter ‘height’ [-Wunused-parameter]
 override_internal_format(GLenum internalFormat, GLint width, GLint height)
                                                                    ^~~~~~
src/mesa/main/teximage.c: In function ‘texture_sub_image’:
src/mesa/main/teximage.c:3293:24: warning: unused parameter ‘dsa’ [-Wunused-parameter]
                   bool dsa)
                        ^~~
src/mesa/main/teximage.c: In function ‘can_avoid_reallocation’:
src/mesa/main/teximage.c:3788:53: warning: unused parameter ‘x’ [-Wunused-parameter]
                        mesa_format texFormat, GLint x, GLint y, GLsizei width,
                                                     ^
src/mesa/main/teximage.c:3788:62: warning: unused parameter ‘y’ [-Wunused-parameter]
                        mesa_format texFormat, GLint x, GLint y, GLsizei width,
                                                              ^
src/mesa/main/teximage.c: In function ‘valid_texstorage_ms_parameters’:
src/mesa/main/teximage.c:5987:40: warning: unused parameter ‘samples’ [-Wunused-parameter]
                                GLsizei samples, unsigned dims)
                                        ^~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-11 16:20:56 -07:00
Ian Romanick
fa44941072 mesa: Silence unused parameter warning in compressedteximage_only_format
Passing ctx to compressedteximage_only_format was the only use of the
ctx parameter in _mesa_format_no_online_compression, so that parameter
had to go too.

../../SOURCE/master/src/mesa/main/teximage.c: In function ‘compressedteximage_only_format’:
../../SOURCE/master/src/mesa/main/teximage.c:1355:57: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 compressedteximage_only_format(const struct gl_context *ctx, GLenum format)
                                                         ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-11 16:20:42 -07:00
Nanley Chery
377da9eb78 blorp: Silence unused function warnings
vulkan/genX_blorp_exec.c:69:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from vulkan/genX_blorp_exec.c:35:0:
./blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not used [-Wunused-function]
 blorp_emit_memcpy(struct blorp_batch *batch,
 ^~~~~~~~~~~~~~~~~
genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from genX_blorp_exec.c:33:0:
../../../../../src/intel/blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not used [-Wunused-function]
 blorp_emit_memcpy(struct blorp_batch *batch,
 ^~~~~~~~~~~~~~~~~

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-04-11 13:04:49 -07:00
Caio Marcelo de Oliveira Filho
89542c9ce6 nir/vars_to_ssa: Simplify node matching code
The matching code doesn't make real use of the return value. The main
function return value is ignored, and while the worker function
propagate its return value, the actual callback never returns false.

v2: Style fixes. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-11 11:05:05 -07:00
Caio Marcelo de Oliveira Filho
fac9dd1b93 nir/vars_to_ssa: Remove an unnecessary deref_arry_type check
Only fully-qualified direct derefs, collected in direct_deref_nodes,
are checked for aliasing, so it is already known up front that they
have only array derefs of type direct.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-11 11:05:05 -07:00
Caio Marcelo de Oliveira Filho
1c9bccdeb8 nir/vars_to_ssa: Rework register_variable_uses()
The return value was needed to make use of the old nir_foreach_block
helper, but not needed anymore with the macro version. Then go one
step further and move the foreach directly into the register variable
uses function.

v2: Move foreach to register_variable_uses(). (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-11 11:05:05 -07:00
Jason Ekstrand
bc2b170d68 nir: Use nir_builder in lower_io_to_temporaries
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-11 11:03:22 -07:00
Bas Nieuwenhuizen
bd95397d65 radv: Enable RB+ on Raven.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-11 18:46:55 +02:00
Tapani Pälli
9f29b1a4c8 vulkan: fix build issue on android (both anv/radv)
Fixes linking errors against:

   anv_GetPhysicalDeviceImageFormatProperties2KHR
   radv_GetPhysicalDeviceImageFormatProperties2KHR

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-11 13:55:49 +03:00
Nicolai Hähnle
41e6ffee49 radeonsi: correctly parse disassembly with labels
LLVM now emits labels as part of the disassembly string, which is very
useful but breaks the old parsing approach.

Use the semicolon to detect the boundary of instructions instead of going
by line breaks.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-11 12:44:30 +02:00
Nicolai Hähnle
0630e52c9e radeonsi: pass -O halt_waves to umr for hang debugging
This will give us meaningful wave information in the case of a hang where
shaders are still running in an infinite loop.

Note that we call umr multiple times for different sections of the ddebug
hang dump, and so the wave information will not necessarily match up
between sections.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-11 12:44:24 +02:00
Jason Ekstrand
69f447553c vulkan: Drop vk_android_native_buffer.xml
All the information in vk_android_native_buffer.xml is now in vk.xml.
The only exception is the extension type attribute which we can work
around in the generators while we wait for the XML to be fixed.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-10 19:29:49 -07:00
Jason Ekstrand
ae3a856c34 nir/lower_atomics: Rework the main walker loop a bit
This replaces some "if (...} { }" with "if (...) continue;" to reduce
nesting depth and makes nir_metadata_preserve conditional on progress
for the given impl.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-10 19:28:49 -07:00
Bas Nieuwenhuizen
ed94638156 radv: Enable RB+ where possible.
According to Marek, not enabling it on Stoney has a significant
negative performance impact. (And I guess this might impact
performance on Raven as well)

The register settings are pretty much copied from radeonsi. I did
not put this in the pipeline as that would make the pipeline more
dependent on the format which mean we would have to have more
pipelines for the meta shaders.

v2: Don't clear RB+ regs if not enabled as the CLEAR_STATE packet
    does already.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-11 01:19:10 +02:00
Topi Pohjolainen
5d895a1f37 nir: Check if u_vector_init() succeeds
However, it only fails when running out of memory. Now, if we
are about to check that, we should be consistent and check
the allocation of the worklist as well.

CID: 1433512
Fixes: edb18564c7 nir: Initial implementation of a nir_instr_worklist
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-11 01:49:56 +03:00
Topi Pohjolainen
98d3874754 mesa: Assert base format before truncating to unsigned short
CID: 1433709
Fixes: ca721b3d8: mesa: use GLenum16 in a few more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-11 01:49:56 +03:00
Topi Pohjolainen
26f48fe010 intel/dev: Assert the number of slices is not zero
Fixes: c1900f5b intel: devinfo: add helper functions to fill...
CID: 1433511
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-11 01:49:56 +03:00
Kenneth Graunke
8960903c90 i965: Remove brw_bo_alloc_tiled_2d from intel_detect_swizzling.
I'd like to drop this pre-isl function.  This drops one of the two uses.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-10 15:31:31 -07:00
Timothy Arceri
a05faf80c3 mesa: fix glsl version mismatch in compat profile
Drivers that only support compat 3.0 were reporting GLSL 1.40
support. This fixes issues with the menu of Dawn of War II.

Fixes: a0c8b49284 "mesa: enable OpenGL 3.1 with ARB_compatibility"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807
2018-04-11 08:05:19 +10:00
Samuel Pitoiset
0babc8e5d6 radv: fix picking the method for resolve subpass
The source and destination image parameters were swapped.

No CTS changes on Polaris10, but I suspect this might
fix something.

Fixes: 2a04f5481d ("radv/meta: select resolve paths")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-10 21:55:28 +02:00
Samuel Pitoiset
9f6a28eb27 radv: add shader BOs to the list at pipeline bind time
Otherwise, the shader BOs are not added to the list on SI because
prefetching isn't supported. Calling radv_cs_add_buffer() in the
prefetch codepath was a bad idea.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105952
Fixes: 4ad7595f35 ("radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Turo Lamminen <turo@alternativegames.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-10 21:55:28 +02:00
Marek Olšák
e29facff31 ac/surface: don't set the display flag for obviously unsupported cases (v2)
This enables the tile swizzle for some cases of the displayable micro mode,
and it also fixes an addrlib assertion failure on Vega.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-04-10 13:06:03 -04:00
Marek Olšák
19ce5048ee radeonsi: add shader binary padding for UMR 2018-04-10 13:05:20 -04:00
Marek Olšák
b64b712558 ac/surface/gfx9: request desired micro tile mode explicitly
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-10 12:44:41 -04:00
Emil Velikov
5dd02123a0 docs/release-calendar: update to include 18.1 and 18.2
Dylan has kindly stepped up to help with 18.1.0, while I've taken the
liberty to nominate Andres for 18.2.0 ;-)

As always, people are welcome to swap/adjust where needed.

v2: Add Juan for 18.0.x (Juan)

Cc: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-10 16:08:54 +01:00
Emil Velikov
8eceac9de7 glsl: remove unreachable assert()
Earlier commit enforced that we'll bail out if the number of terminators
is different than 2. With that in mind, the assert() will never trigger.

Fixes: 56b867395d ("glsl: fix infinite loop caused by bug in loop
unrolling pass")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-10 16:04:50 +01:00
Juan A. Suarez Romero
0d0ef8ae33 spirv: autotools: add vtn_gather_types_c.py in distribution tarball
Fixes: 042ee4bea2 "(spirv: Move SPIR-V building to Makefile.spirv.am and
spirv/meson.build")

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-10 10:37:46 +02:00
Juan A. Suarez Romero
15ed757834 radeonsi: autotools: add si_build_pm4.h in dist tarball
Fixes: 5777488406 ("radeonsi: move r600_cs.h contents into si_pipe.h,
si_build_pm4.h")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-10 10:33:28 +02:00
Bas Nieuwenhuizen
4381be4648 ac/nir: Use an array instead of hashtable for SSA defs.
Saves about 2% of compile time for F1 2017, as well as reduce code
size of an optimized libvulkan_radeon.so by about 1 KiB.

This still keeps the hashtable, as we also stored blocks in there.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-10 09:53:16 +02:00
Timothy Arceri
6066f08ee9 st/mesa: finalise tcs/tes/geom NIR before storing it to the cache
We don't create variants of the NIR so here we finalise it before
caching to avoid unnecessary processing when restoring it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 15:10:16 +10:00
Timothy Arceri
bc71e20993 st/mesa: exit st_translate_fragment_program() earlier for NIR path
This avoids a bunch of scanning that is only used by the TGSI path.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 15:10:16 +10:00
Timothy Arceri
494a5c3501 radeonsi/nir: tidy up si_nir_load_sampler_desc()
This makes it easier to follow the code, and also initialises
dynamic_index which will be useful for adding bindless textures
support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
d7cbe795ed radeonsi/nir: set uses_bindless_images for images
V2: add missing intrinsics (Spotted-by: Samuel Pitoiset)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
74b3fc2ce0 nir: dont lower bindless samplers
We neeed to skip the var if its not a uniform here as well as checking
the bindless flag since UBOs can contain bindless samplers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
bd4cc54c8b st/glsl_to_nir: set paramater value offset as driver location for packed uniforms
This allows us to simplify the code and will also be useful for supporting
bindless textures.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
222d862cd3 radeonsi/nir: don't add bindless samplers/images to declared bitmasks
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
f33d9036b9 st/mesa: stop calling _mesa_init_shader_object_functions()
This sets the LinkShader function for the driver, but for the st we
set it properly with the following call to st_init_program_functions().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Jason Ekstrand
c3f9d5c235 anv/pipeline: Lower more constant initializers earlier
Once we've gotten rid of everything but the main entrypoint, there's no
reason why we should go ahead and lower them all.  This is what radv
does and it will make future work easier.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-09 19:45:25 -07:00
Jason Ekstrand
14e0a222d9 spirv: Use the LOCAL_GROUP_SIZE system value
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-09 19:45:25 -07:00
Jason Ekstrand
131d454c35 nir/lower_system_values: Support SYSTEM_VALUE_LOCAL_GROUP_SIZE
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-09 19:45:25 -07:00
Lionel Landwerlin
f3353e53db intel: aubinator: print out addresses of invalid instructions
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-10 00:58:38 +01:00
Bas Nieuwenhuizen
41fbcc7901 radv: Always reset draw user SGPRs after secondary command buffer.
As we sometimes reset them to -1, -1 does not mean that they are
not written by the secondary command buffer.

Fixes: ad11fc3571 "radv: don't emit unneeded vertex state."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-09 23:04:42 +02:00
Bas Nieuwenhuizen
74b0b869dd radv: Don't set instance count using predication.
The packet can sometimes be skipped, but we still think the change takes effect.

This just makes the packet always take effect.

Fixes: ad11fc3571 "radv: don't emit unneeded vertex state."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105942
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-09 23:04:35 +02:00
Rob Clark
d66dc34316 mesa/st/nir: fix instruction removal
At one point this kinda worked (or at least didn't cause problems).  But
with deref-instructions it results in dangling deref instructions not
being properly removed.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 15:36:21 -04:00
Rob Clark
becf2d1fac mesa/st/nir: fix naked lowering pass call
Not using the macro means no nir_validate in debug builds, resulting in
problems showing up only after later passes.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 15:36:21 -04:00
Rob Clark
c4457113e9 nir: add comment about nir_src_copy()
So it is more clear about when to use nir_instr_rewrite_src()

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 15:36:21 -04:00
Nanley Chery
1d94aa1987 i965: Make the miptree clear color setter take a gl_color_union
We want to hide the internal details of how the miptree's clear color
is calculated.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 10:56:48 -07:00
Nanley Chery
3dbb49a978 i965/miptree: Move the clear color and value setter implementations
These will get more complex in later commits.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 10:56:48 -07:00
Nanley Chery
1ce7ae391e i965: Use the brw_context for the clear color and value setters
Do what all the other functions in the miptree API do.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 10:56:48 -07:00
Bas Vermeulen
c63bef15fc radeonsi: convert dispatch packet to little endian
The parameters for the compute engine are wrong when using
an E8860 on a big endian machine.
To fix this, convert the contents of struct dispatch_packet
to little endian.

This ensures that get_global_id(0) and similar functions
in the OpenCL code get the correct endian values, and
makes my simple OpenCL program work correctly.

Signed-off-by: Bas Vermeulen <bas@daedalean.ai>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-04-09 13:47:52 -04:00
Bas Vermeulen
be628e4749 radeonsi: correct si_vgt_param_key on big endian machines
Using mesa OpenCL failed on a big endian PowerPC machine because
si_vgt_param_key is using bitfields and a 32 bit int for an
index into an array.

Fix si_vgt_param_key to work correctly on both little endian
and big endian machines.

Signed-off-by: Bas Vermeulen <bas@daedalean.ai>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-04-09 13:42:30 -04:00
Marek Olšák
f33e4482b3 radeonsi: don't set RB+ registers on GFX9 chips without RB+
CLEAR_STATE initializes them properly.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-09 13:40:25 -04:00
Emil Velikov
ea2536cd26 etnaviv: meson: add etnaviv_query_pm.[ch] to the sources
Otherwise building the driver will fail with unresolved symbols.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105960
Fixes: 72d2043be0 ("etnaviv: add perfmon query implementation")
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Clayton Craft <clayton.a.craft@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-04-09 19:09:24 +02:00
Xiong, James
f23b45dce3 i965: return the fourcc saved in __DRIimage when possible
When creating a image from a texture, the image's dri_format is
set to the first plane's format, and used to look up for the
fourcc. e.g. for FOURCC_NV12 texture, the dri_format is set to
__DRI_IMAGE_FORMAT_R8, we end up with a wrong entry in function
intel_lookup_fourcc():
   { __DRI_IMAGE_FOURCC_R8, __DRI_IMAGE_COMPONENTS_R, 1,
     { { 0, 0, 0, __DRI_IMAGE_FORMAT_R8, 1 }, } },
instead of the correct one:
   { __DRI_IMAGE_FOURCC_NV12, __DRI_IMAGE_COMPONENTS_Y_UV, 2,
     { { 0, 0, 0, __DRI_IMAGE_FORMAT_R8, 1 },
       { 1, 1, 1, __DRI_IMAGE_FORMAT_GR88, 2 } } },
as a result, a wrong fourcc __DRI_IMAGE_FOURCC_R8 was returned.

To fix this bug, the image inherits the texture's planar_format that
has the original fourcc; Upon querying, if planar_format is set,
return the saved fourcc; Otherwise fall back to the old way.

v3: add a bug description and "cc mesa-stable" tag (Jason)
  remove redundant null pointer check (Tapani)
  squash 2 patches into one (James)
v2: fall back to intel_lookup_fourcc() when planar_format is NULL
  (Dongwon & Matt Roper)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Xiong, James <james.xiong@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-09 18:16:59 +03:00
Bastien Orivel
42c2f5b579 nir: Fix a typo in src/compiler/Makefile.nir.am
Since 31d91f019b, the makefile tries to
find the file SConstript.spirv instead of SConscript.spirv which breaks
the make dist command.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-09 08:32:45 -06:00
Samuel Pitoiset
04e609f1f8 radv: fix prefetching of vertex shader and VBOs on SI
Forgot one check... Too many mistakes for a simple change.

Fixes: f1d7c16e85 ("radv: fix prefetching compute shaders on CIK and older chips")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 16:14:12 +02:00
Samuel Pitoiset
56a4d03b0c radv: implement VK_AMD_shader_core_properties
Simple extension that only returns information for AMD hw.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 14:28:13 +02:00
Samuel Pitoiset
466aba9fa2 radv: add RADV_NUM_PHYSICAL_VGPRS constant
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 14:28:13 +02:00
Samuel Pitoiset
2f7bb93146 radv: add radv_get_num_physical_sgprs() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 14:28:13 +02:00
Samuel Pitoiset
b30dec738a vulkan: Update the XML and headers to 1.1.72
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 14:28:13 +02:00
Andres Gomez
a055f5108d docs: properly escape characters
Signed-off-by: Andres Gomez <agomez@igalia.com>
2018-04-09 13:47:40 +03:00
Andres Gomez
7cf3932098 mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage
Fixes: 03fd6704db ("mesa: Add support for a new override string
MESA_GLES_VERSION_OVERRIDE")

Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-09 13:47:40 +03:00
Marek Olšák
806ab42c0f mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override
v2:
 - Provide a correct explanation on the envvars documentation (Ian).
 - Provide a more correct explanation on the function comments (Andres).
v3:
 - Homogenize documentation and inline comments (Emil).
 - Correct a typo (Emil).

Fixes: 2599b92eb9 ("mesa: allow forcing >=3.1 compatibility contexts
with MESA_GL_VERSION_OVERRIDE")

Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-09 13:47:40 +03:00
Andres Gomez
c6067fcd07 dri_util: don't fail when not supporting ARB_compatibility with GL3.1
Currently, any driver that does not support the ARB_compatibility
extension will fail on GL3.1 context creation if the application does
not request the forward-compatiblity flag.

Restore the original check which changes mesa_api to API_OPENGL_CORE,
only when:
 - GL3.1 is requested, without the forward-compatiblity flag.
 - driver does not support ARB_compatibility - as deduced by
max_gl_compat_version.

Fixes: a0c8b49284 ("mesa: enable OpenGL 3.1 with ARB_compatibility")

v2:
 - Improve commit log (Emil).
 - Provide a correct explanation on the features documentation (Ian).

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-09 13:46:34 +03:00
Andres Gomez
044acd3569 dri_util: when overriding, always reset the core version
This way we won't fail when validating just because we may have a non
overriden core version that is lower than the requested one, even when
the compat version is high enough.

For example, running glcts from VK-GL-CTS with i965, this will
succeed:

$ MESA_GL_VERSION_OVERRIDE=4.6 ./glcts --deqp-case=KHR-GL46.info.vendor

While, this will fail:

$ MESA_GL_VERSION_OVERRIDE=4.6COMPAT ./glcts --deqp-case=KHR-GL46.info.vendor

Fixes: 464c56d3d5 ("dri_util: Use
_mesa_override_gl_version_contextless")

Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-09 13:18:16 +03:00
Samuel Pitoiset
b0f8ad189c radv: add radv_image_is_tc_compat_htile() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:26 +02:00
Samuel Pitoiset
95d5ad80e9 radv: add radv_use_dcc_for_image() helper
And add some TODOs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:24 +02:00
Samuel Pitoiset
fab5fe4284 radv: rename radv_image_is_tc_compat_htile()
... to radv_use_tc_compat_htile_for_image(). This function
name makes more sense to me because we want to know if and
only if TC-compat HTILE should be used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:21 +02:00
Samuel Pitoiset
2692736cee radv: simplify a check in radv_initialise_color_surface()
If the image has FMASK metadata, the number of samples is > 1
because radv_image_can_enable_fmask() handles that already.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:16 +02:00
Samuel Pitoiset
ed41e776d0 radv: clean up radv_vi_dcc_enabled()
And rename to radv_dcc_enabled() to be consistent.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:14 +02:00
Samuel Pitoiset
e213f19907 radv: clean up radv_htile_enabled()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:12 +02:00
Samuel Pitoiset
0fc9113ac5 radv: add radv_image_has_{cmask,fmask,dcc,htile}() helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:10 +02:00
Samuel Pitoiset
32f5174ce8 radv: add radv_get_cmask_fast_clear_value() helper
DCC for MSAA textures are currently unsupported but that will
be used later on.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:08 +02:00
Samuel Pitoiset
f882c62218 radv: add radv_clear_{cmask,dcc} helpers
They will help for DCC MSAA textures and if we support mipmaps
in the future.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:05 +02:00
Axel Davy
d899826733 st/nine: Do not use scratch for face register
Scratch registers are reused every instructions.
Since vFace is reused, a new temporary register
should be used.

Fixes: https://github.com/iXit/Mesa-3D/issues/311

Signed-off-by: Axel Davy <davyaxel0@gmail.com>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-04-08 22:49:43 +02:00
Christian Gmeiner
9e80273693 etnaviv: expose perfmon query groups
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:23:45 +02:00
Christian Gmeiner
c320b158f5 etnaviv: add query_group_info for perfmon counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:23:38 +02:00
Christian Gmeiner
5a3b744ed2 etnaviv: assign group_ids to perfmon queries
Prep work for AMD_performance_monitor support.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:23:34 +02:00
Christian Gmeiner
4020fa3e08 etnaviv: support MC performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:21:40 +02:00
Christian Gmeiner
3c3f936ae1 etnaviv: support TX performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:21:12 +02:00
Christian Gmeiner
f380ce13f0 etnaviv: support RA performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:21:04 +02:00
Christian Gmeiner
3af0e228e5 etnaviv: support SE performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:50 +02:00
Christian Gmeiner
9ae86c1306 etnaviv: support PA performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:46 +02:00
Christian Gmeiner
69bebe06e3 etnaviv: support SH performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:42 +02:00
Christian Gmeiner
1f603402f6 etnaviv: support PE performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:37 +02:00
Christian Gmeiner
d0bed0b494 etnaviv: support HI performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:32 +02:00
Christian Gmeiner
72d2043be0 etnaviv: add perfmon query implementation
Add needed infrastructure to use performance monitor
requests for queries.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:25 +02:00
Christian Gmeiner
7e3dba301e etnaviv: sw queries: return correct number of groups
Fixes: 3d912bd742 ("etnaviv: add query_group_info for sw counters")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-04-08 22:13:04 +02:00
Lucas Stach
208891650b etnaviv: advertise YUV formats as external only
We only support importing YUV as OES external resources.
This will change in the future, but for now this fixes the
advertised capabilities in eglQueryDmaBufModifiersEXT.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-04-08 22:11:46 +02:00
Lucas Stach
dfe4a08ccd gallium/util: implement util_format_is_yuv
This adds a helper to check if a pipe format is in YUV color space.
Drivers want to know about this, as YUV mostly needs special handling.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-04-08 22:10:57 +02:00
Rhys Perry
19254a977b nvc0: finish implementation of PIPE_QUERY_SO_OVERFLOW_PREDICATE
This also removes some useless code leftover from old changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-07 16:45:00 -04:00
Rhys Perry
14cc8c55ea nvc0: change ACQUIRE_EQUAL to ACQUIRE_GEQUAL in nvc0_hw_query_fifo_wait
If a fence is created in between nvc0_hw_end_query and
nvc0_hw_query_fifo_wait, the sequence number in nvc0->screen->fence.bo can
be larger than hq->fence->sequence before the semaphore is created,
resulting in the semaphore never being triggered.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-07 16:45:00 -04:00
Rhys Perry
98d15e0550 nvc0: ensure the query's fence has been emitted in nvc0_hw_query_fifo_wait
If the fence has not been emitted, hq->fence->sequence would be zero. This
would result in the semaphore never being triggered, blocking all later
commands in the pushbuf.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
[imirkin: use nouveau_fence_emit instead]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-07 16:45:00 -04:00
Ilia Mirkin
90bb2d7152 st/mesa: tex offsets can't be in a const or 2d-indexed
All consts are now implicitly 2d (they set .Dimension), so trigger
asserts. Also, the texture offset can't handle any sort of 2d indexing.
While this could be tacked on, this seems unnecessary, just move it off
into a separate temp.

Fixes assertion failure in
tests/spec/arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag

Note that this was an issue even before the const-always-2d thing, since
there was no detection of when even a proper second dimension was used,
e.g. for UBO or geom/tess inputs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-07 16:45:00 -04:00
Ilia Mirkin
2a2b22e9b1 nvc0: restore image binding on RGB10A2, remove from BGR10A2
Fixes a bunch of new CTS pbo tests that use those as an output format,
which the state tracker converts into buffer image writes.

No part of the driver is ready for BGR10A2. It could probably be enabled
on Maxwell+, but seems unnecessary. This error was introduced when
flipping the displayable bit on those formats, which accidentally also
moved the image bit.

Fixes: e1a70aed10 (nv50,nvc0: mark ABGR format as displayable instead of ARGB format)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-07 16:45:00 -04:00
Rob Clark
684f7cd7e3 freedreno/ir3: use lower_global_vars_to_local in cmdline compiler
tgsi_to_nir emits things with arrays as global vars.. and nir->ir3 does
lower_locals_to_regs.  But nothing was lowering global to local, which
breaks compiling tgsi shaders

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-04-07 11:33:41 -04:00
Kenneth Graunke
a3782a612f i965: Use %x instead of %u in debug print.
I mistakenly printed out the address as 0x<decimal number> instead of
printing a proper hex number.  This was...surprising.
2018-04-06 22:57:48 -07:00
Dylan Baker
b5f92b6fd4 meson: fix warnings about comparing unlike types
In the old days (0.42.x), when mesa's meson system was written the
recommendation for handling conditional dependencies was to define them
as empty lists. When meson would evaluate the dependencies of a target
it would recursively flatten all of the arguments, and empty lists would
be removed. There are some problems with this, among them that lists and
dependencies have different methods (namely .found()), so the
recommendation changed to use `dependency('', required : false)` for
such cases.  This has the advantage of providing a .found() method, so
there is no need to do things like `dep_foo != [] and dep_foo.found()`,
such a dependency should never exist.

I've tested this with 0.42 (the minimum we claim to support) and 0.45.
On 0.45 this removes warnings about comparing unlike types, such as:

meson.build:1337: WARNING: Trying to compare values of different types
(DependencyHolder, list) using !=.

v2: - Use dependency('', required : false) instead of
      declare_dependency(), the later will always report that it is
      found, which is not what we want.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-06 15:29:53 -07:00
Ian Romanick
81ed629b38 intel/compiler: Explicitly cast register type in switch
brw_reg::type is "enum brw_reg_type type:4".  For whatever reason, GCC
is treating this as an int instead of an enum.  As a result, it doesn't
detect missing switch cases and it doesn't detect that flow can get out
of the switch.

This silences the warning:

src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg*, const brw_reg*)’:
src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-06 15:22:10 -07:00
Axel Davy
39240926cd st/nine: Declare lighting consts for ff shaders
The lighting constants were not declared previously,
but were accessed with indirect addressing, which is
illegal.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=105442

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-04-06 23:34:31 +02:00
Caio Marcelo de Oliveira Filho
67c728f7a9 nir: rename variables in nir_lower_io_to_temporaries for clarity
In the emit_copies() function, the use of "newv" and "temp" names made
sense when only copies from temporaries to the new variables were
being done. But now there are other calls to copy with other pairings,
and "temp" doesn't always refer to a temporary created in this
pass. Use the names "dest" and "src" instead.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-06 11:08:08 -07:00
Samuel Pitoiset
8f9f62c2db radv: don't pass the pipeline to radv_flush_constants()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 19:46:27 +02:00
Samuel Pitoiset
2bd50cceff radv: rename radv_cmd_buffer_update_vertex_descriptors()
... to radv_flush_vertex_descriptors().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-06 19:46:23 +02:00
Samuel Pitoiset
e829a0cc1e radv: do not try to skip draw calls when VBOs upload failed
This is unnecessary because we record an error which should
be returned by vkEndCommandBuffer(), and the app shouldn't
submit a command buffer when this happens.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 19:46:21 +02:00
Samuel Pitoiset
f1d7c16e85 radv: fix prefetching compute shaders on CIK and older chips
Because the check was moved to radv_emit_prefetch_L2().

Fixes: 4ad7595f35 ("radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2()")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 19:46:18 +02:00
Samuel Pitoiset
7fe586f6fb radv: only enable PERFECT_ZPASS_COUNTS for precision occlusion queries
This unnecessary when the precision bit flag is not set, and this
might hurt performance. The Vulkan explains that not setting
VK_QUERY_CONTROL_PRECISE_BIT might be more efficient on some
implementations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 09:07:34 +02:00
Samuel Pitoiset
d53dff3bfc radv: enable the Polaris small primitive filter control
Enable it directly in the preamble, but do not enable line
on Polaris10/11/12 because there is a hw bug.

There is possibly an issue when MSAA is off, but this doesn't
regress any CTS and AMDVLK doesn't have a workaround as well.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 09:07:31 +02:00
Jason Ekstrand
c5b87c94d8 anv: Add WSI support for the I915_FORMAT_MOD_Y_TILED_CCS
v2 (Jason Ekstrand):
 - Return the correct enum values from anv_layout_to_fast_clear_type

v3 (Jason Ekstrand):
 - Always return ANV_FAST_CLEAR_NONE and leave doing the right thing for
   the patch which adds a modifier which supports fast-clears.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Daniel Stone <daniels@collabora.com>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
2018-04-05 21:17:02 -07:00
Anuj Phogat
ff8b82666a Add more Coffee Lake brand strings
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-04-05 14:50:11 -07:00
Jan Vesely
2406e8848e radeonsi: Reorder checks in si_check_render_feedback
si_get_total_colormask accesses NULL pointer on compute shaders
Fixes crashes on clover
Fixes: 0669dca9c0 ("radeonsi: skip DCC render feedback checking if color writes are disabled")
CC: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-05 17:11:18 -04:00
Kevin Rogovin
cc41603d6d intel/tools: new intel_sanitize_gpu tool
Adds a new debug tool to pad each GEM BO allocated with (weak)
pseudo-random noise values which are then checked after each
batchbuffer dispatch to the kernel. This can be quite valuable to
find diffucult to track down heisenberg style bugs.

[scott.d.phillips@intel.com: split to separate tool]

v2: (by Scott D Phillips)
    - track gem handles per fd (Kevin)
    - remove handles on GEM_CLOSE (Kevin)
    - ignore prime handles
    - meson & shell script

v3: (by Scott D Phillips)
    - don't track prime bos at all (Kevin)
    - protect the hash table with a mutex (Kevin)
    - hook fds by drm_version.name, not path (Chris Wilson)

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-05 13:52:49 -07:00
Jason Ekstrand
e85b95269e prog/nir: Simplify some load/store operations
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-05 13:20:39 -07:00
Marek Olšák
c7dd59b06d radeonsi: fix a crash if ps_shader.cso is NULL in si_get_total_colormask 2018-04-05 15:53:52 -04:00
Marek Olšák
be4250aa88 radeonsi: remove more R600 references
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
c0dfc0c6df radeonsi: try to fix android
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
f55d1f806e radeonsi: try to fix meson
This is not fully tested. Meson can't link LLVM even though automake can.

PATH=/usr/llvm/x86_64-linux-gnu/bin:$PATH meson build/ -Dgallium-va=false \
    -Dplatforms=x11,drm -Dgallium-drivers=radeonsi -Ddri-drivers= \
    -Dgallium-omx=disabled -Dgallium-xvmc=false -Dgles1=false \
    -Dtexture-float=true -Dvulkan-drivers=

src/gallium/auxiliary/libgallium.a(gallivm_lp_bld_misc.cpp.o):
(.data.rel.ro._ZTI26DelegatingJITMemoryManager[_ZTI26DelegatingJITMemoryManager]+0x10):
undefined reference to `typeinfo for llvm::RTDyldMemoryManager'

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
38faac43e3 radeonsi: don't build libradeon.la separately
for better parallelism

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
f9323ddbb9 radeonsi: clean up GET_MAX_VIEWPORT_RANGE definition
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
6a93441295 radeonsi: remove r600_common_context
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5f77361d2e radeonsi: remove r600_pipe_common::screen
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
321bd6c280 radeonsi: move r600_buffer_common.c and r600_texture.c into radeonsi
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
d58080b318 radeonsi: move r600_gpu_load.c to si_gpu_load.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
f7f4ba5306 radeonsi: move r600_query.c/h files to si_query.c/h
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5777488406 radeonsi: move r600_cs.h contents into si_pipe.h, si_build_pm4.h
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
eced536ed6 radeonsi: rename query definitions R600_ -> SI_
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
72e9e98076 radeonsi: move and rename R600_ERR out of r600_pipe_common.h
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
076afb4f0e radeonsi: rename a few R600/r600_ -> SI_/si_
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5f1cddde78 radeonsi: move definitions out of r600_pipe_common.h
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
a67ee02388 radeonsi: move functions out of and remove r600_pipe_common.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
90d12f1d77 radeonsi: rename r600 -> si in some places
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
50c7aa6756 radeonsi: use si_context instead of pipe_context in parameters pt3
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
e332ba61f4 radeonsi: use si_context instead of pipe_context in parameters pt2
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
c424f86180 radeonsi: use si_context instead of pipe_context in parameters pt1
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
2a62e5eec9 radeonsi: pass sctx to si_rebind_buffer and clean up
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
605ba1b9ae radeonsi: use r600_common_context less pt7
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
0b2f2a6a18 radeonsi: use r600_common_context less pt6
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
4c5efc40f4 radeonsi: update copyrights
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
95bc30275b radeonsi: switch radeon_add_to_buffer_list parameter to si_context
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
e5053060eb radeonsi: use r600_common_context less pt5
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
884fd97f6b radeonsi: use r600_common_context less pt4
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
a8291a23c5 radeonsi: use r600_common_context less pt3
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
3069cb8b78 radeonsi: use r600_common_context less pt2
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
71d9028b7a radeonsi: use r600_common_context less pt1
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
0606190059 radeonsi: don't use r600_common_context in si_emit_cache_flush
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
3de323f9bb radeonsi: switch r600_atom::emit parameter to si_context
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
2b70dd8c8a radeonsi: flatten / remove struct r600_ring
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
f7de8686de radeonsi: remove r600_ring::flush callback
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
4598ad6a00 radeonsi: make radeon_add_to_buffer_list_check_mem be gfx-only
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
426ef367f3 radeonsi: add_to_buffer_list functions can return void
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
c0987d8adf radeonsi: move saved_cs functions from r600_pipe_common.c to si_debug.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
37ef4765ff radeonsi: move DMA CS functions from r600_pipe_common.c to si_dma_cs.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
19f550f1d2 radeonsi: move EOP event code from r600_pipe_common.c to si_fence.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
fc6a44e169 radeonsi: rename si_hw_context.c -> si_gfx_cs.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
42500d1dab radeonsi: move si_destroy_saved_cs to si_debug.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
02a61e71a2 radeonsi: rename si_begin_new_cs -> si_begin_new_gfx_cs
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
fa09388704 radeonsi: rename si_need_cs_space -> si_need_gfx_cs_space
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
85e75b2da5 radeonsi: remove r600_pipe_common::blit_decompress_depth
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
e04389cc2a radeonsi: remove r600_pipe_common::decompress_dcc
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
9d7f809c03 radeonsi: remove r600_pipe_common::invalidate_buffer
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
898500c440 radeonsi: remove r600_pipe_common::rebind_buffer
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
fbf1bf9b8f radeonsi: remove r600_common_context::set_occlusion_query_state
and remove unused old_enable parameter.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5ed8b54ffe radeonsi: remove r600_pipe_common::save_qbo_state
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
72842d15ac radeonsi: remove unused query code
The get_size perf counter callback is also inlined and removed.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
3f55fe99d6 radeonsi: use num_cs_dw_queries_suspend
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
54f28359b5 radeonsi: remove r600_pipe_common::need_gfx_cs_space
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
0447e8e59e radeonsi: remove r600_pipe_common::set_atom_dirty
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5c125ab1ba radeonsi: remove r600_pipe_common::check_vm_faults
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
17e8f1608e radeonsi: call CS flush functions directly whenever possible
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
0669dca9c0 radeonsi: skip DCC render feedback checking if color writes are disabled 2018-04-05 15:34:58 -04:00
Dylan Baker
6ac87c1769 meson: fix megadriver symlinking
Which should be relative instead of absolute.

Fixes: f7f1b30f81
       ("meson: extend install_megadrivers script to handle symmlinking")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105567
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-05 10:48:38 -07:00
Dylan Baker
19dbed6477 meson: Set .so version for xa like autotools does
Fixes: 0ba909f0f1
       ("meson: build gallium xa state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-05 10:46:14 -07:00
Rafael Antognolli
7728720f07 anv: Make blorp update the clear color.
Instead of updating the clear color in anv before a resolve, just let
blorp handle that for us during fast clears.

v5: Update comment about HiZ clear color (Jordan).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
e8cadb673d anv: Use clear address for HiZ fast clears too.
Store the default clear address for HiZ fast clears on a global bo, and
point to it when needed.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
021e1885d0 anv: Emit the fast clear color address, instead of value.
On Gen10+, instead of copying the clear color from the state buffer to
the surface state, just use the address of the state buffer in the
surface state directly. This way we can avoid the copy from state buffer
to surface state.

v4:
 - Remove use_clear_address from anv code. (Jason)
 - Use the helper to extract clear color from attachment (Jason)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
3f96b459f4 anv: Add a helper to extract clear color from the attachment.
Extract the code from color_attachment_compute_aux_usage, so we can
later reuse it to update the clear color state buffer.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
7987d041fd i965/surface_state: Emit the clear color address instead of value.
On Gen10, when emitting the surface state, use the value stored in the
clear color entry buffer by using a clear color address in the surface
state.

v4: Use the clear color offset from the clear_color_bo, when available.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
2efe8309d3 i965/blorp: Update the fast clear value buffer.
On Gen10, whenever we do a fast clear, blorp will update the clear color
state buffer for us, as long as we set the clear color address
correctly.

However, on a hiz clear, if the surface is already on the fast clear
state we skip the actual fast clear operation and, before gen10, only
updated the miptree. On gen10+ we need to update the clear value state
buffer too, since blorp will not be doing a fast clear and updating it
for us.

v4:
 - do not use clear_value_size in the for loop
 - Get the address of the clear color from the aux buffer or the
 clear_color_bo, depending on which one is available.
 - let core blorp update the clear color, but also update it when we
 skip a fast clear depth.

v5: Better subject (Jordan).
v6: Remove outdated comment (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
5449f942f2 i965: Add aux_buf variable to simplify code.
In a follow up patch, we make use of clear_color_bo, which is in
mt->mcs_buf or mt->hiz_buf. To avoid duplicating more code that does the
same thing on both aux buffers, just use aux_buf already.

v5: Add aux_buf to brw_wm_surface_state too.
v6: Drop aux_surf and use aux_buf->surf instead (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
8735c86ce0 i965/miptree: Add new clear color BO for winsys aux buffers
Add an extra BO to store clear color when we receive the aux buffer from
the window system. Since we have no control over the aux buffer size in
this case, we need the new BO to store only the clear color.

v5:
 - Better subject (Jordan).
 - Drop alignment from brw_bo_alloc().

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
ab633c2d61 i965/miptree: Add space to store the clear value in the aux surface.
Similarly to vulkan where we store the clear value in the aux surface,
we can do the same in GL.

v2: Remove unneeded extra function.
v3: Use clear_value_state_size instead of clear_value_size.
v4:
 - rename to clear_color_state_size
 - store clear_color_bo and clear_color_offset in the aux buf struct
v5: Unreference clear color bo (Jordan)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
14260e7c60 intel/blorp: Update clear color state buffer during fast clears.
We always want to update the fast clear color during a fast clear on
i965. On anv, we are doing that before a resolve, but by adding support
to blorp, we can do a similar thing and update it during a fast clear
instead.

The goal is to remove some code from anv that does such update, and
centralize everything in blorp, hopefully removing a lot of code
duplication. It also allows us to have a similar behavior on gen < 9 and
gen >= 10.

v5: s/we/we are/ (Jordan)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
92eb5bbc68 intel/blorp: Only copy clear color when doing a resolve.
We only need to copy the clear color from the state buffer to the
inlined surface state when doing a resolve.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
188a473b9a intel/blorp: Add support for fast clear address.
On gen10+, if surface->clear_color_addr is present, use it directly
intead of copying it to the surface state.

v4: Remove redundant #if clause for GEN <= 10 (Jason)
v5: Move flush after the reloc, and keep lower bits (Topi).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
b8f45cf967 intel/isl: Add support to emit clear value address.
gen10 can emit the clear color by setting it on a buffer somewhere, and
then adding only the address to the surface state.

This commit add support for that on isl_surf_fill_state, and if that is
requested, skip setting the clear value itself.

v2: Add assert to make sure we are at least on gen10.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
94675edcfd intel: Use Clear Color struct size.
The size of the clear color struct (expected by the hardware) is 8
dwords (isl_dev.ss.clear_value_state_size here). But we still need to
track the size of the clear color, used when memcopying it to/from the
state buffer. For that we keep isl_dev.ss.clear_value_size.

v4:
 - Add struct to gen11 too (Jason, Jordan)
 - Add field for Converted Clear Color to gen11 (Jason)
 - Add clear_color_state_offset to differentiate from
   clear_value_offset.
 - Fix all the places where clear_value_size was used.

v5 (Jason):
 - Split genxml changes to another commit.
 - Remove unnecessary gen checks.
 - Bring back missing offset increment to init_fast_clear_color().

v6 (Jason):
 - On init_fast_clear_color, change:
   addr.offset += 4 => sdi.Address.offset += i * 4
 - Use GEN_GEN instead of GEN_VERSIONx10.

[jordan.l.justen@intel.com: isl_device_init changes]
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
f77789a3f0 intel/genxml: Add Clear Color struct to gen10+.
v5: Split genxml changes into its own commit (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
7e616ae201 intel/genxml: Use a single field for clear color address on gen10.
genxml does not support having two address fields with different names
but same position in the state struct. Both "Clear Color Address"
and "Clear Depth Address Low" mean the same thing, only for different
surface types.

To workaround this genxml limitation, rename "Clear Color Address"
to "Clear Value Address" and use it for both color and depth. Do the
same for the high bits.

TODO: add support for multiple addresses at the same position in the
xml.

v2: Combine high and low order bits into a single address field.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
8e1f2e1d2d genxml: Preserve fields that share dword space with addresses.
Some instructions contain fields that are either an address or a value
of some type based on the content of other fields, such as clear color
values vs address. That works fine if these fields are in the less
significant dword, the lower 32 bits of the address, because they get
OR'ed with the address. But if they are in the higher 32 bits, they get
discarded.

On Gen10 we have fields that share space with the higher 16 bits of the
address too. This commit makes sure those fields don't get discarded.

v5: Remove spurious whitespace (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
f421a31637 anv/image: Do not override lower bits of dword.
The lower bits seem to have extra fields in every platform but gen8
(even though we don't use them in gen9). So just go ahead and avoid
using them for the address.

v4: Use Jason's suggestion for comment explaining the change.
v5: Fix aux_address comment in anv_private.h (Jason)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-05 07:42:45 -07:00
Samuel Pitoiset
942fdfe357 radv: implement a fast prefetch path for the vertex stage
This allows to start draws as soon as possible.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-05 10:03:48 +02:00
Samuel Pitoiset
4ad7595f35 radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-05 10:03:45 +02:00
Samuel Pitoiset
a8a696a38f radv: use a mask for VBOs and shaders prefetching
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-05 10:03:42 +02:00
Marek Olšák
8cd58df2f2 gallium/pp: fix MLAA shaders
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99549
2018-04-04 20:01:43 -04:00
Marek Olšák
096942be2c gallium/pp: use user constant buffers
This fixes a radeonsi crash.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105026
2018-04-04 20:01:43 -04:00
Marek Olšák
d9dc26c94e st/mesa: set stencil border color the same as intensity
This fixes some stencil border color tests on Vega and Raven chips.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-04 16:55:52 -04:00
Jon Turney
498d9d0f4d Fix use of alloca() without #include <c99_alloca.h>
Fix use of alloca() without #include <c99_alloca.h> in 1da345e5

vbo/vbo_context.c: In function '_vbo_draw_indirect':
vbo/vbo_context.c:284:34: error: implicit declaration of function 'alloca' [-Werror=implicit-function-declaration]
       struct _mesa_prim *space = alloca(draw_count*sizeof(struct _mesa_prim));
                                  ^~~~~~
vbo/vbo_context.c:284:34: warning: initialization makes pointer from integer without a cast [-Wint-conversion]

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-04-04 14:34:07 +01:00
Samuel Pitoiset
922cd38172 radv: implement out-of-order rasterization when it's safe on VI+
Disabled by default for now, it can be enabled with
RADV_PERFTEST=outoforder.

No CTS regressions on Polaris, and all Vulkan games I tested
look good as well.

Expect small performance improvements for applications where
out-of-order rasterization can be enabled by the driver.

Loosely based on RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
d6709c91a6 radv: change blend_enable field to use four bits per CB
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
a8818d1af2 radv: scan which color blend attachments are enabled
With cb_target_enabled_4bit in order to have four bits per CB.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
ac456d0d1b radv: put more fields in radv_blend_state
Some will be used for further optimizations (ie. out-of-order rast).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
e4976ca33b radv: do not always disable dual quad mode when chip has RbPlus
For GFX9+ only, RadeonSI does this too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
b8c06a961c radv: don't use the SPI barrier management bug workaround
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
ab147cba77 radv: mask out high VM address bits in registers where needed
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Lionel Landwerlin
1beb80cb56 intel: compiler: silence compiler warning
../src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg*, const brw_reg*)’:
../src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type]

Introduced by 8f83eea71e ("i965: Add negative_equals methods").

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-04 11:57:39 +01:00
Iago Toral Quiroga
41ac0b1443 compiler/spirv: set is_shadow for depth comparitor sampling opcodes
From the SPIR-V spec, OpTypeImage:

"Depth is whether or not this image is a depth image. (Note that
 whether or not depth comparisons are actually done is a property of
 the sampling opcode, not of this type declaration.)"

The sampling opcodes that specify depth comparisons are
OpImageSample{Proj}Dref{Explicit,Implicit}Lod, so we should set
is_shadow only for these (we were using the deph property of the
image until now).

v2:
 - Do the same for OpImageDrefGather.
 - Set is_shadow to false if the sampling opcode is not one of these (Jason)
 - Reuse an existing switch statement instead of adding a new one (Jason)

Fixes crashes in:
dEQP-VK.spirv_assembly.instruction.graphics.image_sampler.depth_property.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2018-04-04 07:57:58 +02:00
Sergii Romantsov
98b860e311 i965: Extend the negative 32-bit deltas to 64-bits
Gen8+ use 48-bit address relocations so need to extend the sign
to 64-bit return value. Without it we have higher bits zeroed
and missing the negavive values.
Haswell and older use 32-bit deltas so are unaffected by this issue.

v2:
  used int32_t fucntion parameter instead of explicit type conversion.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101408
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Tested-by: Andriy Khulap <andriy.khulap@globallogic.com>
Tested-by: Stuart Young <cefiar@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>
2018-04-03 22:48:09 -07:00
Jason Ekstrand
800df942ea nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination
Otherwise we may end up trying to coalesce in a case such as

ssa_1 = fadd r1, r2
r3.x = fneg(r2);
r3 = vec4(ssa_1, ssa_1.y, ...)

and that would cause us to move the writes to r3 from the vec to the
fadd which would re-order them with respect to the write from the fneg.
In order to solve this, we just don't coalesce if the destination of the
vec is not SSA.  We could try to get clever and still coalesce if there
are no writes to the destination of the vec between the vec and the ALU
source.  However, since registers only come from phi webs and indirects,
the chances of having a vec with a register destination that is actually
coalescable into its source is very slim.

Shader-db results on Haswell:

    total instructions in shared programs: 13657906 -> 13659101 (<.01%)
    instructions in affected programs: 149291 -> 150486 (0.80%)
    helped: 0
    HURT: 592

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105440
Fixes: 2458ea95c5 "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible"
Reported-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Tested-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-04-03 22:21:23 -07:00
Kevin Strasser
5bbde9b80f anv: Fix close(fd) before import issue in vkCreateDmaBufImageINTEL
If we close the fd before calling DRM_IOCTL_PRIME_FD_TO_HANDLE the kernel
will hit a -EBADF error. Move the close(fd) call to the end of
anv_CreateDmaBufImageINTEL().

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-03 18:33:17 -07:00
Timothy Arceri
b42633db8e glsl: always call do_lower_jumps() after loop unrolling
This fixes a bug in radeonsi where LLVM cannot handle the case where
a break exists but its not the last instruction in the block.

LLVM would fail with:
Terminator found in the middle of a basic block!
LLVM ERROR: Broken function found, compilation aborted!

Fixes: 96fe8834f5 "glsl_to_tgsi: do fewer optimizations with GLSLOptimizeConservatively"

Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105317
2018-04-04 08:40:16 +10:00
James Legg
a58fdc61e9 vulkan/wsi/wayland: fix leaks
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Reviewed-by: Daniel Stone <daniels@collabora.com>
CC: Jason Ekstrand <jason@jlekstrand.net>
2018-04-03 22:09:57 +01:00
Juan A. Suarez Romero
06076ead28 docs: update calendar, add news and link release notes to 17.3.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-03 17:38:36 +00:00
Juan A. Suarez Romero
ca71b7bab8 docs: add sha256 checksums for 17.3.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit ba371c7262)
2018-04-03 17:34:16 +00:00
Juan A. Suarez Romero
d89ef8ce62 docs: add release notes for 17.3.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 3bf5c10c5c)
2018-04-03 17:34:16 +00:00
Jakob Bornecrantz
88e958257c st/mesa: Also use PIPE_FORMAT_R8G8B8A8_SRGB for framebuffer_sRGB.
When running virgl on a GLES host the only sRGB formats that support
rendering is RGBA and RGBX. That pipe format is in the sRGB default
lists that the state tracker uses when mapping mesa formats.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-04-03 17:48:52 +01:00
Lionel Landwerlin
78c18d99dc intel: gen-decoder: print all dword a field belongs to
Prior to printing a decoded field, print out all dwords that field
belongs to. In particular with address fields spanning multiple
dwords, we want to have all the dwords presented before the field is
decoded to make it easier to read.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Lionel Landwerlin
4d59127213 intel: genxml: decode variable length MI_LRI
MI_LOAD_REGISTER_IMM can load multiple (register, value) tuples in one
command. In our drivers we only use one tuple at a time, but the
kernel might load more than one at a time.

Instead of making all the tuple part of a group, we leave out the
first tuple (the one we use in the generated packing structures).

This is particularly useful for looking at error stats generated by
the kernel.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Lionel Landwerlin
2841af6238 intel: gen-decoder: don't decode fields beyond a dword length
For example, a PIPE_CONTROL with DWordLength = 2 should look like
this :

0xffffe374:  0x7a000002:  PIPE_CONTROL
0xffffe374:  0x7a000002 : Dword 0
    DWord Length: 2
0xffffe378:  0x00800000 : Dword 1
    Depth Cache Flush Enable: false
    Stall At Pixel Scoreboard: false
    State Cache Invalidation Enable: false
    Constant Cache Invalidation Enable: false
    VF Cache Invalidation Enable: false
    DC Flush Enable: false
    Pipe Control Flush Enable: false
    Notify Enable: false
    Indirect State Pointers Disable: false
    Texture Cache Invalidation Enable: false
    Instruction Cache Invalidate Enable: false
    Render Target Cache Flush Enable: false
    Depth Stall Enable: false
    Post Sync Operation: 0 (No Write)
    Generic Media State Clear: false
    TLB Invalidate: false
    Global Snapshot Count Reset: false
    Command Streamer Stall Enable: false
    Store Data Index: 0
    LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
    Destination Address Type: 0 (PPGTT)
    Flush LLC: false
0xffffe37c:  0x00000000 : Dword 2
    Address: 0x00000000
0xffffe384:  0x05000000:  MI_BATCH_BUFFER_END

Prior to this change, fields beyond the length of the command would be
decoded (notice the MI_BATCH_BUFFER_END decoded as part of the
previous PIPE_CONTROL) :

0xffffe374:  0x7a000002:  PIPE_CONTROL
0xffffe374:  0x7a000002 : Dword 0
    DWord Length: 2
0xffffe378:  0x00800000 : Dword 1
    Depth Cache Flush Enable: false
    Stall At Pixel Scoreboard: false
    State Cache Invalidation Enable: false
    Constant Cache Invalidation Enable: false
    VF Cache Invalidation Enable: false
    DC Flush Enable: false
    Pipe Control Flush Enable: false
    Notify Enable: false
    Indirect State Pointers Disable: false
    Texture Cache Invalidation Enable: false
    Instruction Cache Invalidate Enable: false
    Render Target Cache Flush Enable: false
    Depth Stall Enable: false
    Post Sync Operation: 0 (No Write)
    Generic Media State Clear: false
    TLB Invalidate: false
    Global Snapshot Count Reset: false
    Command Streamer Stall Enable: false
    Store Data Index: 0
    LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
    Destination Address Type: 0 (PPGTT)
    Flush LLC: false
0xffffe37c:  0x00000000 : Dword 2
    Address: 0x00000000
0xffffe380:  0x00000000 : Dword 3
0xffffe384:  0x05000000 : Dword 4
    Immediate Data: 83886080
0xffffe384:  0x05000000:  MI_BATCH_BUFFER_END

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Lionel Landwerlin
81375516b2 intel: error_decode: add an option to decode all buffers
The kernel reports workaround batch buffers, but we're not presenting
them currently. Also they might not be useful for debugging purely
userspace driver issues, when problems arise because of interactions
between kernel & userspace drivers, it's nice to be able to decode
them.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Lionel Landwerlin
b3aa18dfd6 intel: genxml: add preemption control instructions
Helpful to debug kernel workaround batchbuffers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Dylan Baker
6f6e711c72 mesa: ensure that variable is initialized
This variable controls whether we link using the glsl code path or the
spirv path. It's set when we validate that all shaders are glsl or
spirv, but if there are no shaders attached to the program it will
remain unset, resulting in undefined behavior. We want to go down the
glsl path in that case, so initialize to false.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105820
Fixes: 16f6634e7f
       ("mesa/program: Link SPIR-V shaders using the SPIR-V code-path")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-04-03 08:47:59 -07:00
Marek Olšák
d3e96b1063 radeonsi/gfx9: fix bad LLVM params in monolithic LS+HS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-03 11:07:28 -04:00
Samuel Pitoiset
acf60abc54 radv: enable VK_EXT_shader_viewport_index_layer
The driver already supports exporting the Layer and ViewportIndex
built-ins from vertex or tessellation shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-03 14:05:46 +02:00
Rob Clark
51888bf07d nir+drivers: add helpers to get # of src/dest components
Add helpers to get the number of src/dest components for an intrinsic,
and update spots that were open-coding this logic to use the helpers
instead.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-03 06:08:56 -04:00
Rob Clark
91f9450b32 freedreno/ir3: fix fallout of unused false-depth elimination
Since we were MARK flag for both preventing loops, and tracking whether
instructions were used, we could end up in an infinite loop due to
bd2ca2bcdd.  Instead invert the logic.. mark all instructions UNUSED
up front and clear the flag as we visit them.

Fixes: bd2ca2bcdd freedreno/ir3: eliminate unused false-deps
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-04-03 06:08:56 -04:00
Timothy Arceri
7e9b7ec094 gallium/pipebuffer: fix parenthesis location
Without this the return value will never get set to -1. This
was first added in 49866c8f34 and copied in 2b396eeed9.

Fixes: 2b396eeed9 "gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102342
2018-04-03 16:05:59 +10:00
Tapani Pälli
6b21391729 Revert "mesa: add GL_HALF_FLOAT as supported type to readpixels"
This reverts commit 41cf30b8bc.

Commit caused regressions with KHR-GLES3.packed_pixels.* tests.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Eric Anholt <eric@anholt.net>
2018-04-03 08:43:30 +03:00
Mike Lothian
0bdbe4583f gallivm: Fix include for LLVMAddPromoteMemoryToRegisterPass
Include llvm-c/Transforms/Utils.h with the newest LLVM 7

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:27:29 -04:00
Mike Lothian
5e07881305 radeonsi: Fix include for LLVMAddPromoteMemoryToRegisterPass
Include llvm-c/Transforms/Utils.h with the newest LLVM 7

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:27:29 -04:00
Mike Lothian
7e144ace95 ac/nir: Fix include for LLVMAddPromoteMemoryToRegisterPass
Include llvm-c/Transforms/Utils.h with the newest LLVM 7

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:27:29 -04:00
Daniel Stone
4cbecb6168 st/dri: Initialise modifier to INVALID for DRI2
When allocating a buffer for DRI2, set the modifier to INVALID to inform
the backend that we have no supplied modifiers and it should do its own
thing. The missed initialisation forced linear, even if the
implementation had made other decisions.

This resulted in VC4 DRI2 clients failing with:
  Modifier 0x0 vs. tiling (0x700000000000001) mismatch

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Andreas Müller <schnitzeltony@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes: 3f8513172f ("gallium/winsys/drm: introduce modifier field to winsys_handle")
2018-04-02 19:07:57 +01:00
Marek Olšák
2be6143032 radeonsi: implement GL_KHR_blend_equation_advanced
MSAA is supported using sample shading. Layered rendering and all texture
targets are also supported.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:25 -04:00
Marek Olšák
e04631b0f2 radeonsi: rename unpack_param -> si_unpack_param
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:23 -04:00
Marek Olšák
dc04e4bba2 radeonsi: move FMASK shader logic to shared code
We'll need it for FBFETCH in both TGSI and NIR paths.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:22 -04:00
Marek Olšák
eb77961292 radeonsi: add R600_DEBUG=nofmask to disable MSAA compression
For testing.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:20 -04:00
Marek Olšák
56342c97ee gallium/u_tests: test FBFETCH and shader-based blending with MSAA
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:18 -04:00
Marek Olšák
5d91c2ccea ac/gpu_info: print GB_ADDR_CONFIG 2018-04-02 13:10:37 -04:00
Marek Olšák
b1f33086ec ac/gpu_info: reorder the fields and print them nicely 2018-04-02 13:10:37 -04:00
Marek Olšák
a0a96819e1 ac/gpu_info: rename has_virtual_memory -> r600_has_virtual_memory 2018-04-02 13:10:37 -04:00
Marek Olšák
32b3932de1 ac/gpu_info: don't print irrelevant fields 2018-04-02 13:10:37 -04:00
Marek Olšák
f754217517 st/mesa: don't draw if the bound element array buffer is not allocated
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:10:36 -04:00
Iago Toral Quiroga
31881079af anv/cmd_buffer: honor pending clear views for depth/stencil attachments
v2: rebased on top of subpass rework.

v3: rebased

v4:
 - rebased
 - reset pending clear views in one go rather one bit at a time (Caio)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-02 09:53:24 +02:00
Iago Toral Quiroga
f60c5fc17e anv/cmd_buffer: consider multiview masks for tracking pending clear aspects
When multiview is active a subpass clear may only clear a subset of the
attachment layers. Other subpasses in the same render pass may also
clear too and we want to honor those clears as well, however, we need to
ensure that we only clear a layer once, on the first subpass that uses
a particular layer (view) of a given attachment.

This means that when we check if a subpass attachment needs to be cleared
we need to check if all the layers used by that subpass (as indicated by
its view_mask) have already been cleared in previous subpasses or not, in
which case, we must clear any pending layers used by the subpass, and only
those pending.

v2:
  - track pending clear views in the attachment state (Jason)
  - rebased on top of fast-clear rework.

v3:
  - rebased on top of subpass rework.

v4: rebased.

v5 (Caio):
 - Rebased.
 - Initialize pending clear views to only have bits set for layers
   that exist.
 - Reset pending clear views in one go rather one bit at a time.
 - Put "last subpass for this attachment" condition in a separate
   function to simplify the conditional that resets pending_clear_aspects.

Fixes:
dEQP-VK.multiview.readback_implicit_clear.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-02 09:53:15 +02:00
Timothy Arceri
c88e7fe29e radeonsi/nir: fix explicit component packing for geom/tess doubles
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Timothy Arceri
dd3d3cc877 radeonsi/nir: gather buffers declared more accurately and use const fast path
For now we skip SI && HAVE_LLVM < 0x0600 for simplicity. We also skip
setting the more accurate masks for builtin uniforms for now as it
causes some piglit regressions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Timothy Arceri
56017d8100 radeonsi: create load_const_buffer_desc_fast_path() helper
This will be shared by the TGSI and NIR backends. For simplicity
we leave the SI LLVM 5.0 and lower work around only in the TGSI
backend.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Timothy Arceri
7aad5e15f6 radeonsi/nir: set TGSI_PROPERTY_NEXT_SHADER
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Timothy Arceri
2ca5d9548f st/glsl_to_nir: gather next_stage in shader_info
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Rob Clark
2f175bfe5d freedreno/a5xx: don't align height for PIPE_BUFFER
Buffers can be large, so we probably don't want to make them all 32x
bigger.  But they can't be rendered to (at least in GL) so we don't
need this workaround to prevent page faults on mem<->gmem.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-04-01 11:26:01 -04:00
Rob Clark
1866f76f7b freedreno/a5xx: fix page faults on last level
We could alternatively fall back to using "old style" draw's for
mem<->gmem (ie. what <= a4xx do) when height is not aligned to 32,
but that is somewhat more work (and not really something that could
be applied to stable)

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-04-01 10:50:11 -04:00
Rob Clark
afde9294b5 freedreno/ir3: fix issue w/ glamor composite shaders
Fixes an issue that became possible when we started lowering phi webs to
regs (a7ea2b4e) (although was not really seen until we also switched to
using peephole select pass (ec8bc54a) instead of lowering *all* if/else
to select).

If texture coord (or anything else that uses create_collect() to collect
scalar values in a sequence of scalar registers) was consuming a value
produced on either side of an if/else (ie. a phi lowered to nir reg,
which in ir3 is an "array" of length 1) then register allocation would
happen incorrectly and we'd end up sampling from garbage coordinates.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 16:25:13 -04:00
Rob Clark
2191a18e75 freedreno/ir3: more half-precision fixes
Some instructions require src/dst to be in full or half precision
register depending on src/dst type.  So do a better job of propagating
register type.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:16:16 -04:00
Rob Clark
e04e068f75 freedreno/ir3: add helper to create immed of specified size
We'll also need to be able to create a half-precision immediate.  So
re-work create_immed().  Prep work for following patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:13:11 -04:00
Rob Clark
1f45320e51 freedreno/ir3: pass ctx instead of block to create_collect()
Prep work for following patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:12:33 -04:00
Rob Clark
bd2ca2bcdd freedreno/ir3: eliminate unused false-deps
Previously false-dependencies would get flagged as used, even if the
only "use" was a false dep to (for example) prevent a load from being
scheduled after a store.

In addition to being pointless instructions, in some cases they can
cause problems.  For example, ldg (and similar instructions) depend on
an immed arg getting CP'd into the instruction, but this doesn't happen
if an instruction is otherwise unused.  Which can result in undefined
results (overwriting unintended registers).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:11:46 -04:00
Rob Clark
4f78383809 freedreno/ir3: add local_group_size
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:10:56 -04:00
Rob Clark
96e7927fb2 freedreno/ir3: clear SSA flag when assigning "ARRAY" regs too
Avoids a misleading "INVALID FLAGS" warning in debug builds.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:10:16 -04:00
Rob Clark
6514b4e3fd freedreno/ir3: print array live ranges
This is also useful to see if optmsgs are enabled.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:09:42 -04:00
Wladimir J. van der Laan
e8e3aa68d6 freedreno: a2xx: Implement DP2 instruction
Use DOT2ADDv instruction with 0.0f constant add.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
79d6b194f2 freedreno: a2xx: implement SEQ/SNE instructions
Extend translate_sge_slt to emit these, in analogous fashion
but using CNDEv.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
837fabaaa3 freedreno: a2xx: Compressed textures support
Add support for:

- PIPE_FORMAT_ETC1_RGB8
- PIPE_FORMAT_DXT1_RGB
- PIPE_FORMAT_DXT1_RGBA
- PIPE_FORMAT_DXT3_RGBA
- PIPE_FORMAT_DXT5_RGBA

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
92d529e7e4 freedreno: a2xx: Support TEXTURE_RECT
Denormalized texture coordinates are required for text rendering in
GALLIUM_HUD.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
6be017fdc4 freedreno: a2xx: Prevent crash in emit_texture if view is not set
Textures will sometimes be updated if texture view state was
un-set, without this change that causes an assertion crash or
segfault.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
fb41372761 freedreno: a2xx: Fix fd2_tex_swiz
Compose swizzles using util_format_compose_swizzles instead
of the custom code (which somehow had a bug).

This makes the GL_ALPHA internal format work.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
faed84a615 freedreno: a2xx: Change use of BLEND_ to BLEND2_
Change use of BLEND_ to BLEND2_,

    BLEND_* a3xx_rb_blend_opcode
    BLEND2_* is a2xx_rb_blend_opcode

This makes no effective difference as the used enumerant has the same
value (0), but the other enumerants do not match 1-to-1 so this will
avoid future problems.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
cb6dd7070f freedreno: a2xx: Update rnndb header for formats enumeration
The format enumeration comes comes from the yamoto
register headers that are part of the amd-gpu kernel driver.
(see freedreno envytools commit b8fb7978e7ae106d0d11d0b238ab2ba2d4dd9d43)

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Mathias Fröhlich
1da345e569 vbo: Use alloca for _vbo_draw_indirect.
Avoid using malloc in the draw path of mesa.
Since the draw_count is a user api input, fall back to malloc if
the amount of consumed stack space may get too high.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:15 +02:00
Mathias Fröhlich
3f1cd957d3 vbo: Remove unused includes to vbo_private.h
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:14 +02:00
Mathias Fröhlich
6e9f00e3fc vbo: Move vbo_split into the tnl module.
Move the files, adapt to the naming scheme in tnl, update callers
and build system.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:14 +02:00
Mathias Fröhlich
245f9a3977 vbo: Readd the arrays argument to the legacy draw methods.
The legacy draw paths from back before 2012 contained a gl_vertex_array
array for the inputs to be used for draw. So all draw methods from legacy
drivers and everything that goes through tnl are originally written
for this calling convention. The same goes for tools like t_rebase or
vbo_split*, that even partly still have the original calling convention
with a currently unused such pointer.
Back in 2012 patch 50f7e75

mesa: move gl_client_array*[] from vbo_draw_func into gl_context

introduced Array._DrawArrays, which was something that was IMO aiming for
a similar direction than Array._DrawVAO introduced recently.
Now several tools like t_rebase and vbo_split*, which are mostly used by
tnl based drivers, would need to be converted to use the internal
Array._DrawVAO instead of Array._DrawArrays. The same goes for the driver
backends that use any of these tools.
Alternatively we can reintroduce the gl_vertex_array array in its call
argument list and put these tools finally into the tnl directory.
So this change reintroduces this gl_vertex_array array for the legacy
draw paths that are still required for the tools t_rebase and vbo_split*.
A followup will move vbo_split also into tnl.

Note that none of the affected drivers use the DriverFlags.NewArray
driver bit. So it should be safe to remove this also for the legacy
draw path.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:14 +02:00
Mathias Fröhlich
461698af26 vbo: Remove the now unused vbo draw path.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:13 +02:00
Mathias Fröhlich
784fdef4e7 tnl: Push down the gl_vertex_array inputs into tnl drivers.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:13 +02:00
Mathias Fröhlich
7f8db5ca47 vbo: Remove vbo_indirect_draw_func.
Remove the vbo_indirect_draw_func vbo callback and make the default
implementation use the drivers main draw callback function directly.
This will be needed with the next changes when drivers without own main
drivers DrawIndirect implementation get moved to the main drivers
Draw method.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:13 +02:00
Mathias Fröhlich
4db9d83a2d i965: Push down the gl_vertex_array inputs into i965.
Let the i965 backend have its own gl_vertex_array array and basically
reimplement the way _vbo_draw works.
Note that brw_draw_indirect_prims calls brw_draw_prims internally
and gets its update to Array._DrawArray by this way.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:12 +02:00
Mathias Fröhlich
fca1550550 gallium: Push down the gl_vertex_array inputs into gallium.
Let the gallium backend have its own gl_vertex_array array and basically
reimplement the way _vbo_draw works.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:12 +02:00
Jason Ekstrand
9978f55cd1 nir/validator: Validate that all used variables exist
We were validating this for locals but nothing else.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-30 17:20:27 -07:00
Jason Ekstrand
2b977989f3 intel/vec4: Set channel_sizes for MOV_INDIRECT sources
Otherwise, any indirect push constant access results in an assertion
failure when we start digging through the channel_sizes array.  This
fixes dEQP-VK.pipeline.push_constant.graphics_pipeline.dynamic_index_vert
on Haswell.  It should be a harmless no-op for GL since indirect push
constants aren't used there.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: e69e5c7006 "i965/vec4: load dvec3/4 uniforms first in the..."
2018-03-30 17:20:27 -07:00
Jason Ekstrand
6018f5b079 nir/lower_indirect_derefs: Support interp_var_at intrinsics
This fixes the fs-interpolateAtCentroid-block-array piglit test on i965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2018-03-30 17:20:27 -07:00
Jason Ekstrand
0517d65f96 nir/vars_to_ssa: Remove copies from the correct set
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2018-03-30 17:20:27 -07:00
Jason Ekstrand
a1452a94fc nir: Return a cursor from nir_instr_remove
Because nir_instr_remove is an inline wrapper around nir_instr_remove_v,
the compiler should be able to tell that the return value is unused and
not emit the extra code in most cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-30 17:20:27 -07:00
Jason Ekstrand
956f17395b nir: Add src/dest num_components helpers
We already have these for bit_size

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-30 17:20:27 -07:00
Brian Paul
bebf758c49 docs: document WGL_SWAP_INTERVAL env var
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-03-30 14:45:05 -06:00
Brian Paul
c8906b8459 st/wgl: check if WGL_SWAP_INTERVAL is defined in wglSwapIntervalEXT()
This allows the WGL_SWAP_INTERVAL env var to override any application
calls to wglSwapIntervalEXT().  Useful for debugging, or to set the
interval to zero to effectively disable the swap interval.

Note: we also rename the previous instance of SVGA_SWAP_INTERVAL to
WGL_SWAP_INTERVAL since this is a WGL feature and not related to the
svga driver.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-03-30 14:44:50 -06:00
Brian Paul
1bf201ddce glapi: define GL_API to be KEYWORD1 in glapi_dispatch.c (v2)
This fixes a Windows build warning where the prototypes for the ES
function in the header file don't match the prototypes in this file
because the GL_API and GLAPI macros are defined differently.

v2: defined GL_API to KEYWORD1 instead of GLAPI, per Mathias.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-03-30 14:33:33 -06:00
Brian Paul
26bc983c83 spirv: s/uint/unsigned/ to fix MSVC build
Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Brian Paul
f3164c2ed9 nir/spirv: s/uint32_t/SpvOp/ in various functions
The MSVC compiler warns when the function parameter types don't
exactly match with respect to enum vs. uint32_t.  Use SpvOp everywhere.

Alternately, uint32_t could be used everywhere.  There doesn't seem
to be an advantage to one over the other.

Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Brian Paul
cb619a3c9a nir/spirv: fix MSVC syntax error in vtn_handle_texture()
Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Brian Paul
c58c9f712d nir/spirv: move NORETURN annotation on _vtn_fail() prototype
This needs to before the function, not after, to compile with MSVC.
This works with gcc too.

Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Brian Paul
84be45fc20 nir/spirv: fix MSVC warning in vtn_align_u32()
Fixes warning that "negation of an unsigned value results in an
unsigned value".

Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Neil Roberts
31d91f019b spirv: Fix building with SCons
The SCons build broke with commit ba975140d3 because a SPIR-V
function is called from Mesa main. This adds a convenience library for
SPIR-V and adds it to everything that was including nir. It also adds
both nir and spirv to drivers/x11/SConscript.

Also add nir/spirv modules to osmesa and libgl-gdi targets. (Brian Paul)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105817
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2018-03-30 14:33:03 -06:00
Brian Paul
cdc34e2cea mesa: fix MSVC bitshift overflow warnings
In the BITFIELD_MASK() macro, if b==32 the expression evaluates to
~0u, but the compiler still sees the expression (1 << 32) in the
unused part and issues a warning about integer bitshift overflow.

Fix that by using (b) % 32 to ensure the max shift is 31 bits.

This issue has been present for a while, but shows up much more
often because of the recent VBO changes.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-03-30 11:04:32 -06:00
Brian Paul
fa18a427e9 st/mesa: add missing GLSL_TYPE_[U]INT8 cases in st_glsl_type_dword_size()
Silences a compiler warning about unhandled enum switch cases.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-03-30 11:04:32 -06:00
Jakob Bornecrantz
e16b92ad7e vbo: MaxVertexAttribStride is not always set
This assert is hit on hardware which does not expose GL 4.4 or GLES 3.1.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-03-30 17:23:08 +01:00
Daniel Stone
696762eef5 x11: Only report supported DRI3/Present versions
The version passed to QueryVersion requests is the version that the
client supports. We were just passing in whatever version of XCB was
present on the system, which may not be a version that Mesa actually
explicitly supports, e.g. it might bring unwanted semantics.

Set specific protocol versions which we support, and only pass those.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: 7aeef2d4ef ("dri3: allow building against older xcb (v3)")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-30 16:53:51 +01:00
Samuel Pitoiset
2a329f4ada radv: set SAMPLE_RATE to the number of samples of the current fb
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-30 17:32:15 +02:00
Brian Paul
fc1d1dbe81 nir: s/uint/unsigned/ to fix MSVC/MinGW build
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-30 08:37:59 -06:00
Eduardo Lima Mitev
e7fc18097e i965: Don't call process_glsl_ir() for SPIR-V shaders
v2: Use 'spirv_data' from gl_linked_shader instead, to check if shader
   is SPIR-V. (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
e7d97aa75d i965: Call spirv_to_nir() instead of glsl_to_nir() for SPIR-V shaders
This is the main fork of the shader compilation code-path, where a NIR
shader is obtained by calling spirv_to_nir() or glsl_to_nir(),
depending on its nature..

v2: Use 'spirv_data' member from gl_linked_shader to know which method
   to call. (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
abb6d0797c mesa/glspirv: Add a _mesa_spirv_to_nir() function
This is basically a wrapper around spirv_to_nir() that includes
arguments setup and post-conversion validation.

v2: * Rebase update (SpirVCapabilities not a pointer anymore,
    spirv_to_nir_options added, and others).
    * Code-style improvements and remove debug hunk. (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
16f6634e7f mesa/program: Link SPIR-V shaders using the SPIR-V code-path
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
9c36e9f862 mesa/glspirv: Add _mesa_spirv_link_shaders() function
This is the equivalent to link_shaders() from
src/compiler/glsl/linker.cpp, but for SPIR-V programs. It just
creates the program and its gl_linked_shader objects, giving drivers
the opportunity to implement any linking of SPIR-V shaders they choose,
at a later stage.

v2: Bail out if we see more that one shader for the same stage, and
    add a corresponding comment. (Timothy Arceri)

v3:
  * Adds also a linker error log to the condition above, with a
    reference to the specification issue. (Timothy Arceri)
  * Squash with the patch adding the function boilerplate (Timothy
    Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
22b6b3d0a7 mesa: Add a reference to gl_shader_spirv_data to gl_linked_shader
This is a reference to the spirv_data object stored in gl_shader, which
stores shader SPIR-V data that is needed during linking too.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Nicolai Hähnle
ba975140d3 mesa: Implement glSpecializeShaderARB
v2:
  * Use gl_spirv_validation instead of spirv_to_nir.  This method just
    validates the shader. The conversion to NIR will happen later,
    during linking. (Alejandro Piñeiro)
  * Use gl_shader_spirv_data struct to store the SPIR-V data.
    (Eduardo Lima)
  * Use the 'spirv_data' member to tell if the gl_shader is a SPIR-V
    shader, instead of a dedicated flag. (Timothy Arceri)

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Alejandro Piñeiro
9063bf7ad8 nir/spirv: add gl_spirv_validation method
ARB_gl_spirv adds the ability to use SPIR-V binaries, and a new
method, glSpecializeShader. Here we add a new function to do the
validation for this function:

From OpenGL 4.6 spec, section 7.2.1"

   "Shader Specialization", error table:

    INVALID_VALUE is generated if <pEntryPoint> does not name a valid
    entry point for <shader>.

    INVALID_VALUE is generated if any element of <pConstantIndex>
    refers to a specialization constant that does not exist in the
    shader module contained in <shader>.""

v2: rebase update (spirv_to_nir options added, changes on the warning
    logging, and others)

v3: include passing options on common initialization, doesn't call
    setjmp on common_initialization

v4: (after Jason comments):
  * Rename common_initialization to vtn_builder_create
  * Move validation method and their helpers to own source file.
  * Create own handle_constant_decoration_cb instead of reuse existing one

v5: put vtn_build_create refactoring to their own patch (Jason)

v6: update after vtn_builder_create method renamed, add explanatory
    comment, tweak existing comment and commit message (Timothy)
2018-03-30 09:14:56 +02:00
Alejandro Piñeiro
bebe3d626e spirv: add vtn_create_builder
Refactored from spirv_to_nir, in order to be reused later.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>

v2: renamed method (from vtn_builder_create), add explanatory comment
    (Timothy)
2018-03-30 09:14:56 +02:00
Alejandro Piñeiro
3761e675e2 i965: initialize SPIR-V capabilities
Needed for ARB_gl_spirv. Those are not the same that the Intel vulkan
driver. From the ARB_spirv_extensions spec:

   "3. If a new GL extension is added that includes SPIR-V support via
   a new SPIR-V extension does it's SPIR-V extension also get
   enumerated by the SPIR_V_EXTENSIONS_ARB query?.

   RESOLVED. Yes. It's good to include it for consistency. Any SPIR-V
   functionality supported beyond the SPIR-V version that is required
   for the GL API version should be enumerated."

So in addition to the core SPIR-V support, there is the possibility of
specific GL extensions enabling specific SPIR-V extensions (so
capabilities). That would mean that it is possible that OpenGL and
Vulkan not having the same capabilities supported, even for the same
driver. For this reason it is better to keep them separated.

As an example: at the time of this patch writing Intel vulkan driver
support multiview, but there isn't any OpenGL multiview GL extension
supported.

Note: we initialize SPIR-V capabilities at brwCreateContext instead of
the usual brw_initialize_context_constants because we want to do that
only if the extension is enabled.

v2:
   * Rebase update (SpirVCapabilities not a pointer anymore)
   * Fill spirv capabilities for OpenGL >= 3.3 (Ian Romanick)

v3:
   * Drop multiview support, as i965 doesn't support any multiview GL
     extension (Jason)
   * Fill spirv capabilities only if the extension is enabled (Jason)

v4: Capabilities are supported only on gen7+. Added comment and assert
    (Jason)
2018-03-30 09:14:56 +02:00
Nicolai Hähnle
ca5cc78206 mesa: add gl_constants::SpirVCapabilities
For drivers to declare which SPIR-V features they support.

v2: Don't use a pointer (Ian Romanick)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Ian Romanick
19e0dd1ad3 i965: Don't request GLSL IR lowering of gl_VertexID
Let the lowering in NIR handle it instead.

This hurts one shader that occurs twice in shader-db (SynMark GSCloth)
on IVB and HSW.  No other shaders or platforms were affected.

total cycles in shared programs: 253438422 -> 253438426 (0.00%)
cycles in affected programs: 412 -> 416 (0.97%)
helped: 0
HURT: 2

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
2018-03-29 14:16:07 -07:00
Ian Romanick
2765633116 i965: Silence unused parameter warning
src/mesa/drivers/dri/i965/brw_draw_upload.c: In function ‘double_types’:
src/mesa/drivers/dri/i965/brw_draw_upload.c:225:34: warning: unused parameter ‘brw’ [-Wunused-parameter]
 double_types(struct brw_context *brw,
                                  ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-03-29 14:16:04 -07:00
Ian Romanick
042ee4bea2 spirv: Move SPIR-V building to Makefile.spirv.am and spirv/meson.build
Future changes will add generated files used only from
src/compiler/glsl.  These can't be built from Makefile.nir.am, and we
can't move all the rules from Makefile.nir.am to Makefile.spirv.am (and
it would be silly anyway).

v2: Do it for meson too.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (the meson bits)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (the automake bits)
2018-03-29 14:16:01 -07:00
Ian Romanick
2c9621ee5c compiler: All leaf Makefile.am should use +=
This slightly simplifies later changes that add more Makefile.*.am
files.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2018-03-29 14:09:41 -07:00
Ian Romanick
4925347ec5 util: Include bitscan.h directly
Previously bitset.h would include u_math.h to get bitscan.h.  u_math.h
lives in src/gallium/auxiliary/util while both bitset.h and bitscan.h
live in src/util.  Having the one file directly include another file
that lives in the same directory makes much more sense.

As a side-effect, several files need to directly include standard header
files that were previously indirectly included.

v2: Fix build break in src/amd/common/ac_nir_to_llvm.c.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2018-03-29 14:09:30 -07:00
Ian Romanick
ef7a4c9015 util: Optimize util_is_power_of_two_nonzero
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2018-03-29 14:09:29 -07:00
Ian Romanick
cd18aa1e50 util: Use util_is_power_of_two_nonzero in u_vector
Previously size=0, element_size=0 would have been allowed.  That
combination can only lead to despair.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-03-29 14:09:28 -07:00
Ian Romanick
22fbb5c594 util: Add and use util_is_power_of_two_nonzero
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2018-03-29 14:09:28 -07:00
Ian Romanick
d76c204d05 util: Move util_is_power_of_two to bitscan.h and rename to util_is_power_of_two_or_zero
The new name make the zero-input behavior more obvious.  The next
patch adds a new function with different zero-input behavior.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-03-29 14:09:23 -07:00
Dylan Baker
a3a16d4aa7 meson: use dep_libdrm version for pkg-config
This corrects pkg-config to use the libdrm version (as computed by the
previous patch) instead of using a hardcoded value that may or may not
(probably not) be right.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-29 10:20:52 -07:00
Dylan Baker
c445b1d56f meson: Use the same version for all libdrm checks
Currently each driver specifies it's own version, and core libdrm
specifies a version. In the most common case this is fine, since there
will be exactly one libdrm installed on a system, but if there are more
than one it's possible that mesa will be linked against different
versions of libdrm. There is also the possibility that the current
approach makes the pkg-config files we generate incorrect, since there
could be #defines that use newer features if they're available.

This patch corrects all of that. All of the versions are still set by
driver (along with a default core version). Then all of the drivers that
are enabled have their versions compared and the highest version is
selected, then all libdrm checks are made with that version.

v2: - Reorder the list to have the name first and whether the dependency
      is needed second (Eric)

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-29 10:20:52 -07:00
Dylan Baker
acadf06f56 meson: group libdrm dependencies
The reason libdrm is after libdrm_* will be made clear in later patches.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-29 10:18:47 -07:00
Brian Paul
e520ca562a gl.h: remove stale comment, trailing whitespace
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-29 08:46:55 -06:00
Brian Paul
4ff6a7b0de glapi: add glBlendBarrier(), glPrimitiveBoundingBox() prototypes
in glapi_dispatch.c, as we have for many other GLES functions.
Fixes a cross-compile issue (missing prototype) when GLES support
is disabled.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2018-03-29 08:45:10 -06:00
Brian Paul
5cd5878a1f st/mesa: silence unhandled switch case warning
And improve the unreachable() error message.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-03-29 08:45:10 -06:00
Henri Verbeet
0b73c86b80 mesa: Inherit texture view multi-sample information from the original texture images.
Found running "The Witness" in Wine. Without this patch, texture views created
on multi-sample textures would have a GL_TEXTURE_SAMPLES of 0. All things
considered such views actually work surprisingly well, but when combined with
(plain) multi-sample textures in a framebuffer object, the resulting FBO is
incomplete because the sample counts don't match.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-29 14:38:25 +04:30
Samuel Pitoiset
e45fe0ed66 radv: fix scanning output_usage_mask with structs
To fix a regression in:
dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.struct

And the following regressions (Polaris only):
dEQP-VK.glsl.indexing.varying_array.*

Fixes: f3275ca01c ("ac/nir: only enable used channels when exporting parameters")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-29 10:22:10 +02:00
Karol Herbst
6179a87c1e nvc0/ir: fix emiting NOTs with predicates
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-03-29 03:06:36 +02:00
Aaron Watry
1dae92f150 broadcom/vc4: Fix out-of-tree build with automake.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-28 17:48:41 -07:00
Eric Anholt
81f82ecc56 broadcom/vc5: Start using nir_opt_move_load_ubo().
In the absence of a general NIR or VIR-level scheduler, this at least
avoids spilling in
GTF-GLES3.gtf.GL3Tests.uniform_buffer_object.uniform_buffer_object_storage_layouts
2018-03-28 17:48:41 -07:00
Eric Anholt
1fe4c748f7 broadcom/vc5: Fix setup of integer surface clear values.
I'm disappointed that the compiler didn't warn me about use of
uninitialized uc in these paths.  Just use the incoming clear color
instead of the packing temporary if we're doing our own packing.

Fixes GTF-GLES3.gtf.GL3Tests.color_buffer_float.color_buffer_float_clamp_*
2018-03-28 17:48:41 -07:00
Eric Anholt
123ee37627 broadcom/vc5: Stop trying to swizzle around RGBA4 clear color.
We always want A in the A slot in the tile buffer, and any other swapping
should happen elsewhere.

Fixes RGBA4-using cases in fbo-clear-formats and
GTF-GLES3.gtf.GL3Tests.color_buffer_float.color_buffer_float_clamp_fixed.
2018-03-28 17:48:41 -07:00
Eric Anholt
2f4c4e10c2 broadcom/vc5: Work around scissor w/h==0 bug same as rasterizer discard.
The 7268 HW apparently lets some rendering through in this case.  Fixes
GTF-GLES2.gtf.GL2FixedTests.scissor.scissor
2018-03-28 17:48:41 -07:00
Eric Anholt
0349c79bdc st: Don't try to finalize the texture in st_render_texture().
We can't necessarily finalize the texture at this point if we're rendering
to a texture image whose format is different from the baselevel's format.
This was introduced as a fix for fbo-incomplete-texture-03 in
de414f4915, but the later fix for vmware on
that testcase in 95d5c48f68 made it
unnecessary.

Fixes assertion failures in util_resource_copy_region() in
KHR-GLES3.copy_tex_image_conversions.forbidden.* when trying to finalize
an R8 texture image to the RG8 texture object's pt.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-28 17:48:41 -07:00
Marek Olšák
e159d46fc7 drirc: whitelist glthread for Medieval II: TW, Carnivores: DHR, Far Cry 2 2018-03-28 20:00:48 -04:00
Daniel Schürmann
b91cd5dba4 radv: enable VK_AMD_shader_trinary_minmax extension
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 01:29:39 +02:00
Daniel Schürmann
d00fb7ce54 ac: add support for trinary_minmax instructions
v2: Add missing break (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 01:29:35 +02:00
Dave Airlie
fe5d5d19b0 spirv: add support for SPV_AMD_shader_trinary_minmax
Co-authored-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 01:29:29 +02:00
Dave Airlie
3e830a1af2 nir: add support for min/max/median of 3 srcs
These are needed for SPV_AMD_shader_trinary_minmax,
the AMD HW supports these.

Co-authored-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 01:28:58 +02:00
Marek Olšák
025105453a radeonsi: simplify DCC format categories
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-28 18:45:52 -04:00
Marek Olšák
3fea237c85 radeonsi: don't use the SPI barrier management bug workaround
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-28 18:45:52 -04:00
Marek Olšák
3045c5f274 radeonsi: use maximum OFFCHIP_BUFFERING on Vega12
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-28 18:45:52 -04:00
Bas Nieuwenhuizen
4503ff760c ac/nir: Add workaround for GFX9 buffer views.
On GFX9 whether the buffer size is interpreted as elements or bytes
depends on whether IDXEN is enabled in the instruction. If the index
is a constant zero, LLVM optimizes IDXEN to 0.

Now the size in elements is interpreted in bytes which of course
results in out of bounds accesses.

The correct fix is most likely to disable the LLVM optimization,
but we need something to work with LLVM <= 6.0.

radeonsi does the max between stride and element count on the CPU
but that results in the size intrinsics returning the wrong size
for the buffer. This would cause CTS errors for radv.

v2: Also include the store changes.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 00:03:03 +02:00
Marek Olšák
4f96747530 ac/surface: set AddrSurfInfoIn.format = ADDR_FMT_8 for stencil, add assertions
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105738

Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 17:23:41 -04:00
Samuel Pitoiset
1c4fdcf444 radv: enable VK_EXT_sampler_filter_minmax
Only enable for CIK+ because it's buggy on SI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 22:55:48 +02:00
Samuel Pitoiset
413d77e7f9 radv: add support for VK_EXT_sampler_filter_minmax
The driver only supports the required formats for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 22:55:48 +02:00
Samuel Pitoiset
99b52aa1da radv: rename VEGA10 device name
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 20:15:17 +02:00
Samuel Pitoiset
4d2c46dda3 radv: add support for Vega12
Based on RadeonSI. Untested.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 20:15:14 +02:00
Matt Turner
3e6326deb9 build: Fix up nir_intrinsics.Plo
nir_intrinsics.c existed as a static file until commit 76dfed8ae2 began
generating it as part of the build process. autotools is incapable of
coping, and so a build-tree from before this commit would then fail with
it:

[4]: *** No rule to make target '../../../mesa/src/compiler/nir/nir_intrinsics.c', needed by 'nir/nir_intrinsics.lo'.  Stop.

Add a few lines to configure.ac to update the broken build files.

Fixes: 76dfed8ae2 ("nir: mako all the intrinsics")
2018-03-28 11:09:23 -07:00
Dylan Baker
2cfc68d984 autotools: Include intel/dev/meson.build in tarball
Fixes: 272bef0601
       ("intel: Split gen_device_info out into libintel_dev")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-28 10:19:05 -07:00
Dylan Baker
bc2fdb9759 autotools: include meson_get_version
Otherwise meson won't read the VERSION file and won't set a version.
That means that pkg-config files will have version unset as well.

Fixes: 3e9533d9b8
       ("meson: Add script to use VERSION file for getting version")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-28 10:13:23 -07:00
Eric Engestrom
d77844a529 docs: fix 18.0 release note version
Fixes: 839fb3a696 "docs: Update 18.0.0 release notes"
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-28 16:52:56 +01:00
Marek Olšák
20eb44ad65 radeonsi: add support for Vega12
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-28 11:37:43 -04:00
Marek Olšák
5425d32fcf amd/addrlib: update to the latest version for Vega12
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-28 11:37:43 -04:00
Eric Engestrom
431a1d12cc gbm: remove never-implemented function
I assume this was implemented in a previous version of that commit, but
was removed in the version that actually landed.

Fixes: 8430af5ebe "Add support for swrast to the DRM EGL platform"
Cc: Giovanni Campagna <gcampagna@src.gnome.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-28 16:25:52 +01:00
Stefan Schake
77ade10c86 android: Use new nir intrinsics python scripts
Fixes: 76dfed8ae2 ("nir: mako all the intrinsics")
Signed-off-by: Stefan Schake <stschake@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-28 14:48:47 +03:00
Eric Anholt
a691fa4a1b broadcom/vc5: Fix padding of NPOT miplevels >= 2.
The power-of-two padded size that gets minified is based on level 1's
dimensions, not level 0's, which starts to differ at a width of 9.

Fixes all failures on texelFetch fs sampler2D 1x1x1-64x64x1
2018-03-27 21:16:23 -07:00
Timothy Arceri
92fa89a08d ac/radeonsi: pass bindless bool to load_sampler_desc()
We also fix the base_index for bindless by using the driver
location.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 12:56:16 +11:00
Timothy Arceri
5411b98d52 st/glsl_to_nir: set driver location for bindless images and samplers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 12:56:15 +11:00
Timothy Arceri
f94b6b79be radeonsi/nir: set uses_bindless_samplers for samplers
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 12:56:15 +11:00
Timothy Arceri
5c810a2c05 nir: add bindless to nir data
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-28 12:56:15 +11:00
Kenneth Graunke
fb18d0dbe4 i965: Drop unnecessary bo->align field.
bo->align is always 0; there's no need to waste 8 bytes storing it.
Thanks to C99 initializers zeroing fields, we can completely drop the
only read of the field altogether.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 18:41:44 -07:00
Kenneth Graunke
037d738a23 i965: Drop unused alignment parameter from brw_bo_alloc().
brw_bo_alloc no longer uses this parameter, so there's no point.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 18:41:44 -07:00
Kenneth Graunke
07ec3a2e0f i965: Drop alignment parameter from bo_alloc_internal().
Buffers are always page aligned on 965+ hardware; I believe this extra
parameter is a vestige from the Gen2-3 era.

All callers pass 0, and in fact we assert that the alignment is 0 unless
BO_ALLOC_BUSY is set (for some reason).  We can just drop the parameter
and set the value to 0 explicitly.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 18:41:44 -07:00
Kenneth Graunke
b9a54b18f6 i965: Drop BO_ALLOC_BUSY in intel_miptree_create_for_bo().
intel_miptree_create_for_bo does not actually allocate a BO, so
specifying allocation flags accomplishes nothing and is confusing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 18:41:44 -07:00
Kenneth Graunke
2c01215c1b i965: Drop PIPE_CONTROL_NO_WRITE from various calls.
This is just zero - passing nothing already gives us a post-sync
operation of "nothing".

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-27 18:41:44 -07:00
Jason Ekstrand
5f21a7afe0 nir/intrinsics: Don't report negative dest_components
I have no idea why but having dest_components == -1 was causing a memory
leak somewhere.  Without this, you can't get through a full shader-db
run without running out of memory.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-27 18:18:26 -07:00
Jason Ekstrand
7e38f49a8f intel/fs: Don't emit a des copy for image ops with has_dest == false
This was causing us to walk dest_components times over a thing with no
destination.  This happened to work because all of the image intrinsics
without a destination also happened to have dest_components == 0.  We
shouldn't be reading dest_components if has_dest == false.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-03-27 18:18:21 -07:00
Ilia Mirkin
776e6af879 nvc0/ir: fix INTERP_* with indirect inputs
There were two problems, both of which are fixed now:
 - The indirect address was not being shifted by 4
 - The indirect address was being placed as an argument in the offset case

This fixes some of the new interpolateAt* piglits which now test for
these situations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-03-27 20:41:11 -04:00
Timothy Arceri
629ee690ad nir: fix crash in loop unroll corner case
When an if nesting inside anouther if is optimised away we can
end up with a loop terminator and following block that looks like
this:

        if ssa_596 {
                block block_5:
                /* preds: block_4 */
                vec1 32 ssa_601 = load_const (0xffffffff /* -nan */)
                break
                /* succs: block_8 */
        } else {
                block block_6:
                /* preds: block_4 */
                /* succs: block_7 */
        }
        block block_7:
        /* preds: block_6 */
        vec1 32 ssa_602 = phi block_6: ssa_552
        vec1 32 ssa_603 = phi block_6: ssa_553
        vec1 32 ssa_604 = iadd ssa_551, ssa_66

The problem is the phis. Loop unrolling expects the last block in
the loop to be empty once we splice the instructions in the last
block into the continue branch. The problem is we cant move phis
so here we lower the phis to regs when preparing the loop for
unrolling. As it could be possible to have multiple additional
blocks/ifs following the terminator we just convert all phis at
the top level of the loop body for simplicity.

We also add some comments to loop_prepare_for_unroll() while we
are here.

Fixes: 51daccb289 "nir: add a loop unrolling pass"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670
2018-03-28 09:59:38 +11:00
Timothy Arceri
48f6014903 st/glsl_to_nir: correctly handle arrays packed across multiple vars
Fixes piglit test:
tests/spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-interleave-range.shader_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:38 +11:00
Timothy Arceri
b260efbd5e radeonsi/nir: fix input processing for packed varyings
The location was only being incremented the first time we processed a
location. This meant we would incorrectly skip some elements of
an array if the first element was packed and proccessed previously
but other elements were not.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:38 +11:00
Timothy Arceri
51f175028d ac/nir_to_llvm: fix component packing for double outputs
We need to wait until after the writemask is widened before we
adjust it for component packing.

Together with the previous patch this fixes a number of
arb_enhanced_layouts component layout piglit tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:37 +11:00
Timothy Arceri
fc51fdbcde st/glsl_to_nir: fix driver location for dual-slot packed doubles
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:37 +11:00
Timothy Arceri
47eee04556 radeonsi/nir: fix scanning of multi-slot output varyings
This fixes tcs/tes varying arrays where we dont lower indirects and
therefore don't split arrays. Here we also fix useagemask for dual
slot doubles.

Fixes a number of arb_tessellation_shader piglit tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:37 +11:00
Eric Anholt
9f1b4f6204 broadcom/vc5: Fix RG16I/UI texture sampling.
How many times did I look at this table without noticing the missing 'G'
in the texture column?

Fixes KHR-GLES3.copy_tex_image_conversions.required.* on 7268.
2018-03-27 15:49:58 -07:00
Rob Clark
16581904b0 nir: fix generated nir_intrinsics.c for MSVC
Apparently it is not happy about things like: .foo = {}

So skip over initializers for empty lists.

Fixes: 76dfed8ae2
Reported-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-27 15:01:11 -04:00
Emil Velikov
eda2f58d15 docs: update calendar 18.0.0 is out
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-27 19:11:45 +01:00
Emil Velikov
02f89b62fe docs: add news item and link release notes for 18.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-27 19:08:48 +01:00
Emil Velikov
62eb721ed8 docs: add sha256 checksums for 18.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit fb64913d19)
2018-03-27 19:06:27 +01:00
Emil Velikov
839fb3a696 docs: Update 18.0.0 release notes
Note: the file was originally 17.4.0, yet git stuggles to detect the
move :-\

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit dceb1ce807)
2018-03-27 19:06:19 +01:00
Rob Clark
76dfed8ae2 nir: mako all the intrinsics
I threatened to do this a long time ago.. I probably *should* have done
it a long time ago when there where many fewer intrinsics.  But the
system of macro/#include magic for dealing with intrinsics is a bit
annoying, and python has the nice property of optional fxn params,
making it possible to define new intrinsics while ignoring parameters
that are not applicable (and naming optional params).  And not having to
specify various array lengths explicitly is nice too.

I think the end result makes it easier to add new intrinsics.

v2: couple small fixes found with a test program to compare the old and
    new tables
v3: misc comments, don't rely on capture=true for meson.build, get rid
    of system_values table to avoid return value of intrinsic() and
    *mostly* remove side-effects, add autotools build support
v4: scons build

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 08:36:37 -04:00
Rob Clark
cc3a88e81d nir: fix per_vertex_output intrinsic
This is supposed to have both BASE and COMPONENT but num_indices was
inadvertantly set to 1.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 08:20:40 -04:00
Rob Clark
1e0a06000b glsl_types: fix build break with intel/msvc compiler
The VECN() macro was taking advantage of a GCC specific feature that is
not available on lesser compilers, mostly for the purposes of avoiding a
macro that encoded a return statement.

But as suggested by Ian, we could just have the macro produce the entire
method body and avoid the need for this.  So let's do that instead.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105740
Fixes: f407edf340
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Roland Scheidegger <sroland@vmware.com>
Cc: Ian Romanick <idr@freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-03-27 08:17:11 -04:00
Lin Johnson
41cf30b8bc mesa: add GL_HALF_FLOAT as supported type to readpixels
EXT_color_buffer_float spec states:

  "An INVALID_OPERATION error is generated ... if the color buffer is
   a floating-point format and type is not FLOAT, HALF FLOAT, or
   UNSIGNED_INT_10F_11F_11F_REV."

This means that GL_HALF_FLOAT type should be supported when color
buffer has floating-point format.

Fixes Android CTS test android.view.cts.PixelCopyTest.

v2: remove comments of EXT_color_buffer_half_float as
    EXT_color_buffer_float can use type GL_HALF_FLOAT

Signed-off-by: Lin Johnson <johnson.lin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-27 09:04:52 +03:00
Eric Anholt
0024b77e87 broadcom/vc5: Fix swizzling of RGB10_A2UI render targets.
This is the actual hardware layout, and we were only swizzling R/B back
around in texturing.  Fixes part of
KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx in
simulation.
2018-03-26 17:46:23 -07:00
Eric Anholt
c2b13627d9 broadcom/vc5: Fix extraneous register index in QIR dumping of TLBU writes.
Just like TLB without a config uniform, we don't have a register index.
2018-03-26 17:46:23 -07:00
Eric Anholt
494da6c2dd broadcom/vc5: Implement workaround for GFXH-1431.
This should fix some blending errors, but doesn't impact any testcases in
the CTS.
2018-03-26 17:46:19 -07:00
Eric Anholt
1bf466270d broadcom/vc5: Fix EZ disabling and allow using GT/GE direction as well.
Once we've disabled EZ for some draws, we need to not use EZ on future
draws.  Implementing that made implementing the GT/GE direction trivial.

Fixes KHR-GLES3.shaders.fragdepth.compare.no_write on V3D 4.1 simulation.
2018-03-26 17:46:19 -07:00
Eric Anholt
262208eb3c broadcom/vc5: Disable TF on V3D 4.x when drawing with queries disabled.
On 3.x, we just don't flag the primitive as needing TF, but those
primitive bits are now allocated to the new primitive types.  Now we need
to actually update the enable flag at draw time.
2018-03-26 17:46:19 -07:00
Eric Anholt
ef2cf9cc3c broadcom/vc5: Disable transform feedback on V3D 4.x at the end of the job.
The next job from this client will turn it back on unless TF gets
disabled, but we don't want the state to leak from this client to another
(which causes GPU hangs).
2018-03-26 17:46:19 -07:00
Eric Anholt
1fa820cef8 broadcom/vc5: Move the BCL epilogue code to a per-version compile.
I need to do some new packets for transform feedback on 4.1.
2018-03-26 17:46:19 -07:00
Eric Anholt
3387864130 broadcom/vc5: Fix transform feedback in the presence of point size.
I had this note to myself, and it turns out that a lot of CTS tests use
XFB with points to get data out without using a fragment shader.  Keep
track of two sets of precomputed TF specs (point size in VPM prologue or
not), and switch between them when we enable/disable point size.
2018-03-26 17:46:19 -07:00
Eric Anholt
09ac5ade8f broadcom/vc5: Split transform feedback specs update from buffers.
The specs update will be changing based on additional state flags in the
next commit, and this unindents the buffer update code.
2018-03-26 17:46:18 -07:00
Eric Anholt
9e62aec9cd broadcom/vc5: Limit each transform feedback data spec to 16 dwords.
The length-1 field only has 4 bits, so we need to generate separate specs
when there's too much TF output per buffer.

Fixes
GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_builtin_type
and transform_feedback_max_interleaved.
2018-03-26 17:33:37 -07:00
Eric Anholt
0356db022d gallium/u_vbuf: Protect against overflow with large instance divisors.
GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor uses -1
as a divisor, so we would overflow to count=0 and upload no data,
triggering the assert below.  We want to upload 1 element in this case,
fixing the test on VC5.

v2: Use some more obvious logic, and explain why we don't use the normal
    round_up().

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-26 17:33:37 -07:00
Eric Anholt
d491ad1d36 st: Allow accelerated CopyTexImage from RGBA to RGB.
There's nothing to worry about here -- the A channel just gets dropped by
the blit.  This avoids a segfault in the fallback path when copying from a
RGBA16_SINT renderbuffer to a RGB16_SINT destination represented by an
RGBA16_SINT texture (the fallback path tries to get/fetch to float
buffers, but the float pack/unpack functions are NULL for SINT/UINT).

Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba16i on VC5.

v2: Extract the logic to a helper function and explain what's going on
    better.
v3: const-qualify args

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-26 17:33:37 -07:00
Marek Olšák
7d2079908d winsys/amdgpu: always allow GTT placements on APUs
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-03-26 19:23:30 -04:00
Marek Olšák
769603564e radeonsi: don't reallocate on DMABUF export if local BOs are disabled 2018-03-26 19:22:12 -04:00
Timothy Arceri
56b867395d glsl: fix infinite loop caused by bug in loop unrolling pass
Just checking for 2 jumps is not enough to be sure we can do a
complex loop unroll. We need to make sure we also have also found
2 loop terminators.

Without this we were attempting to unroll a loop where the second
jump was nested inside multiple ifs which loop analysis is unable
to detect as a terminator. We ended up splicing out the first
terminator but failed to actually unroll the loop, this resulted
in the creation of a possible infinite loop.

Fixes: 646621c66d "glsl: make loop unrolling more like the nir unrolling path"

Tested-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670
2018-03-27 09:15:02 +11:00
Vinson Lee
dc94a0506f gallium: Do not add -Wframe-address option for gcc <= 4.4.
This patch fixes these build errors with GCC 4.4.

  Compiling src/gallium/auxiliary/util/u_debug_stack.c ...
src/gallium/auxiliary/util/u_debug_stack.c: In function ‘debug_backtrace_capture’:
src/gallium/auxiliary/util/u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions
src/gallium/auxiliary/util/u_debug_stack.c:269: error: #pragma GCC diagnostic not allowed inside functions
src/gallium/auxiliary/util/u_debug_stack.c:271: error: #pragma GCC diagnostic not allowed inside functions

Fixes: 370e356eba ("gallium: silence __builtin_frame_address nonzero argument is unsafe warning")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105529
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-26 11:23:51 -07:00
Alyssa Rosenzweig
029f1a2d61 gallium: Correct minor typo in header comments
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-26 10:15:04 -07:00
Rafael Antognolli
27581d18bc intel/aubinator_error_decode: Decode more registers.
Decode SC_INSTDONE, ROW_INSTDONE and SAMPLER_INSTDONE.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-26 09:25:57 -07:00
Rafael Antognolli
70d7c70e8d intel/genxml: Add SAMPLER_INSTDONE register.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-26 09:25:57 -07:00
Rafael Antognolli
227edf05f3 intel/genxml: Add ROW_INSTDONE register.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-26 09:25:57 -07:00
Rafael Antognolli
4c0ae36143 intel/genxml: Add SC_INSTDONE register.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-26 09:25:57 -07:00
Ian Romanick
91225cb33f i965/vec4: Fix null destination register in 3-source instructions
A recent commit (see below) triggered some cases where conditional
modifier propagation and dead code elimination would cause a MAD
instruction like the following to be generated:

    mad.l.f0  null, ...

Matt pointed out that fs_visitor::fixup_3src_null_dest() fixes cases
like this in the scalar backend.  This commit basically ports that code
to the vec4 backend.

NOTE: I have sent a couple tests to the piglit list that reproduce this
bug *without* the commit mentioned below.  This commit fixes those
tests.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Fixes: ee63933a7 ("nir: Distribute binary operations with constants into bcsel")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105704
2018-03-26 08:50:44 -07:00
Ian Romanick
2c643fd978 nir: Don't condition 'a-b < 0' -> 'a < b' on is_not_used_by_conditional
Now that i965 recognizes that a-b generates the same conditions as 'a <
b', there is no reason to condition this transformation on 'is not used
by conditional.'

Since this was the only user of the is_not_used_by_conditional function,
delete it.

All Gen6+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14400775 -> 14400595 (<.01%)
instructions in affected programs: 36712 -> 36532 (-0.49%)
helped: 182
HURT: 26
helped stats (abs) min: 1 max: 2 x̄: 1.13 x̃: 1
helped stats (rel) min: 0.15% max: 1.82% x̄: 0.70% x̃: 0.62%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.24% max: 1.02% x̄: 0.82% x̃: 0.90%
95% mean confidence interval for instructions value: -0.97 -0.76
95% mean confidence interval for instructions %-change: -0.59% -0.43%
Instructions are helped.

total cycles in shared programs: 532929592 -> 532926345 (<.01%)
cycles in affected programs: 478660 -> 475413 (-0.68%)
helped: 187
HURT: 22
helped stats (abs) min: 2 max: 200 x̄: 20.99 x̃: 18
helped stats (rel) min: 0.23% max: 24.10% x̄: 1.48% x̃: 1.03%
HURT stats (abs)   min: 1 max: 214 x̄: 30.86 x̃: 11
HURT stats (rel)   min: 0.01% max: 23.06% x̄: 3.12% x̃: 0.86%
95% mean confidence interval for cycles value: -19.50 -11.57
95% mean confidence interval for cycles %-change: -1.42% -0.58%
Cycles are helped.

GM45 and Iron Lake had similar results. (Iron Lake shown)
total cycles in shared programs: 177851578 -> 177851810 (<.01%)
cycles in affected programs: 24408 -> 24640 (0.95%)
helped: 2
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.42% max: 0.47% x̄: 0.44% x̃: 0.44%
HURT stats (abs)   min: 24 max: 108 x̄: 60.00 x̃: 54
HURT stats (rel)   min: 0.52% max: 1.62% x̄: 1.04% x̃: 1.02%
95% mean confidence interval for cycles value: -7.75 85.08
95% mean confidence interval for cycles %-change: -0.39% 1.49%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
cd635d149b i965/vec4: Propagate conditional modifiers from compares to adds
No changes on Broadwell or later as those platforms do not use the vec4
backend.

Ivy Bridge and Haswell had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11682119 -> 11681056 (<.01%)
instructions in affected programs: 150403 -> 149340 (-0.71%)
helped: 950
HURT: 0
helped stats (abs) min: 1 max: 16 x̄: 1.12 x̃: 1
helped stats (rel) min: 0.23% max: 2.78% x̄: 0.82% x̃: 0.71%
95% mean confidence interval for instructions value: -1.19 -1.04
95% mean confidence interval for instructions %-change: -0.84% -0.79%
Instructions are helped.

total cycles in shared programs: 257495842 -> 257495238 (<.01%)
cycles in affected programs: 270302 -> 269698 (-0.22%)
helped: 271
HURT: 13
helped stats (abs) min: 2 max: 14 x̄: 2.42 x̃: 2
helped stats (rel) min: 0.06% max: 1.13% x̄: 0.32% x̃: 0.28%
HURT stats (abs)   min: 2 max: 12 x̄: 4.00 x̃: 4
HURT stats (rel)   min: 0.15% max: 1.18% x̄: 0.30% x̃: 0.26%
95% mean confidence interval for cycles value: -2.41 -1.84
95% mean confidence interval for cycles %-change: -0.31% -0.26%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10430493 -> 10429727 (<.01%)
instructions in affected programs: 120860 -> 120094 (-0.63%)
helped: 766
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.30% max: 2.70% x̄: 0.78% x̃: 0.73%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.80% -0.75%
Instructions are helped.

total cycles in shared programs: 146138718 -> 146138446 (<.01%)
cycles in affected programs: 244114 -> 243842 (-0.11%)
helped: 132
HURT: 0
helped stats (abs) min: 2 max: 4 x̄: 2.06 x̃: 2
helped stats (rel) min: 0.03% max: 0.43% x̄: 0.16% x̃: 0.19%
95% mean confidence interval for cycles value: -2.12 -2.00
95% mean confidence interval for cycles %-change: -0.18% -0.15%
Cycles are helped.

GM45 and Iron Lake had identical results. (Iron Lake shown)
total instructions in shared programs: 7780251 -> 7780248 (<.01%)
instructions in affected programs: 175 -> 172 (-1.71%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.49% max: 2.44% x̄: 1.81% x̃: 1.49%

total cycles in shared programs: 177851584 -> 177851578 (<.01%)
cycles in affected programs: 9796 -> 9790 (-0.06%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.05% max: 0.08% x̄: 0.06% x̃: 0.05%

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
780f307ba8 i965/vec4: Allow cmod propagation when src0 is a uniform or shader input
No shader-db changes.  This source must have been written by a previous
instruction, so it cannot be a uniform or a shader input.  However, this
change allows the next commit to help more shaders.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
020b0055e7 i965/fs: Propagate conditional modifiers from compares to adds
The math inside the add and the cmp in this instruction sequence is the
same.  We can utilize this to eliminate the compare.

add(8)          g5<1>F          g2<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
cmp.z.f0(8)     null<1>F        g2<8,8,1>F      -g64.5<0,1,0>F  { align1 1Q switch };
(-f0) sel(8)    g8<1>F          (abs)g5<8,8,1>F 3e-37F          { align1 1Q };

This is reduced to:

add.z.f0(8)     g5<1>F          g2<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
(-f0) sel(8)    g8<1>F          (abs)g5<8,8,1>F 3e-37F          { align1 1Q };

This optimization pass could do even better.  The nature of converting
vectorized code from the GLSL front end to scalar code in NIR results in
sequences like:

add(8)          g7<1>F          g4<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
add(8)          g6<1>F          g3<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
add(8)          g5<1>F          g2<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
cmp.z.f0(8)     null<1>F        g2<8,8,1>F      -g64.5<0,1,0>F  { align1 1Q switch };
(-f0) sel(8)    g8<1>F          (abs)g5<8,8,1>F 3e-37F          { align1 1Q };
cmp.z.f0(8)     null<1>F        g3<8,8,1>F      -g64.5<0,1,0>F  { align1 1Q switch };
(-f0) sel(8)    g10<1>F         (abs)g6<8,8,1>F 3e-37F          { align1 1Q };
cmp.z.f0(8)     null<1>F        g4<8,8,1>F      -g64.5<0,1,0>F  { align1 1Q switch };
(-f0) sel(8)    g12<1>F         (abs)g7<8,8,1>F 3e-37F          { align1 1Q };

In this sequence, only the first cmp.z is removed.  With different
scheduling, all 3 could get removed.

Skylake
total instructions in shared programs: 14407009 -> 14400173 (-0.05%)
instructions in affected programs: 1307274 -> 1300438 (-0.52%)
helped: 4880
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.40 x̃: 1
helped stats (rel) min: 0.03% max: 8.70% x̄: 0.70% x̃: 0.52%
95% mean confidence interval for instructions value: -1.45 -1.35
95% mean confidence interval for instructions %-change: -0.72% -0.69%
Instructions are helped.

total cycles in shared programs: 532943169 -> 532923528 (<.01%)
cycles in affected programs: 14065798 -> 14046157 (-0.14%)
helped: 2703
HURT: 339
helped stats (abs) min: 1 max: 1062 x̄: 12.27 x̃: 2
helped stats (rel) min: <.01% max: 28.72% x̄: 0.38% x̃: 0.21%
HURT stats (abs)   min: 1 max: 739 x̄: 39.86 x̃: 12
HURT stats (rel)   min: 0.02% max: 27.69% x̄: 1.38% x̃: 0.41%
95% mean confidence interval for cycles value: -8.66 -4.26
95% mean confidence interval for cycles %-change: -0.24% -0.14%
Cycles are helped.

LOST:   0
GAINED: 1

Broadwell
total instructions in shared programs: 14719636 -> 14712949 (-0.05%)
instructions in affected programs: 1288188 -> 1281501 (-0.52%)
helped: 4845
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.38 x̃: 1
helped stats (rel) min: 0.03% max: 8.00% x̄: 0.70% x̃: 0.52%
95% mean confidence interval for instructions value: -1.43 -1.33
95% mean confidence interval for instructions %-change: -0.72% -0.68%
Instructions are helped.

total cycles in shared programs: 559599253 -> 559581699 (<.01%)
cycles in affected programs: 13315565 -> 13298011 (-0.13%)
helped: 2600
HURT: 269
helped stats (abs) min: 1 max: 2128 x̄: 12.24 x̃: 2
helped stats (rel) min: <.01% max: 23.95% x̄: 0.41% x̃: 0.20%
HURT stats (abs)   min: 1 max: 790 x̄: 53.07 x̃: 20
HURT stats (rel)   min: 0.02% max: 15.96% x̄: 1.55% x̃: 0.75%
95% mean confidence interval for cycles value: -8.47 -3.77
95% mean confidence interval for cycles %-change: -0.27% -0.18%
Cycles are helped.

LOST:   0
GAINED: 8

Haswell
total instructions in shared programs: 12978609 -> 12973483 (-0.04%)
instructions in affected programs: 932921 -> 927795 (-0.55%)
helped: 3480
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.47 x̃: 1
helped stats (rel) min: 0.03% max: 7.84% x̄: 0.78% x̃: 0.58%
95% mean confidence interval for instructions value: -1.53 -1.42
95% mean confidence interval for instructions %-change: -0.80% -0.75%
Instructions are helped.

total cycles in shared programs: 410270788 -> 410250531 (<.01%)
cycles in affected programs: 10986161 -> 10965904 (-0.18%)
helped: 2087
HURT: 254
helped stats (abs) min: 1 max: 2672 x̄: 14.63 x̃: 4
helped stats (rel) min: <.01% max: 39.61% x̄: 0.42% x̃: 0.21%
HURT stats (abs)   min: 1 max: 519 x̄: 40.49 x̃: 16
HURT stats (rel)   min: 0.01% max: 12.83% x̄: 1.20% x̃: 0.47%
95% mean confidence interval for cycles value: -12.82 -4.49
95% mean confidence interval for cycles %-change: -0.31% -0.18%
Cycles are helped.

LOST:   0
GAINED: 5

Ivy Bridge
total instructions in shared programs: 11686082 -> 11681548 (-0.04%)
instructions in affected programs: 937696 -> 933162 (-0.48%)
helped: 3150
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.44 x̃: 1
helped stats (rel) min: 0.03% max: 7.84% x̄: 0.69% x̃: 0.49%
95% mean confidence interval for instructions value: -1.49 -1.38
95% mean confidence interval for instructions %-change: -0.71% -0.67%
Instructions are helped.

total cycles in shared programs: 257514962 -> 257492471 (<.01%)
cycles in affected programs: 11524149 -> 11501658 (-0.20%)
helped: 1970
HURT: 239
helped stats (abs) min: 1 max: 3525 x̄: 17.48 x̃: 3
helped stats (rel) min: <.01% max: 49.60% x̄: 0.46% x̃: 0.17%
HURT stats (abs)   min: 1 max: 1358 x̄: 50.00 x̃: 15
HURT stats (rel)   min: 0.02% max: 59.88% x̄: 1.84% x̃: 0.65%
95% mean confidence interval for cycles value: -17.01 -3.35
95% mean confidence interval for cycles %-change: -0.33% -0.08%
Cycles are helped.

LOST:   9
GAINED: 1

Sandy Bridge
total instructions in shared programs: 10432841 -> 10429893 (-0.03%)
instructions in affected programs: 685071 -> 682123 (-0.43%)
helped: 2453
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 1.20 x̃: 1
helped stats (rel) min: 0.02% max: 7.55% x̄: 0.64% x̃: 0.46%
95% mean confidence interval for instructions value: -1.23 -1.17
95% mean confidence interval for instructions %-change: -0.67% -0.62%
Instructions are helped.

total cycles in shared programs: 146133660 -> 146134195 (<.01%)
cycles in affected programs: 3991634 -> 3992169 (0.01%)
helped: 1237
HURT: 153
helped stats (abs) min: 1 max: 2853 x̄: 6.93 x̃: 2
helped stats (rel) min: <.01% max: 29.00% x̄: 0.24% x̃: 0.14%
HURT stats (abs)   min: 1 max: 1740 x̄: 59.56 x̃: 12
HURT stats (rel)   min: 0.03% max: 78.98% x̄: 1.96% x̃: 0.42%
95% mean confidence interval for cycles value: -5.13 5.90
95% mean confidence interval for cycles %-change: -0.17% 0.16%
Inconclusive result (value mean confidence interval includes 0).

LOST:   0
GAINED: 1

GM45 and Iron Lake had similar results (GM45 shown):
total instructions in shared programs: 4800332 -> 4798380 (-0.04%)
instructions in affected programs: 565995 -> 564043 (-0.34%)
helped: 1451
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 1.35 x̃: 1
helped stats (rel) min: 0.05% max: 5.26% x̄: 0.47% x̃: 0.31%
95% mean confidence interval for instructions value: -1.40 -1.29
95% mean confidence interval for instructions %-change: -0.50% -0.45%
Instructions are helped.

total cycles in shared programs: 122032318 -> 122027798 (<.01%)
cycles in affected programs: 8334868 -> 8330348 (-0.05%)
helped: 1029
HURT: 1
helped stats (abs) min: 2 max: 40 x̄: 4.43 x̃: 2
helped stats (rel) min: <.01% max: 1.83% x̄: 0.09% x̃: 0.04%
HURT stats (abs)   min: 38 max: 38 x̄: 38.00 x̃: 38
HURT stats (rel)   min: 0.25% max: 0.25% x̄: 0.25% x̃: 0.25%
95% mean confidence interval for cycles value: -4.70 -4.08
95% mean confidence interval for cycles %-change: -0.09% -0.08%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
5bbb3d60d3 i965/fs: Allow cmod propagation when src0 is a uniform or shader input
No shader-db changes.  This source must have been written by a previous
instruction, so it cannot be a uniform or a shader input.  However, this
change allows the next commit to help about 900 more shaders.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
8f83eea71e i965: Add negative_equals methods
This method is similar to the existing ::equals methods.  Instead of
testing that two src_regs are equal to each other, it tests that one is
the negation of the other.

v2: Simplify various checks based on suggestions from Matt.  Use
src_reg::type instead of fixed_hw_reg.type in a check.  Also suggested
by Matt.

v3: Rebase on 3 years.  Fix some problems with negative_equals with VF
constants.  Add fs_reg::negative_equals.

v4: Replace the existing default case with BRW_REGISTER_TYPE_UB,
BRW_REGISTER_TYPE_B, and BRW_REGISTER_TYPE_NF.  Suggested by Matt.
Expand the FINISHME comment to better explain why it isn't already
finished.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v3]
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Gert Wollny
a21da49e5c mesa/st/tests: Use tgsi opcode enum also in the test classes
Fixes: ec478cf9c31K ("st/mesa,tgsi: use enum tgsi_opcode")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105737
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-26 09:04:53 -06:00
Eric Engestrom
1e36fe5dc4 meson: fix header check message
before: Checking if "endian.h works" compiles: YES
after:  Checking if "endian.h" compiles: YES

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
2018-03-26 09:59:32 +01:00
Rob Clark
2f181c8c18 glsl_types: vec8/vec16 support
Not used in GL but 8 and 16 component vectors exist in OpenCL.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-25 10:42:54 -04:00
Rob Clark
f407edf340 glsl_types: refactor/prep for vec8/vec16
Refactor things so there isn't so much typing involved to add new
things.

Also drops a pointless conditional (out of bounds rows or columns
already returns error_type in all paths.. might as well drop it
rather than make the check more convoluted in the next patch by
adding the vec8/vec16 case).

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-25 10:42:54 -04:00
Jordan Justen
d60eaf7b1f anv: Set genX_table for gen11
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-23 17:23:59 -07:00
Jordan Justen
af8535d02f anv: Add gen11 to anv_genX_call
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-23 17:23:59 -07:00
Mathias Fröhlich
4a8ef1f5d4 vbo: Make sure the internal VAO's stay within limits.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-23 19:59:02 +01:00
Mathias Fröhlich
1a131aaf4b mesa: Flag early if we modify a SharedAndImmutable VAO.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-23 19:58:59 +01:00
Mathias Fröhlich
19526a57f5 mesa: When copying a VAO also copy the vertex attribute mode.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-23 19:58:54 +01:00
Emil Velikov
5a75019ad0 configure: use AC_CHECK_HEADERS to check for endian.h
The currently we use the singular CHECK_HEADER combined with explicit
append to the DEFINES variable. That is a legacy misnomer, since it
requires us to add $DEFINES to every piece that we build.

Using the plural version of the helper sets the HAVE_ macro for us, plus
ensures it's passed to the compiler - if config.h is available in there
(not in the case of mesa) otherwise on the command line.

In hindsight, we should replace all the AC_CHECK_{FUNC,HEADER} instances
with the plural version (or even the _ONCE suffixed version) and drop
the DEFINES hacks.

Fixes: cbee1bfb34 ("meson/configure: detect endian.h instead of trying
to guess when it's available")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105717
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2018-03-23 18:12:52 +00:00
Kenneth Graunke
90f556f0b1 android: Use local i915_drm.h rather than the system one.
Fixes: 2d26c99933 (intel: devinfo: meson: include drm uapi)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2018-03-23 10:05:02 -07:00
Brian Paul
e31d5bd2f9 st/mesa: s/unsigned/enum pipe_shader_type/ for st_bind_ubos()
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
6a93deedf5 st/mesa: whitespace/formatting fixes in st_atom_constbuf.c
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
aad23f91ee st/mesa: s/unsigned/enum pipe_shader_type/
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
93581c2ca0 svga: simplify uses_flat_interp expression in emit_input_declarations()
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
c99f46c2ac svga: replace unsigned with proper enum names
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
7181a9fa0e tgsi,softpipe: use enum tgsi_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
ec478cf9c3 st/mesa,tgsi: use enum tgsi_opcode
Need to update the tgsi code and st_glsl_to_tgsi code at the same time
to prevent compile break since C++ is much pickier about implicit
enum/unsigned casting.

Bump size of glsl_to_tgsi_instruction::op to 10 bits to be sure to
avoid MSVC signed enum overflow issue.  No change in class size.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
ccecb2bbd3 tgsi/nir: use enum tgsi_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
22a3190c85 tgsi: use enum tgsi_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
9413d1c0fe gallivm: use enum tgis_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
7df96826f8 svga: use enum tgsi_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
4e0f967f6d tgsi: convert opcode macros to enums
Enums are nicer in gdb.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Lionel Landwerlin
412fae46c0 compiler: glsl: silence valgrind warning on write cache
I don't think it actually fixes anything, but that's nice not to have valgrind warnings.
It manifests itself when running the piglit test : glsl-fs-raytrace-bug27060

==2058== Uninitialised byte(s) found during client check request
==2058==    at 0xC5BB040: blob_write_bytes (blob.c:152)
==2058==    by 0xC595359: write_variable (nir_serialize.c:144)
==2058==    by 0xC59560C: write_var_list (nir_serialize.c:192)
==2058==    by 0xC5982E4: nir_serialize (nir_serialize.c:1124)
==2058==    by 0xC0B729D: brw_program_serialize_nir (brw_program.c:835)
==2058==    by 0xC0AB2D6: brw_link_shader (brw_link.cpp:358)
==2058==    by 0xC32FE3F: _mesa_glsl_link_shader (ir_to_mesa.cpp:3169)
==2058==    by 0xC36C7ED: create_new_program(gl_context*, state_key*) (ff_fragment_shader.cpp:1127)
==2058==    by 0xC36C8A6: _mesa_get_fixed_func_fragment_program (ff_fragment_shader.cpp:1157)
==2058==    by 0xC1B50AF: update_program (state.c:134)
==2058==    by 0xC1B56DF: _mesa_update_state_locked (state.c:352)
==2058==    by 0xC1B579A: _mesa_update_state (state.c:386)
==2058==  Address 0xf1eab8a is 58 bytes inside a block of size 96 alloc'd
==2058==    at 0x4C2CB8F: malloc (vg_replace_malloc.c:299)
==2058==    by 0xC0FD306: ralloc_size (ralloc.c:121)
==2058==    by 0xC0FD5B1: ralloc_array_size (ralloc.c:208)
==2058==    by 0xC452B3B: (anonymous namespace)::nir_visitor::visit(ir_variable*) (glsl_to_nir.cpp:448)
==2058==    by 0xC45CE8B: ir_variable::accept(ir_visitor*) (ir.h:428)
==2058==    by 0xC46D0B5: visit_exec_list(exec_list*, ir_visitor*) (ir.cpp:1898)
==2058==    by 0xC451D2F: glsl_to_nir (glsl_to_nir.cpp:162)
==2058==    by 0xC0B5223: brw_create_nir (brw_program.c:79)
==2058==    by 0xC0AAB67: brw_link_shader (brw_link.cpp:257)
==2058==    by 0xC32FE3F: _mesa_glsl_link_shader (ir_to_mesa.cpp:3169)
==2058==    by 0xC36C7ED: create_new_program(gl_context*, state_key*) (ff_fragment_shader.cpp:1127)
==2058==    by 0xC36C8A6: _mesa_get_fixed_func_fragment_program (ff_fragment_shader.cpp:1157)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-23 13:05:12 +00:00
Eric Engestrom
cbee1bfb34 meson/configure: detect endian.h instead of trying to guess when it's available
Cc: Maxin B. John <maxin.john@gmail.com>
Cc: Khem Raj <raj.khem@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Suggested-by: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Cc: <mesa-stable@lists.freedesktop.org>
2018-03-23 11:44:21 +00:00
Juan A. Suarez Romero
ee2b943fa8 wayland-drm: do not distribute generated sources
Instead we will re-generate them again on building.

v2: get rid of BUILT_SOURCES (Daniel, Emil)
v3: keep BUILT_SOURCES for egl/Makefile.am (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-23 11:27:12 +01:00
Samuel Pitoiset
ccc64f3133 radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8
The hardware only supports 32-bit depth surfaces, but we can
enable TC-compat HTILE for 16-bit depth surfaces if no Z planes
are compressed.

The main benefit is to reduce the number of depth decompression
passes. Also, we don't need to implement DB->CB copies which is
fine.

This improves Serious Sam 2017 by +4%. Talos and F12017 are also
affected but I don't see a performance difference.

This also improves the shadowmapping Vulkan demo by 10-15%
(FPS is now similar to AMDVLK).

No CTS regressions on Polaris10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-23 10:05:57 +01:00
Samuel Pitoiset
5ae9772245 radv: add radv_calc_decompress_on_z_planes() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-23 10:05:55 +01:00
Samuel Pitoiset
9b8e75bee3 radv: add radv_image_is_tc_compat_htile() helper
Instead of that huge conditional that's going to be crazy.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-23 10:05:54 +01:00
Jason Ekstrand
884d27bcf6 nir: Rename image intrinsics to image_var
Generated with

git grep -l nir_intrinsic_image | xargs \
sed -i 's/nir_intrinsic_image/nir_intrinsic_image_var/g'

and some manual fixing in nir_intrinsics.h

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-23 13:48:11 +11:00
Dave Airlie
fa683385de virgl: add ARB_cull_distance support.
This just allows the properties through to the host if we have
cull dist support.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-23 10:21:10 +10:00
Eric Anholt
d7a015cbc6 broadcom/vc5: Account for InstanceID/VertexID in VPM segment size.
Fixes failure in
GTF-GLES3.gtf.GL3Tests.draw_instanced.draw_instanced_attrib_size
2018-03-22 15:12:21 -07:00
Eric Anholt
b8387dbc49 broadcom/vc5: Allow FBOs with mixed color formats.
This is required by GLES3, fixing
GTF-GLES3.gtf.GL3Tests.framebuffer_srgb.framebuffer_srgb_draw
2018-03-22 15:12:21 -07:00
Eric Anholt
4f62679be5 broadcom/vc5: Add missing support for 2101010_REV vertex attributes.
Fixes
GTF-GLES3.gtf.GL3Tests.vertex_type_2_10_10_10_rev.vertex_type_2_10_10_10_rev_invalid2,
where we hadn't thrown a GL error as needed in the extension-disabled
case.  We want to be exposing the extension anyway.
2018-03-22 15:12:21 -07:00
Eric Anholt
ba29b89dc7 broadcom/vc5: Set up a vertex position if the shader doesn't.
Our backend needs some sort of vertex position value to emit the scaled
viewport values and such.  Fixes potential segfaults in
KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx
2018-03-22 15:12:21 -07:00
Lionel Landwerlin
903e9952fb i965: add performance query support on CNL
v2: Add brw_oa_cnl.xml to EXTRA_DIST (Emil)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
e7f6d1e5f8 i965: perf: add support for new equation operators
Some equations of the CNL metrics started to use operators we haven't
defined yet, just add those.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
57a11550bc i965: perf: query topology
With the introduction of asymmetric slices in CNL, we cannot rely on
the previous SUBSLICE_MASK getparam to tell userspace what subslices
are available.

We introduce a new uAPI in the kernel driver to report exactly what
part of the GPU are fused and require this to be available on Gen10+.

Prior generations can continue to rely on GETPARAM on older kernels.

This patch is quite a lot of code because we have to support lots of
different kernel versions, ranging from not providing any information
(for Haswell on 4.13 through 4.17), to being able to query through
GETPARAM (for gen8/9 on 4.13 through 4.17), to finally requiring 4.17
for Gen10+.

This change stores topology information in a unified way on
brw_context.topology from the various kernel APIs. And then generates
the appropriate values for the equations from that unified topology.

v2: Move slice/subslice masks fields to gen_device_info (Rafael)

v3: Add a gen_device_info_subslice_available() helper (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
c1900f5b0f intel: devinfo: add helper functions to fill fusing masks values
There are a couple of ways we can get the fusing information from the
kernel :

  - Through DRM_I915_GETPARAM with the SLICE_MASK/SUBSLICE_MASK
    parameters

  - Through the new DRM_IOCTL_I915_QUERY by requesting the
    DRM_I915_QUERY_TOPOLOGY_INFO

The second method is more accurate and also gives us the EUs fusing
masks. It's also a requirement for CNL as this platform has asymetric
subslices and the first method SUBSLICE_MASK value is assumed uniform
across slices.

v2: Change gen_device_info_update_from_masks() to generate topology
    and call into gen_device_info_update_from_topology (Lionel/Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
2d26c99933 intel: devinfo: meson: include drm uapi
Already available with the autotools build.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
5d3e74a5a5 drm-uapi: bump headers
Required updates from drm-next for changes in i965.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
c471716574 intel: devinfo: store slice/subslice/eu masks
We want to store values coming from the kernel but as a first step, we
can generate mask values out the numbers already stored in the
gen_device_info masks.

v2: Add a helper to set EU masks (Lionel/Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
7e2c6147da intel: devinfo: store number of EUs per subslice
This will be reused to store values reported by the kernel. The main
use case will be for use as the input values of the metric sets
equations for the INTEL_performance_queries extension. By storing this
information in the gen_device_info we make this non GL specific so
this can be reused by Vulkan if we ever have an equivalent extension.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Dylan Baker
8e5988eb35 Revert "meson: merge C and C++ compiler arguments check"
This reverts commit cb2ddcefa5.

This causes clang to error out building C++ code. The plan is to fix the
build to work with clang, but in the mean time we'll just revert this

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2018-03-22 11:35:08 -07:00
Lionel Landwerlin
1603ce1921 i965/perf: fix config registration when uploading to kernel
When registring configurations to the kernel for the first time, we
run into an issue where the id number is not properly set (we're using
the wrong variable). As a result when trying to use that id later on,
we get an error.

This issue manifest itself the first time you use frameretrace after
reboot, subsequent runs are fine.

Fixes: 27ee83eaf7 ("i965: perf: add support for userspace configurations")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 18:21:57 +00:00
Lepton Wu
a8b846bccd gallium/winsys/kms: Add support for multi-planes
Add a new struct kms_sw_plane which delegate a plane and use it
in place of sw_displaytarget. Multiple planes share same underlying
kms_sw_displaytarget.

v2:
 - add more check for plane size (Tomasz)
v3:
 - split from larger patch (Emil)
v4:
 - no change from v3
v5:
 - remove mapped field (Tomasz)
v6:
 - remove change-id in commit message (Tomasz)
v7:
 - add revision history in commit message (Emil)

Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2018-03-22 18:10:44 +00:00
Lepton Wu
d891f28df9 gallium/winsys/kms: Fix possible leak in map/unmap.
If user calls map twice for kms_sw_displaytarget, the first mapped
buffer could get leaked. Instead of calling mmap every time, just
reuse previous mapping. Since user could map same displaytarget with
different flags, we have to keep two different pointers, one for rw
mapping and one for ro mapping. Also introduce reference count for
mapped buffer so we can unmap them at right time.

v2:
 - avoid duplicated mapping and leaked mapping (Tomasz)
v3:
 - split from larger patch (Emil)
v4:
 - remove munmap from dt_destory (Emil)
v5:
 - introduce reference count for mapping (Tomasz)
 - add back munmap in dt_destory
v6:
 - remove change-id in commit message (Tomasz)
v7:
 - remove munmap from dt_destory again (Emil)
 - add revision history in commit message (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2018-03-22 18:10:42 +00:00
Juan A. Suarez Romero
4db269f30c broadcom/vc4: add path to nir_builder.h
As the other VC4 files do. Otherwise, it won't find nir_builder.h

v2: add path in source code rather changing autotools (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
d39e828c82 autotools: add tegra header files
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
40ecee89b7 swr/rast: autotools: add events_private.proto in dist tarball.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
0bf1274883 radv: autotools: add radv_extensions.h in the generated VULKAN list
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
13459c637a anv/radv: autotools: include vulkan_*.h headers
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
f8b749b7c0 nir: autotools, meson: add GLSL.ext.AMD.h in the files list
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Matt Turner
724586a266 intel/compiler: Readd ICL to test_eu_validate.cpp
Now that the PCI IDs are upstream, this can be readded.
2018-03-22 09:56:09 -07:00
Matt Turner
65b060d9cb intel/compiler: Skip 64-bit type tests when types not available
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Anuj Phogat
ad7ed86bf7 intel: Add a Ice Lake PCI IDs
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-22 09:56:09 -07:00
Anuj Phogat
1065acfb69 intel: Disable fast color clear on icl
Disabling fast color clear makes fbo-clearmipmap test render correct
texture in base miplevel. Fast color clear is anyways disabled for
non-base miplevels.

Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Jason Ekstrand
d2eecf0b0b intel/compiler/icl: Clear "null render target" bit in extended message descriptor
Otherwise all our render target writes go no where.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Anuj Phogat
1484876ef7 intel/compiler/icl: Update the assert in brw_stage_has_packed_dispatch()
Rafael ran piglit with the test code enabled and saw no additional GPU
hangs.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Anuj Phogat
f05e0d9c2a intel/common/icl: Disable hiz surface sampling
On gen11+ AUX_HIZ is not a supported value for surfaces being
sampled by the 3D sampler.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Anuj Phogat
370af9dcc0 intel/common/icl: Add L3 config
ICL uses the same L3 configs as CNL, just leaving the SLM configs out.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Matt Turner
f56693af4b intel/tools/aubinator: Drop platform list from print_help()
We all know the platform names, and I don't want to update this list
continually.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Derek Foreman
aa18a63512 egl/wayland: Make swrast display_sync the correct queue
commit 03dd9a88b0 introduced per surface
queues, but the display_sync for swrast_commit_backbuffer remained on
the old queue.  This is likely to break when dispatching the correct
queue at the top of function (which can't dispatch the sync callback
we're waiting for).

The easiest known reproduction case is running weston-subsurfaces under
weston --use-pixman

Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-22 15:27:35 +00:00
Samuel Pitoiset
52fba3f45d radv: remove unused radv_pipeline::needs_data_cache variable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-22 14:30:37 +01:00
Eric Engestrom
cb2ddcefa5 meson: merge C and C++ compiler arguments check
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-22 11:59:12 +00:00
Mathias Fröhlich
880c1718b6 omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}
We're trying to be -Wundef clean so that we can turn it on (and
eventually make it an error).

Note that the OMX code already used `#if ENABLE_ST_OMX_BELLAGIO` instead
of #ifdef; I could've changed these, but the point of -Wundef is to
catch typos, so we might as well make the change the right way.

Fixes: 83d4a5d5ae "st/omx/tizonia: Add H.264 decoder"
Fixes: b2f2236dc5 "st/omx/tizonia: Add H.264 encoder"
Fixes: c62cf1f165 "st/omx/tizonia/h264d: Add EGLImage support"
Cc: Gurkirpal Singh <gurkirpal204@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-22 11:39:28 +00:00
Mathias Fröhlich
795b465c50 meson: simplify omx logic
and let's make sure `with_gallium_omx` is never 'auto' and can only be
one of [bellagio, tizonia, disabled].

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-22 10:08:10 +00:00
Mathias Fröhlich
862c872c48 vbo: Remove now duplicate _DrawVAO notification.
The DriverFlags.NewArray bit is already set to NewDriverState in
_mesa_set_draw_vao since we have actually just above changed the VAOs
content. So this can be removed.
The _vbo_update_inputs is called by the vbo...recalculate_inputs being
set through the same mechanism as described above.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:53 +01:00
Mathias Fröhlich
006b5e798a vbo: Remove now duplicate _vbo_update_inputs from dlist draw.
At the current state, _vbo_update_inputs is called from
the draw callback if vbo...recalculate_inputs is set.
But that is now set of the _DrawVAO or its content or the
vertex program mode is changed.
So remove _vbo_update_inputs from the direct dlist draw path.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:53 +01:00
Mathias Fröhlich
2887c98140 vbo: Remove redundant set of DriverFlags.NewArray in vbo_bind_arrays.
Now that setting vbo...recalculate_inputs also sets the
DriverFlags.NewArray bits into the NewDriverState setting that from
vbo_bind_arrays is redundant.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
9f5b6ef2ef vbo: Remove vbo...recalculate_inputs from vbo_exec_invalidate_state.
This flag is now set when the actual Array._DrawVAO changes.
So setting this flag is redundant here.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
bf328359a7 mesa: A change of gl_vertex_processing_mode needs an array update.
Since arrays also handle the mapping of current values into the
disabled array slots, we need to tell the array update code that
this mapping has changed. Also mark only dirty if it has changed.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
5b91786225 mesa: Set DriverFlags.NewArray together with vbo...recalculate_inputs.
Both mean something very similar and are set at the same time now.
For that vbo module to be set from core mesa, implement a public vbo
module method to set that flag. In the longer term the flag should
vanish in favor of a driver flag of the appropriate driver.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
d3c604e12e mesa: Update VAO internal state when setting the _DrawVAO.
Update the VAO internal state on Array._DrawVAO instead of
Array.VAO. Also the VAO internal state update gets triggered now
by a change of Array._DrawVAO instead of the _NEW_ARRAY state flag.
Also no driver looks at any VAO's NewArrays value from within
the Driver.UpdateState callback. So it should be safe to move
this update into the _mesa_set_draw_vao method.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
c4c56ff303 vbo: Move vbo_bind_arrays into a dd_driver_functions draw callback.
Factor out that common call into the almost single place.
Remove the _mesa_set_drawing_arrays call from vbo_{exec,save}_draw code
paths as the function is now called through vbo_bind_arrays.
Prepare updating the list of struct gl_vertex_array entries via
calling _vbo_update_inputs for being pushed into those drivers that
finally work on that long list of gl_vertex_array pointers.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
6307d1be0a mesa: Move vbo draw functions into dd_function_table.
Move vbo draw functions into struct dd_function_table.
For now just wrap the underlying vbo functions.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Aaron Watry
23100acc8f clover/llvm: Fix build against LLVM/Clang 4.0
The opencl 1.0 langstandard was renamed in 5.0+

v2: Move preprocessor check into compat.hpp

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-21 21:03:23 -05:00
Timothy Arceri
c135316555 ac/nir_to_llvm: add frexp support
Fixes CTS tests:
KHR-GL40.gpu_shader_fp64.builtin.frexp_double
KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec2
KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec3
KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec4

And piglit test:
tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-22 12:42:34 +11:00
Timothy Arceri
cca2141745 nir: add frexp_exp and frexp_sig opcodes
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-22 12:42:34 +11:00
Caio Marcelo de Oliveira Filho
12c22b897a anv/pipeline: don't pass constant view index in multiview
If view mask has only one bit set, view index is effectively a
constant, so doesn't need to be passed to the next stages, just always
set it.

Part of this was in the original patch that added
anv_nir_lower_multiview.c but disabled.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 14:49:50 -07:00
Caio Marcelo de Oliveira Filho
5e7c1d05d4 anv/pipeline: use less instructions for multiview
The view_index is encoded in the remainder of dividing instance id by
the number of views in the view mask (n). In the general case (handled
by the else clause), there is a need to map from 0..n-1 into the
number of the view being masked. For that a map is encoded.

In the case only the first n bits in the mask are set, the mapping is
trivial, 0..n-1 already represent what view is being referred to.

That case was in the original patch that added
anv_nir_lower_multiview.c but disabled.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 14:49:50 -07:00
Eric Anholt
baeb6a4b4a broadcom/vc5: Fix up the NIR types of FS outputs generated by NIR-to-TGSI.
Unfortunately TGSI doesn't record the type of the FS output like GLSL
does, but VC5's TLB writes depend on the output's base type.  Just record
the type in the key at variant compile time when we've got a TGSI input
and then fix it up.

Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba32i/ui and apparently a
GPU hang that breaks most tests that come after it.
2018-03-21 14:02:34 -07:00
Neil Roberts
61603f0e42 spirv: Add a 64-bit implementation of Frexp
The implementation is inspired by
lower_instructions_visitor::dfrexp_sig_to_arith.

This has been tested against the arb_gpu_shader_fp64/fs-frexp-dvec4
test using the ARB_gl_spirv branch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 20:18:44 +01:00
Rafael Antognolli
5297a17571 aubinator_error_decode: Compare only the class_name of the ring.
ring_name is "<class_name> + <instance_id>" (e.g. rcs0). So we need to
first compare the class name only, then get the instance id.

Without this, INSTDONE is not being decoded.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-21 11:35:15 -07:00
Thomas Helland
8d5cd91ca0 nir: Migrate nir_dce to instr worklist
Shader-db runtime change avarage of five runs:
   Before 125,77 seconds (+/- 0,09%)
   After  124,48 seconds (+/- 0,07%)

Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Reviewed-by: Eric Anholt <eric at anholt.net>
2018-03-21 19:26:40 +01:00
Thomas Helland
edb18564c7 nir: Initial implementation of a nir_instr_worklist
Make a simple worklist by basically just wrapping u_vector.
This is intended used in nir_opt_dce to reduce the number of calls
to ralloc, as we are currenlty spamming ralloc quite bad. It should
also give better cache locality and much lower memory usage.

Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Reviewed-by: Eric Anholt <eric at anholt.net>
2018-03-21 19:26:27 +01:00
Scott D Phillips
cab8df1e3e intel/tools: aubinator: Catch gen11 "enhanced execlist" submission
Different registers are used for execlist submission in gen11, so
also watch those. This code only watches element zero of the
submit queue, which is all aubdump currently writes.

Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-21 11:07:15 -07:00
Marek Olšák
a8d55374dc radeonsi: fix a snprintf warning on gcc 7.3.0 2018-03-21 13:43:09 -04:00
Marek Olšák
cf0a95afac radeonsi/gfx9: print the swizzle mode for testdma
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-03-21 13:40:06 -04:00
Marek Olšák
f7ffa504a0 ac/surface: compute tile swizzle for GFX9
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-03-21 13:40:06 -04:00
Eric Anholt
9f0c9c6d18 broadcom/vc5: Don't skip job submit just because everything is scissored.
The coordinate shaders may now have side effects in the form of transform
feedback.

Part of fixing
GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_misc
2018-03-21 10:04:21 -07:00
Eric Anholt
024e814dee broadcom/vc5: Handle sparsely populated SO target array.
Fixes
GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_state_variables
2018-03-21 10:04:21 -07:00
Eric Anholt
f735ac6b1c broadcom/vc5: Fix 3D miplevel limit to match other texture targets.
Fixes segfault in
GTF-GLES3.gtf.GL3Tests.texture_storage.texture_storage_texture_levels on
level 13.
2018-03-21 10:04:21 -07:00
Eric Anholt
ba87d85b04 broadcom/vc5: Clamp the instance divisor to 16 bits.
Fixes debug assert on
GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor

Signed-off-by: Eric Anholt <eric@anholt.net>
2018-03-21 10:04:21 -07:00
Lionel Landwerlin
3dd92184d5 i965: fix android build
This is the equivalent of commit 5770e1d89e for
android.

v2: fix xml files path and file given to --header

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Fixes: 2d2b15fbca ("i965: fix autotools/android build")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105634
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 18:56:47 +02:00
Juan A. Suarez Romero
e5cd376c2f docs: fix typo in 17.3.6 release notes
Title is about 17.3.5, when it must be about 17.3.6.

CC: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-21 16:37:49 +00:00
Caio Marcelo de Oliveira Filho
8571c577aa nir/dead_cf: also remove useless ifs
Generalize the code for remove dead loops to also remove dead if
nodes. The conditions are the same in both cases, if the node (and
it's children) don't have side-effects AND the nodes after it don't
use the values produced by the node.

The only difference is when evaluating side effects: loops consider
only return jumps as a side-effect -- they can stop execution of nodes
after it; 'if' nodes outside loops should consider all kinds of
jumps (return, break, continue) since all of them can cause execution
of nodes after it to be skipped.

After this patch, empty ifs (those which both then and else blocks are
empty) will be removed by nir_opt_dead_cf.

It caused no change to shader-db, in part because the removal of empty
ifs is currently covered by nir_opt_peephole_select.

v2: Improve the identification of cases where break/continue can cause
    side-effects. (Jason)

v3: Move code comment changes to a different patch. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 09:36:09 -07:00
Caio Marcelo de Oliveira Filho
470056d37b nir/dead_cf: rephrase definition of a dead loop node
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 09:35:57 -07:00
Juan A. Suarez Romero
e1f8c23e18 docs: update calendar, add news and link release notes to 17.3.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-03-21 16:02:37 +00:00
Juan A. Suarez Romero
543e7c8382 docs: add sha256 checksums for 17.3.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 13dd6016d7)
2018-03-21 15:58:55 +00:00
Juan A. Suarez Romero
09448940ed docs: add release notes for 17.3.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 8a51f3857c)
2018-03-21 15:58:52 +00:00
Leo Liu
c4de2f0880 radeon/vce: move feedback command inside of destroy function
On the CI family, firmware requires the destory command have to be the
last command in the IB, moving feedback command after destroy is causing
issues on CI cards, so we have to keep the previous logic that moves
destroy back to the last command.

But as the original issue fixed previously, with the newer family like Vega10,
feedback command have to be included inside of the task info command along
with destroy command.

Fixes: 6d74cb25("radeon/vce: move destroy command before feedback command")

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-03-21 11:24:35 -04:00
Eric Engestrom
1346a36162 egl: pull update from Khronos and drop local define
Added in Khronos in 2b6bb4ee45cc46c89d4a "EGL_MESA_drm_image: add
EGL_DRM_BUFFER_USE_CURSOR_MESA to egl.xml" [1] as part of PR #36 [2].

[1] 2b6bb4ee45
[2] https://github.com/KhronosGroup/EGL-Registry/pull/36

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 14:28:05 +00:00
Eric Engestrom
f744c6c1e2 egl: align the formatting of Haiku section of eglplatform.h with Khronos'
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 14:28:05 +00:00
Eric Engestrom
ac698ae4a0 egl: add Ozone section to eglplatform.h
This pulls in commit a93f559e9c11fa53fb5f1cc255b8f75433f85d2a "Add Ozone
section to eglplatform.h" from Khronos [1] added by Brian Anderson [2]
a few months ago.

[1] a93f559e9c
[2] https://github.com/KhronosGroup/EGL-Registry/pull/26

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 14:28:05 +00:00
Aaron Watry
c95d953b18 clover: Dynamically calculate __OPENCL_VERSION__ and CLC language version
Use get_language_version to calculate default cl standard based on
device capabilities and -cl-std specified in build options.

v5; move dev_clc_version declaration from an earlier patch
v4: Squash the __OPENCL_VERSION__ and CLC language version patches
v3: (Jan) Allow device_version up to 2.2 while device_clc_version
    only goes to 2.0
    Use get_cl_version to calculate version instead
v2: Split out from the previous patch (Pierre)

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
CC: Jan Vesely <jan.vesely@rutgers.edu>
2018-03-21 06:59:46 -05:00
Aaron Watry
29b4090d18 clover/llvm: Add get_[cl|language]_version, validation and some helpers
Used to calculate the default CLC language version based on the --cl-std in build args
and the device capabilities.

According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by:
 1) If you have -cl-std=CL1.1+ use the version specified
 2) If not, use the highest 1.x version that the device supports

Curiously, there is no valid value for -cl-std=CL1.0

Validates requested cl-std against device_clc_version

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>

v7: (Pierre) Split cl/clc versions into separate lists and
    make more references const.

v6: (Pierre) Add more const and fix some whitespace

v5: (Aaron) Use a collection of cl versions instead of switch cases
    Consolidates the string, numeric version, and clc langstandard::kind

v4: (Pierre) Split get_language_version addition and use into separate patches
    Squash patches that add the helpers and validate the language standard

v3: Change device_version to device_clc_version

v2: (Pierre) Move create_compiler_instance changes to correct patch
    to prevent temporary build breakage.
    Convert version_str into unsigned and use it to find language version
    Add build_error for unknown language version string
    Whitespace fixes
2018-03-21 06:59:37 -05:00
Juan A. Suarez Romero
14fffefc60 docs: add 17.3.{8,9} in the release calendar
Mesa 18.0 series has not been released yet, so let's extend 17.3 lifetime.

v2: add 17.3.9 in the calendar (Andres Gomez)

CC: Andres Gomez <agomez@igalia.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 11:57:44 +01:00
Eric Anholt
4d8b476fa9 intel/blorp: Fix compiler warning about num_layers.
The compiler doesn't notice that the condition for num_layers to be
undefined already defined it above (as our assert checked in a debug
build).

v2: Move the pair of assignments to one outside of the block.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-20 14:06:46 -07:00
Samuel Pitoiset
f0211155f1 radv: add support for VK_EXT_depth_range_unrestricted
This extension removes the restrictions on minDepth/maxDepth,
minDepthBounds/maxDepthBounds and VkClearDepthStencilValue::depth.

The following CTS tests now pass:

dEQP-VK.glsl.builtin_var.fragdepth.line_list_d32_sfloat_large_depth
dEQP-VK.glsl.builtin_var.fragdepth.point_list_d32_sfloat_large_depth
dEQP-VK.glsl.builtin_var.fragdepth.triangle_list_d32_sfloat_large_depth
dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_depth_range_unrestricted
dEQP-VK.draw.inverted_depth_ranges.depthclamp_depth_range_unrestricted

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-20 21:55:41 +01:00
Samuel Pitoiset
4e9b0b39b5 radv: only enable one channel when exporting prim id
It's a 32-bit integer like the layer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-20 21:54:48 +01:00
Lionel Landwerlin
5770e1d89e i965: fix out of tree autotools build
Fixes: 2d2b15fbca ("i965: fix autotools/android build")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-03-20 19:48:56 +00:00
Stéphane Marchesin
1117edc60d virgl: Implement seamless cube maps
This was previously ignored.

Along with the virglrenderer patch, this fixes ~100 dEQP tests:
dEQP-GLES3.functional.texture.filtering.cube.*

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-21 05:44:52 +10:00
Emil Velikov
c43715d30b i965: annotate brw_oa.py's --header and --code as required
As of earlier commit, the --header was made a hard requirement when
using --code.

Hence - annotate both as required and drop a few no longer needed
checks.

Fixes: 035cc7a12d ("i965: perf: reduce i965 binary size")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-20 17:21:49 +00:00
Lionel Landwerlin
d3e5d3955c i965: pipecontrol: add LRI write immediate flag
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-20 16:58:30 +00:00
Lionel Landwerlin
7f977d51b3 intel: genxml: add INSTPM/CS_DEBUG_MODE2 registers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-20 16:58:30 +00:00
Lionel Landwerlin
2d2b15fbca i965: fix autotools/android build
Autotools/android builds generate the header & code files in 2 steps,
but the code generation requires the name of the header file to
include it.

This change generates both files in one command.

Fixes: 035cc7a12d ("i965: perf: reduce i965 binary size")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-20 16:58:29 +00:00
Daniel Stone
9f3509665d dri3: Fix typo in version check
The have-new-DRI3 codepaths would never actually properly trigger, since
there was a typo in configure.ac which broke the version check. This
went unnoticed but for an error in config.log if you looked closely
enough.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Lukas F. Hartmann <lukas@mntmn.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Fixes: 7aeef2d4ef ("dri3: allow building against older xcb (v3)")
Cc: Dave Airlie <airlied@redhat.com>
2018-03-20 16:38:08 +00:00
Daniel Stone
bc5e59119e meson: Don't build svga by default on ARM/AArch64
VMware has no (published) support for Arm-architecture guests.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reported-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-20 16:18:37 +00:00
Daniel Stone
d7603cb518 meson: Add default DRI drivers for ARM/AArch64
On all Arm architectures (ARMv7 and below as 'arm', ARMv8 and above as
'aarch64'), only build swrast for DRI drivers. The only classic drivers
which could be used are r200 and NV20 cards, which seems unlikely enough
that it shouldn't be the default.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Javier Jardón <jjardon@gnome.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-20 16:18:37 +00:00
Emil Velikov
28780c5028 st/mesa: add compiler/nir/ prefix for nir includes
Stay consistent with the rest of the codebase, effectively fixing the
autotools build.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105621
Fixes: ffa4bbe466 ("st/nir/radeonsi: move nir_lower_uniforms_to_ubo()
to the state tracker")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-20 16:11:19 +00:00
Scott D Phillips
d849d36c6c anv: off-by-one in GetDescriptorSetLayoutSupport
Loop was accessing one more than bindingCount elements from
pBindings, accessing uninitialized memory.

Fixes: ddc4069122 ("anv: Implement VK_KHR_maintenance3")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-20 07:58:10 -07:00
Lionel Landwerlin
035cc7a12d i965: perf: reduce i965 binary size
Performance metric numbers are calculated the following way :

   - out of the 256 bytes long OA reports, we accumulate the deltas
     into an array of uint64_t

   - the equations' generated code reads the accumulated uint64_t
     deltas and normalizes them for a particular platform

Our hardware is such that a number of counters in the OA reports
always return the same values (i.e. they're not programmable), and
they return the same values even across generations, and as a result a
number of equations are identical in different metric sets across
different generations.

Up to now we've kept the generated code of the equations separated in
different files (per generation/GT), and didn't apply any
factorization of the common equations. We could have make some
improvement by reusing equations within a given metrics file, but we
can go even further and reuse across generations (i.e. all files).

This change changes the code generation to emit a single file in which
we reuse equations emitted code based on the hash of equations'
strings.

Here are the savings in a meson build :

Before(.old)/after :
   $ du -h ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old
   43M	./build/src/mesa/drivers/dri/libmesa_dri_drivers.so
   47M	./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old

   $ size build/src/mesa/drivers/dri/libmesa_dri_drivers.so build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old
       text   data          bss	     dec            hex filename
   13054002 409424	 671856	14135282	 d7aff2	build/src/mesa/drivers/dri/libmesa_dri_drivers.so
   14550386 409552	 671856	15631794	 ee85b2	build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old

As a side comment here is the size of the drivers if we remove all of
the metrics from the build :

   $ du -sh build/src/mesa/drivers/dri/libmesa_dri_drivers.so
   40M	build/src/mesa/drivers/dri/libmesa_dri_drivers.so

v2: Fix an issue with hashing of counter equations (Lionel)
    Build system rework (Emil)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (build system part)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-20 13:56:07 +00:00
Lionel Landwerlin
e9a9e85948 i965: perf: fix a counter return type on hsw
The equation code computes a float (percentage) yet the return type
was an uint64_t.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-20 11:36:13 +00:00
Tapani Pälli
604cac9f73 mesa: fix leaking ParameterValueOffset
==15115== 48 bytes in 1 blocks are definitely lost in loss record 16 of 66
==15115==    at 0x4C2EC15: realloc (vg_replace_malloc.c:785)
==15115==    by 0x8602C3E: _mesa_reserve_parameter_storage (prog_parameter.c:212)
==15115==    by 0x8602D1E: _mesa_add_parameter (prog_parameter.c:252)
==15115==    by 0x86032C4: _mesa_add_sized_state_reference (prog_parameter.c:384)
==15115==    by 0x8603324: _mesa_add_state_reference (prog_parameter.c:409)

Fixes: edded12376 "mesa: rework ParameterList to allow packing"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-20 13:25:07 +02:00
Daniel Stone
478fc2d2a1 dri3: Don't fail on version mismatch
The previous commit to make DRI3 modifier support optional, breaks with
an updated server and old client.

Make sure we never set multibuffers_available unless we also support it
locally. Make sure we don't call stubs of new-DRI3 functions (or empty
branches) which will never succeed.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: 7aeef2d4ef ("dri3: allow building against older xcb (v3)")
2018-03-20 08:52:59 +00:00
Timothy Arceri
9a243eccae radv: don't lower indirects until after opts have run
Noticed while passing by. Not sure if it impacts anything, but
likely to impact GFX9 more than anything else since we lower
inputs, outputs and locals there.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-20 15:01:44 +11:00
Timothy Arceri
dfe2f19855 st/nir: fix atomic lowering for gallium drivers
i965 and gallium handle the atomic buffer index differently. It was
just by luck that the single piglit test for this was passing.

For gallium we use the atomic binding so that we match the handling
in st_bind_atomics().

On radeonsi this fixes the CTS test:
KHR-GL43.shader_storage_buffer_object.advanced-write-fragment

It also fixes tressfx hair rendering in Tomb Raider.

Reviewed-by: Marek Olšák  <marek.olsak@amd.com>
2018-03-20 14:29:53 +11:00
Timothy Arceri
632d5e97ef st/radeonsi: enable uniform packing in NIR backend
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:19:35 +11:00
Timothy Arceri
231333a20d st: add uniform packing support to lower_uniforms_to_ubo()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
9c51a7ea29 gallium: add packed uniform CAP
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
ffa4bbe466 st/nir/radeonsi: move nir_lower_uniforms_to_ubo() to the state tracker
This will only ever be used by gallium drivers so it probably doesn't
belong in the nir toolkit. Also we want to pass it some non NIR
things in the following patch.

To avoid regressions we wrap the lowering calls that have been moved
to st_glsl_to_nir with a quick hack so that they are only called for
radeonsi, we will replace the hack with a check for uniform packing
in a following patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
a80cf442d9 st: add st_glsl_type_dword_size() helper
This will be used to support uniform packing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
5488166730 st/glsl_to_nir: add support for packed builtin uniforms
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
57ebab64c0 mesa: add _mesa_add_sized_state_reference() helper
This will be used for adding packed builtin uniforms.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
2377754329 mesa: add support propagate uniform support for packed uniforms
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
40711a7a60 mesa: allow for uniform packing when adding uniforms to param list
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
a2198d4fdb mesa: add packing support for setting uniform handles
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
6cfa15b803 mesa: add packing support for setting uniforms
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
4a7c5c079b mesa: create copy uniform to storage helpers
These will be used in the following patch to allow copying directly
to the param list when packing is enabled.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
edded12376 mesa: rework ParameterList to allow packing
Currently everything is padded to 4 components. Making the list
more flexible will allow us to do uniform packing.

V2 (suggestions from Nicolai):
- always pass existing calls to _mesa_add_parameter() true for padd_and_align
- fix bindless param value offsets
- remove left over wip logic from pad and align code
- zero out param value padding
- whitespace fix

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
b13b9eb432 mesa: add PackedDriverUniformStorage const
Will be used to determine whether to take packing code paths or not.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:33 +11:00
Eric Anholt
00910e3057 broadcom/vc5: Don't annotate dumps with stale live intervals.
As you're debugging register allocation, you may have changed the
intervals and not recomputed yet.  Just skip the dump in that case.
2018-03-19 16:44:20 -07:00
Eric Anholt
facc3c6f58 broadcom/vc5: Add support for register spilling.
Our register spilling support is nice to have since vc4 couldn't at all,
but we're still very restricted due to needing to not spill during a TMU
operation, or during the last segment of the program (which would be nice
to spill a value of, when there's a long-lived value being passed through
with little modification from the start to the end).

We could do better by emitting unspills for the last-segment values just
before the last thrsw, since the last segment is probably not the maximum
interference area.

Fixes GTF uniform_buffer_object_arrays_of_all_valid_basic_types and 3
others.
2018-03-19 16:44:06 -07:00
Eric Anholt
271fc58ba1 broadcom/vc5: Remove redundant last_inst lookup.
The point was to get the MOV, which the MOV_dest already returned.
2018-03-19 16:42:59 -07:00
Eric Anholt
34dc64f627 broadcom/vc5: On QPU pack error, dump the instruction and return cleanly.
This is nice for debugging when you've made a bad instruction.
2018-03-19 16:42:59 -07:00
Eric Anholt
d721348dcd broadcom/vc5: Add cursors to the compiler infrastructure, like NIR's.
This will let me do lowering late in compilation using the same
instruction builder as we use in nir_to_vir.
2018-03-19 16:42:59 -07:00
Eric Anholt
c81d681742 broadcom/vc5: Move the umul macro to a header.
Anywhere we want to multiply, we probably want this.
2018-03-19 16:42:59 -07:00
Eric Anholt
9e28c18cd1 broadcom/vc5: Correct the arg count of TIDX/EIDX. 2018-03-19 16:42:59 -07:00
Eric Anholt
55bf298333 broadcom/vc5: Re-do live variables after removing thrsws.
Otherwise our start/ends ips won't line up with the actual instructions.
2018-03-19 16:42:59 -07:00
Eric Anholt
c3a504f470 broadcom/vc5: Add a QPU helper for instructions using the TLB.
This will be used for detecting last thread segment in register spilling.
2018-03-19 16:42:59 -07:00
Eric Anholt
09c4dd1971 broadcom/vc5: Introduce v3d_qpu_reads_vpm()/v3d_qpu_writes_vpm().
These helpers will be used in register spilling to determine where to add
a last thrsw if needed, and might help refactor QPU scheduling.
2018-03-19 16:42:59 -07:00
Eric Anholt
407f21ef1b broadcom/vc5: The ldvpm signal also a case of using the VPM.
The QPU scheduling code calling this function already separately checked
this signal.
2018-03-19 16:42:59 -07:00
Eric Anholt
4760040c09 broadcom/vc5: Extract v3d_qpu_writes_tmu() helper.
This will be reused in register spilling.
2018-03-19 16:42:59 -07:00
Dave Airlie
32791a0502 radv: don't export NULL layer.
We have some cases where in subpass we want the layer but having
it be 0 and loaded in the frag shader without the vertex shader
exporting it is fine.

So don't export the layer if we don't have a value to put in it.

Fixes: d4c74aed7a (radv/multiview: mark layer_input if we have input attachments.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 21:36:48 +00:00
Marek Olšák
f674b50d0e mesa: adjust incorrect comment in texture_buffer_range 2018-03-19 16:56:17 -04:00
Ian Romanick
6aeaa7d363 nir: Don't compare b2f or b2i with zero
All of the shaders that had loops changed were in Tomb Raider.  The one
shader that lost SIMD16 is one of those.

Skylake
total instructions in shared programs: 14391653 -> 14390468 (<.01%)
instructions in affected programs: 111891 -> 110706 (-1.06%)
helped: 501
HURT: 0
helped stats (abs) min: 1 max: 155 x̄: 2.37 x̃: 1
helped stats (rel) min: 0.05% max: 21.54% x̄: 1.61% x̃: 1.01%
95% mean confidence interval for instructions value: -3.23 -1.50
95% mean confidence interval for instructions %-change: -1.77% -1.45%
Instructions are helped.

total cycles in shared programs: 532793024 -> 532776598 (<.01%)
cycles in affected programs: 987682 -> 971256 (-1.66%)
helped: 348
nnHURT: 41
helped stats (abs) min: 1 max: 3074 x̄: 54.91 x̃: 18
helped stats (rel) min: 0.05% max: 32.24% x̄: 3.36% x̃: 1.68%
HURT stats (abs)   min: 1 max: 422 x̄: 65.39 x̃: 24
HURT stats (rel)   min: 0.09% max: 39.29% x̄: 9.50% x̃: 2.02%
95% mean confidence interval for cycles value: -64.08 -20.38
95% mean confidence interval for cycles %-change: -2.78% -1.23%
Cycles are helped.

total loops in shared programs: 4854 -> 4829 (-0.52%)
loops in affected programs: 27 -> 2 (-92.59%)
helped: 18
HURT: 0

LOST:   1
GAINED: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 13:52:35 -07:00
Dave Airlie
e8d9b7ab02 radv: lower constant initializers on output variables earlier
If a shader only writes to an output via a constant initializer we
need to lower it before we call nir_remove_dead_variables so that
this pass sees the stores from the initializer and doesn't kill the
output.

Fixes test failures in new work-in-progress CTS tests:
dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.float

This is ported from anv:
99b57daf4a anv/pipeline: lower constant initializers on output variables earlier
from Iago Toral Quiroga <itoral@igalia.com>

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:29:40 +00:00
Dave Airlie
032014ac01 radv/query: handle multiview timestamp queries.
For each view bit we need to emit a timestamp query.

Fixes: dEQP-VK.multiview.queries*

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:29:14 +00:00
Dave Airlie
32b4f3c38d radv/query: handle multiview queries properly. (v3)
For multiview we need to emit a number of sequential queries
depending on the view mask.

This avoids dEQP-VK.multiview.queries.15 waiting forever
on the CPU for query results that are never coming.

We only really want to emit one query,
and the rest should be blank (amdvlk does the same),
so we emit begin/end pairs for all the others except
the first query.

v2: fix tests
v3: split out patch.

Fixes: dEQP-VK.multiview.queries*
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:29:09 +00:00
Dave Airlie
4034dc5c72 radv/query: split out begin/end query emission
This just splits out the begin/end query hw emissions,
it makes it easier to add multiview support for queries.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:29:05 +00:00
Dave Airlie
d4c74aed7a radv/multiview: mark layer_input if we have input attachments.
This fixes:
dEQP-VK.multiview.input_attachments*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:26:39 +00:00
Caio Marcelo de Oliveira Filho
f6338c3b85 anv/pipeline: set active_stages early
Since the intermediate states of active_stages are not used,
i.e. active_stages is read only after all stages were set into it,
just set its value before compiling the shaders.

This will allow to conditionally run certain passes based on what
other shaders are being used, e.g. a certain pass might only be
applicable to the vertex shader if there's no geometry or tessellation
shader being used.

v2: Use vk_to_mesa_shader_stage. (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-19 18:00:49 +00:00
Caio Marcelo de Oliveira Filho
318073ce66 anv/pipeline: fail if TCS/TES compile fail
v2: Add Fixes tag. (Lionel)

Fixes: e50d4807a3 ("anv: Compile TCS/TES shaders.")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-19 18:00:49 +00:00
Jordan Justen
2ed288363f main/program_binary: In ProgramBinary set link status as LINKING_SKIPPED
This change allows the disk shader cache to work with programs loaded
with ProgramBinary. Drivers check for LINKING_SKIPPED, and if set,
then they try to use the shader cache.

Since the program loaded by ProgramBinary is similar to loading the
shader from the disk cache, this is probably more appropriate.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
d2b74ca2b5 i965: Allow disk shader cache usage with LINKING_SUCCESS status
Currently, we only look in the disk shader cache if we see that the
shader program is in the cache during the link step.

If the shader cache entry isn't found during the program link, there
are still some (fairly unlikely) scenarios where later it might be
useful to search the cache for gen binary programs.

1. If the cache evicts the serialized glsl cache, there might still be
   valid gen program entries in the disk cache.

2. If two applications are running in parallel, then it is possible
   that one may write out the cached gen program item which the other
   application can then make use of.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
b5baaee0d6 glsl/serialize: Save shader program metadata sha1
When the shader cache is used, this can be generated. In fact, the
shader cache uses this sha1 to lookup the serialized GL shader
program.

If a GL shader program is restored with ProgramBinary, the shaders are
not available, and therefore the correct sha1 cannot be generated. If
this is restored, then we can use the shader cache to restore the
binary programs to the program that was loaded with ProgramBinary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
9b473f9e3c glsl: Remove api_enabled tracking for transform feedback
We used this to prevent usage of the disk shader cache when transform
feedback was enabled via the GL API. This is no longer used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
fc4a7aaa82 i965: Allow disk shader cache usage with transform feedback
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
6d830940f7 glsl/shader_cache: Allow shader cache usage with transform feedback
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Suggested-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jose Fonseca
e10dc12f6f scons: need to split CC or things might fail
We've seen this fail internally.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-03-19 16:41:57 +01:00
Jordan Justen
d07a49fb18 i965: Add INTEL_DEBUG stages support for disk shader cache
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-19 00:07:29 -07:00
Dave Airlie
8f052a3e25 radv: handle exporting view index to fragment shader. (v1.1)
The fragment shader was trying to read this, but nothing
was exporting it from the vertex shader. This handles
it like the prim id export.

Fixes:
dEQP-VK.multiview.secondary_cmd_buffer.*
dEQP-VK.multiview.index.fragment_shader.*

v1.1: updated to use 0x1 (Samuel)

Fixes: e3265c10c8 (radv: Implement multiview draws.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-19 01:20:00 +00:00
Axel Davy
dbc24835d7 st/nine: Fix non inversible matrix check
There was a missing absolute value when
checking if the determinant was big enough.

Fixes: https://github.com/iXit/Mesa-3D/issues/292

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-03-18 22:53:46 +01:00
Axel Davy
f61e9a958b st/nine: Fixes warning about implicit conversion
Makes the conversion explicit.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102542

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-03-18 22:53:42 +01:00
Axel Davy
71eae7940e st/nine: Fix bad tracking of vs textures for NINESBT_ALL
Stateblocks with NINESBT_ALL should track all textures.
For better performance they have a faster path which
copies all the required.

This path was only tracking ps textures.

Fixes: https://github.com/iXit/Mesa-3D/issues/303

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-03-18 22:53:36 +01:00
Axel Davy
76fa1f730b st/nine: Fix bad tracking of bound vs textures
An incorrect formula was used to compute bound_samplers_mask_vs.
Since s is above always 8 for vs and the variable is encoded on 8 bits,
it was always 0.
This resulted in commiting the samplers every call when
there was at least one texture read in the vs shader.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-03-18 22:53:32 +01:00
Grazvydas Ignotas
e1b2e5667c radv: make vk_format_description structures static
No need to bother the linker about them.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-17 18:53:21 +02:00
Grazvydas Ignotas
331141e87e radv: fix stale comment in generated vk_format_table.c
It seems to be a leftover from u_format_table.py.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-17 18:53:21 +02:00
Eric Anholt
7db1c09d12 anv: Silence warning about heap_size.
We only get VK_SUCCESS if it was initialized, but apparently my compiler
doesn't track that far.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-16 15:10:05 -07:00
Eric Anholt
d25640c3a3 i965: Silence compiler warning about promoted_constants.
We only have a cfg != NULL if we went through one of the paths that set
it, but my compiler doesn't figure that out.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 6411defdcd ("intel/cs: Re-run final NIR optimizations for each SIMD size")
2018-03-16 15:09:55 -07:00
Eric Anholt
9f89452ea3 anv: Silence compiler warnings about uninitialized bind_offset.
This is a legitimate warning: if anv's blorp_alloc_binding_table() throws
an error from anv_cmd_buffer_alloc_blorp_binding_table(), we silently
continue to use this undefined value.  The rest of this code doesn't seem
very allocation-error-proof, though, either.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-16 15:09:47 -07:00
Matt Turner
f3833f1ca7 intel/compiler: Use gen_get_device_info() in test_eu_validate
Previously the unit test filled out a minimal devinfo struct. A previous
patch caused the test to begin assert failing because the devinfo was
not complete. Avoid this by using the real mechanism to create devinfo.

Note that we have to drop icl from the table, since we now rely on the
name -> PCI ID translation done by gen_device_name_to_pci_device_id(),
and ICL's PCI IDs are not upstream yet.

Fixes: f89e735719 ("intel/compiler: Check for unsupported register sizes.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-16 13:20:21 -07:00
Matt Turner
54db78b196 intel: Add cfl to gen_device_name_to_pci_device_id()
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-16 13:20:21 -07:00
Rob Clark
bc5001325b meson+dri3: allow building against older xcb (v3)
Similar to previous patch, make xcb 1.13 optional.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-16 16:18:42 -04:00
Dave Airlie
7aeef2d4ef dri3: allow building against older xcb (v3)
I'm not sure everyone wants to be updating their dri3 in a forced
march setting, this allows a nicer approach, esp when you want
to build on distro that aren't brand new.

I'm sure there are plenty of ways this patch could be cleaner,
and I've also not built it against an updated dri3.

For meson I've just left it alone, since if you are using meson
you probably don't mind xcb updates, and if you are using meson
you can fix this better than me.

v3: just don't put a version in for dri3/present without
modifiers, should allow building with 1.11 as well

(feel free to supply meson followups)

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-03-16 13:19:45 -04:00
Marek Olšák
f099c3aef1 r600: consolidate PIPE_BIND_SHARED/SCANOUT handling
(Ported from radeonsi commit f70f6baaa3)

Allows cached BOs to be reused in more cases.

Bugzilla: https://bugs.freedesktop.org/105171
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2018-03-16 17:31:28 +01:00
Rafael Antognolli
f89e735719 intel/compiler: Check for unsupported register sizes.
Make sure we don't emit 64 bit types if the hardware doesn't support
them.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-16 09:27:16 -07:00
Jason Ekstrand
315ee5faec loader: Include include/drm-uapi in the autotools build
We're already including it in the meson build.  This fixes build issues
on systems which have a drm_fourcc.h that doesn't have modifiers.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-16 08:50:07 -07:00
Wu, Zhongmin
5fc21c6044 egl/android: Implement the eglSwapinterval for Android.
Implement the eglSwapinterval for Android platform to
enable the async mode for some GFX benchmarks such as
Daimler C217, CityBench.

Results of the dEQP-EGL.*swap_interval tests

'dEQP-EGL.functional.query_config.get_config_attrib.max_swap_interval'..
'dEQP-EGL.functional.query_config.get_config_attrib.min_swap_interval'..
'dEQP-EGL.functional.choose_config.simple.selection_only.max_swap_interval'..
'dEQP-EGL.functional.choose_config.simple.selection_only.min_swap_interval'..
'dEQP-EGL.functional.choose_config.simple.selection_and_sort.max_swap_interval'..
'dEQP-EGL.functional.choose_config.simple.selection_and_sort.min_swap_interval'..
'dEQP-EGL.functional.negative_api.swap_interval'..

 Test run totals:
   Passed:        7/7 (100.0%)
   Failed:        0/7 (0.0%)
   Not supported: 0/7 (0.0%)
   Warnings:      0/7 (0.0%)

Signed-off-by: Zhongmin Wu <zhongmin.wu@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
[Emil Velikov: polish inline comment, add dEQP stats, s/dpy/disp/]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-16 13:58:56 +00:00
Emil Velikov
3a9fb4f7ad st/mesa: simplify st_init_limits() via tgsi_processor_to_shader_stage
Reuse the tgis helper and remove a bunch of duplicated code.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-16 13:49:16 +00:00
Emil Velikov
f7f95310f0 tgsi: move tgsi_processor_to_shader_stage() to a header
This way we can utilise it with later patches.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-16 13:48:46 +00:00
Emil Velikov
9fa1d822bf egl/dri2: move wayland header inclusion where applicable
Instead of indirectly pulling the wayland headers everywhere, use
forward declarations and #include only as needed.

Should effectively fix build errors like the following:

make[5]: Entering directory
'/.../src/gallium/state_trackers/omx/tizonia'
   CC       h264dprc.lo
In file included from h264dprc.c:45:0:
.../src/egl/drivers/dri2/egl_dri2.h:47:10: fatal error:
wayland/wayland-egl/wayland-egl-backend.h: No such file or directory
  #include "wayland/wayland-egl/wayland-egl-backend.h"

Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
2018-03-16 13:47:59 +00:00
Emil Velikov
d091c9c4cf vulkan/wsi/x11: correct DRI3 version in comment
During development the version was bumped, yet the comment did not get
an update.

Fixes: c80c08e226 ("vulkan/wsi/x11: Add support for DRI3 v1.2")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-16 13:47:52 +00:00
Emil Velikov
19ec817756 vulkan/wsi/x11: use ARRAY_SIZE where applicable
Use the handy macro instead of hard coded numbers.

Fixes: c80c08e226 ("vulkan/wsi/x11: Add support for DRI3 v1.2")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-16 13:45:47 +00:00
Juan A. Suarez Romero
705a6446b4 mesa: RGB9_E5 invalid for CopyTexSubImage* in GLES
According to OpenGL ES 3.2, section 8.6, CopyTexSubImage* should return
an INVALID_OPERATION if the internalformat of the texture is RGB9_E5.

This fixes
dEQP-GLES31.functional.debug.negative_coverage.*.copytexsubimage2d_texture_internalformat.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-16 12:49:16 +00:00
Christian Gmeiner
5e51f72374 etnaviv: remove superfluous \n from DBG(..) callers
The DBG(..) macro appends a \n already so there is no
need to do it twice.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-03-16 11:41:27 +01:00
Samuel Pitoiset
e96a1d27dc radv: run nir_opt_move_load_ubo
Polaris10:
SGPRS: 108560 -> 107856 (-0.65 %)
VGPRS: 74576 -> 74520 (-0.08 %)
Spilled SGPRs: 7375 -> 7113 (-3.55 %)
Code Size: 4273464 -> 4274364 (0.02 %) bytes
Max Waves: 9434 -> 9446 (0.13 %)

Vega10:
Totals from affected shaders:
SGPRS: 108264 -> 107576 (-0.64 %)
VGPRS: 69068 -> 69000 (-0.10 %)
Spilled SGPRs: 7221 -> 6959 (-3.63 %)
Code Size: 3800796 -> 3801496 (0.02 %) bytes
Max Waves: 10687 -> 10709 (0.21 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-16 09:58:19 +01:00
Samuel Pitoiset
af355aaa07 nir: add nir_opt_move_load_ubo() optimization pass
This pass moves load UBO operations just before their first use,
loosely based on nir_opt_move_comparisons.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-16 09:50:31 +01:00
Dave Airlie
9d0d806332 radv: drop geometry stride user sgpr.
This removes the other geometry specific user sgpr.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:21 +00:00
Dave Airlie
6f051549c3 radv: get rid of geometry user sgpr for num entries.
This drops one of the geometry specific user sgprs,
we can work this out at compile time.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:17 +00:00
Dave Airlie
9188bd78d7 radv: migrate lds size calculations to shader gen.
This moves the lds_size calcs into the shader so we have all
the size stuff in one file.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:12 +00:00
Dave Airlie
384aced65e radv: drop scanning the tess shader in the nir code.
This drops the now unneeded scanning and results in favour
of the ones in the info.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:08 +00:00
Dave Airlie
f50d520acf radv: use num_patches output from tcs shader.
Instead of recalculating the value, use the shader calculated value.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:05 +00:00
Dave Airlie
bf9a0ea853 radv/tess: remove last chunk of tess sgprs
This removes the last TES-specifc user sgpr.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:01 +00:00
Dave Airlie
6db44d6a8c radv: pass num_patches to tes from tcs
TES needs num_patches to do some of the calculations.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:58 +00:00
Dave Airlie
010d055aae radv: drop tess offchip layout for tcs.
This removes the last TCS specific user sgpr.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:54 +00:00
Dave Airlie
ee31cff856 radv: drop tcs_out_offsets
Move all calculations to shader generation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:47 +00:00
Dave Airlie
b0460bbf1c radv: drop tcs_out_layout
Move all calculations to shader generation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:43 +00:00
Dave Airlie
6adf99165c radv/tess: drop tcs_in_layout setting completely.
Inline all calcs at shader creation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:37 +00:00
Dave Airlie
f343d11ae7 radv: drop ls_out_layout const.
We can precalculate input_vertex_size at compile time.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:32 +00:00
Dave Airlie
d89b16b7b9 radv/shader_info: start gathering tess output info (v2)
This gathers the ls outputs written by the vertex shader,
and the tcs outputs, these are needed to calculate certain
tcs parameters.

These have to be separate for combined gfx9 shaders.

This is a bit pessimistic compared to the nir pass,
as we don't work out the individual slots for tcs outputs,
but I actually thing it should be fine to just mark the whole
thing used here.

v2: move to radv, handle clip dist (Samuel),
    handle compacts and patchs properly.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:23 +00:00
Dave Airlie
2012dae19a radv: migrate unique index info shader info (v2)
This just moves this function to an inline so the shader_info
pass can use it.

v2: use inline (Samuel)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:19 +00:00
Samuel Pitoiset
f02f1ad13f Revert "mesa: do not trigger _NEW_TEXTURE_STATE in glActiveTexture()"
This reverts commit f314a532fd.

This appears to introduce some blinking textures in UT2004. Not
sure exactly what's the root cause because we don't have much
information about the issue.

Anyway, this was just a micro optimization that actually breaks,
at least, one app almost one year later.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105436
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-15 21:32:52 +01:00
Lionel Landwerlin
51783f3e7d anv: silence unused variable warning
Fixes: 59b0ea0c74 ("anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-15 18:56:26 +00:00
Lionel Landwerlin
b5b56f91f5 i965: silence unused function warning
[123/227] Compiling C object 'src/mesa/drivers/dri/i965/libi965_gen110@sta/genX_blorp_exec.c.o'.
../src/mesa/drivers/dri/i965/genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-15 18:56:23 +00:00
Lionel Landwerlin
0f544a3c51 anv: silence unused function warning on gen11
[84/227] Compiling C object 'src/intel/vulkan/libanv_gen110@sta/genX_blorp_exec.c.o'.
../src/intel/vulkan/genX_blorp_exec.c:68:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-15 18:55:42 +00:00
Dylan Baker
2a7027f79a meson: fix pipe-loaders after omx changes
with_gallium_omx used to be a boolean, but now it's a string. That means
it needs to be compared to 'disabled' instead of false.

CC: Rob Clark <robdclark@gmail.com>
Fixes: 34e852d5b5
       ("meson: Re-add auto option for omx")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Tested-by: Rob Clark <robdclark@gmail.com
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-15 10:02:32 -07:00
Dylan Baker
9bd7a6f6f0 meson: require amdgpu >= 2.4.91
the meson equivalent of f8773edb0a

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-15 10:00:02 -07:00
Marek Olšák
f8773edb0a configure.ac: require libdrm_amdgpu 2.4.91
Since 2.4.90 is problematic, just ask for the next version.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-15 12:44:40 -04:00
Marek Olšák
5d0acff39e configure.ac: blacklist libdrm 2.4.90
Cc: 18.0 17.3 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-15 12:44:37 -04:00
Samuel Pitoiset
16ecf037f9 radv: dump LLVM IR when a hang is detected
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:20:07 +01:00
Samuel Pitoiset
81818662a5 radv: record LLVM IR when debugging shaders
If AMD_shader_info or RADV_TRACE_FILE is used we might need to
keep trace of LLVM IR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:20:03 +01:00
Samuel Pitoiset
d07edf5fdf radv: add dump_shader to the NIR compiler options
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:20:00 +01:00
Samuel Pitoiset
50fcca328c radv: pass the NIR compiler options to ac_compile_llvm_module()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:19:58 +01:00
Samuel Pitoiset
14c27c2511 radv: print some information when RADV_TRACE_FILE is set
Just to be sure all options are enabled when trying to generate
a hang report.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:19:54 +01:00
Samuel Pitoiset
5be2757c35 radv: only display options that are enabled
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:19:52 +01:00
Eric Engestrom
6332893594 mailmap: Use Eric Engestrom's personal email address
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-15 12:03:41 +00:00
Alejandro Piñeiro
50767214a7 spirv/radv: add AMD_gcn_shader capability, remove current extensions
So now, during spirv_to_nir, it uses the capability instead of the
extension. Note that we are really doing here is treating
SPV_AMD_gcn_shader as other supported extensions. SPV_AMD_gcn_shader
is not the first SPV extension supported. For example, the capability
draw_parameters infers if the extension SPV_KHR_shader_draw_parameters
is supported or not.

This could be seen as counter-intuitive, and that it would be easier
to define which extensions are supported, and based our checks on
that, but we need to take into account that some capabilities are
optional from core, and others came from new extensions.

Also this commit would make the implementation of ARB_spirv_extensions
easier.

v2: AMD_gcn_shader capability renamed to gcn_shader (Daniel Schürmann)

Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 12:08:25 +01:00
Samuel Iglesias Gonsálvez
adf58e59d3 spirv: update arguments for vtn_nir_alu_op_for_spirv_opcode()
We don't need anymore the source and destination's data type, just
their bitsize.

v2:
- Use glsl_get_bit_size () instead (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-15 08:56:15 +01:00
Samuel Iglesias Gonsálvez
ce2fd87056 spirv: fix the translation of SPIR-V conversion opcodes to NIR
There are some SPIRV opcodes (like UConvert and SConvert) have some
expectations of the output that doesn't depend on the operands
data type. Generalize the solution of all of them.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-15 08:51:01 +01:00
Mathias Fröhlich
98f35ad63c vbo: Correctly handle source arrays in vbo_split_copy.
The original approach did optimize away a bit too many fields.
Restablish the pointer into the original array and correctly feed that
one.

Reviewed-by: Brian Paul <brianp@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105471
Fixes: 64d2a20480
    mesa: Make gl_vertex_array contain pointers to first order VAO members.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-15 06:11:57 +01:00
Apple SWE
361f79c97f sched.h needs to be imported on Darwin/OSX targets.
sched_yield is used but the include reference on Darwin is missing. This patch
conditionally guards on Darwin/OSX to import sched.h first.

Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-14 22:08:34 -07:00
Apple SWE
67f27b1e18 Add processor topology calculation implementation for Darwin/OSX targets.
The implementation for bootstrapping SWR on Darwin targets is based on the Linux version.
Instead of reading the output of /proc/cpuinfo, sysctlbyname is used to determine the
physical identifiers, processor identifiers, core counts and thread-processor affinities.

With this patch, it is possible to use SWR as an alternate renderer on OSX to softpipe and
llvmpipe.

Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-14 22:08:34 -07:00
Dave Airlie
4b15b5e803 virgl: resize resource bo allocation if we need to.
This fixes an illegal command buffer on the host seen with
piglit arb_internalformat_query2-max-dimensions

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-15 12:26:39 +10:00
Mario Kleiner
c1e47a3c1f nv50,nvc0: Support BGRX1010102 and RGBX1010102 for sampling.
Add them as usable for textures, so they can be used by
Wayland drm in 10 bpc mode and for X11 compositing under
GLX and EGL. We need these formats to be supported at
least for sampling, otherwise GLX_texture_from_pixmap
and the equivalent EGL image extension won't work with
X11 drawables of depth 30 and just display an all black
window.

Do not expose these formats as renderable, and thereby
not as a fbconfig/EGLConfig/Visual, as NVidia hw does
not support 10 bpc unorm formats without alpha channel.

Tested under X11 + GLX/EGL + DRI2/DRI3 for compositing,
and under Wayland+Weston drm backend with a Tesla and
Pascal gpu.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-03-14 21:41:27 -04:00
Thomas Helland
03e37ec6d7 util: Use set_foreach instead of rolling our own
This follows the same pattern as in the hash_table.

Reviewed-by: Jason Ekstrand <jason.ekstrand at intel.com>
2018-03-14 20:03:57 +01:00
Thomas Helland
5f129c05e6 glsl: Use hash table cloning in copy propagation
Walking the whole hash table, inserting entries by hashing them first
is just a really bad idea. We can simply memcpy the whole thing.

V2: Remove leftover creation of acp in two places

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-14 19:52:02 +01:00
Thomas Helland
6baaf4291b util: Implement a hash table cloning function
V2: Don't rzalloc; we are about to rewrite the whole thing (Vladislav)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-14 19:52:01 +01:00
Guillaume Charifi
388ed47081 st/mesa: Factorize duplicate code in st_BlitFramebuffer()
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-03-14 14:46:51 -04:00
Dylan Baker
7dd261ac50 autotools: add -I/src/egl to tizonia
This fixes the following build breakage:

make[5]: Entering directory
'/mnt/sdc1/Gits/mesa/src/gallium/state_trackers/omx/tizonia'
   CC       h264dprc.lo
In file included from h264dprc.c:45:0:
../../../../../src/egl/drivers/dri2/egl_dri2.h:47:10: fatal error:
wayland/wayland-egl/wayland-egl-backend.h: No such file or directory
  #include "wayland/wayland-egl/wayland-egl-backend.h"
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

meson got the same fix in 7598dedfde.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-14 11:23:19 -07:00
Dylan Baker
848f2b6e31 Revert "Add processor topology calculation implementation for Darwin/OSX targets."
This reverts commit de0d10db93.

This breaks the build on at least Linux, probably other non-apple
platforms.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-14 09:30:17 -07:00
Dylan Baker
0f30c80932 Revert "sched.h needs to be imported on Darwin/OSX targets."
This reverts commit 9dc5063262.

This breaks the build on at least Linux, probably other non-apple
platforms.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-14 09:28:58 -07:00
Karol Herbst
b617bfcccf compiler: int8/uint8 support
OpenCL kernels also have int8/uint8.

v2: remove changes in nir_search as Jason posted a patch for that

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-03-14 10:08:42 -04:00
Alex Smith
fcf267ba08 radv: Fix CmdCopyImage between uncompressed and compressed images
From the spec:

    "When copying between compressed and uncompressed formats the
     extent members represent the texel dimensions of the source
     image and not the destination."

However, as per 7b890a36, we must still use the destination image type
when clamping the extent so that we copy the correct number of layers
for 2D to 3D copies.

Fixes: 7b890a36 "radv: Fix vkCmdCopyImage for 2d slices into 3d Images"
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-14 09:59:21 +00:00
Samuel Pitoiset
38f34117dd radv: fix vkGetDeviceQueue2() when create flags don't match
This fixes CTS:
dEQP-VK.api.device_init.create_device_queue2_unmatched_flags

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@gmail.com>
2018-03-14 09:53:42 +01:00
Neil Roberts
25a966a23d spirv: Handle doubles when multiplying a mat by a scalar
The code to handle mat multiplication by a scalar tries to pick either
imul or fmul depending on whether the matrix is float or integer.
However it was doing this by checking whether the base type is float.
This was making it choose the int path for doubles (and presumably
float16s).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14 08:43:33 +01:00
Iago Toral Quiroga
1a0aba7216 anv/entrypoints: VkGetDeviceProcAddr returns NULL for core instance commands
af5f2322d0 addressed this for extension commands, but the spec mandates
this behavior also for core API commands. From the Vulkan spec,
Table 2. vkGetDeviceProcAddr behavior:

device     pname                            return
----------------------------------------------------------
(..)
device     core device-level command        fp
(...)

See that it specifically states "device-level".

Since the vk.xml file doesn't state if core commands are instance or
device level, we identify device level commands as the ones that take a
VkDevice, VkQueue or VkCommandBuffer as their first parameter.

Fixes test failures in new work-in-progress CTS tests.

Also see the public issue:
https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues/2323

v2:
  - Include reference to github issue (Emil)
  - Rebased on top of Vulkan 1.1 changes.

v3:
  - Remove the not in the condition and switch the then/else cases (Jason)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14 08:09:15 +01:00
Iago Toral Quiroga
a631575ff4 anv/entrypoints: dispatches to VkQueue are device-level
v2:
  - Add trampoline functions (Jason)
  - Add an assertion for unhandled trampoline cases

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14 08:09:15 +01:00
Dave Airlie
3b0f2081b5 radv: drop assert on bindingDescriptorCount > 0
The spec is pretty clear that this can be 0, and that it operates
as a reserved binding.

Fixes:
dEQP-VK.binding_model.descriptor_update.empty_descriptor.uniform_buffer

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14 16:54:52 +10:00
Apple SWE
9dc5063262 sched.h needs to be imported on Darwin/OSX targets.
sched_yield is used but the include reference on Darwin is missing. This patch
conditionally guards on Darwin/OSX to import sched.h first.

Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2018-03-13 22:50:56 -07:00
Apple SWE
de0d10db93 Add processor topology calculation implementation for Darwin/OSX targets.
The implementation for bootstrapping SWR on Darwin targets is based on the Linux version.
Instead of reading the output of /proc/cpuinfo, sysctlbyname is used to determine the
physical identifiers, processor identifiers, core counts and thread-processor affinities.

With this patch, it is possible to use SWR as an alternate renderer on OSX to softpipe and
llvmpipe.

Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2018-03-13 22:50:27 -07:00
Roland Scheidegger
274f8bf05e r600: fix abs for op3 sources
If a src was referencing the same temp as the dst, the per-component
copy code didn't work.
e.g.
  cndge r0.xy, r0.xx, |r2|, r3
got expanded into
  mov  r12.x, |r2|
  cndge r0.x, r0.x, r12, r3
  mov  r12.y, |r2|
  cndge r0.y, r0.x, r12, r3
hence for the second cndge r0.x was mistakenly the previous cndge result.
Fix this by doing all the movs first, so there's no bogus alu.last in between.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102905

Tested-by: <iive@yahoo.com>
Reviewed-by: Dave Airlie <airlied@gmail.com>
2018-03-14 04:54:45 +01:00
Dave Airlie
27a5e5366e radv: mark all tess output for an indirect access.
If a shader does a tcs store with an indirect access, we
were only marking the first spot as used. For indirect access
we always now mark all slots used by the variable.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464
Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14 11:18:54 +10:00
Dave Airlie
4f0c89d66c ac/nir: pass the nir variable through tcs loading.
I was going to have to add another parameter to this monster,
so we should just pass the nir_variable in, I can't find any
reason this would be a bad idea.

This needed for the next fix.

Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14 11:18:54 +10:00
Dave Airlie
f9de2d409b radv: get correct offset into LDS for indexed vars.
This seems more correct to me, since if we have an array
of floats they'll be vec4 aligned, and if we do af[2],
we want the const index to increase by 2 slots in the non
compact case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464
Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14 11:18:54 +10:00
Rob Clark
4e4428482e nir: lower_load_const_to_scalar fix for 8/16b types
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-13 20:17:04 -04:00
Dylan Baker
2aad12b2af Update the documentation for meson
Meson is pretty well tested and works in most configurations now, so we
can remove the warning about it being unsuited for actual use.

It's also worth documenting that meson 0.42.0 or greater is required.

v2: - Minor rewording of supported platforms as suggested by Emil
    - Add two missing tags as reported by xmllint --html

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
2018-03-13 14:54:47 -07:00
Jason Ekstrand
85000b812d ac/nir: Use lower_vote_eq_to_ballot instead of ac_nir_lower_subgroups
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 13:25:27 -07:00
Jason Ekstrand
3d1d7e8561 nir/subgroups: Add lowering for vote_ieq/vote_feq to a ballot
This is based heavily on 97f10934ed, "ac/nir: Add vote_ieq/vote_feq
lowering pass." from Bas Nieuwenhuizen.  This version is a bit more
general since it's in common code.  It also properly handles NaN due to
not flipping the comparison for floats.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 13:25:15 -07:00
Dylan Baker
8247a30838 meson: don't use compiler.has_header
Meson's compiler.has_header is completely useless, it only checks that a
header exists, not whether it's usable. This creates problems if a
header contains a conditional #error declaration, like so:

> #if __x86_64__
> # error "Doesn't work with x86_64!"
> #endif

Compiler.has_header will return true in this case, even when compiling
for x86_64. This is useless.

Instead, we'll do a compile check so that any #error declarations will
be treated as errors, and compilation will work.

Fixes compilation on x32 architecture.

Gentoo Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=649746
meson bug: https://github.com/mesonbuild/meson/issues/2246
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-13 11:41:10 -07:00
Jason Ekstrand
8379bff6c4 i965: Emit texture cache invalidates around blorp_copy
This is a terrible hack but it fixes CTS regressions.  It's still
incredibly unclear exactly what is going wrong in the hardware to cause
this to be an issue so this isn't a good fix by any means.  However, it
does fix tests so there is that.

Fixes: fb0e9b5197 "i965: Track the depth and render caches separately"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103746
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-13 11:24:40 -07:00
Eric Anholt
a326eedc75 brodacom/vc4: Fix simulator since the perfmon change.
It would be nice to support perfmon with simulator, and might be a useful
tool for regression testing performance (since the simulator would be
deterministic).
2018-03-13 10:32:58 -07:00
Eric Anholt
191bc7ce61 spirv: Silence compiler warning about undefined srcs[0]
v2: Use assume() at the srcs[] definition instead.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-03-13 10:32:55 -07:00
Samuel Pitoiset
7c83430672 ac/nir: rename radeon_llvm_reg_index_soa() to ac_llvm_reg_index_soa()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:28 +01:00
Samuel Pitoiset
b128fd773f ac/nir: remove some unnecessary includes and declarations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:27 +01:00
Samuel Pitoiset
cd4e823341 ac/nir: drop radv prefix from radv_lower_gather4_integer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:25 +01:00
Samuel Pitoiset
fbe694562b ac/nir: move ac_nir_compiler_options and friends to radv folder
Also replace ac_ by radv_.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:23 +01:00
Samuel Pitoiset
237229430f ac: move ac_shader_info to radv folder
This is RADV specific code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:21 +01:00
Samuel Pitoiset
2cfba40eea ac/nir: move ac_shader_variant_info and friends to radv folder
Also replace ac_ by radv_.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:16 +01:00
Samuel Pitoiset
b2653007b9 ac/nir: move all RADV related code to radv_nir_to_llvm.c
Now the "ac/nir" prefix will really be the shared code between
RadeonSI and RADV, that might avoid confusions in the future.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
8e15824b9d ac/nir: make emit_barrier() non-static
Required in order to move all RADV specific code outside of ac/nir.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
4e3117b718 ac/nir: move radeon_llvm_reg_index_soa() to ac_nir_to_llvm.h
Required in order to move all RADV specific code outside of ac/nir.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
3a30b89353 ac/nir: make handle_shader_output_decl() non-static
Required in order to move all RADV specific code outside of ac/nir.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
3fe47b1290 ac/nir: change prototype of handle_shader_output_decl()
This allows to remove the ac_nir_context dependency.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
61a91ca3f5 ac/nir: move unpack_param() to ac_llvm_build.c
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
28bb6873ec ac/nir: move trim_vector to ac_llvm_build.c
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
895632baef ac/nir: move cast_ptr() to ac_llvm_build.c
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
bf6368297b ac/nir: move ac_build_alloca() to ac_llvm_build.c
As well as si_build_alloca_undef() and drop the si prefix.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Timothy Arceri
370e356eba gallium: silence __builtin_frame_address nonzero argument is unsafe warning
Calling __builtin_frame_address with a nonzero argument is unsafe
but is sometimes done for debugging purposes. Since this code is
part of some debug util code I'm assuming that is the case here
and using GCC pragma to silence the warning.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-13 09:38:10 +11:00
Dylan Baker
b7c6870f87 meson: Add moduledir to d3d.pc
This is required to build wine with the nine patchset

Fixes: 6b4c7047d5
       ("meson: build gallium nine state_tracker")
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-12 13:52:38 -07:00
Mathias Fröhlich
a2f08dd574 gallium: Use struct gl_array_attributes* as st_pipe_vertex_format argument.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-12 18:24:31 +01:00
Ian Romanick
def0030e64 mesa: Don't write to user buffer in glGetTexParameterIuiv on error
With some sets of optimization flags, GCC will generate warnings like
this:

src/mesa/main/texparam.c:2327:27: warning: ‘*((void *)&ip+12)’ may be used uninitialized in this function [-Wmaybe-uninitialized]
             params[3] = ip[3];
                         ~~^~~
src/mesa/main/texparam.c:2320:16: note: ‘*((void *)&ip+12)’ was declared here
          GLint ip[4];
                ^~

ip is not initialized in cases where a GL error is generated.  In these
cases, we should *not* write to the user's buffer, so this is actually a
bug.  I wrote a new piglit test gl-3.0-texparameteri to show this bug.

I suspect that Coverity also detected this, but the scan site is
currently down.

Fixes: c2c507786 "main: Added entry points for glGetTextureParameteriv, Iiv, and Iuiv."
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-12 10:13:30 -07:00
Roman Gilg
f94597f554 gallium: work around libtool relink issue for libdrm
This is similar to commit 90633079. libtool links first to system directories
instead of custom locations of libdrm on relinking. Since a more recent libdrm
version than the one provided by the system is often needed when compiling
mesa, make sure this works by putting libdrm in front.

See also: https://bugs.freedesktop.org/show_bug.cgi?id=100259

Signed-off-by: Roman Gilg <subdiff@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-12 14:49:07 +00:00
Emil Velikov
678ba53240 vulkan: autotools: do not redirect stdin/stdout for wayland-scanner
The tool accepts the input and output files as arguments.
There's no need for the redirection.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
8151f5cad9 wayland-drm: autotools: do not redirect stdin/stdout for wayland-scanner
The tool accepts the input and output files as arguments.
There's no need for the redirection.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
1178e0cf49 egl: autotools: do not redirect stdin/stdout for wayland-scanner
The tool accepts the input and output files as arguments.
There's no need for the redirection.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
08189731a4 docs: document removal of GLX_SGIX_swap_{barrier,group} stubs
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
5ef608fab7 glx: remove empty GLX_SGIX_swap_group stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
2c765b0d9a gallium/x11: remove empty GLX_SGIX_swap_group stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
afab516f5f x11: remove empty GLX_SGIX_swap_group stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
742b8e3301 glx: remove empty GLX_SGIX_swap_barrier stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
447731348e gallium/x11: remove empty GLX_SGIX_swap_barrier stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:51 +00:00
Emil Velikov
1d2d519d78 x11: remove empty GLX_SGIX_swap_barrier stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:51 +00:00
Emil Velikov
f197f02e50 configure: remove unused AM_CONDITIONAL
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-12 14:48:51 +00:00
Bas Nieuwenhuizen
997306c031 radv: Increase the number of dynamic uniform buffers.
The vulkan API is not ideal as it does not allow us have a
shared limit.

Feral needs 15+6 for one of their games, and I'm not a fan
of overcommitting the limits, so increase the number of
dynamic uniform buffers to 16.

CC: <mesa-stable@lists.freedesktop.org>
CC: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-12 09:46:22 +01:00
Dave Airlie
e76cf1ff12 u_vbuf/translate: pass max_index into the set_buffer.
This fixes a memory trashing crash (not the test) seen with
dEQP-GLES3.stress.draw.unaligned_data.random.203
on virgl.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-12 11:57:13 +10:00
Dave Airlie
5d4fbc2b54 r600: implement callstack workaround for evergreen.
This is ported from the sb backend, there are some issues with
evergreen stacks on the boundary between entries and ALU_PUSH_BEFORE
instructions.

Whenever we are going to use a push before, we check the stack
usage and if we have to use the workaround, then we switch to
a separate push.

I noticed this problem dealing with some of the soft fp64 shaders,
in nosb mode, they are quite stack happy.

This fixes all the glitches and inconsistencies I've seen with them

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Elie Tournier <elie.tournier@collabora.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-12 11:11:44 +10:00
Marek Olšák
163a29099a gallium/util: add helper util_wait_for_idle
This is an old patch that I had.
2018-03-11 13:14:27 -04:00
Roland Scheidegger
0f0a6fa21d u_blit: (trivial) u_blit.h needs to include p_defines.h
(For the pipe_tex_filter enum)

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-10 20:09:04 +01:00
Christian Gmeiner
c9b153fea7 travis: bump libxcb version to 1.13
Fixes following dependency problem:
  Native dependency xcb-dri3 found: NO found '1.11' but need: '>= 1.13'

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Fixes: c80c08e226 ("vulkan/wsi/x11: Add support for DRI3 v1.2")
2018-03-10 16:55:36 +01:00
Mathias Fröhlich
64d2a20480 mesa: Make gl_vertex_array contain pointers to first order VAO members.
Instead of keeping a copy of the vertex array content in
struct gl_vertex_array only keep pointers to the first order
information originaly in the VAO.
For that represent the current values by struct gl_array_attributes
and struct gl_vertex_buffer_binding.

v2: Change comments.
    Remove gl... prefix from variables except in the i965 directory where
    it was like that before. Reindent because of that.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-10 07:33:51 +01:00
Roland Scheidegger
d62f0df354 draw: fix alpha value for very short aa lines
The logic would not work correctly for line lengths smaller than 1.0,
even a degenerated line with length 0 would still produce a fragment
with anyhwere between alpha 0.0 and 0.5.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-10 02:11:50 +01:00
Jordan Justen
24b415270f intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview
Ken suggested that we might be underallocating scratch space on HD
400. Allocating scratch space as though there was actually 8 EUs
seems to help with a GPU hang seen on synmark CSDof.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-09 16:15:58 -08:00
Jordan Justen
06e3bd02c0 i965: Hard code CS scratch_ids_per_subslice for Cherryview
Ken suggested that we might be underallocating scratch space on HD
400. Allocating scratch space as though there was actually 8 EUs
seems to help with a GPU hang seen on synmark CSDof.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104636
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105290
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
2018-03-09 16:15:34 -08:00
Marek Olšák
db495b8962 st/dri: fix OpenGL-OpenCL interop for GL_TEXTURE_BUFFER
Tested by our OpenCL team.

Fixes: 9c499e6759 "st/mesa: don't invoke st_finalize_texture & st_convert_sampler for TBOs"

Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-09 16:33:31 -05:00
Marek Olšák
2bdb54bce7 radeonsi: add a workaround for GFX9 hang with init_config alignment
Fixes: 75c5d25f0f "radeonsi: align command buffer starting address to fix some Raven hangs"
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
2018-03-09 16:28:29 -05:00
Marek Olšák
e99212e970 ac/gpu_info: print ib_start_alignment, add assertion 2018-03-09 16:28:29 -05:00
Greg V
e30a165be2 meson: Use system_has_kms_drm in default driver selection
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-09 10:02:44 -08:00
Eric Anholt
c57d5ea3bb broadcom/vc4: Add an accelerated path to turn raster R8/RG88 into tiled.
Drawing a 1080p YV12 video stream generated by MMAL goes from 10.5 FPS to
36.
2018-03-09 09:59:54 -08:00
Eric Anholt
cf170616da gallium: Add a util_blitter path for using a custom VS and FS.
Like the r600 paths to use other custom states, we pass in a couple of
parameters to customize the innards of the blitter.  It's up to the caller
to wrap other state necessary for its shaders (for example, constant
buffers for the uniforms the shader uses).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-09 09:59:54 -08:00
Eric Anholt
46a32e3d2e broadcom/vc4: Allow binding non-zero constant buffers.
We're going to use UBO loads for implementing YUV linear-to-T-format
blits.
2018-03-09 09:59:54 -08:00
Eric Anholt
2725ab2b12 broadcom: Remove our defines of DRM_FORMAT_MOD_INVALID.
The imported drm_fourcc.h handles it now.
2018-03-09 09:59:54 -08:00
Eric Anholt
a3a4c23dec broadcom: Suppress compiler warnings about enum pipe_tex_filter. 2018-03-09 09:59:54 -08:00
Louis-Francis Ratté-Boulianne
3160cb86aa egl/x11: Re-allocate buffers if format is suboptimal
If PresentCompleteNotify event says the pixmap was presented
with mode PresentCompleteModeSuboptimalCopy, it means the pixmap
could possibly have been flipped instead if allocated with a
different format/modifier.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-09 17:47:14 +00:00
Louis-Francis Ratté-Boulianne
069fdd5f9f egl/x11: Support DRI3 v1.1
Add support for DRI3 v1.1, which allows pixmaps to be backed by
multi-planar buffers, or those with format modifiers. This is both
for allocating render buffers, as well as EGLImage imports from a
native pixmap (EGL_NATIVE_PIXMAP_KHR).

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-09 17:47:14 +00:00
Louis-Francis Ratté-Boulianne
61309c2a72 vulkan/wsi/x11: Return VK_SUBOPTIMAL_KHR for X11
When it is detected that a window could have been flipped
but has been copied because of suboptimal format/modifier.
The Vulkan client should then re-create the swapchain.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-09 17:47:13 +00:00
Daniel Stone
c80c08e226 vulkan/wsi/x11: Add support for DRI3 v1.2
Adds support for multiple planes and buffer modifiers.

v4: Rename "has_dri3_v1_1" to "has_dri3_modifiers"
v12: Multi-planar/modifier support is now DRI3 v1.2; also update release
     versions
2018-03-09 17:47:13 +00:00
Dylan Baker
7258be91c5 autotools: include all meson.build files
Otherwise SWR cannot be built with meson from an autotools generated
tarball, such as the 18.0.0-rc4 tarball.

Fixes: 16bf813830 ("meson/swr: re-shuffle generated files")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-09 08:15:04 -08:00
Michel Dänzer
2a4596a2f0 st/mesa: gl_program::info.system_values_read is a 64-bit-field
We were dropping the upper 32 bits, which caused assertion failures in
some compute shader piglit tests with radeonsi since the commit below.

Fixes: 752e969703 ("compiler: Add two new system values for subgroups")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-09 16:52:11 +01:00
George Kyriazis
379e00dc27 swr/rast: Refactor memory gather operations
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:42 -06:00
George Kyriazis
3f7ce10b3e swr/rast: Add KNOB_DISABLE_SPLIT_DRAW
This is useful for archrast data collection. This greatly speeds up the
post processing script since there is significantly less events generated.

Finally, this is a simpler option to communicate to users than having
them directly adjust MAX_PRIMS_PER_DRAW and MAX_TESS_PRIMS_PER_DRAW.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:30 -06:00
George Kyriazis
e0a4a25829 swr/rast: Add VPOPCNT
Supports popcnt on vector masks (e.g. <8 x i1>)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:23 -06:00
George Kyriazis
b56afe1a4f swr/rast: Add tracking for stream out topology
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:14 -06:00
George Kyriazis
2f6ae8cfcd swr/rast: Add split draw and other state information to DrawInfoEvent.
Removed specific split draw events.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:07 -06:00
George Kyriazis
714093203e swr/rast: Refactor api and worker event handlers.
In the API event handler we want to share information between the core
layer and the API. Specifically, around associating various ids with
different kinds of events. For example, associate render pass id with
draw ids, or command buffer ids with draw ids.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:35:59 -06:00
George Kyriazis
cfdd35beaf swr/rast: Add support for generalized late and early z/stencil stats
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:35:52 -06:00
George Kyriazis
9e25f298eb swr/rast: Rasterized Subspans stats support
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:35:47 -06:00
George Kyriazis
d78b28fc33 swr/rast: Added comment
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:34:55 -06:00
Eric Engestrom
e903a7b0bb vulkan/wsi: clean up cleanup path
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-09 13:25:44 +00:00
Bas Nieuwenhuizen
a793e7899f radv: Fix the autotools build take 2.
Forgot to remove a word....

Fixes: 04ffabf17a "radv: Fix autotools build."
2018-03-09 14:10:24 +01:00
Lucas Stach
1f55d06783 etnaviv: allow mixing different bit depths for color and depth surfaces
Vivante hardware supports this just fine. There is no reason why this shouldn't
be advertised as a valid combination.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-03-09 12:06:07 +01:00
Thierry Reding
6d4d46bca9 autotools: Add tegra to AM_DISTCHECK_CONFIGURE_FLAGS
This allows the driver to be built on a make distcheck and makes sure
that it properly builds when a distribution tarball is made.

Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:48:22 +01:00
Thierry Reding
1755f608f5 tegra: Initial support
Tegra K1 and later use a GPU that can be driven by the Nouveau driver.
But the GPU is a pure render node and has no display engine, hence the
scanout needs to happen on the Tegra display hardware. The GPU and the
display engine each have a separate DRM device node exposed by the
kernel.

To make the setup appear as a single device, this driver instantiates
a Nouveau screen with each instance of a Tegra screen and forwards GPU
requests to the Nouveau screen. For purposes of scanout it will import
buffers created on the GPU into the display driver. Handles that
userspace requests are those of the display driver so that they can be
used to create framebuffers.

This has been tested with some GBM test programs, as well as kmscube and
weston. All of those run without modifications, but I'm sure there is a
lot that can be improved.

Some fixes contributed by Hector Martin <marcan@marcan.st>.

Changes in v2:
- duplicate file descriptor in winsys to avoid potential issues
- require nouveau when building the tegra driver
- check for nouveau driver name on render node
- remove unneeded dependency on libdrm_tegra
- remove zombie references to libudev
- add missing headers to C_SOURCES variable
- drop unneeded tegra/ prefix for includes
- open device files with O_CLOEXEC
- update copyrights

Changes in v3:
- properly unwrap resources in ->resource_copy_region()
- support vertex buffers passed by user pointer
- allocate custom stream and const uploader
- silence error message on pre-Tegra124
- support X without explicit PRIME

Changes in v4:
- ship Meson build files in distribution tarball
- drop duplicate driver_tegra dependency

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:48:22 +01:00
Thierry Reding
2052dbdae3 nouveau: Add framebuffer modifier support
This adds support for framebuffer modifiers to Nouveau. This will be
used by the Tegra driver to share metadata about the format of buffers
(such as the tiling mode or compression).

Changes in v2:
- remove unused parameters to nouveau_buffer_create()
- move format modifier query code to nvc0 backend
- restrict format modifiers to 2D textures
- implement ->query_dmabuf_modifiers()

Changes in v4:
- add UAPI include path on meson builds

Changes in v5:
- remove unnecessary includes

Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:48:08 +01:00
Thierry Reding
b964cab80a nouveau/nvc0: Extract common tile mode macro
Add a new macro that can be used to extract the tiling mode from a
tile_mode value. This is will be used to determine the number of GOBs
used in block linear mode.

Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:47:54 +01:00
Thierry Reding
75bf489628 drm/tegra: Sanitize format modifiers
The existing format modifier definitions were merged prematurely, and
recent work has unveiled that the definitions are suboptimal in several
ways:

  - The format specifiers, except for one, are not Tegra specific, but
    the names don't reflect that.
  - The number space is split into two, reserving 32 bits for some
    "parameter" which most of the modifiers are not going to have.
  - Symbolic names for the modifiers are not using the standard
    DRM_FORMAT_MOD_* prefix, which makes them awkward to use.
  - The vendor prefix NV is somewhat ambiguous.

Fortunately, nobody's started using these modifiers, so we can still fix
the above issues. Do so by using the standard prefix. Also, remove TEGRA
from the name of those modifiers that exist on NVIDIA GPUs as well. In
case of the block linear modifiers, make the "parameter" smaller (4
bits, though only 6 values are valid) and don't let that leak into any
of the other modifiers.

Finally, also use the more canonical NVIDIA instead of the ambiguous NV
prefix.

This is based on commit 268892cb63a822315921a8dab48ac3e4abf7dd03 from
Linux v4.16-rc1.

Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:44:35 +01:00
Thierry Reding
ffc85cfac0 drm/fourcc: Fix fourcc_mod_code() definition
Avoid a compiler warnings when the val parameter is an expression.

This is based on commit 5843f4e02fbe86a59981e35adc6cabebee46fdc0 from
Linux v4.16-rc1.

Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:44:35 +01:00
Bas Nieuwenhuizen
04ffabf17a radv: Fix autotools build.
Forgot it again ....

Fixes: b6347807a9 "radv: Generate icd files."
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-09 09:36:19 +01:00
Samuel Pitoiset
365850fd68 ac/nir: set number of channels for packed mrt exports
Bit 0 enables VSRC0 (R in low bits, G high) and bit 2 enables
VSRC1 (B in low bits, A high).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-09 09:28:20 +01:00
Bas Nieuwenhuizen
68201ab2da radv: Update version to 1.1.70.
Turns out they did not reset the patch number on release.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-09 07:53:39 +01:00
Bas Nieuwenhuizen
b6347807a9 radv: Generate icd files.
If the api version is too low, the loader clamps the application
requested version to the advertized version, which messes with
which extensions are enabled.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-09 07:53:39 +01:00
Ian Romanick
6878c9aabc nir: Don't i2b a value that is already Boolean
A bunch of shaders have sequences like:

    i2b(u2i(floatBitsToUint(intBitsToFloat(x == y ? -1 : 0))))

Other optimizations (and NIR's typeless nature) reduce this to

    i2b(x == y)

which is silly.

Skylake
total instructions in shared programs: 14498698 -> 14497948 (<.01%)
instructions in affected programs: 74480 -> 73730 (-1.01%)
helped: 277
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 2.71 x̃: 2
helped stats (rel) min: 0.04% max: 13.79% x̄: 1.45% x̃: 0.68%
95% mean confidence interval for instructions value: -3.35 -2.06
95% mean confidence interval for instructions %-change: -1.74% -1.16%
Instructions are helped.

total cycles in shared programs: 532015500 -> 531999238 (<.01%)
cycles in affected programs: 5943878 -> 5927616 (-0.27%)
helped: 251
HURT: 74
helped stats (abs) min: 1 max: 13149 x̄: 127.89 x̃: 14
helped stats (rel) min: 0.01% max: 17.31% x̄: 1.55% x̃: 0.53%
HURT stats (abs)   min: 1 max: 4550 x̄: 214.04 x̃: 15
HURT stats (rel)   min: <.01% max: 44.43% x̄: 2.81% x̃: 0.33%
95% mean confidence interval for cycles value: -158.51 58.43
95% mean confidence interval for cycles %-change: -1.07% -0.04%
Inconclusive result (value mean confidence interval includes 0).

total loops in shared programs: 4753 -> 4735 (-0.38%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

Haswell and Broadwell had simliar results. (Broadwell shown)
total instructions in shared programs: 14791877 -> 14791127 (<.01%)
instructions in affected programs: 77326 -> 76576 (-0.97%)
helped: 278
HURT: 1
helped stats (abs) min: 1 max: 32 x̄: 2.70 x̃: 2
helped stats (rel) min: 0.04% max: 13.79% x̄: 1.42% x̃: 0.68%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.49% max: 0.49% x̄: 0.49% x̃: 0.49%
95% mean confidence interval for instructions value: -3.33 -2.05
95% mean confidence interval for instructions %-change: -1.70% -1.13%
Instructions are helped.

total cycles in shared programs: 558250067 -> 558252872 (<.01%)
cycles in affected programs: 5806328 -> 5809133 (0.05%)
helped: 235
HURT: 83
helped stats (abs) min: 1 max: 10630 x̄: 81.73 x̃: 16
helped stats (rel) min: 0.03% max: 18.58% x̄: 1.60% x̃: 0.51%
HURT stats (abs)   min: 1 max: 10590 x̄: 265.19 x̃: 20
HURT stats (rel)   min: <.01% max: 15.28% x̄: 1.89% x̃: 0.54%
95% mean confidence interval for cycles value: -89.87 107.51
95% mean confidence interval for cycles %-change: -1.06% -0.32%
Inconclusive result (value mean confidence interval includes 0).

total loops in shared programs: 4735 -> 4717 (-0.38%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

total fills in shared programs: 83111 -> 83110 (<.01%)
fills in affected programs: 28 -> 27 (-3.57%)
helped: 1
HURT: 0

Ivy Bridge
total instructions in shared programs: 11774173 -> 11773436 (<.01%)
instructions in affected programs: 70819 -> 70082 (-1.04%)
helped: 267
HURT: 0
helped stats (abs) min: 1 max: 48 x̄: 2.76 x̃: 2
helped stats (rel) min: 0.21% max: 19.51% x̄: 1.57% x̃: 0.63%
95% mean confidence interval for instructions value: -3.51 -2.01
95% mean confidence interval for instructions %-change: -1.94% -1.21%
Instructions are helped.

total cycles in shared programs: 257153833 -> 257148932 (<.01%)
cycles in affected programs: 585341 -> 580440 (-0.84%)
helped: 167
HURT: 100
helped stats (abs) min: 1 max: 1327 x̄: 44.89 x̃: 16
helped stats (rel) min: 0.04% max: 26.54% x̄: 2.41% x̃: 0.88%
HURT stats (abs)   min: 1 max: 200 x̄: 25.95 x̃: 16
HURT stats (rel)   min: 0.04% max: 9.81% x̄: 1.34% x̃: 0.65%
95% mean confidence interval for cycles value: -33.25 -3.46
95% mean confidence interval for cycles %-change: -1.47% -0.54%
Cycles are helped.

total loops in shared programs: 3416 -> 3398 (-0.53%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

LOST:   2
GAINED: 0

Sandy Bridge
total instructions in shared programs: 10499306 -> 10499094 (<.01%)
instructions in affected programs: 6051 -> 5839 (-3.50%)
helped: 43
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 4.93 x̃: 2
helped stats (rel) min: 0.39% max: 12.90% x̄: 4.29% x̃: 2.45%
95% mean confidence interval for instructions value: -7.66 -2.20
95% mean confidence interval for instructions %-change: -5.47% -3.12%
Instructions are helped.

total cycles in shared programs: 145862568 -> 145861370 (<.01%)
cycles in affected programs: 61733 -> 60535 (-1.94%)
helped: 36
HURT: 2
helped stats (abs) min: 16 max: 66 x̄: 36.61 x̃: 35
helped stats (rel) min: 0.45% max: 17.31% x̄: 4.92% x̃: 2.81%
HURT stats (abs)   min: 18 max: 102 x̄: 60.00 x̃: 60
HURT stats (rel)   min: 1.10% max: 1.85% x̄: 1.48% x̃: 1.48%
95% mean confidence interval for cycles value: -41.28 -21.77
95% mean confidence interval for cycles %-change: -6.16% -3.00%
Cycles are helped.

total loops in shared programs: 1803 -> 1785 (-1.00%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

LOST:   4
GAINED: 0

No changes on Iron Lake of GM45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-08 15:26:26 -08:00
Ian Romanick
1583f49eaa i965/vec4: Allow CSE on subset VF constant loads
v2: Rewrite the code that generates the VF mask.  Suggested by Ken.

No changes on other platforms.

Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown)
total instructions in shared programs: 13059891 -> 13059884 (<.01%)
instructions in affected programs: 431 -> 424 (-1.62%)
helped: 7
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.19% max: 5.26% x̄: 2.05% x̃: 1.49%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -3.39% -0.71%
Instructions are helped.

total cycles in shared programs: 409260032 -> 409260018 (<.01%)
cycles in affected programs: 4228 -> 4214 (-0.33%)
helped: 7
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.28% max: 2.04% x̄: 0.54% x̃: 0.28%
95% mean confidence interval for cycles value: -2.00 -2.00
95% mean confidence interval for cycles %-change: -1.15% 0.07%

Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-08 15:26:26 -08:00
Ian Romanick
360899d457 i965/vec4: Relax writemask condition in CSE
If the previously seen instruction generates more fields than the new
instruction, still allow CSE to happen.  This doesn't do much, but it
also enables a couple more shaders in the next patch.  It helped quite a
bit in another change series that I have (at least for now) abandoned.

v2: Add some extra comentary about the parameters to instructions_match.
Suggested by Ken.

No changes on Skylake, Broadwell, Iron Lake or GM45.

Ivy Bridge and Haswell had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11780295 -> 11780294 (<.01%)
instructions in affected programs: 302 -> 301 (-0.33%)
helped: 1
HURT: 0

total cycles in shared programs: 257308315 -> 257308313 (<.01%)
cycles in affected programs: 2074 -> 2072 (-0.10%)
helped: 1
HURT: 0

Sandy Bridge
total instructions in shared programs: 10506687 -> 10506686 (<.01%)
instructions in affected programs: 335 -> 334 (-0.30%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-08 15:26:26 -08:00
Ian Romanick
52c7df1643 i965/fs: Merge CMP and SEL into CSEL on Gen8+
v2: Fix several problems handling inverted predicates.  Add a much
bigger comment around the BRW_CONDITIONAL_NZ case.

v3: Allow uniforms and shader inputs as sources for the original SEL and
CMP instructions.  This enables a LOT more shaders to receive CSEL
merging (5816 vs 8564 on SKL).

v4: Report progress.

Broadwell and Skylake had similar results. (Broadwell shown)
helped: 8527
HURT: 0
helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 1
helped stats (rel) min: 0.03% max: 17.80% x̄: 1.12% x̃: 0.70%
95% mean confidence interval for instructions value: -2.51 -2.36
95% mean confidence interval for instructions %-change: -1.15% -1.10%
Instructions are helped.

total cycles in shared programs: 559442317 -> 558288357 (-0.21%)
cycles in affected programs: 372699860 -> 371545900 (-0.31%)
helped: 6748
HURT: 1450
helped stats (abs) min: 1 max: 32000 x̄: 182.41 x̃: 12
helped stats (rel) min: <.01% max: 66.08% x̄: 3.42% x̃: 0.70%
HURT stats (abs)   min: 1 max: 2538 x̄: 53.08 x̃: 14
HURT stats (rel)   min: <.01% max: 96.72% x̄: 3.32% x̃: 0.90%
95% mean confidence interval for cycles value: -179.01 -102.51
95% mean confidence interval for cycles %-change: -2.37% -2.08%
Cycles are helped.

LOST:   0
GAINED: 6

No changes on earlier platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3]
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-08 15:26:26 -08:00
Kenneth Graunke
70de61594d i965/fs: Add infrastructure for generating CSEL instructions.
v2 (idr): Don't allow CSEL with a non-float src2.

v3 (idr): Add CSEL to fs_inst::flags_written.  Suggested by Matt.

v4 (idr): Only set BRW_ALIGN_16 on Gen < 10 (suggested by Matt).  Don't
reset the access mode afterwards (suggested by Samuel and Matt).  Add
support for CSEL not modifying the flags to more places (requested by
Matt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v3]
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-08 15:26:26 -08:00
Ian Romanick
54e8d2268d nir: Narrow some dot product operations
On vector platforms, this helps elide some constant loads.

v2: Reorder the transformations.

No changes on Broadwell or Skylake.

Haswell
total instructions in shared programs: 13093793 -> 13060163 (-0.26%)
instructions in affected programs: 1277532 -> 1243902 (-2.63%)
helped: 13216
HURT: 95
helped stats (abs) min: 1 max: 18 x̄: 2.56 x̃: 2
helped stats (rel) min: 0.21% max: 20.00% x̄: 3.63% x̃: 2.78%
HURT stats (abs)   min: 1 max: 6 x̄: 1.77 x̃: 1
HURT stats (rel)   min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19%
95% mean confidence interval for instructions value: -2.57 -2.49
95% mean confidence interval for instructions %-change: -3.65% -3.54%
Instructions are helped.

total cycles in shared programs: 409580819 -> 409268463 (-0.08%)
cycles in affected programs: 71730652 -> 71418296 (-0.44%)
helped: 9898
HURT: 2352
helped stats (abs) min: 2 max: 16014 x̄: 37.08 x̃: 16
helped stats (rel) min: <.01% max: 35.55% x̄: 6.26% x̃: 4.50%
HURT stats (abs)   min: 2 max: 276 x̄: 23.25 x̃: 6
HURT stats (rel)   min: <.01% max: 40.00% x̄: 3.54% x̃: 1.97%
95% mean confidence interval for cycles value: -33.19 -17.80
95% mean confidence interval for cycles %-change: -4.50% -4.26%
Cycles are helped.

total fills in shared programs: 82059 -> 82052 (<.01%)
fills in affected programs: 21 -> 14 (-33.33%)
helped: 7
HURT: 0

Sandy Bridge and Ivy Bridge had similar results (Ivy Bridge shown)
total instructions in shared programs: 11811851 -> 11780605 (-0.26%)
instructions in affected programs: 1155007 -> 1123761 (-2.71%)
helped: 12304
HURT: 95
helped stats (abs) min: 1 max: 18 x̄: 2.55 x̃: 2
helped stats (rel) min: 0.21% max: 20.00% x̄: 3.69% x̃: 2.86%
HURT stats (abs)   min: 1 max: 6 x̄: 1.77 x̃: 1
HURT stats (rel)   min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19%
95% mean confidence interval for instructions value: -2.56 -2.48
95% mean confidence interval for instructions %-change: -3.71% -3.59%
Instructions are helped.

total cycles in shared programs: 257618409 -> 257316805 (-0.12%)
cycles in affected programs: 71999580 -> 71697976 (-0.42%)
helped: 9155
HURT: 2380
helped stats (abs) min: 2 max: 16014 x̄: 38.44 x̃: 16
helped stats (rel) min: <.01% max: 35.75% x̄: 6.39% x̃: 4.62%
HURT stats (abs)   min: 2 max: 290 x̄: 21.14 x̃: 4
HURT stats (rel)   min: <.01% max: 41.55% x̄: 3.14% x̃: 1.33%
95% mean confidence interval for cycles value: -34.32 -17.97
95% mean confidence interval for cycles %-change: -4.55% -4.29%
Cycles are helped.

GM45 and Iron Lake had nearly identical results (Iron Lake shown)
total instructions in shared programs: 7886750 -> 7879944 (-0.09%)
instructions in affected programs: 373781 -> 366975 (-1.82%)
helped: 3715
HURT: 47
helped stats (abs) min: 1 max: 8 x̄: 1.86 x̃: 1
helped stats (rel) min: 0.22% max: 16.67% x̄: 2.88% x̃: 2.06%
HURT stats (abs)   min: 1 max: 6 x̄: 2.55 x̃: 2
HURT stats (rel)   min: 1.09% max: 5.00% x̄: 1.93% x̃: 2.35%
95% mean confidence interval for instructions value: -1.85 -1.77
95% mean confidence interval for instructions %-change: -2.91% -2.73%
Instructions are helped.

total cycles in shared programs: 178114636 -> 178095452 (-0.01%)
cycles in affected programs: 7227666 -> 7208482 (-0.27%)
helped: 3349
HURT: 301
helped stats (abs) min: 2 max: 90 x̄: 6.55 x̃: 4
helped stats (rel) min: <.01% max: 14.18% x̄: 0.95% x̃: 0.63%
HURT stats (abs)   min: 2 max: 42 x̄: 9.13 x̃: 10
HURT stats (rel)   min: 0.01% max: 11.19% x̄: 1.22% x̃: 1.50%
95% mean confidence interval for cycles value: -5.52 -4.99
95% mean confidence interval for cycles %-change: -0.81% -0.73%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-03-08 15:26:26 -08:00
Lionel Landwerlin
d10a39ebe0 i965: perf: consolidate unmapping oa perf bo outside accumulation
Do this in one place outside the only caller of the accumulation
function.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:29 +00:00
Lionel Landwerlin
fb921a2870 i965: perf: count number of accumlated reports
This will be reused later.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:26 +00:00
Lionel Landwerlin
e4387faafb i965: perf: reuse timescale base function from query
We already have the same function in brw_queryobj.c

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:23 +00:00
Lionel Landwerlin
b71da26496 i965: perf: store sysfs device entry into context
We want to reuse it later on.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:21 +00:00
Lionel Landwerlin
5742b17da1 i965: perf: store the hw_id of the context in the query
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:18 +00:00
Lionel Landwerlin
80cd669a32 i965: perf: default case for unknown query types
Just some extra safety before further changes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:00 +00:00
Marek Olšák
9b7db12815 radeonsi: remove chip_class parameter from si_lower_nir
We can get it from si_screen.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Marek Olšák
78ef16e2f9 winsys/amdgpu: query GDS info
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Marek Olšák
a4a113b5bc winsys/amdgpu: pad compute IBs
v2: pad with PKT2 NOPs on SI

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Marek Olšák
35cd86d4e9 radeonsi: expand constbuf 0 address correctly to fix Vega10 hangs
This is only required with the latest libdrm.

This fixes 32-bit support with high addresses.
(and possibly 64-bit support too because the high bits need to be masked out)

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Marek Olšák
75c5d25f0f radeonsi: align command buffer starting address to fix some Raven hangs
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Christian Gmeiner
5b68a7297d etnaviv: add get_driver_query_group_info(..)
This enables AMD_performance_monitor extension.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-03-08 20:44:04 +01:00
Christian Gmeiner
3d912bd742 etnaviv: add query_group_info for sw counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-03-08 20:43:55 +01:00
Dylan Baker
1e9d779331 meson: Fix building gallium media libs without egl
v2: - rebase on omx fix

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
2018-03-08 10:14:02 -08:00
Dylan Baker
f74cf04d3e meson: Allow building dri based EGL without GLX
It should be possible to build EGL without GLX, but the meson build
currently doesn't allow that because it too tightly couples glx and dri.
This patch eases dri and glx apart, so that EGL without GLX can be
built.

CC: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-03-08 09:12:24 -08:00
Thierry Reding
d41ee9ba5d glx/apple: Ship meson build file in tarball
The meson build file for Apple GLX is not listed in the EXTRA_DIST make
variable and therefore isn't shipped as part of the release tarball, so
meson builds from the tarball will fail.

Add the file to EXTRA_DIST to ensure it is included in the tarball.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-08 12:11:32 +01:00
Samuel Pitoiset
4e3c1ace65 ac/nir: do not emit unnecessary null exports in fragment shaders
Null exports should only be needed when no other exports are
emitted. This removes a bunch of 'exp null off, off, off, off done vm'.

Affected games are Dota 2 and Wolfenstein 2, not sure if that
really helps, but code size is decreasing there.

Polaris10:
Totals from affected shaders:
SGPRS: 8216 -> 8216 (0.00 %)
VGPRS: 7072 -> 7072 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 454968 -> 453896 (-0.24 %) bytes
Max Waves: 772 -> 772 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-08 11:56:05 +01:00
Eric Engestrom
19dd7f007e drirc: whitespace fix
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-08 09:53:34 +00:00
Thomas Hellstrom
93e58d5e17 drirc: Disable the GLX_SGI_video_sync extension for gnome-shell on vmware
With this extension enabled and a server GLX implementation that actually
honors it, Window movement lags considerably on gnome-shell/vmware, so
disable it by default.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
2018-03-08 07:26:29 +01:00
Thomas Hellstrom
4ca9ad2bb2 gallium/st_dri: Honor the glx_disable_sgi_video_sync config option
This option is disabled by default. Primarily intended for drivers on
virtual hardware.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
2018-03-08 07:26:29 +01:00
Thomas Hellstrom
f4070956d4 glx/dri: Add a driconf option to disable GLX_SGI_video_sync
Drivers on virtual hardware don't want to expose this extension to
GLX compositors, similarly to GLX_OML_sync_control, since that significantly
increases latency.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
2018-03-08 07:26:29 +01:00
Timothy Arceri
0c90264da4 ac/radeonsi: add emit_kill to the abi
This should fix a regression with Rocket League grass rendering
on the NIR backend.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104717
2018-03-08 11:28:37 +11:00
Timothy Arceri
50cc97d98a radeonsi: add si_llvm_emit_kill() helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 11:28:37 +11:00
Timothy Arceri
f4b877631e spirv: fix autotools builds
Fixes: 68a6a3b51a "spirv: handle AMD_gcn_shader extended instructions"

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-08 10:45:56 +11:00
Timothy Arceri
99cdc019bf ac: make use of if/loop build helpers
These helpers insert the basic block in the same order as they
appear in NIR making it easier to follow LLVM IR dumps. The helpers
also insert more useful labels onto the blocks.

TGSI use the line number of the corresponding opcode in the TGSI
dump as the label id, here we use the corresponding block index
from NIR.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 10:12:34 +11:00
Timothy Arceri
6e1a142863 radeonsi: make use of if/loop build helpers in ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 10:12:34 +11:00
Timothy Arceri
42627dabb4 ac: add if/loop build helpers
These have been ported over from radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 10:12:34 +11:00
Daniel Schürmann
ffbf75cde4 radv: enable AMD_gcn_shader extension
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
18c7f1e041 ac: implement AMD_gcn_shader extended instructions
Co-authored-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
68a6a3b51a spirv: handle AMD_gcn_shader extended instructions
Co-authored-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
a1a2a8dfda nir: add AMD_gcn_shader extended instructions
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
39437025de spirv: import AMD extensions header from glslang
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Dylan Baker
cba104ebe3 meson: Fix indent in omx meson.build
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-03-07 13:30:54 -08:00
Dylan Baker
6f628951af meson: Use include directory variables instead of traversing
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-03-07 13:30:53 -08:00
Dylan Baker
34e852d5b5 meson: Re-add auto option for omx
This re-adds the auto option for omx, without it we default to tizonia
and the build fails almost immediately, this is especially obnoxious
those building a driver that doesn't support the OMX state tracker to
begin with.

v2: - Only define OMX_FOO for auto cases if the dependencies are found.
      This fixes building tizonia with auto (Julien, Eric)

CC: Gurkirpal Singh <gurkirpal204@gmail.com>
Fixes: bb5e27fab6
       ("st/omx/bellagio: Rename st and target directories")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com> (v1)
2018-03-07 13:30:53 -08:00
Dylan Baker
7598dedfde meson: fix tizonia compilation
It needs to have src/egl in it's includes as well.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-03-07 13:30:53 -08:00
Dylan Baker
2d3004ef1c meson: combine state trackers and target if blocks
This is needed later since tizonia requires dri

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-03-07 13:30:53 -08:00
Marek Olšák
55376cb31e st/mesa: expose 0 shader binary formats for compat profiles for Qt
Bugzilla: https://bugreports.qt.io/browse/QTBUG-66420
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105065
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
2018-03-07 15:36:31 -05:00
Roland Scheidegger
8ba3750d3d draw: fix line stippling with aa lines
In contrast to non-aa, where stippling is based on either dx or dy
(depending on if it's a x or y major line), stippling is based on
actual distance with smooth lines, so adjust for this.

(It looks like there's some minor artifacts with mesa demos
line-sample and stippling, it looks like the line endpoints
aren't quite right with aa + stippling - maybe due to the
integer math in the stipple stage, but I can't quite pinpoint it.)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-07 21:29:00 +01:00
Roland Scheidegger
dbb2cf388b draw: simplify (and correct) aaline fallback (v2)
The motivation actually was to get rid of the additional tex
instruction, since that requires the draw fallback code to intercept
all sampler / view calls (even if the fallback is never hit).
Basically, the idea is to use coverage of the pixel to calculate
the alpha value, and coverage is simply based on the distance
to the center of the line (in both line direction, which is useful
for wide lines, as well as perpendicular to the line).
This is much closer to what hw supporting this natively actually does.
It also fixes an issue with line width not quite being correct, as
well as endpoints getting stretched too far (in line direction) with
wide lines, which is apparent with mesa demo line-sample.
(For llvmpipe, it would probably make sense to do something like this
directly when drawing lines, since rendering two tris is twice as
expensive as a line, but it would need some changes with state
management.)
Since we're no longer relying on mipmapping to get the alpha value,
we also don't need to draw 3 rects (6 tris), one is sufficient.

There's still issues (as before):
- quite sure it's not correct without half_pixel_center, but can't test
this with GL.
- aaline + line stipple is incorrect (evident with line-sample demo).
Looking at the spec the stipple pattern should actually be based on
distance (not just dx or dy for x/y major lines as without aa).
- outputs (other than pos + the one used for line aa) should be
reinterpolated since we actually increase line length by half a pixel
(but there's no tests which would care).

v2: simplify the math (should be equivalent), don't need immediate
v3: use float versions of atan2,cos,sin, minor cleanups

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-07 21:28:31 +01:00
Bas Nieuwenhuizen
034cce96b4 radv: Don't emit a warning on VI-GFX9.
We are conformant:

https://www.khronos.org/conformance/adopters/conformant-products#submission_308

v2: Actually not emit it on gfx9.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
04d65d2b76 radv: Enable vulkan 1.1.0 for configurations that can support it.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
0168eaaa42 radv: Disable sampler ycbcr conversion.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
cce62f4065 radv: Expose that we don't support any VK_KHR_16_bit_storage parts.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
b99b9cc864 radv: Implement vkEnumerateInstanceVersion.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
5240fddb9d radv: Add trivial device group implementation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
84e877aa77 radv: Implement vkCmdDispatchBase.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
de5e25898c radv: Implement VkGetDeviceQueue2.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
b137e25277 radv: Support VkPhysicalDeviceProtectedMemoryFeatures.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
4bcf4d1678 radv: Support VkPhysicalDeviceShaderDrawParameterFeatures.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
41d958d073 radv: Implement VK_KHR_maintenance3.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
8f9af587a2 radv: Add minimal subgroup support.
Deliberately not implementing workgroup scopes as that is not needed
for core vulkan.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
89651fba9b radv: Change client version check.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:34 +01:00
Bas Nieuwenhuizen
5b3979704d radv: Update MAX_API_VERSION to 1.1.0
v2: Don't bump supported version.
v3: Update json files.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:34 +01:00
Bas Nieuwenhuizen
97f10934ed ac/nir: Add vote_ieq/vote_feq lowering pass.
The old vote_eq implementation supported only booleans, but now
we have to support arbitrary values, so use the read_first_invocation
intrinsic + ballot.

I took this as an opportunity to figure out how easy it was to do this
in nir instead of in the nir_to_llvm pass, and it actually turned out
pretty okay IMO. Only creating the pass is some extra code.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:32 +01:00
Jason Ekstrand
c217607b65 anv: Support version overrides
While always sketchy to do, this is useful for debugging.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
a1ee51309e vulkan/util: Add a helper to get a version override
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
d6b65222df anv: Enable Vulkan 1.1
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
03c07ac548 anv: Add support for SPIR-V 1.3 subgroup operations
This requires us to bump the subgroup size to 32 for all shader stages
because Vulkan requires that to be a physical device query.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
8b4a5e641b intel/fs: Add support for subgroup quad operations
NIR has code to lower these away for us but we can do significantly
better in many cases with register regioning and SIMD4x2.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
2292b20b29 intel/fs: Implement reduce and scan opeprations
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
4150920b95 intel/fs: Add a helper for emitting scan operations
This commit adds a helper to the builder for emitting "scan" operations.
Given a binary operation #, a scan takes the vector [a0, a1, ..., aN]
and returns the vector [a0, a0 # a1, ..., a0 # a1 # ... # aN] where each
channel contains the combination of all previous channels.  The sequence
of instructions to perform the scan is fairly optimal; a 16-wide scan on
a 32-bit type is only 6 instructions.  The subgroup scan and reduction
operations will be implemented in terms of this.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
b0858c1cc6 intel/fs: Add a couple of simple helper opcodes
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
57bff0a546 spirv: Add support for subgroup arithmetic
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
789221dcfa nir: Add a helper for getting binop identities
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
82d493a939 nir: Add subgroup arithmetic reduction intrinsics
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
b3a5b0f3fc spirv: Add subgroup quad support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
493a165544 nir: Add quad operations and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
90c9f29518 i965/fs: Add support for nir_intrinsic_shuffle
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
8256ee3fa3 spirv: Add subgroup shuffle support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
149b92ccf2 nir: Add subgroup shuffle intrinsics and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
7cfece820d i965/fs: Support nir_intrinsic_vote_feq
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
0e893356fe nir/lower_subgroups: Add scalarizing for vote_eq
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
d792f3d4cd spirv: Add subgroup vote support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
44681e4795 nir: Generalize nir_intrinsic_vote_eq
The SPIR-V extension wants us to be able to do an AllEqual on any vector
or scalar type.  This has two implications:

 1) We need to be able to handle vectors so we switch the vote_eq
    intrinsics to be vectorized intrinsics.

 2) We need to handle floats which have different behavior with respect
    to +-0, NaN, etc. than the integer variant so we need two variants.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
9812fce60b spirv: Add subgroup ballot support
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
974daec495 i965/fs: Implement basic SPIR-V subgroup intrinsics
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
adc077797a spirv: Add initial subgroup support
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
5162a1d884 nir: Add new SPIR-V ballot intrinsics and lowering
Someone can make the lowering optional later if they want something
different for their hardware.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
752e969703 compiler: Add two new system values for subgroups
This will be required for SPIR-V subgroup support

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
34c60ea02b nir: Add new SPIR-V ballot ALU intrinsics and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
cc587ee9a7 spirv: Handle the new OpModuleProcessed instruction
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
59b0ea0c74 anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER
From the Vulkan 1.1 spec:

    "Vulkan 1.0 implementations were required to return
    VK_ERROR_INCOMPATIBLE_DRIVER if apiVersion was larger than 1.0.
    Implementations that support Vulkan 1.1 or later must not return
    VK_ERROR_INCOMPATIBLE_DRIVER for any value of apiVersion."

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
cbab2d1da5 anv: Implement vkEnumerateInstanceVersion
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Iago Toral Quiroga
605fd7c0da anv/device: fail to initialize device if we have queues with unsupported flags
This is not strictly necessary since users should not be requesting any
flags that are not valid for the list of enabled features requested and
we already fail if they attempt to use an unsupported feature, however
it is an easy to implement sanity check that would help developes realize
that they are doing things wrong, so we might as well do it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-07 12:13:47 -08:00
Iago Toral Quiroga
b262f17b15 anv/device: GetDeviceQueue2 should only return queues with matching flags
From the Vulkan 1.1 spec, VkDeviceQueueInfo2 structure:

   "The queue returned by vkGetDeviceQueue2 must have the same flags value
    from this structure as that used at device creation time in a
    VkDeviceQueueCreateInfo instance. If no matching flags were specified
    at device creation time then pQueue will return VK_NULL_HANDLE."

For us this means no flags at all since we don't support any.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
9c8b40001d anv: Support querying for protected memory
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
773a51e772 anv: Implement GetDeviceQueue2
This belongs to the protected memory feature but there's nothing about
it that's specific to protected memory.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
68df93ecbc anv: Trivially implement VK_KHR_device_group
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
dfe18be09e anv: Implement vkCmdDispatchBase
This is part of the device groups extension/feature but it's a decent
chunk of work in its own right so it's worth breaking into its own
patch.  The mechanism we use is fairly straightforward: we just push the
base work group id into the shader and add it to the work group id we
get from dispatch.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
ff9db1a4cc nir/spirv: Add support for device groups
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
ddc4069122 anv: Implement VK_KHR_maintenance3
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
1deb7967c8 anv: Support VkPhysicalDeviceShaderDrawParameterFeatures
This advertises the VK_KHR_shader_draw_parameters functionality as a
"core optimal feature" in Vulkan 1.1.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
06719f9d4b anv/entrypoints: Drop support for protect attributes
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
bd1279bd9f Get rid of a bunch of KHR suffixes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
af461986db anv: Add version 1.1.0 but leave it disabled
This requires us to rename any Vulkan API entrypoints which became core
in 1.1 to no longer have the KHR suffix.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
0128187335 spirv: Update the SPIR-V headers and json to 1.3.1
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
205c271562 vulkan: Update the XML and headers to 1.1.70
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
7fb86fb511 vulkan/enum_to_str: Add support for aliases and new Vulkan versions
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
539a0aec45 vulkan/enum_to_str: Add a add_value_from_xml helper to VkEnum
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
eb23ca069f anv/entrypoints: Generate #ifdef guards from platform attributes
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
05fc377f2e anv/extensions: Add support for multiple API versions
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
8efa173ed2 anv/entrypoints_gen: Add support for aliases in the XML
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
39d9fcea13 anv/entrypoints: Allow an entrypoint to require multiple extensions
In this case, we say an entrypoint is supported if ANY of the extensions
is supported.  This is because, in the XML, entrypoints don't require
extensions so much as extensions require entrypoints.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
8e8f167c72 anv/entrypoints: Add an is_device_entrypoint helper
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
54b3493fc0 anv/entrypoints_gen: Allow the string map to grow
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
d91da06df5 anv/entrypoints_gen: A bit of refactoring
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
a4ca4c99ba anv/entrypoints: Generalize the string map a bit
The original string map assumed that the mapping from strings to
entrypoints was a bijection.  This will not be true the moment we
add entrypoint aliasing.  This reworks things to be an arbitrary map
from strings to non-negative signed integers.  The old one also had a
potential bug if we ever had a hash collision because it didn't do the
strcmp inside the lookup loop.  While we're at it, we break things out
into a helpful class.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
3960d0e332 vulkan: Rename multiview from KHX to KHR
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
68af9f04a4 spirv: Rework barriers
Our previous handling of barriers always used the big hammer and didn't
correctly emit memory barriers when specified along with a control
barrier.  This commit completely reworks the way we emit barriers to
make things both more precise and more correct.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
de518f38e5 spirv: Add a vtn_constant_value helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Marek Olšák
9779f34326 radeonsi: remove si_llvm_add_attribute 2018-03-07 13:55:49 -05:00
Marek Olšák
2c3f3651c4 radeonsi: fix passing address32_hi to LLVM for high values
The old function treats high values as negative, which LLVM interprets as 0.
2018-03-07 13:55:49 -05:00
Marek Olšák
b3b6b00ac8 radeonsi: assume has_virtual_memory == true 2018-03-07 13:55:48 -05:00
Marek Olšák
53db2790c0 radeonsi: add/update assertions for 32-bit address space 2018-03-07 13:55:47 -05:00
Marek Olšák
16856a1ee8 radeonsi: prevent a negative buffer offset in si_upload_descriptors 2018-03-07 13:55:42 -05:00
Marek Olšák
9b55498059 radeonsi: properly extract a buffer address from a descriptor 2018-03-07 13:55:40 -05:00
Marek Olšák
2a47660754 radeonsi: fix vertex buffer address computation with full 64-bit addresses 2018-03-07 13:55:38 -05:00
Marek Olšák
2e30268877 radeonsi: mask out high VM address bits in registers where needed 2018-03-07 13:55:35 -05:00
Bas Nieuwenhuizen
94c9096c83 radv: Add entrypoints generation with the new vk.xml
A lot of it is based on intel again.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 15:50:19 +01:00
Simon Hausmann
fb5825e7ce glsl: Fix memory leak with known glsl_type instances
When looking up known glsl_type instances in the various hash tables, we
end up leaking the key instances used for the lookup, as the glsl_type
constructor allocates memory on the global mem_ctx. This patch changes
glsl_type to manage its own memory, which fixes the leak and also allows
getting rid of the global mem_ctx and its mutex.

v2: remove lambda usage (Tapani)
    (+keep ASSERT_BITFIELD_SIZE, modify dummy ctor to initialize mem_ctx)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104884
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Simon Hausmann <simon.hausmann@qt.io>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-07 14:33:34 +02:00
Caio Marcelo de Oliveira Filho
c17808562e spirv: Add SpvCapabilityShaderViewportIndexLayerEXT
This capability allows gl_ViewportIndex and gl_Layer to also be used
as outputs in Vertex and Tesselation shaders.

v2: Make conditional to the capability, add gl_Layer, add tesselation
    shaders. (Iago)

v3: Don't export to tesselation control shader.

v4: Add Reviewd-by tag.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 07:04:20 +01:00
Mauro Rossi
487f8d48c9 android: anv: add libmesa_intel_dev static dependency
Fixes the following building errors:

external/mesa/src/intel/vulkan/anv_device.c:300: error: undefined reference to 'gen_get_pci_device_id_override'
external/mesa/src/intel/vulkan/anv_device.c:312: error: undefined reference to 'gen_get_device_name'
external/mesa/src/intel/vulkan/anv_device.c:313: error: undefined reference to 'gen_get_device_info'
clang.real: error: linker command failed with exit code 1 (use -v to see invocation)

Fixes: 272bef0601 "intel: Split gen_device_info out into libintel_dev"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-07 07:55:34 +02:00
Timothy Arceri
1fdb21541e Revert "nir: bump loop unroll limit to 96."
This reverts commit 2d36efdb7f.

This raised limit turns out to harmful for more complex shaders,
it causes excessive spilling in some Bioshock Infinite shaders.

The fps for the ssao demo on radv remains unchanged when reverting
this.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-07 15:10:05 +11:00
Dave Airlie
fb077b0728 ac/nir: don't put lod into args if it's zero.
If it's zero but put it in args we still end up consuming a
register for it.

This fixes some spilling in the NIR paths in Dirt Rally that
isn't seen with TGSI.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-07 03:34:59 +00:00
Christian Gmeiner
38e91e2b81 freedreno: bump required libdrm version
Fixes: 26a9321d0a "freedreno: add global_bindings state"

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-06 21:52:59 +01:00
Ian Romanick
e3ea166a2c nir: Simplify some comparisons like a+b < a
All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14514555 -> 14514547 (<.01%)
instructions in affected programs: 1972 -> 1964 (-0.41%)
helped: 8
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.39% max: 0.42% x̄: 0.41% x̃: 0.41%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.41% -0.40%
Instructions are helped.

total cycles in shared programs: 533141444 -> 533136780 (<.01%)
cycles in affected programs: 164728 -> 160064 (-2.83%)
helped: 181
HURT: 3
helped stats (abs) min: 2 max: 94 x̄: 26.17 x̃: 30
helped stats (rel) min: 0.12% max: 5.33% x̄: 3.42% x̃: 3.80%
HURT stats (abs)   min: 4 max: 54 x̄: 24.00 x̃: 14
HURT stats (rel)   min: 0.20% max: 2.39% x̄: 1.09% x̃: 0.68%
95% mean confidence interval for cycles value: -27.12 -23.58
95% mean confidence interval for cycles %-change: -3.54% -3.16%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10533667 -> 10533539 (<.01%)
instructions in affected programs: 10148 -> 10020 (-1.26%)
helped: 124
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.03 x̃: 1
helped stats (rel) min: 0.39% max: 4.35% x̄: 2.20% x̃: 2.04%
95% mean confidence interval for instructions value: -1.06 -1.00
95% mean confidence interval for instructions %-change: -2.46% -1.95%
Instructions are helped.

total cycles in shared programs: 146136887 -> 146132122 (<.01%)
cycles in affected programs: 206382 -> 201617 (-2.31%)
helped: 171
HURT: 0
helped stats (abs) min: 2 max: 40 x̄: 27.87 x̃: 30
helped stats (rel) min: 0.08% max: 5.73% x̄: 2.98% x̃: 2.67%
95% mean confidence interval for cycles value: -29.19 -26.54
95% mean confidence interval for cycles %-change: -3.20% -2.76%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886515 -> 7886507 (<.01%)
instructions in affected programs: 3016 -> 3008 (-0.27%)
helped: 8
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.25% max: 0.28% x̄: 0.27% x̃: 0.27%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.27% -0.26%
Instructions are helped.

total cycles in shared programs: 178100396 -> 178100388 (<.01%)
cycles in affected programs: 156128 -> 156120 (<.01%)
helped: 4
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -3.68 1.68
95% mean confidence interval for cycles %-change: -0.03% <.01%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857872 -> 4857868 (<.01%)
instructions in affected programs: 1544 -> 1540 (-0.26%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.25% max: 0.27% x̄: 0.26% x̃: 0.26%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.28% -0.24%
Instructions are helped.

total cycles in shared programs: 122167654 -> 122167662 (<.01%)
cycles in affected programs: 96248 -> 96256 (<.01%)
helped: 0
HURT: 4
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: 2.00 2.00
95% mean confidence interval for cycles %-change: <.01% 0.02%
Cycles are HURT.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:30 -08:00
Ian Romanick
d1ed4ffe0b nir: Use De Morgan's Law on logic compounded comparisons
The replacement of the comparison operators must happen during this
step.  If it does not, the next pass of nir_opt_algebraic will reapply
De Morgan's Law in the "opposite direction" before performing dead code
elimination.  The resulting infinite loop will eventually get OOM
killed.

Haswell, Broadwell, and Skylake had similar results. (Broadwell shown)
total instructions in shared programs: 14808185 -> 14808036 (<.01%)
instructions in affected programs: 13758 -> 13609 (-1.08%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 3.82 x̃: 3
helped stats (rel) min: 0.44% max: 1.55% x̄: 0.98% x̃: 1.01%
95% mean confidence interval for instructions value: -4.67 -2.97
95% mean confidence interval for instructions %-change: -1.09% -0.88%
Instructions are helped.

total cycles in shared programs: 559438333 -> 559435832 (<.01%)
cycles in affected programs: 199160 -> 196659 (-1.26%)
helped: 42
HURT: 3
helped stats (abs) min: 2 max: 184 x̄: 61.50 x̃: 51
helped stats (rel) min: 0.02% max: 6.94% x̄: 1.41% x̃: 1.40%
HURT stats (abs)   min: 2 max: 40 x̄: 27.33 x̃: 40
HURT stats (rel)   min: 0.05% max: 0.74% x̄: 0.51% x̃: 0.74%
95% mean confidence interval for cycles value: -71.47 -39.69
95% mean confidence interval for cycles %-change: -1.64% -0.93%
Cycles are helped.

Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11811776 -> 11811553 (<.01%)
instructions in affected programs: 15201 -> 14978 (-1.47%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 5.72 x̃: 6
helped stats (rel) min: 0.44% max: 2.53% x̄: 1.30% x̃: 1.26%
95% mean confidence interval for instructions value: -7.21 -4.23
95% mean confidence interval for instructions %-change: -1.48% -1.12%
Instructions are helped.

total cycles in shared programs: 257617270 -> 257614589 (<.01%)
cycles in affected programs: 212107 -> 209426 (-1.26%)
helped: 45
HURT: 0
helped stats (abs) min: 2 max: 180 x̄: 59.58 x̃: 54
helped stats (rel) min: 0.02% max: 6.02% x̄: 1.30% x̃: 1.32%
95% mean confidence interval for cycles value: -74.02 -45.14
95% mean confidence interval for cycles %-change: -1.59% -1.01%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886648 -> 7886515 (<.01%)
instructions in affected programs: 14106 -> 13973 (-0.94%)
helped: 29
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 4.59 x̃: 4
helped stats (rel) min: 0.35% max: 1.83% x̄: 0.90% x̃: 0.81%
95% mean confidence interval for instructions value: -5.65 -3.52
95% mean confidence interval for instructions %-change: -1.03% -0.76%
Instructions are helped.

total cycles in shared programs: 178100812 -> 178100396 (<.01%)
cycles in affected programs: 67970 -> 67554 (-0.61%)
helped: 29
HURT: 0
helped stats (abs) min: 2 max: 40 x̄: 14.34 x̃: 12
helped stats (rel) min: 0.15% max: 1.69% x̄: 0.58% x̃: 0.54%
95% mean confidence interval for cycles value: -18.30 -10.39
95% mean confidence interval for cycles %-change: -0.71% -0.45%
Cycles are helped.

GM45
total instructions in shared programs: 4857939 -> 4857872 (<.01%)
instructions in affected programs: 7426 -> 7359 (-0.90%)
helped: 15
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 4.47 x̃: 4
helped stats (rel) min: 0.33% max: 1.80% x̄: 0.87% x̃: 0.77%
95% mean confidence interval for instructions value: -6.06 -2.87
95% mean confidence interval for instructions %-change: -1.06% -0.67%
Instructions are helped.

total cycles in shared programs: 122167930 -> 122167654 (<.01%)
cycles in affected programs: 43118 -> 42842 (-0.64%)
helped: 15
HURT: 0
helped stats (abs) min: 4 max: 40 x̄: 18.40 x̃: 16
helped stats (rel) min: 0.15% max: 1.69% x̄: 0.62% x̃: 0.54%
95% mean confidence interval for cycles value: -25.03 -11.77
95% mean confidence interval for cycles %-change: -0.82% -0.41%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
52607658ff nir: Replace fmin(b2f(a), b) with a bcsel
All of the affected shaders are HDR mappers from Serious Sam 3.

All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14516285 -> 14516273 (<.01%)
instructions in affected programs: 348 -> 336 (-3.45%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.08% max: 6.67% x̄: 4.31% x̃: 4.17%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -5.55% -3.06%
Instructions are helped.

total cycles in shared programs: 533163876 -> 533163808 (<.01%)
cycles in affected programs: 1144 -> 1076 (-5.94%)
helped: 4
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 5.80% max: 6.08% x̄: 5.94% x̃: 5.94%
95% mean confidence interval for cycles value: -18.84 -15.16
95% mean confidence interval for cycles %-change: -6.20% -5.68%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10533321 -> 10533309 (<.01%)
instructions in affected programs: 372 -> 360 (-3.23%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.00% max: 5.88% x̄: 3.91% x̃: 3.85%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -4.96% -2.86%
Instructions are helped.

total cycles in shared programs: 146136632 -> 146136428 (<.01%)
cycles in affected programs: 11668 -> 11464 (-1.75%)
helped: 12
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 0.99% max: 3.44% x̄: 2.20% x̃: 2.29%
95% mean confidence interval for cycles value: -17.66 -16.34
95% mean confidence interval for cycles %-change: -2.82% -1.58%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886301 -> 7886277 (<.01%)
instructions in affected programs: 576 -> 552 (-4.17%)
helped: 12
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.94% max: 6.06% x̄: 4.51% x̃: 4.65%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -5.30% -3.72%
Instructions are helped.

total cycles in shared programs: 178113176 -> 178113176 (0.00%)
cycles in affected programs: 2116 -> 2116 (0.00%)
helped: 2
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 1.14% max: 1.14% x̄: 1.14% x̃: 1.14%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.50% max: 0.65% x̄: 0.58% x̃: 0.58%
95% mean confidence interval for cycles value: -3.25 3.25
95% mean confidence interval for cycles %-change: -0.93% 0.94%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857756 -> 4857744 (<.01%)
instructions in affected programs: 294 -> 282 (-4.08%)
helped: 6
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.94% max: 5.71% x̄: 4.40% x̃: 4.55%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -5.71% -3.09%
Instructions are helped.

total cycles in shared programs: 122178730 -> 122178722 (<.01%)
cycles in affected programs: 700 -> 692 (-1.14%)
helped: 2
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
b974dfee11 nir: Pull b2f out of bcsel
All platforms had similar results. (Skylake shown)
total instructions in shared programs: 14516592 -> 14516586 (<.01%)
instructions in affected programs: 500 -> 494 (-1.20%)
helped: 2
HURT: 0

total cycles in shared programs: 533167044 -> 533166998 (<.01%)
cycles in affected programs: 6988 -> 6942 (-0.66%)
helped: 2
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
f50400cc80 nir: Replace an odd comparison involving fmin of -b2f
I noticed the fge version while looking at a shader for an unrelated
reason.  The feq version prevents a regression in a later change that
performs strength reduction of some compares.

Broadwell and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14514808 -> 14514796 (<.01%)
instructions in affected programs: 750 -> 738 (-1.60%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.83% max: 1.96% x̄: 1.40% x̃: 1.40%
95% mean confidence interval for instructions value: -6.67 0.67
95% mean confidence interval for instructions %-change: -2.43% -0.36%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 533144939 -> 533144853 (<.01%)
cycles in affected programs: 8911 -> 8825 (-0.97%)
helped: 4
HURT: 0
helped stats (abs) min: 16 max: 32 x̄: 21.50 x̃: 19
helped stats (rel) min: 0.60% max: 1.89% x̄: 1.28% x̃: 1.31%
95% mean confidence interval for cycles value: -32.94 -10.06
95% mean confidence interval for cycles %-change: -2.30% -0.26%
Cycles are helped.

Haswell
total instructions in shared programs: 13093785 -> 13093775 (<.01%)
instructions in affected programs: 924 -> 914 (-1.08%)
helped: 4
HURT: 2
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.82% max: 1.95% x̄: 1.39% x̃: 1.39%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19%
95% mean confidence interval for instructions value: -4.53 1.20
95% mean confidence interval for instructions %-change: -2.02% 0.97%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 409580553 -> 409580118 (<.01%)
cycles in affected programs: 10909 -> 10474 (-3.99%)
helped: 5
HURT: 1
helped stats (abs) min: 6 max: 222 x̄: 89.60 x̃: 18
helped stats (rel) min: 0.16% max: 24.72% x̄: 9.54% x̃: 1.78%
HURT stats (abs)   min: 13 max: 13 x̄: 13.00 x̃: 13
HURT stats (rel)   min: 0.39% max: 0.39% x̄: 0.39% x̃: 0.39%
95% mean confidence interval for cycles value: -180.68 35.68
95% mean confidence interval for cycles %-change: -19.55% 3.79%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 11811851 -> 11811840 (<.01%)
instructions in affected programs: 1032 -> 1021 (-1.07%)
helped: 5
HURT: 1
helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 1
helped stats (rel) min: 0.63% max: 1.95% x̄: 1.13% x̃: 0.97%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19%
95% mean confidence interval for instructions value: -4.17 0.51
95% mean confidence interval for instructions %-change: -1.86% 0.36%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 257618403 -> 257618168 (<.01%)
cycles in affected programs: 10784 -> 10549 (-2.18%)
helped: 4
HURT: 2
helped stats (abs) min: 4 max: 220 x̄: 64.50 x̃: 17
helped stats (rel) min: 0.50% max: 24.34% x̄: 7.07% x̃: 1.72%
HURT stats (abs)   min: 9 max: 14 x̄: 11.50 x̃: 11
HURT stats (rel)   min: 0.24% max: 0.42% x̄: 0.33% x̃: 0.33%
95% mean confidence interval for cycles value: -133.11 54.78
95% mean confidence interval for cycles %-change: -14.79% 5.59%
Inconclusive result (value mean confidence interval includes 0).

GM45, Iron Lake, and Sandy Bridge had similar results. (Sandy Bridge shown)
total instructions in shared programs: 10533871 -> 10533859 (<.01%)
instructions in affected programs: 865 -> 853 (-1.39%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.63% max: 1.83% x̄: 1.22% x̃: 1.21%
95% mean confidence interval for instructions value: -6.67 0.67
95% mean confidence interval for instructions %-change: -2.16% -0.29%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 146139904 -> 146139852 (<.01%)
cycles in affected programs: 15213 -> 15161 (-0.34%)
helped: 4
HURT: 0
helped stats (abs) min: 3 max: 18 x̄: 13.00 x̃: 15
helped stats (rel) min: 0.15% max: 0.84% x̄: 0.39% x̃: 0.29%
95% mean confidence interval for cycles value: -23.79 -2.21
95% mean confidence interval for cycles %-change: -0.88% 0.09%
Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
380136e998 nir: Mark bcsel-to-fmin (or fmax) transformations as inexact
These transformations are inexact because section 4.7.1 (Range and
Precision) says:

    Operations and built-in functions that operate on a NaN are not
    required to return a NaN as the result.

The fmin or fmax might not return NaN in cases where the original
expression would be required to return NaN.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-06 11:17:14 -08:00
Ian Romanick
4addd34b04 nir: Recognize some more open-coded fmin / fmax
This transformation is inexact because section 4.7.1 (Range and
Precision) says:

    Operations and built-in functions that operate on a NaN are not
    required to return a NaN as the result.

The fmin or fmax might not return NaN in cases where the original
expression would be required to return NaN.

v2: Reorder operands and mark as inexact.  The latter suggested by
Jason.

shader-db results:

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14514817 -> 14514808 (<.01%)
instructions in affected programs: 229 -> 220 (-3.93%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4
helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12%

total cycles in shared programs: 533145211 -> 533144939 (<.01%)
cycles in affected programs: 37268 -> 36996 (-0.73%)
helped: 8
HURT: 0
helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2
helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05%

Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
total cycles in shared programs: 257618409 -> 257618403 (<.01%)
cycles in affected programs: 12582 -> 12576 (-0.05%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05%

No changes on Iron Lake or GM45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-06 11:17:14 -08:00
Gurkirpal Singh
c62cf1f165 st/omx/tizonia/h264d: Add EGLImage support
Example Gstreamer pipeline :
MESA_ENABLE_OMX_EGLIMAGE=1 GST_GL_API=gles2 GST_GL_PLATFORM=egl gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! glimagesink

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 17:21:11 +00:00
Gurkirpal Singh
b2f2236dc5 st/omx/tizonia: Add H.264 encoder
v2: Refactor out screen functions to st/omx

Example Gstreamer pipeline :
gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! avdec_h264 ! videoconvert ! omxh264enc ! h264parse ! avdec_h264 ! videoconvert ! ximagesink

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 17:20:08 +00:00
Gurkirpal Singh
83d4a5d5ae st/omx/tizonia: Add H.264 decoder
v2: Refactor out screen functions to st/omx

Example Gstreamer pipeline :
gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! videoconvert ! ximagesink

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 14:29:42 +00:00
Gurkirpal Singh
430ccdbcb9 st/omx/tizonia: Add entrypoint
Adds base files for adding components

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 14:29:42 +00:00
Gurkirpal Singh
e2afa154e9 st/omx/tizonia: Add --enable-omx-tizonia flag and build files
Allow only bellagio or tizonia to be used at the same time.
Detect tizonia package config file
Generate libomx_mesa.so and install it to libtizcore.pc::pluginsdir
Only compile empty source (target.c) for now.

GSoC Project link: https://summerofcode.withgoogle.com/projects/#4737166321123328

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 14:29:42 +00:00
Gurkirpal Singh
bb5e27fab6 st/omx/bellagio: Rename st and target directories
v2: Refactor out screen functions to st/omx

Allows to keep all the code under st/omx (st/omx/tizonia and
st/omx/bellagio).
Reverts targets/omx_bellagio to omx as additions to existing files
is enough to compile for both bellagio and tizonia.

* autotools changes:
  --enable-omx -> --enable-omx-bellagio

* meson changes:
  -Dgallium-omx=false -> -Dgallium-omx=disabled
  -Dgallium-omx=true  -> -Dgallium-omx=bellagio

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 13:07:03 +00:00
Samuel Pitoiset
e96e6f60f7 radv: report the scratch private memory size with shader stats
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:38:42 +01:00
Samuel Pitoiset
7f6b91c9c3 ac/nir: count the scratch private memory size
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:38:40 +01:00
Samuel Pitoiset
3b8e7459f2 ac: add ac_count_scratch_private_memory()
Imported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:38:38 +01:00
Samuel Pitoiset
f3275ca01c ac/nir: only enable used channels when exporting parameters
This allows us to generate, for example,
"exp param0 v0, off, off, off" if only the first channel is needed.

Not sure if this improves performance but it's worth trying.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:38:35 +01:00
Samuel Pitoiset
675dde13b2 ac: update enabled channels mask when optimizing PARAM exports
When the mask is not 0xf we need to update the number of
enabled channels, otherwise the hardware won't emit the
components that are combined.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:37:52 +01:00
Samuel Pitoiset
c24abae9dc ac/nir: pass the number of enabled channels to si_llvm_init_export_args()
Currently, it's always 0xf but an upcoming patch will reduce the
number of channels for parameters export.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:37:50 +01:00
Samuel Pitoiset
5cd34f03c0 ac/shader: scan output usage mask for VS and TES
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:37:47 +01:00
Clayton Craft
d1fa30e0f8 intel: Add missing includes for building on Android
This adds a missing library to the i965/Android.mk file, and updates
intel/Android.mk to include the new library. Without this, mesa does not
build on Android.

Fixes: 272bef0601 "intel: Split gen_device_info out into
libintel_dev"

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-06 00:14:22 -08:00
Tapani Pälli
237c9caa78 vulkan: do not expose surface/swapchain extensions on Android
On Android surface/swapchain extensions are implemented by the loader. Patch
modifies both anv and radv extension scripts disabling currently exposed
ones. See also earlier commit 9f763c1f9b.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-06 08:02:59 +02:00
Tapani Pälli
85518657a9 anv: Don't expose VK_KHX_multiview on android.
Just like commit 2ffe395 does for radv.

Fixes following dEQP test on i965:
   dEQP-VK.api.info.android.no_unknown_extensions

v2: make it !ANDROID since this extension is not about
    surfaces/swapchain

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-06 08:01:20 +02:00
Roland Scheidegger
cf4a92fda2 gallium: increase PIPE_MAX_SHADER_SAMPLER_VIEWS to 128
Some state trackers require 128.
(There are no plans to increase PIPE_MAX_SAMPLERS too, since with gl
state tracker it's unlikely more than 32 will be needed, if you need
more use bindless.)
2018-03-06 05:18:17 +01:00
Roland Scheidegger
06e724c7b4 tgsi/scan: use wrap-around shift behavior explicitly for file_mask
The comment said it will only represent the lowest 32 regs. This was
not entirely true in practice, since at least on x86 you'll get
masked shifts (unless the compiler could recognize it already and toss
it out). It turns out this actually works out alright (presumably
noone uses it for temp regs) when increasing max sampler views, so
make that behavior explicit.
Albeit it feels a bit hacky (but in any case, explicit behavior there
is better than undefined behavior).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-06 05:18:17 +01:00
Aaron Watry
95ae6c0355 clover: Allow overriding platform/device version numbers
Useful for testing API, builtin library, and device completeness of
not-yet-supported versions.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(v3) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>

v4: Remove redundant std::string wrapper around debug_get_option calls
v3: mark CL version overrides as static and const
v2: Make version_string in platform const in case
2018-03-05 20:09:46 -06:00
Aaron Watry
106020712f clover/llvm: Pass device down to compile
We'll need to be able to detect device version to define the appropriate
__OPENCL_VERSION__ header.

v2: Rebase after removing the previous patch (Pierre)
  - Removed "clover: Add device_clc_version to llvm::create_compiler_instance"

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-05 20:09:46 -06:00
Aaron Watry
fc629e3594 clover: Pass device to llvm::create_compiler_instance
We'll be using dev.device_clc_version to select the default language version
soon along with the existing ir_target field.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>

v4: Pass the device down instead of device_clc_version as a separate field
v3: Revise to acknowledge that we now have the device in compile/link_program
    instead of the string values.
v2: (Pierre) Move changes to create_compiler_instance invocation to correct
    patch to prevent temporary build breakage.
    (Jan) Use device_clc_version instead of device_version for compile/link
2018-03-05 20:09:46 -06:00
Aaron Watry
dd81ca3883 clover/llvm: Use device in llvm compilation instead of copying fields
Copying the individual fields from the device when compiling/linking
will lead to an unnecessarily large number of fields getting passed
around.

v3: Rebase on current master
v2: Use device in function args before making additional changes in
    following patches

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-05 20:09:46 -06:00
Timothy Arceri
71b3d681d8 radeonsi/nir: fix handling of doubles for gs inputs
Fixes piglit test:
tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 11:44:06 +11:00
Timothy Arceri
20bd0f6a2b ac: pass the unmodified number of components to load gs inputs
Currently both users of this would overflow an array when the
input was a dual slot double as they expected the number of
components to be a max of 4.

Since we pass the type we can just let the functions handle
doubles in a way they choose.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 11:44:06 +11:00
Timothy Arceri
2a68c6c6c8 radeonsi: move si_nir_load_input_gs() to si_shader.c
All the tess shader and tgsi equivalents are here and it allows
use to use llvm_type_is_64bit() in the following patch without
exposing it externally.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 11:44:06 +11:00
Boris Brezillon
9ea90ffb98 broadcom/vc4: Add support for HW perfmon
The V3D engine provides several perf counters.
Implement ->get_driver_query_[group_]info() so that these counters are
exposed through the GL_AMD_performance_monitor extension.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
2018-03-05 15:54:04 -08:00
Boris Brezillon
5924379a58 drm-uapi: Update vc4 header with perfmon related definitions
v2: Update to the final version with the documentation.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
2018-03-05 15:53:48 -08:00
Roland Scheidegger
434523cf2a r600: fix color export mask
The r600 code (not the eg one) forgot to copy the ps_color_export_mask
in commit 5b14e06d8b when updating the
pixel state, leading to misrenderings (probably with MRT).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105262

Tested-by: LoneVVolf <lonewolf@xs4all.nl>
Tested-by: Pavel Vinogradov <public@sourcemage.org>
2018-03-05 20:15:05 +01:00
Andres Gomez
72552012c7 travis: keep meson version below 0.45.0
Recently Meson upgraded to 0.45.0 and it needs python 3.5+, which is
not available in Trusty.

Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-05 21:12:37 +02:00
Kenneth Graunke
0472aa3efe intel: Drop SURFACE_FORMAT enum from genxml.
We want people to be using ISL_FORMAT_*, rather than the genxml format
enumerations. This patch drops 10 separate copies, and drops a bunch
of ugly casting.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
[jordan.l.justen@intel.com: Minor changes for rebase]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-05 09:51:08 -08:00
Jordan Justen
755e7e6c20 intel/common: Use isl for decoder surface formats
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-05 09:51:04 -08:00
Jordan Justen
bd3392423d intel/isl: Add isl_format_is_valid
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-05 09:51:01 -08:00
Jordan Justen
272bef0601 intel: Split gen_device_info out into libintel_dev
Split out the device info so isl doesn't depend on intel/common. Now
it will depend on the new intel/dev device info lib.

This will allow the decoder in intel/common to use isl, allowing us to
apply Ken's patch that removes the genxml duplication of surface
formats.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-05 09:47:37 -08:00
Gert Wollny
9a0d7bb48c gallium/aux/hud: Avoid possible buffer overflow
Limit the length of acceptable cpu names for use in hud_get_num_cpufreq
in order to avoid a buffer overflow later in add_object when this name
is copied into cpufreq_info::name.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105274
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-03-05 11:38:28 -05:00
Eric Engestrom
b98c905a46 gbm: give a name to rgba fields
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-05 15:14:36 +00:00
Andres Gomez
40abffb295 egl: remove duplicated initialization
Found by inspection.

The line removed is a duplicate of the line literally just above the
the 3 lines context usually printed in a commit log.

v2: enhance the commit log (Emil).

Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-05 15:55:53 +02:00
Rob Clark
5a5a43078c freedreno/ir3: start dealing with half-precision
Some instructions, assume src and/or dst is half-precision based on a
type field (ie. f32/s32/u32 are full precision but others are half
precision).  So add some code to sanity check the src/dst registers to
catch mixups.

Also propagate half-precision flag for SSA sources.  The instruction
consuming a SSA value needs to be of the same type as the one producing
it.

This is probably not complete half-precision support, but a useful first
step.  We do still need to add support for nir alu instructions for
converting between half/full precision.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
175d1b4372 freedreno/ir3: fix fixing-up register footprint
It isn't just vertex shaders that need to fixup reg footprint for inputs
populated before shader starts.

This problem showed up with compute shaders.  If you have (for example)
a localregid sysval, but only the .x component is used, the hw still
writes the .yz components, which could overflow into other threads
causing corruption.  Showed up in cl cts 'basic/test_basic intmath_int'.
But in theory the same problem could crop up elsewhere.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
9a62536108 freedreno: surfaces can be PIPE_BUFFER
At least for clover.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
d7af35a7f3 freedreno/a5xx: handle compute resources
Not *entirely* sure why this is a different BIND bit, but it is.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
82c71b09d5 freedreno/ir3: ignore return jump
I think this should also always only occur at the end of a BB (by
definition), and the BB successor should be the end block.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
c9b1cc33df freedreno: add some more compute caps
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
9630f4df3b freedreno/a5xx: don't expose 64b pointers yet
Temporary hack, but since we can't do 64b math yet in ir3, pretend that
we don't support 64b pointers.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
54988f1e6b freedreno: steal handy macro for compute caps from nouveau
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
26a9321d0a freedreno: add global_bindings state
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
8c42f63151 freedreno/ir3: small cleanup
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
76687b0c0a freedreno: add pctx->memory_barrier()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
9e4f5966e8 freedreno/ir3: cmdline compiler updates for spv shaders
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Samuel Pitoiset
322a51b549 ac: add ac_build_fsign()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-05 11:04:36 +01:00
Samuel Pitoiset
e8bdde2289 ac: add ac_build_isign()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-05 11:04:32 +01:00
Samuel Pitoiset
459e33900f ac: add ac_build_fract()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-05 11:04:30 +01:00
gurchetansingh@chromium.org
fe0647df5a virgl: add offset alignment values to to v2 caps struct
glBindBufferRange(..) in vrend_draw_bind_ubo is failing with
more than one uniform block. This is due to improper alignment
of the start of the second block. Let's query the proper
alignment from the driver and pass it back to Mesa.

Let's query for the texture alignment too, even though the Virgl
renderer doesn't call glTexBufferRange yet.

The default values are the widest workable range possible (for example,
GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT on Nvidia is 256).

Fixes:
	dEQP-GLES3.functional.ubo.* on Nvidia

Example test:
	dEQP-GLES3.functional.ubo.multi_basic_types.single_buffer.shared_vertex

Note: This is based on "virgl: reduce some default capset limits.",
which hasn't landed in Mesa yet but should relatively soon.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-05 13:29:39 +10:00
Dave Airlie
9283cf2ad1 virgl: reduce some default capset limits.
Since v2 might take a while to rollout, we should reduce
these inside some gathered minimums and then v2 can increase
them using host values.

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-05 13:29:38 +10:00
Dave Airlie
cd32258ec1 virgl: handle getting new capsets.
This checks the kernel api is new enough and asks for the
larger caps size since the kernel won't mess it up now.

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-05 13:29:38 +10:00
Timothy Arceri
70190a6567 radeonsi/nir: call ac_lower_indirect_derefs()
Fixes piglit tests:
tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec3-index-rd.shader_test
tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-05 14:09:23 +11:00
Timothy Arceri
561503e3bd radeonsi: add chip class to compiler_ctx_state
This will be used in the following patch.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-05 14:09:23 +11:00
Timothy Arceri
0f2c7341e8 ac/radv: move lower_indirect_derefs() to ac_nir_to_llvm.c
Until llvm handles indirects better we will need to use these
workarounds in the radeonsi backend also.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-05 14:09:23 +11:00
Bas Nieuwenhuizen
eea20d59ab radv: Fix copying from 3D images starting at non-zero depth.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-05 01:04:54 +01:00
Vinson Lee
bb742b6ebf swr/rast: Fix macOS macro.
Fixes: a25093de71 ("swr/rast: Implement JIT shader caching to disk")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-03-04 13:23:57 -08:00
Mathias Fröhlich
411aa8c322 vbo: Try to reuse the same VAO more often for successive dlists.
The change tries to catch more opportunities to reuse the same set
of VAO's when building up display lists. Instead of checking the
offset with respect to the beginning of the vertex buffer object
the change tries to apply this same optimization with respect to the
previous display list node.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-03 05:56:35 +01:00
Ian Romanick
a9eb455e29 mesa: Silence unused parameter warnings from TEXSTORE_PARAMS
Reduces my build from 1717 warnings to 1547 warnings by silencing 170
instances of things like

In file included from ../../SOURCE/master/src/mesa/main/texcompress_bptc.h:30:0,
                 from ../../SOURCE/master/src/mesa/main/texcompress_bptc.c:31:
../../SOURCE/master/src/mesa/main/texcompress_bptc.c: In function ‘_mesa_texstore_bptc_rgba_unorm’:
../../SOURCE/master/src/mesa/main/texstore.h:60:14: warning: unused parameter ‘dstFormat’ [-Wunused-parameter]
  mesa_format dstFormat, \
              ^
../../SOURCE/master/src/mesa/main/texcompress_bptc.c:1276:32: note: in expansion of macro ‘TEXSTORE_PARAMS’
 _mesa_texstore_bptc_rgba_unorm(TEXSTORE_PARAMS)
                                ^~~~~~~~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
1049b57bf2 i965: Silence unused parameter warnings in genX_state_upload
Reduces my build from 1772 warnings to 1717 warnings by silencing 55
instances of things like

../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_vertex_buffer_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:313:41: warning: unused parameter ‘end_offset’ [-Wunused-parameter]
                                unsigned end_offset,
                                         ^~~~~~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_sampler_state_pointers_xs’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4689:58: warning: unused parameter ‘brw’ [-Wunused-parameter]
 genX(emit_sampler_state_pointers_xs)(struct brw_context *brw,
                                                          ^~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4690:62: warning: unused parameter ‘stage_state’ [-Wunused-parameter]
                                      struct brw_stage_state *stage_state)
                                                              ^~~~~~~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_upload_default_color’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4730:40: warning: unused parameter ‘format’ [-Wunused-parameter]
                            mesa_format format, GLenum base_format,
                                        ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘translate_wrap_mode’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4906:41: warning: unused parameter ‘brw’ [-Wunused-parameter]
 translate_wrap_mode(struct brw_context *brw, GLenum wrap, bool using_nearest)
                                         ^~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_update_sampler_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4972:37: warning: unused parameter ‘batch_offset_for_sampler_state’ [-Wunused-parameter]
                            uint32_t batch_offset_for_sampler_state)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
50bf186829 isl: Silence unused parameter warnings in __gen_combine_address implementations
Reduces my build from 1808 warnings to 1772 warnings by silencing 36
instances of things like

../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c: In function ‘__gen_combine_address’:
../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:29: warning: unused parameter ‘data’ [-Wunused-parameter]
 __gen_combine_address(void *data, void *loc, uint64_t addr, uint32_t delta)
                             ^~~~
../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:41: warning: unused parameter ‘loc’ [-Wunused-parameter]
 __gen_combine_address(void *data, void *loc, uint64_t addr, uint32_t delta)
                                         ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
492a472b28 genxml: Silence unused parameter warnings in generated pack code
Reduces my build from 1960 warnings to 1808 warnings by silencing 152
instances of things like

In file included from ../../SOURCE/master/src/intel/genxml/genX_pack.h:32:0,
                 from ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:36:
src/intel/genxml/gen4_pack.h: In function ‘__gen_uint’:
src/intel/genxml/gen4_pack.h:58:49: warning: unused parameter ‘end’ [-Wunused-parameter]
 __gen_uint(uint64_t v, uint32_t start, uint32_t end)
                                                 ^~~
src/intel/genxml/gen4_pack.h: In function ‘__gen_offset’:
src/intel/genxml/gen4_pack.h:94:35: warning: unused parameter ‘start’ [-Wunused-parameter]
 __gen_offset(uint64_t v, uint32_t start, uint32_t end)
                                   ^~~~~
src/intel/genxml/gen4_pack.h:94:51: warning: unused parameter ‘end’ [-Wunused-parameter]
 __gen_offset(uint64_t v, uint32_t start, uint32_t end)
                                                   ^~~
src/intel/genxml/gen4_pack.h: In function ‘__gen_ufixed’:
src/intel/genxml/gen4_pack.h:133:48: warning: unused parameter ‘end’ [-Wunused-parameter]
 __gen_ufixed(float v, uint32_t start, uint32_t end, uint32_t fract_bits)
                                                ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
f726695cce i965: Silence unused parameter warnings in blorp
Reduces my build from 2023 warnings to 1960 warnings by silencing 63
instances of things like

In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:33:0:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_cc_viewport’:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:500:51: warning: unused parameter ‘params’ [-Wunused-parameter]
                        const struct blorp_params *params)
                                                   ^~~~~~
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_sampler_state’:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:524:53: warning: unused parameter ‘params’ [-Wunused-parameter]
                          const struct blorp_params *params)
                                                     ^~~~~~
In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:36:0:
../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h: In function ‘blorp_emit_vs_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h:50:48: warning: unused parameter ‘params’ [-Wunused-parameter]
                     const struct blorp_params *params)
                                                ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c: In function ‘blorp_flush_range’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:39: warning: unused parameter ‘batch’ [-Wunused-parameter]
 blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
                                       ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:52: warning: unused parameter ‘start’ [-Wunused-parameter]
 blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
                                                    ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:66: warning: unused parameter ‘size’ [-Wunused-parameter]
 blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
                                                                  ^~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
3a944316c4 nir: Silence unused parameter warnings in generated nir_constant_expressions code
Reduces my build from 2075 warnings to 2023 warnings by silencing 52
instances of things like

src/compiler/nir/nir_constant_expressions.c: In function ‘evaluate_bfi’:
src/compiler/nir/nir_constant_expressions.c:1812:61: warning: unused parameter ‘bit_size’ [-Wunused-parameter]
 evaluate_bfi(MAYBE_UNUSED unsigned num_components, unsigned bit_size,
                                                             ^~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
ab8f2e30b8 i965: Silence unused parameter warnings in generated OA code
Reduces my build from 6301 warnings to 2075 warnings by silencing 4226
instances of things like

src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c: In function ‘hsw__render_basic__gpu_core_clocks__read’:
src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c:41:62: warning: unused parameter ‘brw’ [-Wunused-parameter]
 hsw__render_basic__gpu_core_clocks__read(struct brw_context *brw,
                                                              ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
a55dae6ea2 i965: Silence warnings about mixing enum and non-enum in conditional
Reduces my build from 6451 warnings to 6301 warnings by silencing 150
instances of

../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_reg_type brw_inst_src1_type(const gen_device_info*, const brw_inst*)’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:802:55: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
    unsigned file = __builtin_strcmp("dst", #reg) == 0 ?                       \
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
                    BRW_GENERAL_REGISTER_FILE :                                \
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                    brw_inst_##reg##_reg_file(devinfo, inst);                  \
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h:811:1: note: in expansion of macro ‘REG_TYPE’
 REG_TYPE(src1)
 ^~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
feefb7810e intel/compiler: Silence unused parameter warnings in release builds
Reduces my build from 7005 warnings to 6451 warnings by silencing 554
instances of

In file included from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:28:0:
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src0_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:346:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_3src_a1_src0_imm(const struct gen_device_info *devinfo,
                                                         ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src2_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:354:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_3src_a1_src2_imm(const struct gen_device_info *devinfo,
                                                         ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src0_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:362:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_set_3src_a1_src0_imm(const struct gen_device_info *devinfo,
                                                             ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src2_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:370:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_set_3src_a1_src2_imm(const struct gen_device_info *devinfo,
                                                             ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_imm_uq’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:703:47: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_imm_uq(const struct gen_device_info *devinfo, const brw_inst *insn)
                                               ^~~~~~~
In file included from ../../SOURCE/master/src/intel/compiler/brw_shader.h:29:0,
                 from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:29:
../../SOURCE/master/src/intel/compiler/brw_compiler.h: In function ‘brw_stage_has_packed_dispatch’:
../../SOURCE/master/src/intel/compiler/brw_compiler.h:1277:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_stage_has_packed_dispatch(const struct gen_device_info *devinfo,
                                                             ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_disasm.c: In function ‘src_ia1’:
../../SOURCE/master/src/intel/compiler/brw_disasm.c:849:18: warning: unused parameter ‘_reg_file’ [-Wunused-parameter]
         unsigned _reg_file,
                  ^~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
c8a03ab453 i965: Silence unused parameter warnings
Reduces my build from 7119 warnings to 7005 warnings by silencing 114
instances of

In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_context.h:46:0,
                 from ../../SOURCE/master/src/mesa/drivers/dri/i965/intel_pixel_read.c:38:
../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h: In function ‘brw_bo_unmap’:
../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h:258:47: warning: unused parameter ‘bo’ [-Wunused-parameter]
 static inline int brw_bo_unmap(struct brw_bo *bo) { return 0; }
                                               ^~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Kenneth Graunke
9fa95359df intel: Drop program size pointer from vec4/fs assembly getters.
These days, we're just passing a pointer to a prog_data field, which
we already have access to.  We can just use it directly.

(In the past, it was a pointer to a separate value.)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-02 14:20:22 -08:00
Kenneth Graunke
b04cf529f2 i965: Mark upload buffers with MAP_ASYNC and MAP_PERSISTENT.
This should have no practical impact.  For the default uploader, we
don't really care, but for others, we may want to append more data
as the GPU is reading existing data, which means we need async and
persistent flags.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-02 14:19:33 -08:00
Kenneth Graunke
eb99bf8abe i965: Generalize intel_upload.c to support multiple uploaders.
I'd like to reuse the upload logic for a new program cache, but the
buffers will need to have a different lifetime than the default
uploader, and also some address space restrictions.  So, we can't
use a single uploader for both situations - we'll need two of them.

This creates a public 'uploader' structure, and adjusts the interface
to take an uploader rather than always using brw->upload.  It should
have no functional change at the moment.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-02 14:19:33 -08:00
Anuj Phogat
56dc9f9f49 intel/compiler: Memory fence commit must always be enabled for gen10+
Commit bit in the message descriptor (Bit 13) must be always set
to true in CNL+ for memory fence messages. It also fixes a piglit
GPU hang on cnl+ in simulation environment.
Piglit test: arb_shader_image_load_store-shader-mem-barrier
See HSD ES # 1404612949

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-02 11:45:21 -08:00
Francisco Jerez
4b4838b1ae Revert "i965/fs: Predicate byte scattered writes if needed"
This reverts commit a4031bdfa9.  It's
redundant with the sample mask predication done at this point by the
common logical send lowering infrastructure, and rather buggy because
it wasn't applying the correct sample mask in shaders using discard,
since the dispatch mask returned by FS_OPCODE_MOV_DISPATCH_TO_FLAGS
doesn't reflect samples discarded by the shader, so it could have led
to data corruption in fragment shader invocations that execute discard
based on a non-dynamically uniform condition.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Francisco Jerez
c063e88909 intel/fs: Handle surface opcode sample masks via predication.
The main motivation is to enable HDC surface opcodes on ICL which no
longer allows the sample mask to be provided in a message header, but
this is enabled all the way back to IVB when possible because it
decreases the instruction count of some shaders using HDC messages
significantly, e.g. one of the SynMark2 CSDof compute shaders
decreases instruction count by about 40% due to the removal of header
setup boilerplate which in turn makes a number of send message
payloads more easily CSE-able.  Shader-db results on SKL:

 total instructions in shared programs: 15325319 -> 15314384 (-0.07%)
 instructions in affected programs: 311532 -> 300597 (-3.51%)
 helped: 491
 HURT: 1

Shader-db results on BDW where the optimization needs to be disabled
in some cases due to hardware restrictions:

 total instructions in shared programs: 15604794 -> 15598028 (-0.04%)
 instructions in affected programs: 220863 -> 214097 (-3.06%)
 helped: 351
 HURT: 0

The FPS of SynMark2 CSDof improves by 5.09% ±0.36% (n=10) on my SKL
laptop with this change.  According to Eero this improves performance
of the same test by 9% on BYT and by 7-8% on BXT J4205 and on SKL GT2
desktop.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
2018-03-02 11:28:56 -08:00
Francisco Jerez
e7c9adca57 intel/eu: Plumb header present bit to codegen helpers for HDC messages.
This makes sure that the header-present bit of the message descriptor
is in sync with the IR instruction fields, which gives the optimizer
more control to avoid the overhead of setting up a message header when
it's possible to do so.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Francisco Jerez
6edb332b44 intel/ir: Allow arbitrary scratch flag registers for SHADER_OPCODE_FIND_LIVE_CHANNEL.
This shouldn't cause any functional change at this point, it changes
SHADER_OPCODE_FIND_LIVE_CHANNEL to use the flag register specified at
the IR level instead of the hard-coded f1.0, now that it can be
represented in backend_instruction::flag_subreg.  This will be
necessary for scheduling to behave correctly once more things start
making use of f1.0.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Francisco Jerez
cc0fc8b8ac intel/ir: Allow representing additional flag subregisters in the IR.
This allows representing conditional mods and predicates on f1.0-f1.1
at the IR level by adding an extra bit to the flag_subreg
backend_instruction field.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Francisco Jerez
9ec3362e0b intel/l3: Don't allocate SLM partition on ICL+.
SLM has a chunk of special-purpose memory separate from L3 on ICL+, we
shouldn't allocate a partition for it on L3 anymore.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Charmaine Lee
af8877af3b svga: add SVGA_NEW_PRESCALE to the tracked dirty mask for gs
Since geometry shader also consumes prescale constants, the
geometry shader constant buffer will need to be updated when prescale
factor is changed.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
dc79b88402 svga: fix blending regression
The earlier Mesa commit 3d06c8afb5 ("st/mesa: don't translate blend
state when it's disabled for a colorbuffer") subtly changed the
details of gallium's per-RT blend state.

In particular, when pipe_rt_blend_state[i].blend_enabled is true,
we have to get the src/dst blend terms from pipe_rt_blend_state[i],
not [0] as before.

We now have to scan the blend targets to find the first one that's
enabled (if any).  We have to use the index of that target for getting
the src/dst blend terms.  And note that we have to set identical blend
terms for all targets.

This fixes the Piglit fbo-drawbuffers2-blend test.  VMware bug 2063493.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
b871a77316 svga: check svga_have_vgpu10() in svga_delete_blend_state()
We were calling SVGA3D_vgpu10_DestroyBlendState() when vgpu10 was not
enabled (bs->id==0 by default), resulting in lots of device errors.

Reviewed-by: Neha Bhende<bhenden@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
72df3a7a39 svga: if svga_update_state() fails, skip the draw call
If svga_update_state() fails, we flush the command buffer and retry.
If it fails again, it likely means we were unable to translate a shader
for some reason (uses too many resources, for example).  In that case,
let's just skip the draw call.  The alternative, just disabling the
shader stage in question, would certainly lead to bad rendering anyway,
and probably device errors.

Fixes failed assertion running Piglit glsl-1.50/execution/
variable-indexing/gs-output-array-vec4-index-wr.shader_test since it
uses too many GS output registers (though the test still fails).
VMware bug 2063492.

v2: also call pipe_debug_message() so apps or apitrace can be notified
when this issue occurs.
v3: use svga_update_state_retry().

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
0a7deaa0d6 svga: let svga_update_state_retry() return a bool
This will allow minor simplifications elsewhere.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
35c5cf8959 svga: s/unsigned/boolean/ for a few local vars
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-03-02 12:23:50 -07:00
Dylan Baker
e23192022a meson: install vulkan_intel.h header
Fixes: d1992255bb
       ("meson: Add build Intel "anv" vulkan driver")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-02 11:11:20 -08:00
Boyuan Zhang
1ad89fa138 st/omx_bellagio: add picture profile and entry point
Profile and entry point were missing in the picture structure.
Therefore, add them back.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-03-02 12:04:36 -05:00
Boyuan Zhang
6a62e455f2 radeonsi: fix radeon create encoder return
Previous patch missed a "return" when trying to modify the create encoder
function, which made the whole logic fail. Therefore, add the return back.

Fixes: b38b208ff8 "radeonsi:create uvd hevc enc entry"

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-02 12:04:36 -05:00
Thierry Reding
f9bc48d41d loader: Add support for platform and host1x busses
ARM SoCs usually have their DRM/KMS devices on the platform bus, so add
support for this bus in order to allow use of the DRI_PRIME environment
variable with those devices.

While at it, also support the host1x bus, which is effectively the same
but uses an additional layer in the bus hierarchy.

Note that it isn't enough to support the bus that has the rendering GPU
because the loader code will also try to construct an ID path tag for a
scanout-only device if it is the default that is being opened.

The ID path tag for a device can be obtained by running udevadm info on
the device node, as shown in this example on NVIDIA Tegra:

	$ udevadm info /dev/dri/card0 | grep ID_PATH_TAG
	E: ID_PATH_TAG=platform-50000000_host1x

The corresponding OF_FULLNAME property, from which the ID_PATH_TAG is
constructed, can be found in the sysfs "uevent" attribute for the card0
device's parent:

	$ grep OF_FULLNAME /sys/devices/platform/50000000.host1x/drm/uevent
	OF_FULLNAME=/host1x@50000000

Similarily, /dev/dri/card1 corresponds to the GPU:

	$ udevadm info /dev/dri/card1 | grep ID_PATH_TAG
	E: ID_PATH_TAG=platform-57000000_gpu

and:

	$ grep OF_FULLNAME /sys/devices/platform/57000000.gpu/uevent
	OF_FULLNAME=/gpu@57000000

Changes in v2:
- avoid confusing pre-increment in strdup()
- add examples of tags to commit message

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-02 14:40:29 +01:00
Thierry Reding
498faea103 disk cache: Link with -latomic if necessary
The disk cache implementation uses 64-bit atomic operations. For some
architectures, such as 32-bit ARM, GCC will not be able to translate
these operations into atomic, lock-free instructions and will instead
rely on the external atomics library to provide these operations.

Check at configuration time whether or not linking against libatomic
is necessary and if so, create a dependency that can be used while
linking the mesautil library.

This is the meson equivalent of 2ef7f23820 ("configure: check if
-latomic is needed for __atomic_*").

For some background information on this, see:

	https://gcc.gnu.org/wiki/Atomic/GCCMM

Changes in v2:
- clarify meaning of lock-free in commit message
- fix build if -latomic is not necessary

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-02 11:31:59 +01:00
Samuel Pitoiset
c133a3411b radv: do not set pending_reset_query in BeginCommandBuffer()
This is just useless for two reasons:
1) flush_bits is not set accordingly, so nothing will be flushed
   in BeginQuery().
2) we always flush caches in EndCommandBuffer(), so if a reset
   is done in a previous command buffer we are safe.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-02 09:44:12 +01:00
Dave Airlie
bf2af063c3 r600/cayman: fix fragcood loading recip generation.
This fixes some hangs seen where the recip_ieee opcodes would
end up split across the wrong slots.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-02 00:33:18 +00:00
Kenneth Graunke
cee9f38903 i965: Allow 48-bit addressing on Gen8+.
This allows most GPU objects to use the full 48-bit address space
offered by Gen8+ platforms, rather than being stuck with 32-bit.
This expands the available GPU memory from 4G to 256TB or so.

A few objects - instruction, scratch, and vertex buffers - need to
remain pinned in the low 4GB of the address space for various reasons.
We default everything to 48-bit but disable it in those cases.

Thanks to Jason Ekstrand for blazing this trail in anv first and
finding the nasty undocumented hardware issues.  This patch simply
rips off all of his findings.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-01 15:46:11 -08:00
Kenneth Graunke
6712611735 i965: Shorten the name of the workaround BO.
This makes the name shorter in debug printouts.  If "workaround_bo"
is good enough for the code, it's probably good enough for debugging.
2018-03-01 15:46:11 -08:00
Kenneth Graunke
b04c5cece7 i965: Add debugging code to dump the validation list.
When anything goes wrong with this code, dumping the validation list
is a useful way to figure out what's happening.
2018-03-01 15:46:11 -08:00
Jason Ekstrand
ff4726077d intel/fs: Set up sampler message headers in the visitor on gen7+
This gives the scheduler visibility into the headers which should
improve scheduling.  More importantly, however, it lets the scheduler
know that the header gets written.  As-is, the scheduler thinks that a
texture instruction only reads it's payload and is unaware that it may
write to the first register so it may reorder it with respect to a read
from that register.  This is causing issues in a couple of Dota 2 vertex
shaders.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104923
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-01 15:11:01 -08:00
Timothy Arceri
f5305c1b44 ac: fix nir_intrinsic_shared_atomic_comp_swap handling
Following on from 49879f3778 this makes sure we use the correct
src index.

Fixes cts test:
KHR-GL46.compute_shader.atomic-case3

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-02 09:11:20 +11:00
Timothy Arceri
13cdf4e590 st/glsl_to_nir: simplify st_nir_assign_var_locations() and fix for fs outputs
We only need to check for previously processed location on user
defined varyings as they are the only ones that support component
packing. Therefore a single instance of processed_locs can be
shared by regular varyings and patches.

For simplicity we make processed_locs an array in order to handle
dual source bleanding.

Fixes the follow piglit test on radeonsi:
tests/spec/arb_enhanced_layouts/execution/component-layout/fs-output.shader_test

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-02 09:11:20 +11:00
Jason Ekstrand
89f78cf333 anv: Enable MSAA fast-clears
This speeds up the Sascha Willems multisampling demo by around 25% when
using 8x or 16x MSAA.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
00da139477 anv/cmd_buffer: Add support for MCS fast-clears and resolves
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
1805c483b1 anv/cmd_buffer: Add helpers for computing resolve predicates
We'll want to re-use the complex resolve predicate computations for MCS
resolves so it's nice to have them as helper functions.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
a0a319f16e anv/cmd_buffer: Handle MCS identical to CCS_E in compute_aux_usage
This doesn't actually do anything because att_state->fast_clear is
determined based on the return value of anv_layout_to_fast_clear_type
which currently returns NONE for multisampled images.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
d0f701d2f1 anv/blorp: Pass the clear address to blorp for subpass MSAA resolves
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
f4f95496cb anv/blorp: Allow indirect clear colors on blorp sources on gen7
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
d85f05bd6f anv/blorp: Add partial clear support to anv_image_mcs_op
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
c34feaea52 intel/blorp: Add indirect clear color support to mcs_partial_resolve
This is a bit complicated because we have to get the indirect clear
color in there somehow.  In order to not do any more work in the shader
than needed, we set it up as it's own vertex binding which points
directly at the clear color address specified by the client.

Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
ca7ab1a6a5 intel/blorp: Add a helper for filling out VERTEX_BUFFER_STATE
There are enough #ifs in there that it's kind-of pointless to duplicate
it for each buffer.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Andriy Khulap
7859701920 i965: Fix RELOC_WRITE typo in brw_store_data_imm64()
Fixes: 6c530ad116
("i965: Reduce passing 2x32b of reloc_domains to 2 bits")

Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com>
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-01 11:20:04 -08:00
Jonathan Gray
034bbaa6c0 gallium/util: use sockets on PIPE_OS_UNIX in u_network
Instead of listing all the UNIX PIPE_OS platforms just use
PIPE_OS_UNIX.  Makes BSD sockets available on PIPE_OS_BSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-01 18:44:39 +00:00
Jonathan Gray
7bea40e566 util: use clock_gettime() on PIPE_OS_BSD
OpenBSD, FreeBSD, NetBSD and DragonFlyBSD all have clock_gettime()
so use it when PIPE_OS_BSD is defined.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-01 18:44:38 +00:00
Jose Maria Casanova Crespo
4420d8866c nir/search: Include 8 and 16-bit support in construct_value
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-01 09:16:03 -08:00
Jason Ekstrand
99ee40fb54 nir/search: Support 8 and 16-bit constants in match_value
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2018-03-01 09:15:01 -08:00
Andres Gomez
b5b912dfee travis: make Meson find the proper llvm-config
Travis CI has moved to LLVM 5.0, and meson is detecting automatically
the available version in /usr/local/bin based on the PATH env variable
order preference.

As for 0.44.x, Meson cannot receive the path to the llvm-config binary
as a configuration parameter. See
https://github.com/mesonbuild/meson/issues/2887 and
7c8b6ee3fa

We want to use the custom (APT) installed version. Therefore, let's
make Meson find our wanted version sooner than the one at
/usr/local/bin

Once this is corrected, we would still need a patch similar to:
https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html

v2: Create the link only to the specificly wanted LLVM version (Gert).

Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Gert Wollny <gw.fossdev@gmail.com>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-01 12:21:30 +02:00
Andres Gomez
98f7650add meson: fix LLVM version detection when <= 3.4
3 digits versions in LLVM only started from 3.4.1 on.

Hence, even if you can perfectly build with an old LLVM (< 3.4.1) in
the system while not needing LLVM at all (auto), when passing through
the LLVM version detection code, meson will fail when accessing
"_llvm_version[2]" due to:

"Index 2 out of bounds of array of size 2."

v2: Properly compare LLVM version and set patch version to 0
    if < 3.4.1 (Eric).

v3: Improve the commit log explanation (Eric).

Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-01 12:16:23 +02:00
Iago Toral Quiroga
bc73016703 i965/sbe: fix number of inputs for active components
In 16631ca30e we fixed gen9 active components to account for padded
inputs in the URB, which we can have with SSO programs. To do that,
instead of going through the bitfield of inputs (which doesn't include
padding information), we compute the number of inputs from the size
of the URB entry.

Unfortunately, there are some special inputs that are not stored in
the URB and that we also need to account for. These special inputs
are identified and handled during calculate_attr_overrides().

Instead of keeping track of the exact number of inputs, we just
program active components for all possible inputs like we do in
anvil.

This fixes a regression in a WebGL program that uses Point Sprite
functionality (specifically, VARYING_SLOT_PNTC).

v2:
 - Add 'Fixes' tag (Mark Janes)
 - make no_vue_inputs int instead of uint32_t, and add const qualifier
   to num_inputs variable (Ian)

v3:
 - Do not try to count inputs correctly, just program all input
   slots like we do in anvil (Ken)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105224
Fixes: 16631ca30e (i965/sbe: fix active components for SSO programs with over 16 inputs)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-01 10:55:12 +01:00
Samuel Pitoiset
c27f5419f6 radv: only emit cache flushes when the pool size is large enough
This is an optimization which reduces the number of flushes for
small pool buffers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-01 09:53:40 +01:00
Samuel Pitoiset
2fe07933bd radv: keep track of the query pool size
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-01 09:53:39 +01:00
Samuel Pitoiset
c956d0f406 radv: make sure to emit cache flushes before starting a query
If the query pool has been previously resetted using the compute
shader path.

Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting the query pool")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-01 09:14:49 +01:00
Alejandro Piñeiro
e72fb4e611 nir/serialize: handle var->name being NULL
var->name could be NULL under ARB_gl_spirv for example. And in any
case, the code is already handing var name being NULL when reading a
variable, so it is consistent to do it writing a variable too.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-01 08:23:33 +01:00
Jose Maria Casanova Crespo
ba642ee3ee anv: Enable VK_KHR_16bit_storage for PushConstant
Enables storagePushConstant16 features of VK_KHR_16bit_storage for Gen8+.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
02266f9ba1 spirv/i965/anv: Relax push constant offset assertions being 32-bit aligned
The introduction of 16-bit types with VK_KHR_16bit_storages implies that
push constant offsets could be multiple of 2-bytes. Some assertions are
updated so offsets should be just multiple of size of the base type but
in some cases we can not assume it as doubles aren't aligned to 8 bytes
in some cases.

For 16-bit types, the push constant offset takes into account the
internal offset in the 32-bit uniform bucket adding 2-bytes when we access
not 32-bit aligned elements. In all 32-bit aligned cases it just becomes 0.

v2: Assert offsets to be aligned to the dest type size. (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
23ffb7c2d1 spirv: Calculate properly 16-bit vector sizes
Range in 16-bit push constants load was being calculated
wrongly using 4-bytes per element instead of 2-bytes as it
should be.

v2: Use glsl_get_bit_size instead of if statement
    (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
994d210429 anv: Enable VK_KHR_16bit_storage for SSBO and UBO
Enables storageBuffer16BitAccess and uniformAndStorageBuffer16BitAccesss
features of VK_KHR_16bit_storage for Gen8+.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
69be3a82ca i965/fs: Support 16-bit store_ssbo with VK_KHR_relaxed_block_layout
Restrict the use of untyped_surface_write with 16-bit pairs in
ssbo to the cases where we can guarantee that offset is multiple
of 4.

Taking into account that VK_KHR_relaxed_block_layout is available
in ANV we can only guarantee that when we have a constant offset
that is multiple of 4. For non constant offsets we will always use
byte_scattered_write.

v2: (Jason Ekstrand)
    - Assert offset_reg to be multiple of 4 if it is immediate.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
8dd8be0323 i965/fs: Support 16-bit do_read_vector with VK_KHR_relaxed_block_layout
16-bit load_ubo/ssbo operations that call do_untyped_read_vector don't
guarantee that offsets are multiple of 4-bytes as required by untyped_read
message. This happens for example in the case of f16mat3x3 when then
VK_KHR_relaxed_block_layout is enabled.

Vectors reads when we have non-constant offsets are implemented with
multiple byte_scattered_read messages that not require 32-bit aligned offsets.

Now for all constant offsets we can use the untyped_read_surface message.
In the case of constant offsets not aligned to 32-bits, we calculate a
start offset 32-bit aligned and use the shuffle_32bit_load_result_to_16bit_data
function and the first_component parameter to skip the copy of the unneeded
component.

v2: (Jason Ekstrand)
    Use untyped_read_surface messages always we have constant offsets.

v3: (Jason Ekstrand)
    Simplify loop for reads with non constant offsets.
    Use end - start to calculate the number of 32-bit components to read with
    constant offsets.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
2dd94f462b i965/fs: shuffle_32bit_load_result_to_16bit_data now skips components
This helper used to load 16bit components from 32-bits read now allows
skipping components with the new parameter first_component. The semantics
now skip components until we reach the first_component, and then reads the
number of components passed to the function.

All previous uses of the helper are updated to use 0 as first_component.
This will allow read 16-bit components when the first one is not aligned
32-bit. Enabling more usages of untyped_reads with 16-bit types.

v2: (Jason Ektrand)
    Change parameters order to first_component, num_components

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
67d7dd594e isl/i965/fs: SSBO/UBO buffers need size padding if not multiple of 32-bit
The surfaces that backup the GPU buffers have a boundary check that
considers that access to partial dwords are considered out-of-bounds.
For example, buffers with 1,3 16-bit elements has size 2 or 6 and the
last two bytes would always be read as 0 or its writting ignored.

The introduction of 16-bit types implies that we need to align the size
to 4-bytew multiples so that partial dwords could be read/written.
Adding an inconditional +2 size to buffers not being multiple of 2
solves this issue for the general cases of UBO or SSBO.

But, when unsized arrays of 16-bit elements are used it is not possible
to know if the size was padded or not. To solve this issue the
implementation calculates the needed size of the buffer surfaces,
as suggested by Jason:

surface_size = isl_align(buffer_size, 4) +
               (isl_align(buffer_size, 4) - buffer_size)

So when we calculate backwards the buffer_size in the backend we
update the resinfo return value with:

buffer_size = (surface_size & ~3) - (surface_size & 3)

It is also exposed this buffer requirements when robust buffer access
is enabled so these buffer sizes recommend being multiple of 4.

v2: (Jason Ekstrand)
    Move padding logic fron anv to isl_surface_state.
    Move calculus of original size from spirv to driver backend.
v3: (Jason Ekstrand)
    Rename some variables and use a similar expresion when calculating.
    padding than when obtaining the original buffer size.
    Avoid use of unnecesary component call at brw_fs_nir.
v4: (Jason Ekstrand)
    Complete comment with buffer size calculus explanation in brw_fs_nir.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Mathias Fröhlich
4c232dc721 vbo: Remove vbo_save_vertex_list::vertex_size.
Like before use local variables from compile_vertex_list instead.
Remove vertex_size from struct vbo_save_vertex_list.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
478a9bc7bb vbo: Remove vbo_save_vertex_list::buffer_offset.
The buffer_offset is used in aligned_vertex_buffer_offset.
But now that most of these decisions are done in compile_vertex_list
we can work on local variables instead of struct members in the
display list code. Clean that up and remove buffer_offset.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
bfa8d8e5bf vbo: Remove vbo_save_vertex_list::start_vertex.
Replace last use on replay with _vbo_save_get_{min,max}_index. Appart from
that it is not used anymore.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
6dd3e98c21 vbo: Remove vbo_save_vertex_list::attrsz.
Is not used anymore on replay, move the last use in display list
compilation to the original array in the display list compiler.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
95b4be4f29 vbo: Remove vbo_save_vertex_list::attrtype.
Is not used anymore on replay, move the last use in display list
compilation to the original array in the display list compiler.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
77df52cc4f vbo: Remove vbo_save_vertex_list::enabled.
Is not used anymore on replay.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
19a0f27a49 vbo: Remove reference to the vertex_store from the dlist node.
Since we now store a set of VAOs in the display list, use these object
to get the reference to the VBO in several places.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
6e410270ee vbo: Implement current values update in terms of the VAO.
Use the information already present in the VAO to update the current values
after display list replay. Set GL_OUT_OF_MEMORY on allocation failure
for the current value update storage.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
08aa0d9bf4 vbo: Implement vbo_loopback_vertex_list in terms of the VAO.
Use the information already present in the VAO to replay a display list
node using immediate mode draw commands. Use a hand full of helper methods
that will be useful for the next patches also.

v2: Insert asserts, constify local variables.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
f7178d677c vbo: Use a local variable for the dlist offsets.
The master value is now stored inside the VAO already present in
struct vbo_save_vertex_list. Remove the unneeded copy from dlist storage.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
1cc3516a11 vbo: Remove unused vbo_save_context::wrap_count.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
07915020f0 vbo: Remove unused vbo_save_vertex_list::dangling_attr_ref.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Jason Ekstrand
6d3edbea16 anv: Always set has_context_priority
We don't zalloc the physical device so we need to unconditionally set
everything.  Crucible helpfully initializes all allocations to 139 so it
was getting true regardless of whether or not the kernel actually
supports context priorities.

Fixes: 6d8ab53303 "anv: implement VK_EXT_global_priority extension"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 17:31:20 -08:00
Mark Janes
0fc009b8c7 Revert "i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+"
This reverts commit a2c1e48f15.

On BDWGT3e and KBLGT3e systems, this commit regressed the following
tests:

  piglit.spec.ext_framebuffer_multisample.accuracy 2 stencil_resolve small depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy 4 stencil_resolve small depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy 6 stencil_resolve small depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy 8 stencil_resolve small depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy all_samples stencil_resolve small depthstencil
2018-02-28 17:26:08 -08:00
Dave Airlie
6c1b5a40fd radeonsi/nir: increase values to 8 for gs fetch.
This stops a crash when running (still fails):
tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-01 10:35:09 +10:00
Bas Nieuwenhuizen
f9898b211e radv: Use the syncobj wait ioctl to wait on fences if possible.
Handles the !waitAll and signal after the start of the wait cases correctly.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen
34bd5e2e2e radv: Implement more efficient !waitAll fence waiting.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen
6968d782d3 radv: Implement waiting on non-submitted fences.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen
2a404c6f92 radv: Implement WaitForFences with !waitAll.
Nothing to do except using a busy wait loop. At least for old kernels.

A better implementation for newer kernels to come later.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105255
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-01 01:07:18 +01:00
Dave Airlie
49879f3778 ac/nir: fix shared atomic operations.
The nir->llvm conversion was using the wrong srcs.

Fixes:
tests/spec/arb_compute_shader/execution/shared-atomics.shader_test

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-01 10:06:06 +10:00
Dave Airlie
69495b30a3 ac/nir: don't apply slice rounding on txf_ms
This matches the tgsi code.

Fixes arb_texture_multisample texelFetch piglit tests.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec79 (radv: add initial non-conformant radv vulkan driver)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-01 10:04:34 +10:00
Timothy Arceri
f383fec903 radeonsi: set some context vars for nir path
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-01 10:51:56 +11:00
Timothy Arceri
7e46214f87 gallium: remove llvm from ir struct
This was added in 425dc4c4b3 but never used. Also since
100796c15c native has superseded llvm.

Acked-by: Dave Airlie <airlied@redhat.com>
2018-03-01 10:51:56 +11:00
Kenneth Graunke
e51b0664e0 i965: Don't emit MOVs with undefined registers for Gen4 point clipping.
Gen4 point clipping calls brw_clip_tri_alloc_regs with nr_verts == 0,
which means that c->reg.vertex[] isn't initialized.  It then emits MOVs
to stomp components of those uninitialized registers to 0.

This started causing assertions after Matt's recent series, when those
uninitialized registers started getting BRW_REGISTER_TYPE_NF, which
definitely doesn't exist on Gen4-5.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-28 15:03:51 -08:00
Eric Anholt
e4e79a02da broadcom/vc5: Fix regression in the page-cache slice size alignment.
We need to align the size of the slice, not the offset of the next slice.
Fixes KHR-GLES3.texture_repeat_mode.rgba32ui_11x131_2_clamp_to_edge.

Fixes: b4b4ada761 ("broadcom/vc5: Fix layout of 3D textures.")
2018-02-28 13:59:50 -08:00
Jason Ekstrand
a2c1e48f15 i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 13:31:42 -08:00
Jason Ekstrand
67da59e320 i965: Be more clever about setting up our viewport clip
Before, we were trusting in the hardware to take the intersection
of the viewport clip with the drawing rectangle.  Unfortunately,
3DSTATE_DRAWING_RECTANGLE is fairly expensive because it implicitly
does a full pipeline stall.  If we're a bit more careful with our
viewport clipping, we can just re-emit it once at context creation
time.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 13:31:42 -08:00
Matt Turner
debaa822ef intel/compiler: Re-add .vs_inputs_dual_locations = true
Looks like a rebase mistake.

Fixes: 89fe5190a2 ("intel/compiler: Lower flrp32 on Gen11+")
2018-02-28 13:25:21 -08:00
Dave Airlie
7cb9353de3 r600/shader: when using images always load thread id gpr at start (v2)
The delayed loading code was fail if we had control flow.

This fixes:
tests/spec/arb_shader_image_load_store/execution/image_checkerboard.shader_test

v2: don't use temp_reg before setting temp_reg up.

Tested-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 20:16:19 +00:00
Dave Airlie
8369fdee8b r600: fix whitespace in recent 1d texture commit.
trivial fix.
2018-02-28 20:16:19 +00:00
Matt Turner
6f00bf519d intel/compiler: Add ICL to test_eu_validate.cpp
With the Align16 tests now disabled, we can run the rest of the tests in
ICL mode (and see them pass!)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
ff4b41dd1d intel/compiler: Disable Align16 tests on Gen11+
Align16 is no more.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
c31d77ac22 intel/compiler: Add instruction compaction support on Gen11
Gen11 only differs from SKL+ in that it uses a new datatype index table.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
d5bf093cf9 intel/compiler: Mark line, pln, and lrp as removed on Gen11+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
89fe5190a2 intel/compiler: Lower flrp32 on Gen11+
The LRP instruction is no more.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
2134ea3800 intel/compiler/fs: Implement ddy without using align16 for Gen11+
Align16 is no more. We previously generated an align16 ADD instruction
to calculate DDY:

   add(16) g25<1>F  -g23<4>.xyxyF   g23<4>.zwzwF   { align16 1H };

Without align16, we now implement it as:

   add(4) g25<1>F   -g23<0,2,1>F    g23.2<0,2,1>F  { align1 1N };
   add(4) g25.4<1>F -g23.4<0,2,1>F  g23.6<0,2,1>F  { align1 1N };
   add(4) g26<1>F   -g24<0,2,1>F    g24.2<0,2,1>F  { align1 1N };
   add(4) g26.4<1>F -g24.4<0,2,1>F  g24.6<0,2,1>F  { align1 1N };

where only the first two instructions are needed in SIMD8 mode.

Note: an earlier version of the patch implemented this in two
instructions in SIMD16:

   add(8) g25<2>F   -g23<4,2,0>F    g23.2<4,2,0>F  { align1 1N };
   add(8) g25.1<2>F -g23.1<4,2,0>F  g23.3<4,2,0>F  { align1 1N };

but I realized that the channel enable bits will not be correct. If we
knew we were under uniform control flow, we could emit only those two
instructions however.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
62cfd4c656 intel/compiler/fs: Simplify ddx/ddy code generation
The brw_reg() constructor just obfuscates things here, in my opinion.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
bed0267ff6 intel/compiler/fs: Pass fs_inst to generate_ddx/ddy instead of opcode
In a future patch, generate_ddy will want to inspect inst->exec_size.
Change generate_ddx as well for consistency.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
3a584a15c0 intel/compiler/fs: Don't generate integer DWord multiply on Gen11
Like CHV et al., Gen11 does not support 32x32 -> 32/64-bit integer
multiplies.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
432674ce93 intel/compiler/fs: Implement FS_OPCODE_LINTERP with MADs on Gen11+
The PLN instruction is no more. Its functionality is now implemented
using two MAD instructions with the new native-float type. Instead of

   pln(16) r20.0<1>:F r10.4<0;1,0>:F r4.0<8;8,1>:F

we now have

   mad(8) acc0<1>:NF r10.7<0;1,0>:F r4.0<8;8,1>:F r10.4<0;1,0>:F
   mad(8) r20.0<1>:F acc0<8;8,1>:NF r5.0<8;8,1>:F r10.5<0;1,0>:F
   mad(8) acc0<1>:NF r10.7<0;1,0>:F r6.0<8;8,1>:F r10.4<0;1,0>:F
   mad(8) r21.0<1>:F acc0<8;8,1>:NF r7.0<8;8,1>:F r10.5<0;1,0>:F

... and in the case of SIMD8 only the first pair of MAD instructions is
used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
b5d8781e19 intel/compiler/fs: Return multiple_instructions_emitted from generate_linterp
If multiple instructions are emitted, special handling of things like
conditional mod and NoDDClr/NoDDChk need to be performed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
b1afdf9fc1 intel/compiler/fs: Fix application of cmod and saturate to LINE/MAC pair
This isn't technically broken, but the next patch will make this
function report whether it generated multiple instructions, and that
information will be used to disable the application of conditional mod
by the generic code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
2cff324210 intel/compiler: Add Gen11+ native float type
This new type exposes the additional precision offered by the
accumulator register and will be used in the next patch to implement the
functionality of the PLN instruction using a pair of MAD instructions.

One weird thing to note: align1 ternary instructions may only have an
accumulator in the dst or src1 normally, but when src0's type is :NF
the accumulator is read.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
58611ff913 intel/compiler: Add Gen11 register types
The hardware register types' encodings have changed on Gen11. Good thing
we have that superfluous looking brw_reg_type abstraction lying around!

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
bb428454a9 intel: Disable 64-bit extensions on platforms without 64-bit types
Gen11 does not support DF, Q, UQ types in hardware. As a result, we have
to disable some GL extensions until they can be reimplemented.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-02-28 11:15:47 -08:00
Anuj Phogat
5e42103f3b intel: Add icl pci id for INTEL_DEVID_OVERRIDE
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-02-28 11:15:47 -08:00
Matt Turner
35bfe20995 i965: Warn about preliminary support for Gen11
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:14:03 -08:00
Anuj Phogat
5ac804bd9a intel: Add a preliminary device for Ice Lake
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Anuj Phogat <anuj.phogat@intel.com>
2018-02-28 11:14:03 -08:00
Tapani Pälli
0c983b9094 anv: remove anv_gem_set_context_priority helper
anv_gem_set_context_param is to be used directly instead!

Fixes: 6d8ab53303 "anv: implement VK_EXT_global_priority extension"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 19:50:54 +02:00
George Kyriazis
a01d5e3712 swr/rast: revert clip distance precision
Fixes piglit tests that broke with 8a64593bde

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:50 -06:00
George Kyriazis
7e813f6214 swr/rast: Faster frustum prim culling
Fix clipper validMask setting. We don't need to run frustum rejected
primitives through the clipper.  Perform frustum culling with only
frustum clip codes. Guardband clip codes cannot be used because they
overlap frustum codes.

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:46 -06:00
George Kyriazis
1c73f42e6e swr/rast: Consolidate TRANSLATE_ADDRESS
Translate is now part of an overloaded LOAD call which required a change to
the code gen to skip the load functions in order to handle them manually
to make them virtual.

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:41 -06:00
George Kyriazis
e2a4fd0761 swr/rast: Code generation cleanup
Generate more compact code from gen_llvm.hpp.

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:37 -06:00
George Kyriazis
190ead3d79 swr/rast: Remove draw type from event definitions
- Have the draw type sent to DrawInfoEvent in handlers created in
  archrast.cpp.  The draw type no longer needs to be sent during during
  AR_API_EVENT() call in api.cpp.

- Remove draw type from event defintions in events_private.proto, no
  longer needed

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:32 -06:00
George Kyriazis
90e3e23f63 swr/rast: whitespace change
Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:28 -06:00
George Kyriazis
539de78633 swr/rast: Fix index buffer overfetch issue for non-indexed draws
Populate pLastIndex, even for the non-indexed case.  An zero pLastIndex
can cause the index offsets inside the fetcher to have non-sensical values
that can be either very large positive or very large negative numbers.

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:19 -06:00
Roland Scheidegger
26103487b5 softpipe: don't iterate through PIPE_MAX_SHADER_SAMPLER_VIEWS
We were setting view to NULL if the iteration was larger than i.
But in fact if the view is NULL the code did nothing anyway...

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-28 18:22:28 +01:00
Roland Scheidegger
b923f21eaa cso: don't cycle through PIPE_MAX_SHADER_SAMPLER_VIEWS on context destroy
There's no point, we know the highest non-null one.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-28 18:22:28 +01:00
Roland Scheidegger
89ae5def8c draw: don't needlessly iterate through all sampler view slots
We already stored the highest (potentially) used number.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-28 18:22:28 +01:00
Tapani Pälli
6d8ab53303 anv: implement VK_EXT_global_priority extension
v2: add ANV_CONTEXT_REALTIME_PRIORITY (Chris)
    use unreachable with unknown priority (Samuel)

v3: add stubs in gem_stubs.c (Emil)
    use priority defines from gen_defines.h

v4: cleanup, add anv_gem_set_context_param (Jason)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> (v2)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v3)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 14:36:57 +02:00
Tapani Pälli
5960023cf4 i965: use context priority definitions from gen_defines.h
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-28 14:36:57 +02:00
Tapani Pälli
4449a1f80d intel: add new common header gen_defines.h
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-28 14:36:57 +02:00
Christian König
33633690aa winsys/amdgpu: request high addresses
We now have hopefully fixed all bugs regarding high addresses on Vega10 and
Raven. Start to use the high range to make room for SVM in the low
range.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 13:30:32 +01:00
Samuel Pitoiset
639c4f2b54 ac/shader: move scanning some info about input PS declarations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-28 10:14:26 +01:00
Samuel Iglesias Gonsálvez
e207b2e2c8 glsl/linker: fix bug when checking precision qualifier
According to GLSL ES 3.2 spec, see table in 9.2.1 "Linked Shaders"
section, the precision qualifier should match for uniform variables.
This also applies to previous GLSL ES 3.x specs.

This 'if' checks the condition for uniform variables, while for UBOs
it is checked in link_interface_blocks.cpp.

Fixes: b50b82b8a5
("glsl/es31: precision qualifier doesn't need to match in shader interface block members")

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-28 07:04:13 +01:00
Samuel Iglesias Gonsálvez
c757c9dc03 anv: set maxResourceSize to the respective value for each generation
v2:
- Add the proper values to gen9+ (Jason)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 06:54:48 +01:00
Dave Airlie
a5853a3333 r600: partly revert disabling tiling for 1d texture.
Previously we had a check for 1d of narrow 2D textures, however
narrow 2d textures caused gpu hangs, but it was correct for 1d
textures.

This fixes a bunch of 1D image piglits for me.

Fixes: 7b8e1c089d (r600/texture: drop lowering 1d/2d images to linear.)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 04:59:37 +00:00
Timothy Arceri
0c1f37cc2d nir: fix interger divide by zero crash during constant folding
From the GLSL 4.60 spec Section 5.9 (Expressions):

   "Dividing by zero does not cause an exception but does result in
    an unspecified value."

Fixes: 89285e4d47 "nir: add new constant folding infrastructure"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105271
2018-02-28 15:55:39 +11:00
Ilia Mirkin
086c88551d st/mesa: ensure that images don't try to reference non-existent levels
Ideally the st_finalize_texture call would take care of that, but it
doesn't seem to with KHR-GL45.shader_image_size.advanced-nonMS-*. This
assertion makes sure that no such values are passed to the driver.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-27 22:38:33 -05:00
Dave Airlie
c7b25005a1 ac/radv: move load base vertex abi setup to vertex shader.
This was segfaulting:
dEQP-VK.memory.pipeline_barrier.host_write_index_buffer.1024

Fixes: 8de6f79707 (ac/radeonsi: add load_base_vertex() to the abi)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 09:58:12 +10:00
Dave Airlie
3401b028df ac/shader: fix vertex input with components.
This fixes:
dEQP-VK.glsl.440.linkage.varying.component.*

Fixes: 1c57a6da5e (ac/shader: scan vertex inputs usage mask)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 09:04:46 +10:00
Dave Airlie
6bafd4f4dd radv: remove device pointer from buffer.
This is never used.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 09:03:26 +10:00
Timothy Arceri
a050ea60ee nir: add lower_ldexp to nir compiler options
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
08fa84bb9a ac: implement nir_op_ldexp
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
9790921ff5 ac: fix nir_op_fdd{x,y} handling
radeonsi, i965 and anv all treat fdd{x,y} opcodes the same as
fdd{x,y}_coarse by default. The SPIR-V spec lets the implementation
decide how it should be handled and radv was previously going
for the higher quality option. Here we change the shared amd
code to match how nir_op_fdd{x,y} is expected to be handled
by the other NIR drivers.

Fixes piglit test:
./bin/arb_shader_texture_lod-texgrad -auto

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
8de6f79707 ac/radeonsi: add load_base_vertex() to the abi
Fixes the following piglit tests:

./bin/arb_shader_draw_parameters-basevertex basevertex -auto -fbo
./bin/arb_shader_draw_parameters-basevertex basevertex-baseinstance -auto -fbo

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
7f91473414 radeonsi: create get_base_vertex() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
ae47af50d6 radeonsi/nir: disable vertex_id_zero_based lowering
The lowering is incompatible with how the radeonsi backend works.

Fixes piglit test:
./bin/arb_shader_draw_parameters-basevertex vertexid-zerobased -auto

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
5504bebfc4 ac: add support for handling nir_intrinsic_load_vertex_id
This will be used by radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
3a0b4187dd ac: fix f2b and i2b for doubles
Without this llvm was asserting in debug builds.

V2: use LLVMConstNull()

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-28 09:23:49 +11:00
Francisco Jerez
cb309d27c5 intel/ir: Fix invalid type aliasing with undefined behavior in test_eu_compact.
test_fuzz_compact_instruction() was attempting to modify the uint64_t
data array of a brw_inst through a pointer to uint32_t, which has
undefined behavior.  This was causing the test_eu_compact unit test to
fail mysteriously for me on GCC 7 with some additional
harmless-looking changes I had applied to my tree, which happened to
affect the order instructions are emitted by GCC causing the bit
twiddling to be done after the clear_pad_bits() call which is supposed
to overwrite the same data through a pointer of different type,
leading to data corruption.  A similar failure has been reported by
Vinson Lee on the master branch built with GCC 8.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105052
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-27 11:42:39 -08:00
Francisco Jerez
69b4a9d21d util/bitset: Make C++ wrapper trivially constructible.
In order to fix a build failure on compilers not implementing
unrestricted unions, which is a C++11 feature.

v2: Provide signed integer comparison and assignment operators instead
    of BITSET_WORD ones to avoid spurious ambiguity warnings on
    comparisons with a signed integer literal.

Fixes: ba79a90fb5 "glsl: Switch ast_type_qualifier to a 128-bit bitset."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105238
Tested-by: Roland Scheidegger <sroland@vmware.com>
Tested-By: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-27 11:38:18 -08:00
Jordan Justen
9f223d860b intel/tools: Use gen_device_name_to_pci_device_id in aubinator
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
8ff89250ff intel/common: Add gen_device_name_to_pci_device_id
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
c2134f94c8 intel/vulkan: Support INTEL_DEVID_OVERRIDE environment variable
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
843f6d187a i965: Use gen_get_pci_device_id_override
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
e560bb9dc2 intel/common: Add gen_get_pci_device_id_override
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
6b274d5cc6 intel/vulkan: Support INTEL_NO_HW environment variable
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Harish Krupo
b9af043716 android: fix source files path for libmesa_anv_gen11
Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-27 14:16:08 +02:00
Eric Engestrom
248c593132 meson: avoid changing types for the dri3 option
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-27 11:21:20 +00:00
Eric Engestrom
76e8d61999 meson: simplify the gbm option code, and avoid changing types
v2: drop gallium comment (Dylan)

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-27 11:21:20 +00:00
Samuel Pitoiset
a549da877b ac/nir: clean up a hack about rounding 2nd coord component
It's basically just the opposite, and it only makes sense to
round the layer for 2D texture arrays.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-27 10:09:27 +01:00
Ilia Mirkin
e683a797c6 nvc0: collapse output slots to have adjacent registers
The hardware skips over unallocated slots, so we have to make sure those
registers are packed together.

Fixes KHR-GL45.enhanced_layouts.fragment_data_location_api

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-02-27 00:10:39 -05:00
Dave Airlie
250468f6b7 radv: expose async compute on SI
It looks like we had all the pieces in place for this,
just never tested it and turned it on.

I don't see any CTS regressions and the computeshader
demo runs.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-27 00:54:59 +00:00
Dave Airlie
1fc19a0f27 radv: merge tess rings into a single bo
Inspired by a passing commit to radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-27 00:54:59 +00:00
Emil Velikov
784d81e97e docs: update calendar, add news and link release notes to 17.3.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-27 00:32:14 +00:00
Emil Velikov
d9391014de docs: add sha256 checksums for 17.3.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b00880973e)
2018-02-27 00:29:44 +00:00
Emil Velikov
676c58fbdb docs: add release notes for 17.3.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b3e5a3f35b)
2018-02-27 00:29:43 +00:00
Dylan Baker
b9636fe38a meson: fix building without GL
libgl will be undefined _glx, so move that check inside the
`if with_glx != 'disabled'` block.

v2: - Simplify commit message (Eric, Emil)

Fixes: 5c460337fd ("meson: Fix GL and EGL pkg-config files with glvnd")
Reported-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
CC: Daniel Stone <daniels@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Untested-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-26 09:32:14 -08:00
Lionel Landwerlin
fca9f5b585 intel: aubinator_error_decode: fix segfault on missing register
Some register might be missing in our genxmls. Don't try to decode
them.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-26 16:54:48 +00:00
Eric Engestrom
11d45304fd *-symbol-check: use correct nm path when cross-compiling
Inspired-by: a similar patch for libdrm by Heiko Becker
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-26 13:50:59 +00:00
Karol Herbst
ef308d4007 nvir/gm107: consider FILE_FLAGS dependencies in SchedDataCalculatorGM107
currently while insterting barriers, writes and reads to FILE_FLAGS aren't
considered. This can lead to WaR hazards in some situations.

With the previous commit fixes shaders with intstructions like this:
  mad u32 $r2 $r4 $r11 $r2
  mad u32 { $r5 $c0 } $r4 $r10 $r6
  mad (SUBOP:1) u32 $r3 $r4 $r10 $r2 $c0

Affects OpenCL CTS tests on Maxwell+:
basic/test_basic intmath_long
basic/test_basic intmath_long2
basic/test_basic intmath_long4

v2: only put barriers on instructions which actually read flags

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-02-26 14:41:58 +01:00
Karol Herbst
2f07f823c9 nvir/gm107: iterate over all defs in SchedDataCalculatorGM107::findFirstUse
In the sched data calculator we have to track first use of defs by iterating
over all defs of an instruction, not just the first one.

v2: fix minGRP and maxGRP values

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-02-26 14:41:58 +01:00
Samuel Pitoiset
e05507a427 ac/nir: use ordered float comparisons except for not equal
Original patch from Timothy Arceri, I have just fixed the
not equal case locally.

This fixes one important rendering issue in Wolfenstein 2
(the cutscene transition issue).

RadeonSI uses the same ordered comparisons, so I guess that
what we should do as well.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104302
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-02-26 13:59:04 +01:00
Mauro Rossi
6451b0703f android: vulkan/util: add dependency on libnativewindow for O and later
Similar to 90dd6e5 ("Android: egl: add dependency on libnativewindow")

Fixes the following building error:

In file included from out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_vulkan_util_intermediates/util/vk_enum_to_str.c:26:
external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found
         ^~~~~~~~~~~~~~~~~
1 error generated.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-02-26 14:50:24 +02:00
Mauro Rossi
d448954228 android: anv: add dependency on libnativewindow for O and later
Similar to 90dd6e5 ("Android: egl: add dependency on libnativewindow")

Fixes the following building errors:

In file included from external/mesa/src/intel/vulkan/gen7_cmd_buffer.c:30:
In file included from external/mesa/src/intel/vulkan/anv_private.h:72:
external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal
error: 'system/window.h' file not found
         ^~~~~~~~~~~~~~~~~
1 error generated.
...
In file included from external/mesa/src/intel/vulkan/anv_gem.c:32:
In file included from external/mesa/src/intel/vulkan/anv_private.h:72:
external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal
error: 'system/window.h' file not found
         ^~~~~~~~~~~~~~~~~
1 error generated.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-02-26 14:49:06 +02:00
Mauro Rossi
9a508b719b android: anv/extensions: fix generated sources build
Building rules are aligned to automake ones

The correct script to build anv_extensions.{c,h} is anv_extensions_gen.py
Generation rules for anv_extensions.c requires --out-c option
Generation rules for anv_extensions.h were missing
Necessary include paths are added to avoid following build errors:

cp: cannot stat '.../gen/STATIC_LIBRARIES/libmesa_vulkan_common_intermediates/vulkan/anv_extensions.c':
No such file or directory

In file included from external/mesa/src/intel/vulkan/anv_gem.c:32:
external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found
         ^~~~~~~~~~~~~~~~~~
1 error generated.

In file included from external/mesa/src/intel/vulkan/anv_batch_chain.c:30:
external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found
         ^~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: dd088d4bec ("anv/extensions: Generate a header file with extension tables")
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-26 14:37:33 +02:00
Marek Olšák
8799eaed99 radeonsi: remove 2 unused user SGPRs from merged TES-GS with 32-bit pointers
The effect of the last 13 commits on user SGPR counts:

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:19 +01:00
Marek Olšák
3fa7a59d69 radeonsi: make SI_SGPR_VERTEX_BUFFERS the last user SGPR input
so that it can be removed and replaced with inline VBO descriptors,
and the pointer can be packed in unused bits of VBO descriptors.
This also removes the pointer from merged TES-GS where it's useless.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:08 +01:00
Marek Olšák
c78640ce31 radeonsi: set correct num_input_sgprs for VS prolog in merged shaders
We need to take num_input_sgprs from VS, not the second shader.
No apps suffered from this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:05 +01:00
Marek Olšák
f852b24ce0 radeonsi: allow fewer input SGPRs in 2nd shader of merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:03 +01:00
Marek Olšák
8d6e6b1d7c radeonsi: don't use struct si_descriptors for vertex buffer descriptors
VBO descriptor code will change a lot one day.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:00 +01:00
Daniel Stone
61d6ff3ba3 build: Move wayland-scanner check into platform
Also only check for wayland-scanner if building for the Wayland
platform.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Cc: Emil Velikov <emil.velikov@collabora.co.uk>
Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211
2018-02-26 10:43:19 +00:00
Daniel Stone
d33cd875e8 build: Move wayland-protocols check into platform
In line with wayland-client and wayland-server, move the check for
wayland-protocols into the wayland platform branch.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Cc: Emil Velikov <emil.velikov@collabora.co.uk>
Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211
2018-02-26 10:43:16 +00:00
Daniel Stone
d8f19d9aa0 vulkan/wsi/wayland: Move Wayland protocol from BUILT_SOURCES
autotools wants to have the BUILT_SOURCES ready as soon as it enters the
directory, even if they are not used. This meant the build failed if
wayland-protocols was not available on the system, even if it was not
enabled.

As BUILT_SOURCES cannot be used in a conditional (cf. 166852ee95), do
the same thing as EGL and manually encode the dependencies in the
Makefile.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Cc: Emil Velikov <emil.velikov@collabora.co.uk>
Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211
2018-02-26 10:43:12 +00:00
Dave Airlie
0cc5be7741 r600: fix tgsi clock last setting
On cayman this was hitting an assert later, which probably wasn't
see on non-cayman due to having the t slot.

Fixes: 9041730d1 (r600: add support for ARB_shader_clock.)
2018-02-26 11:05:45 +10:00
Dave Airlie
4d72a1efea r600: add time lo/hi debugging output.
This just adds the these to the debug prints.
2018-02-26 11:05:26 +10:00
Timothy Arceri
22430224fe radeonsi/nir: enable lowering of fpow
Lowering fpow in NIR rather than LLVM can be beneficial.

Polaris results:

Totals from affected shaders:
SGPRS: 124928 -> 124896 (-0.03 %)
VGPRS: 68616 -> 68332 (-0.41 %)
Spilled SGPRs: 394 -> 413 (4.82 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 3668912 -> 3658368 (-0.29 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 18575 -> 18593 (0.10 %)
Wait states: 0 -> 0 (0.00 %)

Fixes: d6b7539206 "ac/nir: remove emission of nir_op_fpow"

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-26 11:43:47 +11:00
Timothy Arceri
9873bd9dcd ac: make use of ac_get_llvm_num_components() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-26 11:43:47 +11:00
Timothy Arceri
1a757c9c97 gallium/tgsi: remove is_msaa_sampler array from tgsi_shader_info
Seems to have not been used since 16be87c904

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-26 11:43:47 +11:00
Timothy Arceri
9f7c940840 radeonsi/nir: fix loading of doubles for tess varyings
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-26 11:43:47 +11:00
Timothy Arceri
81f9d03807 radeonsi/nir: fix lds store in tcs outputs handling
We were ignoring the channel offset.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-26 11:43:47 +11:00
Gert Wollny
c7cadcbda4 r600: Take ALU_EXTENDED into account when evaluating jump offsets
ALU_EXTENDED needs 4 DWORDS instead of the usual 2, hence if the last ALU
clause within a IF-JUMP or ELSE branch is ALU_EXTENDED the target jump
offset needs to be adjusted accordingly.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104654
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-26 10:29:48 +10:00
Francisco Jerez
51562ea7a0 mesa: Expose EXT_shader_framebuffer_fetch(_non_coherent) on desktop and embedded GL.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
c6c64d4d6a glsl: Silence warnings when reading from a framebuffer fetch output.
Framebuffer fetch outputs are implicitly initialized upon entry to the
fragment shader.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
537bb1da98 glsl: Specify framebuffer fetch coherency mode in lower_blend_equation_advanced().
This requires passing an extra argument to the lowering pass because
the KHR_blend_equation_advanced specification doesn't seem to define
any mechanism for the implementation to determine at compile-time
whether coherent blending can ever be used (not even an "#extension
KHR_blend_equation_advanced_coherent" directive seems to be required
in the shader source AFAICT).

In the long run we'll probably want to do state-dependent recompiles
based on the value of ctx->Color.BlendCoherent, but right now there
would be no benefit from that because the only driver that supports
coherent framebuffer fetch is i965 on SKL+ hardware, which are unable
to support the non-coherent path for the moment because of texture
layout issues, so framebuffer fetch coherency is always enabled for
them.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
ef9e3f63ca glsl: Add support for the framebuffer fetch layout(noncoherent) qualifier.
This allows the application to request framebuffer fetch coherency
with per-fragment output granularity.  Coherent framebuffer fetch
outputs (which is the default if no qualifier is present for
compatibility with older versions of the EXT_shader_framebuffer_fetch
extension) will have ir_variable_data::memory_coherent set to true.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
0aeec504b4 glsl: Allow layout token for EXT_shader_framebuffer_fetch_non_coherent.
EXT_shader_framebuffer_fetch_non_coherent requires layout qualifiers
even on GL(ES) 2.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
1bc01db95f glsl: Initialize ir_variable_data::fb_fetch_output earlier for GL(ES) 2.
At the same point where it is initialized on GL(ES) 3.0+ so we can
implement some common layout qualifier handling in a future commit.
Until now the fb_fetch_output flag would be inherited from the
original implicit gl_LastFragData declaration at a later point in the
AST to GLSL IR translation.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
6ebefb0fd5 glsl: Replace MESA_shader_framebuffer_fetch extension flags with EXT ones.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
ba79a90fb5 glsl: Switch ast_type_qualifier to a 128-bit bitset.
This should end the drought of bits in the ast_type_qualifier object.
The bitset_t type works pretty much as a drop-in replacement for the
current uint64_t bitset.

The only catch is that the bitset_t type as defined in the previous
commit doesn't have a trivial constructor (because it has a
user-defined constructor), so it cannot be used as union member
without providing a user-defined constructor for the union (which
causes it in turn to be non-trivially constructible).  This annoyance
could be easily addressed in C++11 by declaring the default
constructor of bitset_t to be the implicitly defined one -- IMO one
more reason to drop support for GCC 4.2-4.3.

The other minor change was required because glsl_parser_extras.cpp was
hard-coding the type of bitset temporaries as uint64_t, which (unlike
would have been the case if the uint64_t had been replaced with
e.g. an __int128) would otherwise have caused a build failure, because
the boolean conversion operator of bitset_t is marked explicit (if
C++11 is available), so the bitset won't be silently truncated down to
1 bit in order to use it to initialize the uint64_t temporaries
(yikes).

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
bdbc2ffa42 util/bitset: Add C++ wrapper for static-size bitsets.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
8d1f1ce412 util: Add EXPLICIT_CONVERSION macro.
This can be used to specify that a C++ conversion operator is not
meant to be used for implicit conversions, which can lead to
unintended loss of information in some cases.  Implemented as a macro
in order to keep old GCC versions happy.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
378e918e28 mesa: Implement glFramebufferFetchBarrierEXT entry point.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
e4124f9bc1 glapi: Update XML for last revision of EXT_shader_framebuffer_fetch.
Desktop GL is now supported, and there is an additional entry-point
for EXT_shader_framebuffer_fetch_non_coherent.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
6a8ec78c2a mesa: Rename MESA_shader_framebuffer_fetch gl_extensions bits to EXT.
The changes I had originally planned for the MESA_shader_framebuffer_fetch
extension have been merged into the EXT spec, there's no point in keeping
MESA_shader_framebuffer_fetch extension enables.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
d0bef79f12 mesa: Rename dd_function_table::BlendBarrier to match latest EXT spec.
This GL entry point was renamed to glFramebufferFetchBarrier() in the
EXT extension on request from Khronos members.  Update the Mesa
codebase to match the latest spec.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
27c829da28 i965: Fix KHR_blend_equation_advanced with some render targets.
This reverts two bogus and seemingly useless changes from the commits
referenced below, which broke KHR_blend_equation_advanced (and
EXT_shader_framebuffer_fetch_non_coherent which wasn't exposed yet)
for any kind of render target surface that would cause the
get_isl_surf() call in brw_emit_surface_state() to do anything useful
(notice how the result of get_isl_surf() is completely ignored by the
caller right now), as was the case while using those extensions with
1D array or 3D framebuffers in particular.

Fixes: f5859b45b1 "i965/miptree: Switch remaining surfaces to isl"
Fixes: bf24c3539e "i965/miptree: Clean-up unused"
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Marek Olšák
fb410ae392 radeonsi: remove si_descriptors parameter from emit_shader_pointer functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00
Marek Olšák
63ea0a00a3 radeonsi: preload the tess offchip ring in TES
so that it's not done multiple times in branches

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00
Marek Olšák
2d03c4cac8 radeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRs
TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address
aligned to 512KB. Hey, it's a 13-bit pointer!

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00
Marek Olšák
190e064e63 radeonsi: move 2nd-shader descriptor pointers into s[0:1]
If 32-bit pointers are supported, both pointers can be moved into s[0:1]
and then ESGS has exactly the same user data SGPR declarations as VS.

If 32-bit pointers are not supported, only one pointer can be moved into
s[0:1]. In that case, the 2nd pointer is moved before TCS constants,
so that the location is the same in HS and GS.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00
Marek Olšák
1d1df76d2b radeonsi: change si_descriptors::shader_userdata_offset type to short
We will want to use SH registers outside of user data SGPRs, like the GFX9
special SGPRs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00
Marek Olšák
fca7dee9c6 radeonsi: put both tessellation rings into 1 buffer
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00
Marek Olšák
d2963d8b5f radeonsi: move tessellation ring info into si_screen
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00
Marek Olšák
41895c26d3 radeonsi: move TCS_OUT_LAYOUT.PatchVerticesIn to lower bits
For a later patch.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00
Karol Herbst
f0b39779a0 nvir: dont optimize mad with subops to shladd
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-24 18:48:13 +01:00
James Legg
afd8fd0656 radv: Really use correct HTILE expanded words.
When transitioning to an htile compressed depth format, Set the full
depth range, so later rasterization can pass HiZ. Previously, for depth
only formats, the depth range was set to 0 to 0. This caused unwanted
HiZ rejections with a VK_FORMAT_D16_UNORM depth buffer
(VK_FORMAT_D32_SFLOAT was not affected somehow).

These values are derived from PAL [0], since I can't find the
specification describing the htile values.

[0] 5cba4ecbda/src/core/hw/gfxip/gfx9/gfx9MaskRam.cpp (L1500)

CC: Dave Airlie <airlied@redhat.com>
CC: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Fixes: 5158603182 "radv: Use correct HTILE expanded words."
2018-02-24 02:16:22 +01:00
Mauro Rossi
8eed942136 radv/extensions: fix c_vk_version for patch == None
Similar to cb0d1ba156 ("anv/extensions: Fix VkVersion::c_vk_version for patch == None")
fixes the following building errors:

out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/radv_entrypoints.c:1161:48:
error: use of undeclared identifier 'None'; did you mean 'long'?
      return instance && VK_MAKE_VERSION(1, 0, None) <= core_version;
                                               ^~~~
                                               long
external/mesa/include/vulkan/vulkan.h:34:43: note: expanded from macro 'VK_MAKE_VERSION'
    (((major) << 22) | ((minor) << 12) | (patch))
                                          ^
...
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.

Fixes: e72ad05c1d ("radv: Return NULL for entrypoints when not supported.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-24 00:31:31 +01:00
Eric Anholt
b4b4ada761 broadcom/vc5: Fix layout of 3D textures.
Cube maps are entire miptrees repeated, while 3D textures have each level
have all of its layers next to each other.  Fixes tex3d and
tex-miplevel-selection GL2:texture() 3D.
2018-02-23 15:07:26 -08:00
Eric Anholt
97dc077303 broadcom/vc5: Ignore unused usage flags in is_format_supported.
Like for vc4, the new DISPLAY_TARGET flag ended up causing no formats to
match.  Just drop the whole retval == usage thing and return early when we
hit a known unsupported case.

Fixes: f7604d8af5 ("st/dri: only expose config formats that are display targets")
2018-02-23 15:07:18 -08:00
Eric Anholt
880573e737 gbm: Fix the alpha masks in the GBM format table.
Once GBM started looking at the values of the alpha masks, ARGB/ABGR
wouldn't match any more because we had both A and R in the low bits.

Fixes: 2ed344645d ("gbm/dri: Add RGBA masks to GBM format table")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-23 15:03:36 -08:00
Mathias Fröhlich
b54bf0e3e3 mesa: Update vertex processing mode on _mesa_UseProgram.
The change is a bug fix for 92d76a169:
  mesa: Provide an alternative to get_vp_mode()
that actually got exposed through 4562a7b0:
  vbo: Make use of _DrawVAO from the dlist code.

Fixes: KHR-GLES31.core.shader_image_load_store.advanced-sso-simple
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105229
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 21:08:35 +01:00
Marek Olšák
d169438d8e mesa: rename has_core_gs -> has_gs in get_programiv
This is also true for GLES.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:23 +01:00
Marek Olšák
1881f41b6c mesa: replace some API_OPENGL_CORE checks with _mesa_is_desktop_gl
This is more accurate with respect to the compatibility profile.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:22 +01:00
Marek Olšák
1defc973db mesa: add some of missing compatibility support for ARB_bindless_texture
The extension is exposed in the compatibility profile.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:20 +01:00
Marek Olšák
b8e2e9e1a1 mesa: expose ARB_enhanced_layouts in the compatibility profile
GLSL 1.40 is required.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:19 +01:00
Marek Olšák
a0c8b49284 mesa: enable OpenGL 3.1 with ARB_compatibility
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:17 +01:00
Marek Olšák
605a7f6db5 mesa: implement ARB_compatibility
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:15 +01:00
Emil Velikov
14a2c87c41 swr: remove dead LLVM code paths
LLVM requirement was bumped to 4.0.0 with earlier commit.
Hence any code tailored for older versions is now unreachable.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-02-23 19:17:31 +00:00
Eric Anholt
5980a41c0f broadcom/vc4: Remove the retval==usage check in is_format_supported().
This got us into trouble recently, so just remove it entirely.
2018-02-23 08:42:13 -08:00
Eric Anholt
bc3d16e633 broadcom/vc4: Add support for YUV textures using unaccelerated blits.
Previously we would assertion fail about having no hardware format.  This
is enough to get kmscube -M nv12-2img working.
2018-02-23 08:42:13 -08:00
Eric Anholt
c824a045ea broadcom/vc4: Fix double-unrefcounting of prsc->next with shadows.
When we set up the shadow resource we were copying the original resource
as the template, including its prsc->next field.  When we shadowed the
first YUV plane's resource for linear-to-tiled conversion, we would end up
unbalancing the refcount on the shadow resource's destruction.
2018-02-23 08:42:13 -08:00
Eric Anholt
6deb158ec1 broadcom/vc4: Add pipe_reference debugging for vc4_bos.
Trying to track down the YUV EGLImage use-after-free, it helps to see what
the mystery objects are that are being refcounted.
2018-02-23 08:42:13 -08:00
Eric Anholt
34ea1aca92 broadcom/vc4: Remove dead vc4_bo_set_reference().
It would be broken if NULL was passed to it anyway, since it wouldn't
participate in screen->bo_handles management.
2018-02-23 08:42:13 -08:00
Eric Anholt
a49738290c broadcom/vc4: Use pipe_resource_reference in sampler views.
Improves u_debug_refcount output.
2018-02-23 08:42:13 -08:00
Eric Anholt
0c1dd9dee0 broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride.
This is part of supporting YUV textures -- MMAL will be handing us a
single GEM BO with the planes at offsets within it, and MMAL-decided
stride.
2018-02-23 08:42:13 -08:00
Eric Anholt
978b884afc broadcom/vc4: Ignore PIPE_BIND_DISPLAY_TARGET in is_format_supported().
We were failing the retval == usage check at the end.

Fixes: f7604d8af5 ("st/dri: only expose config formats that are display targets")
2018-02-23 08:42:13 -08:00
Lucas Stach
8df11f3fad etnaviv: fix in-place resolve tile count
TS tiles map to a fixed amount of bytes in the color/depth surface,
so the blocksize of the format needs to be taken into account when
calculating the number of tiles to fill.

The simplest fix is to just use the layer stride, which is the surface
size in bytes.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-02-23 15:34:39 +01:00
Lucas Stach
add23b59c9 etnaviv: switch magic single buffer state to "3"
Some of the 16bit formats misrender with missing tiles with the current
"2" state. As all the previously working formats also work with the "3"
state, just always use that one.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-02-23 15:34:39 +01:00
Lucas Stach
8befc11186 etnaviv: add debug switch to disable single buffer feature
This feature has caused some trouble already. Add a debug switch to
allow users to quickly check if a specific issue is caused by this
feature.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-02-23 15:34:31 +01:00
Dylan Baker
5c460337fd meson: Fix GL and EGL pkg-config files with glvnd
Currently meson will generate a pkg-config that links to EGL_mesa (or
GLX_mesa), but this isn't correct, it should always link to EGL or GL.
Probably the "right" solution is to have glvnd itself provide the pkg
config files for GL and EGL, but that also means that glvnd needs to
provide many of the header files, which makes it a more involved job.

Fixes: a47c525f32 ("meson: build glx")
Fixes: 035ec7a2bb ("meson: Add support for EGL glvnd")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-23 13:30:28 +00:00
Frank Binns
6160bf97db egl/dri2: fix segfault when display initialisation fails
dri2_display_destroy() is called when platform specific display
initialisation fails. However, this would typically lead to a
segfault due to the dri2_egl_display vbtl not having been set up.

Fixes: 2db9548296 ("loader_dri3/glx/egl: Optionally use a blit
context for blitting operations")
Signed-off-by: Frank Binns <francisbinns@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-23 11:13:22 +00:00
Juan A. Suarez Romero
e1623b303c mesa: add missing RGB9_E5 format in _mesa_base_fbo_format
RGB9_E5 should be accepted by RenderbufferStorage if the
EXT_texture_shared_exponent is exposed. It is left to the
implementations to return GL_FRAMEBUFFER_UNSUPPORTED_EXT
when checking the framebuffer completeness if they do not
support rendering in this format.

Discussed in:
https://github.com/KhronosGroup/OpenGL-API/issues/32

This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5

v2: Added more info to the commit message (Antia)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
2018-02-23 10:12:06 +01:00
Christian Gmeiner
e72062b66d etnaviv: npot_tex_any_wrap needs one bit only
Reduces size of struct etna_specs from 100 to 94 bytes.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-02-23 09:38:16 +01:00
Mathias Fröhlich
4562a7b0e8 vbo: Make use of _DrawVAO from the dlist code.
Finally use an internal VAO to execute display list draws. Avoid
duplicate state validation for display list draws. Remove client arrays
previously used exclusively for display lists.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:14 +01:00
Mathias Fröhlich
2f35140846 mesa: Use atomics for shared VAO reference counts.
VAOs will be used in the next change as immutable object across multiple
contexts. Only reference counting may write concurrently on the VAO. So,
make the reference count thread safe for those and only those VAO objects.

v3: Use bool/true/false for gl_vertex_array_object::SharedAndImmutable.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:11 +01:00
Mathias Fröhlich
8a3a4b6fae vbo: Make use of _DrawVAO from immediate mode draw
Finally use an internal VAO to execute immediate mode draws. Avoid
duplicate state validation for immediate mode draws. Remove client arrays
previously used exclusively for immediate mode draws.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:07 +01:00
Mathias Fröhlich
c757e416ce vbo: Implement tool functions for vbo specific VAO setup.
Correct VBO_MATERIAL_SHIFT value.
The functions will be used next in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:04 +01:00
Mathias Fröhlich
ef8028017d mesa: Add flush_vertices to _mesa_bind_vertex_buffer.
We will need the flush_vertices argument later in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:01 +01:00
Mathias Fröhlich
354b76ad20 mesa: Make _mesa_vertex_attrib_binding public.
Change vertex_attrib_binding() to _mesa_vertex_attrib_binding(), add a
flush_vertices argument, and make it publicly available.
The function will be needed later in the series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:58 +01:00
Mathias Fröhlich
4331969ac4 mesa: Add flush_vertices to _mesa_{enable,disable}_vertex_array_attrib.
We will need the flush_vertices argument later in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:55 +01:00
Mathias Fröhlich
195bb990ed vbo: Use _DrawVAO for array type draw commands.
Switch over to use the _DrawVAO for all the array type draws.
The _DrawVAO needs to be set before we enter _mesa_update_state, so move
setting the draw method in front of the first call to _mesa_update_state
which is in turn called from the *validate*Draw* calls. Using the
gl_vertex_array_object::_Enabled bitmask, gl_vertex_program_state::_VPMode
and gl_vertex_array_object::_AttributeMapMode we can already set
varying_vp_inputs before we call _mesa_update_state the first time.
Thus remove duplicate state validation.

v2: Update comments.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:50 +01:00
Mathias Fröhlich
6002ab564b vbo: Implement method to track the inputs array.
Provided the _DrawVAO and the derived state that is maintained if we have
the _DrawVAO set, implement a method to incrementally update the array of
gl_vertex_array input pointers.

v2: Add some more comments.
    Rename _vbo_array_init to _vbo_init_inputs.
    Rename vbo_context::arrays to vbo_context::draw_arrays.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:46 +01:00
Mathias Fröhlich
08c7474189 mesa: Introduce a yet unused _DrawVAO.
During the patch series this VAO gets populated with either the currently
bound VAO or an internal VAO that will be used for immediate mode and
dlist rendering.

v2: More comments about the _DrawVAO, filter and enabled mask.
    Rename _DrawVAOEnabled to _DrawVAOEnabledAttribs.
v3: Fix and move comment.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:43 +01:00
Mathias Fröhlich
ce3d2421a0 vbo: Remove get_vp_mode() and enum vp_mode.
Is now unused.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:40 +01:00
Mathias Fröhlich
60c3ca1b23 vbo: Use _VPMode instead of get_vp_mode().
At those places where we used get_vp_mode() use
gl_vertex_program_state::_VPMode instead.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:36 +01:00
Mathias Fröhlich
92d76a1691 mesa: Provide an alternative to get_vp_mode()
To get equivalent information than get_vp_mode(), track the vertex
processing mode in a per context variable at
gl_vertex_program_state::_VPMode.
This aims to replace get_vp_mode() as seen in the vbo module.
But instead of the get_vp_mode() implementation which only gives correct
answers past calling _mesa_update_state() this context variable is
immediately tracked when the vertex processing state is modified. The
correctness of this value is asserted on state validation.

With this in place we should be able to untangle the dependency with
varying_vp_inputs and state invalidation.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:30 +01:00
Ilia Mirkin
d73f1f2ad8 nv50,nvc0: fix integer MS resolves using 2d engine
We don't want filtering for integer textures, same as depth/stencil.

Fixes: KHR-GL45.direct_state_access.renderbuffers_storage_multisample
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-02-22 20:47:48 -05:00
Ilia Mirkin
33ce3569c5 nvc0: fix writing query results into buffer
We need to mark the range as valid, and validate the resource using a
helper to ensure that the buffer status is marked properly.

Fixes some CTS pipeline stats query tests, and
KHR-GL45.direct_state_access.queries_functional

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-02-22 20:47:48 -05:00
Ilia Mirkin
f6e4f95668 nv50,nvc0: fix clear buffer acceleration
Two things were off:
 - valid range was not updated, which could affect waiting for future
   maps
 - fencing was done manually instead of using the *_resource_validate
   helper, which resulted in a missed dirty buffer flag being set

Fixes: KHR-GL45.direct_state_access.buffers_clear
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-02-22 20:47:48 -05:00
Lionel Landwerlin
bd9672695b i965: perf: ensure reading config IDs from sysfs isn't interrupted
Fixes: 458468c136 "i965: Expose OA counters via INTEL_performance_query"
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-23 01:44:07 +00:00
Bas Nieuwenhuizen
032870beda radv: Fix autotools build.
Somewhere along the way the Makefile changes got lost ...

Fixes: 4db78f3a6b "radv: Put supported extensions in a struct."
Acked-by: Dave Airlie <airlied@redhat.com>
2018-02-23 01:54:12 +01:00
Bas Nieuwenhuizen
e72ad05c1d radv: Return NULL for entrypoints when not supported.
This implements strict checking for the entrypoint ProcAddr
functions.

 - InstanceProcAddr with instance = NULL, only returns the 3 allowed
   entrypoints.
 - DeviceProcAddr does not return any instance entrypoints.
 - InstanceProcAddr does not return non-supported or disabled
   instance entrypoints.
 - DeviceProcAddr does not return non-supported or disabled device
   entrypoints.
 - InstanceProcAddr still returns non-supported device entrypoints.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen
414f5e0e14 radv: Reword radv_entrypoints_gen.py
With a big inspiration from anv as always ...

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen
076f7cfc6b radv: Track enabled extensions.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen
4db78f3a6b radv: Put supported extensions in a struct.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-23 00:39:02 +01:00
Jose Fonseca
1f5618e81c appveyor: Build with MSVC 2015.
The MSVC version we (at VMware) primarily care about from now on is
2015.

See https://ci.appveyor.com/project/jrfonseca/mesa/build/46

We can drop support for building with 2013 in a future commit.  I'm not
aware of significant changes in C99/C11 support from MSVC 2013 to 2015,
but there's no point in continuing supporting old MSVC versions when
nobody cares.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-22 21:10:20 +00:00
Samuel Pitoiset
d6b7539206 ac/nir: remove emission of nir_op_fpow
fpow is now lowered at NIR level.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:44:46 +01:00
Samuel Pitoiset
7aa008d1d7 radv: enable lowering of fpow to fexp2 and flog2
There is no fpow in hardware, so it's always lowered somewhere,
but it appears that lowering at NIR level is better. Figured while
comparing compute shaders between RadeonSI and RADV.

Polaris10:
Totals from affected shaders:
SGPRS: 18936 -> 18904 (-0.17 %)
VGPRS: 12240 -> 12220 (-0.16 %)
Spilled SGPRs: 2809 -> 2809 (0.00 %)
Code Size: 718116 -> 719848 (0.24 %) bytes
Max Waves: 1409 -> 1410 (0.07 %)

Vega10:
Totals from affected shaders:
SGPRS: 18392 -> 18392 (0.00 %)
VGPRS: 12008 -> 11920 (-0.73 %)
Spilled SGPRs: 3001 -> 2981 (-0.67 %)
Code Size: 777444 -> 778788 (0.17 %) bytes
Max Waves: 1503 -> 1504 (0.07 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:40:47 +01:00
Samuel Pitoiset
63fb30c674 nir: lower fexp2(fmul(flog2(a), 2)) to fmul(a, a)
Similar for the 4 case.

Suggested by Bas.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:40:45 +01:00
Samuel Pitoiset
b18997876f nir: add is_used_once for fmul(fexp2(a), fexp2(b)) to fexp2(fadd(a, b))
Otherwise the code size increases because the original fexp2()
instructions can't be deleted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:40:43 +01:00
Samuel Pitoiset
a01e9996b5 ac/nir: set GLC=1 for load/store of coherent/volatile images
This disables persistence accross wavefronts.

F1 2017 and Wolfenstein 2 appear to use some coherent images
but this patch doesn't seem to change anything.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:39:55 +01:00
Samuel Pitoiset
3c40be126f spirv: apply memory qualifiers to images
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:39:53 +01:00
Chuck Atkins
540e49e105 glx: Properly handle cases where screen creation fails
This fixes a segfault exposed by a29d63ecf7 which occurs when swr is
used on an unsupported architecture.

v2: re-work to place logic in xmesa_init_display

Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: George Kyriazis <george.kyriazis@intel.com>
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-22 10:20:32 -05:00
Iago Toral Quiroga
7668b594e6 anv/blorp: multisample resolve all attachment layers
We were only resolving the first.

v2:
  - Do not require that the number of layers on dst and src are an
    exact match, it is okay if the dst has more layers so long as
    it has at least the same that we are going to resolve.
  - Do not always resolve array_len layers, we should resolve
    only from base_array_layer to array_len.

v3:
  - v2 was assuming that array_len represented the total number of
    layers in the image, but it represents the number of layers
    starting at the base array ayer.

v4:
 - The number of layers to resolve should be taken from the
   framebuffer (Nanley).

Fixes new CTS tests for multisampled layered rendering:
dEQP-VK.renderpass.multisample_resolve.layers_*

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-22 08:23:39 +01:00
Jason Ekstrand
2dce4ac6ac intel/isl: Improve the documentation on get_default_aux_state
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
24952160fd i965: Use finish_external instead of make_shareable in setTexBuffer2
The setTexBuffer2 hook from GLX is used to implement glxBindTexImageEXT
which has tighter restrictions than just "it's shared".  In particular,
it says that any rendering to the image while it is bound causes the
contents to become undefined.

The GLX_EXT_texture_from_pixmap extension provides us with an acquire
and release in the form of glXBindTexImageEXT and glXReleaseTexImageEXT.
The extension spec says,

    "Rendering to the drawable while it is bound to a texture will leave
    the contents of the texture in an undefined state.  However, no
    synchronization between rendering and texturing is done by GLX.  It
    is the application's responsibility to implement any synchronization
    required."

From the EGL 1.4 spec for eglBindTexImage:

    "After eglBindTexImage is called, the specified surface is no longer
    available for reading or writing.  Any read operation, such as
    glReadPixels or eglCopyBuffers, which reads values from any of the
    surface’s color buffers or ancillary buffers will produce
    indeterminate results.  In addition, draw operations that are done
    to the surface before its color buffer is released from the texture
    produce indeterminate results

In other words, between the bind and release calls, we effectively own
those pixels and can assume, so long as we don't crash, that no one else
is reading from/writing to the surface.  The GLX and EGL implementations
call the setTexBuffer2 and releaseTexBuffer function pointers that the
driver can hook.

In theory, this means that, between BindTexImage and ReleaseTexImage, we
own the pixels and it should be safe to track aux usage so we
can avoid redundant resolves so long as we start off with the right
assumption at the start of the bind/release pair.

In practice, however, X11 has slightly different expectations.  It's
expected that the server may be drawing to the image at the same time as
the compositor is texturing from it.  In that case, the worst expected
outcome should be tearing or partial rendering and not random corruption
like we see when rendering races with scanout with CCS.  Fortunately,
the GEM rules about texture/render dependencies save us here.  If X11
submits work to write to a pixmap after the compositor has submitted
work to texture from it, GEM inserts a dependency between the compositor
and X11.  If X11 is using a high-priority context, this will cause the
compositor to get a temporarily boosted priority while the batch from
X11 is waiting on it.  This means that we will never have an actual race
between X11 and the compositor so no corruption can happen.

Unfortunately, however, this means that X11 will likely be rendering to it
between the compositor's BindTexImage and ReleaseTexImage calls.  If we
want to avoid strange issues, we need to be a bit careful about
resolves because we can't really transition it away from the "default"
aux usage.  The only case where this would practically be a problem is
with image_load_store where we have to do a full resolve in order to use
the image via the data port.  Even there it would only be a problem if
batches were split such that X11's rendering happens between the resolve
and the use of it as a storage image.  However, the chances of this
happening are very slim so we just emit a warning and hope for the best.

This commit adds a new helper intel_miptree_finish_external which resets
all aux state to whatever ISL says is the right worst-case "default" for
the given modifier.  It feels a little awkward to call it "finish"
because it's actually an acquire from the perspective of the driver, but
it matches the semantics of the other prepare/finish functions.  This
new helper gets called in intelSetTexBuffer2 instead of make_shareable.
We also add an intelReleaseTexBuffer (we passed NULL to releaseTexBuffer
before) and call intel_miptree_prepare_external in it.  This probably
does nothing most of the time but it means that the prepare/finish calls
are properly matched.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
00926a2730 i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2
The old code made a new miptree that referenced the same BO as the
renderbuffer and just trusted in the memory aliasing to work.  There are
only two ways in which the new miptree is liable to differ from the one
in the renderbuffer and neither of them matter:

 1) It may have a different target.  The only targets that we can ever
    see in intelSetTexBuffer2 are GL_TEXTURE_2D and GL_TEXTURE_RECTANGLE
    and the difference between the two doesn't matter as far as the
    miptree is concerned; genX(update_sampler_state) only looks at the
    gl_texture_object and not the miptree when determining whether or
    not to use normalized coordinates.

 2) It may have a very slightly different format.  Again, this doesn't
    matter because we've supported texture views for quite some time so
    we always look at the gl_texture_object format instead of the
    miptree format for hardware setup anyway.

On the other hand, because we were recreating the miptree, we were using
intel_miptree_create_for_bo which doesn't understand modifiers.  We
really want this function to work without doing a resolve so long as you
have modifiers so we need to fix that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
41d45eb21e i965/tex_image: Pull the tex format from the renderbuffer in intelSetTexBuffer2
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
344b57b10b i965/miptree: Loosen the format check in miptree_match_image
This function is used to determine when we need to re-allocate a
miptree.  Since we do nothing different in miptree allocation for
sRGB vs. linear, loosening this should be safe and may lead to less
copying and reallocating in some odd cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
5b1b710e6f i965/state: Ignore intel_obj->_Format for depth/stencil and ETC2
We're about to start letting the intel_obj->_Format be the "real"
texture format.  For depth/stencil textures, this may be a combined
depth stencil format.  For ETC2 on gen7 and earlier, this will be the
actual ETC2 format.  This makes a bit more GL sense but means we have to
be careful in state upload.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Kenneth Graunke
183ce5e629 glsl: Parse 'layout' as a token with advanced blending or bindless
Both KHR_blend_equation_advanced and ARB_bindless_texture provide
layout qualifiers, and are exposed in compatibility contexts.  We
need to parse the layout qualifier as a token in order for those
to work, but forgot to extend this check.

ARB_shader_image_load_store would need a similar treatment, but we
don't expose that in legacy OpenGL contexts.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105161
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-02-21 17:50:57 -08:00
Daniel Stone
c7e22483fe vulkan/wsi/x11: Consistently update and return swapchain status
Use a helper function for updating the swapchain status. This will be
used later to handle VK_SUBOPTIMAL_KHR, where we need to make a
non-error status stick to the swapchain until recreation.  Instead of
direct comparisons to VK_SUCCESS to check for error, test for negative
numbers meaning an error status, and positive numbers indicating
non-error statuses.

v2 (Jason Ekstrand):
 - Use a pattern of "return x11_swapchain_result(chain, VK_WHATEVER)"
 - Handle wsi_queue_pull returning VK_TIMEOUT
 - Call x11_swapchain_result in x11_present_to_x11

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-21 22:37:10 +00:00
Jason Ekstrand
6937c61324 vulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails
This most likely means we lost our connection to the X server so
OUT_OF_DATE is reasonable.  This was also the one case where we pushed a
UINT32_MAX into the queue without setting an error condition.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-21 22:37:10 +00:00
Daniel Stone
bfa22266cd vulkan/wsi/wayland: Add support for zwp_dmabuf
zwp_linux_dmabuf_v1 lets us use multi-planar images and buffer
modifiers.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-21 22:37:10 +00:00
Jason Ekstrand
c757fd2852 anv/image: Add support for modifiers for WSI
This adds support for the modifiers portion of the WSI "extension".

Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-21 22:37:10 +00:00
Jason Ekstrand
adca1e4a92 anv/image: Separate modifiers from legacy scanout
For a bit there, we had a bug in i965 where it ignored the tiling of the
modifier and used the one from the BO instead.  At one point, we though
this was best fixed by setting a tiling from Vulkan.  However, we've
decided that i965 was just doing the wrong thing and have fixed it as of
5048572352.

The old assumptions also affected the solution we used for legacy
scanout in Vulkan.  Instead of treating it specially, we just treated it
like a modifier like we do in GL.  This commit goes back to making it
it's own thing so that it's clear in the driver when we're using
modifiers and when we're using legacy paths.

v2 (Jason Ekstrand):
 - Rename legacy_scanout to needs_set_tiling

Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-21 22:37:10 +00:00
Jason Ekstrand
f5433e4d6c vulkan/wsi: Add modifiers support to wsi_create_native_image
This involves extending our fake extension a bit to allow for additional
querying and passing of modifier information.  The added bits are
intended to look a lot like the draft of VK_EXT_image_drm_format_modifier.
Once the extension gets finalized, we'll simply transition all of the
structs used in wsi_common to the real extension structs.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-21 22:37:10 +00:00
Daniel Stone
55b27e1e5f vulkan/wsi: Add drm_modifier member to wsi_image
Not yet used anywhere.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-21 22:37:10 +00:00
Daniel Stone
61c3feb38d vulkan/wsi: Add multiple planes to wsi_image
Not currently used.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-21 22:37:10 +00:00
Timothy Arceri
cdeac00267 nir: remove old assert
This was originally intended to make sure the remap location
was not -1. However the code has changed alot since then,
the location is now never set to -1 and we also handle
components meaning this old assert has been doing comparisions
with the pointer to the array of component data.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183
2018-02-22 09:31:00 +11:00
Timothy Arceri
86098696fc radeonsi/nir: collect more accurate output_usagemask
Fixes assert in the glsl-1.50-gs-max-output-components piglit test.

Note that the double handling will only work for doubles that
don't take up multiple slots i.e. double and dvec2. However
dual slot double handling is an existing bug which is made no
worse by this patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-22 09:31:00 +11:00
Timothy Arceri
79dc94828a radeonsi/nir: disable GLSL IR loop unrolling
Delaying unrolling and allowing NIR to do it instead has been shown
to result in better code in drivers such as i965. shader-db results
appear to show the same is true for radeonsi.

The other advantage is that using NIR unrolling improves compile
times significantly.

Totals from affected shaders:
SGPRS: 9624 -> 10016 (4.07 %)
VGPRS: 6800 -> 6464 (-4.94 %)
Spilled SGPRs: 0 -> 2 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 359176 -> 332264 (-7.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1355 -> 1432 (5.68 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-22 09:31:00 +11:00
Timothy Arceri
e6269ffc2e radeonsi/nir: fix tess varying loads for doubles
Fixes the following piglit tests:

tests/spec/arb_tessellation_shader/execution/double-array-vs-tcs-tes.shader_test
tests/spec/arb_tessellation_shader/execution/double-vs-tcs-tes.shader_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-22 09:31:00 +11:00
Timothy Arceri
6d338d757f ac/radeonsi: pass type to load_tess_varyings()
We need this to be able to load 64bit varyings.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-22 09:31:00 +11:00
Daniel Stone
eef890b7b1 x11/dri3: Store raw present completion mode
The DRI3 drawable info struct currently stores a boolean for whether the
last completed operation was a flip or not. As we need to track the full
completion mode for handling suboptimal returns, change the 'flipping'
field to the raw present completion mode from the server.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-21 21:57:38 +00:00
Daniel Stone
a6f1952814 x11/dri3: Don't open-code ARRAY_SIZE
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-21 21:57:38 +00:00
Jason Ekstrand
52056206e1 anv: Don't assert that stencil HiZ clears are single-slice
It's true for depth HiZ clears because we only have HiZ on single-slice
images right now.  However, for stencil-only clears there is no such
restriction.

Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-21 13:54:11 -08:00
Jason Ekstrand
7dd0f73fe1 anv: Only copy clear dwords if we're rendering to the first slice
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-02-21 12:47:17 -08:00
Marek Olšák
b494ed168c radeonsi: don't flush when si_eliminate_fast_color_clear is no-op 2018-02-21 20:03:11 +01:00
Marek Olšák
5f55f4c59f radeonsi: make texture_discard_cmask/eliminate functions non-static 2018-02-21 20:03:11 +01:00
James Zhu
81dd4a7637 radeonsi: enable uvd encode for HEVC main
Enable UVD encode for HEVC main profile

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
b38b208ff8 radeonsi:create uvd hevc enc entry
Add UVD hevc encode pipe video codec creation entry

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
e7d51e27ed radeon/uvd:add uvd hevc enc functions
Implement UVD hevc encode functions

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
2b86f5fa0b radeon/uvd:add uvd hevc enc hw ib implementation
Implement required IBs for UVD HEVC encode.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
461508c15c radeon/uvd:add uvd hevc enc hw interface header
Add hevc encode hardware interface for UVD

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
c6acae22c8 winsys/amdgpu:add uvd hevc enc support in amdgpu cs
Support UVD HEVC encode in amdgpu cs

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
f0ad908e79 amd/common:add uvd hevc enc support check in hw query
Based on amdgpu hardware query information to check if UVD hevc enc support

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-21 13:53:38 -05:00
Karol Herbst
7319311a50 nvir/nvc0: fix legalizing of ld unlock c0[0x10000]
We have to increase the file index also for 0x10000 not just for values
greater than 0x10000.

Fixes: 37b67db6ae
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-21 11:12:45 +01:00
Samuel Pitoiset
a6accad68f ac/nir: add glsl_is_array_image() helper
For consistency.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-21 09:41:51 +01:00
Samuel Pitoiset
ff83dfb364 ac/nir: set the DA field when performing atomics on 3D images
This doesn't fix anything known but it should definitely be set.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-21 09:41:49 +01:00
Eric Anholt
afa7b2f199 i965: Fix compiler warning about write being undefined.
This looks like it should be protected by the assume() about
nr_color_regions, but my compiler warns anyway.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-20 20:23:57 -08:00
Eric Anholt
4636ce362d glsl/tests: Fix a compiler warning about signed/unsigned loop comparison.
Fixes: d32956935e ("glsl: Walk a list of ir_dereference_array to mark array elements as accessed")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-20 20:23:57 -08:00
Eric Anholt
7075c084fc loader: Fix compiler warnings about truncating the PCI ID path.
My build was producing:

../src/loader/loader.c:121:67: warning: ‘%1u’ directive output may be truncated writing between 1 and 3 bytes into a region of size 2 [-Wformat-truncation=]

and we can avoid this careful calculation by just using asprintf (as we do
elsewhere in the file).

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-20 20:23:57 -08:00
Eric Anholt
1b313eedb5 glsl: Silence warnings in the uniform initializer test about 16-bit types
They should probably get unit tests implemented, but this cleans up a
bunch of warnings in my build for now.

Fixes: 59f458cd87 ("glsl: Add 16-bit types")
Cc: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-20 20:23:57 -08:00
Jordan Justen
96fe36f7ac i965: Enable disk shader cache by default
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-20 18:49:43 -08:00
Dave Airlie
baa0feb73d radv: don't send num_tcs_input_cp to sgprs.
We never use it in the shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:36 +00:00
Dave Airlie
952222ddd4 radv/tess: don't need to look in constant for vertices_per_patch
This just avoids passing this value via user sgprs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:28 +00:00
Dave Airlie
77fd1b9187 ac/radv: cleanup some tcs output values access
Just consolidates some code to make it easier to change.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:23 +00:00
Dave Airlie
0e6f0d400b ac/radv: remove total_vertices variable
This just removes an unneeded variable.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:19 +00:00
Dave Airlie
e9b9fb3616 ac/radv: don't mark tess inner as used if we don't use it.
This just avoids marking it as a used output if we don't
actually use it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:15 +00:00
Dave Airlie
d5b2d7ed67 ac/nir: to integer the args to bcsel.
dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw
was hitting an llvm assert due to one value being an int and the
other a float.

This just casts both values to integer and fixes the test.

Fixes: dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-20 23:15:18 +00:00
Jason Ekstrand
c66fb12117 anv/blorp: Use layout_to_aux_usage when a layout is provided
Instead of having aux usage and ANV_AUX_USAGE_DEFAULT to mean "give me
something reasonable" we now use anv_layout_to_aux_usage whenever a
layout is available.  If a layout is available, we ignore the aux_usage
parameter.  For the cases where we have an explicit aux usage such as
clears and aux ops, we have a new ANV_IMAGE_LAYOUT_EXPLICIT_AUX layout.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-20 13:57:17 -08:00
Jason Ekstrand
0fa040e6f5 anv/cmd_buffer: Delete some assert-only variables
Checking the sample count is almost as good as aux usage in this case.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-20 13:57:16 -08:00
Jason Ekstrand
e10a62662b anv/cmd_buffer: Use layout_to_* helpers in compute_aux_usage
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-20 13:57:14 -08:00
Jason Ekstrand
7ea8131aa0 anv/cmd_buffer: Simplify transition_depth_buffer
If we don't have HiZ, then anv_layout_to_aux_usage will return NONE for
both layouts.  If the two layouts are the same, they will get the aux
usage.  In either case, the code below will give us ISL_AUX_OP_NONE and
we'll return without doing anything.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-20 13:57:09 -08:00
Jason Ekstrand
87e86ee2e6 anv/cmd_buffer: Do subpass image transitions in begin/end_subpass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
7d5f6b6088 anv/cmd_buffer: Mark depth/stencil surfaces written in begin_subpass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
8a3f086a42 anv/cmd_buffer: Sync clear values in begin_subpass
This is quite a bit cleaner because we now sync the clear values at the
same time as we do the fast clear.  For loading the clear values into
the surface state, we now do it once when we handle the LOAD_OP_LOAD
instead of every subpass.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
a4136b8c1a anv/pass: Store usage in each subpass attachment
This requires us to ditch the VkAttachmentReference struct in favor of
an anv-specific struct.  However, we can now easily identify from just
the subpass attachment what kind of an attachment it is.  This will make
iteration over anv_subpass::attachments a little easier in some case.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
bd356e1bcf anv/cmd_buffer: Add a concept of pending load aspects
These are the same as pending clear aspects only for the "load"
operation.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
e526d49edd anv/cmd_buffer: Iterate all subpass attachments when clearing
This unifies things a bit because we now handle depth and stencil at the
same time.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
2cc3445eb2 anv/cmd_buffer: Decide whether or not to HiZ clear up-front
This moves the decision out of begin_subpass and into BeginRenderPass
like the decision for color clears.  We use a similar name for the
function for depth/stencil as for color even though no aux usage is
really getting computed.

v2 (Jason Ekstrand):
 - Don't always disable HiZ clears by accident
 - Use the initial layout to decide whether to do fast clears

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
6fc8555610 anv/cmd_buffer: Move the rest of clear_subpass into begin_subpass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
7991838973 intel/blorp: Add a blorp_hiz_clear_depth_stencil helper
This is similar to blorp_gen8_hiz_clear_attachments except that it takes
actual images instead of trusting in the already set depth state.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
1900dd76d0 anv/cmd_buffer: Move the color portion of clear_subpass into begin_subpass
This doesn't really change much now but it will give us more/better
control over clears in the future.  The one interesting functional
change here is that we are now re-emitting 3DSTATE_DEPTH_BUFFERS and
friends for each clear.  However, this only happens at begin_subpass
time so it shouldn't be substantially more expensive.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
6fb9d6c6f5 anv/cmd_buffer: Pass a subpass id into begin_subpass
This is a bit less awkward than passing in the subpass because it means
we don't have to extract the subpass id from the subpass.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
01223b8199 anv/cmd_buffer: Add begin/end_subpass helpers
Having begin/end_subpass is a bit nicer than the begin/next/end hooks
that Vulkan gives us.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
b5bd3fb4e4 anv/cmd_buffer: Apply subpass flushes before set_subpass
This seems slightly more correct because it means that the flushes
happen before any clears or resolves implied by the subpass transition.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
869448a8ab anv: Use framebuffer layers for implicit subpass transitions
Fixes: de3be61801 "anv/cmd_buffer: Rework aux tracking"
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
85d0bec961 anv: Be more careful about fast-clear colors
Previously, we just used all the channels regardless of the format.
This is less than ideal because some channels may have undefined values
and this should be ok from the client's perspective.  Even though the
driver should do the correct thing regardless of what is in the
undefined value, it makes things less deterministic.  In particular, the
driver may choose to fast-clear or not based on undefined values.  This
level of nondeterminism is bad.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
4796025ba5 intel/isl: Add an isl_color_value_is_zero helper
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
116e818ef1 anv/gpu_memcpy: CS Stall before a MI memcpy on gen7
This fixes a pile of hangs caused by the recent shuffling of resolves
and transitions.  The particularly problematic case is when you have at
least three attachments with load ops of CLEAR, LOAD, CLEAR.  In this
case, we execute the first CLEAR followed by a MI memcpy to copy the
clear values over for the LOAD followed by a second CLEAR.  The MI
commands cause the first CLEAR to hang which causes us to get stuck on
the 3DSTATE_MULTISAMPLE in the second CLEAR.

We also add guards for BLORP to fix the same issue.  These shouldn't
actually do anything right now because the only use of indirect clears
in BLORP today is for resolves which are already guarded by a render
cache flush and CS stall.  However, this will guard us against potential
issues in the future.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:19 -08:00
Guillaume Charifi
a572ec2efe st/mesa: Factorize duplicate code for atomic buffer binding
Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 20:54:49 +01:00
Guillaume Charifi
56bfcd50f7 st/mesa: Factorize duplicate code in st_update_framebuffer_state()
Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 20:54:49 +01:00
Rob Clark
4c4e6232ee freedreno/ir3: fix use_count refcnt'ing issue
Was hitting an assert with vs-varying-array-mat4-index-col-row-wr.shader_test

When eliminating a copy, we were dropping the use_count of the mov that
is skipped, but not increasing the use_count of it's src instruction.

Fixes: 76440fcca9 freedreno/ir3: clean up dangling false-dep's
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-20 13:43:42 -05:00
Eric Engestrom
ac731531a1 docs: fix patent url
Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-20 15:14:34 +00:00
Brian Paul
e7d1a93723 svga: replaced 'unsigned' with proper enum types in shader code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-20 08:11:06 -07:00
Jonathan Gray
9401d90a53 configure.ac: pthread-stubs not present on OpenBSD
pthread-stubs is no longer required on OpenBSD and has been removed.
libpthread parts involved moved to libc.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-20 15:08:47 +00:00
Andres Gomez
36ac485bd1 swr: bump minimum supported LLVM version to 4.0
Since radv and radeonsi removed support for LLVM 3.9 the distcheck
target got broken because SWR distribution needed 3.9.x.

After checking with George Kyriazis, SWR is OK with moving to LLVM 4.0
and above, which will solve this problem.

Fixes: 3bf1e036e8 ("amd: remove support for LLVM 3.9")
Cc: George Kyriazis <george.kyriazis@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2018-02-20 17:03:06 +02:00
Andres Gomez
b39f6d5fc7 travis: radeonsi and radv need LLVM 4.0
Fixes: 3bf1e036e8 ("amd: remove support for LLVM 3.9")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-20 16:58:30 +02:00
Samuel Pitoiset
1ac741d690 ac/nir: move ac_declare_lds_as_pointer() outside of the switch
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-20 10:44:59 +01:00
Samuel Pitoiset
b5d111ae76 radv: allow to force family using RADV_FORCE_FAMILY
Useful for pipeline-db.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-20 10:44:47 +01:00
Thomas Hellstrom
f386776ea5 loader_dri3/glx/egl: Reinstate the loader_dri3_vtable get_dri_screen callback
Removing this callback caused rendering corruption in some multi-screen cases,
so it is reinstated but without the drawable argument which was never used
by implementations and was confusing since the drawable could have been
created with another screen.

Cc: "17.3 18.0" mesa-stable@lists.freedesktop.org
Fixes: 5198e48a0d (loader_dri3/glx/egl: Remove the loader_dri3_vtable get_dri_screen callback)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105013
Reported-by: Daniel van Vugt <daniel.van.vugt@canonical.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-20 10:36:53 +01:00
Thomas Hellstrom
80c31f7837 svga: Fix a leftover debug hack
Fix what appears to be a leftover debug hack.
The hack would force the driver to take a different blit path; possibly,
although unverified, reverting to software blits.

Tested using piglit tests/quick. No related regressions.

Cc: "17.2 17.3 18.0" <mesa-stable@lists.freedesktop.org>
Fixes: 9d81ab7376 (svga: Relax the format checks for copy_region_vgpu10 somewhat)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104625
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-20 10:12:19 +01:00
Iago Toral Quiroga
af5f2322d0 anv/entrypoints: make vkGetDeviceProcAddr return NULL for instance commands
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-20 08:12:32 +01:00
Ilia Mirkin
e1a70aed10 nv50,nvc0: mark ABGR format as displayable instead of ARGB format
This matches the hardware's capabilities.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-19 22:33:58 -05:00
Ilia Mirkin
f7604d8af5 st/dri: only expose config formats that are display targets
In the case of NVIDIA hardware, ABGR is displayable but ARGB is not.
Only advertise the one set in the visuals list.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-02-19 22:33:58 -05:00
Ilia Mirkin
ebdc4c31e2 mesa: add xbgr support adjacent to xrgb
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-02-19 22:33:58 -05:00
Timothy Arceri
d88a2906f8 st/shader_cache: copy nir pointer to gl_program after deserializing
This fixes a crash when running the arb_get_program_binary-api-errors
piglit test twice.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 13:15:02 +11:00
Timothy Arceri
691c320de0 radeonsi: add nir shader cache support
In future we might want to try avoid calling nir_serialize() but
this works for now.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 13:15:02 +11:00
Timothy Arceri
2b431808ab radeonsi: rename variables tgsi_binary -> ir_binary
This better represents that the ir could be either tgsi or nir.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 13:15:02 +11:00
Emil Velikov
1270990438 docs: update calendar, add news and link release notes to 17.3.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-19 22:10:18 +00:00
Emil Velikov
be5a996039 docs: add sha256 checksums for 17.3.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 164a993112)
2018-02-19 22:08:14 +00:00
Emil Velikov
ca614d40cd docs: add release notes for 17.3.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2529d77179)
2018-02-19 22:08:12 +00:00
Marek Olšák
f78fe98fff radeonsi: fix regression from 32-bit pointers on CI
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2018-02-19 17:56:23 +01:00
Samuel Pitoiset
549c7f3724 radv: compact varyings after removing unused ones
It makes no sense to compact before, and the description of
nir_compact_varyings() confirms that.

Polaris10:
Totals from affected shaders:
SGPRS: 108528 -> 108128 (-0.37 %)
VGPRS: 74548 -> 74500 (-0.06 %)
Spilled SGPRs: 844 -> 814 (-3.55 %)
Code Size: 3007328 -> 2992932 (-0.48 %) bytes
Max Waves: 16019 -> 16009 (-0.06 %)

Vega10:
Totals from affected shaders:
SGPRS: 106088 -> 106232 (0.14 %)
VGPRS: 74652 -> 74700 (0.06 %)
Spilled SGPRs: 692 -> 658 (-4.91 %)
Code Size: 2967708 -> 2953028 (-0.49 %) bytes
Max Waves: 18178 -> 18162 (-0.09 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-19 12:19:17 +01:00
Timothy Arceri
51e745cf77 radeonsi/nir: fix gl_FragCoord for pixel_center_integer
Fixes piglit test glsl-arb-fragment-coord-conventions

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-19 08:47:48 +11:00
Timothy Arceri
347038baa9 glsl/nir: add pixel_center_integer to shader info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-19 08:47:48 +11:00
Ilia Mirkin
fe76fc11b1 gm107/ir: avoid using kepler instruction capabilities
Split up the op properties table into generation-specific bits, and only
use the kepler ones on kepler. Fixes some CTS images tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-02-17 23:41:21 -05:00
Ilia Mirkin
f08fd676bf nvc0: add support for bindless on maxwell+
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-17 23:41:21 -05:00
Ilia Mirkin
0255550eb1 gm107/ir: change how SUQ works in preparation for bindless
All this information can be retrieved from the TIC directly. Avoid
having to dip into the constbuf information about the image.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-17 23:41:21 -05:00
Kenneth Graunke
fa8a764b62 i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.
By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic
state base address.  This makes it unusable for pushing UBOs.

There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake)
which controls whether buffer 0 is relative to dynamic state base
address, or simply a normal pointer.  Setting that gives us full
flexibility.  This lets us push up to 4 UBO ranges.

We can't currently write this on Haswell and earlier, and will need
to update the kernel command parser, and then do the whole version
checking song and dance.  We also need a brand new kernel that supports
context isolation - on older kernels, newly created contexts inherit
register state from whatever happened to be running.  So, setting this
would have catastrophic impact on other drivers such as libva, Beignet,
or older Mesa.

See commit 8ec5a4e4a4 where we did this
once before, but had to revert it in commit 013d331220.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-02-17 11:26:31 -08:00
Kenneth Graunke
a63c74be85 i965: Stop restoring the default L3 configuration on Kernel 4.16+.
Kernel 4.16 has proper context isolation, which means we can change
the L3 configuration without worrying about that leaking to other
newly created contexts, breaking the assumptions of other userspace.

So, disable our workaround to reprogram it back to the default.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-02-17 11:26:18 -08:00
Mikko Perttunen
5a1606c51f nvc0: Use GP100_COMPUTE_CLASS on GP10B
GP10B requires the use of GP100_COMPUTE_CLASS instead of
GP104_COMPUTE_CLASS as is used for other non-GP100 chips.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-17 14:16:10 -05:00
Daniel Stone
9d21dbeb88 i965: Fix aux-surface size check
The previous commit reworked the checks intel_from_planar() to check the
right individual cases for regular/planar/aux buffers, and do size
checks in all cases.

Unfortunately, the aux size check was broken, and required the aux
surface to be allocated with the correct aux stride, but full image
height (!).

As the ISL aux surface is not recorded in the DRIimage, we cannot easily
access it to check. Instead, store the aux size from when we do have the
ISL surface to hand, and check against that later when we go to access
the aux surface.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: c2c4e5bae3 ("i965: Fix bugs in intel_from_planar")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-17 10:22:35 +00:00
Marek Olšák
931ec80eeb radeonsi: implement 32-bit pointers in user data SGPRs (v2)
User SGPRs changes:
    VS:     14 ->  9
    TCS:    14 -> 10
    TES:    10 ->  6
    GS:      8 ->  4
    GSCOPY:  2 ->  1
    PS:      9 ->  5
    Merged VS-TCS: 24 -> 16
    Merged VS-GS:  18 -> 11
    Merged TES-GS: 18 -> 11

SGPRS: 2170102 -> 2158430 (-0.54 %)
VGPRS: 1645656 -> 1641516 (-0.25 %)
Spilled SGPRs: 9078 -> 8810 (-2.95 %)
Spilled VGPRs: 130 -> 114 (-12.31 %)
Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread
Code Size: 52094872 -> 52692540 (1.15 %) bytes
Max Waves: 371848 -> 372723 (0.24 %)

v2: - the shader cache needs to take address32_hi into account
    - set amdgpu-32bit-address-high-bits

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)
2018-02-17 04:52:17 +01:00
Marek Olšák
5722cd4084 radeonsi: disallow constant buffers with a 64-bit address in slot 0
State trackers must use a user buffer or const_uploader,
or set pipe_resource::flags same as const_uploader->flags.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
d790b6cece radeonsi: move const_uploader allocations to 32-bit address space
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
50581549b7 winsys/radeon: implement and enable 32-bit VM allocations
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
1104d1e9d3 winsys/radeon: add struct radeon_vm_heap
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
48ecacfefa winsys/amdgpu: enable 32-bit VM allocations
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
c2da45be86 gallium/radeon: add 32-bit address space heaps
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
0977b7f7b3 ac: query high bits of 32-bit address space 2018-02-17 04:51:58 +01:00
Marek Olšák
16be55da94 gallium: use PIPE_CAP_CONSTBUF0_FLAGS 2018-02-17 04:20:55 +01:00
Marek Olšák
8e7222f4e5 gallium: allow drivers to impose BO flags restrictions on constant buffer 0
Required by radeonsi for optimal behavior.
2018-02-17 04:20:55 +01:00
Alexander von Gluck IV
834d221512 meson: Add Haiku platform support v4
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-16 16:56:34 -06:00
Anuj Phogat
7b283544dc anv/icl: Add render target flush after uploading binding table
The PIPE_CONTROL command description says:

"Whenever a Binding Table Index (BTI) used by a Render Taget Message
points to a different RENDER_SURFACE_STATE, SW must issue a Render
Target Cache Flush by enabling this bit. When render target flush
is set due to new association of BTI, PS Scoreboard Stall bit must
be set in this packet."

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
136f583a24 anv/icl: Enable float blend optimization
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
cd7102972f anv/icl: Use gen11 functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
9673c21d4f anv/icl: Build anv libs for gen11
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
1f108b436b anv/icl: Generate gen11 entry point functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
a86c0a08df anv/icl: Don't use DISPATCH_MODE_SIMD4X2
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
cd5fc634a8 anv/icl: Don't use SingleVertexDispatch
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
6e3940b3cf anv/icl: Don't set ResetGatewayTimer
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
41a4c2c8e8 anv/icl: Add #define genX
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:31 -08:00
Anuj Phogat
413d475b44 anv/icl: Add gen11 mocs defines
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:31 -08:00
Kenneth Graunke
1d6cf433d2 i965: Implement GenerateMipmap directly, rather than using Meta.
Meta is awful and we'd like to stop using it.  Implementing this using
BLORP allows us to stop trashing a bunch of GL state every time.

This follows the structure of st_generate_mipmap().
compute_num_levels is lifted directly from there.

Improves performance in Gl41HdrBloom by about 11.794% +/- 1.01919% (n=3)
on Kabylake GT2 at 1280x720 (the difference seems much smaller at higher
resolutions).

v2 (idr): Don't try depth or depth-stencil blorp blits on Gen4 or Gen5
because it's not implemented yet.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-02-16 10:48:10 -08:00
Kenneth Graunke
9bcd31ea90 mesa: Move compute_num_levels from st_gen_mipmap.c to mipmap.c.
I want to use compute_num_levels inside i965.  Rather than duplicating
it, move it from mesa/st to core Mesa, and make it non-static.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-16 10:48:10 -08:00
Dylan Baker
03ab40b1f7 meson: freedreno depends on nir
This fixes a race condition in building targets that link in freedreno.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105120
Fixes: 0bbecc5a85 ("meson: define driver dependencies")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Mark Janes <mark.a.janes@intel.com>
2018-02-16 10:10:18 -08:00
George Kyriazis
f1fbeb1a53 swr/rast: blend_epi32() should return Integer, not Float
fix gcc8 compiler error for KNL.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105029
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis
7dd793d10c swr/rast: Normalize path for debug metadata
in template gen_llvm.hpp

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis
f979d0bc2f swr/rast: Consolidate archrast Draw events
Consolidate archrst draw events into single draw event with an attribute
that represents the type of draw

- Add handlers for new private proto versions of DrawInstancedEvent,
  DrawIndexedInstancedEvent, DrawInstancedSplitEvent, and
  DrawIndexedInstancedSplitEvent
- Convert the draw events to generic DrawInfoEvents
- parse_proto_event_fields() replaces 'AR_DRAW_TYPE' as a field type with
  'uint32_t'. This draw type is actually an enum, but can be represented
  as an unsigned integer.
- is_draw_or_dispatch() recognizes DrawInfoEvent as a draw event

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis
45df1a6520 swr/rast: Add semantics for translating address
Added support for another full translation path in fetch jitter.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis
c09483cf0a swr/rast: Convert C Sampler intrinsics
Convert portions of the C sampler to the rasty SIMD lib.

Also fix SRL call with a non-immediate.  Don't count on the compiler
automagically converting an srli call to srl if the shift count isn't
an immediate.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
37ebf86add swr/rast: Make SIMDLib templated types easier to use
"typename SIMD_T::TypeName" --> "TypeName<SIMD_T>"

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
74e8bb4a22 swr/rast: Be more explicit when fetching next component
Use a new function to denote that we want to get offset to next component
and hide the fact that GEP is used underneath.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
da77eb55d5 swr/rast: Fix bug related to passing AR handle
We were passing a garbage handle. Let's not do that.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
48d62409f8 swr/rast: Fix primitive replication issue in tesselation PA.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
e12db47a7d swr/rast: Use llvm intrinsic masked gather
Use llvm intrinsic masked.gather instead of manual unroll for the cases
where we have vector of pointers.  Improves llvm IR debug experience by
reducing a ton of IR to a single intrinsic call. Also seems to reduce
overall stack use considerably.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
9cc9688e49 swr/rast: Misc cleanup
Together with correct detection of clipDistance NaNs when no cullDistance is set

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
036c8b6247 swr/rast: Renamed variable in vertexbufferstate
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
b25efa36e6 swr/rast: Fix GATHERPS to avoid assertions.
With the pBase type change, LLVM was asserting because of wrong types.
Cast appropriately.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
8a64593bde swr/rast: More precise user clip distance interpolation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
3e560b7c85 swr/rast: Cull prims when all verts have negative clip distances
Performance optimization, and fixes some clipping issues.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
cb4b604ebd swr/rast: whitespace and comment cleanup
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
5df4d98780 swr/rast: Fix invalid number of attributes
Fix invalid number of attributes passed into tesselation PA.
Needs to take into account any offsets from the shader.
Innocuous issue, but removes an assert firing in debug.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
2053472723 swr/rast: Add clipper stats.
Clipper event is now:

event ClipperEvent
{
    uint32_t drawId;
    uint32_t trivialRejectCount;
    uint32_t trivialAcceptCount;
    uint32_t mustClipCount;
};

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
0420b2be89 swr/rast: Separate event types to public and private
Split into two proto files and modify appropriate build rules for
configure / scons / meson builds.

There are private internal events (proxy) that communicate information
from rasterizer to ArchRast. ArchRast can use these events to calculate
a final answer and then emit other public events which will be saved to
file. Users will use the public proto file and not the private one.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
e48dd2489c swr/rast: Clean up event types and remove BE events
Begin/End events not needed anymore.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
7070027d7b swr/rast: Removed unused variable
Gets rid of zillions of unused variable warnings, made worse by templates.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
e3f92bb7af swr/rast: Separate RDTSC code from archrast
Renamed rdstc defines more appropriately

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
8bce71622e swr/rast: Cleanup of mpPrivateContext in Builder
Provide access functions for mpPrivateContext in Builder.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:58 -06:00
George Kyriazis
5697dc3e23 swr/rast: Remove some JIT debug code
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:58 -06:00
George Kyriazis
2407b8c9b4 swr/rast: Don't include private context in gather args
Move mpPrivateContext to compensate

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:58 -06:00
George Kyriazis
a4c23fc25b swr/rast: Cleanup knob definitions
Rename some of the categories and move some options around.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:42 -06:00
George Kyriazis
ec34ed73d6 swr/rast: Add missing parameter to a few gather functions
We now pass pDrawContext as a default parameter

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:39:42 -06:00
Philipp Zabel
bfe4e24a42 etnaviv: add useful information to BO import errors
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-02-16 17:05:43 +01:00
Daniel Stone
ff5432dc50 egl/wayland: Always use in-tree wayland-egl-backend.h
A recent patchset to Wayland[0] migrated Mesa's libwayland-egl backend
into Wayland itself, so implementations could provide backends. Mesa
still uses its own, and the two have already diverged[1].

The include from egl_dri2.h could pick up either the installed Wayland
wayland-egl-backend.h (with a 'driver_private' member), or the Mesa
internal wayland-egl-backend.h (with a 'private' member), failing the
build in the first instance.

Add an explicit directory prefix to the include, so we always get our
in-tree version.

[0]: https://patchwork.freedesktop.org/series/31663/
[1]: https://cgit.freedesktop.org/wayland/wayland/commit/?id=9fa60983b579

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105103
Fixes: 198af27c67 ("wayland-egl: rename wayland-egl-{priv,backend}.h")
2018-02-16 14:04:19 +00:00
Daniel Stone
f766e1afa5 meson: Move Wayland dmabuf to wayland-drm
As the comment notes: linux-dmabuf has nothing to do with wayland-drm,
but we need a single place to build these files we can use from both EGL
and Vulkan, which is guaranteed to be included before both EGL and
Vulkan WSI.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
2018-02-16 14:04:19 +00:00
Eric Engestrom
65dda6c9ec egl/wayland: check for invalid format index
v2: just tell the compiler to assume the format will always be found, as
it comes from the table itself to begin with. (DanielS)

CID: 1429516
Fixes: d32b23f383 "egl/wayland: Add bpp to visual map"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-16 13:14:29 +00:00
Eric Engestrom
a176b053b6 glsl: fix sizeof(pointer) bug
Doesn't really change anything to the test though ¯\_(ツ)_/¯

CID: 1429511
Fixes: e8495646af "glsl/tests: changes to test_disk_cache_create test"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-16 12:04:29 +00:00
Timothy Arceri
2f5d3df9fc radeonsi/nir: set TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL correctly
We set this for post_depth_coverage in addition to early_fragment_tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-16 15:53:13 +11:00
Dave Airlie
60c14a0db2 virgl: remap query types to hw support.
The gallium query types changed, so we need to remap from the
gallium ones to the virgl ones.

Fixes:
dEQP-GLES3.functional.transform_feedback.basic_types*

"This also fixes:

dEQP-GLES3.functional.transform_feedback.array.separate*
dEQP-GLES3.functional.transform_feedback.array_element*
dEQP-GLES3.functional.transform_feedback.interpolation.*

Gallium's p_defines.h and virglrenderer's p_defines.h have diverged
quite a bit, so not including
PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE there makes sense for now."
 - Gurchetan Singh

Fixes: 3f6b3d9db (gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-16 12:42:06 +10:00
Anuj Phogat
8a05b06146 i965/icl: Add render target flush after uploading binding table
From PIPE_CONTROL command description in gfxspecs:

"Whenever a Binding Table Index (BTI) used by a Render Taget Message
 points to a different RENDER_SURFACE_STATE, SW must issue a Render
 Target Cache Flush by enabling this bit. When render target flush
 is set due to new association of BTI, PS Scoreboard Stall bit must
 be set in this packet."

V2: Move the PIPE_CONTROL to update_renderbuffer_surfaces() in
    brw_wm_surface_state.c (Ken).

Fixes a fulsim error and a GPU hang described in below JIRA.
JIRA: MD5-322
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
3f8289164f i965/icl: Enable float blend optimization and Wa3DStateMode
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
ba3cbee6c5 intel/common/icl: Add has_sample_with_hiz flag in gen_device_info
Sampling from hiz is enabled in i965 for GEN9+ but this feature has
been removed from gen11. So, this new flag will be useful to turn
the feature on/off for different gen h/w. It will be used later
in a patch adding device info for gen11.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
9c144dc81e i965/icl: Add assertions to check dispatch mode is SIMD8
SIMD4x2 dispatch mode has been removed in GEN11. We're not using
it anyways in Mesa. Adding few asserts to make it explicit.

Use GEN_GEN macro in place of devinfo->gen (Ken)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
02e91b6d62 i965/icl: Update switch statements
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
27d0034938 i965/icl: Update the assert in brw_memory_barrier()
Nothing is changed here from gen10 to gen11. So, just update
the assert.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
d6b26649a6 i965/icl: Define and use icl mocs settings
Gen11 MOCS settings are duplicate of Gen10 MOCS settings.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
e9ad5c9a5d i965/icl: Update the comment for maximum number of threads per PSD
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
93f601d7ed i965/icl: Build and use gen11 functions for genxml state-upload and blorp
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-15 16:14:56 -08:00
Anuj Phogat
85f319155f i965/icl: Don't set ResetGatewayTimer
This field is removed in gen11+

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
772a75be46 intel/icl: Do StateCacheInvalidation for indirect clear color
StateCacheInvalidation is required on all gen7+ platforms. We
don't need to update this check for every new gen h/w unless
this requirement is changed. So, dropping the check for latest
gen h/w.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:55 -08:00
Anuj Phogat
bff24e2173 intel/isl/icl: Build and use gen11 surface state emit functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-15 16:14:55 -08:00
Anuj Phogat
0427bd4954 intel/isl/icl: Add the maximum surface size limit
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:55 -08:00
Anuj Phogat
c68ede0be7 intel/genxml/icl: Update genx_bits header
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:55 -08:00
Anuj Phogat
165a68b05a intel/genxml/icl: Generate packing headers
Move build system changes in to one patch (Ken, Emil)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-15 16:14:55 -08:00
Anuj Phogat
7ed27d8cbf intel/genxml/icl: Add gen11.xml
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 16:14:55 -08:00
Kenneth Graunke
4dee8f0548 i965: Drop EXEC_OBJECT_CAPTURE defines.
These only existed to avoid making people update libdrm for new uABI
headers.  A while ago we imported those headers into the Mesa repo,
so the dependency is gone and these are no longer useful.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-15 15:35:52 -08:00
Jan Vesely
78673b614b clover: Fix build after llvm r325155 and r325160
r325155 ("Pass a reference to a module to the bitcode writer.")
and
r325160 ("Pass module reference to CloneModule")

change function interface from pointer to reference.

v2: Fix indentation (tab instead of spaces)

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-02-15 18:18:53 -05:00
Bas Nieuwenhuizen
05d84ed68a radv: Always lower indirect derefs after nir_lower_global_vars_to_local.
Otherwise new local variables can cause hangs on vega.

CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105098
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-15 23:45:59 +01:00
Dylan Baker
2ab1ce30c4 meson: fix xvmc target linkage
This needs to link the state tracker with --whole-archive to expose the
right symbols.

v4: - Always add libswdri and libswkmsdri to the link_with list

Fixes: 22a817af8a ("meson: build gallium xvmc state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:38:43 -08:00
Dylan Baker
0b73c329bc meson: Fix xa target linkage
This needs to use --whole-archive (link_whole in meson) to properly
expose symbols.

v4: - Always add libswdri and libswkmsdri to link_with list

Fixes: 0ba909f0f1 ("meson: build gallium xa state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:31 -08:00
Dylan Baker
91a59b6287 meson: Fix omx-bellagio target linkage
This needs to use --whole-archive (link_whole in meson) to properly
expose symbols.

v4: - Always add libswdri and libswkmsdri to link_with

Fixes: 1d36dc674d ("meson: build gallium omx state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:26 -08:00
Dylan Baker
2e4be28fb2 meson: fix va target linkage
The state tracker needs to be linked with whole-archive (like
autotools). As a result there are symbols from libswdri and libswkmsdri
that are needed, so link those as well.

v4: - Always add libswdri and libswkmsdri to link_with list

Fixes: 5a785d51a6 ("meson: build gallium va state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:16 -08:00
Dylan Baker
90d361753c meson: fix vdpau target linkage
The VDPAU state tracker needs to be linked with whole-archive (autotools
does this). Because we are linking the whole archive we alos need to
link with libswdri and libswkmsdri if those have been enabled.

v4: - Always add libswdri and libswkmsdri to link_with list

Fixes: 68076b8747 ("meson: build gallium vdpau state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:09 -08:00
Dylan Baker
3403055768 meson: Actually link xvmc target with libxvmc
Unlike vdpau this is required.

Fixes: 22a817af8a ("meson: build gallium xvmc state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:04 -08:00
Dylan Baker
7708103857 meson: actually link with libomxil-bellagio
This state tracker actually needs to link, unlike vdpau.

Fixes: 1d36dc674d ("meson: build gallium omx state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:57 -08:00
Dylan Baker
7023b373ec meson: link dri3 xcb libs into vlwinsys instead of into each target
This makes the dependencies easier to manage, since each media target
doesn't need to worry about linking to half a dozen libraries.

Fixes: b1b65397d0 ("meson: Build gallium auxiliary")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:51 -08:00
Dylan Baker
424e654cb0 meson: use va-api version reported by pkg-config
Fixes: 5a785d51a6 ("meson: build gallium va state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:47 -08:00
Dylan Baker
8eb608df61 meson: add libswdri and libswkmsdri to dri link_with
Fixes: b154b44ae3 ("meson: build radeonsi gallium driver")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:42 -08:00
Dylan Baker
be879f9f29 meson: add libswdri and libswkmsdri to d3dadaptor link_with
v5: - Fix libswdi -> libswdri typo

Fixes: 6b4c7047d5 ("meson: build gallium nine state_tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:36 -08:00
Dylan Baker
d672084ba2 meson: define empty variables for libswdri and libswkmsdri
This allows these variables to unconditionally included in `link_with`
lists, even if they're not used. This allows deleting duplicated logic
in nearly every gallium target implemented in meson today. This also
removes the now useless `build_by_default` flag from swdri and swkmsdri.

v4: - add this patch

Fixes: 66c94b9313
       ("meson: build gallium winsys for dri, null, and wrapper")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:23 -08:00
Dylan Baker
7d0e342af2 meson: add convenience variable for anv_extensions.py depdendency
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:46:07 -08:00
Dylan Baker
0e617c04f1 meson: use depend_files for adding extra file dependencies
cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: dd088d4bec ("anv/extensions: Generate a header file with extension tables")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:46:04 -08:00
Dylan Baker
b03969a5ad meson: use depend_files to track extra file dependencies
cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: f939940809 ("anv: Split anv_extensions.py into two files")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:45:56 -08:00
Dylan Baker
384bff13e0 Revert "anv/meson: Make anv_entrypoints_gen.py depend on anv_extensions.py"
This reverts commit 10d1b0be8e.

This is unnecessary, the depend_files argument is for adding
dependencies on files that are not part of the input, which is already
done.

cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: 10d1b0be8e
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:45:40 -08:00
Brian Paul
64a1223a80 svga: replace gotos with else clauses
Simple clean-up.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-15 09:49:06 -07:00
Brian Paul
fa901768a4 svga: s/unsigned/enum pipe_shader_type/
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-15 09:05:09 -07:00
Brian Paul
8b54299c34 svga: move duplicated code for setting fillmode/flatshade state
Move the calls to svga_hwtnl_set_fillmode() and svga_hwtnl_set_flatshade()
out of the two retry_draw_*() functions to the svga_draw_vbo() function.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-15 09:05:09 -07:00
Brian Paul
072df89a79 svga: move svga_update_state() call in draw code
This fixes a few Piglit transform feedback regressions caused by
commit 7a1401938b.

In that change I moved the moved svga_update_state() into the loops,
after the calls to svga_hwtnl_set_flatshade().  But
svga_hwtnl_set_flatshade() actually depends on some derived shader
state.  This patch moves the svga_update_state() call into
svga_draw_vbo() so it's not duplicated in two places.

Fixes: 7a1401938b ("svga: clean up retry_draw_range_elements(),
retry_draw_arrays()")

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-15 09:05:08 -07:00
Brian Paul
6f0aec5671 svga: call tgsi_scan_shader() for dummy shaders
If we fail to compile the normal VS or FS we fall back to a simple/
dummy shader.  We need to rescan the the shader to update the shader
info.  Otherwise, this can lead to further translations failures
because the shader info doesn't match the actual shader.

Found by adding some extra debug assertions in the state-update code
while debugging something else.

v2: also update shader generic_inputs/outputs, etc. per Charmaine

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-15 09:05:01 -07:00
Samuel Pitoiset
579b33c1fd ac/nir: do not reserve user SGPRs for unused descriptor sets
In theory this might lead to corruption if we bind a descriptor
set which is unused, because LLVM is smart and it can re-use
unused user SGPRs. In practice, this doesn't seem to fix
anything.

As a side effect, this will reduce the number of emitted
SH_REG packets.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-15 14:53:30 +01:00
Samuel Pitoiset
309854148c ac/shader: fix gathering of desc_set_used_mask
This was quite wrong.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-15 14:53:30 +01:00
Samuel Pitoiset
61a4fc3ecc ac/shader: be a little smarter when scanning vertex buffers
Although meta shaders don't use any vertex buffers, there is no
behaviour change but I think it's better to do this. Though,
this saves two user SGPRs for push constants inlining or
something else.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-15 14:53:30 +01:00
Louis-Francis Ratté-Boulianne
a34715ad9c dri: fromPlanar() can return NULL as a valid result
It was assumed that fromPlanar() could return NULL to mean
that the planar image is the same as the parent DRI image.
That assumption wasn't made everywhere though.

Let's fix things and make sure that all callers understand
a NULL result

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-15 11:58:17 +00:00
Emil Velikov
f0654dfa65 docs: correct link to the 17.3.3 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 11:33:27 +00:00
Emil Velikov
dd4734d5c1 docs: update calendar, add news and link release notes to 17.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 11:33:04 +00:00
Emil Velikov
eadde35f83 docs: add sha256 checksums for 17.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 26c84b8af9)
2018-02-15 11:28:19 +00:00
Emil Velikov
6f4a6e2310 docs: add release notes for 17.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2f9820c553)
2018-02-15 11:28:18 +00:00
Karol Herbst
7bc15090fc nvc0: disable MS Images for sample_count == 1 on Maxwell
fixes KHR-GL45.multi_bind.dispatch_bind_textures on Maxwell

Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-15 11:14:46 +01:00
Gurchetan Singh
c6694793e1 mesa: don't clamp just based on ARB_viewport_array extension
The ARB_viewport_array spec says:

"Dependencies
    OpenGL 1.0 is required.

    OpenGL 3.2 or the EXT_geometry_shader4 or ARB_geometry_shader4 extensions
    are required.

    This extension is written against the OpenGL 3.2 (Compatibility)
    Specification."

As such, we should ignore it for GLES2 contexts.

Fixes:
dEQP-GLES2.functional.state_query.integers.viewport_getinteger
dEQP-GLES2.functional.state_query.integers.viewport_getfloat

on llvmpipe and virgl.

v2: Use _mesa_has_* (Ilia)

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
2018-02-15 01:58:50 +01:00
Dylan Baker
5317211fa0 meson: use a custom target instead of a generator for i965 oa
Generators really are never the thing you want. The problem in this case
is that a generator must create a file that contains any file that the
generated target depends on. Since brw_oa.py doesn't generate such a
file the generated sources are not regenerated even if the xml files
they should depend on changes.

While we could change brw_oa.py to write such a file, that's silly, it
depends on itself and the xml file. So we'll just use a custom target
instead, which will have the correct dependency behavior and doesn't
really add that much code.

Fixes: 3218056e0e ("meson: Build i965 and dri stack")
CC: Ian Romanick <idr@freedesktop.org>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-02-14 16:45:40 -08:00
Anuj Phogat
0cd37f9178 isl: Don't use surface format R32_FLOAT for typed atomic integer operations
From Skylake PRM Surface Formats section:

   "The surface format for the typed atomic integer operations must
    be R32_UINT or R32_SINT."

Fixes an error and a piglit GPU hang in simulation environment.
Piglit test: gl45-imageAtomicExchange-float.shader_test

Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.co
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>
2018-02-14 16:30:05 -08:00
Timothy Arceri
7be5f30bb1 radeonsi/nir: fix si_nir_load_tcs_varyings() for outputs
We were incorrectly using the input info for outputs.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
9740c8a8aa ac: implement nir_intrinsic_image_samples
Fixes cts test:
KHR-GL45.shader_texture_image_samples_tests.image_functional_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
c6b70a0eae st: add NIR GL_ARB_get_program_binary support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
928be4e97e st/shader_cache: add st_{de}serialise_nir_program() helpers
These will be used for NIR GL_ARB_get_program_binary support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
3ad52501dc ac/nir_to_llvm: fix image size for arrays of arrays
Fixes cts test:
KHR-GL44.shader_image_size.advanced-changeSize

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
6acab18828 radeonsi/nir: fix shader ballot return value bitsize
Fixes cts test:
KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Jason Ekstrand
8534af44e4 intel/aubinator: Correctly decode INTERFACE_DESCRIPTOR_DATA
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 13:17:26 -08:00
Jason Ekstrand
5c9d47d9c6 i965: Add gl_state_index casts for PATCH_VERTICES_IN
This fixes the build in clang

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105088
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 13:16:47 -08:00
Scott D Phillips
3b4f432d9b i965/miptree: Initialize mcs with a linear map
When initializing mcs, map with MAP_RAW and fill in the linear
map. Removes a place where gtt mapping is used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 12:38:34 -08:00
Scott D Phillips
d13ab69a78 i965/tiled_memcpy: change linear pointer from (0, 0) to (xt1, yt1)
In all current uses, the linear surface is only allocated starting
at (xt1, yt1) anyway, so this improves the calling ergonomics.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 12:38:34 -08:00
Scott D Phillips
ecaad89525 i965/tiled_memcpy: linear_to_ytiled a cache line at a time
TileY's low 6 address bits are: v1 v0 u3 u2 u1 u0
Thus a cache line in the tiled surface is composed of a 2d area of
16x4 bytes of the linear surface.

Add a special case where the area being copied is 4-line aligned
and a multiple of 4-lines so that entire cache lines will be
written at a time.

On Apollolake, this increases tiling throughput to wc maps by
84.0103% +/- 0.862818%

v2: Split [y0, y1) and [y2, y3) loops apart for clarity (Jason Ekstrand)
v3: Don't reset src var (Jason), Ensure y0 <= y1 <= y2 <= y3

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-14 12:38:34 -08:00
Rafael Antognolli
eb2e17e2d1 docs: Add Cannonlake support to 18.0 release notes.
17.4 is actually 18.0.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: "18.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 10:11:05 -08:00
Rafael Antognolli
fcae3d1a9a anv/gen10: Remove warning message.
Gen10 seems pretty stable so far, remove "alpha support" message.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: "18.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 10:11:01 -08:00
Rafael Antognolli
bf1577fe09 i965/gen10: Remove warning message.
Gen10 seems pretty stable so far, so there's no reason to keep this
message.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 10:09:41 -08:00
Louis-Francis Ratté-Boulianne
aad14cf15a egl/x11: Fix leak in dri3_create_image_khr_pixmap
bp_reply wasn't properly free'd

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-14 11:52:06 +00:00
Iago Toral Quiroga
cb9dbd6dec i965/compiler: clean up nir_intrinsic_load_input for vertex shaders
This code to re-set the type of the source and destination is not
necessary since we never manipulate the types. Looks like a
left over from a time where we had to retype to float temporarily
to handle 64-bit inputs.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-14 12:00:14 +01:00
Iago Toral Quiroga
4917d38321 intel/compiler: fix first_component for 64-bit types on vertex inputs
Divide it by two as we do for other stages. This is because the
component layout qualifier is always in 32-bit units.

Fixes issues in a new CTS test (still WIP):
KHR-GL45.enhanced_layouts.varying_double_components

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-14 12:00:14 +01:00
Samuel Pitoiset
ad4b58ea70 ac/nir: rename nir_to_llvm_context to radv_shader_context
There is still more to do in that area, but it's a good start.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:16 +01:00
Samuel Pitoiset
141db61509 ac: remove nir_to_llvm_context from ac_nir_translate()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:14 +01:00
Samuel Pitoiset
a541117ff4 ac/nir: remove nir_to_llvm_context::nir link
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:12 +01:00
Samuel Pitoiset
e9f0205ca2 ac: move the outputs array to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:10 +01:00
Samuel Pitoiset
07e4268f36 ac/shader: scan force_persample
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:08 +01:00
Dave Airlie
b9d2ff05a6 r600: fix regression in gl_FragColor drawing
This fixes a regression in the broadcast color to all color bufs case.

Fixes: 6c691081a (r600: fixup sparse color exports.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 14:02:41 +10:00
Dave Airlie
9c9a9bee44 r600: fix array spill if temp[0] is before all arrays
I found a shader with
DCL TEMP[0], LOCAL
DCL TEMP[1..256], ARRAY(1), LOCAL
DCL TEMP[257..512], ARRAY(2), LOCAL
DCL TEMP[513..768], ARRAY(3), LOCAL
DCL TEMP[769], LOCAL

This would remap badly, as it would add up all the spilled sizes
and subtract it from the temp for 0. If the current temp is less
than the array start break out.

Fixes: 1d871aa6 (r600g: Implement spilling of temp arrays (v2))
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 13:37:59 +10:00
Dave Airlie
8f2656c75b virgl: add ARB_sample_shading support.
This enable ARB_sample_shading if the renderer supports it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 13:06:07 +10:00
Dave Airlie
9b95b70719 virgl: add ARB_draw_indirect support.
This relies on the renderer code landing first.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 13:06:07 +10:00
Roland Scheidegger
f6718baabc tgsi: Recognize RET in main for tgsi_transform
Shaders coming from dx10 state trackers have a RET before the END.
And the epilog needs to be placed before the RET (otherwise it will
get ignored).
Hence figure out if a RET is in main, in this case we'll place
the epilog there rather than before the END.
(At a closer look, there actually seem to be problems with control
flow in general with output redirection, that would need another
look. It's enough however to fix draw's aa line emulation in some
internal bug - lines tend to be drawn with trivial shaders, moving
either a constant color or a vertex color directly to the output).

v2: add assert so buggy handling of RET in main is detected

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-14 02:06:54 +01:00
Bas Nieuwenhuizen
7461bd5b8f ac: Use the renumbered const address space for LLVM 7.
The LLVM AMDGPU backend decided to renumber the constant address
space ....

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-14 01:05:03 +01:00
Dave Airlie
9ddacd9af4 gallium: drop all the guard band float caps.
Nobody queries these and nobody sets them to anything useful,
the docs say TODO.

Drop them until a use appears.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 08:50:08 +10:00
Vadym Shovkoplias
a553c54abf mesa: add glsl version query (v4)
Add support for GL_NUM_SHADING_LANGUAGE_VERSIONS
and glGetStringi for GL_SHADING_LANGUAGE_VERSION

v2:
  - Combine similar functionality into
    _mesa_get_shading_language_version() function.
  - Change GLSL version return mechanism.
v3:
  - Add return of empty string for GLSL ver 1.10.
  - Move _mesa_get_shading_language_version() function
    to src/mesa/main/version.c.
v4:
  - Add OpenGL version check.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104915
Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com>
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 13:24:31 -07:00
Brian Paul
b08d718703 mesa: add missing switch case for EXTRA_VERSION_40 in check_extra()
The EXTRA_VERSION_40 predicate is tested as part of
extra_gl40_ARB_sample_shading but there was no switch case for it.

Fixes: 77b440e42d ("mesa: Add new functions and enums required
by GL_ARB_sample_shading")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-13 10:35:55 -07:00
Mark Janes
e5809788d6 mesa: fix compile failure
Missing header triggered a failure in i965 CI buildtest project.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067
Fixes: e149a0253c
2018-02-13 00:22:05 -08:00
Mark Janes
d9de7aaca3 Partially revert "mesa: use GLenum16 in a few more places"
This reverts part of commit ca721b3d89.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067
2018-02-13 00:22:05 -08:00
Mark Janes
3e5758a70a Revert "mesa: reduce the size of gl_texture_image"
This reverts commit f4ea2b2a9e.

Several members reduced in size by the offending commit are not large
enough to store the data needed by the i965 driver.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067
2018-02-13 00:22:05 -08:00
Dave Airlie
db5f422169 i965: fix tessellation regressions with gl_state_index16
Looks like one conversion was missed.

Fixes: e149a0253 (mesa,glsl,nir: reduce gl_state_index size to 2 bytes)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2018-02-12 23:05:16 -08:00
Stéphane Marchesin
5e4a2b394e virgl: Support v2 caps struct (v2)
This struct allows us to report:
- accurate max point size/line width.
- accurate texel and texture gather offsets
- vertex/geometry limits.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-13 14:23:54 +10:00
Timothy Arceri
10457712ed ac/nir: add nir_intrinsic_{load,store}_shared support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-13 14:43:05 +11:00
Timothy Arceri
c787cbfa33 ac/nir_to_llvm: add support for nir_intrinsic_shared_atomic_*
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-13 14:43:05 +11:00
Timothy Arceri
b6cf898ec2 radeonsi: make si_declare_compute_memory() more generic and call for nir
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-13 14:43:05 +11:00
Timothy Arceri
94fa090fad st/glsl: set req_local_mem earlier for compute shaders
Without this change it will never be set for backends using nir.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-13 14:43:05 +11:00
Marek Olšák
6b1e26e181 mesa: move STATE_LENGTH to shader_enums.h and use it everywhere
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
f4ea2b2a9e mesa: reduce the size of gl_texture_image
80 -> 40 bytes.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
4794fbc86e mesa: reduce the size of gl_program_parameter
40 -> 24 bytes, which includes the gl_state_index16 change.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
e149a0253c mesa,glsl,nir: reduce gl_state_index size to 2 bytes
Let's use the new gl_state_index16 type everywhere and remove
the typecasts.

This helps reduce the size of gl_program_parameter.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
a7882013d3 mesa: reduce the size of gl_viewport_attrib
All drivers convert these to float, so there is no reason to use double.
The piglit test that expects double precision from glGet will be adjusted
not to require it (there is a piglit patch).

gl_context::ViewportArray: 512 -> 384 bytes

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
d7550d783a mesa: reduce the size of gl_texture_object
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
65ed98839b mesa: reduce the size of gl_program
gl_program: 1456 -> 976 bytes

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
78f1decc95 mesa: reduce the size of gl_image_unit (v2)
gl_context::ImageUnits: 6144 -> 4608 bytes

v2: use ASSERT_BITFIELD_SIZE

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
ca5c5d96d8 mesa: further reduce the size of ctx->Texture
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
78043a75f6 mesa: decrease the array size of ctx->Texture.FixedFuncUnit to 8
GL allows doing glTexEnv on 192 texture units, while in reality,
only MaxTextureCoordUnits units are used by fixed-func shaders.

There is a piglit patch that adjusts piglits/texunits to check only
MaxTextureCoordUnits units.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
07c10cc59c mesa: separate legacy stuff from gl_texture_unit into gl_fixedfunc_texture_unit
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
79aca14f5f mesa: inline init_texture_unit
because this is going to be changed

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
ca721b3d89 mesa: use GLenum16 in a few more places
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Jason Ekstrand
4c77e21c81 anv: Move setting current_pipeline to cmd_state_init
We were setting current_pipeline to UINT32_MAX and then calling
cmd_cmd_state_reset which memsets the entire state struct to 0 which
implicitly resets current_pipeline to 3D.  I have no idea how this
hasn't caused everything to explode.

Fixes: cd3feea745 "anv/cmd_buffer: Rework anv_cmd_state_reset"
cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-02-12 15:18:23 -08:00
Jason Ekstrand
f37bd726c7 anv: Don't resolve or ambiguate non-existent layers
The previous code was trying to avoid non-existent layers by taking a
MAX with anv_image_aux_layers.  Unfortunately, it wasn't taking into
account that layer_count starts at base_layer which may not be zero.
Instead, we need to subtract base_layer from anv_image_aux_layers with
a guard against roll-over.

Fixes: de3be61801 "anv/cmd_buffer: Rework aux tracking"
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-12 15:14:57 -08:00
Daniel Stone
c2c4e5bae3 i965: Fix bugs in intel_from_planar
This commit fixes two bugs in intel_from_planar.  First, if the planar
format was non-NULL but only had a single plane, we were falling through
to the planar case.  If we had a CCS modifier and plane == 1, we would
return NULL instead of the CCS plane.  Second, if we did end up in the
planar_format == NULL case and the modifier was DRM_FORMAT_MOD_INVALID,
we would end up segfaulting in isl_drm_modifier_has_aux.

Cc: mesa-stable@lists.freedesktop.org
Fixes: 8f6e54c929
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-12 15:14:45 -08:00
Eric Anholt
1aed66dc1e radv: Fix compiler warning about uninitialized 'set'
The compiler doesn't figure out that we only get result == VK_SUCCESS if
set got initialized.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 20:48:47 +00:00
Eric Anholt
21670f8208 glsl/tests: Fix strict aliasing warning about int64/double.
Fixes: 4bf9862747 ("glsl/tests: Add UINT64 and INT64 types")
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
2018-02-12 20:48:43 +00:00
Eric Anholt
091bff8317 ac/nir: Fix compiler warning about uninitialized dw_addr.
Even switching the def's condition to be the same chip revision check as
the use, the compiler doesn't figure it out.  Just NULL-init it.

Fixes: ec53e52742 ("ac/nir: Add ES output to LDS for GFX9.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 20:48:29 +00:00
Eric Anholt
7a83be4b28 gallium/llvmpipe: Fix compiler warnings about ddx/ddy/ddmax.
My gcc doesn't figure out that dims >= 1 (seems reasonable), and doesn't
notice that ddmax is used from the same no_rho_opt as its initialization.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-12 20:48:18 +00:00
Kenneth Graunke
bd87bd178c anv: Drop I915_EXEC_CONSTANTS_REL_GENERAL from execbuf.
The kernel used to have execbuf parameters to program the INSTPM bit
for whether 3DSTATE_CONSTANT_* should be relative to dynamic state
base address or an absolute address.  However, they never worked in
the presence of hardware contexts, so I deleted them a while back.

It doesn't make sense to set this flag, as it doesn't exist anymore.
It also never did anything anyway - the flag is zero, so |'ing it in
did nothing.  The default is relative anyway.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-12 07:00:41 -08:00
Eric Engestrom
111d4bf1d0 r200: remove left over dead code
0aaa27f291 removed the references to this array without
removing the array itself

Cc: Ian Romanick <ian.d.romanick@intel.com>
Fixes: 0aaa27f291 "mesa: Pass the translated color logic op dd_function_table::LogicOpcode"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-02-12 11:19:44 +00:00
Samuel Pitoiset
f4e85ba93f ac/nir: remove backlink to nir_to_llvm_context
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:39 +01:00
Samuel Pitoiset
be5f6eb13e ac/nir: remove nir_to_llvm_context::module
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:36 +01:00
Samuel Pitoiset
90a815ddeb ac/nir: remove nir_to_llvm_context::builder
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:34 +01:00
Samuel Pitoiset
759acfa180 ac/nir: drop nir_to_llvm_context from glsl_to_llvm_type()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:31 +01:00
Samuel Pitoiset
e7373a6498 ac/nir: drop nir_to_llvm_context from visit_var_atomic()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:29 +01:00
Samuel Pitoiset
485346b05a ac/nir: drop nir_to_llvm_context from visit_vulkan_resource_reindex()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:27 +01:00
Samuel Pitoiset
cd6dfacda9 ac/nir: drop nir_to_llvm_context from visit_load_push_constant()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:25 +01:00
Samuel Pitoiset
5c9e398c83 ac/nir: drop nir_to_llvm_context from cast_ptr()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:23 +01:00
Samuel Pitoiset
5ef5944848 ac/nir: drop nir_to_llvm_context from visit_load_local_invocation_index()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:21 +01:00
Samuel Pitoiset
da8b0b8264 ac/nir: drop nir_to_llvm_context from emit_f2f16()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:19 +01:00
Samuel Pitoiset
e32f374944 ac: remove unused parameters in abi::load_tess_coord()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:17 +01:00
Samuel Pitoiset
1e69db003d ac/nir: remove useless bitcast in load_tess_coord()
nir_intrinsic_load_tess_coord always returns a v3i32.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:15 +01:00
Samuel Pitoiset
ed179fbdf3 ac: add load_resource() to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:13 +01:00
Samuel Pitoiset
ecf229706f ac: add load_sample_mask_in() to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:11 +01:00
Samuel Pitoiset
0f48eeea05 ac: move view_index to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:09 +01:00
Samuel Pitoiset
0efbede949 ac: move push_constants to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:07 +01:00
Samuel Pitoiset
460d3ce726 ac: move tg_size to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:04 +01:00
Samuel Pitoiset
054c92190c ac/nir: remove unused nir_to_llvm_context:{defs,phis}
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:02 +01:00
Eric Anholt
0b97eb02b0 egl/gbm: Fix compiler warning about visual matching.
The compiler doesn't know that num_visuals > 0.

Fixes: 37a8d907cc ("egl/gbm: Ensure EGLConfigs match GBM surface format")
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-12 09:16:44 +00:00
Rob Clark
831fb29252 freedreno: small fix for flushing dependent batches
Flush a resource's previous write_batch synchronously.  Because a
resource's associated batches are not updated until after the flush
thread submits rendering to the kernel, this was causing a bit of
confusion in the following loop.  This fixes a bug that appeared with
recent stk.

Perhaps we need to re-work things a bit to clear out dependent patches
in the ctx's thread and use a fence to deal with the period between
when a flush is queued and when it is submitted to the kernel.  But
this will do until time permits a larger refactor.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
c57ed8e01c freedreno/ir3: intra-block scheduling
Because of loops, we can't schedule all of a block's predecessors first.
Instead just assume that the result consumed in a block was written far
enough away in all paths into a block.  And do an intra-block scheduling
pass to figure out if there are any cases where we need to insert extra
nop's.  This works out better than always assuming the worst case (ie.
that a value live into a block was written in the last instruction in
the predecessor block).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
2a2099a875 freedreno/ir3: "boost" the depth of if/else condition
Account for the move to predicate register, to try to avoid needing to
insert extra NOPs later.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
ffb00f6841 freedreno/ir3: account for arrays in delayslot calc
Normally false-deps are not something to consider, since they mostly
exist for delay-slot related reasons:

 * barriers
 * ordering writes after read
 * SSBO/image access ordering

The exception is a false-dependency on an array store.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
f54d2b4f10 freedreno/ir3: more clever legalize algorithm
Previously we didn't handle flow control in legalize, and instead just
set (ss)(sy) on the first instruction in every block.  Which isn't very
clever.

Instead, consider output state of all predecessor blocks, so we only
set a sync bit if needed for any possible path leading into a block.
Because of loops, we can't require that all successor blocks are
legalized before a given block, so instead run in a loop until results
converge.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
015afb6a38 freedreno/ir3: track block predecessors
Useful in the following patches.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
76440fcca9 freedreno/ir3: clean up dangling false-dep's
Maybe there is a better way for this..  where it comes useful is "array"
loads, which end up as a false-dep for a later array store.

If all the uses of an array load are CP'd into their consumer, it still
leaves the dangling array load, leading to funny things like:

  mov.u32u32 r5.y, r0.y
  mov.u32u32 r5.y, r0.z

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
aea223741f freedreno/ir3: handle IMMED for mad 2nd src special case
Consider also immediates for swapping the first two srcs, because they
can be lowered to constant.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
242a8a1957 freedreno/ir3: remove ir3 phi instruction
Now that we convert phi webs to ssa, we can drop all this.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
a7b569d60c freedreno/ir3: remove lower_if_else pass
Now that it is unused.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
268ab05484 freedreno/ir3: add experimental GCM pass
Generally seems to do worse on instruction count and register usage,
according to shader-db.  But shader-db also doesn't do a very good job
of weighting loop bodies, so that might not be totally valid.

So add an env variable to enable GCM pass for easier experimentation.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
4c15c53d91 freedreno/ir3: change opt passes
There are more useful nir passes added since initial conversion to nir.
But ir3 was never updated to use them.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
ec8bc54ad2 freedreno/ir3: use peephole select pass
Agressively lowering all if/else to selects in some extreme cases
results in much higher register pressure.  Using peephole select instead
with a modest threshold speeds up alu2 4x!

16 seems like a good limit, low enough to help alu2 but not too low that
it penalizes everything else.  With a bit better scheduling of the
instruction that moves a value into a predicate register, we might be
able to lower this limit a bit more in the future, but since we need 6
cycles from the move to predicate register to predicated branch, that
puts some sort of lower bound on how far we can lower this threshold.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
a7ea2b4eba freedreno/ir3: lower phi webs to regs
nir's from_ssa pass is much better at avoiding inserting extra moves
than our logic is.  And lowering phi webs to regs just treats anything
involved in a phi web as an array of length=1.  Which with previous
array related fixes in RA/etc ends up working out quite well.  This cuts
down on extra instructions and also helps with register pressure.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
0a6ddf964f freedreno/ir3: separate arrays from groups
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
55f14a1ac4 freedreno/ir3: make block/instruction serialno per-shader
Makes it easier to compare values seen in-game (where there are many
shaders) to cmdline standalone compiler.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
5a7de94392 freedreno/ir3: add spirv support to cmdline compiler
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
942341bcd0 freedreno/ir3: don't lower fsat
Instead, if possible fold (sat) flag into src, otherwise use:

  (sat)max.f rD, rS, rS

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
b2fc94f074 freedreno/ir3: add encoding/decoding for (sat) bit
Seems to be there since a3xx, but we always lowered fsat.  But we can
shave some instructions, especially in shaders that use lots of
clamp(foo, 0.0, 1.0) by not lowering fsat.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
1b658533e1 freedreno/ir3: extend liverange of arrays
Use livein state of other blocks to extend liverange of arrays when they
are still needed by successor blocks.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
ac459a6f7f freedreno/ir3: avoid extra mov's for "arrays"
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
2bc3fb6992 freedreno/ir3: a couple more array fixes
(Plus a couple TODOs)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
8ea1ef4191 freedreno/ir3: keep array stores
Since these are not in SSA form, add to block's keeps so it doesn't
appear unused.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
c60f150d56 freedreno/ir3: propagate barrier information
When eliminating movs, the instruction that is now directly using the
src of the mov has the same scheduling order constraints as the original
mov instruction.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
98702c1010 freedreno/ir3: remove pointless statement
Function ends after this if/else ladder, so it was pointless.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
930ca0e038 freedreno/ir3: some more debug prints
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
a84e324847 freedreno/ir3: fix printing of relative branch offsets
The number of bits depends on generation.  But printing negative values
with a5xx encoding (largest size) but compiling for a3xx or a4xx, would
result in negative values printed as large positive values.

I guess in practice huge negative branch offsets aren't likely (and if
that is the case, the shader is probably too big to grok by reading the
assembly).  So just print using smallest bitfield size.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
a5c28fe07b freedreno/ir3: be more clever with if/else jumps
Try to clean up things like:

  br !p0.x #2
  br p0.x #something

to eliminate the first branch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
44dd7dcd2f freedreno/ir3: avoid some spurious sync bits
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
069c0ac625 freedreno/ir3: print # of sync bits for shaderdb
When trying to optimize to reduce stalls, it is nice to see this info.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
7d45e2e39f freedreno: add debug trace for flush
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Grazvydas Ignotas
9b9a89cd79 intel/compiler: fix 64bit value prints on 32bit
Fix the following:
warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but
argument 3 has type ‘uint64_t {aka long long unsigned int}.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-02-10 17:59:02 +02:00
Timothy Arceri
ff0e3fa1fe st/glsl_to_nir: remove unused options variable 2018-02-10 11:06:55 +11:00
Timothy Arceri
8f378c116e st/radeonsi: enable disk cache for nir
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
bc9d9f9b86 st: add nir shader disk cache support
v2: include compute shader support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
97efdc0d57 st/glsl_to_tgsi: move nir detection earlier
We move the nir check before the shader cache call so that we can
call a nir based caching function in a following patch.

Also with this change we simply check if vertex shaders support
NIR rather than looping over the stages as mixing of shader types
is not supported anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
b5e23887fe radeonsi: stop returning PIPE_SHADER_IR_NATIVE for PIPE_SHADER_CAP_PREFERRED_IR
Clover now checks PIPE_SHADER_CAP_SUPPORTED_IRS for native support instead.

This change indirectly enables NIR support for compute shaders
on radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
73f1d6f0c1 r600: always return PIPE_SHADER_IR_TGSI for PIPE_SHADER_CAP_PREFERRED_IR
We now use PIPE_SHADER_CAP_SUPPORTED_IRS to check for native support
in clover.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
51f484bb44 clover: use PIPE_SHADER_CAP_SUPPORTED_IRS to discover IR
PIPE_SHADER_CAP_PREFERRED_IR was conflicting with PIPE_SHADER_IR_NIR
for compute shaders, so we let clover pick the one it wants to use.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
3af4f34e61 r600: add PIPE_SHADER_IR_NATIVE to supported shaders for cs
Acked-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
ce836487b8 radeonsi/nir: add depth layout to scan pass
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-10 10:46:28 +11:00
Timothy Arceri
6a8efbe652 radeonsi/nir: add FRAG_RESULT_COLOR to scan pass
Fixes a number of draw buffers piglit tests.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-10 10:46:28 +11:00
Timothy Arceri
ef8082baf8 ac: convert nir_op_f2f32 src to a float
Fixes the following piglit test:

./bin/arb_vertex_attrib_64bit-check-explicit-location -auto -fbo

Where we would end up with the nir such as:

	vec1 64 ssa_11 = pack_64_2x32_split ssa_9, ssa_10
	vec1 32 ssa_12 = f2f32 ssa_2

And our pack_64_2x32_split nir to llvm code always produces
a 64bit integer as output.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-10 10:46:28 +11:00
Timothy Arceri
1b1e5f8edf ac: fix some 64bit unpack asserts
Previously the asserts did not take swizzles into account.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-10 10:46:28 +11:00
Mark Janes
9a05c66feb Revert "i965: prevent potentially null pointer access"
This reverts commit 712332ed54, which
caused over 90k failures in Mesa i965 CI.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-09 09:46:07 -08:00
Daniel Stone
37a8d907cc egl/gbm: Ensure EGLConfigs match GBM surface format
When we create an EGL window surface on a GBM surface, ensure that the
EGLConfig is compatible with the GBM format, notwithstanding XRGB/ARGB
interchange.

For example, rendering with an XRGB8888 EGLConfig on to an ARGB8888
gbm_surface (and vice-versa) are acceptable, but rendering with an
XRGB2101010 EGLConfig on to an XRGB8888 gbm_surface will now be
rejected.

This was previously allowed through; when 10bpc formats were enabled,
clients which picked a completely random EGL config and hoped/assumed
they were XRGB8888 would break.

If you have bisected a failure to start a GBM/KMS client to this commit,
please look at its EGLConfig selection (e.g. through eglChooseConfigs),
and add an EGL_NATIVE_VISUAL_ID == gbm_surface format match to the
attribs for config selection.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
8174e5b49e egl/gbm: Remove duplicate format table
Now that we have mask/channel information in gbm_dri's format conversion
table, we can remove the copy in EGL.

As this table contains more formats (notably including R8 and RG8, which
can be used for BO but not surface allocation), we now compare the masks
of all channels when trying to find a suitable config. Without doing
this, an XRGB8888 EGLConfig would match on an R8 format.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
314714ac53 gbm/dri: Expose visuals table through gbm_dri_device
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
2ed344645d gbm/dri: Add RGBA masks to GBM format table
Eventually, we can replace the visuals list inside GBM EGL driver with
this one.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
4732094cff egl/wayland: Use an array for modifiers
Each Wayland EGLDisplay currently contains a struct with one vector of
modifiers per format, hardcoded in the header. To allow easier support
for more formats, turn this into an array of u_vectors which is opaque
outside of platform_wayland.c.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
5bc49d4cbf egl/wayland: Remove has_format enum
Instead of the has_format enum, use an index into the visual array. This
makes adding new formats less typing.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
d32b23f383 egl/wayland: Add bpp to visual map
Both the DRI2 GetBuffersWithFormat interface, and SHM buffer allocation,
had their own format -> bpp lookup tables. Replace these with a lookup
into the visual map.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
4de98a9c07 egl/wayland: Use visual map for DRIImage<->FourCC map
When trying to translate between DRIImage format enums and FourCC codes,
use our visual map rather than an open-coded subset.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
68a80c11bd egl/wayland: Use visual map for format advertisement
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
3323ce72ff egl/wayland: Use visual map for buffer_from_image
When creating a wl_buffer on an upstream Wayland display from an
existing EGLImage, use the dri2_wl_visual map rather than another
hardcoded list of formats.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
a9cc4edb60 egl/wayland: Use visual map for config->format lookup
Having hoisted the format -> config map into common code, we now use it
for config -> format lookups.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
1dc013f1ee egl/wayland: Add format enums to visual map
Extend the visual map from only containing names and bitmasks, to also
carrying the three format enums we need. These are the DRIImage format
tokens for internal allocation, FourCC codes for wl_drm and dmabuf
protocol, and wl_shm codes for swrast drivers.

We will later use these formats to eliminate a bunch of open-coded
conversions.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
66912641df egl/wayland: Use proper enum type in visual definition
No semantic change.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
845c2f6156 egl/wayland: Widen channel masks to bpp
Widen the channel masks given in the visual table to the full width of
the pixel format, i.e. as many leading zeros as required.

No functional change.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
19cbca38e4 egl/wayland: Hoist format <-> EGLConfig definition up
Pull the mapping between Wayland formats and EGLConfigs up to the top
level, so we can reuse it elsewhere.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
4fbd2d50b1 egl/wayland: Fix ARGB/XRGB transposition in config map
When 0b2b719121 moved from an if tree to a struct to map between
wl_drm formats and EGLConfigs, it transposed the mapping between XRGB
and ARGB. Luckily, everyone exposes both formats, so this is harmless.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: 0b2b719121 ("egl/wayland: introduce dri2_wl_add_configs_for_visuals() helper")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:06 +00:00
Marek Olšák
76085f2048 st/mesa: generate blend state according to the number of enabled color buffers
Non-MRT cases always translate blend state for 1 color buffer only.
MRT cases only check and translate blend state for enabled color buffers.

This also avoids an assertion failure in translate_blend for:
  dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_advanced_blend_eq_buffer_blend_eq

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-09 15:52:22 +01:00
Marek Olšák
c446dd7927 st/mesa: don't translate blend state when color writes are disabled
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-09 15:52:22 +01:00
Marek Olšák
3d06c8afb5 st/mesa: don't translate blend state when it's disabled for a colorbuffer
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-09 15:52:22 +01:00
Lionel Landwerlin
712332ed54 i965: prevent potentially null pointer access
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
CID: 1418110
2018-02-09 14:02:59 +00:00
Mark Thompson
5db29d62ce st/va: Make the vendor string more descriptive
Include the Mesa version and detail about the platform.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-02-09 13:37:43 +01:00
Mark Thompson
768f1487b0 st/va: Enable vaExportSurfaceHandle()
It is present from libva 2.1 (VAAPI 1.1.0 or higher).

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-02-09 13:37:36 +01:00
Tapani Pälli
41c5bf3836 disk cache: move path creation back to constructor
This patch moves disk cache path and index creation back to the
constructor which matches previous behavior. We still allow create
to succeed without path so that cache can be used with callback
functionality.

Fixes: c95d3ed091 "disk cache: create cache even if path creation fails"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-09 11:33:25 +02:00
Samuel Pitoiset
3a2bb4db23 ac/nir: compute correct number of user SGPRs on GFX9
For merged shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 10:16:04 +01:00
Michel Dänzer
171076f082 st/mesa: Initialize tex_target in compile_tgsi_instruction
Initialize to TGSI_TEXTURE_BUFFER (== 0), same as was done before the
variable type was changed to enum tgsi_texture_type.

Fixes a bunch of piglit failures with radeonsi, e.g.:

gles-3.0-transform-feedback-uniform-buffer-object: ../../../../src/gallium/auxiliary/tgsi/tgsi_util.c:502: tgsi_util_get_texture_coord_dim: Assertion `!"unknown texture target"' failed.

Corresponding compiler warning:

  CXX      state_tracker/st_glsl_to_tgsi.lo
../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp: In function ‘pipe_error st_translate_program(gl_context*, uint, ureg_program*, glsl_to_tgsi_visitor*, const gl_program*, GLuint, const ubyte*, const ubyte*, const ubyte*, const ubyte*, const ubyte*, GLuint, const ubyte*, const ubyte*, const ubyte*)’:
../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5992:23: warning: ‘tex_target’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       ureg_memory_insn(ureg, inst->op, dst, num_dst, src, num_src,
       ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                        inst->buffer_access,
                        ~~~~~~~~~~~~~~~~~~~~
                        tex_target, inst->image_format);
                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5866:27: note: ‘tex_target’ was declared here
    enum tgsi_texture_type tex_target;
                           ^~~~~~~~~~

Fixes: 9f9ce1625f ("st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 09:26:40 +01:00
Alejandro Piñeiro
f32b01ca43 glsl/linker: remove ubo explicit binding handling
This is already handled at link_uniform_blocks, specifically at
process_block_array_leaf.

Additionally, this code was not handling correctly arrays of
arrays. When creating the name of the block to set the binding, it
only took into account the first level, so any attempt to set a
explicit binding on a array of array ubo would trigger an assertion.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-09 08:32:42 +01:00
Mathias Fröhlich
77cb2fc0bd mesa: Only update enabled VAO gl_vertex_array entries.
Instead of updating all modified gl_vertex_array_object::_VertexArray
entries just update those that are modified and enabled.
Also release buffer object from the _VertexArray that belong
to disabled attributes.

v2: Also set Ptr and Size to zero.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-09 04:26:23 +01:00
Mathias Fröhlich
437cae411e gallium: Mute arrays for several meta like callbacks.
Set the _DrawArray pointer to NULL when calling into the Drivers
Bitmap/CopyPixels/DrawAtlasBitmaps/DrawPixels/DrawTex hooks.
This fixes an assert that gets uncovered when the following
patch gets applied.

v2: Mute from within the state tracker instead of generic mesa.
v3: Avoid evaluating _DrawArrays from within st_validate_state.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 04:26:13 +01:00
Mathias Fröhlich
2f9eb0aad5 mesa: Fix VAO buffer object tracking.
When changing the attribute binding in the VAO we also need to
account for getting rid of non vbo bits from VertexAttribBufferMask.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-09 04:21:36 +01:00
Timothy Arceri
d8bca3809d radeonsi/nir: gather some missing fs info
Fixes some early-z arb_shader_image_load_store piglit tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 12:51:27 +11:00
Timothy Arceri
c77078c942 ac: pass struct ac_llvm_context to emit_membar()
Fixes segfault in piglit test:

./bin/arb_shader_image_load_store-shader-mem-barrier --quick -auto -fbo

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 12:51:27 +11:00
Marek Olšák
12fd567c78 radeonsi: copy the NIR enablement debug bit to the shader cache flags
When NIR is enabled, TGSI must not be used. When NIR is disabled, TGSI

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-09 02:01:45 +01:00
Jason Ekstrand
8f20cf166e intel/blorp: Use isl_aux_op instead of blorp_hiz_op
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1e941a0528 intel/blorp: Use isl_aux_op instead of blorp_fast_clear_op
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1810f965c8 anv: Allow fast-clearing the first slice of a multi-slice image
Now that we're tracking aux properly per-slice, we can enable this for
applications which actually care.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
de3be61801 anv/cmd_buffer: Rework aux tracking
This commit completely reworks aux tracking.  This includes a number of
somewhat distinct changes:

 1) Since we are no longer fast-clearing multiple slices, we only need
    to track one fast clear color and one fast clear type.

 2) We store two bits for fast clear instead of one to let us
    distinguish between zero and non-zero fast clear colors.  This is
    needed so that we can do full resolves when transitioning to
    PRESENT_SRC_KHR with gen9 CCS images where we allow zero clear
    values in all sorts of places we wouldn't normally.

 3) We now track compression state as a boolean separate from fast clear
    type and this is tracked on a per-slice granularity.

The previous scheme had some issues when it came to individual slices of
a multi-LOD images.  In particular, we only tracked "needs resolve"
per-LOD but you could do a vkCmdPipelineBarrier that would only resolve
a portion of the image and would set "needs resolve" to false anyway.
Also, any transition from an undefined layout would reset the clear
color for the entire LOD regardless of whether or not there was some
clear color on some other slice.

As far as full/partial resolves go, he assumptions of the previous
scheme held because the one case where we do need a full resolve when
CCS_E is enabled is for window-system images.  Since we only ever
allowed X-tiled window-system images, CCS was entirely disabled on gen9+
and we never got CCS_E.  With the advent of Y-tiled window-system
buffers, we now need to properly support doing a full resolve of images
marked CCS_E.

v2 (Jason Ekstrand):
 - Fix an bug in the compressed flag offset calculation
 - Treat 3D images as multi-slice for the purposes of resolve tracking

v3 (Jason Ekstrand):
 - Set the compressed flag whenever we fast-clear
 - Simplify the resolve predicate computation logic

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
2cbfcb205e anv/cmd_buffer: Move the mi_alu helper higher up
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
2e69045c4d anv/image: Simplify some verbose commennts
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
f0523f70ef anv: Use blorp_ccs_ambiguate instead of fast-clears
Even though the blorp pass looks a bit on the sketchy side, the end
result in the Vulkan driver is very nice.  Instead of having this weird
case where you do a fast clear and then maybe have to resolve, we just
do the ambiguate and are done with it.  The ambiguate does exactly what
we want of setting all the CCS values to 0 which puts it into the
pass-through state.

This should also improve performance a bit in certain cases.  For
instance, if we did a transition from UNDEFINED to GENERAL for a surface
that doesn't have CCS enabled all the time, we would end up doing a
fast-clear and then a full resolve which ends up touching every byte in
the main surface as well as the CCS.  With the ambiguate pass, that
transition only touches the CCS.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
84fd2ebfbc anv/cmd_buffer: Re-arrange the logic around UNDEFINED fast-clears
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
3ef8c4b2f5 anv/cmd_buffer: Pull the undefined layout condition into the if
Now that this isn't a multi-case if and it's just the one case, it's a
bit clearer if the condition is just part of the if instead of being
pulled out into a boolean variable.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
857b5b5a7f intel/blorp: Add a CCS ambiguation pass
This pass performs an "ambiguate" operation on a CCS-compressed surface
by manually writing zeros into the CCS.  On gen8+, ISL gives us a fairly
detailed notion of how the CCS is laid out so this is fairly simple to
do.  On gen7, the CCS tiling is quite crazy but that isn't an issue
because we can only do CCS on single-slice images so we can just blast
over the entire CCS buffer if we want to.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
13b621d6fd anv: Only fast clear single-slice images
The current strategy we use for managing resolves has an issues where we
track clear colors and the need for resolves per-LOD but we still allow
resolves of only a subset of the slices in any given LOD and doing so
sets the "needs resolve" flag for that LOD to false while leaving the
remaining layers unresolved.  This patch is only the first step and does
not, by itself fix anything.  However, it's fairly self-contained and
splitting it out means any performance regressions should bisect to this
nice obvious commit rather than to the giant "rework aux tracking"
commit.

Nanley and I did some testing and none of the applications we tested
even tried to fast-clear anything other than the first slice of an
image.  The test was done by adding a printf right before we call
blorp_fast_clear if we were every going to touch any slice other than
the first with a fast-clear.  Due to the way the original code was
structured, this would not have included applications which only cleared
a subset of layers.  The applications tested were:

 * All Sascha Willems demos
 * Aztec Ruins
 * Dota 2
 * The Talos Principle
 * Mad Max
 * Warhammer 40,000: Dawn of War III
 * Serious Sam Fusion 2017: BFE

While not the full list of shipping applications, it's a pretty good
spread and covers most of the engines we've seen running on our driver.
If this is ever shown to be a performance problem in the future, we can
reconsider our strategy.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
571ed588ac anv/cmd_buffer: Add a mark_image_written helper
Currently, this helper does nothing but we call it every place where an
image is written through the render pipeline.  This will allow us to
properly mark the aux state so that we can handle resolves correctly.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
9876d6f0ef anv/blorp: Add src/dst_level helper variables in CmdCopyImage
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
c180c2c868 anv/cmd_buffer: Add an anv_genX_call macro
This is copied and pasted from the similar macro we added to ISL.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
ab7543b13d anv/cmd_buffer: Generalize transition_color_buffer
This moves it to being based on layout_to_aux_usage instead of being
hard-coded based on bits of a priori knowledge of how transitions
interact with layouts.  This conceptually simplifies things because
we're now using layout_to_aux_usage and layout_supports_fast_clear to
make resolve decisions so changes to those functions will do what one
expects.

There is a potential bug with window system integration on gen9+ where
we wouldn't do a resolve when transitioning to the PRESENT_SRC layout
because we just assume that everything that handles CCS_E can handle it
all the time.  When handing a CCS_E image off to the window system, we
may need to do a full resolve if the window system does not support the
CCS_E modifier.  The only reason why this hasn't been a problem yet is
because we don't support modifiers in Vulkan WSI and so we always get X
tiling which implies no CCS on gen9+.  This patch doesn't actually fix
that bug yet but it takes us the first step in that direction by making
us actually pick the correct resolve op.  In order to handle all of the
cases, we need more detailed aux tracking.

v2 (Jason Ekstrand):
 - Make a few more things const
 - Use the anv_fast_clear_support enum

v3 (Jason Ekstrand):
 - Move an assert and add a better comment

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
151771b390 anv/cmd_buffer: Recurse in transition_color_buffer instead of falling through
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
bea7373c92 anv/image: Support color aspects in layout_to_aux_usage
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
b09464db42 anv/image: Add a helper for determining when fast clears are supported
v2 (Jason Ekstrand):
 - Return an enum instead of a boolean

v3 (Jason Ekstrand):
 - Return ANV_FAST_CLEAR_NONE instead of false (Topi)
 - Rename ANV_FAST_CLEAR_ANY to ANV_FAST_CLEAR_DEFAULT_VALUE
 - Add documentation for the enum values

v4 (Jason Ekstrand):
 - Remove a dead comment

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1f7eee6bc1 anv/image: Update a comment
This got lost in all of the aspect vs. plane rebasing of YCBCR.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
5c38ab8f07 anv/blorp: Rework HiZ ops to look like MCS and CCS
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1d473e26f2 anv/blorp: Support ISL_AUX_USAGE_HIZ in surf_for_anv_image
If the function gets passed ANV_AUX_USAGE_DEFAULT, it still has the old
behavior of setting ISL_AUX_USAGE_NONE for depth/stencil which is what
we want for blits/copies.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
42f1668a54 anv/blorp: Rework image clear/resolve helpers
This replaces image_fast_clear and ccs_resolve with two new helpers that
simply perform an isl_aux_op whatever that may be on CCS or MCS.  This
is a bit cleaner as it separates performing the aux operation from which
blorp helper we have to call to do it.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
482c24783e intel/isl: Codify AUX operations in an enum
Right now, we have different entrypoints and enums in blorp for these
different operations.  This provides us a central enum which we can
begin to transition to.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Gert Wollny
c36172e387 r600/sb: Check whether optimizations would result in reladdr conflict
v2: * Check whether the node src and dst registers are NULL before using
      them.
    * fix a type in the commit message.

Two cases are handled with this patch:

1. If copy propagation tries to eliminated a move from a relative
   array access then it could optimize

     MOV R1, ARRAY[RELADDR_1]
     MOV R2, ARRAY[RELADDR_2]
     OP2 R3, R1 R2

   into

     OP2 R3, ARRAY[RELADDR_1], ARRAY[RELADDR_2]

   which is forbidden, because there is only one address register available.

2. When MULADD(x,a,MUL(x,c)) is handled

      MUL TMP, R1, ARRAY[RELADDR_1]
      MULLADD R3, R1, ARRAY[RELADDR_2], TMP

   by folding this into

      ADD TMP, ARRAY[RELADDR_2], ARRAY[RELADDR_1]
      MUL R3, R1, TMP

   which is also forbidden.

Test for these cases and reject the optimization if a forbidden combination
of relative access would be created.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 10:00:38 +10:00
Glenn Kennard
1d871aa626 r600g: Implement spilling of temp arrays (v2)
Pessimistically spills arrays if GPR limit is exceeded.

v2: fix r600 support [airlied]

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:26 +10:00
Dave Airlie
22fc5eff80 r600/sb: handle scratch mem reads on r600
On r600 we use the scratch mem with read/read_ind, in that case
sb should track the rw_gpr as a dst instead of a src.

This stops the whole shader being optimised out.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:21 +10:00
Glenn Kennard
cd34deb585 r600g/sb: Add dependency tracking for scratch ops
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:19 +10:00
Glenn Kennard
a100d906b2 r600g/sb: Support scratch ops
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:16 +10:00
Glenn Kennard
6b4303f358 r600g: Implement scratch buffer state management (v2)
v2: add Glenn's fixes

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:12 +10:00
Glenn Kennard
9d31596d7a r600g: Add pending output function
Spills have to happen after the VLIW bundle currently
processed, so defer emitting the spill op.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:08 +10:00
Glenn Kennard
9c48a139b0 r600g: Support emitting scratch ops
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:52:48 +10:00
Dave Airlie
2a891ed190 r600: fix texture gather swizzling.
This fixes:
KHR-GL45.texture_gather.swizzle
on cayman and redwood.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:32:20 +10:00
Timothy Arceri
12a2350e6d ac: add 64bit support to ac_find_lsb()
v2: use LLVMBuildTrunc()

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 09:42:59 +11:00
Timothy Arceri
a9f6b392c7 ac: move get_elem_bits() to ac_llvm_build.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 09:42:59 +11:00
Timothy Arceri
19f9839f0b ac: add 64bit bitCount support
v2: use LLVMBuildTrunc()

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 09:42:59 +11:00
Samuel Pitoiset
bb750d265c ac/nir: clean up handle_fs_outputs_post()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:33 +01:00
Samuel Pitoiset
528bc14fa5 ac/nir: add radv_load_output() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:30 +01:00
Samuel Pitoiset
834d9845ca ac/shader: scan info about output PS declarations
NIR->LLVM should only be a translation pass, and all scan stuff
should be done before.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:27 +01:00
Samuel Pitoiset
a8e04e91de ac/nir: add radv_export_param() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:26 +01:00
Samuel Pitoiset
e3cfd6b805 ac/nir: remove set but unused export_mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:24 +01:00
Samuel Pitoiset
724136d590 ac/nir: remove dead code in handle_vs_outputs_post()
The memcpy can't be reached because the condition is always false.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:22 +01:00
Samuel Pitoiset
c63d8d0284 ac/nir: remove useless check in si_llvm_init_export_args()
values can't be NULL because we use ac_build_export_null() now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:20 +01:00
Samuel Pitoiset
26ab5a4269 ac/nir: use ac_build_export_null()
The number of enabled channels should be 0 when exporting null.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:11:44 +01:00
Samuel Pitoiset
bd9f7b7635 ac: add ac_build_export_null() helper
Imported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-08 22:11:42 +01:00
Scott D Phillips
1f4d2433e7 meson: Add build option for tools
Add a build option to control building some of the misc tools we
have. Also set the executables to install, presumably you want
that if you're asking for the build.

v2: set 'install:' to the with_tools value, not true (Jordan)
    handle 'all' in a the comma list (Dylan)
    Add freedreno's tools (Dylan)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-08 11:24:42 -08:00
Anuj Phogat
464d057c86 intel: Add Coffee Lake brand strings
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-02-08 10:26:34 -08:00
Brian Paul
11e92889aa gallium/util: silence clang warning in blitter code
Silence "warning: comparison of constant 4294967295 with expression
of type 'ubyte'".

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-08 10:27:31 -07:00
Brian Paul
4b0a45da25 tgsi: s/unsigned/enum tgsi_semantic/ in ureg_DECL_output()
So the function matches the prototype.  Found with clang.
v2: fix copy&paste error

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-08 10:27:19 -07:00
Brian Paul
d95c2d86cc tgsi: use TGSI_INTERPOLATE_x arguments instead of zeros in ureg code
TGSI_INTERPOLATE_CONSTANT and TGSI_INTERPOLATE_LOC_CENTER have the
value zero so there's no change in behavior.  It seems funny to
declare these fs input registers with constant interpolation.  But
it looks like ureg_DECL_input_layout() is not called anywhere and
ureg_DECL_input() is only called from
util_make_geometry_passthrough_shader().

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
26948ba761 gallium/util: s/uint/enum tgsi_semantic/ in simple shader code
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
0f40f4ffda tgsi: s/unsigned/enum pipe_shader_type/ in ureg code
And add a default switch case to silence a compiler warning.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
c0dc337ecd gallium/util: s/uint/enum tgsi_semantic/ in u_blitter.c
And put static qualifier on const arrays.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
e55de6e20c st/mesa: s/unsigned/enum tgsi_semantic/ st_cb_drawpixels.c
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
b9ff185e41 vbo: add a comment on vbo_draw_transform_feedback()
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
93b3d38176 gallium/util: trivial whitespace/formatting fixes in u_blit.c
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
5396f8546a vbo: improve comments on vbo_draw_func()
And rename a parameter name.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
b03ade55b9 cso: add a couple sanity check assertions in cso_draw_vbo()
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
5cf342704d st/mesa: rename some vars related to indirect draw count
'indirect_params' was a bit vague.  Use the names that we use in
gallium's pipe_draw_indirect_info.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Marek Olšák
d9e6e0bbe3 st/mesa: remove out_num_textures from update_textures
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-08 16:14:11 +01:00
Marek Olšák
08496c5d52 st/mesa: don't store non-fragment sampler states and views in st_context
those are unused.

st_context: 10120 -> 3704 bytes

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-08 16:14:11 +01:00
Lionel Landwerlin
e843667733 i965: perf: cleanup detection of kernel support for loadable configs
The initial revision of the patch adding loadable configs was testing
the feature's availability by adding a new config successfully and
then removing it.

A second version tested the availability just by exercising the
removal. But some unused code remained.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-08 10:52:14 +00:00
Lionel Landwerlin
bd6c0cab60 i965: perf: use drmIoctl() instead of ioctl()
ioctl() might be interrupted, use drmIoctl() instead as it'll retry
automatically.

Fixes: 27ee83eaf7 "i965: perf: add support for userspace configurations"
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2018-02-08 10:51:40 +00:00
Lionel Landwerlin
0f952b778f i965: perf: add debug messages for loaded configs
This helps figuring out potential problems when metrics don't show up
on frameretrace for example.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-08 10:51:01 +00:00
Dave Airlie
3f7a7bd897 r600: implement tg4 integer workaround. (v2)
This ports the texture gather integer workaround from radeonsi.

This fixes:
KHR-GL45.texture_gather.plain-gather-uint/int*

v2: add rect support, fix 2d array shadow
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (on irc)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-08 16:21:40 +10:00
Glenn Kennard
77b1b33724 r600: clean up initial shader register setup
This is taken from Glenn Kennards scratch series, but separated
out as a cleanup by me.

Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-08 16:21:35 +10:00
Roland Scheidegger
b936f4d1ca r600: partly fix sampleMaskIn value
The hw gives us coverage for pixel, not for individual fragment shader
invocations, in case execution isn't per pixel (eg, unlike cm, actually
cannot do "real" minSampleShading, it's either per-pixel or per-fragment,
but it doesn't really make a difference here).
Also, with msaa disabled, the hw still gives us a mask corresponding to
the number of samples, where GL requires this to be 1.
Fix this up by masking the sampleMaskIn bits with the bit corresponding to
the sampleID, if we know this shader is always executed at per-sample
granularity. (In case of a per-sample frequency shader and msaa disabled,
the sampleID will always be 0, so this works just fine there.)
Fixing this for the minSampleShading case will need a shader key (radeonsi
uses the prolog part for) (for eg, could get away with a single bit, cm
would need more bits depending on sample/invocation ratio, or read the
bits from a uniform), unless we'd want to always use a sample mask uniform
(which is probably not a good idea, as it would make the ordinary common
msaa case slower for no good reason).
This fixes some parts of piglit arb_sample_shading-samplemask (with fixed
test), in particular those which use a sampleID, still failing others
as expected.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-08 04:07:52 +01:00
Roland Scheidegger
07d724326a r600: clean up fragment shader input scan code
For some reason, we were iterating through the code twice (first just for
instructions needing barycentrics, then for instructions and input dcls).
Move things around slightly so this is no longer necessary.
There also was a unnedeed enabling of the fixed_pt_position_gpr - this is only
needed if the per-sample interpolation comes from an input, not from an
instruction (just move the assert where it belongs) (since the sample id to
sample from comes from a tgsi src in this case, and isn't sampleID).
Otherwise there should be no functional change.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-08 04:07:52 +01:00
Roland Scheidegger
6fd3c39590 mesa: (trivial) remove unused ignore_sample_qualifier_parameter
This parameter for _mesa_get_min_incations_per_fragment() was once used
by the intel driver, but it's long gone.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@vmware.com>
2018-02-08 04:07:52 +01:00
Roland Scheidegger
becc7faae2 r600/cm: (trivial) code cleanup for emitting msaa state
No functional change (compile tested only).

Reviewed-by: Dave Airlie <airlied@redhate.com>
2018-02-08 04:07:52 +01:00
Brian Paul
b99cb13002 tgsi: use tgsi_semantic enum type in ureg code
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:43:01 -07:00
Brian Paul
174f3a4ab7 st/mesa: use tgsi_semantic enum type
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:43:01 -07:00
Brian Paul
0f7be4fc16 tgsi: use TGSI enum types in ureg code
v2: fix enum tgsi_interpolate_mode/loc typo.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:42:39 -07:00
Brian Paul
9f9ce1625f st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:38:04 -07:00
Brian Paul
6321b1bd40 gallium/util: replace uint with tgsi enum types
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:38:04 -07:00
Brian Paul
15874338ff gallium/util: replace unsigned with tgsi enum types
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:38:04 -07:00
Fredrik Höglund
5a38d8f103 radv: implement VK_EXT_external_memory_host
Ported from the radeonsi GL_AMD_pinned_memory implementation.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 00:46:07 +01:00
Dave Airlie
5dd385f378 r600: fix rendering regression on r6/7 gpus
Fixes: 2d5b5d267e (r600: work out target mask at framebuffer bind.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104989

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-08 09:37:09 +10:00
Grazvydas Ignotas
f91aa68ac6 radeonsi: avoid int-to-pointer-cast warnings on 32bit
I hope the actual dropping of MSB is ok, but that's what's already
happened before this change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-08 01:13:58 +02:00
Grazvydas Ignotas
13ada91740 gallium/hud: update some query functions
It seems these were missed when struct pipe_context * argument was
added to hud_graph::query_new_value.

Fixes: 3132afdf4c "gallium/hud: pass pipe_context explicitly to most functions"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-08 01:12:07 +02:00
Roland Scheidegger
09f49b9e50 Revert "gallium: build ddebug, noop, rbug, trace as part of auxiliary"
This reverts commit 6f82b8d8d0.

This broke scons build, and reportedly clover with autotools/meson too.
2018-02-07 23:47:39 +01:00
Marek Olšák
6f82b8d8d0 gallium: build ddebug, noop, rbug, trace as part of auxiliary
Building gallium is faster by 7.5 seconds on a 4core/8thread 3GHz CPU.
(gallium build time is reduced by 15% when building only radeonsi)

Non-recursive makefiles are great!
2018-02-07 22:08:34 +01:00
Roland Scheidegger
def09f8db0 u_blit: (trivial) fix bogus argument order for set_fragment_shader
Amazingly this still worked sometimes, albeit I'm not even sure why...
This fixes d7bec6f7a6.
2018-02-07 22:03:18 +01:00
Andres Rodriguez
83990dd529 mesa: fix incorrect type when allocating arrays
The array members are have type 'struct gl_buffer_object *'

Found by coverity.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-07 14:50:21 -05:00
Roland Scheidegger
d7bec6f7a6 u_blit,u_simple_shaders: add shader to convert from xrbias format
We need this to handle some oddball dx10 format
(DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM). What you can do with this
format is very limited, hence we don't want to add it as a gallium
format (we could not express the properties of this format as
ordinary format properties neither, so like all special formats
it would need specific code for handling it in any case).
While here, also nuke the array for different shaders for different
writemasks, as it was not actually used (always full masks are
passed in for generating shaders).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-07 17:09:37 +01:00
Roland Scheidegger
afd1e9be17 u_simple_shaders: fix mask handling in util_make_fragment_tex_shader_writemask
The writemask handling was busted, since writing defaults to output
meant they got overwritten by the tex sampling anyway. Albeit the
affected components were undefined, so maybe with some luck it
still would have worked with some drivers - if not could as well
kill it... (This would have affected u_blitter but not u_blit since
the latter always used xyzw mask.)

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-07 17:08:24 +01:00
Bas Nieuwenhuizen
5d754872b5 autotools: Only build libmesa-st-tests-common.a for tests.
We don't need the library if we don't build tests, and building
it adds a dependency on gtest which adds a dependency on cxxabi.h.

Fixes: 6569b33b6e "mesa/st/tests: unify MockCodeLine* classes"
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
2018-02-07 14:04:04 +01:00
Tapani Pälli
9d322fde97 i965: add __DRI2_BLOB support and set cache functions
v2: adjust to change that moved cache from ctx to screen

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
ae00ef2702 disk cache: add callback functionality
v2: add disk_cache_has_key, disk_cache_put_key support
    using blob cache (Nicolai, Jordan)

v3: rename set_cb as put_cb to match existing naming (Timothy)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
6a651b6b77 disk cache: initialize cache path and index only when used
This patch makes disk_cache initialize path and index lazily so
that we can utilize disk_cache without a path using callback
functionality introduced by next patch.

v2: unmap mmap and destroy queue only if index_mmap exists

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
e8495646af glsl/tests: changes to test_disk_cache_create test
Next patch will allow disk_cache instance to be created without
path set for it, modify some test cases that assume disk_cache
creation to fail with invalid path. Creation should succeed but
simple put/get test fail.

v2: leave tests as is but check that both cache struct exists
    and try simple put/get that should fail with invalid path set
    (Emil)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
83c81b6cce glsl/tests: move utility functions in cache_test
Patch moves functions higher so that we can utilize them from
test_disk_cache_create which is modified by next patch.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
6f5b57093b egl: add support for EGL_ANDROID_blob_cache
v2: cleanup, move callbacks to _egl_display struct (Emil Velikov)
    adapt to earlier ctx->screen changes

v3: remove useless checking, add _eglSetFuncName (Emil Velikov)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v2)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
cf4569da6b dri: add interface for EGL_ANDROID_blob_cache extension
v2: move from __DRIcontext to __DRIscreen (Emil Velikov)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Samuel Pitoiset
757d36ee70 ac/nir: use new pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics
Ported from RadeonSI.

Only one F1 2017 shader is affected, code size decreased
from 532 to 488 on both Polaris10 and Vega10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-07 12:42:13 +01:00
Samuel Pitoiset
2f54d7382d ac/nir: avoid loading unused VS input components
Polaris10:
Totals from affected shaders:
SGPRS: 122840 -> 120984 (-1.51 %)
VGPRS: 78812 -> 78440 (-0.47 %)
Spilled SGPRs: 177 -> 129 (-27.12 %)
Code Size: 2950028 -> 2941276 (-0.30 %) bytes
Max Waves: 17899 -> 17976 (0.43 %)

Vega10:
Totals from affected shaders:
SGPRS: 117144 -> 115776 (-1.17 %)
VGPRS: 77580 -> 77532 (-0.06 %)
Spilled SGPRs: 0 -> 152 (0.00 %)
Code Size: 3352656 -> 3347860 (-0.14 %) bytes
Max Waves: 19756 -> 19866 (0.56 %)

This increases SGPRs spilling a bit with Talos, but I have
some other ideas that might reduce it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-07 12:42:09 +01:00
Samuel Pitoiset
1c57a6da5e ac/shader: scan vertex inputs usage mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-07 12:42:07 +01:00
Iago Toral Quiroga
f474b19875 i965: allocate a SGVS element when VertexID or InstanceID are read
Although on gen8+ platforms we can in theory use 3DSTATE_VF_SGVS
to put these beyond the last vertex element it seems that we still
need to allocate the SVGS element, otherwise we have observed cases
where we end up reading garbage. Specifically, the CTS test mentioned
below was flaky with a fail rate of ~1% on some gen9+ platforms caused
by reading garbage for the gl_InstanceID value. The flakyness goes
away as soon as we start allocating the SVGS element.

v2:
  - Do this for gen8+, not just gen9+, and pull the boolean
    outside the #if block (Jason)

Fixes flaky test:
KHR-GL45.vertex_attrib_64bit.limits_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104335
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-07 11:11:16 +01:00
Dylan Baker
c74719cf4a glapi: fix check_table test for non-shared glapi with meson
v2: - Add glapitable_h generated source to requirements

Fixes: 3218056e0e ("meson: Build i965 and dri stack")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2018-02-06 15:00:17 -08:00
Dylan Baker
002fbde71e glapi: Don't search through subdirs from glapitable.h
Because meson won't put it in that folder.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
aac3d01178 state_tracker: Don't build st-renumerate-test without shared glapi
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
0316aa432d glapi: remove APPLE extensions from test
Fixes: 7009955281 ("mesa: Remove GL_APPLE_vertex_array_object stubs")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
a4f1fc5dd1 glapi/check_table: Remove 'extern "C"' block
Using 'extern "C"' around includes is always incorrect, as the header may
contain C++ symbols (as it does in this case), which means it cannot use
C linkage. In this case the header has a template in it, which obviously
cannot be linked with C linkage rules.

Fixes: a29ad2b421 ("mesa/tests: Add tests for the generated dispatch table")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
105178db8f meson: fix test source name for static glapi
fixes: 43a6e84927 ("meson: build mesa test.")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
9be7487f30 glapi: don't walk backwards for includes
Instead just set the proper -I flags and include it from a more standard
path. In this case we'll add -Isrc/mesa (which is common), and #include
main/foo.h.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Brian Paul
e7a4536e64 mesa: rename gl_vertex_array_object::_VertexAttrib -> _VertexArray
Since the type is gl_vertex_array.  Update comment to explain that
these arrays are only used by the VBO module.

Also rename some local variables in _mesa_update_vao_derived_arrays().

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-06 15:36:47 -07:00
Brian Paul
d9ab39ea65 mesa: minor whitespace fixes, line wrapping in texcompress.c
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-06 15:23:26 -07:00
Brian Paul
b38196b452 mesa: simplify _mesa_get_compressed_formats()
Instead of testing for formats==NULL everywhere, just point formats at
a dummy array which will be discarded.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-06 15:23:26 -07:00
Vlad Golovkin
d919ff0f27 util: remove redundant check for the __clang__ macro
Clang defines __GNUC__ macro, so one doesn't need to check __clang__
macro in this particular case.

v2: added comment as per Brian Paul's suggestion

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-06 15:23:26 -07:00
Brian Paul
77bc74e674 st/mesa: use st_access_flags_to_transfer_flags() helper in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-06 15:23:26 -07:00
Brian Paul
1852a2e1a2 st/mesa: refactor st_bufferobj_map_range()
Use a new helper function, st_access_flags_to_transfer_flags(), to
convert the GL_MAP_x flags to PIPE_TRANSFER_x flags.

We'll be able to use this function in a couple other places.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-06 15:23:26 -07:00
Brian Paul
8a32dd2ec9 st/mesa: refactor bufferobj_data()
Split out some of the code into three new helper functions:
buffer_target_to_bind_flags(), storage_flags_to_buffer_flags(),
buffer_usage() to make the code more managable.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-06 15:23:26 -07:00
Samuel Pitoiset
3488a3f033 radv: run nir_opt_shrink_load
LLVM can't shrink loads.

Polaris10:
Totals from affected shaders:
SGPRS: 62528 -> 59955 (-4.11 %)
VGPRS: 44708 -> 44616 (-0.21 %)
Spilled SGPRs: 16 -> 8 (-50.00 %)
Code Size: 1355504 -> 1355172 (-0.02 %) bytes
Max Waves: 11710 -> 11670 (-0.34 %)

Vega10:
Totals from affected shaders:
SGPRS: 51448 -> 50371 (-2.09 %)
VGPRS: 39140 -> 39048 (-0.24 %)
Spilled SGPRs: 16 -> 16 (0.00 %)
Code Size: 1307188 -> 1304296 (-0.22 %) bytes
Max Waves: 11312 -> 11292 (-0.18 %)

This reduces SGPRs spilling in MadMax, and it also reduces
number of SGPRs in DOW3 and F12017. The number of waves slightly
decreases in F1 but I don't see any performance changes after
benchmarking it. Talos and Serious Sam are not affected because
they don't use any push constants.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-06 23:08:44 +01:00
Samuel Pitoiset
e68562b94b nir: add nir_opt_shrink_load pass
This is a very simple pass that just shrinks load_push_constant
intrinsics when some components are unused. For now, it can just
shrink vec4 to vec3, vec3 to vec2 and so on.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-06 23:08:39 +01:00
Timothy Arceri
e2ea9e1191 radeonsi/nir: add nir support for compiling compute shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
9c52902c76 ac/radeonsi: add num_work_groups to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
f12e2f9c12 ac: implement nir_intrinsic_shader_clock
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
b7b89bbddb ac/radeonsi: create ac_build_shader_clock() helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
d116af383f ac/radeonsi: add load_local_group_size() to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
f6932d1ef3 radeonsi: add get_block_size() helper
This will be reused by the nir backend in a later patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
e3ebffdbb0 ac: don't call emit_outputs() for compute
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
c8066cdfa7 ac/radeonsi: add local_invocation_ids to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
fa5239c153 ac/radeonsi: add workgroup_ids to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
64c10c9737 radeonsi/nir: gather some compute info in si_nir_scan_shader()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
1142b1d3e1 radeonsi/nir: always set input_usage_mask as using all components
This fixes a regression for now, in the future we should gather
the used components properly.

V2: just set for VS and correctly handle doubles

Fixes: be973ed21f "radeonsi: load the right number of components for VS inputs and TBOs"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:38:52 +11:00
Timothy Arceri
ffeebcfa7e i965: remove unused brw_nir_lower_cs_shared()
This has been unused since 8761a04d0d.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-02-07 08:38:01 +11:00
Bas Nieuwenhuizen
a3e42e7a69 vulkan/wsi: Fix OOM behavior with prime images.
Fixes: d50937f137 "vulkan/wsi: Implement prime in a completely generic way"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-06 21:52:39 +01:00
Bas Nieuwenhuizen
c7d640fbbf ac/nir: fix GS load input type.
Fixes: df1d5174fc "ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-06 21:52:38 +01:00
Mathias Fröhlich
e8a9473d32 mesa: Factor out _mesa_disable_vertex_array_attrib.
And use it in the enable code path.
Move _mesa_update_attribute_map_mode into its only remaining file.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-06 21:20:14 +01:00
Mathias Fröhlich
236657842b vbo: Move vbo_rebase into its only caller module tnl.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-06 21:20:14 +01:00
Mathias Fröhlich
2313c33e95 mesa: Use atomics for buffer objects reference counts.
The mutex is currently used for reference counting and updating
the minmax index cache.
The change uses atomics directly for reference counting and
the mutex for the minmax cache.
This is safe since the reference count is not modified beside
in _mesa_reference_buffer_object where atomics aim to be used.
While using the minmax cache, the calling code holds a reference
to the buffer object. Thus unreferencing or even referencing the
buffer object does not need to be serialized with accessing
the minmax cache.
The change reduces the time _mesa_reference_buffer_object_ takes
by about a factor of two when looking at perf results for some
of my favorite use cases.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-06 21:20:14 +01:00
Dave Airlie
6c691081a1 r600: fixup sparse color exports.
If we have gaps in the shader mask we have to have 0x1 in them
according to a comment in radeonsi, and this is required to fix
the test at least on cayman.

We also need to record the highest one written to write to the
ps exports reg.

This fixes:
KHR-GL45.enhanced_layouts.fragment_data_location_api

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:16:59 +10:00
Dave Airlie
2d5b5d267e r600: work out target mask at framebuffer bind.
If we only get 1,2,3,6 framebuffers we want a sparse target mask.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:16:55 +10:00
Dave Airlie
5b14e06d8b r600: work out shader export mask at shader build time (v1.1)
Since enhanced layouts allows setting specific MRT outputs, we
can get sparse outputs, so we have to calculate the shader
mask earlier.

v1.1: update checks for state update (Roland)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:16:27 +10:00
Dave Airlie
f292eceae1 r600: fix xfb stream check.
This fixes:
KHR-GL45.enhanced_layouts.xfb_vertex_streams

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
680cb9898a r600/compute: add render cond support.
Set render cond and emit atom.

Fixes:
KHR-GL45.compute_shader.conditional-dispatching

Reviewed-by: Roland Scheidegger <sorland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
5fd7b282b3 r600: fix not-very indirect compute
We need to get the grid sizes earlier to fill in to the const
buffer.

Fixes:
KHR-GL45.compute_shader.built-in-variables
and
KHR-GL45.compute_shader.dispatch-indirect

Reviewed-by: Roland Scheidegger <sorland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
00a112641b r600: overhaul buffer resource query.
This cleans up and fixes the previous fix even more.

Buffers from textures start at max const,
buffers from buffers/images come in from the 168 offset.

This fixes a bunch of:
KHR-GL45.shader_storage_buffer_object*

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
736b150768 r600/eg: fix buffer sizing.
For buffers we want the size in bytes,
For images we want it in elements.

This fixes:
KHR-GL45.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-pad

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
c9c4f0b722 r600/images: set offset for compute shaders with number of declared samplers
for frag shaders we get a value in the key, I expect I need
to make compute work better

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
ab5cee4c24 r600/compute: only mark buffer/image state dirty for fragment shaders
The compute emission path always emits this currently, and emitting
it on the fragment path breaks the blitter.

This fixes gpu hangs in KHR-GL45.compute_shader.resource-texture

Reviewed-by: Roland Scheidegger <sorland@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
4e3b43f180 r600/atomic: fix ATOMCAS instruction.
This has 4 srcs.

This fixes:
KHR-GL45.shader_atomic_counter_ops_tests.ShaderAtomicCounterOpsExchangeTestCase

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:11 +10:00
Dave Airlie
8bdad9fa1f r600/sb/cayman: fix indirect ubo access on cayman
With sb enabled on cayman, this was overwriting the proper
cf index value with random ones if the dst gpr was 2 or 3,
only save the value for a MOVA instruction.

Fixes:
KHR-GL45.gpu_shader5.uniform_blocks_array_indexing
(on cayman with sb)

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:11 +10:00
Dave Airlie
012100b809 r600/eg: use texture target to pick array size not view target (v2)
This fixes a few CTS cases in :
KHR-GL45.texture_view.view_sampling

some multisample cases are still broken, but not sure this is
the same problem.

v2: fix more cases

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:11 +10:00
Dave Airlie
e7e81f362d radv: don't support tc-compat on multisample d32s8 at all.
RX550 fails
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_2

So increase the range of the workaround.

Fixes: f4c534ef6 (radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1))

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-06 19:56:00 +00:00
Michal Navratil
4081e08896 winsys/amdgpu: allow non page-aligned size bo creation from pointer
Fix INVALID_OPERATION caused by BufferData with target
EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD when the buffer size is
not page aligned.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
2018-02-06 18:51:12 +01:00
Jon Turney
9440599c8e meson: ensure xmlpool/options.h is generated for libgallium
In file included from ../src/gallium/targets/dri/target.c:1:
In file included from ../src/gallium/auxiliary/target-helpers/drm_helper.h:8:
../src/util/xmlpool.h:103:10: fatal error: 'xmlpool/options.h' file not found

See also 26bde1e3.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-06 15:56:12 +00:00
Andres Gomez
1ec88755c2 vbo: provide 64bits support to print_draw_arrays
Cc: Mathias Fröhlich <mathias.froehlich@web.de>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-06 15:30:29 +02:00
Andres Gomez
0057ae4038 vbo: take into account the size when printing VAO elements
When using print_draw_arrays for debugging, we were printing an "n"
amount of vertex but that meant not to print all the size in the "n"
vertex, depending on the stride used.

Now we print the whole size in the "n" vertex.

Cc: Mathias Fröhlich <mathias.froehlich@web.de>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-06 15:30:23 +02:00
Andres Gomez
c9325b4fa9 vbo: print first element of the VAO when the binding stride is 0
Cc: Mathias Fröhlich <mathias.froehlich@web.de>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-06 15:30:12 +02:00
Iago Toral Quiroga
a5053ba27e anv/device: initialize the list of enabled extensions properly
The loop goes through the list of enabled extensions marking them as
enabled in the list, but this relies on every other extension being
initialized to false by default.

This bug would make us, for example, advertise certain device extension
entry points as available even when the corresponding extensions had
not been enabled.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: abc62282b5 "anv: Add a per-device table of enabled extensions"
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-02-06 07:51:00 +01:00
Iago Toral Quiroga
ef439a4fdc spirv: split constant initializers on in/out structs
The SPIR-V parser splits in/out struct variables and creates
a separate variable for each first-level member of the struct.
When the struct variable has an initializer this means that we also
need to split the initializer.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-06 07:50:18 +01:00
Iago Toral Quiroga
1d20001d97 i965/nir: do int64 lowering before optimization
Otherwise loop unrolling will fail to see the actual cost of
the unrolling operations when the loop body contains 64-bit integer
instructions, and very specially when the divmod64 lowering applies,
since its lowering is quite expensive.

Without this change, some in-development CTS tests for int64
get stuck forever trying to register allocate a shader with
over 50K SSA values. The large number of SSA values is the result
of NIR first unrolling multiple seemingly simple loops that involve
int64 instructions, only to then lower these instructions to produce
a massive pile of code (due to the divmod64 lowering in the unrolled
instructions).

With this change, loop unrolling will see the loops with the int64
code already lowered and will realize that it is too expensive to
unroll.

v2: Run nir_algebraic first so we can hopefully get rid of some of
    the int64 instructions before we even attempt to lower them.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-06 07:49:27 +01:00
Ilia Mirkin
02a6d901ee mesa: add OES_EGL_image_external_essl3 support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-06 07:28:11 +02:00
Vinson Lee
fe32f796f2 r600/fp64: Fix build.
CC       r600_shader.lo
r600_shader.c: In function ‘egcm_int_to_double’:
r600_shader.c:4543:12: error: ‘ctx’ is a pointer; did you mean to use ‘->’?
     if (ctx.bc->chip_class == CAYMAN)
            ^
            ->

Fixes: 35b4301577 ("r600/fp64: fix integer->double conversion")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-05 15:32:20 -08:00
Dave Airlie
35b4301577 r600/fp64: fix integer->double conversion
Doing a straight uint/int->fp32->fp64 conversion causes
some precision issues, Roland suggested splitting the
integer into two portions and doing two separate
int->fp32->fp64 conversions then adding the results.

This passes the tests in CTS and piglit.

[airlied: fix cypress conversion opcodes]
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-06 08:21:48 +10:00
Samuel Pitoiset
0170ae1e23 ac/nir: remove emission of nir_op_fdiv
RadeonSI and RADV lower fdiv.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-05 23:09:34 +01:00
Jon Turney
b5af199f92 travis: add macOS meson build
v2: Simplify set of options now we have better defaults

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-05 19:42:01 +00:00
Jon Turney
80bc41b2ec meson: osx ld doesn't support --build-id
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-05 19:40:43 +00:00
Jon Turney
ea8730024f meson: build src/glx/apple
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-05 19:40:43 +00:00
Dylan Baker
569628dd24 meson: set apple glx defines
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-05 19:40:43 +00:00
Jon Turney
4772909447 meson: better defaults for osx, windows and cygwin
set suitable defaults for 'dri-drivers', 'gallium-drivers', 'vulkan-drivers'
and 'platforms' options for osx, windows and cygwin, adding cygwin where
appropriate.

v2: error() for unknown OS

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-05 19:34:37 +00:00
Matt Turner
e2b31e9acf i965: Move mistakenly placed line
Ken called this out in review, but it seems I forgot to make the change.
I noticed that the control flow annotations in the fragment shader
disassembly of tests/shaders/glsl-fs-loop-continue.shader_test were not
correct, and moving this line to the correct place fixes it.
2018-02-05 09:50:56 -08:00
Juan A. Suarez Romero
4195eed961 glsl/linker: check same name is not used in block and outside
According with OpenGL GLSL 3.20 spec, section 4.3.9:

  "It is a link-time error if any particular shader interface
   contains:
     - two different blocks, each having no instance name, and each
       having a member of the same name, or
     - a variable outside a block, and a block with no instance name,
       where the variable has the same name as a member in the block."

This fixes a previous commit 9b894c8 ("glsl/linker: link-error using the
same name in unnamed block and outside") that covered this case, but
did not take in account that precision qualifiers are ignored when
comparing blocks with no instance name.

With this commit, the original tests
KHR-GL*.shaders.uniform_block.common.name_matching keep fixed, and also
dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision
regression is fixed, which was broken by previous commit.

v2: use helper varibles (Matteo Bruni)

Fixes: 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104668
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104777
CC: Mark Janes <mark.a.janes@intel.com>
CC: "18.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Matteo Bruni <matteo.mystral@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-02-05 18:10:43 +01:00
Juan A. Suarez Romero
3d14e72057 mesa: enable ASTC format for CompressedTexSubImage3D
If extensions GL_KHR_texture_compression_astc_hdr or
GL_KHR_texture_compression_astc_sliced_3d are implemented then ASTC
format are supported in CompressedTex*Îmage3D.

Fixes KHR-GLES2.texture_3d.* with this format.

CC: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-02-05 17:00:19 +01:00
Stephan Gerhold
02e2009b92 util/build-id: Fix address comparison for binaries with LOAD vaddr > 0
build_id_find_nhdr_for_addr() fails to find the build-id if the first LOAD
segment has a virtual address other than 0x0.

For most shared libraries, the first LOAD segment has vaddr=0x0:

    Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
    LOAD           0x000000 0x00000000 0x00000000 0x2d2e26 0x2d2e26 R E 0x1000
    LOAD           0x2d2e54 0x002d3e54 0x002d3e54 0x2e248 0x2f148 RW  0x1000

However, compiling the Intel Vulkan driver as 32-bit binary on Android produces
the following ELF header with vaddr=0x8000 instead:

    Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
    PHDR           0x000034 0x00008034 0x00008034 0x00100 0x00100 R   0x4
    LOAD           0x000000 0x00008000 0x00008000 0x224a04 0x224a04 R E 0x1000
    LOAD           0x225710 0x0022e710 0x0022e710 0x25988 0x27364 RW  0x1000

build_id_find_nhdr_callback() compares the address of dli_fbase from dladdr()
and dlpi_addr from dl_iterate_phdr(). With vaddr > 0, these point to a
different memory address, e.g.:

    dli_fbase=0xd8395000 (offset 0x8000)
    dlpi_addr=0xd838d000

At least on glibc and bionic (Android) dli_fbase refers to the address where
the shared object is mapped into the process space, whereas dlpi_addr is just
the base address for the vaddrs declared in the ELF header.

To compare them correctly, we need to calculate the start of the mapping
by adding the vaddr of the first LOAD segment to the base address.

Note: musl users will need the following patch.
https://git.musl-libc.org/cgit/musl/commit/?id=b3ae7beabb9f0c219bb8a8b63567a01c6530c1ac

Cc: Chad Versace <chadversary@chromium.org>
Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104642
Fixes: 5c98d38 "util: Query build-id by symbol address, not library name"
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-05 14:26:33 +00:00
Boyuan Zhang
d645b0850a radeonsi: enable vcn encode for HEVC main
Enable vcn encode for HEVC main profile on Raven.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
5534a2791f st/va: implement HEVC encode functions
Implement HEVC encode functions based on VAAPI HEVC encode interface.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
9ac50a2e0c st/va: add HEVC encode functions
Add a separate file for HEVC encode functions.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
66087d8a2d st/va: enable dual instances encode only for H264
Logics that related to dual instances encode should only be done for
H264, not other codecs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
a9c0861c6c st/va: add entrypoint check for HEVC
Add entrypoint check for HEVC to differentiate decode and encode jobs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
ecc3944344 st/va: add HEVC picture desc
Add HEVC picture desc, and add codec check when creating and destroying
context.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
9393b53c29 st/va: move H264 enc functions into separate file
Move all H264 encode related functions into separate file. Similar to
VAAPI decode side, there will be separate file for each codec on encode
side as well.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
b391d34916 radeon/vcn: add header implementations for HEVC
Implement encoding of sps, pps, vps, aud, and slice headers for HEVC
based on HEVC specs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
fdc952b320 radeon/vcn: add ib implementations for HEVC
Implement required ibs for vcn HEVC encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
5ab73edddb radeon/vcn: support picture parameters for HEVC
Pass pipe_picture_desc instead of pipe_h264_enc_picture_desc so that
it can be used for different codecs. Add functions to handle picture
parameters that will be used for HEVC encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
db67d04df3 radeon/vcn: add vcn encode interface for HEVC
Add vcn encode interface for HEVC, and rename radeon_enc_h264_enc_pic
to radeon_enc_pic since radeon_enc_pic is used by both H264 and HEVC.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
f410936439 vl: add parameters for HEVC encode
Add HEVC encode interface

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Eric Anholt
aa2f609f70 broadcom/vc5: Ignore samplers for finding uniform offsets.
Fixes:
KHR-GLES3.shaders.struct.uniform.sampler_array_fragment
KHR-GLES3.shaders.struct.uniform.sampler_array_vertex
KHR-GLES3.shaders.struct.uniform.sampler_nested_fragment
KHR-GLES3.shaders.struct.uniform.sampler_nested_vertex
2018-02-05 13:56:02 +00:00
Eric Anholt
63a8a0f3c0 broadcom/vc5: Fix non-mipfiltered sampling.
We need to clamp the LOD to 0 if mip filtering is disabled.  This is part
of fixing KHR-GLES3.shaders.struct.uniform.sampler_array_fragment.
2018-02-05 13:53:38 +00:00
Eric Anholt
e29988c908 broadcom/vc5: Fix "hardwrae" typo in a field name in XML. 2018-02-05 13:53:38 +00:00
Samuel Pitoiset
a1d568c830 ac/nir: fix a crash in load_gs_input() on pre-GFX9 chips
Fixes: df1d5174fc ("ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-05 11:05:52 +01:00
Eric Anholt
8bb000f460 broadcom/vc5: Try to merge more than 2 QPU instructions together.
Obviously it would be good to have an ADD and a MUL and a signal together,
but we can even potentially have multiple signals merged, as well.

total instructions in shared programs: 100423 -> 97874 (-2.54%)
instructions in affected programs:     78812 -> 76263 (-3.23%)
2018-02-05 09:29:37 +00:00
Eric Anholt
dc78643ace broadcom/vc5: Remove no-op MOVs after register allocation.
We emit some MOVs to track lifetimes of payload registers, but we don't
need there to be actual MOV instructions for them.

total instructions in shared programs: 101045 -> 100423 (-0.62%)
instructions in affected programs:     37083 -> 36461 (-1.68%)
2018-02-05 09:29:37 +00:00
Eric Anholt
f3978a7380 broadcom/vc5: Add missing shader-db instruction counting.
I must have misplaced it in the instruction packing rework.
2018-02-05 09:29:37 +00:00
Dave Airlie
7801425028 r600: fix resq for buffer images.
If this is an image buffer, we need to calculate the correct resource
id.

Fixes:
KHR-GL45.shader_image_size.*

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-05 05:15:41 +10:00
Dave Airlie
6c1432f0be r600/eg: fix cube map array buffer images.
This fixes a crash in:
KHR-GL45.texture_cube_map_array.texture_size_compute_sh.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-05 05:14:56 +10:00
Marek Olšák
af3685d149 mesa: change ctx->Color.ColorMask into a 32-bit bitmask
4 bits per draw buffer, 8 draw buffers in total --> 32 bits.

This is easier to work with.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-04 01:50:10 +01:00
Jordan Justen
83e60ce927 i965: Create new program cache bo when clearing the program cache
When the disk shader cache CI testing was enabled, we started noticing
occasional failures on deqp test runs. (Mainly SNB, rarely HSW)

Before this change, when we cleared the (in memory) program cache we
reused the same bo. Since the disk shader cache quickly restores
programs, it appears that this would lead to overwrites of the older
program binaries in the in memory program cache that apparently were
still executing in some cases. If these programs were still executing,
this could cause a GPU hang.

This issue is probably not disk shader cache specific, but may have
been hidden due to the compiler taking time to recompile programs
after the cache was cleared.

v2:
 * Don't add `copy` param to brw_cache_new_bo (Ken)
 * Call from brw_program_cache_check_size (Ken)

Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-03 12:16:58 -08:00
Jason Ekstrand
589e9db23f aubinator: Multiply count by 4 to compute buffer sizes
The count field is in terms of dwords and not bytes.  In
7d4007d58a, I fixed one instance
of this but missed another.
2018-02-02 22:30:56 -08:00
Eric Anholt
2e746bc63d broadcom/vc5: Enable UIF XOR on textures.
This should increase performance by reducing SDRAM bank conflicts when
crossing between UIF columns (particularly on power-of-two height
textures).

The uif_xor_disable setup is dropped, since we need to allow XOR on lower
miplevels even when level 0 is XOR.  The level 0 force UIF and level 0 XOR
flags should handle setting XOR properly on imported buffers.
2018-02-02 16:50:02 -08:00
Eric Anholt
6a862b0de7 broadcom/vc5: Fix alignment of miplevel 1 with UIF.
The alignment here means that we can't get back the padded height from the
size/stride any more, so it's now a field in the slice as well.

Fixes piglit fbo-generatemipmap-formats RGBA16 NPOT.
2018-02-02 16:27:49 -08:00
Eric Anholt
5c57e0a549 broadcom/vc5: Switch our RGBA4 support to the new gallium format.
Fixes fbo-generatemipmap-formats, fbo-alphatest-formats, etc. tests for
GL_RGBA4, GL_RGB4, GL_RGBA2, etc.
2018-02-02 16:27:49 -08:00
Eric Anholt
2a97f1d3ef gallium: Add a new A4B4G4R4 pipe format for Broadcom.
The VC5 HW puts A in the low bits and R in the high bits.  We can't just
swizzle in the shaders because the blending HW can't pick what channel A
is in, so make a new format to match it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-02 16:27:49 -08:00
Eric Anholt
1429cd74c2 mesa: Drop incorrect A4B4G4R4 _mesa_format_matches_format_and_type() cases.
swapBytes operates on bytes, not 4-bit channels, so you can't just take
non-swapBytes cases and flip the REV flag.

Avoids piglit texture-packed-formats regressions when enabling the
ABGR4444 format.

Fixes: c5a5c9a7db ("mesa/formats: add new mesa formats and their pack/unpack functions.")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-02 16:27:49 -08:00
George Kyriazis
bbef9474fa meson/swr: Updated copyright dates
cc: mesa-stable@lists.freedesktop.org
cc: dylan@pnwbakers.com

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-02 17:43:07 -06:00
George Kyriazis
16bf813830 meson/swr: re-shuffle generated files
Move generated files from codegen/meson.build to other directories, in order
to satisfy generated include file dependencies

Add correct file lists for architecture-specific libraries.

cc: mesa-stable@lists.freedesktop.org
cc: dylan@pnwbakers.com

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-02 17:43:00 -06:00
Marek Olšák
3bf1e036e8 amd: remove support for LLVM 3.9
Only these are supported:
- LLVM 4.0
- LLVM 5.0
- LLVM 6.0
- master (7.0)

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-02 23:47:40 +01:00
Dylan Baker
c75a4e5b46 meson: Check for actual LLVM required versions
Currently we always check for 3.9.0, which is pretty safe since
everything except radv work with >= 3.9 and 3.9 is pretty old at this
point. However, radv actually requires 4.0, and there is a patch for
radeonsi to do the same.

Fixes: 673dda8330 ("meson: build "radv" vulkan driver for radeon hardware")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-02 13:22:58 -08:00
Dylan Baker
d7235ef83b meson: Don't confuse the install and search paths for dri drivers
Currently there is not a separate option for setting the search path of
DRI drivers in meson, like there is in scons and autotools. This is an
oversight and needs to be fixed. This adds an extra option
`dri-search-path`, which will default to the value of
`dri-drivers-path`, like autotools does.

v2: - Split input list before joining.
v3: - use : instead of ; as the delimiter. The autotools help string
      incorrectly says ; but the code uses :
v4: - Take list in pre : delimited form (Ilia)
    - Ensure that the dri-search-path is absolute when using
      dri_drivers_path

Fixes: db9788420d ("meson: Add support for configuring dri drivers directory.")
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v2)
Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v3)
2018-02-02 11:01:42 -08:00
Marek Olšák
847d0a393d radeonsi: use pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-02 16:46:22 +01:00
Jon Turney
b3a1d9588e travis: add osx autotools build
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-02 15:28:52 +00:00
Jon Turney
4701379d96 travis: pip -> pip2
On travis, for OSX, python2 from homebrew is pre-installed. per [1]:

 python points to the macOS system Python (with no manual PATH modification)
 python2 points to Homebrew’s Python 2.7.x (if installed)
 python3 points to Homebrew’s Python 3.x (if installed)
 pip doesn't exist
 pip2 points to Homebrew’s Python 2.7.x’s pip (if installed)
 pip3 points to Homebrew’s Python 3.x’s pip (if installed)

We will end up using 'python2' for building mesa.

Just use 'pip2' instead of 'pip', as that seems to work for all platforms on
travis.

[1] https://docs.brew.sh/Homebrew-and-Python.html

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-02 15:28:52 +00:00
Jon Turney
7d1ec6d6a9 travis: conditionalize building of prerequisites on if OS=linux
Use a '|' YAML literal block to avoid the convoluted syntax needed to put
the entire conditional on a single line.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-02 15:28:52 +00:00
Jon Turney
63041ba613 glx/test: fix building for osx
An additional stub for applegl_create_context() is needed
Cannot test indirect API as it's not built on osx, currently

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-02 15:28:52 +00:00
Andres Gomez
4761a8fea6 i965: check if upload is 0 explicitely, when downsizing a format
downsize_format_if_needed takes an integer as number of uploads
parameter. Hence, let's do an integer comparation instead of a boolean
check, since that is confusing.

Since we are at it, fix a couple of wrongly tabbed indents.

Cc: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-02-02 16:32:30 +02:00
Marek Olšák
51d36f5e02 mesa: don't flag _NEW_COLOR for KHR adv.blend if prog constant doesn't change
This only affects drivers that set DriverFlags.NewBlend.

v2: - fix typo advanded -> advanced
    - return "enum gl_advanced_blend_mode" from
      _mesa_get_advanced_blend_sh_constant
    - don't call FLUSH_VERTICES twice

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-02 15:06:47 +01:00
Samuel Pitoiset
df1d5174fc ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load
The old one generates useless instructions in there, found while
comparing geometry shaders between RadeonSI and RADV.

This improves all Vulkan demos that use geometry shaders, +4%
for deferredshadows, +9% for viewportarray, +7% for
geometryshader on Polaris10.

This seems to also improve DOW3 a little bit (+1%).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by:  Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-02 12:32:21 +01:00
Dave Airlie
f9c121c420 r600/eg: add crap indirect compute support.
I think the cp packets can be made work, but I think it might
need a kernel change, so for now just do the worst thing.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-02 16:50:18 +10:00
Jason Ekstrand
2f7205be47 i965: Call prepare_external after implicit window-system MSAA resolves
This fixes some rendering corruption in a couple of Android apps that
use window-system MSAA.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104741
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-01 21:45:25 -08:00
Roland Scheidegger
c2f0e08857 r600: don't do stack workarounds for hemlock
By the looks of it it seems hemlock is treated separately to cypress, but
certainly it won't need the stack workarounds cedar/redwood (and
seemingly every other eg chip except cypress/juniper) need.
(Discovered by accident.)

Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-02 01:46:43 +01:00
Dave Airlie
8fa5aade43 r600: initial attempt at gl_HelperInvocation (v3)
This passes the CTS and piglit tests.

This also disable sb for helper invocations until it doesn't
mess up the VPM flags.

Thanks to Ilia and Glenn for advice, and Roland for working
out the working evergreen path.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-02 09:46:05 +10:00
Bas Nieuwenhuizen
2ffe395cba radv: Don't expose VK_KHX_multiview on android.
deqp does not allow any KHX extensions, and since deqp is included
in android-cts, android does not allow any khx extensions.

So disable VK_KHX_multiview on android.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: 18.0 <mesa-stable@lists.freedesktop.org>
2018-02-01 23:32:48 +01:00
Mathias Fröhlich
5b3d58520f vbo: Simplify input array distribution for dlist type draws.
Using the newly introduced VAO array maps, we can
simplify vbo_bind_vertex_list.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:08 +01:00
Mathias Fröhlich
fb10a7b7b0 vbo: Simplify input array distribution for imm type draws.
Using the newly introduced VAO array maps, we can
simplify vbo_exec_bind_arrays.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:08 +01:00
Mathias Fröhlich
44b1454b96 vbo: Simplify input array distribution for array type draws.
Using the newly introduced VAO state variable, we can
simplify recalculate_input_bindings.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:07 +01:00
Mathias Fröhlich
3d4fb879dd vbo: Use static const VERT_ATTRIB->VBO_ATTRIB maps.
Instead of each context having its own map instance for
this purpose, use a global static const map.

v2: s,unsigned char,GLubyte,g
    s,_VP_MODE_MAX,VP_MODE_MAX,g
    Change comment style.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:07 +01:00
Mathias Fröhlich
b4fd63015a mesa: Track position/generic0 aliasing in the VAO.
Since the first material attribute no longer aliases with
the generic0 attribute, only aliasing between generic0 and
position is left and entirely dependent on the enabled
state of the VAO. So introduce a gl_attribute_map_mode
in the VAO that is used to track how the position
and the generic 0 attribute alias.
Provide a static const array that can be used to
map from vertex program input indices to VERT_ATTRIB_*
indices. The outer dimension of the array is meant to
be indexed directly by the new VAO member variable.
Also provide methods on the VAO to convert bitmasks of
VERT_BIT's from the VAO numbering to the vertex processing
inputs numbering.

v2: s,unsigned char,GLubyte,g
    s,_ATTRIBUTE_MAP_MODE_MAX,ATTRIBUTE_MAP_MODE_MAX,g
    Change comment style, add comments.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:06 +01:00
Mathias Fröhlich
186f03cfb0 mesa: Put materials at the end of the generic block.
The materials are now moved to the end of the
generic attributes block to the range 4-15.

Before, the way the position and generic 0 attribute
is handled was dependent on the presence and kind of
the currently attached vertex program. With this
change the way the position attribute and the generic 0
attribute is treated only depends on the enabled
flag of those two arrays.
This will later help to untangle the update dependencies
between enabled arrays and shader inputs.

v2: s,VERT_ATTRIB_MAT_OFFSET,VERT_ATTRIB_MAT0,g

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:06 +01:00
Mathias Fröhlich
38b41fd718 mesa: Use defines for the aliased material array attributes.
Instead of just assuming that the material attributes
just overlap with the generic attributes 0-12, give
them symbolic defines so that we can easier move them
to an other range.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:06 +01:00
Mathias Fröhlich
f37e29ac22 vbo: Correctly handle attribute offsets in dlist draw.
When executing a display list draw, for the offset
list to be correct, the offset computation needs to
accumulate all attribute size values in order.
Specifically, if we are shuffling around the position
and generic0 attributes, we may violate the order or
if we do not walk the generic vbo attributes we may
skip some of the attributes.
Even if this is an unlikely usecase we can fix this use
case by precomputing the offsets on the full attribute list
and store the full offset list in the display list node.

v2: Formatting fix
v3: Rebase

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:05 +01:00
Brian Paul
7a044ef68b gallivm/llvmpipe: add const qualifiers on sampler variables
Once a lp_build_sampler_soa or lp_build_sampler_aos object is created,
it should never be modified.  Found by inspection.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-01 14:19:58 -07:00
Brian Paul
1bdbeae17c vbo: change an argument in vbo_draw_indirect_prims()
In vbo_draw_indirect_prims() pass the 'indirect_data' argument to
vbo->draw_prims().  All the callers are passing ctx->DrawIndirectBuffer
so this should be no functional change.  Add a (temporary) assertion to
be sure.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-01 12:17:59 -07:00
Brian Paul
1b7ad3ae97 vbo: add comments on the VBO draw function typedefs
And rename indirect_params -> indirect_draw_count_buffer and
indirect_params_offset -> indirect_draw_count_offset to be more
specific.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-01 12:17:59 -07:00
Brian Paul
c7bf05c833 vbo: s/drawcount/drawcount_offset
This parameter (from the glMultiDrawArraysIndirectCountARB function)
is poorly named.  It's an offset into the buffer which contains the
number of primitives to draw.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-01 12:17:59 -07:00
Brian Paul
b0a2f38db9 vbo: use vbo local var for draw call in vbo_save_playback_vertex_list()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-01 12:17:59 -07:00
Brian Paul
84c3641864 svga: remove unneeded #includes in svga_pipe_draw.c
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-01 12:17:59 -07:00
Brian Paul
fa98730bf3 svga: whitespace/formatting fixes in svga_pipe_draw.c
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-01 12:17:59 -07:00
Brian Paul
7a1401938b svga: clean up retry_draw_range_elements(), retry_draw_arrays()
Get rid of a bunch of goto spaghetti.  Remove unneeded do_retry parameter.
No Piglit changes.  Also tested w/ Google Earth and other apps.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-01 12:17:59 -07:00
Brian Paul
c744289552 svga: remove unused min/max_index params to draw_vgpu10()
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-01 12:17:59 -07:00
Eric Anholt
06858c7348 broadcom/vc5: Fix image_h setup for both loads and stores.
The image_h for the tiling algorithm needs to be the padded-to-a-uifblock
height of the level, not the unpadded height or the height of level 0.
Fixes some cases of KHR-GLES3.texture_repeat_mode.* and
depthstencil-render-miplevels.
2018-02-01 11:02:29 -08:00
Eric Anholt
5329f35ea1 broadcom/vc5: Add appropriate height padding for bank conflicts.
I thought I didn't need this because I was doing level-0-always-UIF and
that the pad there would propagate down, but it turns out that for level 1
the padding ends up being chosen by the HW.  This brings us closer to
being able to turn on UIF XOR for increased performance, as well.
2018-02-01 11:02:29 -08:00
Eric Anholt
dea902c933 broadcom/vc5: Simplify separate stencil surface setup.
If we just make another gallium surface for the separate stencil, it's a
lot easier to keep track of which set of fields we're using in RCL setup.

This also incidentally fixes a little bug in setting up the surface's
padded height for separate stencil when the UIF-ness changes at different
levels of Z versus stencil.
2018-02-01 11:02:29 -08:00
Eric Anholt
7239b3edbe broadcom/vc5: Rename the UIFCFG register in the UAPI.
This matches the naming of the other hub regs we get, and I don't know for
sure if UIFCFG will be the same register between the hub and the cores on
all versions.
2018-02-01 11:02:29 -08:00
Eric Anholt
353b42ccc7 broadcom/vc5: Fix a segfault on mix of booleans.
We don't have a src1 to look up if the compare instruction is "i2b".
2018-02-01 11:02:29 -08:00
Eric Anholt
eb765394c2 broadcom/vc5: Skip over missing color buffers for a couple of checks.
Fixes crashes in piglit alpha-to-coverage-no-draw-buffer-zero 2
2018-02-01 11:02:29 -08:00
Eric Anholt
aec066c7aa broadcom/vc5: Add the missing PIPE_CAP_FENCE_SIGNAL. 2018-02-01 11:02:29 -08:00
Baldur Karlsson
030821a873 mesa: fix query of GL_TEXTURE_COMPRESSION_HINT_ARB
Fixes: f96a69f916 ("mesa: replace GLenum with GLenum16 in common
structures (v4)")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104908
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 11:58:02 -07:00
Lucas Stach
0c71a19fe4 renderonly: fix dumb BO allocation for non 32bpp formats
Take into account the resource format, instead of applying a hardcoded
32bpp. This not only over-allocates 16bpp formats, but also results in
a wrong stride being filled into the handle.

Fixes: 848b49b288 ("gallium: add renderonly library")
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-01 19:36:17 +01:00
Kenneth Graunke
85ec7abc3f intel/decoder: Fix control / evaluation label mixup.
Trivial.  DS is TES, HS is TCS.
2018-02-01 09:44:15 -08:00
Kenneth Graunke
c3cd2aac27 i965: Bump official kernel requirement to Linux v3.9.
In commit 3f353342a6 (present in 17.3.0)
we started unconditionally using I915_EXEC_NO_RELOC, which was
introduced in Linux v3.9.  ChromeOS kernel 3.8 has backported this,
so it should work too.

Running on older kernels would likely result in every single batch
being rejected by the kernel, which is pretty catastrophic.  Yet, it
appears that nobody noticed.  So, let's just bump the official
requirement and move forward ever so slowly.

Fixes: 3f353342a6 ("i965: Use I915_EXEC_NO_RELOC")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 07:58:58 -08:00
Marc Dietrich
4c5f0b4fd4 meson: don't install windows headers on non-windows platforms
Only dive into the windows subdir if windows platform is selected.

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Fixes: 5ef75cb02b "meson: build src/glx/windows"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-01 15:33:02 +00:00
Marek Olšák
71c6f64e54 radeonsi: use ac_build_buffer_load_format for image buffer loads
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
b0a6053a99 ac/nir: use ac_build_buffer_load_format for image buffer loads
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
bac9fa9f17 ac: add glc parameter to ac_build_buffer_load_format
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
be973ed21f radeonsi: load the right number of components for VS inputs and TBOs
The supported counts are 1, 2, 4. (3=4)

The following snippet loads float, vec2, vec3, and vec4:

Before:
    buffer_load_format_x v9, v4, s[0:3], 0 idxen          ; E0002000 80000904
    buffer_load_format_xyzw v[0:3], v5, s[8:11], 0 idxen  ; E00C2000 80020005
    s_waitcnt vmcnt(0)                                    ; BF8C0F70
    buffer_load_format_xyzw v[2:5], v6, s[12:15], 0 idxen ; E00C2000 80030206
    s_waitcnt vmcnt(0)                                    ; BF8C0F70
    buffer_load_format_xyzw v[5:8], v7, s[4:7], 0 idxen   ; E00C2000 80010507

After:
    buffer_load_format_x v10, v4, s[0:3], 0 idxen         ; E0002000 80000A04
    buffer_load_format_xy v[8:9], v5, s[8:11], 0 idxen    ; E0042000 80020805
    buffer_load_format_xyzw v[0:3], v6, s[12:15], 0 idxen ; E00C2000 80030006
    s_waitcnt vmcnt(0)                                    ; BF8C0F70
    buffer_load_format_xyzw v[3:6], v7, s[4:7], 0 idxen   ; E00C2000 80010307

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
472361dd7e radeonsi: remove unused si_shader_context members
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Jon Turney
d3540b405b glx/apple: locate dispatch table functions to wrap by name
Avoid reaching into the dispatch table internals (and thus having to deal
with the complexities of remap etc.) by identifying functions to wrap by
name.

See:
https://lists.freedesktop.org/archives/mesa-dev/2015-June/086721.html et seq.
https://bugs.freedesktop.org/show_bug.cgi?id=90311

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:14:08 +00:00
Jon Turney
b37b7b42dc glx/apple: include util/debug.h for env_var_as_boolean prototype
mesa/src/glx/glxcmds.c:1295:21: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
mesa/src/glx/apple/apple_visual.c:85:28: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration]

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:14:02 +00:00
Jon Turney
f8ed9f24d5 osx: ld doesn't support --build-id
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:13:56 +00:00
Jon Turney
7ad7a07c88 configure: Default to gbm=no on osx
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:13:00 +00:00
Andres Rodriguez
bbd00844a2 mesa: remove usage of alloca in externalobjects.c v4
Don't want an overly large numBufferBarriers/numTextureBarriers to blow
up the stack.

v2: handle malloc errors
v3: fix patch
v4: initialize texObjs/bufObjs

Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
2018-02-01 09:48:04 -05:00
Samuel Pitoiset
2ef5ce1198 radv: do not insert shaders in cache when it's disabled
When the application doesn't provide its own pipeline cache,
the driver uses a in-memory cache but it shouldn't insert any
entries when the cache is explicitely disabled by the user.

Found while running my experimental pipeline-db tool with a
ton of shaders, the memory footprint was just huge, and sometimes
the process was even killed...

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-01 09:40:11 +01:00
Samuel Pitoiset
4922e7f25c radv: use separate bindings for graphics and compute descriptors
The Vulkan spec says:

   "pipelineBindPoint is a VkPipelineBindPoint indicating whether
    the descriptors will be used by graphics pipelines or compute
    pipelines. There is a separate set of bind points for each of
    graphics and compute, so binding one does not disturb the other."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104732
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-01 09:37:09 +01:00
Samuel Pitoiset
cf224014dd radv: store the bind point when creating descriptors with templates
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-01 09:37:07 +01:00
Dave Airlie
7ea15a36fb r600/eg: make sure we allow vpm bit on other CF ops.
the vpm bit wasn't being applied to the push/pop instructions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-01 13:41:32 +10:00
Timothy Arceri
4d982ae2c7 gallium/st/clover: remove unused PIPE_SHADER_IR_LLVM
This has been unused since 100796c15c.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-02-01 13:56:34 +11:00
Dave Airlie
0491d5425f r600/sb: just add some missing debug bits
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-01 12:06:40 +10:00
Dave Airlie
df155a73f4 r600: fix buffer resinfo opcode translation.
The vtx operations never got translated, so things worked by
0 being equal to 0, translate them so we can use the proper buffer
resinfo code.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-01 11:59:55 +10:00
Timothy Arceri
679e4e7a46 st/glsl_to_nir: add more nir opts to st_nir_opts()
All of the current gallium nir driver use these optimisations but
they do so in their backends. Having these called in the backend
only can cause a number of problems:

- Shader compile times are greater because the opts need to do
  significant passes over all shader variants.
- The shader cache is partially defeated due to the significant
  optimisation passes over variants.
- We might miss out on nir linking optimisation opportunities.

Adding these passes to st_nir_opts() alleviates these problems.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-01 09:42:57 +11:00
Andres Gomez
5a7aba2e0a i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen<8
The emission of vertex attributes corresponding to dvec3 and dvec4
vertex shader input variables was not correct when the <size> passed
to the VertexAttribL* commands was <= 2.

In 61a8a55f55 ("i965/gen8: Fix vertex attrib upload for dvec3/4
shader inputs"), for gen8+ we needed to determine if the attrib was
dual slot to emit 128 or 256-bit, independently of the VAO size.

Similarly, for gen < 8 we also need to determine whether the attrib is
dual slot to force the emission of 256-bits through 2 uploads.

Additionally, we make use of the ISL_FORMAT_R32_FLOAT format in this
second upload to fill these unspecified components with zeros, as we
also do for gen8+.

Fixes the following test on Haswell:
KHR-GL46.vertex_attrib_binding.basic-inputL-case1

v2: Added more inline comments to explain why we are using
    ISL_FORMAT_R32_FLOAT and its consequences, as requested by
    Alejandro and Antía.

Fixes: 75968a668e ("i965/gen7: expose OpenGL 4.2 on Haswell when
supported")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103006
Cc: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Antia Puentes <apuentes@igalia.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-31 22:50:06 +02:00
Kenneth Graunke
ab1f2e6bc4 i965: Make texture validation code use texture objects, not units.
This requires moving the _MaxLevel handling up to the callers.  Another
user of intel_finalize_mipmap_tree will be added later that depends on
_MaxLevel not being modified.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-01-31 11:33:52 -08:00
Kenneth Graunke
0a2e878c69 i965: Pass tObj into intel_update_max_level instead of intel_obj.
We want both anyway, but this will simplify things a tiny bit in an
upcoming patch.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-01-31 11:33:52 -08:00
Kenneth Graunke
876f1537e9 i965: Delete more misleading comments.
brw_bo_wait_rendering used to take a brw_context pointer for perf_debug
messages about stalls.  Chris eliminated that in 833108ac14.
This message about passing NULL to avoid those warnings is no longer
relevant, and just adds confusion.  So, drop it.
2018-01-31 11:33:52 -08:00
Andres Rodriguez
8996610acb docs/features: mark EXT_semaphore(_fd) as DONE v2
Support for these extensions is available in radeonsi.

v2: also updated relnotes

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
2018-01-31 12:31:40 -05:00
Brian Paul
d32c22a13f st/mesa: whitespace, formatting fixes in st_glsl_to_tgsi.cpp
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Brian Paul
3b3d8275d8 st/mesa: s/int/GLenum/ in st_glsl_to_tgsi.cpp
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Brian Paul
1882ec4ff7 svga: use opcode local var to simplify some code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Brian Paul
338c35c427 svga: s/unsigned/VGPU10_OPCODE_TYPE/
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Samuel Pitoiset
a097a6f519 radv: do not dump meta shader stats
That's quite useless and that pollutes the output.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-31 14:10:26 +01:00
Samuel Pitoiset
26cc3e74b9 ac/nir: fix emission of ffract for 64-bit
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-31 14:10:24 +01:00
Eric Engestrom
2f0db33527 meson: dedup gallium-xa logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
fa5d616bf9 meson: dedup gallium-va logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
86168ed31c meson: dedup gallium-omx logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
724916c8a8 meson: dedup gallium-xvmc logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
992af0a4b8 meson: dedup gallium-vdpau logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Antia Puentes
0da434fb47 Revert "mesa: add missing RGB9_E5 format in _mesa_base_fbo_format"
This reverts commit 513c2263cb.

_mesa_base_fbo_format_ is used to validate the internalformat
passed to RenderbufferStorage, which in the OpenGL 4.6 is said:

"An INVALID_ENUM error is generated if internalformat is not one of the
color-renderable, depth-renderable, or stencil-renderable formats defined
in section 9.4."

RGB9_E5 format is not renderable, as stated in the same specification
(Bug 9338).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104794

Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-01-31 12:06:00 +01:00
Michel Dänzer
1cf1bf32ef winsys/radeon: Compute is_displayable in surf_drm_to_winsys
It was always 0, breaking (at least) DRI3 with Xwayland.

Bugzilla: https://bugs.freedesktop.org/104306
Fixes: 5f2073be32 ("ac/surface: add ac_surface::is_displayable")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:53:58 +01:00
Matthew Nicholls
ef272b161e radv: remove predication on cache flushes
This can lead to a situation where cache flushes could get conditionally
disabled while still clearing the flush_bits, and thus flushes due to
application pipeline barriers may never get executed.

Fixes: a6c2001ace (radv: add support for cmd predication.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-31 13:37:18 +10:00
Brian Paul
1ea9efd2f8 mesa: fix broken glGet*(GL_POLYGON_MODE) query
This reverts part of the patch which introduced the GLenum16 change.
Fixes a conform regression found by Roland.

Fixes: f96a69f916 ("mesa: replace GLenum with GLenum16 in
common structures (v4)")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-30 20:32:37 -07:00
Dave Airlie
49c61d8b84 virgl: also remove dimension on indirect.
This fixes some dEQP tests that generated bad shaders.

Fixes: b6f6ead19 (virgl: drop const dimensions on first block.)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-31 12:24:11 +10:00
Marek Olšák
fdf01d0244 radeonsi: remove DBG_PRECOMPILE
it's useless and shader-db stats only report the main shader part.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-31 03:21:20 +01:00
Marek Olšák
148b48646b radeonsi: print shader-db stats for main parts, not final binaries
This is needed to get shader-db stats for LS,HS,ES,GS stages on gfx9.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-31 03:21:20 +01:00
Marek Olšák
c02c9ee550 radeonsi: move max_simd_waves computation into a separate function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-31 03:21:20 +01:00
Marek Olšák
a7311cd7ee mesa: fix glGet MAX_VERTEX_ATTRIB queries
Broken by f96a69f916

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-31 03:21:20 +01:00
Jason Ekstrand
97938dac36 anv/cmd_buffer: Re-emit the pipeline at every subpass
If we ever hit this edge-case, it can theoretically cause problem for
CNL because we could end up changing render targets without re-emitting
3DSTATE_MULTISAMPLE which is part of the pipeline.  Just get rid of the
edge case.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-01-30 17:16:33 -08:00
Ian Romanick
ee63933a73 nir: Distribute binary operations with constants into bcsel
This was specifically designed to simplify 1+mix(0, a-1, condition) to
mix(1, a, condition) by pushing the 1+ inside.

Skylake, Broadwell, and Haswell had similar results.  Skylake shown.
total instructions in shared programs: 14521753 -> 14521716 (<.01%)
instructions in affected programs: 10619 -> 10582 (-0.35%)
helped: 51
HURT: 14
helped stats (abs) min: 1 max: 12 x̄: 1.43 x̃: 1
helped stats (rel) min: 0.20% max: 3.58% x̄: 1.01% x̃: 0.95%
HURT stats (abs)   min: 1 max: 11 x̄: 2.57 x̃: 1
HURT stats (rel)   min: 0.22% max: 1.75% x̄: 1.20% x̃: 1.32%
95% mean confidence interval for instructions value: -1.31 0.17
95% mean confidence interval for instructions %-change: -0.80% -0.27%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 533000205 -> 533003533 (<.01%)
cycles in affected programs: 110610 -> 113938 (3.01%)
helped: 43
HURT: 28
helped stats (abs) min: 6 max: 440 x̄: 27.12 x̃: 16
helped stats (rel) min: 0.39% max: 4.84% x̄: 1.60% x̃: 1.67%
HURT stats (abs)   min: 2 max: 3066 x̄: 160.50 x̃: 14
HURT stats (rel)   min: 0.08% max: 77.78% x̄: 5.16% x̃: 0.62%
95% mean confidence interval for cycles value: -43.81 137.56
95% mean confidence interval for cycles %-change: -1.47% 3.60%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 10018840 -> 10018713 (<.01%)
instructions in affected programs: 9431 -> 9304 (-1.35%)
helped: 51
HURT: 3
helped stats (abs) min: 1 max: 80 x̄: 2.76 x̃: 1
helped stats (rel) min: 0.20% max: 16.43% x̄: 1.16% x̃: 0.81%
HURT stats (abs)   min: 1 max: 12 x̄: 4.67 x̃: 1
HURT stats (rel)   min: 0.22% max: 1.33% x̄: 0.59% x̃: 0.22%
95% mean confidence interval for instructions value: -5.36 0.66
95% mean confidence interval for instructions %-change: -1.66% -0.46%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 87571944 -> 87572785 (<.01%)
cycles in affected programs: 117234 -> 118075 (0.72%)
helped: 42
HURT: 23
helped stats (abs) min: 2 max: 114 x̄: 51.90 x̃: 30
helped stats (rel) min: 0.11% max: 11.01% x̄: 4.45% x̃: 2.74%
HURT stats (abs)   min: 1 max: 2341 x̄: 131.35 x̃: 10
HURT stats (rel)   min: 0.06% max: 37.11% x̄: 2.75% x̃: 0.61%
95% mean confidence interval for cycles value: -61.05 86.93
95% mean confidence interval for cycles %-change: -3.47% -0.33%
Inconclusive result (value mean confidence interval includes 0).

Sandy Bridge
total instructions in shared programs: 10542933 -> 10542844 (<.01%)
instructions in affected programs: 11487 -> 11398 (-0.77%)
helped: 52
HURT: 3
helped stats (abs) min: 1 max: 40 x̄: 1.96 x̃: 1
helped stats (rel) min: 0.08% max: 8.16% x̄: 0.90% x̃: 0.72%
HURT stats (abs)   min: 1 max: 11 x̄: 4.33 x̃: 1
HURT stats (rel)   min: 0.22% max: 1.22% x̄: 0.55% x̃: 0.22%
95% mean confidence interval for instructions value: -3.17 -0.07
95% mean confidence interval for instructions %-change: -1.13% -0.52%
Instructions are helped.

total cycles in shared programs: 146098397 -> 146097094 (<.01%)
cycles in affected programs: 128140 -> 126837 (-1.02%)
helped: 47
HURT: 8
helped stats (abs) min: 2 max: 333 x̄: 29.21 x̃: 18
helped stats (rel) min: 0.13% max: 5.04% x̄: 1.18% x̃: 0.95%
HURT stats (abs)   min: 1 max: 16 x̄: 8.75 x̃: 9
HURT stats (rel)   min: 0.08% max: 0.43% x̄: 0.30% x̃: 0.34%
95% mean confidence interval for cycles value: -37.49 -9.90
95% mean confidence interval for cycles %-change: -1.22% -0.71%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886711 -> 7886509 (<.01%)
instructions in affected programs: 10425 -> 10223 (-1.94%)
helped: 50
HURT: 2
helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1
helped stats (rel) min: 0.34% max: 15.38% x̄: 1.12% x̃: 0.54%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.86% max: 0.91% x̄: 0.89% x̃: 0.89%
95% mean confidence interval for instructions value: -8.05 0.28
95% mean confidence interval for instructions %-change: -1.83% -0.26%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 178115324 -> 178114612 (<.01%)
cycles in affected programs: 765726 -> 765014 (-0.09%)
helped: 39
HURT: 1
helped stats (abs) min: 2 max: 276 x̄: 18.31 x̃: 8
helped stats (rel) min: <.01% max: 8.47% x̄: 0.39% x̃: 0.04%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -32.07 -3.53
95% mean confidence interval for cycles %-change: -0.86% 0.10%
Inconclusive result (%-change mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857762 -> 4857661 (<.01%)
instructions in affected programs: 5523 -> 5422 (-1.83%)
helped: 25
HURT: 1
helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1
helped stats (rel) min: 0.34% max: 13.61% x̄: 1.04% x̃: 0.52%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.86% max: 0.86% x̄: 0.86% x̃: 0.86%
95% mean confidence interval for instructions value: -9.99 2.22
95% mean confidence interval for instructions %-change: -2.01% 0.08%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 122179674 -> 122179194 (<.01%)
cycles in affected programs: 530162 -> 529682 (-0.09%)
helped: 22
HURT: 1
helped stats (abs) min: 2 max: 292 x̄: 21.91 x̃: 7
helped stats (rel) min: <.01% max: 8.65% x̄: 0.44% x̃: 0.04%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -46.56 4.82
95% mean confidence interval for cycles %-change: -1.20% 0.36%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:15 -08:00
Ian Romanick
03fb13f646 nir: Rearrange logic op-compounded integer compares
Skylake and Broadwell had similar results.  Skylake shown.
total instructions in shared programs: 14521769 -> 14521753 (<.01%)
instructions in affected programs: 8782 -> 8766 (-0.18%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.12% max: 0.40% x̄: 0.20% x̃: 0.18%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.23% -0.16%
Instructions are helped.

total cycles in shared programs: 533000376 -> 533000205 (<.01%)
cycles in affected programs: 447035 -> 446864 (-0.04%)
helped: 9
HURT: 9
helped stats (abs) min: 2 max: 40 x̄: 35.78 x̃: 40
helped stats (rel) min: 0.02% max: 0.18% x̄: 0.10% x̃: 0.09%
HURT stats (abs)   min: 1 max: 52 x̄: 16.78 x̃: 10
HURT stats (rel)   min: <.01% max: 1.11% x̄: 0.29% x̃: 0.12%
95% mean confidence interval for cycles value: -25.07 6.07
95% mean confidence interval for cycles %-change: -0.08% 0.27%
Inconclusive result (value mean confidence interval includes 0).

No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
053be9f020 nir: Rearrange and-compounded float compares
If both comparisons are used as sources for instructions other than the
iand, this transformation is detrimental.  If the non-identical value in
both compares is constant, the fmin or fmax will be constant-folded
away, so the transformation is always a win.

It is interesting to me that on Iron Lake only 81 shaders have
instruction counts changed, but 726 shaders have cycle counts changed.

shader-db results:

Skylake
total instructions in shared programs: 14525728 -> 14521017 (-0.03%)
instructions in affected programs: 1164726 -> 1160015 (-0.40%)
helped: 1692
HURT: 5
helped stats (abs) min: 1 max: 637 x̄: 2.79 x̃: 2
helped stats (rel) min: 0.07% max: 16.36% x̄: 0.81% x̃: 0.33%
HURT stats (abs)   min: 1 max: 12 x̄: 3.20 x̃: 1
HURT stats (rel)   min: 0.38% max: 2.86% x̄: 2.36% x̃: 2.86%
95% mean confidence interval for instructions value: -3.52 -2.03
95% mean confidence interval for instructions %-change: -0.86% -0.74%
Instructions are helped.

total cycles in shared programs: 533115449 -> 532991404 (-0.02%)
cycles in affected programs: 119401803 -> 119277758 (-0.10%)
helped: 1145
HURT: 467
helped stats (abs) min: 1 max: 34644 x̄: 145.92 x̃: 18
helped stats (rel) min: <.01% max: 45.33% x̄: 1.58% x̃: 0.42%
HURT stats (abs)   min: 1 max: 1590 x̄: 92.15 x̃: 15
HURT stats (rel)   min: <.01% max: 13.48% x̄: 1.26% x̃: 0.39%
95% mean confidence interval for cycles value: -122.16 -31.74
95% mean confidence interval for cycles %-change: -0.94% -0.57%
Cycles are helped.

total spills in shared programs: 9597 -> 9534 (-0.66%)
spills in affected programs: 403 -> 340 (-15.63%)
helped: 1
HURT: 1

total fills in shared programs: 13904 -> 13790 (-0.82%)
fills in affected programs: 1627 -> 1513 (-7.01%)
helped: 2
HURT: 1

LOST:   0
GAINED: 2

Broadwell
total instructions in shared programs: 14816966 -> 14812590 (-0.03%)
instructions in affected programs: 1499885 -> 1495509 (-0.29%)
helped: 1672
HURT: 15
helped stats (abs) min: 1 max: 455 x̄: 2.70 x̃: 2
helped stats (rel) min: 0.05% max: 16.36% x̄: 0.81% x̃: 0.33%
HURT stats (abs)   min: 1 max: 21 x̄: 9.20 x̃: 8
HURT stats (rel)   min: 0.08% max: 2.86% x̄: 1.06% x̃: 0.53%
95% mean confidence interval for instructions value: -3.14 -2.05
95% mean confidence interval for instructions %-change: -0.85% -0.73%
Instructions are helped.

total cycles in shared programs: 559353622 -> 559345595 (<.01%)
cycles in affected programs: 139893703 -> 139885676 (<.01%)
helped: 921
HURT: 697
helped stats (abs) min: 1 max: 42424 x̄: 143.45 x̃: 18
helped stats (rel) min: <.01% max: 36.23% x̄: 2.02% x̃: 0.87%
HURT stats (abs)   min: 1 max: 2370 x̄: 178.03 x̃: 38
HURT stats (rel)   min: <.01% max: 17.35% x̄: 0.71% x̃: 0.14%
95% mean confidence interval for cycles value: -59.64 49.72
95% mean confidence interval for cycles %-change: -1.02% -0.66%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 78902 -> 78861 (-0.05%)
spills in affected programs: 2418 -> 2377 (-1.70%)
helped: 1
HURT: 11

total fills in shared programs: 83782 -> 83678 (-0.12%)
fills in affected programs: 3515 -> 3411 (-2.96%)
helped: 2
HURT: 11

LOST:   0
GAINED: 5

Haswell and Ivy Bridge had similar results. Haswell shown.
total instructions in shared programs: 9033898 -> 9032010 (-0.02%)
instructions in affected programs: 308064 -> 306176 (-0.61%)
helped: 921
HURT: 4
helped stats (abs) min: 1 max: 20 x̄: 2.05 x̃: 1
helped stats (rel) min: 0.17% max: 17.54% x̄: 0.80% x̃: 0.35%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23%
95% mean confidence interval for instructions value: -2.21 -1.87
95% mean confidence interval for instructions %-change: -0.88% -0.68%
Instructions are helped.

total cycles in shared programs: 84628949 -> 84620520 (<.01%)
cycles in affected programs: 2164913 -> 2156484 (-0.39%)
helped: 518
HURT: 359
helped stats (abs) min: 1 max: 440 x̄: 41.52 x̃: 20
helped stats (rel) min: <.01% max: 17.17% x̄: 1.95% x̃: 1.01%
HURT stats (abs)   min: 1 max: 586 x̄: 36.43 x̃: 8
HURT stats (rel)   min: 0.04% max: 18.65% x̄: 1.47% x̃: 0.40%
95% mean confidence interval for cycles value: -15.17 -4.05
95% mean confidence interval for cycles %-change: -0.77% -0.32%
Cycles are helped.

LOST:   0
GAINED: 4

Sandy Bridge
total instructions in shared programs: 10544860 -> 10542933 (-0.02%)
instructions in affected programs: 360019 -> 358092 (-0.54%)
helped: 931
HURT: 4
helped stats (abs) min: 1 max: 20 x̄: 2.07 x̃: 1
helped stats (rel) min: 0.11% max: 15.52% x̄: 0.68% x̃: 0.30%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.33% max: 3.33% x̄: 3.33% x̃: 3.33%
95% mean confidence interval for instructions value: -2.23 -1.89
95% mean confidence interval for instructions %-change: -0.76% -0.58%
Instructions are helped.

total cycles in shared programs: 146106820 -> 146098397 (<.01%)
cycles in affected programs: 3435047 -> 3426624 (-0.25%)
helped: 572
HURT: 329
helped stats (abs) min: 1 max: 1289 x̄: 32.52 x̃: 15
helped stats (rel) min: <.01% max: 26.29% x̄: 0.97% x̃: 0.33%
HURT stats (abs)   min: 1 max: 1714 x̄: 30.93 x̃: 6
HURT stats (rel)   min: 0.02% max: 41.31% x̄: 1.13% x̃: 0.19%
95% mean confidence interval for cycles value: -16.85 -1.85
95% mean confidence interval for cycles %-change: -0.39% -0.01%
Cycles are helped.

LOST:   1
GAINED: 0

Iron Lake
total instructions in shared programs: 7886925 -> 7886711 (<.01%)
instructions in affected programs: 25763 -> 25549 (-0.83%)
helped: 75
HURT: 6
helped stats (abs) min: 1 max: 13 x̄: 3.33 x̃: 1
helped stats (rel) min: 0.35% max: 17.57% x̄: 1.96% x̃: 0.53%
HURT stats (abs)   min: 1 max: 16 x̄: 6.00 x̃: 1
HURT stats (rel)   min: 2.86% max: 4.79% x̄: 3.49% x̃: 2.86%
95% mean confidence interval for instructions value: -3.69 -1.60
95% mean confidence interval for instructions %-change: -2.54% -0.57%
Instructions are helped.

total cycles in shared programs: 178116888 -> 178115324 (<.01%)
cycles in affected programs: 5858790 -> 5857226 (-0.03%)
helped: 484
HURT: 242
helped stats (abs) min: 2 max: 76 x̄: 5.27 x̃: 6
helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06%
HURT stats (abs)   min: 2 max: 76 x̄: 4.07 x̃: 2
HURT stats (rel)   min: 0.01% max: 3.99% x̄: 0.19% x̃: 0.03%
95% mean confidence interval for cycles value: -2.76 -1.55
95% mean confidence interval for cycles %-change: -0.12% 0.01%
Inconclusive result (%-change mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857870 -> 4857762 (<.01%)
instructions in affected programs: 13994 -> 13886 (-0.77%)
helped: 39
HURT: 5
helped stats (abs) min: 1 max: 13 x̄: 3.28 x̃: 2
helped stats (rel) min: 0.33% max: 17.11% x̄: 1.86% x̃: 0.48%
HURT stats (abs)   min: 1 max: 16 x̄: 4.00 x̃: 1
HURT stats (rel)   min: 2.86% max: 4.71% x̄: 3.23% x̃: 2.86%
95% mean confidence interval for instructions value: -3.86 -1.05
95% mean confidence interval for instructions %-change: -2.61% 0.04%
Inconclusive result (%-change mean confidence interval includes 0).

total cycles in shared programs: 122180744 -> 122179674 (<.01%)
cycles in affected programs: 3686646 -> 3685576 (-0.03%)
helped: 273
HURT: 141
helped stats (abs) min: 2 max: 76 x̄: 5.81 x̃: 6
helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06%
HURT stats (abs)   min: 2 max: 76 x̄: 3.66 x̃: 2
HURT stats (rel)   min: 0.01% max: 3.99% x̄: 0.16% x̃: 0.02%
95% mean confidence interval for cycles value: -3.42 -1.75
95% mean confidence interval for cycles %-change: -0.15% 0.03%
Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
821e7a4d32 nir: Separate a weird compare with zero to two compares with zero
min(a+b, c+d) >= 0 becomes (a+b >= 0 && c+d >= 0).

No shader-db changes, but it does prevent 6 to 12 instruction
regressions in the next patch on all measured Intel platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
68420d8322 nir: Simplify min and max of b2f
v2: Rebase on almost 2 years.  Require that one of the arguments to fmin
or fmax be used only once.  This prevents some regressions.

shader-db results:

Skylake and Broadwell had similar results.  Skylake shown.
total instructions in shared programs: 14526021 -> 14525913 (<.01%)
instructions in affected programs: 4613 -> 4505 (-2.34%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 3.48 x̃: 4
helped stats (rel) min: 0.62% max: 6.67% x̄: 3.31% x̃: 2.42%

total cycles in shared programs: 533118710 -> 533118403 (<.01%)
cycles in affected programs: 34334 -> 34027 (-0.89%)
helped: 24
HURT: 0
helped stats (abs) min: 4 max: 24 x̄: 12.79 x̃: 14
helped stats (rel) min: 0.25% max: 2.40% x̄: 1.08% x̃: 1.03%

No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
d8d18516b0 nir: Undo possible damage caused by rearranging or-compounded float compares
shader-db results:

Skylake and Broadwell had similar results (Skylake shown)
total instructions in shared programs: 14525898 -> 14525836 (<.01%)
instructions in affected programs: 1964 -> 1902 (-3.16%)
helped: 14
HURT: 0
helped stats (abs) min: 1 max: 25 x̄: 4.43 x̃: 1
helped stats (rel) min: 0.68% max: 9.77% x̄: 2.10% x̃: 0.86%
95% mean confidence interval for instructions value: -9.46 0.60
95% mean confidence interval for instructions %-change: -3.97% -0.24%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 533119892 -> 533115756 (<.01%)
cycles in affected programs: 96061 -> 91925 (-4.31%)
helped: 13
HURT: 1
helped stats (abs) min: 60 max: 596 x̄: 318.77 x̃: 300
helped stats (rel) min: 1.15% max: 5.49% x̄: 4.27% x̃: 4.42%
HURT stats (abs)   min: 8 max: 8 x̄: 8.00 x̃: 8
HURT stats (rel)   min: 0.46% max: 0.46% x̄: 0.46% x̃: 0.46%
95% mean confidence interval for cycles value: -379.43 -211.43
95% mean confidence interval for cycles %-change: -4.84% -3.01%
Cycles are helped.

Haswell, Ivy Bridge and Sandy Bridge had similar results (Haswell shown).
total instructions in shared programs: 9033948 -> 9033898 (<.01%)
instructions in affected programs: 535 -> 485 (-9.35%)
helped: 2
HURT: 0

total cycles in shared programs: 84631402 -> 84628949 (<.01%)
cycles in affected programs: 63197 -> 60744 (-3.88%)
helped: 13
HURT: 2
helped stats (abs) min: 1 max: 594 x̄: 189.62 x̃: 140
helped stats (rel) min: 0.07% max: 5.04% x̄: 3.79% x̃: 4.01%
HURT stats (abs)   min: 4 max: 8 x̄: 6.00 x̃: 6
HURT stats (rel)   min: 0.17% max: 0.45% x̄: 0.31% x̃: 0.31%
95% mean confidence interval for cycles value: -253.40 -73.67
95% mean confidence interval for cycles %-change: -4.24% -2.25%
Cycles are helped.

No changes on GM45 or Iron Lake.

v2: Add a couple more tautological compares.  Suggested by Elie.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
3941cba0f7 nir: Be more conservative about rearranging or-compounded compares
If both comparisons are used as sources for instructions other than the
ior, this transformation is detrimental.  If the non-identical value in
both compares is constant, the fmin or fmax will be constant-folded
away, so the transformation is always a win.

shader-db results:

Skylake
total instructions in shared programs: 14526147 -> 14525898 (<.01%)
instructions in affected programs: 70239 -> 69990 (-0.35%)
helped: 102
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.44 x̃: 1
helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20%
95% mean confidence interval for instructions value: -2.86 -2.02
95% mean confidence interval for instructions %-change: -0.46% -0.31%
Instructions are helped.

total cycles in shared programs: 533120531 -> 533119892 (<.01%)
cycles in affected programs: 994875 -> 994236 (-0.06%)
helped: 76
HURT: 26
helped stats (abs) min: 1 max: 324 x̄: 27.09 x̃: 13
helped stats (rel) min: <.01% max: 4.21% x̄: 0.45% x̃: 0.18%
HURT stats (abs)   min: 1 max: 167 x̄: 54.62 x̃: 26
HURT stats (rel)   min: <.01% max: 4.36% x̄: 1.01% x̃: 0.39%
95% mean confidence interval for cycles value: -19.44 6.91
95% mean confidence interval for cycles %-change: -0.30% 0.15%
Inconclusive result (value mean confidence interval includes 0).

Broadwell
total instructions in shared programs: 14816005 -> 14815787 (<.01%)
instructions in affected programs: 64658 -> 64440 (-0.34%)
helped: 97
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.25 x̃: 1
helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20%
95% mean confidence interval for instructions value: -2.62 -1.87
95% mean confidence interval for instructions %-change: -0.45% -0.30%
Instructions are helped.

total cycles in shared programs: 559340386 -> 559339907 (<.01%)
cycles in affected programs: 1090491 -> 1090012 (-0.04%)
helped: 66
HURT: 28
helped stats (abs) min: 2 max: 198 x̄: 23.83 x̃: 16
helped stats (rel) min: 0.01% max: 4.21% x̄: 0.47% x̃: 0.27%
HURT stats (abs)   min: 2 max: 226 x̄: 39.07 x̃: 11
HURT stats (rel)   min: <.01% max: 4.61% x̄: 0.64% x̃: 0.20%
95% mean confidence interval for cycles value: -15.94 5.75
95% mean confidence interval for cycles %-change: -0.35% 0.07%
Inconclusive result (value mean confidence interval includes 0).

LOST:   0
GAINED: 1

Haswell
total instructions in shared programs: 9034106 -> 9033948 (<.01%)
instructions in affected programs: 24096 -> 23938 (-0.66%)
helped: 38
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 4.16 x̃: 4
helped stats (rel) min: 0.42% max: 2.29% x̄: 0.71% x̃: 0.64%
95% mean confidence interval for instructions value: -4.71 -3.60
95% mean confidence interval for instructions %-change: -0.84% -0.58%
Instructions are helped.

total cycles in shared programs: 84631628 -> 84631402 (<.01%)
cycles in affected programs: 148674 -> 148448 (-0.15%)
helped: 14
HURT: 14
helped stats (abs) min: 1 max: 114 x̄: 22.14 x̃: 12
helped stats (rel) min: 0.02% max: 2.98% x̄: 0.66% x̃: 0.21%
HURT stats (abs)   min: 1 max: 10 x̄: 6.00 x̃: 5
HURT stats (rel)   min: 0.01% max: 0.20% x̄: 0.12% x̃: 0.11%
95% mean confidence interval for cycles value: -19.42 3.28
95% mean confidence interval for cycles %-change: -0.59% 0.05%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 10015456 -> 10015293 (<.01%)
instructions in affected programs: 27701 -> 27538 (-0.59%)
helped: 38
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 4.29 x̃: 4
helped stats (rel) min: 0.33% max: 2.79% x̄: 0.66% x̃: 0.52%
95% mean confidence interval for instructions value: -4.87 -3.71
95% mean confidence interval for instructions %-change: -0.82% -0.51%
Instructions are helped.

total cycles in shared programs: 87524771 -> 87524569 (<.01%)
cycles in affected programs: 112324 -> 112122 (-0.18%)
helped: 6
HURT: 12
helped stats (abs) min: 2 max: 111 x̄: 44.67 x̃: 20
helped stats (rel) min: 0.02% max: 2.94% x̄: 1.45% x̃: 1.26%
HURT stats (abs)   min: 1 max: 16 x̄: 5.50 x̃: 5
HURT stats (rel)   min: <.01% max: 0.16% x̄: 0.08% x̃: 0.08%
95% mean confidence interval for cycles value: -29.14 6.69
95% mean confidence interval for cycles %-change: -0.93% 0.08%
Inconclusive result (value mean confidence interval includes 0).

LOST:   0
GAINED: 2

Sandy Bridge
total instructions in shared programs: 10545655 -> 10545465 (<.01%)
instructions in affected programs: 37198 -> 37008 (-0.51%)
helped: 42
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 4.52 x̃: 4
helped stats (rel) min: 0.31% max: 2.15% x̄: 0.58% x̃: 0.49%
95% mean confidence interval for instructions value: -5.14 -3.91
95% mean confidence interval for instructions %-change: -0.68% -0.47%
Instructions are helped.

total cycles in shared programs: 146113059 -> 146112427 (<.01%)
cycles in affected programs: 423514 -> 422882 (-0.15%)
helped: 32
HURT: 10
helped stats (abs) min: 4 max: 162 x̄: 24.34 x̃: 12
helped stats (rel) min: 0.06% max: 2.74% x̄: 0.37% x̃: 0.11%
HURT stats (abs)   min: 12 max: 19 x̄: 14.70 x̃: 14
HURT stats (rel)   min: 0.10% max: 0.18% x̄: 0.16% x̃: 0.14%
95% mean confidence interval for cycles value: -26.03 -4.07
95% mean confidence interval for cycles %-change: -0.43% -0.05%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886959 -> 7886925 (<.01%)
instructions in affected programs: 1340 -> 1306 (-2.54%)
helped: 4
HURT: 0
helped stats (abs) min: 2 max: 15 x̄: 8.50 x̃: 8
helped stats (rel) min: 0.63% max: 4.30% x̄: 2.45% x̃: 2.43%
95% mean confidence interval for instructions value: -20.44 3.44
95% mean confidence interval for instructions %-change: -5.78% 0.89%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 178116996 -> 178116888 (<.01%)
cycles in affected programs: 6262 -> 6154 (-1.72%)
helped: 2
HURT: 2
helped stats (abs) min: 44 max: 78 x̄: 61.00 x̃: 61
helped stats (rel) min: 3.31% max: 3.94% x̄: 3.62% x̃: 3.62%
HURT stats (abs)   min: 6 max: 8 x̄: 7.00 x̃: 7
HURT stats (rel)   min: 0.34% max: 0.68% x̄: 0.51% x̃: 0.51%
95% mean confidence interval for cycles value: -93.27 39.27
95% mean confidence interval for cycles %-change: -5.38% 2.27%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857887 -> 4857870 (<.01%)
instructions in affected programs: 674 -> 657 (-2.52%)
helped: 2
HURT: 0

total cycles in shared programs: 122180816 -> 122180744 (<.01%)
cycles in affected programs: 3764 -> 3692 (-1.91%)
helped: 1
HURT: 1
helped stats (abs) min: 78 max: 78 x̄: 78.00 x̃: 78
helped stats (rel) min: 3.94% max: 3.94% x̄: 3.94% x̃: 3.94%
HURT stats (abs)   min: 6 max: 6 x̄: 6.00 x̃: 6
HURT stats (rel)   min: 0.34% max: 0.34% x̄: 0.34% x̃: 0.34%

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
cfc0d34802 nir: See through an fneg to apply existing optimizations
Doing the same for the existing feq and fne transformations didn't help
anything in shader-db.

shader-db results:

Broadwell and Skylake (Skylake shown)
total instructions in shared programs: 14529463 -> 14526147 (-0.02%)
instructions in affected programs: 402420 -> 399104 (-0.82%)
helped: 2136
HURT: 131
helped stats (abs) min: 1 max: 10 x̄: 1.61 x̃: 1
helped stats (rel) min: 0.03% max: 16.22% x̄: 3.14% x̃: 1.12%
HURT stats (abs)   min: 1 max: 2 x̄: 1.01 x̃: 1
HURT stats (rel)   min: 0.13% max: 7.69% x̄: 0.75% x̃: 0.57%
95% mean confidence interval for instructions value: -1.51 -1.41
95% mean confidence interval for instructions %-change: -3.06% -2.78%
Instructions are helped.

total cycles in shared programs: 533146915 -> 533120531 (<.01%)
cycles in affected programs: 10356261 -> 10329877 (-0.25%)
helped: 1933
HURT: 844
helped stats (abs) min: 1 max: 490 x̄: 29.44 x̃: 16
helped stats (rel) min: <.01% max: 28.57% x̄: 3.43% x̃: 1.88%
HURT stats (abs)   min: 1 max: 423 x̄: 36.17 x̃: 12
HURT stats (rel)   min: <.01% max: 23.75% x̄: 1.90% x̃: 0.59%
95% mean confidence interval for cycles value: -11.78 -7.22
95% mean confidence interval for cycles %-change: -1.98% -1.65%
Cycles are helped.

Haswell
total instructions in shared programs: 9037416 -> 9034106 (-0.04%)
instructions in affected programs: 389831 -> 386521 (-0.85%)
helped: 2184
HURT: 120
helped stats (abs) min: 1 max: 11 x̄: 1.57 x̃: 1
helped stats (rel) min: 0.03% max: 25.00% x̄: 2.73% x̃: 1.02%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.19% max: 7.69% x̄: 0.81% x̃: 0.57%
95% mean confidence interval for instructions value: -1.49 -1.39
95% mean confidence interval for instructions %-change: -2.68% -2.41%
Instructions are helped.

total cycles in shared programs: 84636243 -> 84631628 (<.01%)
cycles in affected programs: 4745058 -> 4740443 (-0.10%)
helped: 1904
HURT: 960
helped stats (abs) min: 1 max: 466 x̄: 30.21 x̃: 18
helped stats (rel) min: 0.02% max: 36.36% x̄: 3.57% x̃: 2.38%
HURT stats (abs)   min: 1 max: 1080 x̄: 55.11 x̃: 14
HURT stats (rel)   min: 0.02% max: 51.33% x̄: 2.77% x̃: 0.81%
95% mean confidence interval for cycles value: -4.51 1.29
95% mean confidence interval for cycles %-change: -1.64% -1.25%
Inconclusive result (value mean confidence interval includes 0).

LOST:   1
GAINED: 0

Sandy Bridge and Ivy Bridge (Ivy Bridge shown)
total instructions in shared programs: 10018873 -> 10015456 (-0.03%)
instructions in affected programs: 512820 -> 509403 (-0.67%)
helped: 2268
HURT: 162
helped stats (abs) min: 1 max: 11 x̄: 1.62 x̃: 1
helped stats (rel) min: 0.03% max: 25.00% x̄: 2.47% x̃: 0.88%
HURT stats (abs)   min: 1 max: 4 x̄: 1.59 x̃: 1
HURT stats (rel)   min: 0.09% max: 7.69% x̄: 0.86% x̃: 0.50%
95% mean confidence interval for instructions value: -1.46 -1.35
95% mean confidence interval for instructions %-change: -2.38% -2.12%
Instructions are helped.

total cycles in shared programs: 87538223 -> 87524771 (-0.02%)
cycles in affected programs: 5435520 -> 5422068 (-0.25%)
helped: 1916
HURT: 946
helped stats (abs) min: 1 max: 1392 x̄: 29.44 x̃: 18
helped stats (rel) min: <.01% max: 34.51% x̄: 3.34% x̃: 1.97%
HURT stats (abs)   min: 1 max: 633 x̄: 45.41 x̃: 11
HURT stats (rel)   min: 0.02% max: 25.95% x̄: 2.41% x̃: 0.62%
95% mean confidence interval for cycles value: -7.34 -2.06
95% mean confidence interval for cycles %-change: -1.62% -1.26%
Cycles are helped.

LOST:   1
GAINED: 0

Iron Lake
total instructions in shared programs: 7888446 -> 7886959 (-0.02%)
instructions in affected programs: 331581 -> 330094 (-0.45%)
helped: 1160
HURT: 97
helped stats (abs) min: 1 max: 10 x̄: 1.37 x̃: 1
helped stats (rel) min: 0.02% max: 9.68% x̄: 0.93% x̃: 0.43%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.17% max: 4.17% x̄: 0.37% x̃: 0.25%
95% mean confidence interval for instructions value: -1.25 -1.12
95% mean confidence interval for instructions %-change: -0.91% -0.75%
Instructions are helped.

total cycles in shared programs: 178130766 -> 178116996 (<.01%)
cycles in affected programs: 12534564 -> 12520794 (-0.11%)
helped: 1856
HURT: 187
helped stats (abs) min: 2 max: 202 x̄: 7.78 x̃: 4
helped stats (rel) min: <.01% max: 6.47% x̄: 0.28% x̃: 0.11%
HURT stats (abs)   min: 2 max: 26 x̄: 3.55 x̃: 2
HURT stats (rel)   min: 0.01% max: 2.14% x̄: 0.08% x̃: 0.02%
95% mean confidence interval for cycles value: -7.41 -6.07
95% mean confidence interval for cycles %-change: -0.28% -0.22%
Cycles are helped.

GM45
total instructions in shared programs: 4858912 -> 4857887 (-0.02%)
instructions in affected programs: 237565 -> 236540 (-0.43%)
helped: 867
HURT: 57
helped stats (abs) min: 1 max: 10 x̄: 1.25 x̃: 1
helped stats (rel) min: 0.02% max: 9.38% x̄: 0.87% x̃: 0.43%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.16% max: 3.85% x̄: 0.34% x̃: 0.22%
95% mean confidence interval for instructions value: -1.18 -1.04
95% mean confidence interval for instructions %-change: -0.88% -0.71%
Instructions are helped.

total cycles in shared programs: 122189118 -> 122180816 (<.01%)
cycles in affected programs: 8776418 -> 8768116 (-0.09%)
helped: 1213
HURT: 166
helped stats (abs) min: 2 max: 202 x̄: 7.30 x̃: 4
helped stats (rel) min: <.01% max: 6.43% x̄: 0.25% x̃: 0.11%
HURT stats (abs)   min: 2 max: 26 x̄: 3.35 x̃: 2
HURT stats (rel)   min: 0.01% max: 2.14% x̄: 0.06% x̃: 0.02%
95% mean confidence interval for cycles value: -6.78 -5.26
95% mean confidence interval for cycles %-change: -0.24% -0.18%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Timothy Arceri
283e25102b st/glsl_to_nir: disable io lowering and array splitting of fs inputs
We need this to be able to support the interpolateAt builtins in a
sane way. It also leads to the generation of more optimal code.

The lowering and splitting is made conditional on lower_all_io_to_temps
because vc4 and freedreno both expect these passes to be enabled and
niether support glsl 400 so don't need to deal with the interpolateAt
builtins.

We leave the other stages for now as to avoid regressions. Ideally we
could remove the stage checks and just set the nir options correctly
for each stage. However all gallium drivers currently just use return
the same nir compiler options for all stages, and it's probably more
trouble than its worth to change this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri
9a2e085680 nir: add lower_all_io_to_temps flag
This will be used for freedreno and vc4 which require all inputs
and outputs to be copied to temps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri
3218756262 nir/st_glsl_to_nir: add param to disable splitting of inputs
We need this because we will always copy fs outputs to temps and
split the arrays, but do not want to do either of these with fs
inputs as it is unnessisary and makes handling interpolateAt
builtins difficult.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri
93e213f91f st/glsl_to_nir: copy nir compiler options to context
Various nir passes may expect this to be here as does the nir
serialisation pass.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri
dd6d6c63a7 radeonsi/nir: add input support for arrays that have not been copied to temps and split
We need this to be able to support the interpolateAt builtins in a
sane way. It also leads to the generation of more optimal code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
d185190222 ac/radeonsi: add lookup_interp_param and load_sample_position to the abi
This will enable the interpolateAt builtins to work on the radeonsi
nir backend.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
97058168a4 radeonsi/nir: add prim_mask to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
3ff012f142 radeonsi/nir: adjust load_sample_position() to be shared between backends
With this interface change it can be shared between the tgsi and
nir backends.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
3a47b138e3 radeonsi/nir: add si_nir_lookup_interp_param() helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
b8808848ce ac/nir_to_llvm: move some interp defines to the header
These will be used in the following patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
fea6da9aaa radeonsi/nir: move the interpolation qualifier scanning
We need to collect this when scanning over the instruction rather
than when scanning over the inputs otherwise we might get confliting
values for inputs that are use by the interpolateAt* builtins.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
580f1aa247 radeonsi/nir: add interpolate at intrinsics to scan_instruction()
V2: use the uses_*_opcode_interp_* flags

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Bas Nieuwenhuizen
882eff4d20 radv: Merge raster state with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:02:05 +01:00
Bas Nieuwenhuizen
69364f1c34 radv: Move gs state out of pipeline.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:02:01 +01:00
Bas Nieuwenhuizen
e4e060d135 radv: Split out cliprect rule generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:56 +01:00
Bas Nieuwenhuizen
acbaef3005 radv: Merge VGT_GS_MODE computation with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:52 +01:00
Bas Nieuwenhuizen
4ae6a8b0cd radv: Split out processing the vertex input state.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:41 +01:00
Bas Nieuwenhuizen
9062b1c241 radv: Move tessellation state out of pipeline.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:38 +01:00
Bas Nieuwenhuizen
4aa1cb4e90 radv: Move blend state out of pipeline.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:34 +01:00
Bas Nieuwenhuizen
0f72f0eacb radv: Split out generating VGT_SHADER_STAGES_EN.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:30 +01:00
Bas Nieuwenhuizen
694c34314b radv: Split out the ia_multi_vgt_param precomputation.
Also moved everything in a struct and then return the struct from
the helper function, so it is clear in the caller what part of the
pipeline gets modified.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:26 +01:00
Bas Nieuwenhuizen
0bea0851aa radv: Split out db_shader_control computation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:18 +01:00
Bas Nieuwenhuizen
5dce47ae6d radv: Compute shader_z_format when emitting it.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:13 +01:00
Bas Nieuwenhuizen
df2e7ab0db radv: Merge depth stencil state with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:06 +01:00
Bas Nieuwenhuizen
d5a0af84ec radv: Merge ps_input_cntl computation with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:01 +01:00
Bas Nieuwenhuizen
e2bf18030d radv: Merge vtx_reuse_depth computation with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:55 +01:00
Bas Nieuwenhuizen
c80747b32c radv: Merge vs state computation with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:50 +01:00
Bas Nieuwenhuizen
c4191cf944 radv: Merge binning state generation with pm4 emission.
We don't need the pipeline state struct anymore.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:45 +01:00
Bas Nieuwenhuizen
6f1a3f081e radv: Constify some pipeline helpers.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:40 +01:00
Bas Nieuwenhuizen
f0c9ef410a radv: Add PM4 pregeneration for compute pipelines.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:34 +01:00
Bas Nieuwenhuizen
beeab44190 radv: Record a PM4 sequence for graphics pipeline switches.
This gives about 2% performance improvement on dota2 for me.

This is mostly a mechanical copy and replacement, but at bind time
we still do:

1) Some stuff that is only based on num_samples changes.
2) Some command buffer state setting.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:22 +01:00
Bas Nieuwenhuizen
7c366bc152 radv: Determine unneeded dynamic states.
Which avoids setting or emitting them.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:17 +01:00
Andres Rodriguez
0a89784bcc mesa: check for invalid index on UUID glGet queries
This fixes the piglit test:
spec/ext_semaphore/api-errors/usigned-byte-i-v-bad-value

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
566ed727a4 mesa: fix glGet for ext_external_objects parameters
This allows the client to actually query the enums specified in the
ext_external_objects spec.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
0ebd3cc863 mesa: fix error codes for importing memory/semaphore FDs
This fixes the following piglit tests:
spec/ext_semaphore_fd/api-errors/import-semaphore-fd-bad-enum
spec/ext_memory_object_fd/api-errors/import-memory-fd-bad-enum

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
50b06cbc10 radeonsi: fix fence_server_sync() holding up extra work v2
When calling si_fence_server_sync(), the wait operation is associated
with the next kernel submission. Therefore, any unflushed work
submitted previous to fence_server_sync() will also be affected by
the wait.

To avoid adding the dependency to the unflushed work, we flush before
emitting the fence dependency.

v2: s/semaphore/fence

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
e0f16ee666 radeonsi: implement semaphore_server_signal v2
Syncobj based waits or signals only happen at submission boundaries. In
order to guarantee that the requested signal event will occur when the
state tracker requested it, we must issue a flush.

v2: s/fence/semaphore for pipe objects

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
5b07b06d6b radeonsi: add support for importing PIPE_FD_TYPE_SYNCOBJ semaphores
Hook up importing semaphores of type PIPE_FD_TYPE_SYNCOBJ

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
cc9762d74d winsys/amdgpu: add support for syncobj signaling v3
Add the ability to signal a syncobj when a cs completes execution.

v2: corresponding changes for gallium fence->semaphore rename
v3: s/semaphore/fence for pipe objects

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
29b9bd0539 mesa/st: add support for semaphore object signal/wait v4
Bits to implement ServerWaitSemaphoreObject/ServerSignalSemaphoreObject

v2:
  - corresponding changes for gallium fence->semaphore rename
  - flushing moved to mesa/main

v3: s/semaphore/fence for pipe objects
v4: add bitmap flushing

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
89b52891fd mesa: add support for semaphore object signal/wait v3
Memory synchronization is left for a future patch.

v2: flush vertices/bitmaps moved to mesa/main
v3: removed spaces before/after braces

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
260f7fcc46 mesa: add semaphore parameter stub v2
EXT_semaphore and EXT_semaphore_fd define no pnames. Therefore there
isn't much to do besides determining the correct error code.

v2: removed useless return

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
382067f065 mesa/st: add support for semaphore object create/import/delete v3
Add basic semaphore object operations.

v2: s/semaphore/fence for pipe objects
v3: added missing license headers

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
67d5d08682 mesa: add support for semaphore object creation/import/delete v3
Used by EXT_semmaphore and EXT_semaphore_fd

v2: Removed unnecessary dummy callback initialization
v3: Fixed attempting to free the DummySemaphoreObject

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
8e635f7d65 mesa/st: introduce EXT_semaphore and EXT_semaphore_fd v2
Guarded by PIPE_CAP_SEMAPHORE_SIGNAL

v2: corresponding changes for PIPE_CAP_SEMAPHORE_SIGNAL rename

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
fde1afc495 u_threaded_context: add support for fence_server_signal v2
v2: s/semaphore/fence

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
d34c2cf3e6 gallium: add fence_server_signal() v2
Calling this function will emit a fence signal operation into the
GPU's command stream.

v2: documentation typos

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
458f89be78 gallium: introduce PIPE_FD_TYPE_SYNCOBJ
Denotes that a fd is backed by a synobj. For example, radv shared
semaphores.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
2ab405d254 gallium: introduce PIPE_CAP_FENCE_SIGNAL v2
Protects semaphore signaling functionality required by GL_EXT_semaphore.

v2: s/semaphore/fence

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
585daa2378 gallium: add type parameter to create_fence_fd
An fd can potentially have different types of objects backing it.
Specifying the type helps us make sure we treat the FD correctly.

This is in preparation to allow importing syncobj fence FDs in addition
to native sync FDs.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Dave Airlie
16dd0eb517 ac/llvm: bump the number of results to 8.
This function can get access for a 64-bit dvec4, which means we
have to load 8 components.

This fixes:
R600_DEBUG=nir ./bin/shader_runner generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-abs-dvec4.shader_test -auto

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 05:37:16 +10:00
Dave Airlie
8d633f067b r600/sb: insert the else clause when we might depart from a loop
If there is a break inside the else clause and this means we
are breaking from a loop, the loop finalise will want to insert
the LOOP_BREAK/CONTINUE instruction, however if we don't emit
the else there is no where for these to end up, so they will end
up in the wrong place.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101442
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-31 04:47:29 +10:00
Brian Paul
1a9aa69ae8 mesa: remove invalid assertion in _mesa_enable_vertex_array_attrib()
The meta module passes some 0-based attrib values.  Should fix Piglit
regressions reported by Mark Janes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104863
Fixes: 4ab7e03e1f ("mesa: add an assertion in
_mesa_enable_vertex_array_attrib()")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-30 11:02:43 -07:00
Brian Paul
efa0993eaf mesa: use gl_vert_attrib enum type in more places
Slightly better readbility.

Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 11:02:43 -07:00
Brian Paul
f892e332a8 mesa: rename some 'client' array functions
A long time ago gl_vertex_array was gl_client_array.  Update some function
names to be consistent.

Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
d2d9d090e5 mesa: s/src/attribs/ in _mesa_update_client_array()
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
e863541e43 mesa: check/assert array index in _mesa_bind_vertex_buffer()
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
fcee2cc711 mesa: trivial comment typo fix in arrayobj.c
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
4ab7e03e1f mesa: add an assertion in _mesa_enable_vertex_array_attrib()
Some of the enable/disable vertex array functions take a zero-based
generic index, while others take a VERT_ATTRIB_GENERIC0-based value.
Add an assertion to clarify that in one place.

Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
7f12791cc6 mesa: rename some vars in client_state()
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
06621e8a0d mesa: Care for differences in fog mode only if fog is consumed.
In creating fixed function vertex shader hash keys do only
care for producing the varying output if fog is enabled and the
varing is consumed in the fragment stage.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
6395a0ecf2 mesa: Reduce ffvertex_prog state_key to 36 bytes.
Using lower alignment restrictions for the state key fields finally
yields to a smaller hashing state key.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
b4216b588e mesa: Remove unused ffvertex_prog texunit_really_enabled.
Remove set but not read field from the state key used for hashing
fixed function vertex shaders.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
1169791c18 mesa: Remove unused bit in ffvertex_prog state_key.
Remove set but not read field from the state key used for hashing
fixed function vertex shaders.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
6726d16098 mesa: texgen_enabled is only 1 bit.
For the state key for hashing fixed function vertex shaders, the
texgen_enabled field requires only a single bit.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
d6b0ad51ec mesa: Encode fog modes in a 2 bit field.
For the state key for hashing fixed function
vertex shaders, encode the different fog modes, including
if fog is generally enabled or not, into a 2 bit field.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
63e845d3cc mesa: Move seperate_specular into the lighting section.
For the state key for hashing fixed function
vertex shaders, the information is only evaluated
if lighting is generally switched on.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:58 -07:00
Mathias Fröhlich
11e665d434 mesa: Get the point size array state from varying_vp_inputs.
For the state key for hashing fixed function
vertex shaders, The varying_vp_inputs bitmask already
contains the point size array enabled information.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:58 -07:00
Mathias Fröhlich
bc5c54cadf mesa: Remove unused gl_fog_attrib::_Scale.
The patch removes a variable that is only written to.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:58 -07:00
Iago Toral Quiroga
99b57daf4a anv/pipeline: lower constant initializers on output variables earlier
If a shader only writes to an output via a constant initializer we
need to lower it before we call nir_remove_dead_variables so that
this pass sees the stores from the initializer and doesn't kill the
output.

Fixes test failures in new work-in-progress CTS tests:
dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output_vert
dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output_frag

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-30 08:10:29 +01:00
Tapani Pälli
6316c2ecbd i965: move disk cache from brw_context to intel_screen
Now every context refers to same disk_cache instance in screen.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-30 08:42:51 +02:00
Elie Tournier
6f8518e068 mesa: Correctly print glTexImage dimensions
texture_format_error_check_gles() displays error like "glTexImage%dD".
This patch just replace the %d by the correct dimension.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-30 07:48:56 +02:00
Brian Paul
d5f42f96e1 mesa: shrink size of gl_array_attributes (v2)
Inspired by Marek's earlier patch, but even smaller.  Sort fields from
largest to smallest.  Use bitfields for more fields (sometimes with an
extra bit for MSVC).  Reduce Stride field to GLshort.

Note that some fields cannot be bitfields because they're accessed via
pointers (such as for glEnableClientState(GL_VERTEX_ARRAY) to set the
Enabled field).

Reduces size from 48 to 24 bytes.
Also reduces size of gl_vertex_array_object from 3632 to 2864 bytes.

And add some assertions in init_array().

v2: use s/GLuint/unsigned/, improve commit comments.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-29 21:16:50 -07:00
Brian Paul
79cafa0df3 mesa: shrink gl_vertex_array
Inspired by Marek's earlier patch, but goes a little further.
Sort fields from largest to smallest.  Use bitfields.

Reduced from 48 bytes to 32.  Also reduces size of gl_vertex_array_object
from 4144 to 3632

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-29 21:15:52 -07:00
Marek Olšák
f96a69f916 mesa: replace GLenum with GLenum16 in common structures (v4)
v2: - fix glGet*
    - also use GLenum16 for DrawBuffers
v3: - rebase to top of tree (BrianP) and incorporate Ian's suggestions
v4: - fix a GLenum16 bug in VBO/save code, add some STATIC_ASSERT()s

gl_context = 152432 -> 136840 bytes
vbo_context = 22096 -> 20608 bytes

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-29 21:15:52 -07:00
Brian Paul
94843e6056 mesa: fix incorrect size/error test in _mesa_GetUnsignedBytevEXT()
get_value_size() returns -1 for an error.  The similar check in
_mesa_GetUnsignedBytei_vEXT() is correct.

Found by chance.  There are apparently no Piglit tests which exercise
glGetUnsignedBytei_vEXT() or glGetUnsignedBytevEXT().

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-29 21:15:52 -07:00
Neha Bhende
e4ca1d6456 svga: Check rasterization state object before checking poly_stipple_enable
Sometimes rasterization state object could be empty. This is causing
segfault on hw8,9,10 for some traces.

This patch fixes enemy_territory_quake_wars_high,
enemy_territory_quake_wars_low, etqw-demo, lightsmark2008, quake1
glretrace crashes on hw 8,9,10.

Tested with mtt-glretrace and mtt-piglit.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-29 21:04:49 -07:00
Neha Bhende
d4a5e14fae svga: Adjust alpha for S3TC_DXT1_EXT RGB formats
According to spec, S3TC_DXT1_EXT RGB formats are supposed to be
opaque. Correspoding svga formats are not handling it so explicitly
setting it to 1.0.
This fixes piglit test spec@ext_texture_compression_s3tc@s3tc-targeted
Note: This test is testcase for freedesktop bug 100925

Tested with mtt-piglit and mtt-glretrace on 8,9,10,11 and 15

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-29 21:04:49 -07:00
Gert Wollny
6a7d1ca2c4 mesa/st/glsl_to_tgsi: Mark first write as unconditional when appropriate
In the register lifetime estimation if the first write is unconditional or
conditional but not within a loop then this is an unconditional dominant
write in the sense of register life time estimation.
Add a test case and record the write accordingly.

Fixes: 807e2539e5 ("mesa/st/glsl_to_tgsi: Add
tracking of ifelse writes in register merging")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104803
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-29 21:04:49 -07:00
Roland Scheidegger
3c7aa242f5 mesa: skip validation of legality of size/type queries for format queries
The size/type query is always legal (if we made it that far).
Removing this causes a difference for GL_TEXTURE_BUFFER - the reason is that
these parameters are valid only with GetTexLevelParameter() if gl 3.1 is
supported, but not if only ARB_texture_buffer_object is supported.
However, while the spec says that these queries return "the same information
as querying GetTexLevelParameter" I believe we're not expected to return just
zeros here. By definition, these pnames are always valid (unlike for the
GetTexLevelParameter() function which would return an error without GL 3.1).
The spec is a bit inconsistent there and open to interpretation - while
mentioning the "same information as querying GetTexLevelParameter" is
returned, it also mentions that 0 is returned for size/type if the
target/format is not supported - implying correct results to be returned
if it is supported, regardless that GetTexLevelParameter would return
an error. (Also, the bit about this returning the same as
GetTexLevelParameter also includes querying stencil type, which isn't
even possible with GetTexLevelParameter.)

This breaks some piglit arb_internalformat_query2 tests (which I believe to
be wrong).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com
2018-01-30 01:28:47 +01:00
Roland Scheidegger
21fe02d1d3 mesa: restrict formats being supported by target type for formatquery
The code just considered all formats as being supported if they were either
a valid fbo or texture format.
This was quite awkward since then the query would return "supported" for
e.g. GL_RGB9E5 or compressed formats and target RENDERBUFFER (albeit the driver
could still refuse it in theory). However, when then querying for instance the
internalformat sizes, it would just return 0 (due to the checks being more
strict there).
It was also a problem for texture buffer targets, which have a more restricted
list of formats which are allowed (and again, it would return supported but
then querying sizes would return 0).
So only take validation of formats into account which make sense for a given
target.
Can also toss out some special checks for rgb9e5 later, since we'd never get
there if it wasn't supported in the first place.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-30 01:28:47 +01:00
Roland Scheidegger
272e7e1bd5 mesa: (trivial) add TODO comment for default results for internal queries 2018-01-30 01:28:47 +01:00
Roland Scheidegger
09dc4f9012 mesa: remove misleading gles checks for formatquery
Testing for gles there is just confusing - this is about target being
supported, if it was valid at all was already determined earlier
(in _legal_parameters). It didn't make sense at all in any case, since
it would only have said false there for gles for 2d but not 2d arrays etc.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-30 01:28:47 +01:00
Rafael Antognolli
e7ecc5e160 i965: Emit PIPE_CONTROL with ISP bit on older platforms.
Emit it on all platforms since gen7.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-29 14:52:07 -08:00
Rafael Antognolli
fa21ddf7b1 anv/cmd_buffer: Emit PIPE_CONTROL with ISP bit on older platforms.
Emit it on all platforms since gen7.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-29 14:52:07 -08:00
Timothy Arceri
2b4afaef1c st/glsl_to_nir: remove dead io after conversion to nir
This fixes an assert in nir_lower_var_copies() for some bioshock
shaders where an unused clipdistance array has no size.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:14:36 +11:00
Timothy Arceri
327c1a7fb3 radeonsi/nir: add support vs double inputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
44067d6f0d radeonsi: pass input_idx to declare_nir_input_vs()
This make it consistent with declare_nir_input_fs() and will allow
us to support doubles.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
cf75ee3ab1 radeonsi: add bitcast_inputs() helper
Will be used in a following patch to help support doubles.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
96cfd4bd7e radeonsi/nir: fix num_inputs for doubles in vs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
09cd484d61 nir: partially revert c2acf97fcc
c2acf97fcc changed the use of double_inputs_read to be
inconsitent with its previous meaning. Here we re-enable the
gather info code that was removed as the modified code from
c2acf97fcc now uses the double_inputs member rather than
double_inputs_read.

This change allows us to use double_inputs_read with gallium
drivers without impacting double_inputs which is used by i965.

We also make use of the compiler option vs_inputs_dual_locations
to allow for the difference in behaviour between drivers that handle
vs inputs as taking up two locations for doubles, versus those that
treat them as taking a single location.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
5b8de4bdff nir: add vs_inputs_dual_locations compiler option
Allows nir drivers to either use a single or dual locations for
vs double inputs.

i965 uses dual locations for both OpenGL and Vulkan drivers, for
now gallium OpenGL drivers only use a single location.

The following patch will also make use of this option when
calling nir_shader_gather_info().

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
f63e05ae9e compiler: tidy up double_inputs_read uses
First we move double_inputs_read into a vs struct in the union,
double_inputs_read is only used for vs inputs so this will
save space and also allows us to add a new double_inputs field.

We add the new field because c2acf97fcc changed the behaviour
of double_inputs_read, and while it's no longer used to track
actual reads in i965 we do still want to track this for gallium
drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Dave Airlie
f6cc15dccd radv/gfx9: fix block compression texture views. (v2)
This ports a fix from amdvlk, to fix the sizing for mip levels
when block compressed images are viewed using uncompressed views.

My original fix didn't power the clamping, but it looks like
the clamping is required to stop the sizing going too large.

Fixes:
dEQP-VK.image.texel_view_compatible.graphic.extended*bc*
Doesn't crash DOW3 anymore.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-30 07:39:13 +10:00
Bas Nieuwenhuizen
0347a83bbf radv: Signal fence correctly after sparse binding.
It did not signal syncobjs in the fence, and also signalled too early
if there was work on the queue already, as we have to wait till that
work is done.

Fixes: d27aaae4d2 "radv: Add external fence support."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-29 17:22:58 +01:00
Brian Paul
0d044f7d61 mesa/vbo: replace vbo_draw_method() with _mesa_set_drawing_arrays()
The arrays specified by ctx->Array._DrawArrays are used for all
vertex drawing via vbo_context::draw_prims().  Different arrays are
used for immediate mode, vertex arrays, display lists, etc.  Changing
from one to another requires updating derived/driver array state.

Before, we indirectly specifid the arrays with the gl_draw_method values.
Now we just directly specify the arrays instead.  This is simpler and
will allow a subsequent display list optimization.

In the future, it might make sense to get rid of ctx->Array._DrawArrays
entirely and just pass the arrays as another parameter to
vbo_context::draw_prims().

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
d9894ede02 vbo: s/[0]/[VERT_ATTRIB_POS]/ in recalculate_input_bindings()
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
48a6ab472a vbo: add new VBO_ATTRIBS_ masks to vbo_attrib.h
These will be used in a later patch.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
41cd3ee5a2 vbo: s/VBO_ATTRIB_INDEX/VBO_ATTRIB_COLOR_INDEX/
To match the VERT_ATTRIB_COLOR_INDEX name.
Give a name to the previously anonymous enum of VBO_ATTRIB_x values.
Update the comment on the enum.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
425da3bbfc vbo: minor clean-ups in vbo_exec.h
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
d631ea3a23 vbo: s/_API_NOOP_H/VBO_NOOP_H/ in vbo_noop.h
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
094a80db4c vbo: whitespace/formatting fixes in vbo_exec.h
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
b080fc6199 vbo: move, rename vp_mode enums, get_program_mode() function
Instead of NONE/ARB use FF/SHADER.  Move the enum declaration to
vbo_private.h where it's used.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
35e0ff5bd5 vbo: s/cl/array/ in vbo_context.c
I think 'cl' used to mean client array.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Tapani Pälli
d0343bef66 nir: mark unused space in packed_tex_data
This change cleans following scary warnings in valgrind output
when disk cache is being written:

   ==6532== Uninitialised byte(s) found during client check request
   ==6532==    at 0x14423FAD: blob_write_bytes (blob.c:152)
   ==6532==    by 0x144240FB: blob_write_uint32 (blob.c:194)
   ==6532==    by 0x144001A5: write_tex (nir_serialize.c:613)

and later (loads of):

   ==6532== Use of uninitialised value of size 8
   ==6532==    at 0x62FCD9E: crc32_z (in /usr/lib64/libz.so.1.2.11)
   ==6532==    by 0x13F65014: util_hash_crc32 (crc32.c:127)
   ==6532==    by 0x13F5DABA: cache_put (disk_cache.c:947)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-29 08:11:22 +02:00
Tapani Pälli
b99c88037b i965: fix disk_cache leak when destroying context
==2780== 1,024 bytes in 1 blocks are possibly lost in loss record 180 of 205
   ==2780==    at 0x4C31A1E: calloc (vg_replace_malloc.c:711)
   ==2780==    by 0x13F6467E: util_queue_init (u_queue.c:309)
   ==2780==    by 0x13F5C9F6: disk_cache_create (disk_cache.c:369)
   ==2780==    by 0x13F05406: brw_disk_cache_init (brw_disk_cache.c:428)
   ==2780==    by 0x13F01E78: brwCreateContext (brw_context.c:1068)

Fixes: 1a61a8b9a7 ("i965: Initialize disk shader cache if MESA_GLSL_CACHE_DISABLE is false")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-29 08:11:14 +02:00
Tapani Pälli
28db950b51 i965: fix prog_data leak in brw_disk_cache
==25481== 576 bytes in 1 blocks are definitely lost in loss record 179 of 208
   ==25481==    at 0x4C2FB6B: malloc (vg_replace_malloc.c:299)
   ==25481==    by 0x1404E2CC: ralloc_size (ralloc.c:121)
   ==25481==    by 0x14119F82: read_and_upload (brw_disk_cache.c:176)
   ==25481==    by 0x1411A5C9: brw_disk_cache_upload_program (brw_disk_cache.c:271)
   ==25481==    by 0x1412FCA4: brw_upload_wm_prog (brw_wm.c:597)

Fixes: 516d50db31 ("i965: add initial implementation of on disk shader cache")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-29 08:11:03 +02:00
Timothy Arceri
9afc38c799 ac: fix indentation
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-29 11:14:23 +11:00
Timothy Arceri
03086f86ae ac: remove unused nir2llvmtype()
The last use of this was removed in the previous patch.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-29 11:14:23 +11:00
Timothy Arceri
fa29a9625e ac: fix gs load inputs type
This fixes the scenario where the input is a struct. With this
the Unreal engines Elemental demo now works on radeonsi.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-29 11:14:23 +11:00
Kai Wasserbäch
0aba967328 ac/nir: call glsl_get_sampler_dim() only once where possible
Changes since v1:
  * Rebased on top of e68150de26 and
    82adf53308.

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-29 10:47:31 +11:00
Dave Airlie
2af66ba7e7 docs/features: add r600 ARB_query_buffer_object support
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-29 05:42:34 +10:00
Dave Airlie
1c9ea24a19 r600: add ARB_query_buffer_object support
This uses a different shader than radeonsi, as we can't address non-256
aligned ssbos, which the radeonsi code does. This passes some extra
offsets into the shader.

It also contains a set of u64 instruction implementation that may
or may not be complete (at least the u64div is definitely not something
that works outside this use-case). If r600 grows 64-bit integers,
it will use the GLSL lowering for divmod.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-29 05:42:28 +10:00
Dave Airlie
a7ec366e50 r600/shader: refactor mul hi/lo instruction emission
This just makes it a bit simpler for cayman vs eg

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-29 05:42:17 +10:00
Dave Airlie
e0e23ea69c r600/eg: construct proper rat mask for image/buffers.
If the images/buffer bindings had a gap, this produced the wrong values,
this should fix that to generate the correct rat mask for mixes of
images/buffers/cbs.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-29 05:41:58 +10:00
Jon Turney
4a0bab1d7f meson: libdrm shouldn't appear in Requires.private: if it wasn't found
Otherwise, using pkg-config to retrieve flags will fail, e.g.

$ pkg-config gl --cflags
Package libdrm was not found in the pkg-config search path.
Perhaps you should add the directory containing `libdrm.pc'
to the PKG_CONFIG_PATH environment variable
Package 'libdrm', required by 'gl', not found

Fixes: 3218056e0e ("meson: Build i965 and dri stack")

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
2018-01-27 18:13:18 +00:00
Eric Anholt
e5a81ac704 broadcom/vc5: Don't forget to get the BO offset when opening a dmabuf.
Fixes black display in DRI due to storing to 0x00000000.
2018-01-27 19:40:14 +11:00
Eric Anholt
314e9ee6c4 broadcom/vc5: Enable the driver on V3D 4.2.
The changes in 4.2 haven't impacted any of our CL or state struct entries
that I can see, so I haven't enabled custom compile for doing 4.2 instead
of 4.1.
2018-01-27 19:39:56 +11:00
Eric Anholt
71c7e9bea1 broadcom/vc5: Enable CLIF dumping of V3D 4.2. 2018-01-27 19:04:21 +11:00
Eric Anholt
91f899cbc1 broadcom/vc5: Update the compiler for V3D 4.2. 2018-01-27 19:04:21 +11:00
Eric Anholt
f2e41daac5 broadcom/vc5: Update QPU instruction pack/unpack for v4.2.
After the 4.1 spec, 4.2 retroactively renamed patchid to barrierid because
it's used for other barriers in compute.
2018-01-27 19:03:55 +11:00
Eric Anholt
96d3e8f134 broadcom/vc5: Add XML for V3D 4.2. 2018-01-27 18:57:58 +11:00
Eric Anholt
b026063b16 broadcom/vc5: Fix a race between XML codegen build and CLIF build. 2018-01-27 18:57:58 +11:00
Eric Anholt
de60ea4432 Android: Attempt to fix broadcom build after vc5 changes. 2018-01-27 18:03:58 +11:00
Marek Olšák
b633999a4e ac: rename and move si_const_array into common code
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
e17eb8800f ac: move address space definitions to common code
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
0d62370bbb ac: don't use byval LLVM qualifier in shaders
shader-db doesn't show any regression and 32-bit pointers with byval
are declared as VGPRs for some reason.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
0e40c6a7b7 gallium/radeon: set number of pb_cache buckets = number of heaps
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
175549e0e9 pb_cache: let drivers choose the number of buckets
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
ecfd521502 pb_cache: call os_time_get outside of the loop
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
e553cb5a68 gallium/radeon: simplify radeon_flags_from_heap
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Timothy Arceri
041b18cf23 st/shader_cache: restore num_tgsi_tokens when loading from cache
Without this we will fail to correctly serialise programs when
using glGetProgramBinary() if the program was retrieved from
the disk cache rather than freshly compiled.

Fixes: c69b0dd681 "st/glsl_to_tgsi: store num_tgsi_tokens in st_*_program"

Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104762
2018-01-27 10:06:16 +11:00
Marek Olšák
17423c993d winsys/amdgpu: fix assertion failure with UVD and VCE rings
Cc: 18.0 <mesa-stable@lists.freedesktop.org>
2018-01-26 23:12:11 +01:00
Brian Paul
ac0e9e343c mesa: remove MESA_FUNCTION
Just use __func__ in the two macros where it was used.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 13:52:48 -07:00
Brian Paul
bacf72a18d mesa: change gl_link_status enums to uppercase
follow the convention of other enums.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 13:52:48 -07:00
Brian Paul
aff5d9c256 mesa: change gl_compile_status enums to uppercase
To follow the convention of other enums.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 13:52:48 -07:00
Brian Paul
d9832f1fc4 mesa: minor comment reformatting, whitespace fixes in mtypes.h
Trivial.
2018-01-26 13:52:42 -07:00
Rafael Antognolli
131e871385 i965/gen10: Use CS Stall instead of WriteImmediate.
Fixes: ca19ee33d7
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 12:02:34 -08:00
Rafael Antognolli
20578f81a6 anv/gen10: Emit CS stall and mark push constants dirty.
I got reviews and fixed the patches locally, but ended up merging the
ones that I sent originally to the list. This patch fixes those
mistakes.

Fixes: 78c125af39
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 11:59:17 -08:00
Rafael Antognolli
bcfd78e448 i965/gen10: Re-enable push constants.
The GPU hang caused by push constants is apparently fixed, so let's
enable them again.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 10:07:44 -08:00
Rafael Antognolli
78c125af39 anv/gen10: Ignore push constant packets during context restore.
Similar to the GL driver, ignore 3DSTATE_CONSTANT_* packets when doing a
context restore.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 10:07:40 -08:00
Rafael Antognolli
ca19ee33d7 i965/gen10: Ignore push constant packets during context restore.
These packets were causing GPU hangs when the context was restored,
possibly because they were pointing to BO's that were already
unreferenced. So we tell the hardware to ignore such packets after the
batch buffer ends, since we know those BO's are not around anymore.

This change fixes GPU hangs on CNL. The (partial) solution to this
problem so far was to entirely disable push constants on this platform.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 10:07:35 -08:00
Brian Paul
acaec6cdd9 mesa: silence MinGW 'may be unused uninitialized' warning in get.c
The warning happens on line 2114 for the memcpy(data, p, size) call.
I'm not sure why that generates the warning but not the earlier use
of p in the code.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 10:44:05 -07:00
Eleni Maria Stea
8096b558a7 mesa: Fix function pointers initialization in status tracker
We assigned the function that gets the device uuid to the GetDriverUuid
function pointer and the function that gets the driver uuid to the
GetDeviceUuid function pointer inside the state tracker. Exchanged the
pointers.

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-26 08:17:55 -07:00
Iago Toral Quiroga
d3ce493b34 anv/pipeline: remove the pipeline layout field from anv_pipeline
It no longer has any users.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 14:06:47 +01:00
Iago Toral Quiroga
75a4802060 anv/cmd_buffer: add the pipeline layout to the pipeline state
We need to access the pipeline layout to compute correct dynamic
offsets for dyamic UBO/SSBO descriptors when we emit draw commands.
Instead of taking it from the pipeline object, store the layout
in the command buffer pipeline state.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 14:06:47 +01:00
Iago Toral Quiroga
e1a49f974b anv/pipeline: don't take the layout from the pipeline to compile shaders
The Vulkan spec states that VkPipelineLayout objects must not be
destroyed while any command buffer that uses them is in the recording
state, but it permits them to be destroyed otherwise. This means that
applications are allowed to free pipeline layouts after command recording
is finished even if there are pipeline objects that still exist and were
created with these layouts.

There are two solutions to this, one is to use reference counting on
pipeline layout objects. The other is to avoid holding references to
pipeline layouts where they are not really needed.

This patch takes a step towards the second option by making the
pipeline shader compile code take pipeline layout from the
VkGraphicsPipelineCreateInfo provided rather than the pipeline
object.

A follow-up patch will remove any remaining uses of the layout field
so we can remove it from the pipeline object and avoid the need
for reference counting.

v2: Use ANV_FROM_HANDLE, remove unnecessary braces (Jason)

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 14:06:46 +01:00
Iago Toral Quiroga
14f6275c92 anv/descriptor_set: add reference counting for descriptor set layouts
The spec states that descriptor set layouts can be destroyed almost
at any time:

   "VkDescriptorSetLayout objects may be accessed by commands that
    operate on descriptor sets allocated using that layout, and those
    descriptor sets must not be updated with vkUpdateDescriptorSets
    after the descriptor set layout has been destroyed. Otherwise,
    descriptor set layouts can be destroyed any time they are not in
    use by an API command."

v2: allocate off the device allocator with DEVICE scope (Jason)

Fixes the following work-in-progress CTS tests:
dEQP-VK.api.descriptor_set.descriptor_set_layout_lifetime.graphics
dEQP-VK.api.descriptor_set.descriptor_set_layout_lifetime.compute

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 14:06:46 +01:00
Samuel Pitoiset
e28233a527 ac/nir: set amdgpu.uniform and invariant.load for SSBOs
For descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:28 +01:00
Samuel Pitoiset
49b0a140a7 ac/nir: set amdgpu.uniform and invariant.load for UBOs
UBOs are constants buffers.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Fixes: 41c36c45 ("amd/common: use ac_build_buffer_load() for emitting UBO loads")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:28 +01:00
Samuel Pitoiset
b453f38a47 ac/nir: set the noalias attribute on input pointers
This attribute is similar to the definition of restrict in
C99 and it might help LLVM.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:28 +01:00
Samuel Pitoiset
310d17fcf1 ac: only load used channels when sampling buffer views
This allows to reduce the number of dwords that are loaded
with buffer_load_format_xyzw. For example, when the only used
channel is 1, the driver will emit buffer_load_format_x instead.

Shader stats for DOW3 (with some local hacky scripts for SPIRV):

143 shaders in 143 tests
Totals:
SGPRS: 5344 -> 5352 (0.15 %)
VGPRS: 3476 -> 3452 (-0.69 %)
Spilled SGPRs: 30 -> 29 (-3.33 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 269860 -> 269808 (-0.02 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1267 -> 1272 (0.39 %)
Wait states: 0 -> 0 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
51e14bc3c0 ac: pass the number of channels to ac_build_buffer_load_format()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
d7c93b558a ac: add ac_build_buffer_load_common() helper
For both versions of llvm.amdgcn.buffer.load.{format}.*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
6d07e443ba radv: fix RADV_DEBUG=syncshaders on GFX9
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
5391de1262 radv: fix a GPU hang with RADV_DEBUG=syncshaders
The GPU hangs when the driver forces a PS_PARTIAL_FLUSH after
a dispatch call (and vice versa for graphics). Something has
changed in the kernel driver because it used to work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
b358e0e67f ac/shader: scan if fragment shaders write memory
It's better to do that in ac_shader_info.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
b9e2f78d6e ac/nir: only canonicalize 32-bit float min/max outputs on pre-GFX9
According to LLVM, only pre-GFX9 targets do not flush denorms
for fmin/fmax.

All dEQP-VK.glsl.builtin.precision.* still pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:27 +01:00
Jason Ekstrand
c8949e2498 anv/pipeline: Don't look at blend state unless we have an attachment
Without this, we may end up dereferencing blend before we check for
binding->index != UINT32_MAX.  However, Vulkan allows the blend state to
be NULL so long as you don't have any color attachments.  This fixes a
segfault when running The Talos Principal.

Fixes: 12f4e00b69
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-26 01:44:45 -08:00
Maxin B. John
8116b9170b anv_icd.py: improve reproducible builds
Sort the output to ensure build reproducibility

Signed-off-by: Maxin B. John <maxin.john@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Fixes: 0ab04ba979 ("anv: Use python to generate ICD json files")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 01:37:45 -08:00
Ian Romanick
c7deeb71a8 nouveau: Remove no-op nvgl_logicop_func function
The values that this function returned were always the values passed
in.  The only thing that happened was either an assertion or undefined
results when an unknown value was passed in.  This doesn't seem that
useful.  Most of nouveau_gldefs.h could be removed in this manner.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-01-26 11:21:46 +08:00
Ian Romanick
f5b9c2a6e3 i915: Silence unused parameter warnings
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_alloc_window_storage’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:290:48: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 intel_alloc_window_storage(struct gl_context * ctx, struct gl_renderbuffer *rb,
                                                ^~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_nop_alloc_storage’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:303:74: warning: unused parameter ‘rb’ [-Wunused-parameter]
 intel_nop_alloc_storage(struct gl_context * ctx, struct gl_renderbuffer *rb,
                                                                          ^~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:32: warning: unused parameter ‘internalFormat’ [-Wunused-parameter]
                         GLenum internalFormat, GLuint width, GLuint height)
                                ^~~~~~~~~~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:55: warning: unused parameter ‘width’ [-Wunused-parameter]
                         GLenum internalFormat, GLuint width, GLuint height)
                                                       ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:69: warning: unused parameter ‘height’ [-Wunused-parameter]
                         GLenum internalFormat, GLuint width, GLuint height)
                                                                     ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_bind_framebuffer’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:396:47: warning: unused parameter ‘fb’ [-Wunused-parameter]
                        struct gl_framebuffer *fb, struct gl_framebuffer *fbread)
                                               ^~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:396:74: warning: unused parameter ‘fbread’ [-Wunused-parameter]
                        struct gl_framebuffer *fb, struct gl_framebuffer *fbread)
                                                                          ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_renderbuffer_update_wrapper’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:422:57: warning: unused parameter ‘intel’ [-Wunused-parameter]
 intel_renderbuffer_update_wrapper(struct intel_context *intel,
                                                         ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_blit_framebuffer_with_blitter’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:644:61: warning: unused parameter ‘filter’ [-Wunused-parameter]
                                     GLbitfield mask, GLenum filter)
                                                             ^~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-26 11:21:46 +08:00
Ian Romanick
39f875a6b7 i915: Make intelEmitCopyBlit static
And rename to emit_copy_blit.

v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-01-26 11:21:46 +08:00
Ian Romanick
9eed6bea6b i965: Make intelEmitCopyBlit static
And rename to emit_copy_blit.

v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-01-26 11:21:46 +08:00
Ian Romanick
4e9e964de6 i915: Use enum color_logic_ops for blits
v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-01-26 11:21:46 +08:00
Ian Romanick
21be331401 i965: Use enum color_logic_ops for blits
v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-01-26 11:21:46 +08:00
Ian Romanick
0aaa27f291 mesa: Pass the translated color logic op dd_function_table::LogicOpcode
And delete the resulting dead code.  This has only been compile-tested.

v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 11:21:46 +08:00
Ian Romanick
cf0b26ec12 st/mesa: Use the translated color logic op from the context
And delete the resulting dead code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 11:21:46 +08:00
Ian Romanick
0c69db895f i965: Use the translated color logic op from the context
And delete the resulting dead code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-26 11:21:46 +08:00
Ian Romanick
9c1f010f34 mesa: Also track a remapped version of the color logic op
With the exception of NVIDIA hardware, these are is the values that all
hardware and Gallium want.  The remapping is currently implemented in at
least 6 places.  This starts the process of consolidating to a single
place.

v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.  Added some comments about the
selection of bit patterns for gl_logicop_mode and the GLenums.
Suggested by Nicolai.  Folded the GLenum_to_color_logicop macro into its
only users.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com> [v1]
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 11:21:46 +08:00
Bas Nieuwenhuizen
5a3404d443 radeonsi: Export signalled sync file instead of -1.
-1 is considered an error for EGL_ANDROID_native_fence_sync, so
we need to actually create a sync file.

Fixes: f536f45250 "radeonsi: implement sync_file import/export"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-26 01:26:53 +01:00
Jason Ekstrand
db682b8f0e i965/fs: Reset the register file to VGRF in lower_integer_multiplication
18fde36ced changed the way temporary
registers were allocated in lower_integer_multiplication so that we
allocate regs_written(inst) space and keep the stride of the original
destination register.  This was to ensure that any MUL which originally
followed the CHV/BXT integer multiply regioning restrictions would
continue to follow those restrictions even after lowering.  This works
fine except that I forgot to reset the register file to VGRF so, even
though they were assigned a number from alloc.allocate(), they had the
wrong register file.  This caused some GLES 3.0 CTS tests to start
failing on Sandy Bridge due to attempted reads from the MRF:

    ES3-CTS.functional.shaders.precision.int.highp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.int.mediump_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.int.lowp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.highp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.mediump_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.lowp_mul_fragment.snbm64

This commit remedies this problem by, instead of copying inst->dst and
overwriting nr, just make a new register and set the region to match
inst->dst.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103626
Fixes: 18fde36ced
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-01-25 13:58:55 -08:00
Jason Ekstrand
af9d4ce480 vulkan: Update the XML and headers to 1.0.68
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2018-01-25 13:30:05 -08:00
Dave Airlie
f4c534ef68 radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)
This seems to be broken, at least the cts tests fail.

This fixes:
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_4
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_8

2 samples seems to pass fine, amdvlk doesn't appear to enable TC for
possibly some other reasons here.

This is most likely a hack.

v1.1: add a bit of explaination text. (Samuel)
Fixes: ad3d98da9 (radv: enable tc compatible htile for d32s8 also.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-26 06:55:09 +10:00
Chuck Atkins
6ac5e851f1 configure.ac: add missing llvm dependencies to .pc files
v2: Only add as dependencies for gallium-osmesa and gallium-xlib

CC: <mesa-stable@lists.freedesktop.org>
Signed-of-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 14:54:08 -05:00
George Kyriazis
5d8f270d10 swr/rast: Optimize DumpToFile output size
Modify DumpToFile to only dump the function, not the entire module.
Reduces file sizes and speeds up the dumping.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
dfe4dd48ec swr/rast: Updated copyright dates
on knob-related files.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
36dbbf11a0 swr/rast: Move memory-related JIT functions
Move them to their own file (builder_mem.{h|cpp}).  Add builder_mem.cpp
to the build system.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
94922dbe4b swr/rast: Add extra (optional) parameter in GATHERPS
Now also takes in an additional parameter (draw context) for future
expansion.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
0b46c7b3b0 swr/rast: Better ExecCmd (i.e. system()) implmentation
Hides console window creation during JIT linker execution in apps that
don't have a console.  Remove hooking of CreateProcessInternalA - the
MSFT implementation just turns around and calls CreateProcessInternalW
which, we do hook.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
2d16b61bff swr/rast: Support USE_SIMD16_FRONTEND=0 for EarlyRast
Early Rasterization did not initially work with USE_SIMD16_FRONTEND=0.
Fix it so it works there, too.  Please note that the default setting
is USE_SIMD16_FRONTEND=1.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
Brian Paul
123798eb44 mesa: whitespace fixes in attrib.c
Trivial.
2018-01-25 12:17:26 -07:00
Brian Paul
0e7aaaf5a5 mesa: whitespace fixes in varray.h
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
ba01589c0c mesa: include mtypes.h in varray.h
We actually use some of the types from mtypes.h so include it directly
instead of relying on indirectly including it via bufferobj.h

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
e4504be6fc mesa: s/gl_vertex_attrib_array/gl_array_attributes/ in comments
The structure type was renamed some time ago, but some comments
were not updated.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
6c724fb7c1 mesa: simplify _mesa_delete_list() a bit, add some assertions
All but two cases of the switch did the same n += InstSize[n[0].opcode]
instruction.  Just move it after the switch.

Add some sanity check assertions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
c860171c63 st/mesa: expand glDrawPixels cache to handle multiple images
The newest version of WSI Fusion makes several glDrawPixels calls
per frame.  By caching more than one image, we get better performance
when panning/zooming the map.

v2: move pixel unpack param checking out of cache search loop, per Roland
v3: also move unpack->BufferObj check out of loop, per Roland.
2018-01-25 12:17:26 -07:00
Brian Paul
5092610f29 st/mesa: add some debug code in st_choose_format()
To aid in debugging gallium surface format selection issues.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
94610758a3 svga: s/Bool/SVGA3dBool/ in SVGA3dDevCapResult
And fix whitespace.  To sync up with in-house code.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-25 11:56:33 -07:00
Emil Velikov
6aeef54644 configure.ac: correct driglx-direct help text
The default was toggled a while back, but the text wasn't updated.

Fixes: bd526ec9e1 ("configure: Always default to
--enable-driglx-direct")
Cc: Jon TURNEY <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-01-25 17:44:35 +00:00
Emil Velikov
7b744a494d swrast: remove non-applicable GLX_SWAP_COPY_OML comment
Noticed while skimming for GLX_ instances in the dri codebase.
Comment is completely off and was in such a state since day 1.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-25 17:42:57 +00:00
Emil Velikov
3e3956d6ae mapi: remove duplicate GL typedefs
Remove the instances already available in gl.h or glext.h.
Sadly GLclampx is only available in GLES(1) so we need to keep that one.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-25 17:42:50 +00:00
Emil Velikov
647f40298a mapi: remove non applicable HAVE_DIX_CONFIG_H hunk
Seeming artefact from when the xserver build was diving directly into
mesa's tree.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-25 17:42:48 +00:00
Emil Velikov
48e7bc6833 mapi: autotools: remove unused MAPI_FILES file list
The sole user was OpenVG, which was removed couple of years ago.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-25 17:42:46 +00:00
Emil Velikov
785d9a4ed8 automake: st/mesa/tests: add st_tests_common.h to the tarball
Fixes: 6569b33b6e ("mesa/st/tests: unify MockCodeLine* classes")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 17:06:29 +00:00
Emil Velikov
0beaf7ad3e automake: mesa: include vbo_private.h in the tarball
Fixes: a7cfec3be0 ("vbo: move VBO-private types, prototypes, etc. into
new vbo_private.h header")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 17:06:29 +00:00
Emil Velikov
ac4437b20b automake: small cleanup after the meson.build inclusion
Namely extend the EXTRA_DIST list, instead of re-assigning it and bring
back a file dropped by mistake.

Fixes: 436ed65d38 ("autotools: include meson build files in tarball")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-01-25 17:06:29 +00:00
Emil Velikov
50265cd9ee automake: anv: ship anv_extensions_gen.py in the tarball
Fixes: dd088d4bec ("anv/extensions: Generate a header file with
extension tables")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 17:06:29 +00:00
Emil Velikov
265d36c890 automake: vc5: remove non-applicable v3dx_simulator.h
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 17:06:28 +00:00
Roland Scheidegger
4fe662c58f gallivm: fix crash with seamless cube filtering with different min/mag filter
We are not allowed to modify the incoming coords values, or things may
crash (as we may be inside a llvm conditional and the values may be used
in another branch).
I recently broke this when fixing an issue with NaNs and seamless cube
map filtering, and it causes crashes when doing cubemap filtering
if the min and mag filters are different.
Add const to the pointers passed in to prevent this mishap in the future.

Fixes: a485ad0bcd ("gallivm: fix an issue with NaNs with seamless cube filtering")

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-01-25 18:03:38 +01:00
Eric Engestrom
57223fb07a egl: keep extension list sorted, per comment at the top
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-01-25 16:38:11 +00:00
George Kyriazis
0e879aad2f swr/rast: support llvm 3.9 type declarations
LLVM 3.9 was not taken into account in initial check-in.

Fixes: 01ab218bbc ("swr/rast: Initial work for debugging support.")
cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104749
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 08:22:52 -06:00
Samuel Pitoiset
e1331c9d61 ac/nir: add break statements in needs_view_index_sgpr()
Previous code is correct but as the first case statement uses
a break, keep it consistent.

CID: 1428579
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-25 13:59:52 +01:00
Eric Engestrom
0663ae0aa1 loader: let compiler figure out the length of the string
Basically, turn comment into code

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 11:40:25 +00:00
Eric Engestrom
57b0ccd178 meson: simplify dri3 logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-25 10:10:04 +00:00
Juan A. Suarez Romero
513c2263cb mesa: add missing RGB9_E5 format in _mesa_base_fbo_format
This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-25 09:54:31 +01:00
Jason Ekstrand
df13588d21 i965: Stop disabling aux during texture preparation
Previously, we were handling self-dependencies by marking the render
buffer and then passing disable_aux=true to prepare_texture so that it
would do a resolve.  This works but ends us up doing to much resolving
in some cases.  Specifically, if we're doing something such as mipmap
generation, this would cause us to resolve all levels of the texture if
even one of them is overlapping.

Instead, this commit makes us wait until we process the framebuffer to
do these resolves and we only resolve the slices needed for rendering.
Doing this resolve puts them into the pass-through state so, even if we
do texture using CCS_E, the CCS data will effectively be ignored and the
real surface contents read.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
20f70ae385 i965/draw: Set NEW_AUX_STATE when draw aux changes
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383
Fixes: ea0d2e98ec
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
e52a9f18d6 i965: Replace draw_aux_buffer_disabled with draw_aux_usage
Instead of keeping an array of booleans, we now hang onto an array of
isl_aux_usage enums.  This means that the thing we are passing from
brw_draw.c to surface state setup is the thing that surface state setup
actually needs instead of an input to compute what it needs.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
468ea3cc45 i965/surface_state: Drop brw_aux_surface_disabled
The only purpose of this function is to disable aux on texture surfaces
when the corresponding renderbuffer has aux disabled.  However, the act
of disabling aux on the renderbuffer will cause it to be resolved and
intel_miptree_texture_aux_usage will already check the resolved status
of a texture and return ISL_AUX_USAGE_NONE for it.  Even if we used CCS
for it, that wouldn't really be a problem because the CCS will be in the
pass-through state and so it would effectively be ignored.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
d38ec24f53 i965/miptree: Add an aux_disabled parameter to render_aux_usage
Only one of the callers of intel_miptree_render_aux_usage actually took
brw->draw_aux_buffer_disabled into account.  This was causing us to
ignore draw_aux_buffer_disabled for the intel_miptree_prepare_render.
This isn't a problem because the draw_aux_buffer_disabled entry was set
during texture preparation and we already did the resolve at that time.
However, this also meant that the aux_usage we were passing to
brw_cache_flush_for_render and brw_render_cache_add_bo was wrong so our
automatic cache flushing around aux_usage changes wasn't happening.
This was causing GPU hangs in Oxenfree.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104711
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383
Fixes: ea0d2e98ec
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
dfe0217905 i965/miptree: Take an aux_usage in prepare/finish_render
Both callers of intel_miptree_prepare/finish_render have to call
intel_miptree_render_aux_usage anyway for other reasons.  They may as
well pass the result in instead of us calling it again.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
7d4007d58a aubinator: Multiply count by 4 to compute buffer sizes
The count field is in terms of dwords and not bytes.
2018-01-24 19:05:36 -08:00
Timothy Arceri
e776791432 st/glsl_to_nir: remove reallocation of sampler/image location
As far as I can tell this always just reassigns the same value.

Also as we don't curretly store UniformHash in the shader cache
removing this will help with adding a shader cache to gallium
nir drivers.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-01-25 13:27:22 +11:00
Jordan Justen
62b68d05e7 docs: add 18.1.0-devel release notes template
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-24 17:10:58 -08:00
Jordan Justen
65c18b02fc mesa: bump version to 18.1.0-devel
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-24 17:10:58 -08:00
Greg V
8fae5eddd9 meson: handle LLVM 'x.x.xgit-revision' versions
When LLVM is built inside of a git repo (even way below, e.g. /usr/ports/.git
exists, and LLVM is built in /usr/ports/devel/llvm50/work), its version
becomes something like 5.0.0git-f8ab206b2176.

New meson versions already handle this, but we support older versions too.

Fixes: 673dda8330 ("meson: build "radv" vulkan driver for radeon hardware")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-24 15:25:54 -08:00
Greg V
53f9131205 meson: fix getting cflags from pkg-config
get_pkgconfig_variable('cflags') always returns an empty list, it's a
function for getting *custom* variables.

Meson does not yet support asking for cflags, so explicitly invoke
pkg-config for now.

Fixes: 68076b8747 ("meson: build gallium vdpau state tracker")
Fixes: a817af8a89eb ("meson: build gallium xvmc state tracker")
Fixes: 1d36dc674d ("meson: build gallium omx state tracker")
Fixes: 5a785d51a6 ("meson: build gallium va state tracker")
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-24 15:25:54 -08:00
Greg V
c38c60a63c meson: fix BSD build
CC: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-24 15:25:54 -08:00
Greg V
7c8cfe2d59 meson: fix missing dependencies
Fixes: 66f97f6640 ("meson: build radeonsi")
Reviewed-by: Emil Velikov <emil.velikov@colalbora.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-24 15:25:54 -08:00
Grazvydas Ignotas
0cc7370733 anv: correct a duplicate check in an assert
Looks like checking both sources was intended, instead of the first one
twice. Found with Coccinelle, coccinellery/xand/xand.cocci semantic patch.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-25 01:10:45 +02:00
Marc Dietrich
a2a1b0e75e meson: fix HAVE_LLVM version define in meson build
LLVM patch level is not included in HAVE_LLVM.

Fixes: e6418ab156 ("meson: build "radv" vulkan driver for radeon hardware")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Signed-off-by: Marc Dietrich <marvin24@gmx.de>
2018-01-24 14:04:20 -08:00
Dylan Baker
5781c3d1db meson: correctly set SYSCONFDIR for loading dirrc
Fixes: d1992255bb ("meson: Add build Intel "anv" vulkan driver")
Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-24 13:10:32 -08:00
Dave Airlie
d2414e64e4 radv: add multisample Z optimisation from amdvlk
This was just found while reading for other stuff,
src/core/hw/gfxip/gfx6/gfx6DepthStencilView.cpp.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-25 06:48:11 +10:00
Dave Airlie
298554541d radv: move spi_baryc_cntl to pipeline
We need to enable the pos float location 2 mode anytime we have
persample not just when forced by the frag shader.

This fixes:
dEQP-VK.pipeline.multisample.min_sample_shading*

Fixes: 58c97a079 (radv: enable location at sample when persample is forced.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-25 06:47:28 +10:00
Marek Olšák
125c0529f3 gallium/u_tests: add texture_barrier and FBFETCH tests
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-24 21:08:45 +01:00
Marek Olšák
022c5b22fe radeonsi: don't ignore pitch for imported textures
Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-24 21:08:45 +01:00
Scott D Phillips
0b8d38bd48 meson: Fix define for USE_SSE41
Before we were adding -DHAVE_SSE41 which isn't what the code is
looking for, so some uses of the sse4.1 code were always being
skipped.

v2: Don't add any compile check for the quite old -msse4.1 option (Dylan)

Fixes: 84486f6462 ("meson: Enable SSE4.1 optimizations")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-24 11:32:34 -08:00
Gert Wollny
8172b9ff48 mesa/st/glsl_to_tgsi: remove now unneeded assert.
With the implementation of the tracking of the registers used in reladdr
asserting that a driver calling merge_register() uses the address register
is no longer needed.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:34:05 -07:00
Gert Wollny
f2040fbe48 mesa/st/tests: Add tests for lifetime tracking with indirect addressing
Add a code line type that accepts one layer of indirect addressing and
add tests to check that temporary register access used for indirect
addressing is accounted for in the lifetime estimation.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:34:00 -07:00
Gert Wollny
51c0cee267 mesa/st/glsl_to_tgsi: Add tracking of indirect addressing registers
So far indirect addressing was not tracked to estimate the temporary
life time, and it was not needed, because code to load the address
registers was always emitted eliminating the reladdr* handles in the
past glsl-to.tgsi stages. Now, with Mareks patch allowing any 1D register
to be used for addressing on some hardware this changed, and
the tracking becomes necessary.

Because the registers have no direct indication on whether the reladdr* was
already loaded into an address register, the temporaries in reladdr* are
always tracked as reads. This may result in a slight over-estimation of the
lifetime in the cases when the load to the address register was emitted.

v2: no changes
v3: Use debug_log variable instead of directly writing to std::err in debugging
    output.
v6: fix indention and typos

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
517e34c62f mesa/st/tests: Add tests for improved tracking of temporaries
Additional tests are added that check the tracking of access to temporaries
in if-else branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
807e2539e5 mesa/st/glsl_to_tgsi: Add tracking of ifelse writes in register merging
Improve the life-time evaluation of temporary registers by also tracking
writes in both if and else branches and in up to 32 nested scopes.
As a result the estimated required register life-times can be further
reduced enabling more registers to be merged.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
8dda01ef5a mesa/st/tests: cleanup whitespace usage and correct some comments
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
6569b33b6e mesa/st/tests: unify MockCodeLine* classes
* Merge the classes MockCodeLine and MockCodelineWithSwizzle into
   one, and  refactor tests accordingly.
 * Change memory allocations to use ralloc* interface.

 v2:
 * move the test classes into a conveniance library
 * rename the Mock* classes to Fake* since they are not really
   Mocks
 * Base assertion of correct number of src and dst registers in tests
   on what the operatand actually expects
 * Fix number of destinations in one test

 v6:
 * fix local includes using "..." insteadof <...>

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
ad1990629e mesa/st/tests: Fix zero-byte allocation leaks
Don't allocate a zero-sized array, when no texture offsets are given.

v5: correct spaces and empty lines

Reviewed-by: Brian Paul <brianp@vmware.com>(v4)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
ee48e3acb8 mesa/st/glsl_to_tgsi: Add some operators for glsl_to_tgsi related classes
Add the equal operator and the "<<" stream write operator for the
st_*_reg classes and the "<<" operator to the instruction class, and
make use of these operators in the debugging output.

v5: Fix empty lines

Reviewed-by: Brian Paul <brianp@vmware.com> (v4)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
6a3421078a mesa/program: Add missing file types to printout
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Brian Paul
365a48abdd vbo: fix incorrect min/max_index values in display list draw call
This fixes another regression from commit 8e4efdc895 ("vbo: optimize
some display list drawing").  The problem was the min_index, max_index
values passed to the vbo drawing function were not computed to compensate
for the biased prim::start values.

https://bugs.freedesktop.org/show_bug.cgi?id=104746
https://bugs.freedesktop.org/show_bug.cgi?id=104742
https://bugs.freedesktop.org/show_bug.cgi?id=104690
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Fixes: 8e4efdc895 ("vbo: optimize some display list drawing")
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
2018-01-24 10:12:49 -07:00
Brian Paul
2123bd2805 vbo: whitespace/formatting fixes in vbo_split_inplace.c
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
6b0109cf39 vbo: whitespace/formatting fixes in vbo.h
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
b9280031a8 vbo/i965: move vbo_all_varyings_in_vbos() to brw_draw.c
It's only used in brw_draw_prims().

s/GLboolean/bool/, etc.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
a83f7e119c vbo: remove unused vbo_any_varyings_in_vbos() function
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
718f4251c5 vbo: remove unneeded #includes
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
f4376a0c2b vbo: remove vbo_context.h and change includes to use vbo.h instead
Now vbo.h is the public interface to the VBO module.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
aafb56a148 vbo: move remaining items from vbo_context.h to vbo.h
Non-VBO sources files sometimes included vbo.h while others included
vbo_context.h.  We're moving all public types, functions to the former.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
a7cfec3be0 vbo: move VBO-private types, prototypes, etc. into new vbo_private.h header
Things which should not be used outside the VBO module.
More public/private clean-ups coming.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
d40fa42292 mesa: use new _vbo_install_exec_vtxfmt() function
Instead of reaching into the vbo_context object in vtxfmt.c

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
04a17ec327 nouveau: remove vbo_context() call
_vbo_DestroyContext() can be safely called even if there's no VBO
module.  Removes a dependency on the vbo_context() function.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
7b0ae96711 i965: use vbo_set_[indirect]_draw_func()
Instead of poking into the vbo_context object.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
3bbf8d9042 vbo: move vbo_sizeof_ib_type() into vbo_exec_array.c
It's only used in this one file.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
a152cb7492 mesa: move vbo_count_tessellated_primitives() to api_validate.c
It's only used in this file and has nothing VBO-specific about it.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
5d3e10fd27 mesa: update comment on gl_display_list
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
cffa82327d mesa: whitespace clean-ups in mtypes.h
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
b3a1aa94d9 mesa: remove unused MAT_INDEX_AMBIENT/DIFFUSE/SPECULAR contants
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
67dc551ba9 vbo: move DLIST_DANGLING_REFS from mtypes.h to vbo_save_api.c
It's only used in this file.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
cb7ef0df00 vbo: replace assert(0) with unreachable()
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
8b3cb7c651 vbo: fix, add comment in vbo_save.h
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
67ebde19d4 vbo: whitespace, formatting fixes in vbo_split.[ch]
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Topi Pohjolainen
ec4bb693a0 i965: Don't try to disable render aux buffers for compute
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104546
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-01-24 10:54:08 +02:00
Jason Ekstrand
4064fe59e7 anv/cmd_buffer: Move gen7 index buffer state to graphics state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:46 -08:00
Jason Ekstrand
38ec78049f anv/cmd_buffer: Move num_workgroups to compute state
While we're here, make it an anv_address.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:44 -08:00
Jason Ekstrand
95ff232294 anv/cmd_buffer: Move dynamic state to graphics state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:43 -08:00
Jason Ekstrand
24caee8975 anv/cmd_buffer: Use a temporary variable for dynamic state
We were already doing this for some packets to keep the lines shorter.
We may as well just do it for all of them.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:40 -08:00
Jason Ekstrand
8bd5ec5b86 anv/cmd_buffer: Move vb_dirty bits into anv_cmd_graphics_state
Vertex buffers are entirely a graphics pipeline thing.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:39 -08:00
Jason Ekstrand
e85aaec148 anv/cmd_buffer: Move dirty bits into anv_cmd_*_state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:36 -08:00
Jason Ekstrand
97f96610c8 anv: Separate compute and graphics descriptor sets
The Vulkan spec says:

    "pipelineBindPoint is a VkPipelineBindPoint indicating whether the
    descriptors will be used by graphics pipelines or compute pipelines.
    There is a separate set of bind points for each of graphics and
    compute, so binding one does not disturb the other."

Up until now, we've been ignoring the pipeline bind point and had just
one bind point for everything.  This commit separates things out into
separate bind points.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102897
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:33 -08:00
Jason Ekstrand
31b2144c83 anv/cmd_buffer: Use anv_descriptor_for_binding for samplers
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:31 -08:00
Jason Ekstrand
b9e1ca16f8 anv/cmd_buffer: Add a helper for binding descriptor sets
This lets us unify some code between push descriptors and regular
descriptors.  It doesn't do much for us yet but it will.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:30 -08:00
Jason Ekstrand
90cceaa9dd anv/cmd_buffer: Refactor ensure_push_descriptor_set
It's now a function which returns the push descriptor set.  Since we set
the error on the command buffer, returning the error is a little
redundant.  Returning the descriptor set (or NULL on error) is more
convenient.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:28 -08:00
Jason Ekstrand
d5592e2fda anv: Remove semicolons from vk_error[f] definitions
With the semicolons, they can't be used in a function argument without
throwing syntax errors.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:27 -08:00
Jason Ekstrand
9af5379228 anv/cmd_buffer: Add substructs to anv_cmd_state for graphics and compute
Initially, these just contain the pipeline in a base struct.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:25 -08:00
Jason Ekstrand
ddc2d28548 anv/cmd_buffer: Use some pre-existing pipeline temporaries
There are several places where we'd already saved the pipeline off to a
temporary variable but, due to an artifact of history, weren't actually
using that temporary everywhere.  No functional change.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:24 -08:00
Jason Ekstrand
cd3feea745 anv/cmd_buffer: Rework anv_cmd_state_reset
This splits anv_cmd_state_reset into separate init and finish functions.
This lets us share init code with cmd_buffer_create.  This potentially
fixes subtle bugs where we may have missed some bit of state that needs
to get initialized on command buffer creation.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:22 -08:00
Jason Ekstrand
d6c9a89d13 anv/cmd_buffer: Get rid of the meta query workaround
Meta has been gone for a long time.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:20 -08:00
Jason Ekstrand
bc0a21e348 anv/cmd_state: Drop the scratch_size field
This is a legacy left-over from the mechanism we used to use to handle
scratch.  The new (and better) mechanism doesn't use this.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:19 -08:00
Jason Ekstrand
4b69ba3817 anv/pipeline: Don't assert on more than 32 samplers
This prevents an assert when running one unreleased Vulkan game.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:08 -08:00
Dave Airlie
766589d89a radv: fix sample_mask_in loading. (v3.1)
This is ported from radeonsi and fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.bit_*

v2: don't call this path for radeonsi, it does it in the epilog.
use the radeonsi code path.
v3: handle NULL pCreateInfo->pMultisampleState properly (Samuel)
v3.1: set ps_iter_samples default to 1 (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: bdcbe7c76 (radv: add sample mask input support)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-24 14:25:11 +10:00
Dave Airlie
c727ea9370 radv: don't use hw resolves for r16g16 norm formats.
radeonsi has a workaround for this, but it uses a R16A16 format,
which vulkan doesn't have, we could probably come up with a work
around but for now just avoid hw resolves.

Fixes:
dEQP-VK.renderpass.suballocation.multisample.r16g16_*norm*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-24 09:01:12 +10:00
Dave Airlie
4df414bbd2 radv: don't use hw resolve for integer image formats
From reading AMDVLK it currently never uses hw resolve paths.

This patch takes from radeonsi which doesn't use hw resolve
for integer formats, and does the same for radv.

This fixes:
dEQP-VK.renderpass.suballocation.multisample*uint tests.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-24 08:53:18 +10:00
Dave Airlie
316d762186 radv: add fs_key meta format support to resolve passes.
Some of the hw resolve passes need the SPI color format setup
correctly.

This fixes lots of 16-bit and 32-bit format tests in
dEQP-VK.renderpass.suballocation.multisample*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-24 08:50:51 +10:00
Grazvydas Ignotas
224fd17e1e winsys/svga: check correct member after create
.mob_fenced was already checked, probably a copy-paste bug.
Found by Coccinelle.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-23 11:04:07 -07:00
Grazvydas Ignotas
08085df313 svga: fix context alloc error handling
'cleanup' path is dereferencing 'svga' a lot, 'done' is a better choice.
Found by Coccinelle.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-23 11:04:07 -07:00
Christoph Haag
4b4d929c27 meson: remove lib prefix from libd3dadapter9.so
Fixes: 6b4c7047d5 ("meson: build gallium nine state_tracker")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-23 09:30:30 -08:00
Emil Velikov
3b6d232a5c docs: update calendar 18.0.0-rc1 is out
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-23 17:02:17 +00:00
Eric Engestrom
eee8dd7c33 radeon: remove left over dead code
Fixes: 4e0d99a635 "r100: Use shared debug code"
Cc: Pauli Nieminen <suokkos@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-01-23 15:39:57 +00:00
Eric Engestrom
10f5e0dce2 docs: ask for backport nominations to cc: the author
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-23 15:39:57 +00:00
Marc Dietrich
911ca587f8 meson: fix some defines misspelled errors in meson.build
Defines
- HAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL
- HAVE_FUNC_ATTRIBUTE_VISIBILITY
were misspelled.

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-23 15:39:57 +00:00
3143 changed files with 426112 additions and 154151 deletions

View File

@@ -11,6 +11,7 @@ tab_width = 8
[*.{c,h,cpp,hpp,cc,hh}] [*.{c,h,cpp,hpp,cc,hh}]
indent_style = space indent_style = space
indent_size = 3 indent_size = 3
max_line_length = 78
[{Makefile*,*.mk}] [{Makefile*,*.mk}]
indent_style = tab indent_style = tab

499
.gitlab-ci.yml Normal file
View File

@@ -0,0 +1,499 @@
# This is the tag of the docker image used for the build jobs. If the
# image doesn't exist yet, the containers-build stage generates it.
#
# In order to generate a new image, one should generally change the tag.
# While removing the image from the registry would also work, that's not
# recommended except for ephemeral images during development: Replacing
# an image after a significant amount of time might pull in newer
# versions of gcc/clang or other packages, which might break the build
# with older commits using the same tag.
#
# After merging a change resulting in generating a new image to the
# main repository, it's recommended to remove the image from the source
# repository's container registry, so that the image from the main
# repository's registry will be used there as well.
#
# The format of the tag is "%Y-%m-%d-${counter}" where ${counter} stays
# at "01" unless you have multiple updates on the same day :)
variables:
UBUNTU_TAG: 2019-02-12-01
UBUNTU_IMAGE: "$CI_REGISTRY_IMAGE/ubuntu:$UBUNTU_TAG"
UBUNTU_IMAGE_MAIN: "registry.freedesktop.org/mesa/mesa/ubuntu:$UBUNTU_TAG"
cache:
paths:
- ccache
stages:
- containers-build
- build+test
# When to automatically run the CI
.ci-run-policy:
only:
- master
- merge_requests
- /^ci([-/].*)?$/
# CONTAINERS
containers:ubuntu:
extends: .ci-run-policy
stage: containers-build
image: docker:stable
services:
- docker:dind
variables:
DOCKER_HOST: tcp://docker:2375
DOCKER_DRIVER: overlay2
script:
# Enable experimental features such as `docker manifest inspect`
- mkdir -p ~/.docker
- "echo '{\"experimental\": \"enabled\"}' > ~/.docker/config.json"
- docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY
# Check if the image (with the specific tag) already exists
- docker manifest inspect $UBUNTU_IMAGE && exit || true
# Try to re-use the image from the main repository's registry
- docker image pull $UBUNTU_IMAGE_MAIN &&
docker image tag $UBUNTU_IMAGE_MAIN $UBUNTU_IMAGE &&
docker image push $UBUNTU_IMAGE && exit || true
- docker build -t $UBUNTU_IMAGE -f .gitlab-ci/Dockerfile.ubuntu .
- docker push $UBUNTU_IMAGE
# BUILD
.build:
extends: .ci-run-policy
image: $UBUNTU_IMAGE
stage: build+test
artifacts:
when: on_failure
untracked: true
# Use ccache transparently, and print stats before/after
before_script:
- export PATH="/usr/lib/ccache:$PATH"
- export CCACHE_BASEDIR="$PWD"
- export CCACHE_DIR="$PWD/ccache"
- export CCACHE_COMPILERCHECK=content
- ccache --zero-stats || true
- ccache --show-stats || true
after_script:
- export CCACHE_DIR="$PWD/ccache"
- ccache --show-stats
.meson-build:
extends: .build
script:
# We need to control the version of llvm-config we're using, so we'll
# generate a native file to do so. This requires meson >=0.49
- if test -n "$LLVM_VERSION"; then
LLVM_CONFIG="llvm-config-${LLVM_VERSION}";
echo -e "[binaries]\nllvm-config = '`which $LLVM_CONFIG`'" > native.file;
$LLVM_CONFIG --version;
else
touch native.file;
fi
- meson --version
- meson _build
--native-file=native.file
-D build-tests=true
-D libunwind=${UNWIND}
${DRI_LOADERS}
-D dri-drivers=${DRI_DRIVERS:-[]}
${GALLIUM_ST}
-D gallium-drivers=${GALLIUM_DRIVERS:-[]}
-D vulkan-drivers=${VULKAN_DRIVERS:-[]}
- cd _build
- meson configure
- ninja -j4
- ninja test
.make-build:
extends: .build
variables:
MAKEFLAGS: "-j4"
script:
- if test -n "$LLVM_VERSION"; then
export LLVM_CONFIG="llvm-config-${LLVM_VERSION}";
fi
- mkdir build
- cd build
- ../autogen.sh
--enable-autotools
--enable-debug
$LIBUNWIND_FLAGS
$DRI_LOADERS
--with-dri-drivers=$DRI_DRIVERS
$GALLIUM_ST
--with-gallium-drivers=$GALLIUM_DRIVERS
--with-vulkan-drivers=$VULKAN_DRIVERS
--disable-llvm-shared-libs
- make
- eval $MAKE_CHECK_COMMAND
.scons-build:
extends: .build
variables:
SCONSFLAGS: "-j4"
script:
- if test -n "$LLVM_VERSION"; then
export LLVM_CONFIG="llvm-config-${LLVM_VERSION}";
fi
- scons $SCONS_TARGET
- eval $SCONS_CHECK_COMMAND
build:meson-vulkan:
extends: .meson-build
variables:
UNWIND: "false"
DRI_LOADERS: >
-D glx=disabled
-D gbm=false
-D egl=false
-D platforms=x11,wayland,drm
-D osmesa=none
GALLIUM_ST: >
-D dri3=true
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
VULKAN_DRIVERS: intel,amd
LLVM_VERSION: "7"
build:meson-loader-classic-dri:
extends: .meson-build
variables:
UNWIND: "false"
DRI_LOADERS: >
-D glx=dri
-D gbm=true
-D egl=true
-D platforms=x11,wayland,drm,surfaceless
-D osmesa=classic
DRI_DRIVERS: "i915,i965,r100,r200,swrast,nouveau"
GALLIUM_ST: >
-D dri3=true
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
build:meson-glvnd:
extends: .meson-build
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glvnd=true
-D egl=true
-D gbm=true
-D glx=dri
DRI_DRIVERS: "i965"
GALLIUM_ST: >
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
# NOTE: Building SWR is 2x (yes two) times slower than all the other
# gallium drivers combined.
# Start this early so that it doesn't hunder the run time.
build:meson-gallium-swr:
extends: .meson-build
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glx=disabled
-D egl=false
-D gbm=false
GALLIUM_ST: >
-D dri3=false
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
GALLIUM_DRIVERS: "swr"
LLVM_VERSION: "6.0"
build:meson-gallium-radeonsi:
extends: .meson-build
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glx=disabled
-D egl=false
-D gbm=false
GALLIUM_ST: >
-D dri3=false
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
GALLIUM_DRIVERS: "radeonsi"
LLVM_VERSION: "7"
build:meson-gallium-drivers-other:
extends: .meson-build
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glx=disabled
-D egl=false
-D gbm=false
GALLIUM_ST: >
-D dri3=false
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
GALLIUM_DRIVERS: "i915,iris,nouveau,kmsro,r300,r600,freedreno,svga,swrast,v3d,vc4,virgl,etnaviv"
LLVM_VERSION: "5.0"
build:meson-gallium-clover-llvm5:
extends: .meson-build
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glx=disabled
-D egl=false
-D gbm=false
GALLIUM_ST: >
-D dri3=false
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=icd
GALLIUM_DRIVERS: "r600"
LLVM_VERSION: "5.0"
build:meson-gallium-clover-llvm6:
extends: build:meson-gallium-clover-llvm5
variables:
LLVM_VERSION: "6.0"
build:meson-gallium-clover-llvm7:
extends: build:meson-gallium-clover-llvm5
variables:
GALLIUM_DRIVERS: "r600,radeonsi"
LLVM_VERSION: "7"
build:meson-gallium-st-other:
extends: .meson-build
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glx=disabled
-D egl=false
-D gbm=false
GALLIUM_ST: >
-D dri3=true
-D gallium-vdpau=true
-D gallium-xvmc=true
-D gallium-omx=bellagio
-D gallium-va=true
-D gallium-xa=true
-D gallium-nine=true
-D gallium-opencl=disabled
-D osmesa=gallium
GALLIUM_DRIVERS: "nouveau,swrast"
LLVM_VERSION: "5.0"
build:make-vulkan:
extends: .make-build
variables:
MAKE_CHECK_COMMAND: "make -C src/gtest check && make -C src/intel check"
LLVM_VERSION: "7"
DRI_LOADERS: >
--disable-glx
--disable-gbm
--disable-egl
--with-platforms=x11,wayland,drm
DRI_DRIVERS: ""
GALLIUM_ST: >
--enable-dri
--enable-dri3
--disable-opencl
--disable-xa
--disable-nine
--disable-xvmc
--disable-vdpau
--disable-va
--disable-omx-bellagio
--disable-gallium-osmesa
VULKAN_DRIVERS: intel,radeon
LIBUNWIND_FLAGS: --disable-libunwind
build:make-loader-classic-dri:
extends: .make-build
variables:
MAKE_CHECK_COMMAND: "make check"
DRI_LOADERS: >
--enable-glx
--enable-gbm
--enable-egl
--with-platforms=x11,wayland,drm,surfaceless
--enable-osmesa
DRI_DRIVERS: "i915,i965,radeon,r200,swrast,nouveau"
GALLIUM_ST: >
--enable-dri
--disable-opencl
--disable-xa
--disable-nine
--disable-xvmc
--disable-vdpau
--disable-va
--disable-omx-bellagio
--disable-gallium-osmesa
LIBUNWIND_FLAGS: --disable-libunwind
# NOTE: Building SWR is 2x (yes two) times slower than all the other
# gallium drivers combined.
# Start this early so that it doesn't hunder the run time.
build:make-gallium-drivers-swr:
extends: .make-build
variables:
MAKE_CHECK_COMMAND: "true"
LLVM_VERSION: "6.0"
DRI_LOADERS: >
--disable-glx
--disable-gbm
--disable-egl
GALLIUM_ST: >
--enable-dri
--disable-opencl
--disable-xa
--disable-nine
--disable-xvmc
--disable-vdpau
--disable-va
--disable-omx-bellagio
--disable-gallium-osmesa
GALLIUM_DRIVERS: "swr"
LIBUNWIND_FLAGS: --enable-libunwind
build:make-gallium-drivers-radeonsi:
extends: build:make-gallium-drivers-swr
variables:
LLVM_VERSION: "7"
GALLIUM_DRIVERS: "radeonsi"
build:make-gallium-drivers-other:
extends: build:make-gallium-drivers-swr
variables:
LLVM_VERSION: "3.9"
GALLIUM_DRIVERS: "i915,nouveau,kmsro,r300,r600,freedreno,svga,swrast,v3d,vc4,virgl,etnaviv"
build:make-gallium-st-clover-llvm-39:
extends: .make-build
variables:
MAKE_CHECK_COMMAND: "true"
LLVM_VERSION: "3.9"
DRI_LOADERS: >
--disable-glx
--disable-gbm
--disable-egl
GALLIUM_ST: >
--disable-dri
--enable-opencl
--enable-opencl-icd
--enable-llvm
--disable-xa
--disable-nine
--disable-xvmc
--disable-vdpau
--disable-va
--disable-omx-bellagio
--disable-gallium-osmesa
GALLIUM_DRIVERS: "r600"
LIBUNWIND_FLAGS: --enable-libunwind
build:make-gallium-st-clover-llvm-4:
extends: build:make-gallium-st-clover-llvm-39
variables:
LLVM_VERSION: "4.0"
build:make-gallium-st-clover-llvm-5:
extends: build:make-gallium-st-clover-llvm-39
variables:
LLVM_VERSION: "5.0"
build:make-gallium-st-clover-llvm-6:
extends: build:make-gallium-st-clover-llvm-39
variables:
LLVM_VERSION: "6.0"
build:make-gallium-st-clover-llvm-7:
extends: build:make-gallium-st-clover-llvm-39
variables:
LLVM_VERSION: "7"
GALLIUM_DRIVERS: "r600,radeonsi"
build:make-gallium-st-other:
extends: .make-build
variables:
MAKE_CHECK_COMMAND: "true"
# We should be testing 3.3, but 3.9 is the oldest that still exists in ubuntu
LLVM_VERSION: "3.9"
DRI_LOADERS: >
--disable-glx
--disable-gbm
--disable-egl
GALLIUM_ST: >
--enable-dri
--disable-opencl
--enable-xa
--enable-nine
--enable-xvmc
--enable-vdpau
--enable-va
--enable-omx-bellagio
--enable-gallium-osmesa
# We need swrast for osmesa and nine.
# i915 most likely doesn't work with most ST.
# Regardless - we're doing a quick build test here.
GALLIUM_DRIVERS: "i915,swrast"
LIBUNWIND_FLAGS: --enable-libunwind
build:scons-nollvm:
extends: .scons-build
variables:
SCONS_TARGET: "llvm=0"
SCONS_CHECK_COMMAND: "scons llvm=0 check"
build:scons-llvm:
extends: .scons-build
variables:
SCONS_TARGET: "llvm=1"
SCONS_CHECK_COMMAND: "scons llvm=1 check"
LLVM_VERSION: "3.9"
build:scons-swr:
extends: .scons-build
variables:
SCONS_TARGET: "swr=1"
SCONS_CHECK_COMMAND: "true"
LLVM_VERSION: "6.0"

View File

@@ -0,0 +1,165 @@
FROM ubuntu:bionic
RUN apt-get update
RUN apt-get upgrade -y
RUN apt-get install -y \
curl \
wget \
gnupg \
software-properties-common
RUN curl -fsSL https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
RUN add-apt-repository "deb http://apt.llvm.org/bionic/ llvm-toolchain-bionic-7 main"
RUN apt-get update
RUN apt-get install -y \
pkg-config \
libdrm-dev \
libpciaccess-dev \
libxrandr-dev \
libxdamage-dev \
libxfixes-dev \
libxshmfence-dev \
libxxf86vm-dev \
libvdpau-dev \
libva-dev \
llvm-3.9-dev \
libclang-3.9-dev \
llvm-4.0-dev \
libclang-4.0-dev \
llvm-5.0-dev \
llvm-6.0-dev \
llvm-7-dev \
clang-5.0 \
libclang-5.0-dev \
clang-6.0 \
libclang-6.0-dev \
clang-7 \
libclang-7-dev \
libclc-dev \
libxvmc-dev \
libomxil-bellagio-dev \
xz-utils \
libexpat1-dev \
libx11-xcb-dev \
x11proto-xf86vidmode-dev \
libelf-dev \
libunwind8-dev \
libglvnd-dev \
python2.7 \
python-pip \
python-setuptools \
python3.5 \
python3-pip \
python3-setuptools
RUN apt-get install -y \
libxcb-randr0
# autotools build deps
RUN apt-get install -y \
autoconf \
automake \
xutils-dev \
libtool \
bison \
flex \
gettext \
make
# dependencies where we want a specific version
ENV XORG_RELEASES https://xorg.freedesktop.org/releases/individual
ENV XCB_RELEASES https://xcb.freedesktop.org/dist
ENV WAYLAND_RELEASES https://wayland.freedesktop.org/releases
ENV XORGMACROS_VERSION util-macros-1.19.0
ENV GLPROTO_VERSION glproto-1.4.17
ENV DRI2PROTO_VERSION dri2proto-2.8
ENV LIBPCIACCESS_VERSION libpciaccess-0.13.4
ENV LIBDRM_VERSION libdrm-2.4.97
ENV XCBPROTO_VERSION xcb-proto-1.13
ENV RANDRPROTO_VERSION randrproto-1.3.0
ENV LIBXRANDR_VERSION libXrandr-1.3.0
ENV LIBXCB_VERSION libxcb-1.13
ENV LIBXSHMFENCE_VERSION libxshmfence-1.3
ENV LIBVDPAU_VERSION libvdpau-1.1
ENV LIBVA_VERSION libva-1.7.0
ENV LIBWAYLAND_VERSION wayland-1.15.0
ENV WAYLAND_PROTOCOLS_VERSION wayland-protocols-1.8
RUN wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2
RUN tar -xvf $XORGMACROS_VERSION.tar.bz2 && rm $XORGMACROS_VERSION.tar.bz2
RUN (cd $XORGMACROS_VERSION && ./configure && make install) && rm -rf $XORGMACROS_VERSION
RUN wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2
RUN tar -xvf $GLPROTO_VERSION.tar.bz2 && rm $GLPROTO_VERSION.tar.bz2
RUN (cd $GLPROTO_VERSION && ./configure && make install) && rm -rf $GLPROTO_VERSION
RUN wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2
RUN tar -xvf $DRI2PROTO_VERSION.tar.bz2 && rm $DRI2PROTO_VERSION.tar.bz2
RUN (cd $DRI2PROTO_VERSION && ./configure && make install) && rm -rf $DRI2PROTO_VERSION
RUN wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2
RUN tar -xvf $XCBPROTO_VERSION.tar.bz2 && rm $XCBPROTO_VERSION.tar.bz2
RUN (cd $XCBPROTO_VERSION && ./configure && make install) && rm -rf $XCBPROTO_VERSION
RUN wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2
RUN tar -xvf $LIBXCB_VERSION.tar.bz2 && rm $LIBXCB_VERSION.tar.bz2
RUN (cd $LIBXCB_VERSION && ./configure && make install) && rm -rf $LIBXCB_VERSION
RUN wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2
RUN tar -xvf $LIBPCIACCESS_VERSION.tar.bz2 && rm $LIBPCIACCESS_VERSION.tar.bz2
RUN (cd $LIBPCIACCESS_VERSION && ./configure && make install) && rm -rf $LIBPCIACCESS_VERSION
RUN wget https://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
RUN tar -xvf $LIBDRM_VERSION.tar.bz2 && rm $LIBDRM_VERSION.tar.bz2
RUN (cd $LIBDRM_VERSION && ./configure --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api && make install) && rm -rf $LIBDRM_VERSION
RUN wget $XORG_RELEASES/proto/$RANDRPROTO_VERSION.tar.bz2
RUN tar -xvf $RANDRPROTO_VERSION.tar.bz2 && rm $RANDRPROTO_VERSION.tar.bz2
RUN (cd $RANDRPROTO_VERSION && ./configure && make install) && rm -rf $RANDRPROTO_VERSION
RUN wget $XORG_RELEASES/lib/$LIBXRANDR_VERSION.tar.bz2
RUN tar -xvf $LIBXRANDR_VERSION.tar.bz2 && rm $LIBXRANDR_VERSION.tar.bz2
RUN (cd $LIBXRANDR_VERSION && ./configure && make install) && rm -rf $LIBXRANDR_VERSION
RUN wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
RUN tar -xvf $LIBXSHMFENCE_VERSION.tar.bz2 && rm $LIBXSHMFENCE_VERSION.tar.bz2
RUN (cd $LIBXSHMFENCE_VERSION && ./configure && make install) && rm -rf $LIBXSHMFENCE_VERSION
RUN wget https://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2
RUN tar -xvf $LIBVDPAU_VERSION.tar.bz2 && rm $LIBVDPAU_VERSION.tar.bz2
RUN (cd $LIBVDPAU_VERSION && ./configure && make install) && rm -rf $LIBVDPAU_VERSION
RUN wget https://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2
RUN tar -xvf $LIBVA_VERSION.tar.bz2 && rm $LIBVA_VERSION.tar.bz2
RUN (cd $LIBVA_VERSION && ./configure --disable-wayland --disable-dummy-driver && make install) && rm -rf $LIBVA_VERSION
RUN wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz
RUN tar -xvf $LIBWAYLAND_VERSION.tar.xz && rm $LIBWAYLAND_VERSION.tar.xz
RUN (cd $LIBWAYLAND_VERSION && ./configure --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation && make install) && rm -rf $LIBWAYLAND_VERSION
RUN wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz
RUN tar -xvf $WAYLAND_PROTOCOLS_VERSION.tar.xz && rm $WAYLAND_PROTOCOLS_VERSION.tar.xz
RUN (cd $WAYLAND_PROTOCOLS_VERSION && ./configure && make install) && rm -rf $WAYLAND_PROTOCOLS_VERSION
RUN apt-get install -y unzip
# Meson requires ninja >= 1.6, but xenial has 1.3.x
RUN wget https://github.com/ninja-build/ninja/releases/download/v1.6.0/ninja-linux.zip
RUN unzip ninja-linux.zip && rm ninja-linux.zip
RUN mv ninja /usr/bin/
RUN pip3 install 'meson>=0.49'
RUN pip2 install 'scons>=2.4'
RUN pip2 install mako
RUN pip3 install mako
# Use ccache to speed up builds
RUN apt-get install -y ccache
# Cleanup workdir
WORKDIR /

View File

@@ -145,9 +145,16 @@ Edward O'Callaghan <funfunctor@folklore1984.net> <eocallaghan@alterapraxis.com>
Emeric Grange <emeric.grange@gmail.com> Emeric <emeric.grange@gmail.com> Emeric Grange <emeric.grange@gmail.com> Emeric <emeric.grange@gmail.com>
Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com> Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com>
Emil Velikov <emil.l.velikov@gmail.com> <emil.veliko@collabora.com>
Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.co.uk>
Emil Velikov <emil.l.velikov@gmail.com> <emil.veliikov@collabora.com>
Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@gmail.com>
Emil Velikov <emil.l.velikov@gmail.com> <emmil.velikov@collabora.com>
Eric Anholt <eric@anholt.net> Eric Anholt <anholt@FreeBSD.org> Eric Anholt <eric@anholt.net> Eric Anholt <anholt@FreeBSD.org>
Eric Engestrom <eric@engestrom.ch> <eric.engestrom@imgtec.com>
Eugeni Dodonov <eugeni.dodonov@intel.com> <eugeni@mandriva.com> Eugeni Dodonov <eugeni.dodonov@intel.com> <eugeni@mandriva.com>
Fabian Bieler <der.fabe@gmx.net> <fabianbieler@fastmail.fm> Fabian Bieler <der.fabe@gmx.net> <fabianbieler@fastmail.fm>
@@ -258,6 +265,9 @@ Kristian Høgsberg <krh@bitplanet.net> <krh@hinata.boston.redhat.com>
Kristian Høgsberg <krh@bitplanet.net> <krh@sasori.boston.redhat.com> Kristian Høgsberg <krh@bitplanet.net> <krh@sasori.boston.redhat.com>
Kristian Høgsberg <krh@bitplanet.net> <krh@temari.boston.redhat.com> Kristian Høgsberg <krh@bitplanet.net> <krh@temari.boston.redhat.com>
Kristian Høgsberg <krh@bitplanet.net> <kristian.h.kristensen@intel.com> Kristian Høgsberg <krh@bitplanet.net> <kristian.h.kristensen@intel.com>
Kristian Høgsberg <krh@bitplanet.net> <hoegsberg@chromium.org>
Kristian Høgsberg <krh@bitplanet.net> <hoegsberg@google.com>
Kristian Høgsberg <krh@bitplanet.net> <hoegsberg@gmail.com>
Krzesimir Nowak <qdlacz@gmail.com> <krzesimir@kinvolk.io> Krzesimir Nowak <qdlacz@gmail.com> <krzesimir@kinvolk.io>

View File

@@ -1,500 +1,79 @@
language: c language: c
sudo: false dist: xenial
dist: trusty
cache: cache:
apt: true
ccache: true ccache: true
env: env:
global: global:
- XORG_RELEASES=http://xorg.freedesktop.org/releases/individual - PKG_CONFIG_PATH="$PKG_CONFIG_PATH"
- XCB_RELEASES=http://xcb.freedesktop.org/dist
- WAYLAND_RELEASES=http://wayland.freedesktop.org/releases
- XORGMACROS_VERSION=util-macros-1.19.0
- GLPROTO_VERSION=glproto-1.4.17
- DRI2PROTO_VERSION=dri2proto-2.8
- LIBPCIACCESS_VERSION=libpciaccess-0.13.4
- LIBDRM_VERSION=libdrm-2.4.74
- XCBPROTO_VERSION=xcb-proto-1.11
- LIBXCB_VERSION=libxcb-1.11
- LIBXSHMFENCE_VERSION=libxshmfence-1.2
- LIBVDPAU_VERSION=libvdpau-1.1
- LIBVA_VERSION=libva-1.6.2
- LIBWAYLAND_VERSION=wayland-1.11.1
- WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.8
- PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig:$HOME/prefix/share/pkgconfig
- LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"
- PATH="$HOME/prefix/bin:$PATH"
matrix: matrix:
include: include:
- env: - env:
- LABEL="meson Vulkan" - LABEL="macOS make"
- BUILD=meson
- MESON_OPTIONS="-Ddri-drivers= -Dgallium-drivers="
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
# Common
- xz-utils
- libexpat1-dev
- libelf-dev
- python3-pip
- env:
- LABEL="meson loaders/classic DRI"
- BUILD=meson
- MESON_OPTIONS="-Dvulkan-drivers= -Dgallium-drivers="
addons:
apt:
packages:
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libxdamage-dev
- libxfixes-dev
- python3-pip
- env:
- LABEL="make loaders/classic DRI"
- BUILD=make - BUILD=make
- MAKEFLAGS="-j4" - MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="make check" - MAKE_CHECK_COMMAND="make check"
- DRI_LOADERS="--enable-glx --enable-gbm --enable-egl --with-platforms=x11,drm,surfaceless,wayland --enable-osmesa" - DRI_LOADERS="--with-platforms=x11 --disable-egl"
- DRI_DRIVERS="i915,i965,radeon,r200,swrast,nouveau" os: osx
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS=""
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--disable-libunwind"
addons:
apt:
packages:
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libxdamage-dev
- libxfixes-dev
- env: - env:
# NOTE: Building SWR is 2x (yes two) times slower than all the other - LABEL="macOS meson"
# gallium drivers combined. - BUILD=meson
# Start this early so that it doesn't hunder the run time. - UNWIND="false"
- LABEL="make Gallium Drivers SWR" - DRI_LOADERS="-Dglx=dri -Dgbm=false -Degl=false -Dplatforms=x11 -Dosmesa=none"
- BUILD=make - GALLIUM_ST="-Ddri3=true -Dgallium-vdpau=false -Dgallium-xvmc=false -Dgallium-omx=disabled -Dgallium-va=false -Dgallium-xa=false -Dgallium-nine=false -Dgallium-opencl=disabled"
- MAKEFLAGS="-j4" os: osx
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.9
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- OVERRIDE_CC="gcc-4.8"
- OVERRIDE_CXX="g++-4.8"
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="swr"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
- binutils-2.26
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
- LABEL="make Gallium Drivers Other"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.9
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="i915,nouveau,pl111,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
- binutils-2.26
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
# NOTE: Analogous to SWR above, building Clover is quite slow.
- LABEL="make Gallium ST Clover LLVM-3.9"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.9
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- OVERRIDE_CC=gcc-4.7
- OVERRIDE_CXX=g++-4.7
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600,radeonsi"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
- binutils-2.26
- libclc-dev
# LLVM packaging is broken and misses these dependencies
- libedit-dev
- g++-4.7
# From sources above
- llvm-3.9-dev
- clang-3.9
- libclang-3.9-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
# NOTE: Analogous to SWR above, building Clover is quite slow.
- LABEL="make Gallium ST Clover LLVM-4.0"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=4.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- OVERRIDE_CC=gcc-4.8
- OVERRIDE_CXX=g++-4.8
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600,radeonsi"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-4.0
packages:
- libclc-dev
# LLVM packaging is broken and misses these dependencies
- libedit-dev
- g++-4.8
# From sources above
- llvm-4.0-dev
- clang-4.0
- libclang-4.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
# NOTE: Analogous to SWR above, building Clover is quite slow.
- LABEL="make Gallium ST Clover LLVM-5.0"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=5.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- OVERRIDE_CC=gcc-4.8
- OVERRIDE_CXX=g++-4.8
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600,radeonsi"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-5.0
packages:
- libclc-dev
# LLVM packaging is broken and misses these dependencies
- libedit-dev
- g++-4.8
# From sources above
- llvm-5.0-dev
- clang-5.0
- libclang-5.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
- LABEL="make Gallium ST Other"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.3
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --enable-xa --enable-nine --enable-xvmc --enable-vdpau --enable-va --enable-omx-bellagio --enable-gallium-osmesa"
# We need swrast for osmesa and nine.
# i915 most likely doesn't work with most ST.
# Regardless - we're doing a quick build test here.
- GALLIUM_DRIVERS="i915,swrast"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
packages:
# We actually want to test against llvm-3.3
- llvm-3.3-dev
# Nine requires gcc 4.6... which is the one we have right ?
- libxvmc-dev
# Build locally, for now.
#- libvdpau-dev
#- libva-dev
- libomxil-bellagio-dev
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
- LABEL="make Vulkan"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="make -C src/gtest check && make -C src/intel check"
- LLVM_VERSION=3.9
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl --with-platforms=x11,wayland"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --enable-dri3 --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS=""
- VULKAN_DRIVERS="intel,radeon"
- LIBUNWIND_FLAGS="--disable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
- binutils-2.26
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- env:
- LABEL="scons"
- BUILD=scons
- SCONSFLAGS="-j4"
# Explicitly disable.
- SCONS_TARGET="llvm=0"
# Keep it symmetrical to the make build.
- SCONS_CHECK_COMMAND="scons llvm=0 check"
addons:
apt:
packages:
- scons
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- env:
- LABEL="scons LLVM"
- BUILD=scons
- SCONSFLAGS="-j4"
- SCONS_TARGET="llvm=1"
# Keep it symmetrical to the make build.
- SCONS_CHECK_COMMAND="scons llvm=1 check"
- LLVM_VERSION=3.3
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
addons:
apt:
packages:
- scons
# LLVM packaging is broken and misses these dependencies
- libedit-dev
- llvm-3.3-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- env:
- LABEL="scons SWR"
- BUILD=scons
- SCONSFLAGS="-j4"
- SCONS_TARGET="swr=1"
- LLVM_VERSION=3.9
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
# Keep it symmetrical to the make build. There's no actual SWR, yet.
- SCONS_CHECK_COMMAND="true"
- OVERRIDE_CC="gcc-4.8"
- OVERRIDE_CXX="g++-4.8"
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
- scons
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
install: before_install:
- pip install --user mako - |
if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then
HOMEBREW_NO_AUTO_UPDATE=1 brew install python3 ninja expat gettext
# Set PATH for homebrew pip3 installs
PATH="$HOME/Library/Python/3.6/bin:${PATH}"
# Set PKG_CONFIG_PATH for keg-only expat
PKG_CONFIG_PATH="/usr/local/opt/expat/lib/pkgconfig:${PKG_CONFIG_PATH}"
# Set PATH for keg-only gettext
PATH="/usr/local/opt/gettext/bin:${PATH}"
# Install the latest meson from pip, since the version in the ubuntu repos is # Install xquartz for prereqs ...
# often quite old. XQUARTZ_VERSION="2.7.11"
- if test "x$BUILD" = xmeson; then wget -nv https://dl.bintray.com/xquartz/downloads/XQuartz-${XQUARTZ_VERSION}.dmg
pip3 install --user meson; hdiutil attach XQuartz-${XQUARTZ_VERSION}.dmg
sudo installer -pkg /Volumes/XQuartz-${XQUARTZ_VERSION}/XQuartz.pkg -target /
hdiutil detach /Volumes/XQuartz-${XQUARTZ_VERSION}
# ... and set paths
PATH="/opt/X11/bin:${PATH}"
PKG_CONFIG_PATH="/opt/X11/share/pkgconfig:/opt/X11/lib/pkgconfig:${PKG_CONFIG_PATH}"
ACLOCAL="aclocal -I /opt/X11/share/aclocal -I /usr/local/share/aclocal"
fi fi
# Since libdrm gets updated in configure.ac regularly, try to pick up the install:
# latest version from there. # Install a more modern meson from pip, since the version in the
- for line in `grep "^LIBDRM.*_REQUIRED=" configure.ac`; do # ubuntu repos is often quite old.
old_ver=`echo $LIBDRM_VERSION | sed 's/libdrm-//'`; - if test "x$BUILD" = xmeson; then
new_ver=`echo $line | sed 's/.*REQUIRED=//'`; pip3 install --user meson;
if `echo "$old_ver,$new_ver" | tr ',' '\n' | sort -Vc 2> /dev/null`; then pip3 install --user mako;
export LIBDRM_VERSION="libdrm-$new_ver"; fi
fi;
done # Install autotools build dependencies
- if test "x$BUILD" = xmake; then
pip2 install --user mako;
fi
# Install dependencies where we require specific versions (or where # Install dependencies where we require specific versions (or where
# disallowed by Travis CI's package whitelisting). # disallowed by Travis CI's package whitelisting).
- wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2
- tar -jxvf $XORGMACROS_VERSION.tar.bz2
- (cd $XORGMACROS_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2
- tar -jxvf $GLPROTO_VERSION.tar.bz2
- (cd $GLPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2
- tar -jxvf $DRI2PROTO_VERSION.tar.bz2
- (cd $DRI2PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2
- tar -jxvf $XCBPROTO_VERSION.tar.bz2
- (cd $XCBPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2
- tar -jxvf $LIBXCB_VERSION.tar.bz2
- (cd $LIBXCB_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2
- tar -jxvf $LIBPCIACCESS_VERSION.tar.bz2
- (cd $LIBPCIACCESS_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
- tar -jxvf $LIBDRM_VERSION.tar.bz2
- (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api && make install)
- wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
- tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
- (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget http://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2
- tar -jxvf $LIBVDPAU_VERSION.tar.bz2
- (cd $LIBVDPAU_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget http://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2
- tar -jxvf $LIBVA_VERSION.tar.bz2
- (cd $LIBVA_VERSION && ./configure --prefix=$HOME/prefix --disable-wayland --disable-dummy-driver && make install)
- wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz
- tar -axvf $LIBWAYLAND_VERSION.tar.xz
- (cd $LIBWAYLAND_VERSION && ./configure --prefix=$HOME/prefix --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation && make install)
- wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz
- tar -axvf $WAYLAND_PROTOCOLS_VERSION.tar.xz
- (cd $WAYLAND_PROTOCOLS_VERSION && ./configure --prefix=$HOME/prefix && make install)
# Meson requires ninja >= 1.6, but trusty has 1.3.x
- wget https://github.com/ninja-build/ninja/releases/download/v1.6.0/ninja-linux.zip;
- unzip ninja-linux.zip
- mv ninja $HOME/prefix/bin/
# Generate the header since one is missing on the Travis instance
- mkdir -p linux
- printf "%s\n" \
"#ifndef _LINUX_MEMFD_H" \
"#define _LINUX_MEMFD_H" \
"" \
"#define __NR_memfd_create 319" \
"#define SYS_memfd_create __NR_memfd_create" \
"" \
"#define MFD_CLOEXEC 0x0001U" \
"#define MFD_ALLOW_SEALING 0x0002U" \
"" \
"#endif /* _LINUX_MEMFD_H */" > linux/memfd.h
script: script:
- if test "x$BUILD" = xmake; then - if test "x$BUILD" = xmake; then
test -n "$OVERRIDE_CC" && export CC="$OVERRIDE_CC";
test -n "$OVERRIDE_CXX" && export CXX="$OVERRIDE_CXX";
test -n "$OVERRIDE_PATH" && export PATH="$OVERRIDE_PATH:$PATH";
export CFLAGS="$CFLAGS -isystem`pwd`"; export CFLAGS="$CFLAGS -isystem`pwd`";
./autogen.sh --enable-debug mkdir build &&
cd build &&
../autogen.sh
--enable-autotools
--enable-debug
$LIBUNWIND_FLAGS $LIBUNWIND_FLAGS
$DRI_LOADERS $DRI_LOADERS
--with-dri-drivers=$DRI_DRIVERS --with-dri-drivers=$DRI_DRIVERS
@@ -506,14 +85,30 @@ script:
make && eval $MAKE_CHECK_COMMAND; make && eval $MAKE_CHECK_COMMAND;
fi fi
- if test "x$BUILD" = xscons; then - |
test -n "$OVERRIDE_CC" && export CC="$OVERRIDE_CC"; if test "x$BUILD" = xmeson; then
test -n "$OVERRIDE_CXX" && export CXX="$OVERRIDE_CXX"; if test -n "$LLVM_CONFIG"; then
scons $SCONS_TARGET && eval $SCONS_CHECK_COMMAND; # We need to control the version of llvm-config we're using, so we'll
fi # generate a native file to do so. This requires meson >=0.49
#
echo -e "[binaries]\nllvm-config = '`which $LLVM_CONFIG`'" > native.file
- if test "x$BUILD" = xmeson; then $LLVM_CONFIG --version
export CFLAGS="$CFLAGS -isystem`pwd`"; else
meson _build $MESON_OPTIONS; : > native.file
ninja -C _build; fi
export CFLAGS="$CFLAGS -isystem`pwd`"
meson _build \
--native-file=native.file \
-Dbuild-tests=true \
-Dlibunwind=${UNWIND} \
${DRI_LOADERS} \
-Ddri-drivers=${DRI_DRIVERS:-[]} \
${GALLIUM_ST} \
-Dgallium-drivers=${GALLIUM_DRIVERS:-[]} \
-Dvulkan-drivers=${VULKAN_DRIVERS:-[]}
meson configure _build
ninja -C _build
ninja -C _build test
fi fi

View File

@@ -37,7 +37,6 @@ LOCAL_CFLAGS += \
-Wno-missing-field-initializers \ -Wno-missing-field-initializers \
-Wno-initializer-overrides \ -Wno-initializer-overrides \
-Wno-mismatched-tags \ -Wno-mismatched-tags \
-DVERSION=\"$(MESA_VERSION)\" \
-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \ -DPACKAGE_VERSION=\"$(MESA_VERSION)\" \
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" -DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"
@@ -52,6 +51,7 @@ LOCAL_CFLAGS += \
-DHAVE___BUILTIN_EXPECT \ -DHAVE___BUILTIN_EXPECT \
-DHAVE___BUILTIN_FFS \ -DHAVE___BUILTIN_FFS \
-DHAVE___BUILTIN_FFSLL \ -DHAVE___BUILTIN_FFSLL \
-DHAVE_DLFCN_H \
-DHAVE_FUNC_ATTRIBUTE_FLATTEN \ -DHAVE_FUNC_ATTRIBUTE_FLATTEN \
-DHAVE_FUNC_ATTRIBUTE_UNUSED \ -DHAVE_FUNC_ATTRIBUTE_UNUSED \
-DHAVE_FUNC_ATTRIBUTE_FORMAT \ -DHAVE_FUNC_ATTRIBUTE_FORMAT \
@@ -70,9 +70,13 @@ LOCAL_CFLAGS += \
-DHAVE_DLADDR \ -DHAVE_DLADDR \
-DHAVE_DL_ITERATE_PHDR \ -DHAVE_DL_ITERATE_PHDR \
-DHAVE_LINUX_FUTEX_H \ -DHAVE_LINUX_FUTEX_H \
-DHAVE_ENDIAN_H \
-DHAVE_ZLIB \ -DHAVE_ZLIB \
-DMAJOR_IN_SYSMACROS \ -DMAJOR_IN_SYSMACROS \
-DVK_USE_PLATFORM_ANDROID_KHR \
-fvisibility=hidden \ -fvisibility=hidden \
-fno-math-errno \
-fno-trapping-math \
-Wno-sign-compare -Wno-sign-compare
LOCAL_CPPFLAGS += \ LOCAL_CPPFLAGS += \
@@ -86,6 +90,13 @@ LOCAL_CPPFLAGS += \
LOCAL_CONLYFLAGS += \ LOCAL_CONLYFLAGS += \
-std=c99 -std=c99
# c11 timespec_get is part of bionic as well
# https://android-review.googlesource.com/c/718518
# This means releases from P and earlier won't need this
ifeq ($(filter 5 6 7 8 9, $(MESA_ANDROID_MAJOR_VERSION)),)
LOCAL_CFLAGS += -DHAVE_TIMESPEC_GET
endif
ifeq ($(strip $(MESA_ENABLE_ASM)),true) ifeq ($(strip $(MESA_ENABLE_ASM)),true)
ifeq ($(TARGET_ARCH),x86) ifeq ($(TARGET_ARCH),x86)
LOCAL_CFLAGS += \ LOCAL_CFLAGS += \

View File

@@ -24,7 +24,7 @@
# BOARD_GPU_DRIVERS should be defined. The valid values are # BOARD_GPU_DRIVERS should be defined. The valid values are
# #
# classic drivers: i915 i965 # classic drivers: i915 i965
# gallium drivers: swrast freedreno i915g nouveau pl111 r300g r600g radeonsi vc4 virgl vmwgfx etnaviv imx # gallium drivers: swrast freedreno i915g nouveau kmsro r300g r600g radeonsi vc4 virgl vmwgfx etnaviv iris
# #
# The main target is libGLES_mesa. For each classic driver enabled, a DRI # The main target is libGLES_mesa. For each classic driver enabled, a DRI
# module will also be built. DRI modules will be loaded by libGLES_mesa. # module will also be built. DRI modules will be loaded by libGLES_mesa.
@@ -52,7 +52,7 @@ gallium_drivers := \
freedreno.HAVE_GALLIUM_FREEDRENO \ freedreno.HAVE_GALLIUM_FREEDRENO \
i915g.HAVE_GALLIUM_I915 \ i915g.HAVE_GALLIUM_I915 \
nouveau.HAVE_GALLIUM_NOUVEAU \ nouveau.HAVE_GALLIUM_NOUVEAU \
pl111.HAVE_GALLIUM_PL111 \ kmsro.HAVE_GALLIUM_KMSRO \
r300g.HAVE_GALLIUM_R300 \ r300g.HAVE_GALLIUM_R300 \
r600g.HAVE_GALLIUM_R600 \ r600g.HAVE_GALLIUM_R600 \
radeonsi.HAVE_GALLIUM_RADEONSI \ radeonsi.HAVE_GALLIUM_RADEONSI \
@@ -60,7 +60,7 @@ gallium_drivers := \
vc4.HAVE_GALLIUM_VC4 \ vc4.HAVE_GALLIUM_VC4 \
virgl.HAVE_GALLIUM_VIRGL \ virgl.HAVE_GALLIUM_VIRGL \
etnaviv.HAVE_GALLIUM_ETNAVIV \ etnaviv.HAVE_GALLIUM_ETNAVIV \
imx.HAVE_GALLIUM_IMX iris.HAVE_GALLIUM_IRIS
ifeq ($(BOARD_GPU_DRIVERS),all) ifeq ($(BOARD_GPU_DRIVERS),all)
MESA_BUILD_CLASSIC := $(filter HAVE_%, $(subst ., , $(classic_drivers))) MESA_BUILD_CLASSIC := $(filter HAVE_%, $(subst ., , $(classic_drivers)))

View File

@@ -10,7 +10,7 @@ $(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/STATIC_LIBRARIES/libmesa_*_interm
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/i9?5_dri_intermediates) $(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/i9?5_dri_intermediates)
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/libglapi_intermediates) $(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/libglapi_intermediates)
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/libGLES_mesa_intermediates) $(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/libGLES_mesa_intermediates)
$(call add-clean-step, rm -rf $(HOST_OUT_release)/*/EXECUTABLES/mesa_*_intermediates) $(call add-clean-step, rm -rf $(HOST_OUT)/*/EXECUTABLES/mesa_*_intermediates)
$(call add-clean-step, rm -rf $(HOST_OUT_release)/*/EXECUTABLES/glsl_compiler_intermediates) $(call add-clean-step, rm -rf $(HOST_OUT)/*/EXECUTABLES/glsl_compiler_intermediates)
$(call add-clean-step, rm -rf $(HOST_OUT_release)/*/STATIC_LIBRARIES/libmesa_*_intermediates) $(call add-clean-step, rm -rf $(HOST_OUT)/*/STATIC_LIBRARIES/libmesa_*_intermediates)
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/*_dri_intermediates) $(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/*_dri_intermediates)

View File

@@ -22,6 +22,7 @@
SUBDIRS = src SUBDIRS = src
AM_DISTCHECK_CONFIGURE_FLAGS = \ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-autotools \
--enable-dri \ --enable-dri \
--enable-dri3 \ --enable-dri3 \
--enable-egl \ --enable-egl \
@@ -45,7 +46,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-libunwind \ --enable-libunwind \
--with-platforms=x11,wayland,drm,surfaceless \ --with-platforms=x11,wayland,drm,surfaceless \
--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \ --with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
--with-gallium-drivers=i915,nouveau,r300,pl111,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr,etnaviv,imx \ --with-gallium-drivers=i915,nouveau,r300,kmsro,r600,radeonsi,freedreno,svga,swrast,vc4,tegra,virgl,swr,etnaviv \
--with-vulkan-drivers=intel,radeon --with-vulkan-drivers=intel,radeon
ACLOCAL_AMFLAGS = -I m4 ACLOCAL_AMFLAGS = -I m4
@@ -64,7 +65,8 @@ EXTRA_DIST = \
meson_options.txt \ meson_options.txt \
bin/meson.build \ bin/meson.build \
include/meson.build \ include/meson.build \
bin/install_megadrivers.py bin/install_megadrivers.py \
bin/meson_get_version.py
noinst_HEADERS = \ noinst_HEADERS = \
include/c99_alloca.h \ include/c99_alloca.h \
@@ -75,12 +77,15 @@ noinst_HEADERS = \
include/drm-uapi/drm_fourcc.h \ include/drm-uapi/drm_fourcc.h \
include/drm-uapi/drm_mode.h \ include/drm-uapi/drm_mode.h \
include/drm-uapi/i915_drm.h \ include/drm-uapi/i915_drm.h \
include/drm-uapi/tegra_drm.h \
include/drm-uapi/v3d_drm.h \
include/drm-uapi/vc4_drm.h \ include/drm-uapi/vc4_drm.h \
include/D3D9 \ include/D3D9 \
include/GL/wglext.h \ include/GL/wglext.h \
include/HaikuGL \ include/HaikuGL \
include/no_extern_c.h \ include/no_extern_c.h \
include/pci_ids include/pci_ids \
include/vulkan
# We list some directories in EXTRA_DIST, but don't actually want to include # We list some directories in EXTRA_DIST, but don't actually want to include
# the .gitignore files in the tarball. # the .gitignore files in the tarball.

60
README.rst Normal file
View File

@@ -0,0 +1,60 @@
`Mesa <https://mesa3d.org>`_ - The 3D Graphics Library
======================================================
Source
------
This repository lives at https://gitlab.freedesktop.org/mesa/mesa.
Other repositories are likely forks, and code found there is not supported.
Build & install
---------------
You can find more information in our documentation (`docs/install.html
<https://mesa3d.org/install.html>`_), but the recommended way is to use
Meson (`docs/meson.html <https://mesa3d.org/meson.html>`_):
.. code-block:: sh
$ mkdir build
$ cd build
$ meson ..
$ sudo ninja install
Support
-------
Many Mesa devs hang on IRC; if you're not sure which channel is
appropriate, you should ask your question on `Freenode's #dri-devel
<irc://chat.freenode.net#dri-devel>`_, someone will redirect you if
necessary.
Remember that not everyone is in the same timezone as you, so it might
take a while before someone qualified sees your question.
To figure out who you're talking to, or which nick to ping for your
question, check out `Who's Who on IRC
<https://dri.freedesktop.org/wiki/WhosWho/>`_.
The next best option is to ask your question in an email to the
mailing lists: `mesa-dev\@lists.freedesktop.org
<https://lists.freedesktop.org/mailman/listinfo/mesa-dev>`_
Bug reports
-----------
If you think something isn't working properly, please file a bug report
(`docs/bugs.html <https://mesa3d.org/bugs.html>`_).
Contributing
------------
Contributions are welcome, and step-by-step instructions can be found in our
documentation (`docs/submittingpatches.html
<https://mesa3d.org/submittingpatches.html>`_).
Note that Mesa uses email mailing-lists for patches submission, review and
discussions.

View File

@@ -72,7 +72,9 @@ F: src/loader/
EGL EGL
R: Eric Engestrom <eric@engestrom.ch> R: Eric Engestrom <eric@engestrom.ch>
R: Emil Velikov <emil.l.velikov@gmail.com>
F: src/egl/ F: src/egl/
F: include/EGL/
HAIKU HAIKU
R: Alexander von Gluck IV <kallisti5@unixzen.com> R: Alexander von Gluck IV <kallisti5@unixzen.com>
@@ -116,6 +118,7 @@ MESON BUILD
R: Dylan Baker <dylan@pnwbakers.com> R: Dylan Baker <dylan@pnwbakers.com>
R: Eric Engestrom <eric@engestrom.ch> R: Eric Engestrom <eric@engestrom.ch>
F: */meson.build F: */meson.build
F: meson.build
F: meson_options.txt F: meson_options.txt
ANDROID EGL SUPPORT ANDROID EGL SUPPORT
@@ -135,3 +138,8 @@ F: src/gallium/drivers/freedreno/
GLX GLX
R: Adam Jackson <ajax@redhat.com> R: Adam Jackson <ajax@redhat.com>
F: src/glx/ F: src/glx/
VULKAN
R: Eric Engestrom <eric@engestrom.ch>
F: src/vulkan/
F: include/vulkan/

View File

@@ -27,6 +27,13 @@ import SCons.Util
import common import common
#######################################################################
# Minimal scons version
EnsureSConsVersion(2, 4)
EnsurePythonVersion(2, 7)
####################################################################### #######################################################################
# Configuration options # Configuration options

View File

@@ -1 +1 @@
17.4.0-devel 19.1.0-devel

View File

@@ -33,31 +33,41 @@ branches:
# - https://www.appveyor.com/blog/2014/06/04/shallow-clone-for-git-repositories # - https://www.appveyor.com/blog/2014/06/04/shallow-clone-for-git-repositories
clone_depth: 100 clone_depth: 100
# https://www.appveyor.com/docs/build-cache/
cache: cache:
- win_flex_bison-2.5.9.zip - '%LOCALAPPDATA%\pip\Cache -> appveyor.yml'
- llvm-3.3.1-msvc2013-mtd.7z - win_flex_bison-2.5.15.zip
- llvm-5.0.1-msvc2017-mtd.7z
os: Visual Studio 2013 os: Visual Studio 2017
init:
# Appveyor defaults core.autocrlf to input instead of the default (true), but
# that can hide problems processing CRLF text on Windows
- git config --global core.autocrlf true
environment: environment:
WINFLEXBISON_ARCHIVE: win_flex_bison-2.5.9.zip WINFLEXBISON_VERSION: 2.5.15
LLVM_ARCHIVE: llvm-3.3.1-msvc2013-mtd.7z LLVM_ARCHIVE: llvm-5.0.1-msvc2017-mtd.7z
install: install:
# Check git config
- git config core.autocrlf
# Check pip # Check pip
- python --version - python --version
- python -m pip --version - python -m pip --version
# Install Mako # Install Mako
- python -m pip install Mako==1.0.6 - python -m pip install Mako==1.0.7
# Install pywin32 extensions, needed by SCons # Install pywin32 extensions, needed by SCons
- python -m pip install pypiwin32 - python -m pip install pypiwin32
# Install python wheels, necessary to install SCons via pip # Install python wheels, necessary to install SCons via pip
- python -m pip install wheel - python -m pip install wheel
# Install SCons # Install SCons
- python -m pip install scons==2.5.1 - python -m pip install scons==3.0.1
- scons --version - scons --version
# Install flex/bison # Install flex/bison
- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://downloads.sourceforge.net/project/winflexbison/old_versions/%WINFLEXBISON_ARCHIVE%" - set WINFLEXBISON_ARCHIVE=win_flex_bison-%WINFLEXBISON_VERSION%.zip
- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://github.com/lexxmark/winflexbison/releases/download/v%WINFLEXBISON_VERSION%/%WINFLEXBISON_ARCHIVE%"
- 7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul - 7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul
- set Path=%CD%\winflexbison;%Path% - set Path=%CD%\winflexbison;%Path%
- win_flex --version - win_flex --version
@@ -69,10 +79,10 @@ install:
- set LLVM=%CD%\llvm - set LLVM=%CD%\llvm
build_script: build_script:
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1 - scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.1 llvm=1
after_build: after_build:
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1 check - scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.1 llvm=1 check
# It's possible to setup notification here, as described in # It's possible to setup notification here, as described in

View File

@@ -23,7 +23,7 @@ echo "<ul>"
echo "" echo ""
# extract fdo urls from commit log # extract fdo urls from commit log
git log $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after |\ git log --pretty=medium $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after |\
while read url while read url
do do
id=$(echo $url | cut -d'=' -f2) id=$(echo $url | cut -d'=' -f2)

View File

@@ -1,81 +0,0 @@
#!/bin/sh
# Script for generating a list of candidates [referenced by a Fixes tag] for
# cherry-picking to a stable branch
#
# Usage examples:
#
# $ bin/get-fixes-pick-list.sh
# $ bin/get-fixes-pick-list.sh > picklist
# $ bin/get-fixes-pick-list.sh | tee picklist
# Use the last branchpoint as our limit for the search
latest_branchpoint=`git merge-base origin/master HEAD`
# List all the commits between day 1 and the branch point...
git log --reverse --pretty=%H $latest_branchpoint > already_landed
# ... and the ones cherry-picked.
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits with Fixes tag
git log --reverse --pretty=%H -i --grep="fixes:" $latest_branchpoint..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list ...
if [ -f bin/.cherry-ignore ] ; then
if grep -q ^$sha bin/.cherry-ignore ; then
continue
fi
fi
# Skip if it has been already cherry-picked.
if grep -q ^$sha already_picked ; then
continue
fi
# Place every "fixes:" tag on its own line and join with the next word
# on its line or a later one.
fixes=`git show -s $sha | tr -d "\n" | sed -e 's/fixes:[[:space:]]*/\nfixes:/Ig' | grep "fixes:" | sed -e 's/\(fixes:[a-zA-Z0-9]*\).*$/\1/'`
# For each one try to extract the tag
fixes_count=`echo "$fixes" | wc -l`
warn=`(test $fixes_count -gt 1 && echo $fixes_count) || echo 0`
while [ $fixes_count -gt 0 ] ; do
# Treat only the current line
id=`echo "$fixes" | tail -n $fixes_count | head -n 1 | cut -d : -f 2`
fixes_count=$(($fixes_count-1))
# Bail out if we cannot find suitable id.
# Any specific validation the $id is valid and not some junk, is
# implied with the follow up code
if [ "x$id" = x ] ; then
continue
fi
# Check if the offending commit is in branch.
# Be that cherry-picked ...
# ... or landed before the branchpoint.
if grep -q ^$id already_picked ||
grep -q ^$id already_landed ; then
printf "Commit \"%s\" fixes %s\n" \
"`git log -n1 --pretty=oneline $sha`" \
"$id"
warn=$(($warn-1))
fi
done
if [ $warn -gt 0 ] ; then
printf "WARNING: Commit \"%s\" has more than one Fixes tag\n" \
"`git log -n1 --pretty=oneline $sha`"
fi
done
rm -f already_picked
rm -f already_landed

View File

@@ -7,21 +7,107 @@
# $ bin/get-pick-list.sh # $ bin/get-pick-list.sh
# $ bin/get-pick-list.sh > picklist # $ bin/get-pick-list.sh > picklist
# $ bin/get-pick-list.sh | tee picklist # $ bin/get-pick-list.sh | tee picklist
#
# The output is as follows:
# [nomination_type] commit_sha commit summary
is_stable_nomination()
{
git show --pretty=medium --summary "$1" | grep -q -i -o "CC:.*mesa-stable"
}
is_typod_nomination()
{
git show --pretty=medium --summary "$1" | grep -q -i -o "CC:.*mesa-dev"
}
fixes=
# Helper to handle various mistypos of the fixes tag.
# The tag string itself is passed as argument and normalised within.
#
# Resulting string in the global variable "fixes" and contains entries
# in the form "fixes:$sha"
is_sha_nomination()
{
fixes=`git show --pretty=medium -s $1 | tr -d "\n" | \
sed -e 's/'"$2"'/\nfixes:/Ig' | \
grep -Eo 'fixes:[a-f0-9]{8,40}'`
fixes_count=`echo "$fixes" | grep "fixes:" | wc -l`
if test $fixes_count -eq 0; then
return 1
fi
# Throw a warning for each invalid sha
while test $fixes_count -gt 0; do
# Treat only the current line
id=`echo "$fixes" | tail -n $fixes_count | head -n 1 | cut -d : -f 2`
fixes_count=$(($fixes_count-1))
if ! git show $id >/dev/null 2>&1; then
echo WARNING: Commit $1 lists invalid sha $id
fi
done
return 0
}
# Checks if at least one of offending commits, listed in the global
# "fixes", is in branch.
sha_in_range()
{
fixes_count=`echo "$fixes" | grep "fixes:" | wc -l`
while test $fixes_count -gt 0; do
# Treat only the current line
id=`echo "$fixes" | tail -n $fixes_count | head -n 1 | cut -d : -f 2`
fixes_count=$(($fixes_count-1))
# Be that cherry-picked ...
# ... or landed before the branchpoint.
if grep -q ^$id already_picked ||
grep -q ^$id already_landed ; then
return 0
fi
done
return 1
}
is_fixes_nomination()
{
is_sha_nomination "$1" "fixes:[[:space:]]*"
if test $? -eq 0; then
return 0
fi
is_sha_nomination "$1" "fixes[[:space:]]\+"
}
is_brokenby_nomination()
{
is_sha_nomination "$1" "broken by"
}
is_revert_nomination()
{
is_sha_nomination "$1" "This reverts commit "
}
# Use the last branchpoint as our limit for the search # Use the last branchpoint as our limit for the search
latest_branchpoint=`git merge-base origin/master HEAD` latest_branchpoint=`git merge-base origin/master HEAD`
# Grep for commits with "cherry picked from commit" in the commit message. # List all the commits between day 1 and the branch point...
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\ git log --reverse --pretty=%H $latest_branchpoint > already_landed
# ... and the ones cherry-picked.
git log --reverse --pretty=medium --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\ grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree. # Grep for potential candidates
git log --reverse --pretty=%H -i --grep='^CC:.*mesa-stable' $latest_branchpoint..origin/master |\ git log --reverse --pretty=%H -i --grep='^CC:.*mesa-stable\|^CC:.*mesa-dev\|\<fixes\>\|\<broken by\>\|This reverts commit' $latest_branchpoint..origin/master |\
while read sha while read sha
do do
# Check to see whether the patch is on the ignore list. # Check to see whether the patch is on the ignore list.
if [ -f bin/.cherry-ignore ] ; then if test -f bin/.cherry-ignore; then
if grep -q ^$sha bin/.cherry-ignore ; then if grep -q ^$sha bin/.cherry-ignore ; then
continue continue
fi fi
@@ -32,7 +118,33 @@ do
continue continue
fi fi
git log -n1 --pretty=oneline $sha | cat if is_fixes_nomination "$sha"; then
tag=fixes
elif is_brokenby_nomination "$sha"; then
tag=brokenby
elif is_revert_nomination "$sha"; then
tag=revert
elif is_stable_nomination "$sha"; then
tag=stable
elif is_typod_nomination "$sha"; then
tag=typod
else
continue
fi
case "$tag" in
fixes | brokenby | revert )
if ! sha_in_range; then
continue
fi
;;
* )
;;
esac
printf "[ %8s ] " "$tag"
git --no-pager show --no-patch --oneline $sha
done done
rm -f already_picked rm -f already_picked
rm -f already_landed

View File

@@ -1,42 +0,0 @@
#!/bin/sh
# Script for generating a list of candidates which have typos in the nomination line
#
# Usage examples:
#
# $ bin/get-typod-pick-list.sh
# $ bin/get-typod-pick-list.sh > picklist
# $ bin/get-typod-pick-list.sh | tee picklist
# NB:
# This script intentionally _never_ checks for specific version tag
# Should we consider folding it with the original get-pick-list.sh
# Use the last branchpoint as our limit for the search
latest_branchpoint=`git merge-base origin/master HEAD`
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^CC:.*mesa-dev' $latest_branchpoint..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.
if [ -f bin/.cherry-ignore ] ; then
if grep -q ^$sha bin/.cherry-ignore ; then
continue
fi
fi
# Check to see if it has already been picked over.
if grep -q ^$sha already_picked ; then
continue
fi
git log -n1 --pretty=oneline $sha | cat
done
rm -f already_picked

29
bin/git_sha1_gen.py Executable file → Normal file
View File

@@ -1,5 +1,3 @@
#!/usr/bin/env python
""" """
Generate the contents of the git_sha1.h file. Generate the contents of the git_sha1.h file.
The output of this script goes to stdout. The output of this script goes to stdout.
@@ -28,22 +26,25 @@ def get_git_sha1():
git_sha1 = '' git_sha1 = ''
return git_sha1 return git_sha1
def write_if_different(contents):
"""
Avoid touching the output file if it doesn't need modifications
Useful to avoid triggering rebuilds when nothing has changed.
"""
if os.path.isfile(args.output):
with open(args.output, 'r') as file:
if file.read() == contents:
return
with open(args.output, 'w') as file:
file.write(contents)
parser = argparse.ArgumentParser() parser = argparse.ArgumentParser()
parser.add_argument('--output', help='File to write the #define in', parser.add_argument('--output', help='File to write the #define in',
required=True) required=True)
args = parser.parse_args() args = parser.parse_args()
git_sha1 = os.environ.get('MESA_GIT_SHA1_OVERRIDE', get_git_sha1())[:10] git_sha1 = os.environ.get('MESA_GIT_SHA1_OVERRIDE', get_git_sha1())[:10]
if git_sha1: if git_sha1:
git_sha1_h_in_path = os.path.join(os.path.dirname(sys.argv[0]), write_if_different('#define MESA_GIT_SHA1 " (git-' + git_sha1 + ')"')
'..', 'src', 'git_sha1.h.in')
with open(git_sha1_h_in_path , 'r') as git_sha1_h_in:
new_sha1 = git_sha1_h_in.read().replace('@VCS_TAG@', git_sha1)
if os.path.isfile(args.output):
with open(args.output, 'r') as git_sha1_h:
if git_sha1_h.read() == new_sha1:
quit()
with open(args.output, 'w') as git_sha1_h:
git_sha1_h.write(new_sha1)
else: else:
open(args.output, 'w').close() write_if_different('#define MESA_GIT_SHA1 ""')

27
bin/install_megadrivers.py Executable file → Normal file
View File

@@ -1,6 +1,5 @@
#!/usr/bin/env python
# encoding=utf-8 # encoding=utf-8
# Copyright © 2017 Intel Corporation # Copyright © 2017-2018 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy # Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal # of this software and associated documentation files (the "Software"), to deal
@@ -35,28 +34,34 @@ def main():
parser.add_argument('drivers', nargs='+') parser.add_argument('drivers', nargs='+')
args = parser.parse_args() args = parser.parse_args()
to = os.path.join(os.environ.get('MESON_INSTALL_DESTDIR_PREFIX'), args.libdir) if os.path.isabs(args.libdir):
to = os.path.join(os.environ.get('DESTDIR', '/'), args.libdir[1:])
else:
to = os.path.join(os.environ['MESON_INSTALL_DESTDIR_PREFIX'], args.libdir)
master = os.path.join(to, os.path.basename(args.megadriver)) master = os.path.join(to, os.path.basename(args.megadriver))
if not os.path.exists(to): if not os.path.exists(to):
if os.path.lexists(to):
os.unlink(to)
os.makedirs(to) os.makedirs(to)
shutil.copy(args.megadriver, master) shutil.copy(args.megadriver, master)
for each in args.drivers: for driver in args.drivers:
driver = os.path.join(to, each) abs_driver = os.path.join(to, driver)
if os.path.exists(driver): if os.path.lexists(abs_driver):
os.unlink(driver) os.unlink(abs_driver)
print('installing {} to {}'.format(args.megadriver, driver)) print('installing {} to {}'.format(args.megadriver, abs_driver))
os.link(master, driver) os.link(master, abs_driver)
try: try:
ret = os.getcwd() ret = os.getcwd()
os.chdir(to) os.chdir(to)
name, ext = os.path.splitext(each) name, ext = os.path.splitext(driver)
while ext != '.so': while ext != '.so':
if os.path.exists(name): if os.path.lexists(name):
os.unlink(name) os.unlink(name)
os.symlink(driver, name) os.symlink(driver, name)
name, ext = os.path.splitext(name) name, ext = os.path.splitext(name)

88
bin/meson-cmd-extract.py Executable file
View File

@@ -0,0 +1,88 @@
#!/usr/bin/env python3
# Copyright © 2019 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
"""This script reads a meson build directory and gives back the command line it
was configured with.
This only works for meson 0.49.0 and newer.
"""
import argparse
import ast
import configparser
import pathlib
import sys
def parse_args() -> argparse.Namespace:
"""Parse arguments."""
parser = argparse.ArgumentParser()
parser.add_argument(
'build_dir',
help='Path the meson build directory')
args = parser.parse_args()
return args
def load_config(path: pathlib.Path) -> configparser.ConfigParser:
"""Load config file."""
conf = configparser.ConfigParser()
with path.open() as f:
conf.read_file(f)
return conf
def build_cmd(conf: configparser.ConfigParser) -> str:
"""Rebuild the command line."""
args = []
for k, v in conf['options'].items():
if ' ' in v:
args.append(f'-D{k}="{v}"')
else:
args.append(f'-D{k}={v}')
cf = conf['properties'].get('cross_file')
if cf:
args.append('--cross-file={}'.format(cf))
nf = conf['properties'].get('native_file')
if nf:
# this will be in the form "['str', 'str']", so use ast.literal_eval to
# convert it to a list of strings.
nf = ast.literal_eval(nf)
args.extend(['--native-file={}'.format(f) for f in nf])
return ' '.join(args)
def main():
args = parse_args()
path = pathlib.Path(args.build_dir, 'meson-private', 'cmd_line.txt')
if not path.exists():
print('Cannot find the necessary file to rebuild command line. '
'Is your meson version >= 0.49.0?', file=sys.stderr)
sys.exit(1)
conf = load_config(path)
cmd = build_cmd(conf)
print(cmd)
if __name__ == '__main__':
main()

63
bin/meson-options.py Executable file
View File

@@ -0,0 +1,63 @@
#!/usr/bin/env python3
from os import get_terminal_size
from textwrap import wrap
from mesonbuild import coredata
from mesonbuild import optinterpreter
(COLUMNS, _) = get_terminal_size()
def describe_option(option_name: str, option_default_value: str,
option_type: str, option_message: str) -> None:
print('name: ' + option_name)
print('default: ' + option_default_value)
print('type: ' + option_type)
for line in wrap(option_message, width=COLUMNS - 9):
print(' ' + line)
print('---')
oi = optinterpreter.OptionInterpreter('')
oi.process('meson_options.txt')
for (name, value) in oi.options.items():
if isinstance(value, coredata.UserStringOption):
describe_option(name,
value.value,
'string',
"You can type what you want, but make sure it makes sense")
elif isinstance(value, coredata.UserBooleanOption):
describe_option(name,
'true' if value.value else 'false',
'boolean',
"You can set it to 'true' or 'false'")
elif isinstance(value, coredata.UserIntegerOption):
describe_option(name,
str(value.value),
'integer',
"You can set it to any integer value between '{}' and '{}'".format(value.min_value, value.max_value))
elif isinstance(value, coredata.UserUmaskOption):
describe_option(name,
str(value.value),
'umask',
"You can set it to 'preserve' or a value between '0000' and '0777'")
elif isinstance(value, coredata.UserComboOption):
choices = '[' + ', '.join(["'" + v + "'" for v in value.choices]) + ']'
describe_option(name,
value.value,
'combo',
"You can set it to any one of those values: " + choices)
elif isinstance(value, coredata.UserArrayOption):
choices = '[' + ', '.join(["'" + v + "'" for v in value.choices]) + ']'
value = '[' + ', '.join(["'" + v + "'" for v in value.value]) + ']'
describe_option(name,
value,
'array',
"You can set it to one or more of those values: " + choices)
elif isinstance(value, coredata.UserFeatureOption):
describe_option(name,
value.value,
'feature',
"You can set it to 'auto', 'enabled', or 'disabled'")
else:
print(name + ' is an option of a type unknown to this script')
print('---')

View File

@@ -86,7 +86,7 @@ def AddOptions(opts):
from SCons.Options.EnumOption import EnumOption from SCons.Options.EnumOption import EnumOption
opts.Add(EnumOption('build', 'build type', 'debug', opts.Add(EnumOption('build', 'build type', 'debug',
allowed_values=('debug', 'checked', 'profile', allowed_values=('debug', 'checked', 'profile',
'release', 'opt'))) 'release')))
opts.Add(BoolOption('verbose', 'verbose output', 'no')) opts.Add(BoolOption('verbose', 'verbose output', 'no'))
opts.Add(EnumOption('machine', 'use machine-specific assembly code', opts.Add(EnumOption('machine', 'use machine-specific assembly code',
default_machine, default_machine,
@@ -99,17 +99,13 @@ def AddOptions(opts):
'enable static code analysis where available', 'no')) 'enable static code analysis where available', 'no'))
opts.Add(BoolOption('asan', 'enable Address Sanitizer', 'no')) opts.Add(BoolOption('asan', 'enable Address Sanitizer', 'no'))
opts.Add('toolchain', 'compiler toolchain', default_toolchain) opts.Add('toolchain', 'compiler toolchain', default_toolchain)
opts.Add(BoolOption('gles', 'EXPERIMENTAL: enable OpenGL ES support',
'no'))
opts.Add(BoolOption('llvm', 'use LLVM', default_llvm)) opts.Add(BoolOption('llvm', 'use LLVM', default_llvm))
opts.Add(BoolOption('openmp', 'EXPERIMENTAL: compile with openmp (swrast)', opts.Add(BoolOption('openmp', 'EXPERIMENTAL: compile with openmp (swrast)',
'no')) 'no'))
opts.Add(BoolOption('debug', 'DEPRECATED: debug build', 'yes')) opts.Add(BoolOption('debug', 'DEPRECATED: debug build', 'yes'))
opts.Add(BoolOption('profile', 'DEPRECATED: profile build', 'no')) opts.Add(BoolOption('profile', 'DEPRECATED: profile build', 'no'))
opts.Add(BoolOption('quiet', 'DEPRECATED: profile build', 'yes')) opts.Add(BoolOption('quiet', 'DEPRECATED: profile build', 'yes'))
opts.Add(BoolOption('texture_float',
'enable floating-point textures and renderbuffers',
'no'))
opts.Add(BoolOption('swr', 'Build OpenSWR', 'no')) opts.Add(BoolOption('swr', 'Build OpenSWR', 'no'))
if host_platform == 'windows': if host_platform == 'windows':
opts.Add('MSVC_VERSION', 'Microsoft Visual C/C++ version') opts.Add('MSVC_VERSION', 'Microsoft Visual C/C++ version')
opts.Add('MSVC_USE_SCRIPT', 'Microsoft Visual C/C++ vcvarsall script', True)

View File

@@ -52,6 +52,19 @@ mingw*)
;; ;;
esac esac
AC_ARG_ENABLE(autotools,
[AS_HELP_STRING([--enable-autotools],
[Enable the use of this autotools based build configuration])],
[enable_autotools=$enableval], [enable_autotools=no])
if test "x$enable_autotools" != "xyes" ; then
AC_MSG_ERROR([the autotools build system has been deprecated in favour of
meson and will be removed eventually. For instructions on how to use meson
see https://www.mesa3d.org/meson.html.
If you still want to use the autotools build, then add --enable-autotools
to the configure command line.])
fi
# Support silent build rules, requires at least automake-1.11. Disable # Support silent build rules, requires at least automake-1.11. Disable
# by either passing --disable-silent-rules to configure or passing V=1 # by either passing --disable-silent-rules to configure or passing V=1
# to make # to make
@@ -74,24 +87,28 @@ AC_SUBST([OPENCL_VERSION])
# in the first entry. # in the first entry.
LIBDRM_REQUIRED=2.4.75 LIBDRM_REQUIRED=2.4.75
LIBDRM_RADEON_REQUIRED=2.4.71 LIBDRM_RADEON_REQUIRED=2.4.71
LIBDRM_AMDGPU_REQUIRED=2.4.89 LIBDRM_AMDGPU_REQUIRED=2.4.97
LIBDRM_INTEL_REQUIRED=2.4.75 LIBDRM_INTEL_REQUIRED=2.4.75
LIBDRM_NVVIEUX_REQUIRED=2.4.66 LIBDRM_NVVIEUX_REQUIRED=2.4.66
LIBDRM_NOUVEAU_REQUIRED=2.4.66 LIBDRM_NOUVEAU_REQUIRED=2.4.66
LIBDRM_FREEDRENO_REQUIRED=2.4.89 LIBDRM_ETNAVIV_REQUIRED=2.4.89
LIBDRM_ETNAVIV_REQUIRED=2.4.82 LIBDRM_VC4_REQUIRED=2.4.89
dnl Versions for external dependencies dnl Versions for external dependencies
DRI2PROTO_REQUIRED=2.8 DRI2PROTO_REQUIRED=2.8
GLPROTO_REQUIRED=1.4.14 GLPROTO_REQUIRED=1.4.14
LIBOMXIL_BELLAGIO_REQUIRED=0.0 LIBOMXIL_BELLAGIO_REQUIRED=0.0
LIBVA_REQUIRED=0.38.0 LIBOMXIL_TIZONIA_REQUIRED=0.10.0
LIBVA_REQUIRED=0.39.0
VDPAU_REQUIRED=1.1 VDPAU_REQUIRED=1.1
WAYLAND_REQUIRED=1.11 WAYLAND_REQUIRED=1.11
WAYLAND_EGL_BACKEND_REQUIRED=3
WAYLAND_PROTOCOLS_REQUIRED=1.8 WAYLAND_PROTOCOLS_REQUIRED=1.8
XCB_REQUIRED=1.9.3 XCB_REQUIRED=1.9.3
XCBDRI2_REQUIRED=1.8 XCBDRI2_REQUIRED=1.8
XCBDRI3_MODIFIERS_REQUIRED=1.13
XCBGLX_REQUIRED=1.8.1 XCBGLX_REQUIRED=1.8.1
XCBPRESENT_MODIFIERS_REQUIRED=1.13
XDAMAGE_REQUIRED=1.1 XDAMAGE_REQUIRED=1.1
XSHMFENCE_REQUIRED=1.1 XSHMFENCE_REQUIRED=1.1
XVMC_REQUIRED=1.0.6 XVMC_REQUIRED=1.0.6
@@ -103,9 +120,9 @@ dnl LLVM versions
LLVM_REQUIRED_GALLIUM=3.3.0 LLVM_REQUIRED_GALLIUM=3.3.0
LLVM_REQUIRED_OPENCL=3.9.0 LLVM_REQUIRED_OPENCL=3.9.0
LLVM_REQUIRED_R600=3.9.0 LLVM_REQUIRED_R600=3.9.0
LLVM_REQUIRED_RADEONSI=3.9.0 LLVM_REQUIRED_RADEONSI=7.0.0
LLVM_REQUIRED_RADV=3.9.0 LLVM_REQUIRED_RADV=7.0.0
LLVM_REQUIRED_SWR=3.9.0 LLVM_REQUIRED_SWR=6.0.0
dnl Check for progs dnl Check for progs
AC_PROG_CPP AC_PROG_CPP
@@ -116,9 +133,12 @@ dnl other CC/CXX flags related help
AC_ARG_VAR([CXX11_CXXFLAGS], [Compiler flag to enable C++11 support (only needed if not AC_ARG_VAR([CXX11_CXXFLAGS], [Compiler flag to enable C++11 support (only needed if not
enabled by default and different from -std=c++11)]) enabled by default and different from -std=c++11)])
AM_PROG_CC_C_O AM_PROG_CC_C_O
AC_PROG_GREP
AC_PROG_NM
AM_PROG_AS AM_PROG_AS
AX_CHECK_GNU_MAKE AX_CHECK_GNU_MAKE
AC_CHECK_PROGS([PYTHON2], [python2.7 python2 python]) AM_PATH_PYTHON([2.7],, [AM_PATH_PYTHON([3.4],, [:])])
AC_PROG_SED AC_PROG_SED
AC_PROG_MKDIR_P AC_PROG_MKDIR_P
@@ -150,7 +170,7 @@ fi
AX_CHECK_PYTHON_MAKO_MODULE($PYTHON_MAKO_REQUIRED) AX_CHECK_PYTHON_MAKO_MODULE($PYTHON_MAKO_REQUIRED)
if test -z "$PYTHON2"; then if test "$PYTHON" = ":"; then
if test ! -f "$srcdir/src/util/format_srgb.c"; then if test ! -f "$srcdir/src/util/format_srgb.c"; then
AC_MSG_ERROR([Python not found - unable to generate sources]) AC_MSG_ERROR([Python not found - unable to generate sources])
fi fi
@@ -288,6 +308,12 @@ esac
AM_CONDITIONAL(HAVE_ANDROID, test "x$android" = xyes) AM_CONDITIONAL(HAVE_ANDROID, test "x$android" = xyes)
# Toggle Werror since at some point clang started treating unknown -W
# flags as warnings, succeeding with the build, yet issuing an annoying
# warning.
save_CFLAGS="$CFLAGS"
export CFLAGS="$CFLAGS -Werror"
dnl dnl
dnl Check compiler flags dnl Check compiler flags
dnl dnl
@@ -295,10 +321,19 @@ AX_CHECK_COMPILE_FLAG([-Wall], [CFLAGS="$CFLAGS
AX_CHECK_COMPILE_FLAG([-Werror=implicit-function-declaration], [CFLAGS="$CFLAGS -Werror=implicit-function-declaration"]) AX_CHECK_COMPILE_FLAG([-Werror=implicit-function-declaration], [CFLAGS="$CFLAGS -Werror=implicit-function-declaration"])
AX_CHECK_COMPILE_FLAG([-Werror=missing-prototypes], [CFLAGS="$CFLAGS -Werror=missing-prototypes"]) AX_CHECK_COMPILE_FLAG([-Werror=missing-prototypes], [CFLAGS="$CFLAGS -Werror=missing-prototypes"])
AX_CHECK_COMPILE_FLAG([-Wmissing-prototypes], [CFLAGS="$CFLAGS -Wmissing-prototypes"]) AX_CHECK_COMPILE_FLAG([-Wmissing-prototypes], [CFLAGS="$CFLAGS -Wmissing-prototypes"])
dnl Dylan Baker: gcc and clang always accepr -Wno-*, hence check for the original warning, then set the no-* flag
AX_CHECK_COMPILE_FLAG([-Wmissing-field-initializers], [CFLAGS="$CFLAGS -Wno-missing-field-initializers"])
AX_CHECK_COMPILE_FLAG([-Wformat-truncation], [CFLAGS="$CFLAGS -Wno-format-truncation"])
AX_CHECK_COMPILE_FLAG([-fno-math-errno], [CFLAGS="$CFLAGS -fno-math-errno"]) AX_CHECK_COMPILE_FLAG([-fno-math-errno], [CFLAGS="$CFLAGS -fno-math-errno"])
AX_CHECK_COMPILE_FLAG([-fno-trapping-math], [CFLAGS="$CFLAGS -fno-trapping-math"]) AX_CHECK_COMPILE_FLAG([-fno-trapping-math], [CFLAGS="$CFLAGS -fno-trapping-math"])
AX_CHECK_COMPILE_FLAG([-fvisibility=hidden], [VISIBILITY_CFLAGS="-fvisibility=hidden"]) AX_CHECK_COMPILE_FLAG([-fvisibility=hidden], [VISIBILITY_CFLAGS="-fvisibility=hidden"])
CFLAGS="$save_CFLAGS"
# Toggle Werror since at some point clang started treating unknown -W
# flags as warnings, succeeding with the build, yet issuing an annoying
# warning.
dnl dnl
dnl Check C++ compiler flags dnl Check C++ compiler flags
dnl dnl
@@ -307,6 +342,8 @@ AX_CHECK_COMPILE_FLAG([-Wall], [CXXFLAGS="$CXXFL
AX_CHECK_COMPILE_FLAG([-fno-math-errno], [CXXFLAGS="$CXXFLAGS -fno-math-errno"]) AX_CHECK_COMPILE_FLAG([-fno-math-errno], [CXXFLAGS="$CXXFLAGS -fno-math-errno"])
AX_CHECK_COMPILE_FLAG([-fno-trapping-math], [CXXFLAGS="$CXXFLAGS -fno-trapping-math"]) AX_CHECK_COMPILE_FLAG([-fno-trapping-math], [CXXFLAGS="$CXXFLAGS -fno-trapping-math"])
AX_CHECK_COMPILE_FLAG([-fvisibility=hidden], [VISIBILITY_CXXFLAGS="-fvisibility=hidden"]) AX_CHECK_COMPILE_FLAG([-fvisibility=hidden], [VISIBILITY_CXXFLAGS="-fvisibility=hidden"])
AX_CHECK_COMPILE_FLAG([-Wmissing-field-initializers], [CXXFLAGS="$CXXFLAGS -Wno-missing-field-initializers"])
AX_CHECK_COMPILE_FLAG([-Wformat-truncation], [CXXFLAGS="$CXXFLAGS -Wno-format-truncation"])
AC_LANG_POP([C++]) AC_LANG_POP([C++])
# Flags to help ensure that certain portions of the code -- and only those # Flags to help ensure that certain portions of the code -- and only those
@@ -429,28 +466,41 @@ fi
AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1]) AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS) AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
dnl Check for new-style atomic builtins dnl Check for new-style atomic builtins. We first check without linking to
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ dnl -latomic.
AC_MSG_CHECKING(whether __atomic_load_n is supported)
AC_LINK_IFELSE([AC_LANG_SOURCE([[
#include <stdint.h>
int main() { int main() {
int n; struct {
return __atomic_load_n(&n, __ATOMIC_ACQUIRE); uint64_t *v;
}]])], GCC_ATOMIC_BUILTINS_SUPPORTED=1) } x;
if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE) &
DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS" (int)__atomic_add_fetch(x.v, (uint64_t)1, __ATOMIC_ACQ_REL);
dnl On some platforms, new-style atomics need a helper library }]])], GCC_ATOMIC_BUILTINS_SUPPORTED=yes, GCC_ATOMIC_BUILTINS_SUPPORTED=no)
AC_MSG_CHECKING(whether -latomic is needed)
AC_LINK_IFELSE([AC_LANG_SOURCE([[ dnl If that didn't work, we try linking with -latomic, which is needed on some
#include <stdint.h> dnl platforms.
uint64_t v; if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" != xyes; then
int main() { save_LDFLAGS=$LDFLAGS
return (int)__atomic_load_n(&v, __ATOMIC_ACQUIRE); LDFLAGS="$LDFLAGS -latomic"
}]])], GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC=no, GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC=yes) AC_LINK_IFELSE([AC_LANG_SOURCE([[
AC_MSG_RESULT($GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC) #include <stdint.h>
if test "x$GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC" = xyes; then int main() {
LIBATOMIC_LIBS="-latomic" struct {
fi uint64_t *v;
} x;
return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE) &
(int)__atomic_add_fetch(x.v, (uint64_t)1, __ATOMIC_ACQ_REL);
}]])], GCC_ATOMIC_BUILTINS_SUPPORTED=yes LIBATOMIC_LIBS="-latomic",
GCC_ATOMIC_BUILTINS_SUPPORTED=no)
LDFLAGS=$save_LDFLAGS
fi
AC_MSG_RESULT($GCC_ATOMIC_BUILTINS_SUPPORTED)
if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = xyes; then
DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
fi fi
AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
AC_SUBST([LIBATOMIC_LIBS]) AC_SUBST([LIBATOMIC_LIBS])
dnl Check if host supports 64-bit atomics dnl Check if host supports 64-bit atomics
@@ -685,6 +735,19 @@ AC_LINK_IFELSE(
LDFLAGS=$save_LDFLAGS LDFLAGS=$save_LDFLAGS
AM_CONDITIONAL(HAVE_LD_DYNAMIC_LIST, test "$have_ld_dynamic_list" = "yes") AM_CONDITIONAL(HAVE_LD_DYNAMIC_LIST, test "$have_ld_dynamic_list" = "yes")
dnl
dnl OSX linker does not support build-id
dnl
case "$host_os" in
darwin*)
LD_BUILD_ID=""
;;
*)
LD_BUILD_ID="-Wl,--build-id=sha1"
;;
esac
AC_SUBST([LD_BUILD_ID])
dnl dnl
dnl compatibility symlinks dnl compatibility symlinks
dnl dnl
@@ -730,21 +793,6 @@ esac
AC_SUBST([LIB_EXT]) AC_SUBST([LIB_EXT])
dnl
dnl potentially-infringing-but-nobody-knows-for-sure stuff
dnl
AC_ARG_ENABLE([texture-float],
[AS_HELP_STRING([--enable-texture-float],
[enable floating-point textures and renderbuffers @<:@default=disabled@:>@])],
[enable_texture_float="$enableval"],
[enable_texture_float=no]
)
if test "x$enable_texture_float" = xyes; then
AC_MSG_WARN([Floating-point textures enabled.])
AC_MSG_WARN([Please consult docs/patents.txt with your lawyer before building Mesa.])
DEFINES="$DEFINES -DTEXTURE_FLOAT_ENABLED"
fi
dnl dnl
dnl Arch/platform-specific settings dnl Arch/platform-specific settings
dnl dnl
@@ -849,6 +897,8 @@ fi
AC_HEADER_MAJOR AC_HEADER_MAJOR
AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"]) AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"])
AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"]) AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"])
AC_CHECK_HEADERS([endian.h])
AC_CHECK_HEADER([dlfcn.h], [DEFINES="$DEFINES -DHAVE_DLFCN_H"])
AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"]) AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"])
AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"]) AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"])
AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"]) AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"])
@@ -926,10 +976,10 @@ dnl In practise that should be sufficient for all platforms, since any
dnl platforms build with GCC and Clang support the flag. dnl platforms build with GCC and Clang support the flag.
PTHREAD_LIBS="$PTHREAD_LIBS -pthread" PTHREAD_LIBS="$PTHREAD_LIBS -pthread"
dnl pthread-stubs is mandatory on BSD platforms, due to the nature of the dnl pthread-stubs is mandatory on some BSD platforms, due to the nature of the
dnl project. Even then there's a notable issue as described in the project README dnl project. Even then there's a notable issue as described in the project README
case "$host_os" in case "$host_os" in
linux* | cygwin* | darwin* | solaris* | *-gnu* | gnu*) linux* | cygwin* | darwin* | solaris* | *-gnu* | gnu* | openbsd*)
pthread_stubs_possible="no" pthread_stubs_possible="no"
;; ;;
* ) * )
@@ -941,6 +991,22 @@ if test "x$pthread_stubs_possible" = xyes; then
PKG_CHECK_MODULES(PTHREADSTUBS, pthread-stubs >= 0.4) PKG_CHECK_MODULES(PTHREADSTUBS, pthread-stubs >= 0.4)
fi fi
save_LIBS="$LIBS"
LIBS="$PTHREAD_LIBS"
AC_MSG_CHECKING(whether pthread_setaffinity_np is supported)
AC_LINK_IFELSE([AC_LANG_SOURCE([[
#define _GNU_SOURCE
#include <pthread.h>
int main() {
void *a = (void*) &pthread_setaffinity_np;
long b = (long) a;
return (int) b;
}]])],
[DEFINES="$DEFINES -DHAVE_PTHREAD_SETAFFINITY"];
AC_MSG_RESULT([yes]),
AC_MSG_RESULT([no]))
LIBS="$save_LIBS"
dnl Check for futex for fast inline simple_mtx_t. dnl Check for futex for fast inline simple_mtx_t.
AC_CHECK_HEADER([linux/futex.h], [DEFINES="$DEFINES -DHAVE_LINUX_FUTEX_H"]) AC_CHECK_HEADER([linux/futex.h], [DEFINES="$DEFINES -DHAVE_LINUX_FUTEX_H"])
@@ -1270,10 +1336,10 @@ AC_ARG_ENABLE([xa],
[enable_xa=no]) [enable_xa=no])
AC_ARG_ENABLE([gbm], AC_ARG_ENABLE([gbm],
[AS_HELP_STRING([--enable-gbm], [AS_HELP_STRING([--enable-gbm],
[enable gbm library @<:@default=yes except cygwin@:>@])], [enable gbm library @<:@default=yes except cygwin and macOS@:>@])],
[enable_gbm="$enableval"], [enable_gbm="$enableval"],
[case "$host_os" in [case "$host_os" in
cygwin*) cygwin* | darwin*)
enable_gbm=no enable_gbm=no
;; ;;
*) *)
@@ -1298,14 +1364,19 @@ AC_ARG_ENABLE([vdpau],
[enable_vdpau=auto]) [enable_vdpau=auto])
AC_ARG_ENABLE([omx], AC_ARG_ENABLE([omx],
[AS_HELP_STRING([--enable-omx], [AS_HELP_STRING([--enable-omx],
[DEPRECATED: Use --enable-omx-bellagio instead @<:@default=auto@:>@])], [DEPRECATED: Use --enable-omx-bellagio or --enable-omx-tizonia instead @<:@default=auto@:>@])],
[AC_MSG_ERROR([--enable-omx is deprecated. Use --enable-omx-bellagio instead.])], [AC_MSG_ERROR([--enable-omx is deprecated. Use --enable-omx-bellagio or --enable-omx-tizonia instead.])],
[]) [])
AC_ARG_ENABLE([omx-bellagio], AC_ARG_ENABLE([omx-bellagio],
[AS_HELP_STRING([--enable-omx-bellagio], [AS_HELP_STRING([--enable-omx-bellagio],
[enable OpenMAX Bellagio library @<:@default=disabled@:>@])], [enable OpenMAX Bellagio library @<:@default=disabled@:>@])],
[enable_omx_bellagio="$enableval"], [enable_omx_bellagio="$enableval"],
[enable_omx_bellagio=no]) [enable_omx_bellagio=no])
AC_ARG_ENABLE([omx-tizonia],
[AS_HELP_STRING([--enable-omx-tizonia],
[enable OpenMAX Tizonia library @<:@default=disabled@:>@])],
[enable_omx_tizonia="$enableval"],
[enable_omx_tizonia=no])
AC_ARG_ENABLE([va], AC_ARG_ENABLE([va],
[AS_HELP_STRING([--enable-va], [AS_HELP_STRING([--enable-va],
[enable va library @<:@default=auto@:>@])], [enable va library @<:@default=auto@:>@])],
@@ -1337,7 +1408,7 @@ GALLIUM_DRIVERS_DEFAULT="r300,r600,svga,swrast"
AC_ARG_WITH([gallium-drivers], AC_ARG_WITH([gallium-drivers],
[AS_HELP_STRING([--with-gallium-drivers@<:@=DIRS...@:>@], [AS_HELP_STRING([--with-gallium-drivers@<:@=DIRS...@:>@],
[comma delimited Gallium drivers list, e.g. [comma delimited Gallium drivers list, e.g.
"i915,nouveau,r300,r600,radeonsi,freedreno,pl111,svga,swrast,swr,vc4,vc5,virgl,etnaviv,imx" "i915,nouveau,r300,r600,radeonsi,freedreno,kmsro,svga,swrast,swr,tegra,v3d,vc4,virgl,etnaviv"
@<:@default=r300,r600,svga,swrast@:>@])], @<:@default=r300,r600,svga,swrast@:>@])],
[with_gallium_drivers="$withval"], [with_gallium_drivers="$withval"],
[with_gallium_drivers="$GALLIUM_DRIVERS_DEFAULT"]) [with_gallium_drivers="$GALLIUM_DRIVERS_DEFAULT"])
@@ -1357,11 +1428,17 @@ if test "x$enable_opengl" = xno -a \
"x$enable_xvmc" = xno -a \ "x$enable_xvmc" = xno -a \
"x$enable_vdpau" = xno -a \ "x$enable_vdpau" = xno -a \
"x$enable_omx_bellagio" = xno -a \ "x$enable_omx_bellagio" = xno -a \
"x$enable_omx_tizonia" = xno -a \
"x$enable_va" = xno -a \ "x$enable_va" = xno -a \
"x$enable_opencl" = xno; then "x$enable_opencl" = xno; then
AC_MSG_ERROR([at least one API should be enabled]) AC_MSG_ERROR([at least one API should be enabled])
fi fi
if test "x$enable_omx_bellagio" = xyes -a \
"x$enable_omx_tizonia" = xyes; then
AC_MSG_ERROR([Can't enable both bellagio and tizonia at same time])
fi
# Building OpenGL ES1 and/or ES2 without OpenGL is not supported on mesa 9.0.x # Building OpenGL ES1 and/or ES2 without OpenGL is not supported on mesa 9.0.x
if test "x$enable_opengl" = xno -a \ if test "x$enable_opengl" = xno -a \
"x$enable_gles1" = xyes; then "x$enable_gles1" = xyes; then
@@ -1380,6 +1457,7 @@ AM_CONDITIONAL(NEED_OPENGL_COMMON, test "x$enable_opengl" = xyes -o \
"x$enable_gles1" = xyes -o \ "x$enable_gles1" = xyes -o \
"x$enable_gles2" = xyes) "x$enable_gles2" = xyes)
AM_CONDITIONAL(NEED_KHRPLATFORM, test "x$enable_egl" = xyes -o \ AM_CONDITIONAL(NEED_KHRPLATFORM, test "x$enable_egl" = xyes -o \
"x$enable_opengl" = xyes -o \
"x$enable_gles1" = xyes -o \ "x$enable_gles1" = xyes -o \
"x$enable_gles2" = xyes) "x$enable_gles2" = xyes)
@@ -1468,15 +1546,15 @@ fi
AC_ARG_WITH([gl-lib-name], AC_ARG_WITH([gl-lib-name],
[AS_HELP_STRING([--with-gl-lib-name@<:@=NAME@:>@], [AS_HELP_STRING([--with-gl-lib-name@<:@=NAME@:>@],
[specify GL library name @<:@default=GL@:>@])], [specify GL library name @<:@default=GL@:>@])],
[GL_LIB=$withval], [AC_MSG_ERROR([--with-gl-lib-name is no longer supported. Rename the library manually if needed.])],
[GL_LIB="$DEFAULT_GL_LIB_NAME"]) [])
AC_ARG_WITH([osmesa-lib-name], AC_ARG_WITH([osmesa-lib-name],
[AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@], [AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@],
[specify OSMesa library name @<:@default=OSMesa@:>@])], [specify OSMesa library name @<:@default=OSMesa@:>@])],
[OSMESA_LIB=$withval], [AC_MSG_ERROR([--with-osmesa-lib-name is no longer supported. Rename the library manually if needed.])],
[OSMESA_LIB=OSMesa]) [])
AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"]) GL_LIB="$DEFAULT_GL_LIB_NAME"
AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa]) OSMESA_LIB=OSMesa
dnl dnl
dnl Mangled Mesa support dnl Mangled Mesa support
@@ -1488,6 +1566,9 @@ AC_ARG_ENABLE([mangling],
[enable_mangling=no] [enable_mangling=no]
) )
if test "x${enable_mangling}" = "xyes" ; then if test "x${enable_mangling}" = "xyes" ; then
if test "x$enable_libglvnd" = xyes; then
AC_MSG_ERROR([Conflicting options --enable-mangling and --enable-libglvnd.])
fi
DEFINES="${DEFINES} -DUSE_MGL_NAMESPACE" DEFINES="${DEFINES} -DUSE_MGL_NAMESPACE"
GL_LIB="Mangled${GL_LIB}" GL_LIB="Mangled${GL_LIB}"
OSMESA_LIB="Mangled${OSMESA_LIB}" OSMESA_LIB="Mangled${OSMESA_LIB}"
@@ -1495,6 +1576,15 @@ fi
AC_SUBST([GL_LIB]) AC_SUBST([GL_LIB])
AC_SUBST([OSMESA_LIB]) AC_SUBST([OSMESA_LIB])
dnl HACK when building glx + glvnd we ship gl.pc, despite that glvnd should do it
dnl Thus we need to use GL as a DSO name.
if test "x$enable_libglvnd" = xyes -a "x$enable_glx" != xno; then
GL_PKGCONF_LIB="GL"
else
GL_PKGCONF_LIB="$GL_LIB"
fi
AC_SUBST([GL_PKGCONF_LIB])
# Check for libdrm # Check for libdrm
PKG_CHECK_MODULES([LIBDRM], [libdrm >= $LIBDRM_REQUIRED], PKG_CHECK_MODULES([LIBDRM], [libdrm >= $LIBDRM_REQUIRED],
[have_libdrm=yes], [have_libdrm=no]) [have_libdrm=yes], [have_libdrm=no])
@@ -1534,6 +1624,7 @@ AM_CONDITIONAL(HAVE_APPLEDRI, test "x$enable_dri" = xyes -a "x$dri_platform" = x
AM_CONDITIONAL(HAVE_LMSENSORS, test "x$enable_lmsensors" = xyes ) AM_CONDITIONAL(HAVE_LMSENSORS, test "x$enable_lmsensors" = xyes )
AM_CONDITIONAL(HAVE_GALLIUM_EXTRA_HUD, test "x$enable_gallium_extra_hud" = xyes ) AM_CONDITIONAL(HAVE_GALLIUM_EXTRA_HUD, test "x$enable_gallium_extra_hud" = xyes )
AM_CONDITIONAL(HAVE_WINDOWSDRI, test "x$enable_dri" = xyes -a "x$dri_platform" = xwindows ) AM_CONDITIONAL(HAVE_WINDOWSDRI, test "x$enable_dri" = xyes -a "x$dri_platform" = xwindows )
AM_CONDITIONAL(HAVE_XLEASE, test "x$have_xlease" = xyes )
AC_ARG_ENABLE([shared-glapi], AC_ARG_ENABLE([shared-glapi],
[AS_HELP_STRING([--enable-shared-glapi], [AS_HELP_STRING([--enable-shared-glapi],
@@ -1598,7 +1689,7 @@ fi
AC_ARG_ENABLE([driglx-direct], AC_ARG_ENABLE([driglx-direct],
[AS_HELP_STRING([--disable-driglx-direct], [AS_HELP_STRING([--disable-driglx-direct],
[disable direct rendering in GLX and EGL for DRI \ [disable direct rendering in GLX and EGL for DRI \
@<:@default=auto@:>@])], @<:@default=enabled@:>@])],
[driglx_direct="$enableval"], [driglx_direct="$enableval"],
[driglx_direct="yes"]) [driglx_direct="yes"])
@@ -1622,6 +1713,8 @@ xxlib | xgallium-xlib)
xdri) xdri)
# DRI-based GLX # DRI-based GLX
require_dri_shared_libs_and_glapi "GLX"
# find the DRI deps for libGL # find the DRI deps for libGL
dri_modules="x11 xext xdamage >= $XDAMAGE_REQUIRED xfixes x11-xcb xcb xcb-glx >= $XCBGLX_REQUIRED" dri_modules="x11 xext xdamage >= $XDAMAGE_REQUIRED xfixes x11-xcb xcb xcb-glx >= $XCBGLX_REQUIRED"
@@ -1636,6 +1729,8 @@ xdri)
if test x"$enable_dri" = xyes; then if test x"$enable_dri" = xyes; then
dri_modules="$dri_modules xcb-dri2 >= $XCBDRI2_REQUIRED" dri_modules="$dri_modules xcb-dri2 >= $XCBDRI2_REQUIRED"
fi fi
dri_modules="$dri_modules xxf86vm"
fi fi
if test x"$dri_platform" = xapple ; then if test x"$dri_platform" = xapple ; then
DEFINES="$DEFINES -DGLX_USE_APPLEGL" DEFINES="$DEFINES -DGLX_USE_APPLEGL"
@@ -1645,12 +1740,6 @@ xdri)
fi fi
fi fi
# add xf86vidmode if available
PKG_CHECK_MODULES([XF86VIDMODE], [xxf86vm], HAVE_XF86VIDMODE=yes, HAVE_XF86VIDMODE=no)
if test "$HAVE_XF86VIDMODE" = yes ; then
dri_modules="$dri_modules xxf86vm"
fi
PKG_CHECK_MODULES([DRIGL], [$dri_modules]) PKG_CHECK_MODULES([DRIGL], [$dri_modules])
GL_PC_REQ_PRIV="$GL_PC_REQ_PRIV $dri_modules" GL_PC_REQ_PRIV="$GL_PC_REQ_PRIV $dri_modules"
X11_INCLUDES="$X11_INCLUDES $DRIGL_CFLAGS" X11_INCLUDES="$X11_INCLUDES $DRIGL_CFLAGS"
@@ -1662,10 +1751,6 @@ xdri)
;; ;;
esac esac
# This is outside the case (above) so that it is invoked even for non-GLX
# builds.
AM_CONDITIONAL(HAVE_XF86VIDMODE, test "x$HAVE_XF86VIDMODE" = xyes)
GLESv1_CM_LIB_DEPS="$LIBDRM_LIBS -lm $PTHREAD_LIBS $DLOPEN_LIBS" GLESv1_CM_LIB_DEPS="$LIBDRM_LIBS -lm $PTHREAD_LIBS $DLOPEN_LIBS"
GLESv1_CM_PC_LIB_PRIV="-lm $PTHREAD_LIBS $DLOPEN_LIBS" GLESv1_CM_PC_LIB_PRIV="-lm $PTHREAD_LIBS $DLOPEN_LIBS"
GLESv2_LIB_DEPS="$LIBDRM_LIBS -lm $PTHREAD_LIBS $DLOPEN_LIBS" GLESv2_LIB_DEPS="$LIBDRM_LIBS -lm $PTHREAD_LIBS $DLOPEN_LIBS"
@@ -1682,8 +1767,6 @@ AC_SUBST([GLESv1_CM_PC_LIB_PRIV])
AC_SUBST([GLESv2_LIB_DEPS]) AC_SUBST([GLESv2_LIB_DEPS])
AC_SUBST([GLESv2_PC_LIB_PRIV]) AC_SUBST([GLESv2_PC_LIB_PRIV])
AC_SUBST([HAVE_XF86VIDMODE])
dnl dnl
dnl More GLX setup dnl More GLX setup
dnl dnl
@@ -1757,19 +1840,6 @@ if test "x$with_platforms" = xauto; then
with_platforms=$with_egl_platforms with_platforms=$with_egl_platforms
fi fi
PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland-scanner`,
WAYLAND_SCANNER='')
if test "x$WAYLAND_SCANNER" = x; then
AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner], [:])
fi
PKG_CHECK_EXISTS([wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED], [have_wayland_protocols=yes], [have_wayland_protocols=no])
if test "x$have_wayland_protocols" = xyes; then
ac_wayland_protocols_pkgdatadir=`$PKG_CONFIG --variable=pkgdatadir wayland-protocols`
fi
AC_SUBST(WAYLAND_PROTOCOLS_DATADIR, $ac_wayland_protocols_pkgdatadir)
# Do per platform setups and checks # Do per platform setups and checks
platforms=`IFS=', '; echo $with_platforms` platforms=`IFS=', '; echo $with_platforms`
for plat in $platforms; do for plat in $platforms; do
@@ -1778,13 +1848,26 @@ for plat in $platforms; do
PKG_CHECK_MODULES([WAYLAND_CLIENT], [wayland-client >= $WAYLAND_REQUIRED]) PKG_CHECK_MODULES([WAYLAND_CLIENT], [wayland-client >= $WAYLAND_REQUIRED])
PKG_CHECK_MODULES([WAYLAND_SERVER], [wayland-server >= $WAYLAND_REQUIRED]) PKG_CHECK_MODULES([WAYLAND_SERVER], [wayland-server >= $WAYLAND_REQUIRED])
PKG_CHECK_MODULES([WAYLAND_PROTOCOLS], [wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED])
if test "x$enable_egl" = xyes; then
PKG_CHECK_MODULES([WAYLAND_EGL], [wayland-egl-backend >= $WAYLAND_EGL_BACKEND_REQUIRED])
fi
WAYLAND_PROTOCOLS_DATADIR=`$PKG_CONFIG --variable=pkgdatadir wayland-protocols`
PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland-scanner`,
WAYLAND_SCANNER='')
PKG_CHECK_EXISTS([wayland-scanner >= 1.15],
AC_SUBST(SCANNER_ARG, 'private-code'),
AC_SUBST(SCANNER_ARG, 'code'))
if test "x$WAYLAND_SCANNER" = x; then
AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner], [:])
fi
if test "x$WAYLAND_SCANNER" = "x:"; then if test "x$WAYLAND_SCANNER" = "x:"; then
AC_MSG_ERROR([wayland-scanner is needed to compile the wayland platform]) AC_MSG_ERROR([wayland-scanner is needed to compile the wayland platform])
fi fi
if test "x$have_wayland_protocols" = xno; then
AC_MSG_ERROR([wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED is needed to compile the wayland platform])
fi
DEFINES="$DEFINES -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED" DEFINES="$DEFINES -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED"
;; ;;
@@ -1794,6 +1877,7 @@ for plat in $platforms; do
;; ;;
drm) drm)
test "x$enable_egl" = "xyes" &&
test "x$enable_gbm" = "xno" && test "x$enable_gbm" = "xno" &&
AC_MSG_ERROR([EGL platform drm needs gbm]) AC_MSG_ERROR([EGL platform drm needs gbm])
DEFINES="$DEFINES -DHAVE_DRM_PLATFORM" DEFINES="$DEFINES -DHAVE_DRM_PLATFORM"
@@ -1805,6 +1889,9 @@ for plat in $platforms; do
android) android)
PKG_CHECK_MODULES([ANDROID], [cutils hardware sync]) PKG_CHECK_MODULES([ANDROID], [cutils hardware sync])
if test -n "$with_gallium_drivers"; then
PKG_CHECK_MODULES([BACKTRACE], [backtrace])
fi
DEFINES="$DEFINES -DHAVE_ANDROID_PLATFORM" DEFINES="$DEFINES -DHAVE_ANDROID_PLATFORM"
;; ;;
@@ -1819,6 +1906,7 @@ for plat in $platforms; do
;; ;;
esac esac
done done
AC_SUBST([WAYLAND_PROTOCOLS_DATADIR])
if test "x$enable_glx" != xno; then if test "x$enable_glx" != xno; then
if ! echo "$platforms" | grep -q 'x11'; then if ! echo "$platforms" | grep -q 'x11'; then
@@ -1831,6 +1919,26 @@ if test x"$enable_dri3" = xyes; then
dri3_modules="x11-xcb xcb >= $XCB_REQUIRED xcb-dri3 xcb-xfixes xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED" dri3_modules="x11-xcb xcb >= $XCB_REQUIRED xcb-dri3 xcb-xfixes xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED"
PKG_CHECK_MODULES([XCB_DRI3], [$dri3_modules]) PKG_CHECK_MODULES([XCB_DRI3], [$dri3_modules])
dri3_modifier_modules="xcb-dri3 >= $XCBDRI3_MODIFIERS_REQUIRED xcb-present >= $XCBPRESENT_MODIFIERS_REQUIRED"
PKG_CHECK_MODULES([XCB_DRI3_MODIFIERS], [$dri3_modifier_modules], [have_dri3_modifiers=yes], [have_dri3_modifiers=no])
if test "x$have_dri3_modifiers" = xyes; then
DEFINES="$DEFINES -DHAVE_DRI3_MODIFIERS"
fi
fi
if echo "$platforms" | grep -q 'x11' && echo "$platforms" | grep -q 'drm'; then
have_xlease=yes
else
have_xlease=no
fi
if test x"$have_xlease" = xyes; then
randr_modules="x11-xcb xcb-randr"
PKG_CHECK_MODULES([XCB_RANDR], [$randr_modules])
xlib_randr_modules="xrandr"
PKG_CHECK_MODULES([XLIB_RANDR], [$xlib_randr_modules])
fi fi
AM_CONDITIONAL(HAVE_PLATFORM_X11, echo "$platforms" | grep -q 'x11') AM_CONDITIONAL(HAVE_PLATFORM_X11, echo "$platforms" | grep -q 'x11')
@@ -1839,6 +1947,25 @@ AM_CONDITIONAL(HAVE_PLATFORM_DRM, echo "$platforms" | grep -q 'drm')
AM_CONDITIONAL(HAVE_PLATFORM_SURFACELESS, echo "$platforms" | grep -q 'surfaceless') AM_CONDITIONAL(HAVE_PLATFORM_SURFACELESS, echo "$platforms" | grep -q 'surfaceless')
AM_CONDITIONAL(HAVE_PLATFORM_ANDROID, echo "$platforms" | grep -q 'android') AM_CONDITIONAL(HAVE_PLATFORM_ANDROID, echo "$platforms" | grep -q 'android')
AC_ARG_ENABLE(xlib-lease,
[AS_HELP_STRING([--enable-xlib-lease]
[enable VK_acquire_xlib_display using X leases])],
[enable_xlib_lease=$enableval], [enable_xlib_lease=auto])
case "x$enable_xlib_lease" in
xyes)
;;
xno)
;;
*)
if echo "$platforms" | grep -q 'x11' && echo "$platforms" | grep -q 'drm'; then
enable_xlib_lease=yes
else
enable_xlib_lease=no
fi
esac
AM_CONDITIONAL(HAVE_XLIB_LEASE, test "x$enable_xlib_lease" = xyes)
dnl dnl
dnl More DRI setup dnl More DRI setup
dnl dnl
@@ -2057,6 +2184,9 @@ if test -n "$with_vulkan_drivers"; then
PKG_CHECK_MODULES([AMDGPU], [libdrm >= $LIBDRM_AMDGPU_REQUIRED libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED]) PKG_CHECK_MODULES([AMDGPU], [libdrm >= $LIBDRM_AMDGPU_REQUIRED libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED])
radeon_llvm_check $LLVM_REQUIRED_RADV "radv" radeon_llvm_check $LLVM_REQUIRED_RADV "radv"
require_x11_dri3 "radv" require_x11_dri3 "radv"
if test "x$acv_mako_found" = xno; then
AC_MSG_ERROR([Python mako module v$PYTHON_MAKO_REQUIRED or higher not found])
fi
HAVE_RADEON_VULKAN=yes HAVE_RADEON_VULKAN=yes
;; ;;
*) *)
@@ -2174,13 +2304,13 @@ else
have_vdpau_platform=no have_vdpau_platform=no
fi fi
if echo $platforms | grep -q "x11\|drm"; then if echo $platforms | egrep -q "x11|drm"; then
have_omx_platform=yes have_omx_platform=yes
else else
have_omx_platform=no have_omx_platform=no
fi fi
if echo $platforms | grep -q "x11\|drm\|wayland"; then if echo $platforms | egrep -q "x11|drm|wayland"; then
have_va_platform=yes have_va_platform=yes
else else
have_va_platform=no have_va_platform=no
@@ -2202,6 +2332,10 @@ if test -n "$with_gallium_drivers" -a "x$with_gallium_drivers" != xswrast; then
PKG_CHECK_EXISTS([libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED], [enable_omx_bellagio=yes], [enable_omx_bellagio=no]) PKG_CHECK_EXISTS([libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED], [enable_omx_bellagio=yes], [enable_omx_bellagio=no])
fi fi
if test "x$enable_omx_tizonia" = xauto -a "x$have_omx_platform" = xyes; then
PKG_CHECK_EXISTS([libtizonia >= $LIBOMXIL_TIZONIA_REQUIRED], [enable_omx_tizonia=yes], [enable_omx_tizonia=no])
fi
if test "x$enable_va" = xauto -a "x$have_va_platform" = xyes; then if test "x$enable_va" = xauto -a "x$have_va_platform" = xyes; then
PKG_CHECK_EXISTS([libva >= $LIBVA_REQUIRED], [enable_va=yes], [enable_va=no]) PKG_CHECK_EXISTS([libva >= $LIBVA_REQUIRED], [enable_va=yes], [enable_va=no])
fi fi
@@ -2211,6 +2345,7 @@ if test "x$enable_dri" = xyes -o \
"x$enable_xvmc" = xyes -o \ "x$enable_xvmc" = xyes -o \
"x$enable_vdpau" = xyes -o \ "x$enable_vdpau" = xyes -o \
"x$enable_omx_bellagio" = xyes -o \ "x$enable_omx_bellagio" = xyes -o \
"x$enable_omx_tizonia" = xyes -o \
"x$enable_va" = xyes; then "x$enable_va" = xyes; then
need_gallium_vl=yes need_gallium_vl=yes
fi fi
@@ -2219,6 +2354,7 @@ AM_CONDITIONAL(NEED_GALLIUM_VL, test "x$need_gallium_vl" = xyes)
if test "x$enable_xvmc" = xyes -o \ if test "x$enable_xvmc" = xyes -o \
"x$enable_vdpau" = xyes -o \ "x$enable_vdpau" = xyes -o \
"x$enable_omx_bellagio" = xyes -o \ "x$enable_omx_bellagio" = xyes -o \
"x$enable_omx_tizonia" = xyes -o \
"x$enable_va" = xyes; then "x$enable_va" = xyes; then
if echo $platforms | grep -q "x11"; then if echo $platforms | grep -q "x11"; then
PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED]) PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
@@ -2252,9 +2388,27 @@ if test "x$enable_omx_bellagio" = xyes; then
fi fi
PKG_CHECK_MODULES([OMX_BELLAGIO], [libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED]) PKG_CHECK_MODULES([OMX_BELLAGIO], [libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED])
gallium_st="$gallium_st omx_bellagio" gallium_st="$gallium_st omx_bellagio"
AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 1, [Use Bellagio for OMX IL])
else
AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 0)
fi fi
AM_CONDITIONAL(HAVE_ST_OMX_BELLAGIO, test "x$enable_omx_bellagio" = xyes) AM_CONDITIONAL(HAVE_ST_OMX_BELLAGIO, test "x$enable_omx_bellagio" = xyes)
if test "x$enable_omx_tizonia" = xyes; then
if test "x$have_omx_platform" != xyes; then
AC_MSG_ERROR([OMX requires at least one of the x11 or drm platforms])
fi
PKG_CHECK_MODULES([OMX_TIZONIA],
[libtizonia >= $LIBOMXIL_TIZONIA_REQUIRED
tizilheaders >= $LIBOMXIL_TIZONIA_REQUIRED
libtizplatform >= $LIBOMXIL_TIZONIA_REQUIRED])
gallium_st="$gallium_st omx_tizonia"
AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 1, [Use Tizoina for OMX IL])
else
AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 0)
fi
AM_CONDITIONAL(HAVE_ST_OMX_TIZONIA, test "x$enable_omx_tizonia" = xyes)
if test "x$enable_va" = xyes; then if test "x$enable_va" = xyes; then
if test "x$have_va_platform" != xyes; then if test "x$have_va_platform" != xyes; then
AC_MSG_ERROR([VA requires at least one of the x11 drm or wayland platforms]) AC_MSG_ERROR([VA requires at least one of the x11 drm or wayland platforms])
@@ -2428,6 +2582,15 @@ AC_ARG_WITH([omx-bellagio-libdir],
$PKG_CONFIG --define-variable=libdir=\$libdir --variable=pluginsdir libomxil-bellagio`]) $PKG_CONFIG --define-variable=libdir=\$libdir --variable=pluginsdir libomxil-bellagio`])
AC_SUBST([OMX_BELLAGIO_LIB_INSTALL_DIR]) AC_SUBST([OMX_BELLAGIO_LIB_INSTALL_DIR])
dnl Directory for OMX_TIZONIA libs
AC_ARG_WITH([omx-tizonia-libdir],
[AS_HELP_STRING([--with-omx-tizonia-libdir=DIR],
[directory for the OMX_TIZONIA libraries])],
[OMX_TIZONIA_LIB_INSTALL_DIR="$withval"],
[OMX_TIZONIA_LIB_INSTALL_DIR=`$PKG_CONFIG --define-variable=libdir=\$libdir --variable=pluginsdir libtizcore`])
AC_SUBST([OMX_TIZONIA_LIB_INSTALL_DIR])
dnl Directory for VA libs dnl Directory for VA libs
AC_ARG_WITH([va-libdir], AC_ARG_WITH([va-libdir],
@@ -2571,7 +2734,6 @@ if test -n "$with_gallium_drivers"; then
;; ;;
xfreedreno) xfreedreno)
HAVE_GALLIUM_FREEDRENO=yes HAVE_GALLIUM_FREEDRENO=yes
PKG_CHECK_MODULES([FREEDRENO], [libdrm >= $LIBDRM_FREEDRENO_REQUIRED libdrm_freedreno >= $LIBDRM_FREEDRENO_REQUIRED])
require_libdrm "freedreno" require_libdrm "freedreno"
;; ;;
xetnaviv) xetnaviv)
@@ -2579,8 +2741,9 @@ if test -n "$with_gallium_drivers"; then
PKG_CHECK_MODULES([ETNAVIV], [libdrm >= $LIBDRM_ETNAVIV_REQUIRED libdrm_etnaviv >= $LIBDRM_ETNAVIV_REQUIRED]) PKG_CHECK_MODULES([ETNAVIV], [libdrm >= $LIBDRM_ETNAVIV_REQUIRED libdrm_etnaviv >= $LIBDRM_ETNAVIV_REQUIRED])
require_libdrm "etnaviv" require_libdrm "etnaviv"
;; ;;
ximx) xtegra)
HAVE_GALLIUM_IMX=yes HAVE_GALLIUM_TEGRA=yes
require_libdrm "tegra"
;; ;;
xswrast) xswrast)
HAVE_GALLIUM_SOFTPIPE=yes HAVE_GALLIUM_SOFTPIPE=yes
@@ -2649,23 +2812,23 @@ if test -n "$with_gallium_drivers"; then
;; ;;
xvc4) xvc4)
HAVE_GALLIUM_VC4=yes HAVE_GALLIUM_VC4=yes
require_libdrm "vc4" PKG_CHECK_MODULES([VC4], [libdrm >= $LIBDRM_VC4_REQUIRED])
PKG_CHECK_MODULES([SIMPENROSE], [simpenrose], PKG_CHECK_MODULES([SIMPENROSE], [simpenrose],
[USE_VC4_SIMULATOR=yes; [USE_VC4_SIMULATOR=yes;
DEFINES="$DEFINES -DUSE_VC4_SIMULATOR"], DEFINES="$DEFINES -DUSE_VC4_SIMULATOR"],
[USE_VC4_SIMULATOR=no]) [USE_VC4_SIMULATOR=no])
;; ;;
xvc5) xv3d)
HAVE_GALLIUM_VC5=yes HAVE_GALLIUM_V3D=yes
PKG_CHECK_MODULES([VC5_SIMULATOR], [v3dv3], PKG_CHECK_MODULES([V3D_SIMULATOR], [v3dv3],
[USE_VC5_SIMULATOR=yes; [USE_V3D_SIMULATOR=yes;
DEFINES="$DEFINES -DUSE_VC5_SIMULATOR"], DEFINES="$DEFINES -DUSE_V3D_SIMULATOR"],
[AC_MSG_ERROR([vc5 requires the simulator])]) [USE_V3D_SIMULATOR=no])
;; ;;
xpl111) xkmsro)
HAVE_GALLIUM_PL111=yes HAVE_GALLIUM_KMSRO=yes
;; ;;
xvirgl) xvirgl)
HAVE_GALLIUM_VIRGL=yes HAVE_GALLIUM_VIRGL=yes
@@ -2682,8 +2845,8 @@ if test -n "$with_gallium_drivers"; then
fi fi
# XXX: Keep in sync with LLVM_REQUIRED_SWR # XXX: Keep in sync with LLVM_REQUIRED_SWR
AM_CONDITIONAL(SWR_INVALID_LLVM_VERSION, test "x$LLVM_VERSION" != x3.9.0 -a \ AM_CONDITIONAL(SWR_INVALID_LLVM_VERSION, test "x$LLVM_VERSION" != x6.0.0 -a \
"x$LLVM_VERSION" != x3.9.1) "x$LLVM_VERSION" != x6.0.1)
if test "x$enable_llvm" = "xyes" -a "$with_gallium_drivers"; then if test "x$enable_llvm" = "xyes" -a "$with_gallium_drivers"; then
llvm_require_version $LLVM_REQUIRED_GALLIUM "gallium" llvm_require_version $LLVM_REQUIRED_GALLIUM "gallium"
@@ -2698,15 +2861,14 @@ AM_CONDITIONAL(HAVE_SWR_BUILTIN, test "x$HAVE_SWR_BUILTIN" = xyes)
dnl We need to validate some needed dependencies for renderonly drivers. dnl We need to validate some needed dependencies for renderonly drivers.
if test "x$HAVE_GALLIUM_ETNAVIV" != xyes -a "x$HAVE_GALLIUM_IMX" = xyes ; then if test "x$HAVE_GALLIUM_VC4" != xyes -a "x$HAVE_GALLIUM_KMSRO" = xyes ; then
AC_MSG_ERROR([Building with imx requires etnaviv]) AC_MSG_ERROR([Building with kmsro requires vc4])
fi fi
if test "x$HAVE_GALLIUM_VC4" != xyes -a "x$HAVE_GALLIUM_PL111" = xyes ; then if test "x$HAVE_GALLIUM_NOUVEAU" != xyes -a "x$HAVE_GALLIUM_TEGRA" = xyes; then
AC_MSG_ERROR([Building with pl111 requires vc4]) AC_MSG_ERROR([Building with tegra requires nouveau])
fi fi
detect_old_buggy_llvm() { detect_old_buggy_llvm() {
dnl llvm-config may not give the right answer when llvm is a built as a dnl llvm-config may not give the right answer when llvm is a built as a
dnl single shared library, so we must work the library name out for dnl single shared library, so we must work the library name out for
@@ -2748,6 +2910,7 @@ if test "x$enable_llvm" = xyes; then
LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags` LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags`
LLVM_CFLAGS=$LLVM_CPPFLAGS # CPPFLAGS seem to be sufficient LLVM_CFLAGS=$LLVM_CPPFLAGS # CPPFLAGS seem to be sufficient
LLVM_CXXFLAGS=`strip_unwanted_llvm_flags "$LLVM_CONFIG --cxxflags"` LLVM_CXXFLAGS=`strip_unwanted_llvm_flags "$LLVM_CONFIG --cxxflags"`
LLVM_CXXFLAGS="$CXX11_CXXFLAGS $LLVM_CXXFLAGS"
dnl Set LLVM_LIBS - This is done after the driver configuration so dnl Set LLVM_LIBS - This is done after the driver configuration so
dnl that drivers can add additional components to LLVM_COMPONENTS. dnl that drivers can add additional components to LLVM_COMPONENTS.
@@ -2780,27 +2943,38 @@ if test "x$enable_llvm" = xyes; then
fi fi
fi fi
fi fi
dnl The gallium-xlib GLX and gallium OSMesa targets directly embed the
dnl swr/llvmpipe driver into the final binary. Adding LLVM_LIBS results in
dnl the LLVM library propagated in the Libs.private of the respective .pc
dnl file which ensures complete dependency information when statically
dnl linking.
if test "x$enable_glx" = xgallium-xlib; then
GL_PC_LIB_PRIV="$GL_PC_LIB_PRIV $LLVM_LIBS"
fi
if test "x$enable_gallium_osmesa" = xyes; then
OSMESA_PC_LIB_PRIV="$OSMESA_PC_LIB_PRIV $LLVM_LIBS"
fi
fi fi
AM_CONDITIONAL(HAVE_GALLIUM_SVGA, test "x$HAVE_GALLIUM_SVGA" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_SVGA, test "x$HAVE_GALLIUM_SVGA" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_I915, test "x$HAVE_GALLIUM_I915" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_I915, test "x$HAVE_GALLIUM_I915" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_PL111, test "x$HAVE_GALLIUM_PL111" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_KMSRO, test "x$HAVE_GALLIUM_KMSRO" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_R300, test "x$HAVE_GALLIUM_R300" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_R300, test "x$HAVE_GALLIUM_R300" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_R600, test "x$HAVE_GALLIUM_R600" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_R600, test "x$HAVE_GALLIUM_R600" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_RADEONSI, test "x$HAVE_GALLIUM_RADEONSI" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_RADEONSI, test "x$HAVE_GALLIUM_RADEONSI" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_RADEON_COMMON, test "x$HAVE_GALLIUM_RADEONSI" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_NOUVEAU, test "x$HAVE_GALLIUM_NOUVEAU" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_NOUVEAU, test "x$HAVE_GALLIUM_NOUVEAU" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_FREEDRENO, test "x$HAVE_GALLIUM_FREEDRENO" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_FREEDRENO, test "x$HAVE_GALLIUM_FREEDRENO" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_ETNAVIV, test "x$HAVE_GALLIUM_ETNAVIV" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_ETNAVIV, test "x$HAVE_GALLIUM_ETNAVIV" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_IMX, test "x$HAVE_GALLIUM_IMX" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_TEGRA, test "x$HAVE_GALLIUM_TEGRA" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_SOFTPIPE, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_SOFTPIPE, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_LLVMPIPE, test "x$HAVE_GALLIUM_LLVMPIPE" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_LLVMPIPE, test "x$HAVE_GALLIUM_LLVMPIPE" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_SWR, test "x$HAVE_GALLIUM_SWR" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_SWR, test "x$HAVE_GALLIUM_SWR" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_SWRAST, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes -o \ AM_CONDITIONAL(HAVE_GALLIUM_SWRAST, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes -o \
"x$HAVE_GALLIUM_LLVMPIPE" = xyes -o \ "x$HAVE_GALLIUM_LLVMPIPE" = xyes -o \
"x$HAVE_GALLIUM_SWR" = xyes) "x$HAVE_GALLIUM_SWR" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_V3D, test "x$HAVE_GALLIUM_V3D" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_VC4, test "x$HAVE_GALLIUM_VC4" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_VC4, test "x$HAVE_GALLIUM_VC4" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_VC5, test "x$HAVE_GALLIUM_VC5" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_VIRGL, test "x$HAVE_GALLIUM_VIRGL" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_VIRGL, test "x$HAVE_GALLIUM_VIRGL" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_STATIC_TARGETS, test "x$enable_shared_pipe_drivers" = xno) AM_CONDITIONAL(HAVE_GALLIUM_STATIC_TARGETS, test "x$enable_shared_pipe_drivers" = xno)
@@ -2828,8 +3002,9 @@ AM_CONDITIONAL(HAVE_AMD_DRIVERS, test "x$HAVE_GALLIUM_RADEONSI" = xyes -o \
"x$HAVE_RADEON_VULKAN" = xyes) "x$HAVE_RADEON_VULKAN" = xyes)
AM_CONDITIONAL(HAVE_BROADCOM_DRIVERS, test "x$HAVE_GALLIUM_VC4" = xyes -o \ AM_CONDITIONAL(HAVE_BROADCOM_DRIVERS, test "x$HAVE_GALLIUM_VC4" = xyes -o \
"x$HAVE_GALLIUM_VC5" = xyes) "x$HAVE_GALLIUM_V3D" = xyes)
AM_CONDITIONAL(HAVE_FREEDRENO_DRIVERS, test "x$HAVE_GALLIUM_FREEDRENO" = xyes)
AM_CONDITIONAL(HAVE_INTEL_DRIVERS, test "x$HAVE_INTEL_VULKAN" = xyes -o \ AM_CONDITIONAL(HAVE_INTEL_DRIVERS, test "x$HAVE_INTEL_VULKAN" = xyes -o \
"x$HAVE_I965_DRI" = xyes) "x$HAVE_I965_DRI" = xyes)
@@ -2839,8 +3014,8 @@ AM_CONDITIONAL(NEED_RADEON_DRM_WINSYS, test "x$HAVE_GALLIUM_R300" = xyes -o \
AM_CONDITIONAL(NEED_WINSYS_XLIB, test "x$enable_glx" = xgallium-xlib) AM_CONDITIONAL(NEED_WINSYS_XLIB, test "x$enable_glx" = xgallium-xlib)
AM_CONDITIONAL(HAVE_GALLIUM_COMPUTE, test x$enable_opencl = xyes) AM_CONDITIONAL(HAVE_GALLIUM_COMPUTE, test x$enable_opencl = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_LLVM, test "x$enable_llvm" = xyes) AM_CONDITIONAL(HAVE_GALLIUM_LLVM, test "x$enable_llvm" = xyes)
AM_CONDITIONAL(USE_V3D_SIMULATOR, test x$USE_V3D_SIMULATOR = xyes)
AM_CONDITIONAL(USE_VC4_SIMULATOR, test x$USE_VC4_SIMULATOR = xyes) AM_CONDITIONAL(USE_VC4_SIMULATOR, test x$USE_VC4_SIMULATOR = xyes)
AM_CONDITIONAL(USE_VC5_SIMULATOR, test x$USE_VC5_SIMULATOR = xyes)
AM_CONDITIONAL(HAVE_LIBDRM, test "x$have_libdrm" = xyes) AM_CONDITIONAL(HAVE_LIBDRM, test "x$have_libdrm" = xyes)
AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes) AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes)
@@ -2876,7 +3051,7 @@ AC_SUBST([XVMC_MAJOR], 1)
AC_SUBST([XVMC_MINOR], 0) AC_SUBST([XVMC_MINOR], 0)
AC_SUBST([XA_MAJOR], 2) AC_SUBST([XA_MAJOR], 2)
AC_SUBST([XA_MINOR], 3) AC_SUBST([XA_MINOR], 5)
AC_SUBST([XA_PATCH], 0) AC_SUBST([XA_PATCH], 0)
AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_PATCH") AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_PATCH")
@@ -2922,40 +3097,36 @@ AC_CONFIG_FILES([Makefile
src/amd/vulkan/Makefile src/amd/vulkan/Makefile
src/broadcom/Makefile src/broadcom/Makefile
src/compiler/Makefile src/compiler/Makefile
src/freedreno/Makefile
src/egl/Makefile src/egl/Makefile
src/egl/main/egl.pc src/egl/main/egl.pc
src/egl/wayland/wayland-drm/Makefile src/egl/wayland/wayland-drm/Makefile
src/egl/wayland/wayland-egl/Makefile
src/egl/wayland/wayland-egl/wayland-egl.pc
src/gallium/Makefile src/gallium/Makefile
src/gallium/auxiliary/Makefile src/gallium/auxiliary/Makefile
src/gallium/auxiliary/pipe-loader/Makefile src/gallium/auxiliary/pipe-loader/Makefile
src/gallium/drivers/freedreno/Makefile src/gallium/drivers/freedreno/Makefile
src/gallium/drivers/ddebug/Makefile
src/gallium/drivers/i915/Makefile src/gallium/drivers/i915/Makefile
src/gallium/drivers/llvmpipe/Makefile src/gallium/drivers/llvmpipe/Makefile
src/gallium/drivers/noop/Makefile
src/gallium/drivers/nouveau/Makefile src/gallium/drivers/nouveau/Makefile
src/gallium/drivers/pl111/Makefile src/gallium/drivers/kmsro/Makefile
src/gallium/drivers/r300/Makefile src/gallium/drivers/r300/Makefile
src/gallium/drivers/r600/Makefile src/gallium/drivers/r600/Makefile
src/gallium/drivers/radeon/Makefile
src/gallium/drivers/radeonsi/Makefile src/gallium/drivers/radeonsi/Makefile
src/gallium/drivers/rbug/Makefile
src/gallium/drivers/softpipe/Makefile src/gallium/drivers/softpipe/Makefile
src/gallium/drivers/svga/Makefile src/gallium/drivers/svga/Makefile
src/gallium/drivers/swr/Makefile src/gallium/drivers/swr/Makefile
src/gallium/drivers/trace/Makefile src/gallium/drivers/tegra/Makefile
src/gallium/drivers/etnaviv/Makefile src/gallium/drivers/etnaviv/Makefile
src/gallium/drivers/imx/Makefile src/gallium/drivers/v3d/Makefile
src/gallium/drivers/vc4/Makefile src/gallium/drivers/vc4/Makefile
src/gallium/drivers/vc5/Makefile
src/gallium/drivers/virgl/Makefile src/gallium/drivers/virgl/Makefile
src/gallium/state_trackers/clover/Makefile src/gallium/state_trackers/clover/Makefile
src/gallium/state_trackers/dri/Makefile src/gallium/state_trackers/dri/Makefile
src/gallium/state_trackers/glx/xlib/Makefile src/gallium/state_trackers/glx/xlib/Makefile
src/gallium/state_trackers/nine/Makefile src/gallium/state_trackers/nine/Makefile
src/gallium/state_trackers/omx_bellagio/Makefile src/gallium/state_trackers/omx/Makefile
src/gallium/state_trackers/omx/bellagio/Makefile
src/gallium/state_trackers/omx/tizonia/Makefile
src/gallium/state_trackers/osmesa/Makefile src/gallium/state_trackers/osmesa/Makefile
src/gallium/state_trackers/va/Makefile src/gallium/state_trackers/va/Makefile
src/gallium/state_trackers/vdpau/Makefile src/gallium/state_trackers/vdpau/Makefile
@@ -2966,7 +3137,7 @@ AC_CONFIG_FILES([Makefile
src/gallium/targets/d3dadapter9/d3d.pc src/gallium/targets/d3dadapter9/d3d.pc
src/gallium/targets/dri/Makefile src/gallium/targets/dri/Makefile
src/gallium/targets/libgl-xlib/Makefile src/gallium/targets/libgl-xlib/Makefile
src/gallium/targets/omx-bellagio/Makefile src/gallium/targets/omx/Makefile
src/gallium/targets/opencl/Makefile src/gallium/targets/opencl/Makefile
src/gallium/targets/opencl/mesa.icd src/gallium/targets/opencl/mesa.icd
src/gallium/targets/osmesa/Makefile src/gallium/targets/osmesa/Makefile
@@ -2980,11 +3151,10 @@ AC_CONFIG_FILES([Makefile
src/gallium/tests/trivial/Makefile src/gallium/tests/trivial/Makefile
src/gallium/tests/unit/Makefile src/gallium/tests/unit/Makefile
src/gallium/winsys/etnaviv/drm/Makefile src/gallium/winsys/etnaviv/drm/Makefile
src/gallium/winsys/imx/drm/Makefile
src/gallium/winsys/freedreno/drm/Makefile src/gallium/winsys/freedreno/drm/Makefile
src/gallium/winsys/i915/drm/Makefile src/gallium/winsys/i915/drm/Makefile
src/gallium/winsys/nouveau/drm/Makefile src/gallium/winsys/nouveau/drm/Makefile
src/gallium/winsys/pl111/drm/Makefile src/gallium/winsys/kmsro/drm/Makefile
src/gallium/winsys/radeon/drm/Makefile src/gallium/winsys/radeon/drm/Makefile
src/gallium/winsys/amdgpu/drm/Makefile src/gallium/winsys/amdgpu/drm/Makefile
src/gallium/winsys/svga/drm/Makefile src/gallium/winsys/svga/drm/Makefile
@@ -2993,8 +3163,9 @@ AC_CONFIG_FILES([Makefile
src/gallium/winsys/sw/null/Makefile src/gallium/winsys/sw/null/Makefile
src/gallium/winsys/sw/wrapper/Makefile src/gallium/winsys/sw/wrapper/Makefile
src/gallium/winsys/sw/xlib/Makefile src/gallium/winsys/sw/xlib/Makefile
src/gallium/winsys/tegra/drm/Makefile
src/gallium/winsys/v3d/drm/Makefile
src/gallium/winsys/vc4/drm/Makefile src/gallium/winsys/vc4/drm/Makefile
src/gallium/winsys/vc5/drm/Makefile
src/gallium/winsys/virgl/drm/Makefile src/gallium/winsys/virgl/drm/Makefile
src/gallium/winsys/virgl/vtest/Makefile src/gallium/winsys/virgl/vtest/Makefile
src/gbm/Makefile src/gbm/Makefile
@@ -3028,8 +3199,11 @@ AC_CONFIG_FILES([Makefile
src/mesa/main/tests/Makefile src/mesa/main/tests/Makefile
src/mesa/state_tracker/tests/Makefile src/mesa/state_tracker/tests/Makefile
src/util/Makefile src/util/Makefile
src/util/tests/fast_idiv_by_const/Makefile
src/util/tests/hash_table/Makefile src/util/tests/hash_table/Makefile
src/util/tests/set/Makefile
src/util/tests/string_buffer/Makefile src/util/tests/string_buffer/Makefile
src/util/tests/vma/Makefile
src/util/xmlpool/Makefile src/util/xmlpool/Makefile
src/vulkan/Makefile]) src/vulkan/Makefile])
@@ -3042,6 +3216,9 @@ $SED -i -e 's/brw_blorp.cpp/brw_blorp.c/' src/mesa/drivers/dri/i965/.deps/brw_bl
rm -f src/compiler/spirv/spirv_info.lo rm -f src/compiler/spirv/spirv_info.lo
echo "# dummy" > src/compiler/spirv/.deps/spirv_info.Plo echo "# dummy" > src/compiler/spirv/.deps/spirv_info.Plo
rm -f src/compiler/nir/.deps/nir_intrinsics.Plo
echo "# dummy" > src/compiler/nir/.deps/nir_intrinsics.Plo
dnl dnl
dnl Output some configuration info for the user dnl Output some configuration info for the user
dnl dnl
@@ -3194,7 +3371,7 @@ if test "x$enable_llvm" = xyes; then
echo " LLVM_LDFLAGS: $LLVM_LDFLAGS" echo " LLVM_LDFLAGS: $LLVM_LDFLAGS"
echo "" echo ""
fi fi
echo " PYTHON2: $PYTHON2" echo " PYTHON: $PYTHON"
echo "" echo ""
echo " Run '${MAKE-make}' to build Mesa" echo " Run '${MAKE-make}' to build Mesa"

View File

@@ -26,6 +26,12 @@
</ul> </ul>
</ol> </ol>
<h2>ATTENTION:</h2>
<p>
The autotools build is being replaced by the <a href="meson.html">meson</a>
build system. If you haven't yet now is a good time to try using meson and
report any issues you run into.
</p>
<h2 id="basic">1. Basic Usage</h2> <h2 id="basic">1. Basic Usage</h2>
@@ -94,6 +100,13 @@ Currently there's only one config file provided when dri drivers are
enabled - it's <code>drirc</code>.</p> enabled - it's <code>drirc</code>.</p>
</dd> </dd>
<dt><code>--datadir=DIR</code></dt>
<dd><p>This option specifies the directory where the data files will
be installed. The default is <code>${prefix}/share</code>.
Currently when dri drivers are enabled, <code>drirc.d/</code> is at
this place.</p>
</dd>
<dt><code>--enable-static, --disable-shared</code></dt> <dt><code>--enable-static, --disable-shared</code></dt>
<dd><p>By default, Mesa <dd><p>By default, Mesa
will build shared libraries. Either of these options will force static will build shared libraries. Either of these options will force static

View File

@@ -14,7 +14,7 @@
<iframe src="contents.html"></iframe> <iframe src="contents.html"></iframe>
<div class="content"> <div class="content">
<h1>Bug Database</h1> <h1>Report a bug</h1>
<p> <p>
The Mesa bug database is hosted on The Mesa bug database is hosted on

View File

@@ -83,7 +83,7 @@ We try to quote the OpenGL specification where prudent:
* "An INVALID_OPERATION error is generated for any of the following * "An INVALID_OPERATION error is generated for any of the following
* conditions: * conditions:
* *
* * <length> is zero." * * &lt;length&gt; is zero."
* *
* Additionally, page 94 of the PDF of the OpenGL 4.5 core spec * Additionally, page 94 of the PDF of the OpenGL 4.5 core spec
* (30.10.2014) also says this, so it's no longer allowed for desktop GL, * (30.10.2014) also says this, so it's no longer allowed for desktop GL,
@@ -94,7 +94,7 @@ Function comment example:
<pre> <pre>
/** /**
* Create and initialize a new buffer object. Called via the * Create and initialize a new buffer object. Called via the
* ctx->Driver.CreateObject() driver callback function. * ctx-&gt;Driver.CreateObject() driver callback function.
* \param name integer name of the object * \param name integer name of the object
* \param type one of GL_FOO, GL_BAR, etc. * \param type one of GL_FOO, GL_BAR, etc.
* \return pointer to new object or NULL if error * \return pointer to new object or NULL if error

View File

@@ -49,10 +49,10 @@
<li><a href="precompiled.html" target="_parent">Precompiled Libraries</a> <li><a href="precompiled.html" target="_parent">Precompiled Libraries</a>
</ul> </ul>
<b>Resources</b> <b>Need help?</b>
<ul> <ul>
<li><a href="lists.html" target="_parent">Mailing Lists</a> <li><a href="lists.html" target="_parent">Mailing Lists</a>
<li><a href="bugs.html" target="_parent">Bug Database</a> <li><a href="bugs.html" target="_parent">Report a bug</a>
<li><a href="webmaster.html" target="_parent">Webmaster</a> <li><a href="webmaster.html" target="_parent">Webmaster</a>
<li><a href="https://dri.freedesktop.org/" target="_parent">Mesa/DRI Wiki</a> <li><a href="https://dri.freedesktop.org/" target="_parent">Mesa/DRI Wiki</a>
</ul> </ul>

View File

@@ -102,9 +102,9 @@ In the past, GLUT, GLU and the Mesa demos were released in conjunction with
Mesa releases. But since GLUT, GLU and the demos change infrequently, they Mesa releases. But since GLUT, GLU and the demos change infrequently, they
were split off into their own git repositories: were split off into their own git repositories:
<a href="https://cgit.freedesktop.org/mesa/glut/">GLUT</a>, <a href="https://gitlab.freedesktop.org/mesa/glut">GLUT</a>,
<a href="https://cgit.freedesktop.org/mesa/glu/">GLU</a> and <a href="https://gitlab.freedesktop.org/mesa/glu">GLU</a> and
<a href="https://cgit.freedesktop.org/mesa/demos/">Demos</a>, <a href="https://gitlab.freedesktop.org/mesa/demos">Demos</a>,
</p> </p>
</div> </div>

View File

@@ -168,6 +168,7 @@ the X server directly using (XCB-)DRI2 protocol.</p>
<p>This driver can share DRI drivers with <code>libGL</code>.</p> <p>This driver can share DRI drivers with <code>libGL</code>.</p>
</dd> </dd>
</dl>
<h2>Packaging</h2> <h2>Packaging</h2>

View File

@@ -88,22 +88,40 @@ This is a work-around for that.
<li>MESA_GL_VERSION_OVERRIDE - changes the value returned by <li>MESA_GL_VERSION_OVERRIDE - changes the value returned by
glGetString(GL_VERSION) and possibly the GL API type. glGetString(GL_VERSION) and possibly the GL API type.
<ul> <ul>
<li> The format should be MAJOR.MINOR[FC] <li>The format should be MAJOR.MINOR[FC|COMPAT]
<li> FC is an optional suffix that indicates a forward compatible context. <li>FC is an optional suffix that indicates a forward compatible
This is only valid for versions &gt;= 3.0. context. This is only valid for versions &gt;= 3.0.
<li> GL versions &lt; 3.0 are set to a compatibility (non-Core) profile <li>COMPAT is an optional suffix that indicates a compatibility
<li> GL versions = 3.0, see below context or GL_ARB_compatibility support. This is only valid for
<li> GL versions &gt; 3.0 are set to a Core profile versions &gt;= 3.1.
<li> Examples: 2.1, 3.0, 3.0FC, 3.1, 3.1FC <li>GL versions &lt;= 3.0 are set to a compatibility (non-Core)
<ul> profile
<li> 2.1 - select a compatibility (non-Core) profile with GL version 2.1 <li>GL versions = 3.1, depending on the driver, it may or may not
<li> 3.0 - select a compatibility (non-Core) profile with GL version 3.0 have the ARB_compatibility extension enabled.
<li> 3.0FC - select a Core+Forward Compatible profile with GL version 3.0 <li>GL versions &gt;= 3.2 are set to a Core profile
<li> 3.1 - select a Core profile with GL version 3.1 <li>Examples: 2.1, 3.0, 3.0FC, 3.1, 3.1FC, 3.1COMPAT, X.Y, X.YFC,
<li> 3.1FC - select a Core+Forward Compatible profile with GL version 3.1 X.YCOMPAT.
</ul> <ul>
<li> Mesa may not really implement all the features of the given version. <li>2.1 - select a compatibility (non-Core) profile with GL
(for developers only) version 2.1.
<li>3.0 - select a compatibility (non-Core) profile with GL
version 3.0.
<li>3.0FC - select a Core+Forward Compatible profile with GL
version 3.0.
<li>3.1 - select GL version 3.1 with GL_ARB_compatibility enabled
per the driver default.
<li>3.1FC - select GL version 3.1 with forward compatibility and
GL_ARB_compatibility disabled.
<li>3.1COMPAT - select GL version 3.1 with GL_ARB_compatibility
enabled.
<li>X.Y - override GL version to X.Y without changing the profile.
<li>X.YFC - select a Core+Forward Compatible profile with GL
version X.Y.
<li>X.YCOMPAT - select a Compatibility profile with GL version
X.Y.
</ul>
<li>Mesa may not really implement all the features of the given
version. (for developers only)
</ul> </ul>
<li>MESA_GLES_VERSION_OVERRIDE - changes the value returned by <li>MESA_GLES_VERSION_OVERRIDE - changes the value returned by
glGetString(GL_VERSION) for OpenGL ES. glGetString(GL_VERSION) for OpenGL ES.
@@ -128,13 +146,23 @@ your system. For example under the default settings you may end up with a 1GB
cache for x86_64 and another 1GB cache for i386. cache for x86_64 and another 1GB cache for i386.
<li>MESA_GLSL_CACHE_DIR - if set, determines the directory to be used <li>MESA_GLSL_CACHE_DIR - if set, determines the directory to be used
for the on-disk cache of compiled GLSL programs. If this variable is for the on-disk cache of compiled GLSL programs. If this variable is
not set, then the cache will be stored in $XDG_CACHE_HOME/mesa (if not set, then the cache will be stored in $XDG_CACHE_HOME/mesa_shader_cache (if
that variable is set), or else within .cache/mesa within the user's that variable is set), or else within .cache/mesa_shader_cache within the user's
home directory. home directory.
<li>MESA_GLSL - <a href="shading.html#envvars">shading language compiler options</a> <li>MESA_GLSL - <a href="shading.html#envvars">shading language compiler options</a>
<li>MESA_NO_MINMAX_CACHE - when set, the minmax index cache is globally disabled. <li>MESA_NO_MINMAX_CACHE - when set, the minmax index cache is globally disabled.
<li>MESA_SHADER_CAPTURE_PATH - see <a href="shading.html#capture">Capturing Shaders</a></li> <li>MESA_SHADER_CAPTURE_PATH - see <a href="shading.html#capture">Capturing Shaders</a></li>
<li>MESA_SHADER_DUMP_PATH and MESA_SHADER_READ_PATH - see <a href="shading.html#replacement">Experimenting with Shader Replacements</a></li> <li>MESA_SHADER_DUMP_PATH and MESA_SHADER_READ_PATH - see <a href="shading.html#replacement">Experimenting with Shader Replacements</a></li>
<li>MESA_VK_VERSION_OVERRIDE - changes the Vulkan physical device version
as returned in VkPhysicalDeviceProperties::apiVersion.
<ul>
<li>The format should be MAJOR.MINOR[.PATCH]</li>
<li>This will not let you force a version higher than the driver's
instance versionas advertised by vkEnumerateInstanceVersion</li>
<li>This can be very useful for debugging but some features may not be
implemented correctly. (For developers only)</li>
</ul>
</li>
</ul> </ul>
@@ -241,7 +269,7 @@ Mesa EGL supports different sets of environment variables. See the
Especially useful to toggle hud at specific points of application and Especially useful to toggle hud at specific points of application and
disable for unencumbered viewing the rest of the time. For example, set disable for unencumbered viewing the rest of the time. For example, set
GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_TOGGLE_SIGNAL to 10 (SIGUSR1). GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_TOGGLE_SIGNAL to 10 (SIGUSR1).
Use kill -10 <pid> to toggle the hud as desired. Use kill -10 &lt;pid&gt; to toggle the hud as desired.
<li>GALLIUM_HUD_DUMP_DIR - specifies a directory for writing the displayed <li>GALLIUM_HUD_DUMP_DIR - specifies a directory for writing the displayed
hud values into files. hud values into files.
<li>GALLIUM_DRIVER - useful in combination with LIBGL_ALWAYS_SOFTWARE=true for <li>GALLIUM_DRIVER - useful in combination with LIBGL_ALWAYS_SOFTWARE=true for
@@ -313,6 +341,12 @@ such as the OpenGL program's name and command line arguments.
<li>See the driver code for other, lesser-used variables. <li>See the driver code for other, lesser-used variables.
</ul> </ul>
<h3>WGL environment variables</h3>
<ul>
<li>WGL_SWAP_INTERVAL - to set a swap interval, equivalent to calling
wglSwapIntervalEXT() in an application. If this environment variable
is set, application calls to wglSwapIntervalEXT() will have no effect.
</ul>
<h3>VA-API state tracker environment variables</h3> <h3>VA-API state tracker environment variables</h3>
<ul> <ul>

View File

@@ -16,7 +16,7 @@
<center> <center>
<h1>Mesa Frequently Asked Questions</h1> <h1>Mesa Frequently Asked Questions</h1>
Last updated: 9 October 2012 Last updated: 19 September 2018
</center> </center>
<br> <br>
@@ -373,18 +373,16 @@ the archives) is a good way to get information.
<h2>4.3 Why isn't GL_EXT_texture_compression_s3tc implemented in Mesa?</h2> <h2>4.3 Why isn't GL_EXT_texture_compression_s3tc implemented in Mesa?</h2>
<p> <p>
The <a href="http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_compression_s3tc.txt">specification for the extension</a> Oh but it is! Prior to 2nd October 2017, the Mesa project did not include s3tc
indicates that there are intellectual property (IP) and/or patent issues support due to intellectual property (IP) and/or patent issues around the s3tc
to be dealt with. algorithm.
</p>
<p>We've been unsuccessful in getting a response from S3 (or whoever owns
the IP nowadays) to indicate whether or not an open source project can
implement the extension (specifically the compression/decompression
algorithms).
</p> </p>
<p> <p>
In the mean time, a 3rd party <a href="https://dri.freedesktop.org/wiki/S3TC"> As of Mesa 17.3.0, Mesa now officially supports s3tc, as the patent has expired.
plug-in library</a> is available. </p>
<p>
In versions prior to this, a 3rd party <a href="https://dri.freedesktop.org/wiki/S3TC">
plug-in library</a> was required.
</p> </p>
</div> </div>

BIN
docs/favicon.ico Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

BIN
docs/favicon.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.9 KiB

View File

@@ -24,16 +24,19 @@ not started
# OpenGL Core and Compatibility context support # OpenGL Core and Compatibility context support
OpenGL 3.1 and later versions are only supported with the Core profile. Some drivers do not support the Compatibility profile or the
There are no plans to support GL_ARB_compatibility. The last supported OpenGL ARB_compatibility extensions. If an application does not request a
version with all deprecated features is 3.0. Some of the later GL features specific version without the forward-compatiblity flag, such drivers
are exposed in the 3.0 context as extensions. will be limited to OpenGL 3.0. If an application requests OpenGL 3.1,
it will get a context that may or may not have the ARB_compatibility
extension enabled. Some of the later GL features are exposed in the 3.0
context as extensions.
Feature Status Feature Status
------------------------------------------------------- ------------------------ ------------------------------------------------------- ------------------------
GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr, virgl
glBindFragDataLocation, glGetFragDataLocation DONE glBindFragDataLocation, glGetFragDataLocation DONE
GL_NV_conditional_render (Conditional rendering) DONE () GL_NV_conditional_render (Conditional rendering) DONE ()
@@ -60,12 +63,12 @@ GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llv
glVertexAttribI commands DONE glVertexAttribI commands DONE
Depth format cube textures DONE () Depth format cube textures DONE ()
GLX_ARB_create_context (GLX 1.4 is required) DONE GLX_ARB_create_context (GLX 1.4 is required) DONE
Multisample anti-aliasing DONE (freedreno (*), llvmpipe (*), softpipe (*), swr (*)) Multisample anti-aliasing DONE (freedreno/a5xx, freedreno (*), llvmpipe (*), softpipe (*), swr (*))
(*) freedreno, llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support (*) freedreno (a2xx-a4xx), llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support
GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr, virgl
Forward compatible context support/deprecations DONE () Forward compatible context support/deprecations DONE ()
GL_ARB_draw_instanced (Instanced drawing) DONE () GL_ARB_draw_instanced (Instanced drawing) DONE ()
@@ -78,7 +81,7 @@ GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llv
GL_EXT_texture_snorm (Signed normalized textures) DONE () GL_EXT_texture_snorm (Signed normalized textures) DONE ()
GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr, virgl
Core/compatibility profiles DONE Core/compatibility profiles DONE
Geometry shaders DONE () Geometry shaders DONE ()
@@ -87,13 +90,13 @@ GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (freedreno) GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (freedreno)
GL_ARB_provoking_vertex (Provoking vertex) DONE (freedreno) GL_ARB_provoking_vertex (Provoking vertex) DONE (freedreno)
GL_ARB_seamless_cube_map (Seamless cubemaps) DONE (freedreno) GL_ARB_seamless_cube_map (Seamless cubemaps) DONE (freedreno)
GL_ARB_texture_multisample (Multisample textures) DONE () GL_ARB_texture_multisample (Multisample textures) DONE (freedreno/a5xx)
GL_ARB_depth_clamp (Frag depth clamp) DONE (freedreno) GL_ARB_depth_clamp (Frag depth clamp) DONE (freedreno)
GL_ARB_sync (Fence objects) DONE (freedreno) GL_ARB_sync (Fence objects) DONE (freedreno)
GLX_ARB_create_context_profile DONE GLX_ARB_create_context_profile DONE
GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, virgl
GL_ARB_blend_func_extended DONE (freedreno/a3xx, swr) GL_ARB_blend_func_extended DONE (freedreno/a3xx, swr)
GL_ARB_explicit_attrib_location DONE (all drivers that support GLSL) GL_ARB_explicit_attrib_location DONE (all drivers that support GLSL)
@@ -107,18 +110,18 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
GL_ARB_vertex_type_2_10_10_10_rev DONE (freedreno, swr) GL_ARB_vertex_type_2_10_10_10_rev DONE (freedreno, swr)
GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
GL_ARB_draw_buffers_blend DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr) GL_ARB_draw_buffers_blend DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_draw_indirect DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr) GL_ARB_draw_indirect DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr)
GL_ARB_gpu_shader5 DONE (i965/gen7+) GL_ARB_gpu_shader5 DONE (i965/gen7+)
- 'precise' qualifier DONE - 'precise' qualifier DONE
- Dynamically uniform sampler array indices DONE (softpipe) - Dynamically uniform sampler array indices DONE (softpipe)
- Dynamically uniform UBO array indices DONE () - Dynamically uniform UBO array indices DONE (freedreno)
- Implicit signed -> unsigned conversions DONE - Implicit signed -> unsigned conversions DONE
- Fused multiply-add DONE () - Fused multiply-add DONE ()
- Packing/bitfield/conversion functions DONE (softpipe) - Packing/bitfield/conversion functions DONE (freedreno, softpipe)
- Enhanced textureGather DONE (softpipe) - Enhanced textureGather DONE (freedreno, softpipe)
- Geometry shader instancing DONE (llvmpipe, softpipe) - Geometry shader instancing DONE (llvmpipe, softpipe)
- Geometry shader multiple streams DONE () - Geometry shader multiple streams DONE ()
- Enhanced per-sample shading DONE () - Enhanced per-sample shading DONE ()
@@ -136,7 +139,7 @@ GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
GL_ARB_transform_feedback3 DONE (i965/gen7+, llvmpipe, softpipe, swr) GL_ARB_transform_feedback3 DONE (i965/gen7+, llvmpipe, softpipe, swr)
GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
GL_ARB_ES2_compatibility DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr) GL_ARB_ES2_compatibility DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_get_program_binary DONE (0 or 1 binary formats) GL_ARB_get_program_binary DONE (0 or 1 binary formats)
@@ -146,7 +149,7 @@ GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
GL_ARB_viewport_array DONE (i965, nv50, llvmpipe, softpipe) GL_ARB_viewport_array DONE (i965, nv50, llvmpipe, softpipe)
GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, r600, radeonsi GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
GL_ARB_texture_compression_bptc DONE (freedreno, i965) GL_ARB_texture_compression_bptc DONE (freedreno, i965)
GL_ARB_compressed_texture_pixel_storage DONE (all drivers) GL_ARB_compressed_texture_pixel_storage DONE (all drivers)
@@ -162,7 +165,7 @@ GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, r600, radeonsi
GL_ARB_map_buffer_alignment DONE (all drivers) GL_ARB_map_buffer_alignment DONE (all drivers)
GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, r600, radeonsi GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, r600, radeonsi, virgl
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30) GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_ES3_compatibility DONE (all drivers that support GLSL 3.30) GL_ARB_ES3_compatibility DONE (all drivers that support GLSL 3.30)
@@ -188,12 +191,12 @@ GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
GL_ARB_vertex_attrib_binding DONE (all drivers) GL_ARB_vertex_attrib_binding DONE (all drivers)
GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, radeonsi GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
GL_MAX_VERTEX_ATTRIB_STRIDE DONE (all drivers) GL_MAX_VERTEX_ATTRIB_STRIDE DONE (all drivers)
GL_ARB_buffer_storage DONE (freedreno, i965, nv50, r600, llvmpipe, swr) GL_ARB_buffer_storage DONE (freedreno, i965, nv50, llvmpipe, swr)
GL_ARB_clear_texture DONE (i965, nv50, r600, llvmpipe, softpipe, swr) GL_ARB_clear_texture DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_enhanced_layouts DONE (i965, nv50, r600, llvmpipe, softpipe) GL_ARB_enhanced_layouts DONE (i965, nv50, llvmpipe, softpipe, virgl)
- compile-time constant expressions DONE - compile-time constant expressions DONE
- explicit byte offsets for blocks DONE - explicit byte offsets for blocks DONE
- forced alignment within blocks DONE - forced alignment within blocks DONE
@@ -201,22 +204,22 @@ GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, radeonsi
- specified transform/feedback layout DONE - specified transform/feedback layout DONE
- input/output block locations DONE - input/output block locations DONE
GL_ARB_multi_bind DONE (all drivers) GL_ARB_multi_bind DONE (all drivers)
GL_ARB_query_buffer_object DONE (i965/hsw+) GL_ARB_query_buffer_object DONE (i965/hsw+, virgl)
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv50, r600, llvmpipe, softpipe, swr) GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_texture_stencil8 DONE (freedreno, i965/hsw+, nv50, r600, llvmpipe, softpipe, swr) GL_ARB_texture_stencil8 DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, r600, llvmpipe, softpipe, swr) GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi
GL_ARB_ES3_1_compatibility DONE (i965/hsw+, r600) GL_ARB_ES3_1_compatibility DONE (i965/hsw+, r600, virgl)
GL_ARB_clip_control DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr) GL_ARB_clip_control DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_conditional_render_inverted DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr) GL_ARB_conditional_render_inverted DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr, virgl)
GL_ARB_cull_distance DONE (i965, nv50, r600, llvmpipe, softpipe, swr) GL_ARB_cull_distance DONE (i965, nv50, r600, llvmpipe, softpipe, swr, virgl)
GL_ARB_derivative_control DONE (i965, nv50, r600) GL_ARB_derivative_control DONE (i965, nv50, r600, virgl)
GL_ARB_direct_state_access DONE (all drivers) GL_ARB_direct_state_access DONE (all drivers)
GL_ARB_get_texture_sub_image DONE (all drivers) GL_ARB_get_texture_sub_image DONE (all drivers)
GL_ARB_shader_texture_image_samples DONE (i965, nv50, r600) GL_ARB_shader_texture_image_samples DONE (i965, nv50, r600, virgl)
GL_ARB_texture_barrier DONE (freedreno, i965, nv50, r600) GL_ARB_texture_barrier DONE (freedreno, i965, nv50, r600, virgl)
GL_KHR_context_flush_control DONE (all - but needs GLX/EGL extension to be useful) GL_KHR_context_flush_control DONE (all - but needs GLX/EGL extension to be useful)
GL_KHR_robustness DONE (i965) GL_KHR_robustness DONE (i965)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL) GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
@@ -226,19 +229,19 @@ GL 4.6, GLSL 4.60
GL_ARB_gl_spirv in progress (Nicolai Hähnle, Ian Romanick) GL_ARB_gl_spirv in progress (Nicolai Hähnle, Ian Romanick)
GL_ARB_indirect_parameters DONE (i965/gen7+, nvc0, radeonsi) GL_ARB_indirect_parameters DONE (i965/gen7+, nvc0, radeonsi)
GL_ARB_pipeline_statistics_query DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr) GL_ARB_pipeline_statistics_query DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_polygon_offset_clamp DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr) GL_ARB_polygon_offset_clamp DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr, virgl)
GL_ARB_shader_atomic_counter_ops DONE (freedreno/a5xx, i965/gen7+, nvc0, r600, radeonsi, softpipe) GL_ARB_shader_atomic_counter_ops DONE (freedreno/a5xx, i965/gen7+, nvc0, r600, radeonsi, softpipe, virgl)
GL_ARB_shader_draw_parameters DONE (i965, nvc0, radeonsi) GL_ARB_shader_draw_parameters DONE (i965, nvc0, radeonsi)
GL_ARB_shader_group_vote DONE (i965, nvc0, radeonsi) GL_ARB_shader_group_vote DONE (i965, nvc0, radeonsi)
GL_ARB_spirv_extensions in progress (Nicolai Hähnle, Ian Romanick) GL_ARB_spirv_extensions in progress (Nicolai Hähnle, Ian Romanick)
GL_ARB_texture_filter_anisotropic DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, softpipe (*), llvmpipe (*)) GL_ARB_texture_filter_anisotropic DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, softpipe (*), llvmpipe (*))
GL_ARB_transform_feedback_overflow_query DONE (i965/gen6+, radeonsi, llvmpipe, softpipe) GL_ARB_transform_feedback_overflow_query DONE (i965/gen6+, nvc0, radeonsi, llvmpipe, softpipe, virgl)
GL_KHR_no_error DONE (all drivers) GL_KHR_no_error DONE (all drivers)
(*) softpipe and llvmpipe advertise 16x anisotropy but simply ignore the setting (*) softpipe and llvmpipe advertise 16x anisotropy but simply ignore the setting
These are the extensions cherry-picked to make GLES 3.1 These are the extensions cherry-picked to make GLES 3.1
GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi, virgl
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30) GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_compute_shader DONE (freedreno/a5xx, i965/gen7+, softpipe) GL_ARB_compute_shader DONE (freedreno/a5xx, i965/gen7+, softpipe)
@@ -253,11 +256,11 @@ GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi
GL_ARB_shading_language_packing DONE (all drivers) GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_separate_shader_objects DONE (all drivers) GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_stencil_texturing DONE (freedreno, nv50, llvmpipe, softpipe, swr) GL_ARB_stencil_texturing DONE (freedreno, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_multisample (Multisample textures) DONE (i965/gen7+, nv50, llvmpipe, softpipe) GL_ARB_texture_multisample (Multisample textures) DONE (freedreno/a5xx, i965/gen7+, nv50, llvmpipe, softpipe)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample) GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_vertex_attrib_binding DONE (all drivers) GL_ARB_vertex_attrib_binding DONE (all drivers)
GS5 Enhanced textureGather DONE (freedreno, i965/gen7+,) GS5 Enhanced textureGather DONE (freedreno, i965/gen7+)
GS5 Packing/bitfield/conversion functions DONE (i965/gen6+) GS5 Packing/bitfield/conversion functions DONE (freedreno/a5xx, i965/gen6+)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL) GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
Additional functionality not covered above: Additional functionality not covered above:
@@ -266,28 +269,28 @@ GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi
glGetBooleani_v - restrict to GLES enums glGetBooleani_v - restrict to GLES enums
gl_HelperInvocation support DONE (i965, r600) gl_HelperInvocation support DONE (i965, r600)
GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+ GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+, radeonsi, virgl
GL_EXT_color_buffer_float DONE (all drivers) GL_EXT_color_buffer_float DONE (all drivers)
GL_KHR_blend_equation_advanced DONE (i965, nvc0) GL_KHR_blend_equation_advanced DONE (i965, nvc0)
GL_KHR_debug DONE (all drivers) GL_KHR_debug DONE (all drivers)
GL_KHR_robustness DONE (i965, nvc0, radeonsi) GL_KHR_robustness DONE (i965, nvc0)
GL_KHR_texture_compression_astc_ldr DONE (freedreno, i965/gen9+) GL_KHR_texture_compression_astc_ldr DONE (freedreno, i965/gen9+)
GL_OES_copy_image DONE (all drivers) GL_OES_copy_image DONE (all drivers)
GL_OES_draw_buffers_indexed DONE (all drivers that support GL_ARB_draw_buffers_blend) GL_OES_draw_buffers_indexed DONE (all drivers that support GL_ARB_draw_buffers_blend)
GL_OES_draw_elements_base_vertex DONE (all drivers) GL_OES_draw_elements_base_vertex DONE (all drivers)
GL_OES_geometry_shader DONE (i965/hsw+, nvc0, radeonsi) GL_OES_geometry_shader DONE (i965/hsw+, nvc0)
GL_OES_gpu_shader5 DONE (all drivers that support GL_ARB_gpu_shader5) GL_OES_gpu_shader5 DONE (all drivers that support GL_ARB_gpu_shader5)
GL_OES_primitive_bounding_box DONE (i965/gen7+, nvc0, radeonsi) GL_OES_primitive_bounding_box DONE (i965/gen7+, nvc0)
GL_OES_sample_shading DONE (i965, nvc0, r600, radeonsi) GL_OES_sample_shading DONE (i965, nvc0, r600)
GL_OES_sample_variables DONE (i965, nvc0, r600, radeonsi) GL_OES_sample_variables DONE (i965, nvc0, r600)
GL_OES_shader_image_atomic DONE (all drivers that support GL_ARB_shader_image_load_store) GL_OES_shader_image_atomic DONE (all drivers that support GL_ARB_shader_image_load_store)
GL_OES_shader_io_blocks DONE (All drivers that support GLES 3.1) GL_OES_shader_io_blocks DONE (All drivers that support GLES 3.1)
GL_OES_shader_multisample_interpolation DONE (i965, nvc0, r600, radeonsi) GL_OES_shader_multisample_interpolation DONE (i965, nvc0, r600)
GL_OES_tessellation_shader DONE (all drivers that support GL_ARB_tessellation_shader) GL_OES_tessellation_shader DONE (all drivers that support GL_ARB_tessellation_shader)
GL_OES_texture_border_clamp DONE (all drivers) GL_OES_texture_border_clamp DONE (all drivers)
GL_OES_texture_buffer DONE (i965, nvc0, radeonsi) GL_OES_texture_buffer DONE (freedreno, i965, nvc0)
GL_OES_texture_cube_map_array DONE (i965/hsw+, nvc0, radeonsi) GL_OES_texture_cube_map_array DONE (i965/hsw+, nvc0)
GL_OES_texture_stencil8 DONE (all drivers that support GL_ARB_texture_stencil8) GL_OES_texture_stencil8 DONE (all drivers that support GL_ARB_texture_stencil8)
GL_OES_texture_storage_multisample_2d_array DONE (all drivers that support GL_ARB_texture_multisample) GL_OES_texture_storage_multisample_2d_array DONE (all drivers that support GL_ARB_texture_multisample)
@@ -296,17 +299,17 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
GL_ARB_bindless_texture DONE (nvc0, radeonsi) GL_ARB_bindless_texture DONE (nvc0, radeonsi)
GL_ARB_cl_event not started GL_ARB_cl_event not started
GL_ARB_compute_variable_group_size DONE (nvc0, radeonsi) GL_ARB_compute_variable_group_size DONE (nvc0, radeonsi)
GL_ARB_ES3_2_compatibility DONE (i965/gen8+) GL_ARB_ES3_2_compatibility DONE (i965/gen8+, radeonsi, virgl)
GL_ARB_fragment_shader_interlock not started GL_ARB_fragment_shader_interlock DONE (i965)
GL_ARB_gpu_shader_int64 DONE (i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe) GL_ARB_gpu_shader_int64 DONE (i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe)
GL_ARB_parallel_shader_compile not started, but Chia-I Wu did some related work in 2014 GL_ARB_parallel_shader_compile not started, but Chia-I Wu did some related work in 2014
GL_ARB_post_depth_coverage DONE (i965) GL_ARB_post_depth_coverage DONE (i965, nvc0)
GL_ARB_robustness_isolation not started GL_ARB_robustness_isolation not started
GL_ARB_sample_locations not started GL_ARB_sample_locations DONE (nvc0)
GL_ARB_seamless_cubemap_per_texture DONE (i965, nvc0, radeonsi, r600, softpipe, swr) GL_ARB_seamless_cubemap_per_texture DONE (freedreno, i965, nvc0, radeonsi, r600, softpipe, swr, virgl)
GL_ARB_shader_ballot DONE (i965/gen8+, nvc0, radeonsi) GL_ARB_shader_ballot DONE (i965/gen8+, nvc0, radeonsi)
GL_ARB_shader_clock DONE (i965/gen7+, nv50, nvc0, r600, radeonsi) GL_ARB_shader_clock DONE (i965/gen7+, nv50, nvc0, r600, radeonsi, virgl)
GL_ARB_shader_stencil_export DONE (i965/gen9+, r600, radeonsi, softpipe, llvmpipe, swr) GL_ARB_shader_stencil_export DONE (i965/gen9+, r600, radeonsi, softpipe, llvmpipe, swr, virgl)
GL_ARB_shader_viewport_layer_array DONE (i965/gen6+, nvc0, radeonsi) GL_ARB_shader_viewport_layer_array DONE (i965/gen6+, nvc0, radeonsi)
GL_ARB_sparse_buffer DONE (radeonsi/CIK+) GL_ARB_sparse_buffer DONE (radeonsi/CIK+)
GL_ARB_sparse_texture not started GL_ARB_sparse_texture not started
@@ -316,15 +319,18 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
GL_EXT_memory_object DONE (radeonsi) GL_EXT_memory_object DONE (radeonsi)
GL_EXT_memory_object_fd DONE (radeonsi) GL_EXT_memory_object_fd DONE (radeonsi)
GL_EXT_memory_object_win32 not started GL_EXT_memory_object_win32 not started
GL_EXT_semaphore not started GL_EXT_render_snorm DONE (i965, radeonsi)
GL_EXT_semaphore_fd not started GL_EXT_semaphore DONE (radeonsi)
GL_EXT_semaphore_fd DONE (radeonsi)
GL_EXT_semaphore_win32 not started GL_EXT_semaphore_win32 not started
GL_EXT_texture_norm16 DONE (i965, r600, radeonsi, nvc0)
GL_KHR_blend_equation_advanced_coherent DONE (i965/gen9+) GL_KHR_blend_equation_advanced_coherent DONE (i965/gen9+)
GL_KHR_texture_compression_astc_hdr DONE (i965/bxt) GL_KHR_texture_compression_astc_hdr DONE (i965/bxt)
GL_KHR_texture_compression_astc_sliced_3d DONE (i965/gen9+) GL_KHR_texture_compression_astc_sliced_3d DONE (i965/gen9+, radeonsi)
GL_OES_depth_texture_cube_map DONE (all drivers that support GLSL 1.30+) GL_OES_depth_texture_cube_map DONE (all drivers that support GLSL 1.30+)
GL_OES_EGL_image DONE (all drivers) GL_OES_EGL_image DONE (all drivers)
GL_OES_EGL_image_external_essl3 not started GL_OES_EGL_image_external DONE (all drivers)
GL_OES_EGL_image_external_essl3 DONE (all drivers)
GL_OES_required_internalformat DONE (all drivers) GL_OES_required_internalformat DONE (all drivers)
GL_OES_surfaceless_context DONE (all drivers) GL_OES_surfaceless_context DONE (all drivers)
GL_OES_texture_compression_astc DONE (core only) GL_OES_texture_compression_astc DONE (core only)
@@ -332,12 +338,69 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
GL_OES_texture_float_linear DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe) GL_OES_texture_float_linear DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_half_float DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe) GL_OES_texture_half_float DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_half_float_linear DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe) GL_OES_texture_half_float_linear DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_view not started - based on GL_ARB_texture_view GL_OES_texture_view DONE (freedreno, i965/gen8+, r600, radeonsi, nv50, nvc0, softpipe, llvmpipe, swr)
GL_OES_viewport_array DONE (i965, nvc0, radeonsi) GL_OES_viewport_array DONE (i965, nvc0, radeonsi)
GLX_ARB_context_flush_control not started GLX_ARB_context_flush_control not started
GLX_ARB_robustness_application_isolation not started GLX_ARB_robustness_application_isolation not started
GLX_ARB_robustness_share_group_isolation not started GLX_ARB_robustness_share_group_isolation not started
GL_EXT_direct_state_access subfeatures (in the spec order):
GL 1.1: Client commands not started
GL 1.0-1.3: Matrix and transpose matrix commands not started
GL 1.1-1.2: Texture commands not started
GL 1.2: 3D texture commands not started
GL 1.2.1: Multitexture commands not started
GL 1.2.1-3.0: Indexed texture commands not started
GL 1.2.1-3.0: Indexed generic queries not started
GL 1.2.1: EnableIndexed.. Get*Indexed not started
GL_ARB_vertex_program not started
GL 1.3: Compressed texture and multitexture commands not started
GL 1.5: Buffer commands not started
GL 2.0-2.1: Uniform and uniform matrix commands not started
GL_EXT_texture_buffer_object not started
GL_EXT_texture_integer not started
GL_EXT_gpu_shader4 not started
GL_EXT_gpu_program_parameters not started
GL_NV_gpu_program4 n/a
GL_NV_framebuffer_multisample_coverage n/a
GL 3.0: Renderbuffer/framebuffer commands, Gen*Mipmap not started
GL 3.0: CopyBuffer command not started
GL_EXT_geometry_shader4 commands (expose in GL 3.2) not started
GL_NV_explicit_multisample n/a
GL 3.0: Vertex array/attrib/query/map commands not started
Matrix GL tokens not started
GL_EXT_direct_state_access additions from other extensions (complete list):
GL_AMD_framebuffer_sample_positions n/a
GL_AMD_gpu_shader_int64 not started
GL_ARB_bindless_texture not started
GL_ARB_buffer_storage not started
GL_ARB_clear_buffer_object not started
GL_ARB_framebuffer_no_attachments not started
GL_ARB_gpu_shader_fp64 not started
GL_ARB_instanced_arrays not started
GL_ARB_internalformat_query2 not started
GL_ARB_sparse_texture n/a
GL_ARB_sparse_buffer not started
GL_ARB_texture_buffer_range not started
GL_ARB_texture_storage not started
GL_ARB_texture_storage_multisample not started
GL_ARB_vertex_attrib_64bit not started
GL_ARB_vertex_attrib_binding not started
GL_EXT_buffer_storage not started
GL_EXT_external_buffer not started
GL_EXT_separate_shader_objects n/a
GL_EXT_sparse_texture n/a
GL_EXT_texture_storage n/a
GL_EXT_vertex_attrib_64bit not started
GL_EXT_EGL_image_storage n/a
GL_NV_bindless_texture n/a
GL_NV_gpu_shader5 n/a
GL_NV_texture_multisample n/a
GL_NV_vertex_buffer_unified_memory n/a
GL_NVX_linked_gpu_multicast n/a
GLX_NV_copy_buffer n/a
The following extensions are not part of any OpenGL or OpenGL ES version, and The following extensions are not part of any OpenGL or OpenGL ES version, and
we DO NOT WANT implementations of these extensions for Mesa. we DO NOT WANT implementations of these extensions for Mesa.
@@ -349,39 +412,55 @@ we DO NOT WANT implementations of these extensions for Mesa.
Vulkan 1.0 -- all DONE: anv, radv Vulkan 1.0 -- all DONE: anv, radv
Khronos extensions that are not part of any Vulkan version: Vulkan 1.1 -- all DONE: anv, radv
VK_KHR_16bit_storage in progress (Alejandro) VK_KHR_16bit_storage in progress (Alejandro)
VK_KHR_android_surface not started VK_KHR_bind_memory2 DONE (anv, radv)
VK_KHR_dedicated_allocation DONE (anv, radv) VK_KHR_dedicated_allocation DONE (anv, radv)
VK_KHR_descriptor_update_template DONE (anv, radv) VK_KHR_descriptor_update_template DONE (anv, radv)
VK_KHR_display not started VK_KHR_device_group not started
VK_KHR_display_swapchain not started VK_KHR_device_group_creation not started
VK_KHR_external_fence not started VK_KHR_external_fence DONE (anv, radv)
VK_KHR_external_fence_capabilities not started VK_KHR_external_fence_capabilities DONE (anv, radv)
VK_KHR_external_fence_fd not started
VK_KHR_external_fence_win32 not started
VK_KHR_external_memory DONE (anv, radv) VK_KHR_external_memory DONE (anv, radv)
VK_KHR_external_memory_capabilities DONE (anv, radv) VK_KHR_external_memory_capabilities DONE (anv, radv)
VK_KHR_external_memory_fd DONE (anv, radv) VK_KHR_external_semaphore DONE (anv, radv)
VK_KHR_external_memory_win32 not started VK_KHR_external_semaphore_capabilities DONE (anv, radv)
VK_KHR_external_semaphore DONE (radv)
VK_KHR_external_semaphore_capabilities DONE (radv)
VK_KHR_external_semaphore_fd DONE (radv)
VK_KHR_external_semaphore_win32 not started
VK_KHR_get_memory_requirements2 DONE (anv, radv) VK_KHR_get_memory_requirements2 DONE (anv, radv)
VK_KHR_get_physical_device_properties2 DONE (anv, radv) VK_KHR_get_physical_device_properties2 DONE (anv, radv)
VK_KHR_get_surface_capabilities2 DONE (anv)
VK_KHR_incremental_present DONE (anv, radv)
VK_KHR_maintenance1 DONE (anv, radv) VK_KHR_maintenance1 DONE (anv, radv)
VK_KHR_maintenance2 DONE (anv, radv)
VK_KHR_maintenance3 DONE (anv, radv)
VK_KHR_multiview DONE (anv, radv)
VK_KHR_relaxed_block_layout DONE (anv, radv)
VK_KHR_sampler_ycbcr_conversion DONE (anv)
VK_KHR_shader_draw_parameters DONE (anv, radv)
VK_KHR_storage_buffer_storage_class DONE (anv, radv)
VK_KHR_variable_pointers DONE (anv, radv)
Khronos extensions that are not part of any Vulkan version:
VK_KHR_8bit_storage DONE (anv)
VK_KHR_android_surface not started
VK_KHR_create_renderpass2 DONE (anv, radv)
VK_KHR_display DONE (anv, radv)
VK_KHR_display_swapchain DONE (anv, radv)
VK_KHR_draw_indirect_count DONE (radv)
VK_KHR_external_fence_fd DONE (anv, radv)
VK_KHR_external_fence_win32 not started
VK_KHR_external_memory_fd DONE (anv, radv)
VK_KHR_external_memory_win32 not started
VK_KHR_external_semaphore_fd DONE (anv, radv)
VK_KHR_external_semaphore_win32 not started
VK_KHR_get_display_properties2 DONE (anv, radv)
VK_KHR_get_surface_capabilities2 DONE (anv, radv)
VK_KHR_image_format_list DONE (anv, radv)
VK_KHR_incremental_present DONE (anv, radv)
VK_KHR_mir_surface not started VK_KHR_mir_surface not started
VK_KHR_push_descriptor DONE (anv, radv) VK_KHR_push_descriptor DONE (anv, radv)
VK_KHR_sampler_mirror_clamp_to_edge DONE (anv, radv) VK_KHR_sampler_mirror_clamp_to_edge DONE (anv, radv)
VK_KHR_shader_draw_parameters DONE (anv, radv)
VK_KHR_shared_presentable_image not started VK_KHR_shared_presentable_image not started
VK_KHR_storage_buffer_storage_class DONE (anv, radv)
VK_KHR_surface DONE (anv, radv) VK_KHR_surface DONE (anv, radv)
VK_KHR_swapchain DONE (anv, radv) VK_KHR_swapchain DONE (anv, radv)
VK_KHR_variable_pointers DONE (anv, radv)
VK_KHR_wayland_surface DONE (anv, radv) VK_KHR_wayland_surface DONE (anv, radv)
VK_KHR_win32_keyed_mutex not started VK_KHR_win32_keyed_mutex not started
VK_KHR_win32_surface not started VK_KHR_win32_surface not started

View File

@@ -47,7 +47,7 @@ You can find some further To-do lists here:
<b>Common To-Do lists:</b> <b>Common To-Do lists:</b>
</p> </p>
<ul> <ul>
<li><a href="https://cgit.freedesktop.org/mesa/mesa/tree/docs/features.txt"> <li><a href="https://gitlab.freedesktop.org/mesa/mesa/blob/master/docs/features.txt">
<b>features.txt</b></a> - Status of OpenGL 3.x / 4.x features in Mesa.</li> <b>features.txt</b></a> - Status of OpenGL 3.x / 4.x features in Mesa.</li>
</ul> </ul>

View File

@@ -15,6 +15,238 @@
<div class="content"> <div class="content">
<h1>News</h1> <h1>News</h1>
<h2>February 18, 2019</h2>
<p>
<a href="relnotes/18.3.4.html">Mesa 18.3.4</a> is released.
This is a bug-fix release.
</p>
<h2>January 31, 2019</h2>
<p>
<a href="relnotes/18.3.3.html">Mesa 18.3.3</a> is released.
This is a bug-fix release.
</p>
<h2>January 17, 2019</h2>
<p>
<a href="relnotes/18.3.2.html">Mesa 18.3.2</a> is released.
This is a bug-fix release.
</p>
<h2>December 27, 2018</h2>
<p>
<a href="relnotes/18.2.8.html">Mesa 18.2.8</a> is released.
This is a bug-fix release.
<br>
NOTE: It is anticipated that 18.2.8 will be the final release in the
18.2 series. Users of 18.2 are encouraged to migrate to the 18.3
series in order to obtain future fixes.
</p>
<h2>December 13, 2018</h2>
<p>
<a href="relnotes/18.2.7.html">Mesa 18.2.7</a> is released.
This is a bug-fix release.
</p>
<h2>December 11, 2018</h2>
<p>
<a href="relnotes/18.3.1.html">Mesa 18.3.1</a> is released.
This is a bug-fix release.
</p>
<h2>December 7, 2018</h2>
<p>
<a href="relnotes/18.3.0.html">Mesa 18.3.0</a> is released. This is a
new development release. See the release notes for more information
about the release.
</p>
<h2>November 28, 2018</h2>
<p>
<a href="relnotes/18.2.6.html">Mesa 18.2.6</a> is released.
This is a bug-fix release.
</p>
<h2>November 15, 2018</h2>
<p>
<a href="relnotes/18.2.5.html">Mesa 18.2.5</a> is released.
This is a bug-fix release.
</p>
<h2>October 31, 2018</h2>
<p>
<a href="relnotes/18.2.4.html">Mesa 18.2.4</a> is released.
This is a bug-fix release.
</p>
<h2>October 19, 2018</h2>
<p>
<a href="relnotes/18.2.3.html">Mesa 18.2.3</a> is released.
This is a bug-fix release.
</p>
<h2>October 5, 2018</h2>
<p>
<a href="relnotes/18.2.2.html">Mesa 18.2.2</a> is released.
This is a bug-fix release.
</p>
<h2>September 24, 2018</h2>
<p>
<a href="relnotes/18.1.9.html">Mesa 18.1.9</a> is released.
This is a bug-fix release.
<br>
NOTE: It is anticipated that 18.1.9 will be the final release in the
18.1 series. Users of 18.1 are encouraged to migrate to the 18.2
series in order to obtain future fixes.
</p>
<h2>September 21, 2018</h2>
<p>
<a href="relnotes/18.2.1.html">Mesa 18.2.1</a> is released.
This is a bug-fix release.
</p>
<h2>September 7, 2018</h2>
<p>
<a href="relnotes/18.1.8.html">Mesa 18.1.8</a> and
<a href="relnotes/18.2.0.html">Mesa 18.2.0</a> are released.
These are, respectively, a bug-fix release from the 18.1 branch and a
new development release. See the release notes for more information
about the releases.
</p>
<h2>August 24, 2018</h2>
<p>
<a href="relnotes/18.1.7.html">Mesa 18.1.7</a> is released.
This is a bug-fix release.
</p>
<h2>August 13, 2018</h2>
<p>
<a href="relnotes/18.1.6.html">Mesa 18.1.6</a> is released.
This is a bug-fix release.
</p>
<h2>July 27, 2018</h2>
<p>
<a href="relnotes/18.1.5.html">Mesa 18.1.5</a> is released.
This is a bug-fix release.
</p>
<h2>July 13, 2018</h2>
<p>
<a href="relnotes/18.1.4.html">Mesa 18.1.4</a> is released.
This is a bug-fix release.
</p>
<h2>June 29, 2018</h2>
<p>
<a href="relnotes/18.1.3.html">Mesa 18.1.3</a> is released.
This is a bug-fix release.
</p>
<h2>June 15, 2018</h2>
<p>
<a href="relnotes/18.1.2.html">Mesa 18.1.2</a> is released.
This is a bug-fix release.
</p>
<h2>June 3, 2018</h2>
<p>
<a href="relnotes/18.0.5.html">Mesa 18.0.5</a> is released.
This is a bug-fix release.
<br>
NOTE: It is anticipated that 18.0.5 will be the final release in the
18.0 series. Users of 18.0 are encouraged to migrate to the 18.1
series in order to obtain future fixes.
</p>
<h2>June 1, 2018</h2>
<p>
<a href="relnotes/18.1.1.html">Mesa 18.1.1</a> is released.
This is a bug-fix release.
</p>
<h2>May 18, 2018</h2>
<p>
<a href="relnotes/18.1.0.html">Mesa 18.1.0</a> is released. This is a
new development release. See the release notes for more information
about the release.
</p>
<h2>May 17, 2018</h2>
<p>
<a href="relnotes/18.0.4.html">Mesa 18.0.4</a> is released.
This is a bug-fix release.
</p>
<h2>May 7, 2018</h2>
<p>
<a href="relnotes/18.0.3.html">Mesa 18.0.3</a> is released.
This is a bug-fix release.
</p>
<h2>April 28, 2018</h2>
<p>
<a href="relnotes/18.0.2.html">Mesa 18.0.2</a> is released.
This is a bug-fix release.
</p>
<h2>April 18, 2018</h2>
<p>
<a href="relnotes/18.0.1.html">Mesa 18.0.1</a> is released.
This is a bug-fix release.
</p>
<h2>April 18, 2018</h2>
<p>
<a href="relnotes/17.3.9.html">Mesa 17.3.9</a> is released.
This is a bug-fix release.
<br>
NOTE: It is anticipated that 17.3.9 will be the final release in the
17.3 series. Users of 17.3 are encouraged to migrate to the 18.0
series in order to obtain future fixes.
</p>
<h2>April 03, 2018</h2>
<p>
<a href="relnotes/17.3.8.html">Mesa 17.3.8</a> is released.
This is a bug-fix release.
</p>
<h2>March 27, 2018</h2>
<p>
<a href="relnotes/18.0.0.html">Mesa 18.0.0</a> is released. This is a
new development release. See the release notes for more information
about the release.
</p>
<h2>March 21, 2018</h2>
<p>
<a href="relnotes/17.3.7.html">Mesa 17.3.7</a> is released.
This is a bug-fix release.
</p>
<h2>February 26, 2018</h2>
<p>
<a href="relnotes/17.3.6.html">Mesa 17.3.6</a> is released.
This is a bug-fix release.
</p>
<h2>February 19, 2018</h2>
<p>
<a href="relnotes/17.3.5.html">Mesa 17.3.5</a> is released.
This is a bug-fix release.
</p>
<h2>February 15, 2018</h2>
<p>
<a href="relnotes/17.3.4.html">Mesa 17.3.4</a> is released.
This is a bug-fix release.
</p>
<h2>January 18, 2018</h2> <h2>January 18, 2018</h2>
<p> <p>

View File

@@ -22,6 +22,7 @@
<li><a href="#prereq-general">General prerequisites</a> <li><a href="#prereq-general">General prerequisites</a>
<li><a href="#prereq-dri">For DRI and hardware acceleration</a> <li><a href="#prereq-dri">For DRI and hardware acceleration</a>
</ul> </ul>
<li><a href="#meson">Building with meson</a>
<li><a href="#autoconf">Building with autoconf (Linux/Unix/X11)</a> <li><a href="#autoconf">Building with autoconf (Linux/Unix/X11)</a>
<li><a href="#scons">Building with SCons (Windows/Linux)</a> <li><a href="#scons">Building with SCons (Windows/Linux)</a>
<li><a href="#android">Building with AOSP (Android)</a> <li><a href="#android">Building with AOSP (Android)</a>
@@ -39,9 +40,10 @@ Build system.
</p> </p>
<ul> <ul>
<li>Autoconf is required when building on *nix platforms. <li><a href="https://mesonbuild.com">meson</a> is recommended when building on *nix platforms.
<li>Autoconf is another option when building on *nix platforms.
<li><a href="http://www.scons.org/">SCons</a> is required for building on <li><a href="http://www.scons.org/">SCons</a> is required for building on
Windows and optional for Linux (it's an alternative to autoconf/automake.) Windows and optional for Linux (it's an alternative to autoconf/automake or meson.)
</li> </li>
<li>Android Build system when building as native Android component. Autoconf <li>Android Build system when building as native Android component. Autoconf
is used when when building ARC. is used when when building ARC.
@@ -57,7 +59,7 @@ willing to maintain support for other compiler get in touch.
<ul> <ul>
<li>GCC 4.2.0 or later (some parts of Mesa may require later versions) <li>GCC 4.2.0 or later (some parts of Mesa may require later versions)
<li>clang - exact minimum requirement is currently unknown. <li>clang - exact minimum requirement is currently unknown.
<li>Microsoft Visual Studio 2013 Update 4 or later is required, for building on Windows. <li>Microsoft Visual Studio 2015 or later is required, for building on Windows.
</ul> </ul>
@@ -72,10 +74,12 @@ you think you've spotted a bug let developers know by filing a
<ul> <ul>
<li><a href="https://www.python.org/">Python</a> - Python is required. <li><a href="https://www.python.org/">Python</a> - Python is required.
Version 2.6.4 or later should work. When building with scons 2.7 is required.
When building with meson 3.5 or newer is required.
When building with autotools 2.7, or 3.5 or later are required.
</li> </li>
<li><a href="http://www.makotemplates.org/">Python Mako module</a> - <li><a href="http://www.makotemplates.org/">Python Mako module</a> -
Python Mako module is required. Version 0.3.4 or later should work. Python Mako module is required. Version 0.8.0 or later should work.
</li> </li>
<li>lex / yacc - for building the Mesa IR and GLSL compiler. <li>lex / yacc - for building the Mesa IR and GLSL compiler.
<div> <div>
@@ -111,11 +115,31 @@ the packaging tool used by your distro.
... # others ... # others
</pre> </pre>
<h1 id="meson">2. Building with meson</h1>
<h1 id="autoconf">2. Building with autoconf (Linux/Unix/X11)</h1>
<p> <p>
The primary method to build Mesa on Unix systems is with autoconf. Meson is the latest build system in mesa, it is currently able to build for
*nix systems like Linux and BSD, and will be able to build for windows as well.
</p>
<p>
The general approach is:
</p>
<pre>
meson builddir/
ninja -C builddir/
sudo ninja -C builddir/ install
</pre>
<p>
Please read the <a href="meson.html">detailed meson instructions</a>
for more information
</p>
<h1 id="autoconf">3. Building with autoconf (Linux/Unix/X11)</h1>
<p>
Although meson is recommended, another supported way to build on *nix systems
is with autoconf.
</p> </p>
<p> <p>
@@ -133,7 +157,7 @@ for more details.
<h1 id="scons">3. Building with SCons (Windows/Linux)</h1> <h1 id="scons">4. Building with SCons (Windows/Linux)</h1>
<p> <p>
To build Mesa with SCons on Linux or Windows do To build Mesa with SCons on Linux or Windows do
@@ -169,7 +193,7 @@ Additional information is available in <a href="README.WIN32">README.WIN32</a>.
<h1 id="android">4. Building with AOSP (Android)</h1> <h1 id="android">5. Building with AOSP (Android)</h1>
<p> <p>
Currently one can build Mesa for Android as part of the AOSP project, yet Currently one can build Mesa for Android as part of the AOSP project, yet
@@ -188,7 +212,7 @@ Android-x86 and/or other resources.
</p> </p>
<h1 id="libs">5. Library Information</h1> <h1 id="libs">6. Library Information</h1>
<p> <p>
When compilation has finished, look in the top-level <code>lib/</code> When compilation has finished, look in the top-level <code>lib/</code>
@@ -226,7 +250,7 @@ versions of libGL and device drivers.
</p> </p>
<h1 id="pkg-config">6. Building OpenGL programs with pkg-config</h1> <h1 id="pkg-config">7. Building OpenGL programs with pkg-config</h1>
<p> <p>
Running <code>make install</code> will install package configuration files Running <code>make install</code> will install package configuration files

View File

@@ -29,6 +29,9 @@ pre {
/*font-family: monospace;*/ /*font-family: monospace;*/
font-size: 10pt; font-size: 10pt;
/*color: black;*/ /*color: black;*/
background-color: #eee;
margin-left: 2em;
padding: .5em;
} }
iframe { iframe {

View File

@@ -16,18 +16,29 @@
<h1>Compilation and Installation using Meson</h1> <h1>Compilation and Installation using Meson</h1>
<ul>
<li><a href="#basic">Basic Usage</a></li>
<li><a href="#cross-compilation">Cross-compilation and 32-bit builds</a></li>
</ul>
<h2 id="basic">1. Basic Usage</h2> <h2 id="basic">1. Basic Usage</h2>
<p><strong>The Meson build system for Mesa is still under active development, <p><strong>The Meson build system is generally considered stable and ready
and should not be used in production environments.</strong></p> for production</strong></p>
<p>The meson build is currently only tested on linux, and is known to not work <p>The meson build is tested on Linux, macOS, Cygwin and Haiku, FreeBSD,
on macOS, Windows, and haiku. This will be fixed.</p> DragonflyBSD, NetBSD, and should work on OpenBSD.</p>
<p><strong>Mesa requires Meson >= 0.45.0 to build.</strong>
Some older versions of meson do not check that they are too old and will error
out in odd ways.
</p>
<p> <p>
The meson program is used to configure the source directory and generates The meson program is used to configure the source directory and generates
either a ninja build file or Visual Studio® build files. The latter must either a ninja build file or Visual Studio® build files. The latter must
be enabled via the --backend switch, as ninja is the default backend on all be enabled via the <code>--backend</code> switch, as ninja is the default backend on all
operating systems. Meson only supports out-of-tree builds, and must be passed a operating systems. Meson only supports out-of-tree builds, and must be passed a
directory to put built and generated sources into. We'll call that directory directory to put built and generated sources into. We'll call that directory
"build" for examples. "build" for examples.
@@ -42,9 +53,15 @@ To see a description of your options you can run <code>meson configure</code>
along with a build directory to view the selected options for. This will show along with a build directory to view the selected options for. This will show
your meson global arguments and project arguments, along with their defaults your meson global arguments and project arguments, along with their defaults
and your local settings. and your local settings.
</p>
Moes does not currently support listing options before configure a build <p>
Meson does not currently support listing options before configure a build
directory, but this feature is being discussed upstream. directory, but this feature is being discussed upstream.
For now, we have a <code>bin/meson-options.py</code> script that prints
the options for you.
If that script doesn't work for some reason, you can always look in the
<code>meson_options.txt</code> file at the root of the project.
</p> </p>
<pre> <pre>
@@ -54,13 +71,21 @@ directory, but this feature is being discussed upstream.
<p> <p>
With additional arguments <code>meson configure</code> is used to change With additional arguments <code>meson configure</code> is used to change
options on already configured build directory. All options passed to this options on already configured build directory. All options passed to this
command are in the form -D "command"="value". command are in the form <code>-D "command"="value"</code>.
</p> </p>
<pre> <pre>
meson configure build/ -Dprefix=/tmp/install -Dglx=true meson configure build/ -Dprefix=/tmp/install -Dglx=true
</pre> </pre>
<p>
Note that options taking lists (such as <code>platforms</code>) are
<a href="http://mesonbuild.com/Build-options.html#using-build-options">a bit
more complicated</a>, but the simplest form compatible with Mesa options
is to use a comma to separate values (<code>-D platforms=drm,wayland</code>)
and brackets to represent an empty list (<code>-D platforms=[]</code>).
</p>
<p> <p>
Once you've run the initial <code>meson</code> command successfully you can use Once you've run the initial <code>meson</code> command successfully you can use
your configured backend to build the project. With ninja, the -C option can be your configured backend to build the project. With ninja, the -C option can be
@@ -76,58 +101,112 @@ Without arguments, it will produce libGL.so and/or several other libraries
depending on the options you have chosen. Later, if you want to rebuild for a depending on the options you have chosen. Later, if you want to rebuild for a
different configuration, you should run <code>ninja clean</code> before different configuration, you should run <code>ninja clean</code> before
changing the configuration, or create a new out of tree build directory for changing the configuration, or create a new out of tree build directory for
each configuration you want to build. each configuration you want to build
<a href="http://mesonbuild.com/Using-multiple-build-directories.html">as
http://mesonbuild.com/Using-multiple-build-directories.html recommended in the documentation</a>
</p> </p>
<p>
Autotools automatically updates translation files as part of the build process,
meson does not do this. Instead if you want translated drirc files you will need
to invoke non-default targets for ninja to update them:
<code>ninja -C build/ xmlpool-pot xmlpool-update-po xmlpool-gmo</code>
</p>
<dl>
<dt><code>Environment Variables</code></dt> <dt><code>Environment Variables</code></dt>
<dd><p>Meson supports the standard CC and CXX envrionment variables for <dd><p>Meson supports the standard CC and CXX environment variables for
changing the default compiler, and CFLAGS, CXXFLAGS, and LDFLAGS for setting changing the default compiler. Meson does support CFLAGS, CXXFLAGS, etc. But
options to the compiler and linker. their use is discouraged because of the many caveats in using them. Instead it
is recomended to use <code>-D${lang}_args</code> and
<code>-D${lang}_link_args</code> instead. Among the benefits of these options
is that they are guaranteed to persist across rebuilds and reconfigurations.
The default compilers depends on your operating system. Meson supports most of Meson does not allow changing compiler in a configured builddir, you will need
the popular compilers, a complete list is available to create a new build dir for a different compiler.
<a href="http://mesonbuild.com/Reference-tables.html#compiler-ids">here</a>.
These arguments are consumed and stored by meson when it is initialized or
re-initialized. Therefore passing them to meson configure will not do anything,
and passing them to ninja will only do something if ninja decides to
re-initialze meson, for example, if a meson.build file has been changed.
Changing these variables will not cause all targets to be rebuilt, so running
ninja clean is recomended when changing CFLAGS or CXXFLAGS. meson will never
change compiler in a configured build directory.
</p> </p>
<pre> <pre>
CC=clang CXX=clang++ meson build-clang CC=clang CXX=clang++ meson build-clang
ninja -C build-clang ninja -C build-clang
ninja -C build-clang clean ninja -C build-clang clean
touch meson.build meson configure build -Dc_args="-Wno-typedef-redefinition"
CFLAGS=-Wno-typedef-redefinition ninja -C build-clang ninja -C build-clang
</pre> </pre>
<p>Meson also honors DESTDIR for installs</p> <p>
The default compilers depends on your operating system. Meson supports most of
the popular compilers, a complete list is available
<a href="http://mesonbuild.com/Reference-tables.html#compiler-ids">here</a>.
</p>
<p>Meson also honors <code>DESTDIR</code> for installs</p>
</dd> </dd>
<dt><code>LLVM</code></dt> <dt><code>LLVM</code></dt>
<dd><p>Meson includes upstream logic to wrap llvm-config using it's standard <dd><p>Meson includes upstream logic to wrap llvm-config using its standard
dependncy interface. It will search $PATH (or %PATH% on windows) for dependency interface.
llvm-config, so using an LLVM from a non-standard path is as easy as
<code>PATH=/path/with/llvm-config:$PATH meson build</code>.
</p></dd> </p></dd>
<dd><p>
As of meson 0.49.0 meson also has the concept of a
<a href="https://mesonbuild.com/Native-environments.html">"native file"</a>,
these files provide information about the native build environment (as opposed
to a cross build environment). They are ini formatted and can override where to
find llvm-config:
custom-llvm.ini
<pre>
[binaries]
llvm-config = '/usr/local/bin/llvm/llvm-config'
</pre>
Then configure meson:
<pre>
meson builddir/ --native-file custom-llvm.ini
</pre>
</p></dd>
<dd><p>
For selecting llvm-config for cross compiling a
<a href="https://mesonbuild.com/Cross-compilation.html#defining-the-environment">"cross file"</a>
should be used. It uses the same format as the native file above:
cross-llvm.ini
<pre>
[binaries]
...
llvm-config = '/usr/lib/llvm-config-32'
</pre>
Then configure meson:
<pre>
meson builddir/ --cross-file cross-llvm.ini
</pre>
See the <a href="#cross-compilation">Cross Compilation</a> section for more information.
</dd></p>
<dd><p>
For older versions of meson <code>$PATH</code> (or <code>%PATH%</code> on
windows) will be searched for llvm-config (and llvm-config$version and
llvm-config-$version), you can override this environment variable to control
the search: <code>PATH=/path/with/llvm-config:$PATH meson build</code>.
</dd></p>
</dl> </dl>
<dl>
<dt><code>PKG_CONFIG_PATH</code></dt> <dt><code>PKG_CONFIG_PATH</code></dt>
<dd><p>The <dd><p>The
<code>pkg-config</code> utility is a hard requirement for configuring and <code>pkg-config</code> utility is a hard requirement for configuring and
building Mesa on Linux and *BSD. It is used to search for external libraries building Mesa on Unix-like systems. It is used to search for external libraries
on the system. This environment variable is used to control the search on the system. This environment variable is used to control the search path for
path for <code>pkg-config</code>. For instance, setting <code>pkg-config</code>. For instance, setting
<code>PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig</code> will search for <code>PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig</code> will search for package
package metadata in <code>/usr/X11R6</code> before the standard metadata in <code>/usr/X11R6</code> before the standard directories.</p>
directories.</p>
</dd> </dd>
</dl> </dl>
@@ -136,7 +215,7 @@ One of the oddities of meson is that some options are different when passed to
the <code>meson</code> than to <code>meson configure</code>. These options are the <code>meson</code> than to <code>meson configure</code>. These options are
passed as --option=foo to <code>meson</code>, but -Doption=foo to <code>meson passed as --option=foo to <code>meson</code>, but -Doption=foo to <code>meson
configure</code>. Mesa defined options are always passed as -Doption=foo. configure</code>. Mesa defined options are always passed as -Doption=foo.
<p> </p>
<p>For those coming from autotools be aware of the following:</p> <p>For those coming from autotools be aware of the following:</p>
@@ -145,24 +224,115 @@ configure</code>. Mesa defined options are always passed as -Doption=foo.
<dd><p>This option will set the compiler debug/optimisation levels to aid <dd><p>This option will set the compiler debug/optimisation levels to aid
debugging the Mesa libraries.</p> debugging the Mesa libraries.</p>
<p>Note that in meson this defaults to "debugoptimized", and not setting it to <p>Note that in meson this defaults to <code>debugoptimized</code>, and
"release" will yield non-optimal performance and binary size. Not using "debug" not setting it to <code>release</code> will yield non-optimal
may interfer with debbugging as some code and validation will be optimized performance and binary size. Not using <code>debug</code> may interfere
away. with debugging as some code and validation will be optimized away.
</p> </p>
<p> For those wishing to pass their own -O option, use the "plain" buildtype, <p> For those wishing to pass their own optimization flags, use the <code>plain</code>
which cuases meson to inject no additional compiler arguments, only those in buildtype, which causes meson to inject no additional compiler arguments, only
the C/CXXFLAGS and those that mesa itself defines.</p> those in the C/CXXFLAGS and those that mesa itself defines.</p>
</dd> </dd>
</dl> </dl>
<dl> <dl>
<dt><code>-Db_ndebug</code></dt> <dt><code>-Db_ndebug</code></dt>
<dd><p>This option controls assertions in meson projects. When set to false <dd><p>This option controls assertions in meson projects. When set to <code>false</code>
(the default) assertions are enabled, when set to true they are disabled. This (the default) assertions are enabled, when set to true they are disabled. This
is unrelated to the <code>buildtype</code>; setting the latter to is unrelated to the <code>buildtype</code>; setting the latter to
<code>release</code> will not turn off assertions. <code>release</code> will not turn off assertions.
</p> </p>
</dd> </dd>
</dl> </dl>
<h2 id="cross-compilation">2. Cross-compilation and 32-bit builds</h2>
<p><a href="https://mesonbuild.com/Cross-compilation.html">Meson supports
cross-compilation</a> by specifying a number of binary paths and
settings in a file and passing this file to <code>meson</code> or
<code>meson configure</code> with the <code>--cross-file</code>
parameter.</p>
<p>This file can live at any location, but you can use the bare filename
(without the folder path) if you put it in $XDG_DATA_HOME/meson/cross or
~/.local/share/meson/cross</p>
<p>Below are a few example of cross files, but keep in mind that you
will likely have to alter them for your system.</p>
<p>
Those running on ArchLinux can use the AUR-maintained packages for some
of those, as they'll have the right values for your system:
<ul>
<li><a href="https://aur.archlinux.org/packages/meson-cross-x86-linux-gnu">meson-cross-x86-linux-gnu</a></li>
<li><a href="https://aur.archlinux.org/packages/meson-cross-aarch64-linux-gnu">meson-cross-aarch64-linux-gnu</a></li>
</ul>
</p>
<p>
32-bit build on x86 linux:
<pre>
[binaries]
c = '/usr/bin/gcc'
cpp = '/usr/bin/g++'
ar = '/usr/bin/gcc-ar'
strip = '/usr/bin/strip'
pkgconfig = '/usr/bin/pkg-config-32'
llvm-config = '/usr/bin/llvm-config32'
[properties]
c_args = ['-m32']
c_link_args = ['-m32']
cpp_args = ['-m32']
cpp_link_args = ['-m32']
[host_machine]
system = 'linux'
cpu_family = 'x86'
cpu = 'i686'
endian = 'little'
</pre>
</p>
<p>
64-bit build on ARM linux:
<pre>
[binaries]
c = '/usr/bin/aarch64-linux-gnu-gcc'
cpp = '/usr/bin/aarch64-linux-gnu-g++'
ar = '/usr/bin/aarch64-linux-gnu-gcc-ar'
strip = '/usr/bin/aarch64-linux-gnu-strip'
pkgconfig = '/usr/bin/aarch64-linux-gnu-pkg-config'
exe_wrapper = '/usr/bin/qemu-aarch64-static'
[host_machine]
system = 'linux'
cpu_family = 'aarch64'
cpu = 'aarch64'
endian = 'little'
</pre>
</p>
<p>
64-bit build on x86 windows:
<pre>
[binaries]
c = '/usr/bin/x86_64-w64-mingw32-gcc'
cpp = '/usr/bin/x86_64-w64-mingw32-g++'
ar = '/usr/bin/x86_64-w64-mingw32-ar'
strip = '/usr/bin/x86_64-w64-mingw32-strip'
pkgconfig = '/usr/bin/x86_64-w64-mingw32-pkg-config'
exe_wrapper = 'wine'
[host_machine]
system = 'windows'
cpu_family = 'x86_64'
cpu = 'i686'
endian = 'little'
</pre>
</p>
</div>
</body>
</html>

View File

@@ -1,31 +0,0 @@
ARB_texture_float:
Silicon Graphics, Inc. owns US Patent #6,650,327, issued November 18,
2003 [1].
SGI believes this patent contains necessary IP for graphics systems
implementing floating point rasterization and floating point
framebuffer capabilities described in ARB_texture_float extension, and
will discuss licensing on RAND terms, on an individual basis with
companies wishing to use this IP in the context of conformant OpenGL
implementations [2].
The source code to implement ARB_texture_float extension is included
and can be toggled on at compile time, for those who purchased a
license from SGI, or are in a country where the patent does not apply,
etc.
The software is provided "as is", without warranty of any kind, express
or implied, including but not limited to the warranties of
merchantability, fitness for a particular purpose and noninfringement.
In no event shall the authors or copyright holders be liable for any
claim, damages or other liability, whether in an action of contract,
tort or otherwise, arising from, out of or in connection with the
software or the use or other dealings in the software.
You should contact a lawyer or SGI's legal department if you want to
enable this extension.
[1] https://www.google.com/patents/about?id=mIIOAAAAEBAJ&dq=6650327
[2] https://www.opengl.org/registry/specs/ARB/texture_float.txt

View File

@@ -24,10 +24,12 @@ Some Linux distributions closely follow the latest Mesa releases. On others one
has to use unofficial channels. has to use unofficial channels.
<br> <br>
There are some general directions: There are some general directions:
<ul>
<li>Debian/Ubuntu based distros - PPA: xorg-edgers, oibaf and padoka</li> <li>Debian/Ubuntu based distros - PPA: xorg-edgers, oibaf and padoka</li>
<li>Fedora - Corp: erp and che</li> <li>Fedora - Corp: erp and che</li>
<li>OpenSuse/SLES - OBS: X11:XOrg and pontostroy:X11</li> <li>OpenSuse/SLES - OBS: X11:XOrg and pontostroy:X11</li>
<li>Gentoo/Archlinux - officially provided/supported</li> <li>Gentoo/Archlinux - officially provided/supported</li>
</ul>
</p> </p>
</div> </div>

View File

@@ -23,6 +23,16 @@ Mesa provides feature/development and stable releases.
The table below lists the date and release manager that is expected to do the The table below lists the date and release manager that is expected to do the
specific release. specific release.
<br> <br>
Regular updates will ensure that the schedule for the current and the
next two feature releases are shown in the table.
<br>
In order to keep the whole releasing team up to date with the tools
used, best practices and other details, the member in charge of the
next feature release will be in constant rotation.
<br>
The way the release schedule works is
explained <a href="releasing.html#schedule" target="_parent">here</a>.
<br>
Take a look <a href="submittingpatches.html#criteria" target="_parent">here</a> Take a look <a href="submittingpatches.html#criteria" target="_parent">here</a>
if you'd like to nominate a patch in the next stable release. if you'd like to nominate a patch in the next stable release.
</p> </p>
@@ -39,66 +49,117 @@ if you'd like to nominate a patch in the next stable release.
<th>Notes</th> <th>Notes</th>
</tr> </tr>
<tr> <tr>
<td rowspan="3">17.3</td> <td rowspan="2">18.3</td>
<td>2018-01-26</td> <td>2019-02-27</td>
<td>17.3.4</td> <td>18.3.5</td>
<td>Emil Velikov</td> <td>Emil Velikov</td>
<td></td> <td>
</tr> </tr>
<tr> <tr>
<td>2018-02-09</td> <td>2019-03-13</td>
<td>17.3.5</td> <td>18.3.6</td>
<td>Juan A. Suarez Romero</td>
<td></td>
</tr>
<tr>
<td>2018-02-23</td>
<td>17.3.6</td>
<td>Juan A. Suarez Romero</td>
<td>Final planned release for the 17.3 series</td>
</tr>
<tr>
<td rowspan="7">18.0</td>
<td>2018-01-19</td>
<td>18.0.0-rc1</td>
<td>Emil Velikov</td> <td>Emil Velikov</td>
<td></td> <td>Last planned 18.3.x release</td>
</tr> </tr>
<tr> <tr>
<td>2018-01-26</td> <td rowspan="4">19.0</td>
<td>18.0.0-rc2</td> <td>2019-01-29</td>
<td>Emil Velikov</td> <td>19.0.0-rc1</td>
<td></td> <td>Dylan Baker</td>
<td>
</tr> </tr>
<tr> <tr>
<td>2018-02-02</td> <td>2019-02-05</td>
<td>18.0.0-rc3</td> <td>19.0.0-rc2</td>
<td>Emil Velikov</td> <td>Dylan Baker</td>
<td></td> <td>
</tr> </tr>
<tr> <tr>
<td>2018-02-09</td> <td>2019-02-12</td>
<td>18.0.0-rc4</td> <td>19.0.0-rc3</td>
<td>Emil Velikov</td> <td>Dylan Baker</td>
<td>May be promoted to 18.0.0 final</td> <td>
</tr> </tr>
<tr> <tr>
<td>2018-02-23</td> <td>2019-02-19</td>
<td>18.0.1</td> <td>19.0.0-rc4</td>
<td>Dylan Baker</td>
<td>Last planned RC/Final release</td>
</tr>
<tr>
<td rowspan="4">19.1</td>
<td>2019-04-30</td>
<td>19.1.0-rc1</td>
<td>Andres Gomez</td> <td>Andres Gomez</td>
<td></td> <td>
</tr> </tr>
<tr> <tr>
<td>2018-03-09</td> <td>2019-05-07</td>
<td>18.0.2</td> <td>19.1.0-rc2</td>
<td>Andres Gomez</td> <td>Andres Gomez</td>
<td></td> <td>
</tr> </tr>
<tr> <tr>
<td>2018-03-23</td> <td>2019-05-14</td>
<td>18.0.3</td> <td>19.1.0-rc3</td>
<td>Andres Gomez</td> <td>Andres Gomez</td>
<td></td> <td>
</tr>
<tr>
<td>2019-05-21</td>
<td>19.1.0-rc4</td>
<td>Andres Gomez</td>
<td>Last planned RC/Final release</td>
</tr>
<tr>
<td rowspan="4">19.2</td>
<td>2019-08-06</td>
<td>19.2.0-rc1</td>
<td>Emil Velikov</td>
<td>
</tr>
<tr>
<td>2019-08-13</td>
<td>19.2.0-rc2</td>
<td>Emil Velikov</td>
<td>
</tr>
<tr>
<td>2019-08-20</td>
<td>19.2.0-rc3</td>
<td>Emil Velikov</td>
<td>
</tr>
<tr>
<td>2019-08-27</td>
<td>19.2.0-rc4</td>
<td>Emil Velikov</td>
<td>Last planned RC/Final release</td>
</tr>
<tr>
<td rowspan="4">19.3</td>
<td>2019-10-15</td>
<td>19.3.0-rc1</td>
<td>Juan A. Suarez</td>
<td>
</tr>
<tr>
<td>2019-10-22</td>
<td>19.3.0-rc2</td>
<td>Juan A. Suarez</td>
<td>
</tr>
<tr>
<td>2019-10-29</td>
<td>19.3.0-rc3</td>
<td>Juan A. Suarez</td>
<td>
</tr>
<tr>
<td>2019-11-05</td>
<td>19.3.0-rc4</td>
<td>Juan A. Suarez</td>
<td>Last planned RC/Final release</td>
</tr> </tr>
</table> </table>

View File

@@ -21,6 +21,7 @@
<li><a href="#overview">Overview</a> <li><a href="#overview">Overview</a>
<li><a href="#schedule">Release schedule</a> <li><a href="#schedule">Release schedule</a>
<li><a href="#pickntest">Cherry-pick and test</a> <li><a href="#pickntest">Cherry-pick and test</a>
<li><a href="#stagingbranch">Staging branch</a>
<li><a href="#branch">Making a branchpoint</a> <li><a href="#branch">Making a branchpoint</a>
<li><a href="#prerelease">Pre-release announcement</a> <li><a href="#prerelease">Pre-release announcement</a>
<li><a href="#release">Making a new release</a> <li><a href="#release">Making a new release</a>
@@ -54,10 +55,11 @@ For example:
<h1 id="schedule">Release schedule</h1> <h1 id="schedule">Release schedule</h1>
<p> <p>
Releases should happen on Fridays. Delays can occur although those should be keep Releases should happen on Wednesdays. Delays can occur although those
to a minimum. should be kept to a minimum.
<br> <br>
See our <a href="release-calendar.html" target="_parent">calendar</a> for the See our <a href="release-calendar.html" target="_parent">calendar</a>
for information about how the release schedule is planned, and the
date and other details for individual releases. date and other details for individual releases.
</p> </p>
@@ -66,6 +68,9 @@ date and other details for individual releases.
<li>Available approximately every three months. <li>Available approximately every three months.
<li>Initial timeplan available 2-4 weeks before the planned branchpoint (rc1) <li>Initial timeplan available 2-4 weeks before the planned branchpoint (rc1)
on the mesa-announce@ mailing list. on the mesa-announce@ mailing list.
<li>Typically, the final release will happen after 4
candidates. Additional ones may be needed in order to resolve blocking
regressions, though.
<li>A <a href="#prerelease">pre-release</a> announcement should be available <li>A <a href="#prerelease">pre-release</a> announcement should be available
approximately 24 hours before the final (non-rc) release. approximately 24 hours before the final (non-rc) release.
</ul> </ul>
@@ -83,6 +88,12 @@ Note: There is one or two releases overlap when changing branches. For example:
<br> <br>
The final release from the 12.0 series Mesa 12.0.5 will be out around the same The final release from the 12.0 series Mesa 12.0.5 will be out around the same
time (or shortly after) 13.0.1 is out. time (or shortly after) 13.0.1 is out.
<br>
This also involves that, as a final release may be delayed due to the
need of additional candidates to solve some blocking regression(s),
the release manager might have to update
the <a href="release-calendar.html" target="_parent">calendar</a> with
additional bug fix releases of the current stable branch.
</p> </p>
@@ -111,18 +122,21 @@ the autoconf and scons build.
<p>Done continuously up-to the <a href="#prerelease">pre-release</a> announcement.</p> <p>Done continuously up-to the <a href="#prerelease">pre-release</a> announcement.</p>
<p> <p>
As an exception, patches can be applied up-to the last ~1h before the actual Developers can request, <em>as an exception</em>, patches to be applied up-to
release. This is made <strong>only</strong> with explicit permission/request, the last one hour before the actual release. This is made <strong>only</strong>
and the patch <strong>must</strong> be very well contained. Thus it cannot with explicit permission/request, and the patch <strong>must</strong> be very
affect more than one driver/subsystem. well contained. Thus it cannot affect more than one driver/subsystem.
</p>
<p>
Currently Ilia Mirkin and AMD devs have requested "permanent" exception.
</p> </p>
<p>Following developers have requested permanent exception</p>
<ul> <ul>
<li>make distcheck, scons and scons check must pass <li><em>Ilia Mirkin</em>
<li><em>AMD team</em>
</ul>
<p>The following must pass:</p>
<ul>
<li>make distcheck, scons and scons check
<li>Testing with different version of system components - LLVM and others is also <li>Testing with different version of system components - LLVM and others is also
performed where possible. performed where possible.
<li>As a general rule, testing with various combinations of configure <li>As a general rule, testing with various combinations of configure
@@ -130,9 +144,9 @@ switches, depending on the specific patchset.
</ul> </ul>
<p> <p>
Achieved by combination of local ad-hoc scripts, mingw-w64 cross These are achieved by combination of <a href="basictesting">local testing</a>,
compilation and AppVeyor plus Travis-CI, the latter as part of their which includes mingw-w64 cross compilation and AppVeyor plus Travis-CI, the
Github integration. latter two as part of their Github integration.
</p> </p>
<p> <p>
@@ -209,6 +223,25 @@ system and making some every day's use until the release may be a good
idea too. idea too.
</p> </p>
<h1 id="stagingbranch">Staging branch</h1>
<p>
A live branch, which contains the currently merge/rejected patches is available
in the main repository under <code>staging/X.Y</code>. For example:
</p>
<pre>
staging/18.1 - WIP branch for the 18.1 series
staging/18.2 - WIP branch for the 18.2 series
</pre>
<p>
Notes:
</p>
<ul>
<li>People are encouraged to test the staging branch and report regressions.</li>
<li>The branch history is not stable and it <strong>will</strong> be rebased,</li>
</ul>
<h1 id="branch">Making a branchpoint</h1> <h1 id="branch">Making a branchpoint</h1>
@@ -425,7 +458,7 @@ Ensure the latest code is available - both in your local master and the
relevant branch. relevant branch.
</p> </p>
<h3>Perform basic testing</h3> <h3 id="basictesting">Perform basic testing</h3>
<p> <p>
Most of the testing should already be done during the Most of the testing should already be done during the
@@ -492,10 +525,10 @@ Here is one solution that I've been using.
# Drop LLVM_CONFIG, if applicable: # Drop LLVM_CONFIG, if applicable:
# unset LLVM_CONFIG # unset LLVM_CONFIG
__glxinfo_cmd='glxinfo 2>&amp;1 | egrep -o "Mesa.*|Gallium.*|.*dri\.so"' __glxinfo_cmd='glxinfo 2&gt;&amp;1 | egrep -o "Mesa.*|Gallium.*|.*dri\.so"'
__glxgears_cmd='glxgears 2>&amp;1 | grep -v "configuration file"' __glxgears_cmd='glxgears 2&gt;&amp;1 | grep -v "configuration file"'
__es2info_cmd='es2_info 2>&amp;1 | egrep "GL_VERSION|GL_RENDERER|.*dri\.so"' __es2info_cmd='es2_info 2&gt;&amp;1 | egrep "GL_VERSION|GL_RENDERER|.*dri\.so"'
__es2gears_cmd='es2gears_x11 2>&amp;1 | grep -v "configuration file"' __es2gears_cmd='es2gears_x11 2&gt;&amp;1 | grep -v "configuration file"'
test "x$LD_LIBRARY_PATH" != 'x' &amp;&amp; __old_ld="$LD_LIBRARY_PATH" test "x$LD_LIBRARY_PATH" != 'x' &amp;&amp; __old_ld="$LD_LIBRARY_PATH"
export LD_LIBRARY_PATH=`pwd`/test/usr/local/lib/:"${__old_ld}" export LD_LIBRARY_PATH=`pwd`/test/usr/local/lib/:"${__old_ld}"
export LIBGL_DRIVERS_PATH=`pwd`/test/usr/local/lib/dri/ export LIBGL_DRIVERS_PATH=`pwd`/test/usr/local/lib/dri/

View File

@@ -21,7 +21,43 @@ The release notes summarize what's new or changed in each Mesa release.
</p> </p>
<ul> <ul>
<li><a href="relnotes/17.3.2.html">17.3.3 release notes</a> <li><a href="relnotes/18.3.4.html">18.3.4 release notes</a>
<li><a href="relnotes/18.3.3.html">18.3.3 release notes</a>
<li><a href="relnotes/18.3.2.html">18.3.2 release notes</a>
<li><a href="relnotes/18.2.8.html">18.2.8 release notes</a>
<li><a href="relnotes/18.2.7.html">18.2.7 release notes</a>
<li><a href="relnotes/18.3.1.html">18.3.1 release notes</a>
<li><a href="relnotes/18.3.0.html">18.3.0 release notes</a>
<li><a href="relnotes/18.2.6.html">18.2.6 release notes</a>
<li><a href="relnotes/18.2.5.html">18.2.5 release notes</a>
<li><a href="relnotes/18.2.4.html">18.2.4 release notes</a>
<li><a href="relnotes/18.2.3.html">18.2.3 release notes</a>
<li><a href="relnotes/18.2.2.html">18.2.2 release notes</a>
<li><a href="relnotes/18.1.9.html">18.1.9 release notes</a>
<li><a href="relnotes/18.2.1.html">18.2.1 release notes</a>
<li><a href="relnotes/18.2.0.html">18.2.0 release notes</a>
<li><a href="relnotes/18.1.8.html">18.1.8 release notes</a>
<li><a href="relnotes/18.1.7.html">18.1.7 release notes</a>
<li><a href="relnotes/18.1.6.html">18.1.6 release notes</a>
<li><a href="relnotes/18.1.5.html">18.1.5 release notes</a>
<li><a href="relnotes/18.1.4.html">18.1.4 release notes</a>
<li><a href="relnotes/18.1.3.html">18.1.3 release notes</a>
<li><a href="relnotes/18.1.2.html">18.1.2 release notes</a>
<li><a href="relnotes/18.0.5.html">18.0.5 release notes</a>
<li><a href="relnotes/18.1.1.html">18.1.1 release notes</a>
<li><a href="relnotes/18.1.0.html">18.1.0 release notes</a>
<li><a href="relnotes/18.0.4.html">18.0.4 release notes</a>
<li><a href="relnotes/18.0.3.html">18.0.3 release notes</a>
<li><a href="relnotes/18.0.2.html">18.0.2 release notes</a>
<li><a href="relnotes/18.0.1.html">18.0.1 release notes</a>
<li><a href="relnotes/17.3.9.html">17.3.9 release notes</a>
<li><a href="relnotes/17.3.8.html">17.3.8 release notes</a>
<li><a href="relnotes/18.0.0.html">18.0.0 release notes</a>
<li><a href="relnotes/17.3.7.html">17.3.7 release notes</a>
<li><a href="relnotes/17.3.6.html">17.3.6 release notes</a>
<li><a href="relnotes/17.3.5.html">17.3.5 release notes</a>
<li><a href="relnotes/17.3.4.html">17.3.4 release notes</a>
<li><a href="relnotes/17.3.3.html">17.3.3 release notes</a>
<li><a href="relnotes/17.3.2.html">17.3.2 release notes</a> <li><a href="relnotes/17.3.2.html">17.3.2 release notes</a>
<li><a href="relnotes/17.2.8.html">17.2.8 release notes</a> <li><a href="relnotes/17.2.8.html">17.2.8 release notes</a>
<li><a href="relnotes/17.3.1.html">17.3.1 release notes</a> <li><a href="relnotes/17.3.1.html">17.3.1 release notes</a>

275
docs/relnotes/17.3.4.html Normal file
View File

@@ -0,0 +1,275 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.4 Release Notes / January 15, 2018</h1>
<p>
Mesa 17.3.4 is a bug fix release which fixes bugs found since the 17.3.3 release.
</p>
<p>
Mesa 17.3.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
2d3a4c3cbc995b3e192361dce710d8c749e046e7575aa1b7d8fc9e6b4df28f84 mesa-17.3.4.tar.gz
71f995e233bc5df1a0dd46c980d1720106e7f82f02d61c1ca50854b5e02590d0 mesa-17.3.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU HANG: ecode 9:0:0x86df7cf9, in csgo_linux64 [4947], reason: Hang on rcs, action: reset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104163">Bug 104163</a> - [GEN9+] 2-3% perf drop in GfxBench Manhattan 3.1 from &quot;i965: Disable regular fast-clears (CCS_D) on gen9+&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104487">Bug 104487</a> - [KBL] portal2_linux GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104711">Bug 104711</a> - [skl CCS] Oxenfree (unity engine game) hangs GPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104745">Bug 104745</a> - HEVC VDPAU decoding broken on RX 460 with UVD Firmware v1.130</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104818">Bug 104818</a> - mesa fails to build on ia64</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (1):</p>
<ul>
<li>i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen&lt;8</li>
</ul>
<p>Bas Nieuwenhuizen (10):</p>
<ul>
<li>radv: Fix ordering issue in meta memory allocation failure path.</li>
<li>radv: Fix memory allocation failure path in compute resolve init.</li>
<li>radv: Fix freeing meta state if the device pipeline cache fails to allocate.</li>
<li>radv: Fix fragment resolve init memory allocation failure paths.</li>
<li>radv: Fix bufimage failure deallocation.</li>
<li>radv: Init variant entry with memset.</li>
<li>radv: Don't allow 3d or 1d depth/stencil textures.</li>
<li>ac/nir: Use instance_rate_inputs per attribute, not per variable.</li>
<li>ac/nir: Use correct 32-bit component writemask for 64-bit SSBO stores.</li>
<li>ac/nir: Fix vector extraction if source vector has &gt;4 elements.</li>
</ul>
<p>Boyuan Zhang (2):</p>
<ul>
<li>radeon/vcn: add and manage render picture list</li>
<li>radeon/uvd: add and manage render picture list</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>configure.ac: add missing llvm dependencies to .pc files</li>
</ul>
<p>Dave Airlie (10):</p>
<ul>
<li>r600/sb: fix a bug emitting ar load from a constant.</li>
<li>ac/nir: account for view index in the user sgpr allocation.</li>
<li>radv: add fs_key meta format support to resolve passes.</li>
<li>radv: don't use hw resolve for integer image formats</li>
<li>radv: don't use hw resolves for r16g16 norm formats.</li>
<li>radv: move spi_baryc_cntl to pipeline</li>
<li>r600/sb: insert the else clause when we might depart from a loop</li>
<li>radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)</li>
<li>radv/gfx9: fix block compression texture views. (v2)</li>
<li>virgl: also remove dimension on indirect.</li>
</ul>
<p>Eleni Maria Stea (1):</p>
<ul>
<li>mesa: Fix function pointers initialization in status tracker</li>
</ul>
<p>Emil Velikov (18):</p>
<ul>
<li>cherry-ignore: i965: Accept CONTEXT_ATTRIB_PRIORITY for brwCreateContext</li>
<li>cherry-ignore: swr: refactor swr_create_screen to allow for proper cleanup on error</li>
<li>cherry-ignore: anv: add explicit 18.0 only nominations</li>
<li>cherry-ignore: radv: fix sample_mask_in loading. (v3.1)</li>
<li>cherry-ignore: meson: multiple fixes</li>
<li>cherry-ignore: swr/rast: support llvm 3.9 type declarations</li>
<li>Revert "cherry-ignore: intel/fs: Use the original destination region for int MUL lowering"</li>
<li>cherry-ignore: ac/nir: set amdgpu.uniform and invariant.load for UBOs</li>
<li>cherry-ignore: add gen10 fixes</li>
<li>cherry-ignore: add r600/amdgpu 18.0 nominations</li>
<li>cherry-ignore: add i965 shader cache fixes</li>
<li>cherry-ignore: nir: mark unused space in packed_tex_data</li>
<li>radv: Stop advertising VK_KHX_multiview</li>
<li>cherry-ignore: radv: Don't expose VK_KHX_multiview on android.</li>
<li>configure.ac: correct driglx-direct help text</li>
<li>cherry-ignore: add meson fix</li>
<li>cherry-ignore: add a few more meson fixes</li>
<li>Update version to 17.3.4</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>radeon: remove left over dead code</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>r600/shader: Initialize max_driver_temp_used correctly for the first time</li>
</ul>
<p>Grazvydas Ignotas (2):</p>
<ul>
<li>st/va: release held locks in error paths</li>
<li>st/vdpau: release held lock in error path</li>
</ul>
<p>Igor Gnatenko (1):</p>
<ul>
<li>link mesautil with pthreads</li>
</ul>
<p>Indrajit Das (4):</p>
<ul>
<li>st/omx_bellagio: Update default intra matrix per MPEG2 spec</li>
<li>radeon/uvd: update quantiser matrices only when requested</li>
<li>radeon/vcn: update quantiser matrices only when requested</li>
<li>st/va: clear pointers for mpeg2 quantiser matrices</li>
</ul>
<p>Jason Ekstrand (19):</p>
<ul>
<li>i965: Call brw_cache_flush_for_render in predraw_resolve_framebuffer</li>
<li>i965: Add more precise cache tracking helpers</li>
<li>i965/blorp: Add more destination flushing</li>
<li>i965: Track the depth and render caches separately</li>
<li>i965: Track format and aux usage in the render cache</li>
<li>Re-enable regular fast-clears (CCS_D) on gen9+</li>
<li>i965/miptree: Refactor CCS_E and CCS_D cases in render_aux_usage</li>
<li>i965/miptree: Add an explicit tiling parameter to create_for_bo</li>
<li>i965/miptree: Use the tiling from the modifier instead of the BO</li>
<li>i965/bufmgr: Add a create_from_prime_tiled function</li>
<li>i965: Set tiling on BOs imported with modifiers</li>
<li>i965/miptree: Take an aux_usage in prepare/finish_render</li>
<li>i965/miptree: Add an aux_disabled parameter to render_aux_usage</li>
<li>i965/surface_state: Drop brw_aux_surface_disabled</li>
<li>intel/fs: Use the original destination region for int MUL lowering</li>
<li>anv/pipeline: Don't look at blend state unless we have an attachment</li>
<li>anv/cmd_buffer: Re-emit the pipeline at every subpass</li>
<li>anv: Stop advertising VK_KHX_multiview</li>
<li>i965: Call prepare_external after implicit window-system MSAA resolves</li>
</ul>
<p>Jon Turney (3):</p>
<ul>
<li>configure: Default to gbm=no on osx</li>
<li>glx/apple: include util/debug.h for env_var_as_boolean prototype</li>
<li>glx/apple: locate dispatch table functions to wrap by name</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>svga: Prevent use after free.</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.3</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Bind null render targets for shadow sampling + color.</li>
<li>i965: Bump official kernel requirement to Linux v3.9.</li>
</ul>
<p>Lucas Stach (2):</p>
<ul>
<li>etnaviv: dirty TS state when framebuffer has changed</li>
<li>renderonly: fix dumb BO allocation for non 32bpp formats</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: don't ignore pitch for imported textures</li>
</ul>
<p>Matthew Nicholls (2):</p>
<ul>
<li>radv: restore previous stencil reference after depth-stencil clear</li>
<li>radv: remove predication on cache flushes</li>
</ul>
<p>Maxin B. John (1):</p>
<ul>
<li>anv_icd.py: improve reproducible builds</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>winsys/radeon: Compute is_displayable in surf_drm_to_winsys</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>r600: don't do stack workarounds for hemlock</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: create pipeline layout objects for all meta operations</li>
</ul>
<p>Samuel Thibault (1):</p>
<ul>
<li>glx: fix non-dri build</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>ac: fix buffer overflow bug in 64bit SSBO loads</li>
<li>ac: fix visit_ssa_undef() for doubles</li>
</ul>
</div>
</body>
</html>

66
docs/relnotes/17.3.5.html Normal file
View File

@@ -0,0 +1,66 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.5 Release Notes / February 19, 2018</h1>
<p>
Mesa 17.3.5 is a bug fix release which fixes bugs found since the 17.3.4 release.
</p>
<p>
Mesa 17.3.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
bc1ee20366aae2affc37c89228f871f438136f70252005e9f842169bde976788 mesa-17.3.5.tar.gz
eb9228fc8aaa71e0205c1481c5b157752ebaec9b646b030d27478e25a6d7936a mesa-17.3.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
</ul>
<h2>Changes</h2>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.4</li>
<li>Update version to 17.3.5</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>ac/nir: Fix conflict resolution typo in handle_vs_input_decl</li>
</ul>
</div>
</body>
</html>

85
docs/relnotes/17.3.6.html Normal file
View File

@@ -0,0 +1,85 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.6 Release Notes / February 27, 2018</h1>
<p>
Mesa 17.3.6 is a bug fix release which fixes bugs found since the 17.3.5 release.
</p>
<p>
Mesa 17.3.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
d5e10ea3f0d11b06d2b0b235bba372a04278c39bc0e712090bda1f61842db188 mesa-17.3.6.tar.gz
e5915680d44ac9d05defdec529db7459ac9edd441c9845266eff2e2d3e57fbf8 mesa-17.3.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104546">Bug 104546</a> - Crash happens when running compute pipeline after calling glxMakeCurrent two times</li>
</ul>
<h2>Changes</h2>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.5</li>
<li>Update version to 17.3.6</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>i965/draw: Do resolves properly for textures used by TXF</li>
<li>i965: Replace draw_aux_buffer_disabled with draw_aux_usage</li>
<li>i965/draw: Set NEW_AUX_STATE when draw aux changes</li>
<li>i965: Stop disabling aux during texture preparation</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Don't disable CCS for RT dependencies when dispatching compute.</li>
</ul>
<p>Topi Pohjolainen (1):</p>
<ul>
<li>i965: Don't try to disable render aux buffers for compute</li>
</ul>
</div>
</body>
</html>

312
docs/relnotes/17.3.7.html Normal file
View File

@@ -0,0 +1,312 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.7 Release Notes / March 21, 2018</h1>
<p>
Mesa 17.3.7 is a bug fix release which fixes bugs found since the 17.3.7 release.
</p>
<p>
Mesa 17.3.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
f08de6d0ccb3dbca04b44790d85c3ff9e7b1cc4189d1b7c7167e5ba7d98736c0 mesa-17.3.7.tar.gz
0595904a8fba65a8fe853a84ad3c940205503b94af41e8ceed245fada777ac1e mesa-17.3.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103007">Bug 103007</a> - [OpenGL CTS] [HSW] KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103988">Bug 103988</a> - Intermittent piglit failures with shader cache enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104381">Bug 104381</a> - swr fails to build since llvm-svn r321257</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert _mm512_mask_blend_epi32((__mmask16)(ImmT), a, b) from __m512i {aka __vector(8) long long int} to SIMDImpl::SIMD512Impl::Float</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (1):</p>
<ul>
<li>radv: Fix CmdCopyImage between uncompressed and compressed images</li>
</ul>
<p>Andriy Khulap (1):</p>
<ul>
<li>i965: Fix RELOC_WRITE typo in brw_store_data_imm64()</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>isl: Don't use surface format R32_FLOAT for typed atomic integer operations</li>
</ul>
<p>Bas Nieuwenhuizen (6):</p>
<ul>
<li>radv: Always lower indirect derefs after nir_lower_global_vars_to_local.</li>
<li>radeonsi: Export signalled sync file instead of -1.</li>
<li>radv: Implement WaitForFences with !waitAll.</li>
<li>radv: Implement waiting on non-submitted fences.</li>
<li>radv: Fix copying from 3D images starting at non-zero depth.</li>
<li>radv: Increase the number of dynamic uniform buffers.</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: add missing switch case for EXTRA_VERSION_40 in check_extra()</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>glx: Properly handle cases where screen creation fails</li>
</ul>
<p>Daniel Stone (3):</p>
<ul>
<li>i965: Fix bugs in intel_from_planar</li>
<li>egl/wayland: Fix ARGB/XRGB transposition in config map</li>
<li>egl/wayland: Always use in-tree wayland-egl-backend.h</li>
</ul>
<p>Dave Airlie (9):</p>
<ul>
<li>r600: fix cubemap arrays</li>
<li>r600/sb/cayman: fix indirect ubo access on cayman</li>
<li>r600: fix xfb stream check.</li>
<li>ac/nir: to integer the args to bcsel.</li>
<li>r600/cayman: fix fragcood loading recip generation.</li>
<li>radv: don't support tc-compat on multisample d32s8 at all.</li>
<li>virgl: remap query types to hw support.</li>
<li>ac/nir: don't apply slice rounding on txf_ms</li>
<li>r600: implement callstack workaround for evergreen.</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>glapi/check_table: Remove 'extern "C"' block</li>
<li>glapi: remove APPLE extensions from test</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.6</li>
</ul>
<p>Eric Anholt (4):</p>
<ul>
<li>mesa: Drop incorrect A4B4G4R4 _mesa_format_matches_format_and_type() cases.</li>
<li>ac/nir: Fix compiler warning about uninitialized dw_addr.</li>
<li>glsl/tests: Fix strict aliasing warning about int64/double.</li>
<li>glsl/tests: Fix a compiler warning about signed/unsigned loop comparison.</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>i965: Fix KHR_blend_equation_advanced with some render targets.</li>
</ul>
<p>Frank Binns (1):</p>
<ul>
<li>egl/dri2: fix segfault when display initialisation fails</li>
</ul>
<p>George Kyriazis (1):</p>
<ul>
<li>swr/rast: blend_epi32() should return Integer, not Float</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>r600: Take ALU_EXTENDED into account when evaluating jump offsets</li>
</ul>
<p>Gurchetan Singh (1):</p>
<ul>
<li>mesa: don't clamp just based on ARB_viewport_array extension</li>
</ul>
<p>Iago Toral Quiroga (2):</p>
<ul>
<li>i965/sbe: fix number of inputs for active components</li>
<li>i965/vec4: use a temp register to compute offsets for pull loads</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>radv: Really use correct HTILE expanded words.</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>intel/isl: Add an isl_color_value_is_zero helper</li>
<li>vulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails</li>
<li>intel/fs: Set up sampler message headers in the visitor on gen7+</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>configure.ac: pthread-stubs not present on OpenBSD</li>
</ul>
<p>Jordan Justen (3):</p>
<ul>
<li>i965: Create new program cache bo when clearing the program cache</li>
<li>program: Don't reset SamplersValidated when restoring from shader cache</li>
<li>intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview</li>
</ul>
<p>Juan A. Suarez Romero (14):</p>
<ul>
<li>cherry-ignore: Explicit 18.0 only nominations</li>
<li>cherry-ignore: r600/compute: only mark buffer/image state dirty for fragment shaders</li>
<li>cherry-ignore: anv: Move setting current_pipeline to cmd_state_init</li>
<li>cherry-ignore: anv: Be more careful about fast-clear colors</li>
<li>cherry-ignore: Add patches that has a specific version for 17.3</li>
<li>cherry-ignore: r600: Take ALU_EXTENDED into account when evaluating jump offsets</li>
<li>cherry-ignore: intel/compiler: Memory fence commit must always be enabled for gen10+</li>
<li>cherry-ignore: i965: Avoid problems from referencing orphaned BOs after growing.</li>
<li>cherry-ignore: include all Meson related fixes</li>
<li>cherry-ignore: ac/shader: fix vertex input with components.</li>
<li>cherry-ignore: i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.</li>
<li>cherry-ignore: anv/image: Separate modifiers from legacy scanout</li>
<li>cherry-ignore: glsl: Fix memory leak with known glsl_type instances</li>
<li>Update version to 17.3.7</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nvir/nvc0: fix legalizing of ld unlock c0[0x10000]</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Emit CS stall before MEDIA_VFE_STATE.</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>i965: perf: ensure reading config IDs from sysfs isn't interrupted</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>radeonsi: align command buffer starting address to fix some Raven hangs</li>
<li>configure.ac: blacklist libdrm 2.4.90</li>
</ul>
<p>Michal Navratil (1):</p>
<ul>
<li>winsys/amdgpu: allow non page-aligned size bo creation from pointer</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>glsl/linker: fix bug when checking precision qualifier</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>ac/nir: use ordered float comparisons except for not equal</li>
<li>Revert "mesa: do not trigger _NEW_TEXTURE_STATE in glActiveTexture()"</li>
</ul>
<p>Stephan Gerhold (1):</p>
<ul>
<li>util/build-id: Fix address comparison for binaries with LOAD vaddr &gt; 0</li>
</ul>
<p>Thomas Hellstrom (2):</p>
<ul>
<li>svga: Fix a leftover debug hack</li>
<li>loader_dri3/glx/egl: Reinstate the loader_dri3_vtable get_dri_screen callback</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr/rast: fix MemoryBuffer build break for llvm-6</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>nir: fix interger divide by zero crash during constant folding</li>
</ul>
<p>Tobias Droste (1):</p>
<ul>
<li>gallivm: Use new LLVM fast-math-flags API</li>
</ul>
<p>Vadym Shovkoplias (1):</p>
<ul>
<li>mesa: add glsl version query (v4)</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>swr/rast: Fix macOS macro.</li>
</ul>
</div>
</body>
</html>

147
docs/relnotes/17.3.8.html Normal file
View File

@@ -0,0 +1,147 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.8 Release Notes / April 03, 2018</h1>
<p>
Mesa 17.3.8 is a bug fix release which fixes bugs found since the 17.3.7 release.
</p>
<p>
Mesa 17.3.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
175d2ca9be2af3a8db6cd603986096d75da70f59699528d7b6675d542a305e23 mesa-17.3.8.tar.gz
8f9d9bf281c48e4a8f5228816577263b4c655248dc7666e75034ab422951a6b1 mesa-17.3.8.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - [BSW/HD400] SynMark OglCSDof GPU hangs when shaders come from cache</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - Reading per-patch outputs in Tessellation Control Shader returns undefined values</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - [regression][hang] Trine1EE hangs GPU after loading screen on Mesa3D-17.3 and later</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - compiler assertion hit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>
</ul>
<h2>Changes</h2>
<p>Axel Davy (3):</p>
<ul>
<li>st/nine: Fix bad tracking of vs textures for NINESBT_ALL</li>
<li>st/nine: Fixes warning about implicit conversion</li>
<li>st/nine: Fix non inversible matrix check</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>anv/pipeline: fail if TCS/TES compile fail</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv: get correct offset into LDS for indexed vars.</li>
</ul>
<p>Derek Foreman (1):</p>
<ul>
<li>egl/wayland: Make swrast display_sync the correct queue</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>meson/configure: detect endian.h instead of trying to guess when it's available</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>mesa: Don't write to user buffer in glGetTexParameterIuiv on error</li>
<li>i965/vec4: Fix null destination register in 3-source instructions</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965: Emit texture cache invalidates around blorp_copy</li>
</ul>
<p>Jordan Justen (2):</p>
<ul>
<li>i965: Calculate thread_count in brw_alloc_stage_scratch</li>
<li>i965: Hard code CS scratch_ids_per_subslice for Cherryview</li>
</ul>
<p>Juan A. Suarez Romero (6):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.7</li>
<li>cherry-ignore: ac/nir: pass the nir variable through tcs loading.</li>
<li>cherry-ignore: radv: handle exporting view index to fragment shader. (v1.1)</li>
<li>cherry-ignore: omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}</li>
<li>cherry-ignore: docs: fix 18.0 release note version</li>
<li>Update version to 17.3.8</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/vce: move feedback command inside of destroy function</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>st/dri: fix OpenGL-OpenCL interop for GL_TEXTURE_BUFFER</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>nir: fix per_vertex_output intrinsic</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>glsl: fix infinite loop caused by bug in loop unrolling pass</li>
<li>nir: fix crash in loop unroll corner case</li>
</ul>
</div>
</body>
</html>

162
docs/relnotes/17.3.9.html Normal file
View File

@@ -0,0 +1,162 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.9 Release Notes / April 18, 2018</h1>
<p>
Mesa 17.3.9 is a bug fix release which fixes bugs found since the 17.3.8 release.
</p>
<p>
Mesa 17.3.9 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
4d625f65a1ff4cd8cfeb39e38f047507c6dea047502a0d53113c96f54588f340 mesa-17.3.9.tar.gz
c5beb5fc05f0e0c294fefe1a393ee118cb67e27a4dca417d77c297f7d4b6e479 mesa-17.3.9.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (2):</p>
<ul>
<li>dri_util: when overriding, always reset the core version</li>
<li>mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage</li>
</ul>
<p>Axel Davy (2):</p>
<ul>
<li>st/nine: Declare lighting consts for ff shaders</li>
<li>st/nine: Do not use scratch for face register</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>ac/nir: Add workaround for GFX9 buffer views.</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>st/dri: Initialise modifier to INVALID for DRI2</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>glsl: remove unreachable assert()</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>gbm: remove never-implemented function</li>
</ul>
<p>Henri Verbeet (1):</p>
<ul>
<li>mesa: Inherit texture view multi-sample information from the original texture images.</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>compiler/spirv: set is_shadow for depth comparitor sampling opcodes</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>nir/vars_to_ssa: Remove copies from the correct set</li>
<li>nir/lower_indirect_derefs: Support interp_var_at intrinsics</li>
<li>intel/vec4: Set channel_sizes for MOV_INDIRECT sources</li>
<li>nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination</li>
</ul>
<p>Juan A. Suarez Romero (3):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.8</li>
<li>cherry-ignore: Explicit 18.0 only nominations</li>
<li>Update version to 17.3.9</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: fix number of planes for depth &amp; stencil</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix picking the method for resolve subpass</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>i965: Extend the negative 32-bit deltas to 64-bits</li>
</ul>
<p>Timothy Arceri (6):</p>
<ul>
<li>gallium/pipebuffer: fix parenthesis location</li>
<li>glsl: always call do_lower_jumps() after loop unrolling</li>
<li>ac: add if/loop build helpers</li>
<li>radeonsi: make use of if/loop build helpers in ac</li>
<li>ac: make use of if/loop build helpers</li>
<li>mesa: free debug messages when destroying the debug state</li>
</ul>
<p>Xiong, James (1):</p>
<ul>
<li>i965: return the fourcc saved in __DRIimage when possible</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,73 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.4.0 Release Notes / TBD</h1>
<p>
Mesa 17.4.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 17.4.1.
</p>
<p>
Mesa 17.4.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>Disk shader cache support for i965 when MESA_GLSL_CACHE_DISABLE environment variable is set to "0" or "false"</li>
<li>GL_ARB_shader_atomic_counters and GL_ARB_shader_atomic_counter_ops on r600/evergreen+</li>
<li>GL_ARB_shader_image_load_store and GL_ARB_shader_image_size on r600/evergreen+</li>
<li>GL_ARB_shader_storage_buffer_object on r600/evergreen+<li>
<li>GL_ARB_compute_shader on r600/evergreen+<li>
<li>GL_ARB_cull_distance on r600/evergreen+</li>
<li>GL_ARB_enhanced_layouts on r600/evergreen+</li>
<li>GL_ARB_bindless_texture on nvc0/kepler</li>
<li>OpenGL 4.3 on r600/evergreen with hw fp64 support</li>
<li>Support 1 binary format for GL_ARB_get_program_binary on i965</li>
</ul>
<h2>Bug fixes</h2>
<ul>
TBD
</ul>
<h2>Changes</h2>
<ul>
<li>Remove incomplete GLX_MESA_set_3dfx_mode from the Xlib libGL</li>
</ul>
</div>
</body>
</html>

321
docs/relnotes/18.0.0.html Normal file
View File

@@ -0,0 +1,321 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.0 Release Notes / March 27 2018</h1>
<p>
Mesa 18.0.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 18.0.1.
</p>
<p>
Mesa 18.0.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
93c2d3504b2871ac2146603fb1270f341d36a39695e2950a469c5eac74f98457 mesa-18.0.0.tar.gz
694e5c3d37717d23258c1f88bc134223c5d1aac70518d2f9134d6df3ee791eea mesa-18.0.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>Disk shader cache support for i965 when MESA_GLSL_CACHE_DISABLE environment variable is set to "0" or "false"</li>
<li>GL_ARB_shader_atomic_counters and GL_ARB_shader_atomic_counter_ops on r600/evergreen+</li>
<li>GL_ARB_shader_image_load_store and GL_ARB_shader_image_size on r600/evergreen+</li>
<li>GL_ARB_shader_storage_buffer_object on r600/evergreen+</li>
<li>GL_ARB_compute_shader on r600/evergreen+</li>
<li>GL_ARB_cull_distance on r600/evergreen+</li>
<li>GL_ARB_enhanced_layouts on r600/evergreen+</li>
<li>GL_ARB_bindless_texture on nvc0/kepler</li>
<li>OpenGL 4.3 on r600/evergreen with hw fp64 support</li>
<li>Support 1 binary format for GL_ARB_get_program_binary on i965.
(For the 18.0 release, 0 formats continue to be supported in
compatibility profiles.)</li>
<li>Cannonlake support on i965 and anv</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85564">Bug 85564</a> - Dead Island rendering issues</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92363">Bug 92363</a> - [BSW/BDW] ogles1conform Gets test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94739">Bug 94739</a> - Mesa 11.1.2 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97532">Bug 97532</a> - Regression: GLB 2.7 &amp; Glmark-2 GLES versions segfault due to linker precision error (259fc505) on dead variable</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97852">Bug 97852</a> - Unreal Engine corrupted preview viewport</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100438">Bug 100438</a> - glsl/ir.cpp:1376: ir_dereference_variable::ir_dereference_variable(ir_variable*): Assertion `var != NULL' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101378">Bug 101378</a> - interpolateAtSample check for input parameter is too strict</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101560">Bug 101560</a> - SPIR-V OpSwitch with int64 not supported even though shaderInt64 is true</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101691">Bug 101691</a> - gfx corruption on windowed 3d-apps running on dGPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102177">Bug 102177</a> - [SKL] ES31-CTS.core.sepshaderobjs.StateInteraction fails sporadically</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102264">Bug 102264</a> - Missing MESA_FORMAT_{B8G8R8A8,B8G8R8X8}_SRGB formats</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102354">Bug 102354</a> - Mesa 17.2 no longer can give SRGB-capable framebuffer on i965, even though Mesa 17.1.x does.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102358">Bug 102358</a> - WarThunder freezes at start, with activated vsync (vblank_mode=2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU hang in Valve games based on Source 1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102503">Bug 102503</a> - Report SRGB framebuffer to SuperTuxKart to workaround SuperTuxKart crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102665">Bug 102665</a> - test_glsl_to_tgsi_lifetime.cpp:53:67: error: &gt;&gt; should be &gt; &gt; within a nested template argument list</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102677">Bug 102677</a> - [OpenGL CTS] KHR-GL45.CommonBugs.CommonBug_PerVertexValidation fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102680">Bug 102680</a> - [OpenGL CTS] KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102710">Bug 102710</a> - vkCmdBlitImage with arrayLayers &gt; 1 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102774">Bug 102774</a> - [BDW] [Bisected] Absolute constant buffers break VAAPI in mpv</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102809">Bug 102809</a> - Rust shadows(?) flash random colours</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102897">Bug 102897</a> - Separate bind points are not implemented correctly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102955">Bug 102955</a> - HyperZ related rendering issue in ARK: Survival Evolved</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103007">Bug 103007</a> - [OpenGL CTS] [HSW] KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103085">Bug 103085</a> - [ivb byt hsw] piglit.spec.arb_indirect_parameters.tf-count-arrays</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103098">Bug 103098</a> - [OpenGL CTS] KHR-GL45.enhanced_layouts.varying_structure_locations fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103101">Bug 103101</a> - [SKL][bisected] DiRT Rally GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103115">Bug 103115</a> - [BSW BXT GLK] dEQP-VK.spirv_assembly.instruction.compute.sconvert.int32_to_int64</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103128">Bug 103128</a> - [softpipe] piglit fs-ldexp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103227">Bug 103227</a> - [G965 G45 ILK] ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103283">Bug 103283</a> - drm_get_device_name_for_fd is broken on FreeBSD</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103388">Bug 103388</a> - Linking libcltgsi.la (llvm/codegen/libclllvm_la-common.lo) fails with &quot;error: no match for 'operator-'&quot; with GCC-7, Mesa from Git and current LLVM revisions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103393">Bug 103393</a> - glDispatchComputeGroupSizeARB : gl_GlobalInvocationID.x != gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103412">Bug 103412</a> - gallium/wgl: Another fix to context creation without prior SetPixelFormat()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103496">Bug 103496</a> - svga_screen.c:26:46: error: git_sha1.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103513">Bug 103513</a> - [build failure] radv_shader.c:683:2: error: format not a string literal and no format arguments [-Werror=format-security]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103519">Bug 103519</a> - wayland egl apps crash on start with mesa 17.2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103529">Bug 103529</a> - [GM45] GPU hang with mpv fullscreen (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103537">Bug 103537</a> - i965: Shadow of Mordor broken since commit 379b24a40d3d34ffdaaeb1b328f50e28ecb01468 on Haswell</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103544">Bug 103544</a> - Graphical glitches r600 in game this war of mine linux native</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103579">Bug 103579</a> - Vertex shader causes compiler to crash in SPIRV-to-NIR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103616">Bug 103616</a> - Increased difference from reference image in shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103628">Bug 103628</a> - [BXT, GLK, BSW] KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103653">Bug 103653</a> - Unreal segfault since gallium/u_threaded: avoid syncs for get_query_result</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103658">Bug 103658</a> - addrlib/gfx9/gfx9addrlib.cpp:727:50: error: expected expression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103674">Bug 103674</a> - u_queue.c:173:7: error: implicit declaration of function 'timespec_get' is invalid in C99</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103759">Bug 103759</a> - plasma desktop corrupted rendering</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103784">Bug 103784</a> - [bisected] Egl changes breaks all of EGL</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103787">Bug 103787</a> - [BDW,BSW] gpu hang on spec.arb_pipeline_statistics_query.arb_pipeline_statistics_query-comp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103801">Bug 103801</a> - [i965] &gt;Observer_ issue</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103808">Bug 103808</a> - [radeonsi, bisected] World of Warcraft scribbling all over screen</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103902">Bug 103902</a> - Portal 2 game hangs at startup with latest mesa dev</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103904">Bug 103904</a> - Source engine-based games won't hang at start without R600_DEBUG=vs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103909">Bug 103909</a> - anv_allocator.c:113:1: error: static declaration of memfd_create follows non-static declaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103942">Bug 103942</a> - KHR-GL46.enhanced_layouts.varying* regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103955">Bug 103955</a> - Using array in structure results in wrong GLSL compilation output</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103966">Bug 103966</a> - Mesa 17.2.5 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103988">Bug 103988</a> - Intermittent piglit failures with shader cache enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104005">Bug 104005</a> - [sklgt4e] GPU hangs in Car_Chase</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104119">Bug 104119</a> - radv: OpBitFieldInsert produces 0 with a loop counter for Insert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104141">Bug 104141</a> - include/c11/threads_posix.h:96: undefined reference to `pthread_once'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104143">Bug 104143</a> - r600/sb: clobbers gl_Position -&gt; gl_FragCoord</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104163">Bug 104163</a> - [GEN9+] 2-3% perf drop in GfxBench Manhattan 3.1 from &quot;i965: Disable regular fast-clears (CCS_D) on gen9+&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104183">Bug 104183</a> - mesa-17.3.0/src/broadcom/qpu/qpu_pack.c:171]: (error) Invalid memcmp() argument</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104199">Bug 104199</a> - [i965 bisected] BIO and EM Vision in &gt;Observer_ is broken since commit af2c320190f3c73180f1610c8df955a7fa2a4d09</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104213">Bug 104213</a> - NULL pointer access crashes on compiling Vulkan compute shaders after &quot;anv: Add support for the variablePointers feature&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104214">Bug 104214</a> - Dota crashes when switching from game to desktop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104226">Bug 104226</a> - [bisected] Anvil accesses uninitialized memory while compiling shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104231">Bug 104231</a> - DispatchSanity_test.GL30 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104246">Bug 104246</a> - Talos Principle Vulkan version crash: spirv_to_nir() returns NULL entry_point</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104271">Bug 104271</a> - i965: Timeout in dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.5</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104288">Bug 104288</a> - Steamroll needs allow_glsl_cross_stage_interpolation_mismatch=true</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104331">Bug 104331</a> - [r600g] Ogre demo &quot;TutorialUAV01&quot; crash at r600_decompress_color_images</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104338">Bug 104338</a> - NULL pointer access crash on Sacha Willems' Vulkan raytracing demo after &quot;spirv: Add basic type validation for OpLoad, OpStore, and OpCopyMemory&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104359">Bug 104359</a> - Mesa freezes in &quot;vtn_cfg_walk_blocks&quot; with Sacha Willems' hdr, parallaxmapping and specializationconstants Vulkan demos</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104381">Bug 104381</a> - swr fails to build since llvm-svn r321257</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104424">Bug 104424</a> - DOOM 2016 broken by spirv OpStore validation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104487">Bug 104487</a> - [KBL] portal2_linux GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104490">Bug 104490</a> - [radeonsi/290x] Dota2 fails to start (can't create opengl context)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104492">Bug 104492</a> - Compute Shader: Wrong alignment when assigning struct value to structured SSBO</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104546">Bug 104546</a> - Crash happens when running compute pipeline after calling glxMakeCurrent two times</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104551">Bug 104551</a> - Check if Mako templates for Python are installed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104668">Bug 104668</a> - dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104677">Bug 104677</a> - radv_generate_graphics_pipeline_key reads input rate from incorrect binding</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104690">Bug 104690</a> - [G33] regression: piglit.spec.!opengl 1_4.draw-batch and gl-1_4-dlist-multidrawarrays</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104711">Bug 104711</a> - [skl CCS] Oxenfree (unity engine game) hangs GPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104742">Bug 104742</a> - [swrast] piglit gl-1.4-dlist-multidrawarrays regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104746">Bug 104746</a> - [swrast] piglit attribs regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104749">Bug 104749</a> - rasterizer/jitter/JitManager.cpp:252:91: error: no matching function for call to llvm::DIBuilder::createBasicType(const char [8], int, llvm::dwarf::TypeKind)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104762">Bug 104762</a> - Various segfaults/problems in qt/plasma</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104777">Bug 104777</a> - Attaching multiple shader objects for the same stage to a GLSL program triggers a linker error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104884">Bug 104884</a> - memory leak with intel i965 mesa when running android container in Ubuntu</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert _mm512_mask_blend_epi32((__mmask16)(ImmT), a, b) from __m512i {aka __vector(8) long long int} to SIMDImpl::SIMD512Impl::Float</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105065">Bug 105065</a> - Qt Programs occasionally fail to render with new Mesa (glGetProgramBinary)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105120">Bug 105120</a> - meson build broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - [BSW/HD400] SynMark OglCSDof GPU hangs when shaders come from cache</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105292">Bug 105292</a> - vkGetQueryPoolResults returns incorrect query status for large query buffers (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - Reading per-patch outputs in Tessellation Control Shader returns undefined values</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Remove incomplete GLX_MESA_set_3dfx_mode from the Xlib libGL</li>
</ul>
</div>
</body>
</html>

225
docs/relnotes/18.0.1.html Normal file
View File

@@ -0,0 +1,225 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.1 Release Notes / April 18, 2018</h1>
<p>
Mesa 18.0.1 is a bug fix release which fixes bugs found since the 18.0.0 release.
</p>
<p>
Mesa 18.0.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
0c93ba892c0610f5dd87f2e2673b9445187995c395b3ddb33fd4260bfb291e89 mesa-18.0.1.tar.gz
b2d2f5b5dbaab13e15cb0dcb5ec81887467f55ebc9625945b303a3647cd87954 mesa-18.0.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105567">Bug 105567</a> - meson/ninja: 1. mesa/vdpau incorrect symlinks in DESTDIR and 2. Ddri-drivers-path Dvdpau-libs-path overrides DESTDIR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - [regression][hang] Trine1EE hangs GPU after loading screen on Mesa3D-17.3 and later</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - compiler assertion hit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105942">Bug 105942</a> - Graphical artefacts after update to mesa 18.0.0-2</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (2):</p>
<ul>
<li>dri_util: when overriding, always reset the core version</li>
<li>mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage</li>
</ul>
<p>Axel Davy (5):</p>
<ul>
<li>st/nine: Fix bad tracking of vs textures for NINESBT_ALL</li>
<li>st/nine: Fixes warning about implicit conversion</li>
<li>st/nine: Fix non inversible matrix check</li>
<li>st/nine: Declare lighting consts for ff shaders</li>
<li>st/nine: Do not use scratch for face register</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>ac/nir: Add workaround for GFX9 buffer views.</li>
<li>radv: Don't set instance count using predication.</li>
<li>radv: Always reset draw user SGPRs after secondary command buffer.</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>anv/pipeline: fail if TCS/TES compile fail</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>st/dri: Initialise modifier to INVALID for DRI2</li>
</ul>
<p>Derek Foreman (1):</p>
<ul>
<li>egl/wayland: Make swrast display_sync the correct queue</li>
</ul>
<p>Dylan Baker (4):</p>
<ul>
<li>meson: don't use compiler.has_header</li>
<li>autotools: include meson_get_version</li>
<li>meson: Set .so version for xa like autotools does</li>
<li>meson: fix megadriver symlinking</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.0</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>meson/configure: detect endian.h instead of trying to guess when it's available</li>
<li>docs: fix 18.0 release note version</li>
<li>gbm: remove never-implemented function</li>
</ul>
<p>Henri Verbeet (1):</p>
<ul>
<li>mesa: Inherit texture view multi-sample information from the original texture images.</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>compiler/spirv: set is_shadow for depth comparitor sampling opcodes</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>i965/vec4: Fix null destination register in 3-source instructions</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>nir/vars_to_ssa: Remove copies from the correct set</li>
<li>nir/lower_indirect_derefs: Support interp_var_at intrinsics</li>
<li>intel/vec4: Set channel_sizes for MOV_INDIRECT sources</li>
<li>nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination</li>
</ul>
<p>Juan A. Suarez Romero (5):</p>
<ul>
<li>cherry-ignore anv: Be more careful about fast-clear colors</li>
<li>cherry-ignore: ac/shader: fix vertex input with components.</li>
<li>cherry-ignore: radv: handle exporting view index to fragment shader. (v1.1)</li>
<li>cherry-ignore: omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}</li>
<li>Update version to 18.0.1</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/vce: move feedback command inside of destroy function</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>i965/perf: fix config registration when uploading to kernel</li>
</ul>
<p>Marc Dietrich (1):</p>
<ul>
<li>meson: fix HAVE_LLVM version define in meson build</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override</li>
</ul>
<p>Mark Thompson (1):</p>
<ul>
<li>st/va: Enable vaExportSurfaceHandle()</li>
</ul>
<p>Rob Clark (3):</p>
<ul>
<li>nir: fix per_vertex_output intrinsic</li>
<li>freedreno/a5xx: fix page faults on last level</li>
<li>freedreno/a5xx: don't align height for PIPE_BUFFER</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>radv: fix picking the method for resolve subpass</li>
<li>radv: fix radv_layout_dcc_compressed() when image doesn't have DCC</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>i965: Extend the negative 32-bit deltas to 64-bits</li>
</ul>
<p>Timothy Arceri (7):</p>
<ul>
<li>ac: add if/loop build helpers</li>
<li>radeonsi: make use of if/loop build helpers in ac</li>
<li>ac: make use of if/loop build helpers</li>
<li>glsl: fix infinite loop caused by bug in loop unrolling pass</li>
<li>nir: fix crash in loop unroll corner case</li>
<li>gallium/pipebuffer: fix parenthesis location</li>
<li>glsl: always call do_lower_jumps() after loop unrolling</li>
</ul>
<p>Xiong, James (1):</p>
<ul>
<li>i965: return the fourcc saved in __DRIimage when possible</li>
</ul>
</div>
</body>
</html>

144
docs/relnotes/18.0.2.html Normal file
View File

@@ -0,0 +1,144 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.2 Release Notes / April 28, 2018</h1>
<p>
Mesa 18.0.2 is a bug fix release which fixes bugs found since the 18.0.1 release.
</p>
<p>
Mesa 18.0.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
SHA256: ffd8dfe3337b474a3baa085f0e7ef1a32c7cdc3bed1ad810b2633919a9324840 mesa-18.0.2.tar.gz
SHA256: 98fa159768482dc568b9f8bf0f36c7acb823fa47428ffd650b40784f16b9e7b3 mesa-18.0.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95009">Bug 95009</a> - [SNB] amd_shader_trinary_minmax.execution.built-in-functions.gs-mid3-ivec2-ivec2-ivec2 intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95012">Bug 95012</a> - [SNB] glsl-1_50.execution.built-in-functions.gs-op tests intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105320">Bug 105320</a> - Storage texel buffer access produces wrong results (RX Vega)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105775">Bug 105775</a> - SI reaches the maximum IB size in dwords and fail to submit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106074">Bug 106074</a> - radv: si_scissor_from_viewport returns incorrect result when using half-pixel viewport offset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106126">Bug 106126</a> - eglMakeCurrent does not always ensure dri_drawable-&gt;update_drawable_info has been called for a new EGLSurface if another has been created and destroyed first</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too.</li>
<li>radv: Mark GTT memory as device local for APUs.</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>bin/install_megadrivers: fix DESTDIR and -D*-path</li>
<li>meson: don't build classic mesa tests without dri_drivers</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>intel/compiler: Add scheduler deps for instructions that implicitly read g0</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965/fs: Return mlen * 8 for size_read() for INTERPOLATE_AT_*</li>
</ul>
<p>Johan Klokkhammer Helsing (1):</p>
<ul>
<li>st/dri: Fix dangling pointer to a destroyed dri_drawable</li>
</ul>
<p>Juan A. Suarez Romero (4):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.1</li>
<li>travis: radv needs LLVM 4.0</li>
<li>cherry-ignore: add explicit 18.1 only nominations</li>
<li>Update version to 18.0.2</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Fix shadow batches to be the same size as the real BO.</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: fix number of planes for depth &amp; stencil</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>etnaviv: fix texture_format_needs_swiz</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>radeonsi/gfx9: fix a hang with an empty first IB</li>
<li>glsl_to_tgsi: try harder to lower unsupported ir_binop_vector_extract</li>
<li>Revert "st/dri: Fix dangling pointer to a destroyed dri_drawable"</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>radv: fix scissor computation when using half-pixel viewport offset</li>
<li>radv/winsys: allow to submit up to 4 IBs for chips without chaining</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>svga: Fix incorrect advertizing of EGL_KHR_gl_colorspace</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>mesa: free debug messages when destroying the debug state</li>
</ul>
</div>
</body>
</html>

107
docs/relnotes/18.0.3.html Normal file
View File

@@ -0,0 +1,107 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.3 Release Notes / May 7, 2018</h1>
<p>
Mesa 18.0.3 is a bug fix release which fixes bugs found since the 18.0.2 release.
</p>
<p>
Mesa 18.0.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
58cc5c5b1ab2a44e6e47f18ef6c29836ad06f95450adce635ce3c317507a171b mesa-18.0.3.tar.gz
099d9667327a76a61741a533f95067d76ea71a656e66b91507b3c0caf1d49e30 mesa-18.0.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105374">Bug 105374</a> - texture3d, a SaschaWillems demo, assert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106147">Bug 106147</a> - SIGBUS in write_reloc() when Sacha Willems' &quot;texture3d&quot; Vulkan demo starts</li>
</ul>
<h2>Changes</h2>
<p>Andres Rodriguez (1):</p>
<ul>
<li>radv/winsys: fix leaking resources from bo's imported by fd</li>
</ul>
<p>Boyuan Zhang (1):</p>
<ul>
<li>radeon/vcn: fix mpeg4 msg buffer settings</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>gallium/util: Fix incorrect refcounting of separate stencil.</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>anv/allocator: Don't shrink either end of the block pool</li>
</ul>
<p>Juan A. Suarez Romero (3):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.2</li>
<li>cherry-ignore: add explicit 18.1 only nominations</li>
<li>Update version to 18.0.3</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>st/omx/enc: fix blit setup for YUV LoadImage</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>util/u_queue: fix a deadlock in util_queue_finish</li>
<li>radeonsi/gfx9: workaround for INTERP with indirect indexing</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>i965/tex_image: Avoid the ASTC LDR workaround on gen9lp</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: compute the number of subpass attachments correctly</li>
</ul>
</div>
</body>
</html>

157
docs/relnotes/18.0.4.html Normal file
View File

@@ -0,0 +1,157 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.4 Release Notes / May 17, 2018</h1>
<p>
Mesa 18.0.4 is a bug fix release which fixes bugs found since the 18.0.3 release.
</p>
<p>
Mesa 18.0.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
d1dc3469faccdd73439479426952d71a9e8f684e8d03b6687063c12b13430801 mesa-18.0.4.tar.gz
1f3bcfe7cef0a5c20dae2b41df5d7e0a985e06be0183fa4d43b6068fcba2920f mesa-18.0.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91808">Bug 91808</a> - trine1 misrender r600g</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100430">Bug 100430</a> - [radv] graphical glitches on dolphin emulator</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106243">Bug 106243</a> - [kbl] GPU HANG: 9:0:0x85dffffb, in Cinnamon</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106480">Bug 106480</a> - A2B10G10R10_SNORM vertex attribute doesn't work.</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Translate logic ops.</li>
<li>radv: Fix up 2_10_10_10 alpha sign.</li>
<li>radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>r600: fix constant buffer bounds.</li>
<li>radv: resolve all layers in compute resolve path.</li>
<li>radv: use compute path for multi-layer images.</li>
</ul>
<p>Deepak Rawat (1):</p>
<ul>
<li>egl/x11: Send invalidate to driver on copy_region path in swap_buffer</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>mesa: Add missing support for glFogiv(GL_FOG_DISTANCE_MODE_NV)</li>
</ul>
<p>Jan Vesely (8):</p>
<ul>
<li>clover: Add explicit virtual destructor to argument class</li>
<li>eg/compute: Drop reference on code_bo in destructor.</li>
<li>r600: Cleanup constant buffers on context destruction</li>
<li>eg/compute: Drop reference to kernel_param bo in destructor</li>
<li>pipe-loader: Free driver_name in error path</li>
<li>gallium/auxiliary: Add helper function to count the number of entries in hash table</li>
<li>winsys/radeon: Destroy fd_hash table when the last winsys is removed.</li>
<li>winsys/amdgpu: Destroy dev_hash table when the last winsys is removed.</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965,anv: Set the CS stall bit on the ISP disable PIPE_CONTROL</li>
</ul>
<p>Jose Maria Casanova Crespo (2):</p>
<ul>
<li>intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate</li>
<li>intel/compiler: fix brw_imm_w for negative 16-bit integers</li>
</ul>
<p>Juan A. Suarez Romero (7):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.3</li>
<li>cherry-ignore: add explicit 18.1 only nominations</li>
<li>cherry-ignore: glsl: change ast_type_qualifier bitset size to work around GCC 5.4 bug</li>
<li>cherry-ignore: mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs</li>
<li>cherry-ignore: mesa: revert GL_[SECONDARY_]COLOR_ARRAY_SIZE glGet type to TYPE_INT</li>
<li>cherry-ignore: radv/resolve: do fmask decompress on all layers.</li>
<li>Update version to 18.0.4</li>
</ul>
<p>Kai Wasserbäch (1):</p>
<ul>
<li>opencl: autotools: Fix linking order for OpenCL target</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Don't leak blorp on Gen4-5.</li>
</ul>
<p>Lionel Landwerlin (2):</p>
<ul>
<li>i965: require pixel scoreboard stall prior to ISP disable</li>
<li>anv: emit pixel scoreboard stall before ISP disable</li>
</ul>
<p>Matthew Nicholls (1):</p>
<ul>
<li>radv: fix multisample image copies</li>
</ul>
<p>Neil Roberts (1):</p>
<ul>
<li>spirv: Apply OriginUpperLeft to FragCoord</li>
</ul>
<p>Rhys Perry (1):</p>
<ul>
<li>mesa: fix error handling in get_framebuffer_parameteriv</li>
</ul>
<p>Ross Burton (1):</p>
<ul>
<li>src/intel/Makefile.vulkan.am: add missing MKDIR_GEN</li>
</ul>
</div>
</body>
</html>

162
docs/relnotes/18.0.5.html Normal file
View File

@@ -0,0 +1,162 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.5 Release Notes / June 3, 2018</h1>
<p>
Mesa 18.0.5 is a bug fix release which fixes bugs found since the 18.0.4 release.
</p>
<p>
Mesa 18.0.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
ea3e00329cea899b1e32db812fd2f426832be37e4baa2e2fd9288a3480f30531 mesa-18.0.5.tar.gz
5187bba8d72aea78f2062d134ec6079a508e8216062dce9ec9048b5eb2c4fc6b mesa-18.0.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78097">Bug 78097</a> - glUniform1ui and friends not supported by display lists</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102390">Bug 102390</a> - centroid interpolation causes broken attribute values</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105351">Bug 105351</a> - [Gen6+] piglit's arb_shader_image_load_store-host-mem-barrier fails with a glGetTexSubImage fallback path</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106090">Bug 106090</a> - Compiling compute shader crashes RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106315">Bug 106315</a> - The witness + dxvk suffers flickering garbage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106465">Bug 106465</a> - No test for Image Load/Store on format-incompatible texture buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106479">Bug 106479</a> - NDEBUG not defined for libamdgpu_addrlib</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106481">Bug 106481</a> - No test for Image Load/Store on texture buffer sized greater than MAX_TEXTURE_BUFFER_SIZE_ARB</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106504">Bug 106504</a> - vulkan SPIR-V parsing failed at ../src/compiler/spirv/vtn_cfg.c:381</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106587">Bug 106587</a> - Dota2 is very dark when using vulkan render on a Intel &lt;&lt; AMD prime setup</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106629">Bug 106629</a> - [SNB,IVB,HSW,BDW] dEQP-EGL.functional.image.create.gles2_cubemap_negative_z_rgb_read_pixels</li>
</ul>
<h2>Changes</h2>
<p>Anuj Phogat (1):</p>
<ul>
<li>i965/glk: Add l3 banks count for 2x6 configuration</li>
</ul>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>amd/addrlib: Use defines in autotools build.</li>
<li>radv: Fix SRGB compute copies.</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>tgsi/scan: add hw atomic to the list of memory accessing files</li>
</ul>
<p>Francisco Jerez (4):</p>
<ul>
<li>Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"</li>
<li>i965: Move buffer texture size calculation into a common helper function.</li>
<li>i965: Handle non-zero texture buffer offsets in buffer object range calculation.</li>
<li>i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>eg/compute: Use reference counting to handle compute memory pool.</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0</li>
<li>intel/blorp: Support blits and clears on surfaces with offsets</li>
</ul>
<p>Jose Dapena Paz (1):</p>
<ul>
<li>mesa: do not leak ctx-&gt;Shader.ReferencedProgram references</li>
</ul>
<p>Juan A. Suarez Romero (8):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.4</li>
<li>cherry-ignore: i965/miptree: Fix handling of uninitialized MCS buffers</li>
<li>cherry-ignore: add explicit 18.1 only nominations</li>
<li>cherry-ignore: mesa/st: handle vert_attrib_mask in nir case too</li>
<li>cherry-ignore: Tegra is not supported</li>
<li>cherry-ignore: st/mesa: fix assertion failures with GL_UNSIGNED_INT64_ARB (v2)</li>
<li>cherry-ignore: nv30: ensure that displayable formats are marked accordingly</li>
<li>Update version to 18.0.5</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>st/mesa: simplify lastLevel determination in st_finalize_texture</li>
<li>radeonsi: fix incorrect parentheses around VS-PS varying elimination</li>
<li>mesa: handle GL_UNSIGNED_INT64_ARB properly (v2)</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>dri3: Stricter SBC wraparound handling</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>i965/miptree: Zero-initialize CCS_D buffers</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>spirv: fix visiting inner loops with same break/continue block</li>
<li>radv: fix centroid interpolation</li>
</ul>
<p>Stuart Young (1):</p>
<ul>
<li>etnaviv: Fix missing rnndb file in tarballs</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>mesa: add glUniform*ui{v} support to display lists</li>
</ul>
</div>
</body>
</html>

268
docs/relnotes/18.1.0.html Normal file
View File

@@ -0,0 +1,268 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.0 Release Notes / May 18 2018</h1>
<p>
Mesa 18.1.0 is a new development release. People who are concerned
with stability and reliability should stick with a previous release or
wait for Mesa 18.1.1.
</p>
<p>
Mesa 18.1.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
b1c1dbb42597190503d3abc518b12de880623f097c6cb6c293ecf69ae87e6fbf mesa-18.1.0.tar.gz
c855c5b67ef993b7621f76d8b120769ec0415f1c3616eaff44ef7f7f300aceba mesa-18.1.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>OpenGL 3.1 with ARB_compatibility on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe, svga</li>
<li>GL_ARB_bindless_texture on nvc0/maxwell+</li>
<li>GL_ARB_transform_feedback_overflow_query on nvc0</li>
<li>GL_EXT_semaphore on radeonsi</li>
<li>GL_EXT_semaphore_fd on radeonsi</li>
<li>GL_EXT_shader_framebuffer_fetch on i965 on desktop GL (GLES was already supported)</li>
<li>GL_EXT_shader_framebuffer_fetch_non_coherent on i965</li>
<li>GL_KHR_blend_equation_advanced on radeonsi</li>
<li>Disk shader cache support for i965 enabled by default</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91808">Bug 91808</a> - trine1 misrender r600g</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95009">Bug 95009</a> - [SNB] amd_shader_trinary_minmax.execution.built-in-functions.gs-mid3-ivec2-ivec2-ivec2 intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95012">Bug 95012</a> - [SNB] glsl-1_50.execution.built-in-functions.gs-op tests intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99549">Bug 99549</a> - pp: Failed to translate a shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100259">Bug 100259</a> - [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102905">Bug 102905</a> - [R600] Miscompilation of TGSI to VLIW causes artifacts in Gallium Nine with Crysis2 bump mapping</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104335">Bug 104335</a> - [OpenGL CTS][SKL,KBL] KHR-GL45.vertex_attrib_64bit.limits_test occasionally fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104668">Bug 104668</a> - dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104717">Bug 104717</a> - Rocket League: grass rendering broken with nir</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104732">Bug 104732</a> - [radv] Binding descriptor sets disturbs other pipeline bindings</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104762">Bug 104762</a> - Various segfaults/problems in qt/plasma</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104777">Bug 104777</a> - Attaching multiple shader objects for the same stage to a GLSL program triggers a linker error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104794">Bug 104794</a> - piglit.spec.arb_internalformat_query2.samples and num_sample_counts pname checks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104803">Bug 104803</a> - SIGSEGV in state_tracker/st_glsl_to_tgsi_temprename.cpp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104863">Bug 104863</a> - 186 assertions in piglit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104884">Bug 104884</a> - memory leak with intel i965 mesa when running android container in Ubuntu</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104908">Bug 104908</a> - Texture Compression Hint not converted to enum16</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104989">Bug 104989</a> - [r600] [bisected] OpenGL applications can't render anything at all</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105026">Bug 105026</a> - glxgears asserts with pp_jimenezmlaa=1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert _mm512_mask_blend_epi32((__mmask16)(ImmT), a, b) from __m512i {aka __vector(8) long long int} to SIMDImpl::SIMD512Impl::Float</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105052">Bug 105052</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105065">Bug 105065</a> - Qt Programs occasionally fail to render with new Mesa (glGetProgramBinary)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105067">Bug 105067</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105088">Bug 105088</a> - brw_nir_uniforms.cpp:256:10: error: non-constant-expression cannot be narrowed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105120">Bug 105120</a> - meson build broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105161">Bug 105161</a> - KHR_blend_equation_advanced doesn't work in GLSL 1.10-1.40 shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105183">Bug 105183</a> - Weird assertion in NIR linker</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105211">Bug 105211</a> - build failure after zwp_dmabuf commit if wayland-protocols is not installed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105229">Bug 105229</a> - [KBL SKL BDW HSW] [Regression] KHR-GLES31.core.shader_image_load_store.advanced-sso-simple failures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105238">Bug 105238</a> - ast.h:648:16: error: union member 'i' has a non-trivial constructor</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105262">Bug 105262</a> - [R600] [BISECTED] ttf fonts are invisible in many programs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105274">Bug 105274</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105292">Bug 105292</a> - vkGetQueryPoolResults returns incorrect query status for large query buffers (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105320">Bug 105320</a> - Storage texel buffer access produces wrong results (RX Vega)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105374">Bug 105374</a> - texture3d, a SaschaWillems demo, assert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105444">Bug 105444</a> - Enable GL disk shader cache when transform feedback is enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105471">Bug 105471</a> - [g33] [bisected] dEQP-GLES2.functional.shaders failures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105497">Bug 105497</a> - shader-db crashes on 72 core system after ast_type_qualifier bitset change</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105529">Bug 105529</a> - u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105567">Bug 105567</a> - meson/ninja: 1. mesa/vdpau incorrect symlinks in DESTDIR and 2. Ddri-drivers-path Dvdpau-libs-path overrides DESTDIR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105621">Bug 105621</a> - Build failure on GNOME Continuous</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105634">Bug 105634</a> - Android build test fails when building brw_oa_metrics.c</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105737">Bug 105737</a> - st_tests_common.cpp:140:42: error: no matching function for call to 'tgsi_get_opcode_info'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105738">Bug 105738</a> - commit f7ffa504a065dc2631fd38cc5fe885b277f4e7e7 causes artifacting in radv</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105740">Bug 105740</a> - glsl_types.cpp(524): error: a dynamically-initialized local static variable is not allowed inside of a statement expression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105775">Bug 105775</a> - SI reaches the maximum IB size in dwords and fail to submit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105807">Bug 105807</a> - [Regression, bisected]: 3D Rendering not working correctly in Warhammer 40k: Dawn of War II</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105817">Bug 105817</a> - scons build broken by glSpecializeShaderARB</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105820">Bug 105820</a> - [m32] piglit regressions relinking program without shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105942">Bug 105942</a> - Graphical artefacts after update to mesa 18.0.0-2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105952">Bug 105952</a> - radv causes GPU hang on SI</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105960">Bug 105960</a> - [bisected] meson build test fails with: undefined reference to `etna_pm_create_query'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106074">Bug 106074</a> - radv: si_scissor_from_viewport returns incorrect result when using half-pixel viewport offset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106126">Bug 106126</a> - eglMakeCurrent does not always ensure dri_drawable-&gt;update_drawable_info has been called for a new EGLSurface if another has been created and destroyed first</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106131">Bug 106131</a> - meson/ninja build missing file gtest.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106133">Bug 106133</a> - make check &quot;OSError: [Errno 24] Too many open files&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106147">Bug 106147</a> - SIGBUS in write_reloc() when Sacha Willems' &quot;texture3d&quot; Vulkan demo starts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106174">Bug 106174</a> - vulkan dota2 broken (segfaulting), found bug commit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106180">Bug 106180</a> - [bisected] radv vulkan smoke test black screen (Add support for DRI3 v1.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106243">Bug 106243</a> - [kbl] GPU HANG: 9:0:0x85dffffb, in Cinnamon</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106450">Bug 106450</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106462">Bug 106462</a> - piglit.spec.arb_vertex_array_bgra.get regression</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Remove incomplete GLX_SGIX_swap_barrier stubs from the Xlib libGL</li>
<li>Remove incomplete GLX_SGIX_swap_group stubs from the Xlib libGL</li>
</ul>
</div>
</body>
</html>

168
docs/relnotes/18.1.1.html Normal file
View File

@@ -0,0 +1,168 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.1 Release Notes / June 1 2018</h1>
<p>
Mesa 18.1.1 is a bug fix release which fixes bugs found since the 18.1.0 release.
</p>
<p>
Mesa 18.1.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
366a35f7530a016f2a8284fb0ee5759eeb216b4d6fa47f0e96b89ad2e43faf96 mesa-18.1.1.tar.gz
d3312a2ede5aac14a47476b208b8e3a401367838330197c4588ab8ad420d7781 mesa-18.1.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>None<p>
<h2>Changes</h2>
<p>Anuj Phogat (1):</p>
<ul>
<li>i965/glk: Add l3 banks count for 2x6 configuration</li>
</ul>
<p>Bas Nieuwenhuizen (7):</p>
<ul>
<li>radv: Fix multiview queries.</li>
<li>radv: Translate logic ops.</li>
<li>radv: Fix up 2_10_10_10 alpha sign.</li>
<li>radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.</li>
<li>amd/addrlib: Use defines in autotools build.</li>
<li>radv: Fix SRGB compute copies.</li>
<li>radv: Only expose subgroup shuffles on VI+.</li>
</ul>
<p>Christoph Haag (1):</p>
<ul>
<li>radv: fix VK_EXT_descriptor_indexing</li>
</ul>
<p>Dave Airlie (5):</p>
<ul>
<li>radv/resolve: do fmask decompress on all layers.</li>
<li>radv: resolve all layers in compute resolve path.</li>
<li>radv: use compute path for multi-layer images.</li>
<li>virgl: set texture buffer offset alignment to disable ARB_texture_buffer_range.</li>
<li>tgsi/scan: add hw atomic to the list of memory accessing files</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>docs: Add sha sums for release</li>
<li>VERSION: bump to 18.1.1 for next release</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>vulkan: don't free uninitialised memory</li>
</ul>
<p>Francisco Jerez (4):</p>
<ul>
<li>Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"</li>
<li>i965: Move buffer texture size calculation into a common helper function.</li>
<li>i965: Handle non-zero texture buffer offsets in buffer object range calculation.</li>
<li>i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nv30: ensure that displayable formats are marked accordingly</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>eg/compute: Use reference counting to handle compute memory pool.</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0</li>
<li>intel/blorp: Support blits and clears on surfaces with offsets</li>
</ul>
<p>Jose Dapena Paz (1):</p>
<ul>
<li>mesa: do not leak ctx-&gt;Shader.ReferencedProgram references</li>
</ul>
<p>Kai Wasserbäch (1):</p>
<ul>
<li>opencl: autotools: Fix linking order for OpenCL target</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>st/mesa: simplify lastLevel determination in st_finalize_texture</li>
<li>radeonsi: fix incorrect parentheses around VS-PS varying elimination</li>
<li>mesa: handle GL_UNSIGNED_INT64_ARB properly (v2)</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>dri3: Stricter SBC wraparound handling</li>
</ul>
<p>Nanley Chery (4):</p>
<ul>
<li>i965: Add and use a getter for the miptree aux buffer</li>
<li>i965: Add and use a single miptree aux_buf field</li>
<li>i965/miptree: Fix handling of uninitialized MCS buffers</li>
<li>i965/miptree: Zero-initialize CCS_D buffers</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>spirv: fix visiting inner loops with same break/continue block</li>
<li>radv: fix centroid interpolation</li>
</ul>
<p>Stuart Young (1):</p>
<ul>
<li>etnaviv: Fix missing rnndb file in tarballs</li>
</ul>
<p>Thierry Reding (3):</p>
<ul>
<li>tegra: Treat resources with modifiers as scanout</li>
<li>tegra: Fix scanout resources without modifiers</li>
<li>tegra: Remove usage of non-stable UAPI</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>mesa: add glUniform*ui{v} support to display lists</li>
</ul>
</div>
</body>
</html>

170
docs/relnotes/18.1.2.html Normal file
View File

@@ -0,0 +1,170 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.2 Release Notes / June 15 2018</h1>
<p>
Mesa 18.1.2 is a bug fix release which fixes bugs found since the 18.1.1 release.
</p>
<p>
Mesa 18.1.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
a644df23937f4078a2bd9a54349f6315c1955f5e3a4ac272832da51dea4d3c11 mesa-18.1.1.tar.gz
070bf0648ba5b242d7303ceed32aed80842f4c0ba16e5acc1a650a46eadfb1f9 mesa-18.1.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>None<p>
<h2>Changes</h2>
<p>Alex Smith (4):</p>
<ul>
<li>radv: Consolidate GFX9 merged shader lookup logic</li>
<li>radv: Handle GFX9 merged shaders in radv_flush_constants()</li>
<li>radeonsi: Fix crash on shaders using MSAA image load/store</li>
<li>radv: Set active_stages the same whether or not shaders were cached</li>
</ul>
<p>Andrew Galante (2):</p>
<ul>
<li>meson: Test for __atomic_add_fetch in atomic checks</li>
<li>configure.ac: Test for __atomic_add_fetch in atomic checks</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Don't pass a TESS_EVAL shader when tesselation is not enabled.</li>
</ul>
<p>Cameron Kumar (1):</p>
<ul>
<li>vulkan/wsi: Destroy swapchain images after terminating FIFO queues</li>
</ul>
<p>Dylan Baker (6):</p>
<ul>
<li>docs/relnotes: Add sha256 sums for mesa 18.1.1</li>
<li>cherry-ignore: add commits not to pull</li>
<li>cherry-ignore: Add patches from Jason that he rebased on 18.1</li>
<li>meson: work around gentoo applying -m32 to host compiler in cross builds</li>
<li>cherry-ignore: Add another patch</li>
<li>version: bump version for 18.1.2 release</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>autotools: add missing android file to package</li>
<li>configure: radv depends on mako</li>
<li>i965: fix resource leak</li>
</ul>
<p>Jason Ekstrand (10):</p>
<ul>
<li>intel/eu: Add some brw_get_default_ helpers</li>
<li>intel/eu: Copy fields manually in brw_next_insn</li>
<li>intel/eu: Set flag [sub]register number differently for 3src</li>
<li>intel/blorp: Don't vertex fetch directly from clear values</li>
<li>intel/isl: Add bounds-checking assertions in isl_format_get_layout</li>
<li>intel/isl: Add bounds-checking assertions for the format_info table</li>
<li>i965/screen: Refactor query_dma_buf_formats</li>
<li>i965/screen: Use RGBA non-sRGB formats for images</li>
<li>anv: Set fence/semaphore types to NONE in impl_cleanup</li>
<li>i965/screen: Return false for unsupported formats in query_modifiers</li>
</ul>
<p>Jordan Justen (1):</p>
<ul>
<li>mesa/program_binary: add implicit UseProgram after successful ProgramBinary</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>glsl: Add ir_binop_vector_extract in NIR</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Fix batch-last mode to properly swap BOs.</li>
<li>anv: Disable __gen_validate_value if NDEBUG is set.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>r300g/swtcl: make pipe_context uploaders use malloc'd memory as before</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>meson: Fix -latomic check</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>glx: Fix number of property values to read in glXImportContextEXT</li>
</ul>
<p>Nicolas Boichat (1):</p>
<ul>
<li>configure.ac/meson.build: Fix -latomic test</li>
</ul>
<p>Philip Rebohle (1):</p>
<ul>
<li>radv: Use correct color format for fast clears</li>
</ul>
<p>Samuel Pitoiset (3):</p>
<ul>
<li>radv: fix a GPU hang when MRTs are sparse</li>
<li>radv: fix missing ZRANGE_PRECISION(1) for GFX9+</li>
<li>radv: add a workaround for DXVK hangs by setting amdgpu-skip-threshold</li>
</ul>
<p>Scott D Phillips (1):</p>
<ul>
<li>intel/tools: add intel_sanitize_gpu to EXTRA_DIST</li>
</ul>
<p>Thomas Petazzoni (1):</p>
<ul>
<li>configure.ac: rework -latomic check</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>ac: fix possible truncation of intrinsic name</li>
<li>radeonsi: fix possible truncation on renderer string</li>
</ul>
</div>
</body>
</html>

167
docs/relnotes/18.1.3.html Normal file
View File

@@ -0,0 +1,167 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.3 Release Notes / June 29 2018</h1>
<p>
Mesa 18.1.3 is a bug fix release which fixes bugs found since the 18.1.2 release.
</p>
<p>
Mesa 18.1.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
2a1e36280d01ad18ba6d5b3fbd653ceaa109eaa031b78eb5dfaa4df452742b66 mesa-18.1.3.tar.gz
54f08deeda0cd2f818e8d40140040ed013de7852573002453b7f50da9ea738ce mesa-18.1.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105396">Bug 105396</a> - tc compatible htile sets depth of htiles of discarded fragments to 1.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105399">Bug 105399</a> - [snb] GPU hang: after geometry shader emits no geometry, the program hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106756">Bug 106756</a> - Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106774">Bug 106774</a> - GLSL IR copy propagates loads of SSBOs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106903">Bug 106903</a> - radv: Fragment shader output goes to wrong attachments when render targets are sparse</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106907">Bug 106907</a> - Correct Transform Feedback Varyings information is expected after using ProgramBinary</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106912">Bug 106912</a> - radv: 16-bit depth buffer causes artifacts in Shadow Warrior 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106980">Bug 106980</a> - Basemark GPU vulkan benchmark fails.</li>
</ul>
<h2>Changes</h2>
<p>Andrii Simiklit (1):</p>
<ul>
<li>i965/gen6/gs: Handle case where a GS doesn't allocate VUE</li>
</ul>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>radv: Fix output for sparse MRTs.</li>
<li>ac/surface: Set compressZ for stencil-only surfaces.</li>
</ul>
<p>Christian Gmeiner (1):</p>
<ul>
<li>util/bitset: include util/macro.h</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>glsl: allow standalone semicolons outside main()</li>
</ul>
<p>Dylan Baker (8):</p>
<ul>
<li>docs: Add release notes for 18.1.2</li>
<li>cherry-ignore: Add 587e712eda95c31d88ea9d20e59ad0ae59afef4f</li>
<li>meson: Fix auto option for va</li>
<li>meson: Fix auto option for xvmc</li>
<li>meson: Correct behavior of vdpau=auto</li>
<li>cherry-ignore: Ignore cac7ab1192eefdd8d8b3f25053fb006b5c330eb8</li>
<li>cherry-ignore: add a2f5292c82ad07731d633b36a663e46adc181db9</li>
<li>VERSION: bump version to 18.1.3</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>configure: use compliant grep regex checks</li>
<li>glsl/tests/glcpp: reinstate "error out if no tests found"</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>radv: fix reported number of available VGPRs</li>
<li>radv: fix bitwise check</li>
<li>meson: fix i965/anv/isl genX static lib names</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>glsl: Don't copy propagate from SSBO or shared variables either</li>
<li>glsl: Don't copy propagate elements from SSBO or shared variables either</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>nir: Handle call instructions in foreach_src</li>
<li>nir/validate: Use the type from the tail of call parameter derefs</li>
</ul>
<p>Lukas Rusak (2):</p>
<ul>
<li>meson: only build vl_winsys_dri.c when x11 platform is used</li>
<li>meson: fix private libs when building without glx</li>
</ul>
<p>Marek Olšák (5):</p>
<ul>
<li>radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointers</li>
<li>ac/gpu_info: report real total memory sizes</li>
<li>ac/gpu_info: add kernel_flushes_hdp_before_ib</li>
<li>radeonsi: always put persistent buffers into GTT on radeon</li>
<li>mesa: fix glGetInteger64v for arrays of integers</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno/ir3: fix base_vertex</li>
</ul>
<p>Samuel Pitoiset (6):</p>
<ul>
<li>radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8</li>
<li>radv: update the ZRANGE_PRECISION value for the TC-compat bug</li>
<li>radv: fix emitting the TCS regs on GFX9</li>
<li>radv: fix HTILE metadata initialization in presence of subpass clears</li>
<li>radv: ignore pInheritanceInfo for primary command buffers</li>
<li>radv: use separate bind points for the dynamic buffers</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>glsl: serialize data from glTransformFeedbackVaryings</li>
</ul>
<p>Tomeu Vizoso (1):</p>
<ul>
<li>virgl: Remove debugging left-overs</li>
</ul>
</div>
</body>
</html>

150
docs/relnotes/18.1.4.html Normal file
View File

@@ -0,0 +1,150 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.4 Release Notes / July 13 2018</h1>
<p>
Mesa 18.1.4 is a bug fix release which fixes bugs found since the 18.1.3 release.
</p>
<p>
Mesa 18.1.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
SHA256: 8acd42e4ac4d1e96ed22344073b3d4fef03d10f225f4eaf3f88c001dfc10e2db mesa-18.1.4.tar.gz
SHA256: 3061488b5d85504092cf4343816cfb2d96f2ad9bc2edec31fc96933d184cf58b mesa-18.1.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106906">Bug 106906</a> - Failed to recongnize keyword “sampler2DRect” and &quot;sampler2DRectShadow&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106928">Bug 106928</a> - When starting a match Rocket League crashes on &quot;Go&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107193">Bug 107193</a> - piglit.spec.arb_compute_shader.linker.bug-93840 fails</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (1):</p>
<ul>
<li>glx: Don't allow glXMakeContextCurrent() with only one valid drawable</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>r600/sb: cleanup if_conversion iterator to be legal C++</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>docs: Add SHA256 sums to notes for 18.1.3</li>
<li>Bump version for release</li>
</ul>
<p>Iago Toral Quiroga (3):</p>
<ul>
<li>anv/cmd_buffer: make descriptors dirty when emitting base state address</li>
<li>anv/cmd_buffer: clean dirty push constants flag after emitting push constants</li>
<li>anv/cmd_buffer: never shrink the push constant buffer size</li>
</ul>
<p>Ian Romanick (4):</p>
<ul>
<li>i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible</li>
<li>intel/compiler: Relax mixed type restriction for saturating immediates</li>
<li>i965/vec4: Properly handle sign(-abs(x))</li>
<li>i965/fs: Properly handle sign(-abs(x))</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>intel/fs: Split instructions low to high in lower_simd_width</li>
<li>anv: Be more careful about hashing pipeline layouts</li>
<li>intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLN</li>
</ul>
<p>Jose Maria Casanova Crespo (3):</p>
<ul>
<li>i965/fs: Register allocator shoudn't use grf127 for sends dest</li>
<li>intel/compiler: grf127 can not be dest when src and dest overlap in send</li>
<li>i965/fs: unspills shoudn't use grf127 as dest since Gen8+</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>i965: fix clear color bo address relocation</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>radeonsi: fix memory exhaustion issue with DCC statistics gathering with DRI2</li>
<li>glsl/cache: save and restore ExternalSamplersUsed</li>
<li>st/dri: fix a crash in server_wait_sync</li>
</ul>
<p>Neil Roberts (1):</p>
<ul>
<li>i965: Fix output register sizes when variable ranges are interleaved</li>
</ul>
<p>Rhys Perry (1):</p>
<ul>
<li>nvc0/ir: fix TargetNVC0::insnCanLoadOffset()</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>r600/sb: fix crash in fold_alu_op3</li>
</ul>
<p>Ross Burton (1):</p>
<ul>
<li>egl: fix build race in automake</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix emitting the view index on GFX9</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>glsl: skip comparison opt when adding vars of different size</li>
<li>nir: fix selection of loop terminator when two or more have the same limit</li>
</ul>
<p>zhaowei yuan (1):</p>
<ul>
<li>glsl: Treat sampler2DRect and sampler2DRectShadow as reserved in ES2</li>
</ul>
</div>
</body>
</html>

183
docs/relnotes/18.1.5.html Normal file
View File

@@ -0,0 +1,183 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.4 Release Notes / July 13 2018</h1>
<p>
Mesa 18.1.5 is a bug fix release which fixes bugs found since the 18.1.4 release.
</p>
<p>
Mesa 18.1.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
SHA256: f966d5d5d373a5b8a16ed5036c1e7f05d4ad46d130f793bf9782c3ac9133a02e mesa-18.1.5.tar.gz
SHA256: 69dbe6f1a6660386f5beb85d4fcf003ee23023ed7b9a603de84e9a37e8d98dea mesa-18.1.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103274">Bug 103274</a> - BRW allocates too much heap memory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107275">Bug 107275</a> - NIR segfaults after spirv-opt</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107295">Bug 107295</a> - Access violation on glDrawArrays with count &gt;= 2048</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107312">Bug 107312</a> - Mesa-git RPM build fails after commit 8cacf38f527d42e41441ef8c25d95d4b2f4e8602</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107366">Bug 107366</a> - NIR verification crashes on piglit tests</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (1):</p>
<ul>
<li>anv: Pay attention to VK_ACCESS_MEMORY_(READ|WRITE)_BIT</li>
</ul>
<p>Bas Nieuwenhuizen (7):</p>
<ul>
<li>radv: Select correct entries for binning.</li>
<li>radv: Fix number of samples used for binning.</li>
<li>radv: Disable disabled color buffers in rbplus opts.</li>
<li>nir: Do not use continue block after removing it.</li>
<li>util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.</li>
<li>nir: Fix end of function without return warning/error.</li>
<li>radv: Still enable inmemory &amp; API level caching if disk cache is not enabled.</li>
</ul>
<p>Chad Versace (2):</p>
<ul>
<li>anv/android: Fix type error in call to vk_errorf()</li>
<li>anv/android: Fix Autotools build for VK_ANDROID_native_buffer</li>
</ul>
<p>Chih-Wei Huang (1):</p>
<ul>
<li>Android: fix a missing nir_intrinsics.h error</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>i965: Sweep NIR after linking phase to free held memory</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>r600: enable tess_input_info for TES</li>
</ul>
<p>Dylan Baker (5):</p>
<ul>
<li>docs: Add sha256 sums for 18.1.4 tarballs</li>
<li>cherry-ignore: add 4a67ce886a7b3def5f66c1aedf9e5436d157a03c</li>
<li>cherry-ignore: Add 1f616a840eac02241c585d28e9dac8f19a297f39</li>
<li>cherry-ignore: add 11712b9ca17e4e1a819dcb7d020e19c6da77bc90</li>
<li>bump version to 18.1.5</li>
</ul>
<p>Eric Anholt (2):</p>
<ul>
<li>vc4: Don't automatically reallocate a PERSISTENT-mapped buffer.</li>
<li>meson: Move xvmc test tools from unit tests to installed tools.</li>
</ul>
<p>Harish Krupo (1):</p>
<ul>
<li>egl: Fix missing clamping in eglSetDamageRegionKHR</li>
</ul>
<p>Jan Vesely (3):</p>
<ul>
<li>radeonsi: Refuse to accept code with unhandled relocations</li>
<li>clover: Report error when pipe driver fails to create compute state</li>
<li>clover: Catch errors from executing event action</li>
</ul>
<p>Jason Ekstrand (6):</p>
<ul>
<li>anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV</li>
<li>nir/serialize: Alloc constants off the variable</li>
<li>blorp: Handle the RGB workaround more like other workarounds</li>
<li>intel/blorp: Handle 3-component formats in clears</li>
<li>intel/compiler: Account for built-in uniforms in analyze_ubo_ranges</li>
<li>spirv: Fix a couple of image atomic load/store bugs</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>gallium/tests: Don't ignore S3TC errors.</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nir: fix printing of vec16 type</li>
</ul>
<p>Lepton Wu (1):</p>
<ul>
<li>virgl: Fix flush in virgl_encoder_inline_write.</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>st/mesa: call resource_changed when binding a EGLImage to a texture</li>
</ul>
<p>Mauro Rossi (2):</p>
<ul>
<li>radv: winsys/amdgpu: include missing pthread.h header</li>
<li>android: util/disk_cache: fix building errors in gallium drivers</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>gallium: Check pipe_screen::resource_changed before dereferencing it</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>draw: force draw pipeline if there's more than 65535 vertices</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>anv: fix assert in anv_CmdBindDescriptorSets()</li>
</ul>
<p>Samuel Pitoiset (3):</p>
<ul>
<li>radv: make sure to wait for CP DMA when needed</li>
<li>radv: emit a dummy ZPASS_DONE to prevent GPU hangs on GFX9</li>
<li>radv: fix a memleak for merged shaders on GFX9</li>
</ul>
</div>
</body>
</html>

188
docs/relnotes/18.1.6.html Normal file
View File

@@ -0,0 +1,188 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.6 Release Notes / August 13 2018</h1>
<p>
Mesa 18.1.6 is a bug fix release which fixes bugs found since the 18.1.5 release.
</p>
<p>
Mesa 18.1.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
580e03328ffefe1fd43b19ab7669f20d931601a1c0a4c0f8b9c65d6e81a06df3 mesa-18.1.6.tar.gz
bb7ce759069801804fcfb8152da3457f76cd7b4e0096e4870ff5adcb5c894289 mesa-18.1.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=13728">Bug 13728</a> - [G965] Some objects in Neverwinter Nights Linux version not displayed correctly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98699">Bug 98699</a> - &quot;float[a+++4 ? 1:1] f;&quot; crashes glsl_compiler</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99730">Bug 99730</a> - Metro Redux game(s) needs override for midshader extension declaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106382">Bug 106382</a> - Shader cache breaks INTEL_DEBUG=shader_time</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107117">Bug 107117</a> - mesa-18.1: regression with TFP on intel with modesettings and glamor acceleration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107212">Bug 107212</a> - Dual-Core CPU E5500 / G45: RetroArch with reicast core results in corrupted graphics</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (1):</p>
<ul>
<li>glx: GLX_MESA_multithread_makecurrent is direct-only</li>
</ul>
<p>Andres Gomez (3):</p>
<ul>
<li>ddebug: use util_snprintf() in dd_get_debug_filename_and_mkdir</li>
<li>gallium/aux/util: use util_snprintf() in test_texture_barrier</li>
<li>glsl: use util_snprintf()</li>
</ul>
<p>Christian Gmeiner (1):</p>
<ul>
<li>etnaviv: fix typo in query names</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>r600: reduce num compute threads to 1024.</li>
</ul>
<p>Dylan Baker (6):</p>
<ul>
<li>docs: Add sha-256 sums for 18.1.5</li>
<li>nir/meson: fix c vs cpp args for nir test</li>
<li>gallium: fix ddebug on windows</li>
<li>cherry-ignore: add patches that get-pick-list is finding in error</li>
<li>cherry-ignore: Add some additional patches that are for 18.2</li>
<li>bump version to 18.1.6</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>swr: don't export swr_create_screen_internal</li>
<li>automake: require shared glapi when using DRI based libGL</li>
<li>autotools: error out when using the broken --with-{gl, osmesa}-lib-name</li>
<li>autotools: error out when building with mangling and glvnd</li>
<li>autotools: use correct gl.pc LIBS when using glvnd</li>
</ul>
<p>Eric Anholt (4):</p>
<ul>
<li>vc4: Fix a leak of the no-vertex-elements workaround BO.</li>
<li>vc4: Respect a sampler view's first_layer field.</li>
<li>vc4: Ignore samplers for finding uniform offsets.</li>
<li>egl: Fix leak of X11 pixmaps backing pbuffers in DRI3.</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>meson, install_megadrivers: Also remove stale symlinks</li>
</ul>
<p>Jan Vesely (2):</p>
<ul>
<li>clover: Reduce wait_count in abort path.</li>
<li>clover: Don't extend illegal integer types.</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>nir: Take if uses into account in ssa_def_components_read</li>
<li>i965/fs: Flag all slots of a flat input as flat</li>
</ul>
<p>Jon Turney (1):</p>
<ul>
<li>meson: use correct keyword to fix a meson warning</li>
</ul>
<p>Jordan Justen (2):</p>
<ul>
<li>i965, anv: Use INTEL_DEBUG for disk_cache driver flags</li>
<li>i965: Disable shader cache with INTEL_DEBUG=shader_time</li>
</ul>
<p>Juan A. Suarez Romero (2):</p>
<ul>
<li>wayland/egl: update surface size on window resize</li>
<li>wayland/egl: initialize window surface size to window size</li>
</ul>
<p>Karol Herbst (2):</p>
<ul>
<li>nir/lower_int64: mark all metadata as dirty</li>
<li>nvc0/ir: return 0 in imageLoad on incomplete textures</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>intel: Fix SIMD16 unaligned payload GRF reads on Gen4-5.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>ac/surface: fix MSAA corruption on Vega due to FMASK tile swizzle</li>
</ul>
<p>Mauro Rossi (2):</p>
<ul>
<li>radv: generate entrypoints for VK_ANDROID_native_buffer</li>
<li>radv: move vk_format_table.c to generated sources</li>
</ul>
<p>Olivier Fourdan (1):</p>
<ul>
<li>dri3: For 1.2, use root window instead of pixmap drawable</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>glsl: handle error case with ast_post_inc, ast_post_dec</li>
</ul>
<p>Vlad Golovkin (1):</p>
<ul>
<li>swr: Remove unnecessary memset call</li>
</ul>
<p>vadym.shovkoplias (1):</p>
<ul>
<li>drirc: Allow extension midshader for Metro Redux</li>
</ul>
</div>
</body>
</html>

104
docs/relnotes/18.1.7.html Normal file
View File

@@ -0,0 +1,104 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.7 Release Notes / August 24 2018</h1>
<p>
Mesa 18.1.7 is a bug fix release which fixes bugs found since the 18.1.6 release.
</p>
<p>
Mesa 18.1.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
0c3c240bcd1352d179e65993214f9d55a399beac852c3ab4433e8df9b6c51c83 mesa-18.1.7.tar.gz
655e3b32ce3bdddd5e6e8768596e5d4bdef82d0dd37067c324cc4b2daa207306 mesa-18.1.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105975">Bug 105975</a> - i965 always reports 0 viewport subpixel bits</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107098">Bug 107098</a> - Segfault after munmap(kms_sw_dt-&gt;ro_mapped)</li>
</ul>
<h2>Changes</h2>
<p>Alexander Tsoy (1):</p>
<ul>
<li>meson: fix build for egl platform_x11 without dri3 and gbm</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Fix missing Android platform define.</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>i965: Advertise 8 bits subpixel precision for viewport bounds on gen6+</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>r600/eg: rework atomic counter emission with flushes</li>
</ul>
<p>Dylan Baker (7):</p>
<ul>
<li>docs: Add sha256 sums for 18.1.6</li>
<li>cherry-ignore: Add additional 18.2 only patches</li>
<li>cherry-ignore: Add more 18.2 patches</li>
<li>cherry-ignore: Add more 18.2 patches</li>
<li>cherry-ignore: Add a couple of patches with &gt; 1 fixes tags</li>
<li>cherry-ignore: more 18.2 patches</li>
<li>bump version for 18.1.7 release</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>intel: Switch the order of the 2x MSAA sample positions</li>
<li>anv/lower_ycbcr: Use the binding array size for bounds checks</li>
</ul>
<p>Ray Strode (1):</p>
<ul>
<li>gallium/winsys/kms: don't unmap what wasn't mapped</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv/winsys: fix creating the BO list for virtual buffers</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>radv: add Doom workaround</li>
</ul>
</div>
</body>
</html>

180
docs/relnotes/18.1.8.html Normal file
View File

@@ -0,0 +1,180 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.8 Release Notes / September 7 2018</h1>
<p>
Mesa 18.1.8 is a bug fix release which fixes bugs found since the 18.1.7 release.
</p>
<p>
Mesa 18.1.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
8ec62f215dd1bb3910987f9941c6fc31632a0874e618815cf1e8e29445c86e0a mesa-18.1.8.tar.gz
bd1be67fe9c73b517765264ac28911c84144682d28dbff140e1c2deb2f44c21b mesa-18.1.8.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93355">Bug 93355</a> - [BXT,SKLGT4e] intermittent ext_framebuffer_multisample.accuracy fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101247">Bug 101247</a> - Mesa fails to link GLSL programs with unused output blocks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104809">Bug 104809</a> - anv: DOOM 2016 and Wolfenstein II:The New Colossus crash due to not having depthBoundsTest</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105904">Bug 105904</a> - Needed to delete mesa shader cache after driver upgrade for 32 bit wine vulkan programs to work.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106738">Bug 106738</a> - No test for miptrees with DRI modifiers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106865">Bug 106865</a> - [GLK] piglit.spec.ext_framebuffer_multisample.accuracy stencil tests fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107359">Bug 107359</a> - [Regression] [bisected] [OpenGL CTS] [SKL,BDW] KHR-GL46.texture_barrier*-texels, GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners, and GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners fail with some configuration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107477">Bug 107477</a> - [DXVK] Setting high shader quality in GTA V results in LLVM error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107579">Bug 107579</a> - [SNB] The graphic corruption when we reuse the GS compiled and used for TFB when statebuffer contain magic trash in the unused space</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107601">Bug 107601</a> - Rise of the Tomb Raider Segmentation Fault when the game starts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107760">Bug 107760</a> - GPU Hang when Playing DiRT 3 Complete Edition using Steam Play with DXVK</li>
</ul>
<h2>Changes</h2>
<p>Andrii Simiklit (1):</p>
<ul>
<li>i965/gen6/xfb: handle case where transform feedback is not active</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Add missing checks in radv_get_image_format_properties.</li>
<li>radv: Fix CMASK dimensions.</li>
<li>radv: Use a lower max offchip buffer count.</li>
</ul>
<p>Christian Gmeiner (1):</p>
<ul>
<li>tegra: fix memory leak</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>st/dri: Don't expose sRGB formats to clients</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>ac/radeonsi: fix CIK copy max size</li>
</ul>
<p>Dylan Baker (10):</p>
<ul>
<li>docs: Add mesa 18.1.7 notes</li>
<li>cherry-ignore: add a patch</li>
<li>cherry-ignore: Add more 18.2 only patches</li>
<li>meson: Actually load translation files</li>
<li>cherry-ignore: Add more 18.2 patches</li>
<li>cherry-ignore: Add additional patch</li>
<li>cherry-ignore: Add patch that doesn't apply to 18.1</li>
<li>cherry-ignore: Add a couple of two fixes warning patches</li>
<li>cherry-ignore: Add patch that needs more significant patches to function</li>
<li>Bump version to 18.1.8</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: update required mako version</li>
</ul>
<p>Grazvydas Ignotas (1):</p>
<ul>
<li>radv: place pointer length into cache uuid</li>
</ul>
<p>Gurchetan Singh (2):</p>
<ul>
<li>meson: fix egl build for surfaceless</li>
<li>meson: fix egl build for android</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>i965/vec4: Clamp indirect tes input array reads with 0x0fffffff</li>
<li>i965/vec4: Correctly handle uniform sources in generate_tes_add_indirect_urb_offset</li>
</ul>
<p>Jason Ekstrand (5):</p>
<ul>
<li>anv: Fill holes in the VF VUE to zero</li>
<li>nir/algebraic: Be more careful converting ushr to extract_u8/16</li>
<li>egl/dri2: Add a helper for the number of planes for a FOURCC format</li>
<li>egl/dri2: Guard against invalid fourcc formats</li>
<li>anv/blorp: Do more flushing around HiZ clears</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>egl/wayland: do not leak wl_buffer when it is locked</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: blorp: support multiple aspect blits</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>glapi: actually implement GL_EXT_robustness for GLES</li>
</ul>
<p>Nanley Chery (7):</p>
<ul>
<li>intel/isl: Avoid tiling some 16K-wide render targets</li>
<li>i965: Make blt_pitch public</li>
<li>i965/miptree: Drop an if case from retile_as_linear</li>
<li>i965/miptree: Use the correct BLT pitch</li>
<li>i965/miptree: Use miptree_map in map_blit functions</li>
<li>i965/miptree: Fix can_blit_slice()</li>
<li>i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on some gen9</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix passing clip/cull distances from VS to PS</li>
</ul>
<p>vadym.shovkoplias (1):</p>
<ul>
<li>glsl/linker: Allow unused in blocks which are not declated on previous stage</li>
</ul>
</div>
</body>
</html>

178
docs/relnotes/18.1.9.html Normal file
View File

@@ -0,0 +1,178 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.8 Release Notes / September 24 2018</h1>
<p>
Mesa 18.1.9 is a bug fix release which fixes bugs found since the 18.1.8 release.
</p>
<p>
Mesa 18.1.9 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
0f825dc834b1b3e3d9a6c3ce58b42977f0d9a248a7627a36dd3b313ffe41a499 mesa-18.1.9.tar.gz
55f5778d58a710a63d6635f000535768faf7db9e8144dc0f4fd1989f936c1a83 mesa-18.1.9.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103241">Bug 103241</a> - Anv crashes when using 64-bit vertex inputs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104926">Bug 104926</a> - swrast: Mesa 17.3.3 produces: HW cursor for format 875713089 not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107280">Bug 107280</a> - [DXVK] Batman: Arkham City with tessellation enabled hangs on SKL GT4</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107772">Bug 107772</a> - Mesa preprocessor matches if(def)s &amp; endifs incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107779">Bug 107779</a> - Access violation with some games</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107810">Bug 107810</a> - The 'va_end' call is missed after 'va_copy' in 'util_vsnprintf' function under windows</li>
</ul>
<h2>Changes</h2>
<p>Andrii Simiklit (4):</p>
<ul>
<li>apple/glx/log: added missing va_end() after va_copy()</li>
<li>mesa/util: don't use the same 'va_list' instance twice</li>
<li>mesa/util: don't ignore NULL returned from 'malloc'</li>
<li>mesa/util: add missing va_end() after va_copy()</li>
</ul>
<p>Bas Nieuwenhuizen (4):</p>
<ul>
<li>radv: Use build ID if available for cache UUID.</li>
<li>radv: Only allow 16 user SGPRs for compute on GFX9+.</li>
<li>radv: Set the user SGPR MSB for Vega.</li>
<li>radv: Fix driver UUID SHA1 init.</li>
</ul>
<p>Christopher Egert (1):</p>
<ul>
<li>radeon: fix ColorMask</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>virgl: don't send a shader create with no data. (v2)</li>
</ul>
<p>Dylan Baker (10):</p>
<ul>
<li>docs/relnotes: Add sha256 sums for mesa 18.1.8</li>
<li>cherry-ignore: Add additional 18.2 patch</li>
<li>meson: Print a message about why a libdrm version was selected</li>
<li>cherry-ignore: add another 18.2 patch</li>
<li>cherry-ignore: Add patches that don't apply cleanly and are for developer tools</li>
<li>cherry-ignore: Add more 18.2 patches</li>
<li>cherry-ignore: add 18.2 patchs</li>
<li>cherry-ignore: add a patch that was reverted on master</li>
<li>cherry-ignore: one final update</li>
<li>Bump version to 18.1.9</li>
</ul>
<p>Erik Faye-Lund (2):</p>
<ul>
<li>winsys/virgl: avoid unintended behavior</li>
<li>virgl: adjust strides when mapping temp-resources</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>winsys/virgl: correct resource and handle allocation (v2)</li>
</ul>
<p>Jason Ekstrand (6):</p>
<ul>
<li>anv/pipeline: Only consider double elements which actually exist</li>
<li>i965: Workaround the gen9 hw astc5x5 sampler bug</li>
<li>anv: Re-emit vertex buffers when the pipeline changes</li>
<li>anv: Disable the vertex cache when tessellating on SKL GT4</li>
<li>anv: Clamp scissors to the framebuffer boundary</li>
<li>anv/query: Write both dwords in emit_zero_queries</li>
</ul>
<p>Josh Pieper (1):</p>
<ul>
<li>st/mesa: Validate the result of pipe_transfer_map in make_texture (v2)</li>
</ul>
<p>Kenneth Feng (1):</p>
<ul>
<li>amd: Add Picasso device id</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>st/mesa: help fix stencil border color for GL_DEPTH_STENCIL textures</li>
<li>radeonsi: fix HTILE for NPOT textures with mipmapping on SI/CI</li>
<li>r600: fix HTILE for NPOT textures with mipmapping</li>
<li>radeonsi: fix printing a BO list into ddebug reports</li>
</ul>
<p>Mathias Fröhlich (1):</p>
<ul>
<li>tnl: Fix green gun regression in xonotic.</li>
</ul>
<p>Mauro Rossi (3):</p>
<ul>
<li>android: broadcom/genxml: fix collision with intel/genxml header-gen macro</li>
<li>android: broadcom/cle: add gallium include path</li>
<li>android: broadcom/cle: export the broadcom top level path headers</li>
</ul>
<p>Michal Srb (1):</p>
<ul>
<li>st/dri: don't set queryDmaBufFormats/queryDmaBufModifiers if the driver does not implement it</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>loader/dri3: Only wait for back buffer fences in dri3_get_buffer</li>
</ul>
<p>Pierre Moreau (1):</p>
<ul>
<li>nvir: Always split 64-bit IMAD/IMUL operations</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>intel: compiler option msse2 and mstackrealign</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>glsl: fixer lexer for unreachable defines</li>
</ul>
</div>
</body>
</html>

284
docs/relnotes/18.2.0.html Normal file
View File

@@ -0,0 +1,284 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.0 Release Notes / September 7, 2018</h1>
<p>
Mesa 18.2.0 is a new development release. People who are concerned
with stability and reliability should stick with a previous release or
wait for Mesa 18.2.1.
</p>
<p>
Mesa 18.2.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
libwayland-egl is now distributed by Wayland (since 1.15,
<a href="https://lists.freedesktop.org/archives/wayland-devel/2018-April/037767.html">see announcement</a>),
and has been removed from Mesa in this release. Make sure you're using
an up-to-date version of Wayland to keep the functionality.
</p>
<h2>SHA256 checksums</h2>
<pre>
b9e6bb3eb7660b0726ba28405ffa0cb77de619e925b910b72f4d7a85c0098596 mesa-18.2.0.tar.gz
22452bdffff8e11bf4284278155a9f77cb28d6d73a12c507f1490732d0d9ddce mesa-18.2.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>OpenGL 4.3 on virgl</li>
<li>OpenGL 4.4 Compatibility profile on radeonsi</li>
<li>OpenGL ES 3.2 on radeonsi and virgl</li>
<li>GL_ARB_ES3_2_compatibility on radeonsi</li>
<li>GL_ARB_fragment_shader_interlock on i965</li>
<li>GL_ARB_sample_locations and GL_NV_sample_locations on nvc0 (GM200+)</li>
<li>GL_ANDROID_extension_pack_es31a on radeonsi.</li>
<li>GL_KHR_texture_compression_astc_ldr on radeonsi</li>
<li>GL_NV_conservative_raster and GL_NV_conservative_raster_dilate on nvc0 (GM200+)</li>
<li>GL_NV_conservative_raster_pre_snap_triangles on nvc0 (GP102+)</li>
<li>multisampled images on nvc0 (GM107+) (now supported on GF100+)</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=13728">Bug 13728</a> - [G965] Some objects in Neverwinter Nights Linux version not displayed correctly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61761">Bug 61761</a> - glPolygonOffsetEXT, OFFSET_BIAS incorrectly set to a huge number</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=65422">Bug 65422</a> - Rename api_validate.[ch] to draw_validate.[ch]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78097">Bug 78097</a> - glUniform1ui and friends not supported by display lists</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91808">Bug 91808</a> - trine1 misrender r600g</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93355">Bug 93355</a> - [BXT,SKLGT4e] intermittent ext_framebuffer_multisample.accuracy fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95009">Bug 95009</a> - [SNB] amd_shader_trinary_minmax.execution.built-in-functions.gs-mid3-ivec2-ivec2-ivec2 intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95012">Bug 95012</a> - [SNB] glsl-1_50.execution.built-in-functions.gs-op tests intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98699">Bug 98699</a> - &quot;float[a+++4 ? 1:1] f;&quot; crashes glsl_compiler</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99116">Bug 99116</a> - Wine DirectDraw programs showing only a blackscreen when using Mesa Gallium drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99730">Bug 99730</a> - Metro Redux game(s) needs override for midshader extension declaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100177">Bug 100177</a> - [GM206] Misrendering in XCOM Ennemy Within</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100430">Bug 100430</a> - [radv] graphical glitches on dolphin emulator</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101247">Bug 101247</a> - Mesa fails to link GLSL programs with unused output blocks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102390">Bug 102390</a> - centroid interpolation causes broken attribute values</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102678">Bug 102678</a> - gl_BaseVertex should always be zero when the draw command has no &lt;basevertex&gt; parameter</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103274">Bug 103274</a> - BRW allocates too much heap memory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104388">Bug 104388</a> - [snb] GPU HANG: ecode 6:0:0x85fffff8 in fgfs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104626">Bug 104626</a> - broadcom/vc5: double compare</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104809">Bug 104809</a> - anv: DOOM 2016 and Wolfenstein II:The New Colossus crash due to not having depthBoundsTest</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105351">Bug 105351</a> - [Gen6+] piglit's arb_shader_image_load_store-host-mem-barrier fails with a glGetTexSubImage fallback path</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105374">Bug 105374</a> - texture3d, a SaschaWillems demo, assert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105396">Bug 105396</a> - tc compatible htile sets depth of htiles of discarded fragments to 1.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105399">Bug 105399</a> - [snb] GPU hang: after geometry shader emits no geometry, the program hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105497">Bug 105497</a> - shader-db crashes on 72 core system after ast_type_qualifier bitset change</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105613">Bug 105613</a> - Compute shader locks up within nested &quot;for&quot; loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105731">Bug 105731</a> - linker error &quot;fragment shader input ... has no matching output in the previous stage&quot; when previous stage's output declaration in a separate shader object</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105904">Bug 105904</a> - Needed to delete mesa shader cache after driver upgrade for 32 bit wine vulkan programs to work.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105975">Bug 105975</a> - i965 always reports 0 viewport subpixel bits</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106090">Bug 106090</a> - Compiling compute shader crashes RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106133">Bug 106133</a> - make check &quot;OSError: [Errno 24] Too many open files&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106163">Bug 106163</a> - r600/sb: optimizer tries to schedule access to different array elements in one instruction group</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106174">Bug 106174</a> - vulkan dota2 broken (segfaulting), found bug commit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106180">Bug 106180</a> - [bisected] radv vulkan smoke test black screen (Add support for DRI3 v1.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106232">Bug 106232</a> - LLVM unit tests have error in random number handling</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106243">Bug 106243</a> - [kbl] GPU HANG: 9:0:0x85dffffb, in Cinnamon</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106315">Bug 106315</a> - The witness + dxvk suffers flickering garbage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106331">Bug 106331</a> - radv doesnt support VK_FORMAT_R32G32B32_SFLOAT</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106382">Bug 106382</a> - Shader cache breaks INTEL_DEBUG=shader_time</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106393">Bug 106393</a> - glsl-fs-shader-stencil-export hangs forever</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106450">Bug 106450</a> - glGetIntegerv return wrong value in some cases</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106462">Bug 106462</a> - piglit.spec.arb_vertex_array_bgra.get regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106479">Bug 106479</a> - NDEBUG not defined for libamdgpu_addrlib</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106480">Bug 106480</a> - A2B10G10R10_SNORM vertex attribute doesn't work.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106499">Bug 106499</a> - [regression, bisected] Several games crash on start</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106504">Bug 106504</a> - vulkan SPIR-V parsing failed at ../src/compiler/spirv/vtn_cfg.c:381</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106511">Bug 106511</a> - radv: MSAA broken on SI (assertion failure in vkCreateImage)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106587">Bug 106587</a> - Dota2 is very dark when using vulkan render on a Intel &lt;&lt; AMD prime setup</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106594">Bug 106594</a> - [regression,apitrace,bisected] Prison Architect rendered unplayable by multicoloured flickering triangles and overlayed triangles when performing certain actions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106619">Bug 106619</a> - [OpenCL][llvm-svn]build failure addPassesToEmitFile candidate expects 6 arguments, 3 provided</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106629">Bug 106629</a> - [SNB,IVB,HSW,BDW] dEQP-EGL.functional.image.create.gles2_cubemap_negative_z_rgb_read_pixels</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106642">Bug 106642</a> - X server crashes in i965 on desktop startup when DRI3 v1.2 / modifier support is enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106643">Bug 106643</a> - double free when exporting a temporarily imported semaphore</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106673">Bug 106673</a> - [bisected] Steam is unusable since commit 5c33e8c7</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106687">Bug 106687</a> - radv: Fast color clears use incorrect format</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106708">Bug 106708</a> - [SKL/KBL/GLK] 2-3% performance drop in SynMark DrvState and 5-9% drop on SynMark Multithread</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106748">Bug 106748</a> - st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY broke qemu -display sdl,gl=on</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106756">Bug 106756</a> - Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106774">Bug 106774</a> - GLSL IR copy propagates loads of SSBOs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106776">Bug 106776</a> - vma_random unrecognized command line option &quot;-std=c++11&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106778">Bug 106778</a> - Files missing from tarball - intel_sanitize_gpu.*</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106779">Bug 106779</a> - Files missing from tarball - u_debug_stack_android.cpp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106784">Bug 106784</a> - 18.1.1 autotools build fail without mako</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106801">Bug 106801</a> - vma_random_test.cpp:239:18: error: non-constant-expression cannot be narrowed from type 'unsigned long' to 'uint_fast32_t' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106810">Bug 106810</a> - ProgramBinary does not switch program correctly when using transform feedback</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106823">Bug 106823</a> - Failed to recongnize keyword of shader code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106830">Bug 106830</a> - [bisected] 32 bit tests (deqp, piglit, glcts, vulkancts) crashing on all platforms</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106861">Bug 106861</a> - fatal error: wayland-egl-backend.h: No such file or directory compilation terminated.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106865">Bug 106865</a> - [GLK] piglit.spec.ext_framebuffer_multisample.accuracy stencil tests fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106903">Bug 106903</a> - radv: Fragment shader output goes to wrong attachments when render targets are sparse</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106906">Bug 106906</a> - Failed to recongnize keyword “sampler2DRect” and &quot;sampler2DRectShadow&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106907">Bug 106907</a> - Correct Transform Feedback Varyings information is expected after using ProgramBinary</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106912">Bug 106912</a> - radv: 16-bit depth buffer causes artifacts in Shadow Warrior 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106928">Bug 106928</a> - When starting a match Rocket League crashes on &quot;Go&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106941">Bug 106941</a> - Intel ANV vulkan driver exposing version 1.1.0 which is incorrect</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106986">Bug 106986</a> - glGetQueryiv error when querying number of result bits for GL_ANY_SAMPLES_PASSED_CONSERVATIVE</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106997">Bug 106997</a> - [Regression]. Dying light game is crashing on latest mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107098">Bug 107098</a> - Segfault after munmap(kms_sw_dt-&gt;ro_mapped)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107117">Bug 107117</a> - mesa-18.1: regression with TFP on intel with modesettings and glamor acceleration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107190">Bug 107190</a> - Got seg fault on snb when use INTEL_DEBUG=bat</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107193">Bug 107193</a> - piglit.spec.arb_compute_shader.linker.bug-93840 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107212">Bug 107212</a> - Dual-Core CPU E5500 / G45: RetroArch with reicast core results in corrupted graphics</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107223">Bug 107223</a> - [GEN9+] 50% perf drop in SynMark Fill* tests (E2E RBC gets disabled?)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107248">Bug 107248</a> - [G45 ILK G965] Texture handling broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107275">Bug 107275</a> - NIR segfaults after spirv-opt</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107276">Bug 107276</a> - radv: OpBitfieldUExtract returns incorrect result when count is zero</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107295">Bug 107295</a> - Access violation on glDrawArrays with count &gt;= 2048</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107305">Bug 107305</a> - glsl/opt_copy_propagation_elements.cpp:72:9: error: delegating constructors are permitted only in C++11</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107312">Bug 107312</a> - Mesa-git RPM build fails after commit 8cacf38f527d42e41441ef8c25d95d4b2f4e8602</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107359">Bug 107359</a> - [Regression] [bisected] [OpenGL CTS] [SKL,BDW] KHR-GL46.texture_barrier*-texels, GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners, and GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners fail with some configuration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107366">Bug 107366</a> - NIR verification crashes on piglit tests</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107423">Bug 107423</a> - vc4 build failure: &quot;v3d_decoder.c:893: undefined reference to `clif_lookup_bo'&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107443">Bug 107443</a> - Build error on arm64: v3d_decoder.c:837:17: error: format not a string literal and no format arguments [-Werror=format-security]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107460">Bug 107460</a> - radv: OpControlBarrier does not always work correctly (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107477">Bug 107477</a> - [DXVK] Setting high shader quality in GTA V results in LLVM error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107510">Bug 107510</a> - [GEN8+] up to 10% perf drop on several 3D benchmarks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107544">Bug 107544</a> - intel/decoder: out of bounds group_iter</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107550">Bug 107550</a> - &quot;0[2]&quot; as function parameter hits assert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107579">Bug 107579</a> - [SNB] The graphic corruption when we reuse the GS compiled and used for TFB when statebuffer contain magic trash in the unused space</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107601">Bug 107601</a> - Rise of the Tomb Raider Segmentation Fault when the game starts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107610">Bug 107610</a> - Dolphin emulator mis-renders shadow overlay in Super Mario Sunshine</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Removed GL_EXT_polygon_offset applications should use glPolygonOffset instead.</li>
<li>Removed libwayland-egl, now part of Wayland</li>
</ul>
</div>
</body>
</html>

227
docs/relnotes/18.2.1.html Normal file
View File

@@ -0,0 +1,227 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.1 Release Notes / September 21, 2018</h1>
<p>
Mesa 18.2.1 is a bug fix release which fixes bugs found since the 18.2.0 release.
</p>
<p>
Mesa 18.2.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
SHA256: 45419ccbe1bf9a2e15ffe71ced34615002e1b42c24b917fbe2b2f58ab1970562 mesa-18.2.1.tar.gz
SHA256: 9636dc6f3d188abdcca02da97cedd73640d9035224efd5db724187d062c81056 mesa-18.2.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103241">Bug 103241</a> - Anv crashes when using 64-bit vertex inputs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107280">Bug 107280</a> - [DXVK] Batman: Arkham City with tessellation enabled hangs on SKL GT4</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107772">Bug 107772</a> - Mesa preprocessor matches if(def)s &amp; endifs incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107779">Bug 107779</a> - Access violation with some games</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107810">Bug 107810</a> - The 'va_end' call is missed after 'va_copy' in 'util_vsnprintf' function under windows</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107832">Bug 107832</a> - Gallium picking A16L16 formats when emulating INTENSITY16 conflicts with mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107843">Bug 107843</a> - 32bit Mesa build failes with meson.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107879">Bug 107879</a> - crash happens when link program</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107891">Bug 107891</a> - [wine, regression, bisected] RAGE, Wolfenstein The New Order hangs in menu</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (3):</p>
<ul>
<li>docs: add sha256 checksums for 18.2.0</li>
<li>Revert "Revert "glsl: skip stringification in preprocessor if in unreachable branch""</li>
<li>cherry-ignore: i965/tools: 32bit compilation with meson</li>
</ul>
<p>Andrii Simiklit (4):</p>
<ul>
<li>apple/glx/log: added missing va_end() after va_copy()</li>
<li>mesa/util: don't use the same 'va_list' instance twice</li>
<li>mesa/util: don't ignore NULL returned from 'malloc'</li>
<li>mesa/util: add missing va_end() after va_copy()</li>
</ul>
<p>Bas Nieuwenhuizen (5):</p>
<ul>
<li>radv: Support v3 of VK_EXT_vertex_attribute_divisor.</li>
<li>radv: Set the user SGPR MSB for Vega.</li>
<li>radv: Only allow 16 user SGPRs for compute on GFX9+.</li>
<li>radv: Use build ID if available for cache UUID.</li>
<li>radv: Fix driver UUID SHA1 init.</li>
</ul>
<p>Christopher Egert (1):</p>
<ul>
<li>radeon: fix ColorMask</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>virgl: don't send a shader create with no data. (v2)</li>
</ul>
<p>Dylan Baker (1):</p>
<ul>
<li>meson: Print a message about why a libdrm version was selected</li>
</ul>
<p>Eric Anholt (2):</p>
<ul>
<li>v3d: Fix SRC_ALPHA_SATURATE blending for RTs without alpha.</li>
<li>v3d: Fix setup of the VCM cache size.</li>
</ul>
<p>Erik Faye-Lund (2):</p>
<ul>
<li>winsys/virgl: avoid unintended behavior</li>
<li>virgl: adjust strides when mapping temp-resources</li>
</ul>
<p>Fritz Koenig (2):</p>
<ul>
<li>mesa: Additional FlipY applications</li>
<li>mesa: FramebufferParameteri parameter checking</li>
</ul>
<p>Gert Wollny (2):</p>
<ul>
<li>winsys/virgl: correct resource and handle allocation (v2)</li>
<li>mesa/texture: Also check for LA texture when querying intensity component size</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>i965/fs: Don't propagate conditional modifiers from integer compares to adds</li>
</ul>
<p>Jason Ekstrand (11):</p>
<ul>
<li>anv/pipeline: Only consider double elements which actually exist</li>
<li>i965: Workaround the gen9 hw astc5x5 sampler bug</li>
<li>anv: Re-emit vertex buffers when the pipeline changes</li>
<li>anv: Disable the vertex cache when tessellating on SKL GT4</li>
<li>anv: Clamp scissors to the framebuffer boundary</li>
<li>vulkan: Update the XML and headers to 1.1.84</li>
<li>anv: Support v3 of VK_EXT_vertex_attribute_divisor</li>
<li>anv/query: Write both dwords in emit_zero_queries</li>
<li>nir: Add a small pass to rematerialize derefs per-block</li>
<li>nir/loop_unroll: Re-materialize derefs in use blocks before unrolling</li>
<li>nir/opt_if: Re-materialize derefs in use blocks before peeling loops</li>
</ul>
<p>Josh Pieper (1):</p>
<ul>
<li>st/mesa: Validate the result of pipe_transfer_map in make_texture (v2)</li>
</ul>
<p>Juan A. Suarez Romero (2):</p>
<ul>
<li>cherry-ignore: radv: fix descriptor pool allocation size</li>
<li>Update version to 18.2.1</li>
</ul>
<p>Kenneth Feng (1):</p>
<ul>
<li>amd: Add Picasso device id</li>
</ul>
<p>Marek Olšák (5):</p>
<ul>
<li>radeonsi: fix HTILE for NPOT textures with mipmapping on SI/CI</li>
<li>winsys/radeon: fix CMASK fast clear for NPOT textures with mipmapping on SI/CI</li>
<li>r600: fix HTILE for NPOT textures with mipmapping</li>
<li>radeonsi: fix printing a BO list into ddebug reports</li>
<li>ac: revert new LLVM 7.0 behavior for fdiv</li>
</ul>
<p>Mathias Fröhlich (1):</p>
<ul>
<li>tnl: Fix green gun regression in xonotic.</li>
</ul>
<p>Mauro Rossi (3):</p>
<ul>
<li>android: broadcom/genxml: fix collision with intel/genxml header-gen macro</li>
<li>android: broadcom/cle: add gallium include path</li>
<li>android: broadcom/cle: export the broadcom top level path headers</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>loader/dri3: Only wait for back buffer fences in dri3_get_buffer</li>
</ul>
<p>Pierre Moreau (1):</p>
<ul>
<li>nvir: Always split 64-bit IMAD/IMUL operations</li>
</ul>
<p>Samuel Pitoiset (7):</p>
<ul>
<li>radv: fix function names for VK_EXT_conditional_rendering</li>
<li>radv: fix VK_EXT_conditional_rendering visibility</li>
<li>radv: bump the maximum number of arguments to 64</li>
<li>radv: handle loc-&gt;indirect correctly for the first descriptor</li>
<li>radv: fix GPU hangs with 32-bit indirect descriptors</li>
<li>radv: fix flushing indirect descriptors</li>
<li>radv: fix setting global locations for indirect descriptors</li>
</ul>
<p>Sergii Romantsov (3):</p>
<ul>
<li>intel: compiler option msse2 and mstackrealign</li>
<li>i965/tools: 32bit compilation with meson</li>
<li>mesa/meson: 32bit xmlconfig linkage</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>glsl: fixer lexer for unreachable defines</li>
<li>Revert "radeonsi: avoid syncing the driver thread in si_fence_finish"</li>
</ul>
</div>
</body>
</html>

155
docs/relnotes/18.2.2.html Normal file
View File

@@ -0,0 +1,155 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.2 Release Notes / October 5, 2018</h1>
<p>
Mesa 18.2.2 is a bug fix release which fixes bugs found since the 18.2.1 release.
</p>
<p>
Mesa 18.2.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
SHA256: c51711168971957037cc7e3e19e8abe1ec6eeab9cf236d419a1e7728a41cac8a mesa-18.2.2.tar.gz
SHA256: c3ba82b12a89d3d9fed2bdd96b4702dbb7ab675034650a8b1b718320daf073c4 mesa-18.2.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104602">Bug 104602</a> - [apitrace] Graphical artifacts in Civilization VI on RX Vega</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104926">Bug 104926</a> - swrast: Mesa 17.3.3 produces: HW cursor for format 875713089 not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107276">Bug 107276</a> - radv: OpBitfieldUExtract returns incorrect result when count is zero</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107786">Bug 107786</a> - [DXVK] MSAA reflections are broken in GTA V</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108024">Bug 108024</a> - [Debian Stretch]Fail to build because &quot;xcb_randr_lease_t&quot;</li>
</ul>
<h2>Changes</h2>
<p>Alex Deucher (1):</p>
<ul>
<li>pci_ids: add new polaris pci id</li>
</ul>
<p>Andres Rodriguez (1):</p>
<ul>
<li>radv: only emit ZPASS_DONE for timestamp queries on gfx queues</li>
</ul>
<p>Axel Davy (3):</p>
<ul>
<li>st/nine: Clamp RCP when 0*inf!=0</li>
<li>st/nine: Avoid redundant SetCursorPos calls</li>
<li>st/nine: Increase maximum number of temp registers</li>
</ul>
<p>Dylan Baker (1):</p>
<ul>
<li>meson: Don't compile pipe loader with dri support when not using dri</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>vc4: Fix sin(0.0) and cos(0.0) accuracy to fix SDL rendering rotation.</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>vulkan/wsi/display: check if wsi_swapchain_init() succeeded</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>anv,radv: Implement vkAcquireNextImage2</li>
</ul>
<p>Juan A. Suarez Romero (2):</p>
<ul>
<li>docs: add sha256 checksums for 18.2.1</li>
<li>Update version to 18.2.2</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/uvd: use bitstream coded number for symbols of Huffman tables</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>glsl_to_tgsi: invert gl_SamplePosition.y for the default framebuffer</li>
<li>radeonsi: NaN should pass kill_if</li>
</ul>
<p>Maxime (1):</p>
<ul>
<li>vulkan: Disable randr lease for libxcb &lt; 1.13</li>
</ul>
<p>Michal Srb (1):</p>
<ul>
<li>st/dri: don't set queryDmaBufFormats/queryDmaBufModifiers if the driver does not implement it</li>
</ul>
<p>Rhys Perry (2):</p>
<ul>
<li>nvc0: Update counter reading shaders to new NVC0_CB_AUX_MP_INFO</li>
<li>nvc0: fix bindless multisampled images on Maxwell+</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>anv: Add support for protected memory properties on anv_GetPhysicalDeviceProperties2()</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: use the resolve compute path if dest uses multiple layers</li>
</ul>
<p>Stuart Young (1):</p>
<ul>
<li>docs: Update FAQ with respect to s3tc support</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>radeonsi: add a workaround for bitfield_extract when count is 0</li>
</ul>
</div>
</body>
</html>

167
docs/relnotes/18.2.3.html Normal file
View File

@@ -0,0 +1,167 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.3 Release Notes / October 19, 2018</h1>
<p>
Mesa 18.2.3 is a bug fix release which fixes bugs found since the 18.2.2 release.
</p>
<p>
Mesa 18.2.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
0e13e2342eae74d8848df23595c4bb4b2f8874c9e1213b8466b1fbfa7ef99375 mesa-18.2.3.tar.gz
e2bf83c17e1abdecb1ee81af22652e27e9aa38f963e95e60f34275cc0376304f mesa-18.2.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99507">Bug 99507</a> - Corrupted frame contents with Vulkan version of DOTA2, Talos Principle and Sascha Willems' demos when they're run Vsynched in fullscreen</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107857">Bug 107857</a> - GPU hang - GS_EMIT without shader outputs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107926">Bug 107926</a> - [anv] Rise of the Tomb Raider always misrendering, segfault and gpu hang.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108012">Bug 108012</a> - Compiler crashes on access of non-existent member incremental operations</li>
</ul>
<h2>Changes</h2>
<p>Boyuan Zhang (1):</p>
<ul>
<li>st/va: use provided sizes and coords for vlVaGetImage</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>anv: add missing unlock in error path.</li>
</ul>
<p>Dylan Baker (1):</p>
<ul>
<li>meson: Don't allow building EGL on Windows or MacOS</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>st/nine: do not double-close the fd on teardown</li>
<li>egl: make eglSwapInterval a no-op for !window surfaces</li>
<li>egl: make eglSwapBuffers* a no-op for !window surfaces</li>
<li>vl/dri3: do full teardown on screen_destroy</li>
<li>Revert "mesa: remove unnecessary 'sort by year' for the GL extensions"</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>radv: add missing meson c++ visibility arguments</li>
</ul>
<p>Fritz Koenig (1):</p>
<ul>
<li>i965: Replace checks for rb-&gt;Name with FlipY (v2)</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>virgl, vtest: Correct the transfer size calculation</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>glsl: fix array assignments of a swizzled vector</li>
<li>nv50,nvc0: mark RGBX_UINT formats as renderable</li>
<li>nv50,nvc0: guard against zero-size blits</li>
<li>nvc0: fix blitting red to srgb8_alpha</li>
</ul>
<p>Jason Ekstrand (7):</p>
<ul>
<li>nir/cf: Remove phi sources if needed in nir_handle_add_jump</li>
<li>anv: Use separate MOCS settings for external BOs</li>
<li>intel/fs: Fix a typo in need_matching_subreg_offset</li>
<li>nir/from_ssa: Don't rewrite derefs destinations to registers</li>
<li>anv/batch_chain: Don't start a new BO just for BATCH_BUFFER_START</li>
<li>nir/alu_to_scalar: Use ssa_for_alu_src in hand-rolled expansions</li>
<li>intel: Don't propagate conditional modifiers if a UD source is negated</li>
</ul>
<p>Juan A. Suarez Romero (2):</p>
<ul>
<li>docs: add sha256 checksums for 18.2.2</li>
<li>Update version to 18.2.3</li>
</ul>
<p>Józef Kucia (1):</p>
<ul>
<li>radeonsi: avoid sending GS_EMIT in shaders without outputs</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>drirc: add a workaround for ARMA 3</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: add a workaround for a VGT hang with prim restart and strips</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>glsl: do not attempt assignment if operand type not parsed correctly</li>
</ul>
<p>Timothy Arceri (11):</p>
<ul>
<li>glsl: ignore trailing whitespace when define redefined</li>
<li>util: disable cache if we have no build-id and timestamp is zero</li>
<li>util: rename timestamp param in disk_cache_create()</li>
<li>util: add disk_cache_get_function_identifier()</li>
<li>radeonsi: use build-id when available for disk cache</li>
<li>nouveau: use build-id when available for disk cache</li>
<li>r600: use build-id when available for disk cache</li>
<li>mesa/st: add force_compat_profile option to driconfig</li>
<li>util: use force_compat_profile for Wolfenstein The Old Blood</li>
<li>util: better handle program names from wine</li>
<li>util: add drirc workarounds for RAGE</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>r600/sb: Fix constant-logical-operand warning.</li>
</ul>
</div>
</body>
</html>

154
docs/relnotes/18.2.4.html Normal file
View File

@@ -0,0 +1,154 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.4 Release Notes / October 31, 2018</h1>
<p>
Mesa 18.2.4 is a bug fix release which fixes bugs found since the 18.2.4 release.
</p>
<p>
Mesa 18.2.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
968bfe78605e9397ddf244933b1fa62edb8429fc55aaec2ae7e20bb1c82abdea mesa-18.2.4.tar.gz
621d1aebb57876d5b6a5d2dcf4eb7e0620e650c6fe5cf3655c65e243adc9cb4e mesa-18.2.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107865">Bug 107865</a> - swr fail to build with llvm-libs 6.0.1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108272">Bug 108272</a> - [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108524">Bug 108524</a> - [RADV] GPU lockup on event synchronization</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (2):</p>
<ul>
<li>ac/nir: Use context-specific LLVM types</li>
<li>anv: Fix sanitization of stencil state when the depth test is disabled</li>
</ul>
<p>Alok Hota (2):</p>
<ul>
<li>swr/rast: ignore CreateElementUnorderedAtomicMemCpy</li>
<li>swr/rast: fix intrinsic/function for LLVM 7 compatibility</li>
</ul>
<p>Andres Rodriguez (1):</p>
<ul>
<li>radv: fix check for perftest options size</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Emit enqueued pipeline barriers on event write.</li>
</ul>
<p>Connor Abbott (2):</p>
<ul>
<li>ac: Introduce ac_build_expand()</li>
<li>ac: Fix loading a dvec3 from an SSBO</li>
</ul>
<p>David McFarland (1):</p>
<ul>
<li>util: Change remaining uint32 cache ids to sha1</li>
</ul>
<p>Dylan Baker (1):</p>
<ul>
<li>meson: don't require libelf for r600 without LLVM</li>
</ul>
<p>Elie Tournier (1):</p>
<ul>
<li>gallium: Correctly handle no config context creation</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>radv: s/abs/fabsf/ for floats</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>radeonsi: Bump number of allowed global buffers to 32</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>spirv: Use the right bit-size for spec constant ops</li>
<li>blorp: Emit a dummy 3DSTATE_WM prior to 3DSTATE_WM_HZ_OP</li>
<li>anv: Flag semaphore BOs as external</li>
</ul>
<p>Juan A. Suarez Romero (3):</p>
<ul>
<li>docs: add sha256 checksums for 18.2.3</li>
<li>cherry-ignore: Revert "anv/skylake: disable ForceThreadDispatchEnable"</li>
<li>Update version to 18.2.4</li>
</ul>
<p>Liviu Prodea (1):</p>
<ul>
<li>scons: Put to rest zombie texture_float build option.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: fix a VGT hang with primitive restart on Polaris10 and later</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>loader/dri3: Also wait for front buffer fence if we triggered it</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>intel/blorp: Define the clear value bounds for HiZ clears</li>
</ul>
<p>Rob Clark (2):</p>
<ul>
<li>freedreno: fix inorder rendering case</li>
<li>freedreno: don't flush when new and old pfb is identical</li>
</ul>
</div>
</body>
</html>

172
docs/relnotes/18.2.5.html Normal file
View File

@@ -0,0 +1,172 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.5 Release Notes / November 15, 2018</h1>
<p>
Mesa 18.2.5 is a bug fix release which fixes bugs found since the 18.2.4 release.
</p>
<p>
Mesa 18.2.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
dddc28928b6f4083a0d5120b58c1c8e2dc189ab5c14299c08a386607fdbbdce7 mesa-18.2.5.tar.gz
b12c32872832e5353155e1e8026e1f1ab75bba9dc5b178d712045684d26c2b73 mesa-18.2.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105731">Bug 105731</a> - linker error &quot;fragment shader input ... has no matching output in the previous stage&quot; when previous stage's output declaration in a separate shader object</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107511">Bug 107511</a> - KHR/khrplatform.h not always installed when needed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107626">Bug 107626</a> - [SNB] The graphical corruption and GPU hang occur sometimes on the piglit test &quot;arb_texture_multisample-large-float-texture&quot; with parameter --fp16</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108082">Bug 108082</a> - warning: unknown warning option '-Wno-format-truncation' [-Wunknown-warning-option]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108560">Bug 108560</a> - Mesa 32 is built without sse</li>
</ul>
<h2>Changes</h2>
<p>Andre Heider (1):</p>
<ul>
<li>st/nine: fix stack corruption due to ABI mismatch</li>
</ul>
<p>Andrii Simiklit (1):</p>
<ul>
<li>i965/batch: don't ignore the 'brw_new_batch' call for a 'new batch'</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>meson: link gallium nine with pthreads</li>
<li>meson: fix libatomic tests</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>egl/glvnd: correctly report errors when vendor cannot be found</li>
<li>m4: add Werror when checking for compiler flags</li>
</ul>
<p>Eric Engestrom (6):</p>
<ul>
<li>svga: add missing meson build dependency</li>
<li>clover: add missing meson build dependency</li>
<li>wsi/wayland: use proper VkResult type</li>
<li>wsi/wayland: only finish() a successfully init()ed display</li>
<li>configure: install KHR/khrplatform.h when needed</li>
<li>meson: install KHR/khrplatform.h when needed</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>virgl/vtest-winsys: Use virgl version of bind flags</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>intel/tools: include stdarg.h in error2aub</li>
</ul>
<p>Juan A. Suarez Romero (4):</p>
<ul>
<li>docs: add sha256 checksums for 18.2.4</li>
<li>cherry-ignore: add explicit 18.3 only nominations</li>
<li>cherry-ignore: i965/batch: avoid reverting batch buffer if saved state is an empty</li>
<li>Update version to 18.2.5</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv/android: mark gralloc allocated BOs as external</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>ac: fix ac_build_fdiv for f64</li>
<li>st/va: fix incorrect use of resource_destroy</li>
<li>include: update GL &amp; GLES headers (v2)</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>util/ralloc: Switch from DEBUG to NDEBUG</li>
<li>util/ralloc: Make sizeof(linear_header) a multiple of 8</li>
</ul>
<p>Olivier Fourdan (1):</p>
<ul>
<li>wayland/egl: Resize EGL surface on update buffer for swrast</li>
</ul>
<p>Rhys Perry (1):</p>
<ul>
<li>glsl_to_tgsi: don't create 64-bit integer MAD/FMA</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>radv: disable conditional rendering for vkCmdCopyQueryPoolResults()</li>
<li>radv: only expose VK_SUBGROUP_FEATURE_ARITHMETIC_BIT for VI+</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>autotools: library-dependency when no sse and 32-bit</li>
</ul>
<p>Timothy Arceri (4):</p>
<ul>
<li>st/mesa: calculate buffer size correctly for packed uniforms</li>
<li>st/glsl_to_nir: fix next_stage gathering</li>
<li>nir: add glsl_type_is_integer() helper</li>
<li>nir: don't pack varyings ints with floats unless flat</li>
</ul>
<p>Vadym Shovkoplias (1):</p>
<ul>
<li>glsl/linker: Fix out variables linking during single stage</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>r600/sb: Fix constant logical operand in assert.</li>
</ul>
</div>
</body>
</html>

179
docs/relnotes/18.2.6.html Normal file
View File

@@ -0,0 +1,179 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.6 Release Notes / November 28, 2018</h1>
<p>
Mesa 18.2.6 is a bug fix release which fixes bugs found since the 18.2.5 release.
</p>
<p>
Mesa 18.2.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
e0ea1236dbc6c412b02e1b5d7f838072525971a6630246fa82ae4466a6d8a587 mesa-18.2.6.tar.gz
9ebafa4f8249df0c718e93b9ca155e3593a1239af303aa2a8b0f2056a7efdc12 mesa-18.2.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107626">Bug 107626</a> - [SNB] The graphical corruption and GPU hang occur sometimes on the piglit test &quot;arb_texture_multisample-large-float-texture&quot; with parameter --fp16</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107856">Bug 107856</a> - i965 incorrectly calculates the number of layers for texture views (assert)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108630">Bug 108630</a> - [G965] piglit.spec.!opengl 1_2.tex3d-maxsize spins forever</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108713">Bug 108713</a> - Gallium: use after free with transform feedback</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108829">Bug 108829</a> - [meson] libglapi exports internal API</li>
</ul>
<h2>Changes</h2>
<p>Andrii Simiklit (1):</p>
<ul>
<li>i965/batch: avoid reverting batch buffer if saved state is an empty</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Fix opaque metadata descriptor last layer.</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>scons/svga: remove opt from the list of valid build types</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>i965: Fix calculation of layers array length for isl_view</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>meson: Don't set -Wall</li>
<li>meson: Don't force libva to required from auto</li>
</ul>
<p>Emil Velikov (13):</p>
<ul>
<li>bin/get-pick-list.sh: simplify git oneline printing</li>
<li>bin/get-pick-list.sh: prefix output with "[stable] "</li>
<li>bin/get-pick-list.sh: handle "typod" usecase.</li>
<li>bin/get-pick-list.sh: handle the fixes tag</li>
<li>bin/get-pick-list.sh: tweak the commit sha matching pattern</li>
<li>bin/get-pick-list.sh: flesh out is_sha_nomination</li>
<li>bin/get-pick-list.sh: handle fixes tag with missing colon</li>
<li>bin/get-pick-list.sh: handle unofficial "broken by" tag</li>
<li>bin/get-pick-list.sh: use test instead of [ ]</li>
<li>bin/get-pick-list.sh: handle reverts prior to the branchpoint</li>
<li>travis: drop unneeded x11proto-xf86vidmode-dev</li>
<li>glx: make xf86vidmode mandatory for direct rendering</li>
<li>travis: adding missing x11-xcb for meson+vulkan</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>vc4: Make sure we make ro scanout resources for create_with_modifiers.</li>
</ul>
<p>Eric Engestrom (5):</p>
<ul>
<li>meson: only run vulkan's meson.build when building vulkan</li>
<li>gbm: remove unnecessary meson include</li>
<li>meson: fix wayland-less builds</li>
<li>egl: add missing glvnd entrypoint for EGL_ANDROID_blob_cache</li>
<li>glapi: add missing visibility args</li>
</ul>
<p>Erik Faye-Lund (1):</p>
<ul>
<li>mesa/main: remove bogus error for zero-sized images</li>
</ul>
<p>Gert Wollny (3):</p>
<ul>
<li>mesa: Reference count shaders that are used by transform feedback objects</li>
<li>r600: clean up the GS ring buffers when the context is destroyed</li>
<li>glsl: free or reuse memory allocated for TF varying</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>nir/lower_alu_to_scalar: Don't try to lower unpack_32_2x16</li>
<li>anv: Put robust buffer access in the pipeline hash</li>
</ul>
<p>Juan A. Suarez Romero (6):</p>
<ul>
<li>cherry-ignore: add explicit 18.3 only nominations</li>
<li>cherry-ignore: intel/aub_viewer: fix dynamic state printing</li>
<li>cherry-ignore: intel/aub_viewer: Print blend states properly</li>
<li>cherry-ignore: mesa/main: fix incorrect depth-error</li>
<li>docs: add sha256 checksums for 18.2.5</li>
<li>Update version to 18.2.6</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nir/spirv: cast shift operand to u32</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Add PCI IDs for new Amberlake parts that are Coffeelake based</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>egl/dri: fix error value with unknown drm format</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>winsys/amdgpu: fix a buffer leak in amdgpu_bo_from_handle</li>
<li>winsys/amdgpu: fix a device handle leak in amdgpu_winsys_create</li>
</ul>
<p>Rodrigo Vivi (4):</p>
<ul>
<li>i965: Add a new CFL PCI ID.</li>
<li>intel: aubinator: Adding missed platforms to the error message.</li>
<li>intel: Introducing Amber Lake platform</li>
<li>intel: Introducing Whiskey Lake platform</li>
</ul>
</div>
</body>
</html>

167
docs/relnotes/18.2.7.html Normal file
View File

@@ -0,0 +1,167 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.7 Release Notes / December 13, 2018</h1>
<p>
Mesa 18.2.7 is a bug fix release which fixes bugs found since the 18.2.6 release.
</p>
<p>
Mesa 18.2.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
092351cfbcd430ec595fbd3a3d8d253fd62c29074e1740d7198b00289ab400f8 mesa-18.2.7.tar.gz
9c7b02560d89d77ca279cd21f36ea9a49e9ffc5611f6fe35099357d744d07ae6 mesa-18.2.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106577">Bug 106577</a> - broken rendering with nine and nouveau (GM107)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108245">Bug 108245</a> - RADV/Vega: Low mip levels of large BCn textures get corrupted by vkCmdCopyBufferToImage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108311">Bug 108311</a> - Query buffer object support is broken on r600.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108894">Bug 108894</a> - [anv] vkCmdCopyBuffer() and vkCmdCopyQueryPoolResults() write-after-write hazard</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108909">Bug 108909</a> - Vkd3d test failure test_resolve_non_issued_query_data()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108914">Bug 108914</a> - blocky shadow artifacts in The Forest with DXVK, RADV_DEBUG=nohiz fixes this</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108925">Bug 108925</a> - vkCmdCopyQueryPoolResults(VK_QUERY_RESULT_WAIT_BIT) for timestamps with large query count hangs</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (1):</p>
<ul>
<li>radv: Flush before vkCmdWriteTimestamp() if needed</li>
</ul>
<p>Bas Nieuwenhuizen (4):</p>
<ul>
<li>radv: Align large buffers to the fragment size.</li>
<li>radv: Clamp gfx9 image view extents to the allocated image extents.</li>
<li>radv/android: Mark android WSI image as shareable.</li>
<li>radv/android: Use buffer metadata to determine scanout compat.</li>
</ul>
<p>Dave Airlie (2):</p>
<ul>
<li>r600: make suballocator 256-bytes align</li>
<li>radv: use 3d shader for gfx9 copies if dst is 3d</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>egl/wayland: bail out when drmGetMagic fails</li>
<li>egl/wayland: plug memory leak in drm_handle_device()</li>
</ul>
<p>Eric Anholt (3):</p>
<ul>
<li>v3d: Fix a leak of the transfer helper on screen destroy.</li>
<li>vc4: Fix a leak of the transfer helper on screen destroy.</li>
<li>v3d: Fix a leak of the disassembled instruction string during debug dumps.</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>anv: correctly use vulkan 1.0 by default</li>
<li>wsi/display: fix mem leak when freeing swapchains</li>
<li>vulkan/wsi: fix s/,/;/ typo</li>
</ul>
<p>Gurchetan Singh (3):</p>
<ul>
<li>virgl: quadruple command buffer size</li>
<li>virgl: avoid large inline transfers</li>
<li>virgl: don't mark buffers as unclean after a write</li>
</ul>
<p>Juan A. Suarez Romero (4):</p>
<ul>
<li>docs: add sha256 checksums for 18.2.6</li>
<li>cherry-ignore: freedreno: Fix autotools build.</li>
<li>cherry-ignore: mesa: Revert INTEL_fragment_shader_ordering support</li>
<li>Update version to 18.2.7</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nv50,nvc0: Fix gallium nine regression regarding sampler bindings</li>
</ul>
<p>Lionel Landwerlin (2):</p>
<ul>
<li>anv: flush pipeline before query result copies</li>
<li>anv/query: flush render target before copying results</li>
</ul>
<p>Michal Srb (2):</p>
<ul>
<li>gallium: Constify drisw_loader_funcs struct</li>
<li>drisw: Use separate drisw_loader_funcs for shm</li>
</ul>
<p>Nicolai Hähnle (2):</p>
<ul>
<li>egl/wayland: rather obvious build fix</li>
<li>meson: link LLVM 'native' component when LLVM is available</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: rework the TC-compat HTILE hardware bug with COND_EXEC</li>
</ul>
<p>Thomas Hellstrom (2):</p>
<ul>
<li>st/xa: Fix a memory leak</li>
<li>winsys/svga: Fix a memory leak</li>
</ul>
<p>Tobias Klausmann (1):</p>
<ul>
<li>amd/vulkan: meson build - use radv_deps for libvulkan_radeon</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>st/xvmc: Add X11 include path.</li>
</ul>
</div>
</body>
</html>

183
docs/relnotes/18.2.8.html Normal file
View File

@@ -0,0 +1,183 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.8 Release Notes / December 27, 2018</h1>
<p>
Mesa 18.2.8 is a bug fix release which fixes bugs found since the 18.2.7 release.
</p>
<p>
Mesa 18.2.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
77512edc0a84e19c7131a0e2e5ebf1beaf1494dc4b71508fcc92d06d65f9f4f5 mesa-18.2.8.tar.gz
1d2ed9fd435d86d95b7215b287258d3e6b1180293a36f688e5a2efc18298d863 mesa-18.2.8.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108114">Bug 108114</a> - [vulkancts] new VK_KHR_16bit_storage tests fail.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108116">Bug 108116</a> - [vulkancts] stencil partial clear tests fail.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108910">Bug 108910</a> - Vkd3d test failure test_multisample_array_texture()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108911">Bug 108911</a> - Vkd3d test failure test_clear_render_target_view()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109081">Bug 109081</a> - [bisected] [HSW] Regression in clipping.user_defined.clip_* vulkancts tests</li>
</ul>
<h2>Changes</h2>
<p>Alex Deucher (3):</p>
<ul>
<li>pci_ids: add new vega10 pci ids</li>
<li>pci_ids: add new vega20 pci id</li>
<li>pci_ids: add new VegaM pci id</li>
</ul>
<p>Axel Davy (3):</p>
<ul>
<li>st/nine: Fix volumetexture dtor on ctor failure</li>
<li>st/nine: Bind src not dst in nine_context_box_upload</li>
<li>st/nine: Add src reference to nine_context_range_upload</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>nir: properly clear the entry sources in copy_prop_vars</li>
</ul>
<p>Dylan Baker (1):</p>
<ul>
<li>meson: Fix ppc64 little endian detection</li>
</ul>
<p>Emil Velikov (9):</p>
<ul>
<li>glx: mandate xf86vidmode only for "drm" dri platforms</li>
<li>bin/get-pick-list.sh: rework handing of sha nominations</li>
<li>bin/get-pick-list.sh: warn when commit lists invalid sha</li>
<li>meson: don't require glx/egl/gbm with gallium drivers</li>
<li>pipe-loader: meson: reference correct library</li>
<li>TODO: glx: meson: build dri based glx tests, only with -Dglx=dri</li>
<li>glx: meson: drop includes from a link-only library</li>
<li>glx: meson: wire up the dispatch-index-check test</li>
<li>glx/test: meson: assorted include fixes</li>
</ul>
<p>Eric Anholt (2):</p>
<ul>
<li>v3d: Make sure that a thrsw doesn't split a multop from its umul24.</li>
<li>v3d: Add missing flagging of SYNCB as a TSY op.</li>
</ul>
<p>Erik Faye-Lund (2):</p>
<ul>
<li>virgl: wrap vertex element state in a struct</li>
<li>virgl: work around bad assumptions in virglrenderer</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>i965/vec4/dce: Don't narrow the write mask if the flags are used</li>
<li>Revert "nir/lower_indirect: Bail early if modes == 0"</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>clover: Fix build after clang r348827</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>nir/constant_folding: Fix source bit size logic</li>
</ul>
<p>Jon Turney (1):</p>
<ul>
<li>glx: Fix compilation with GLX_USE_WINDOWSGL</li>
</ul>
<p>Juan A. Suarez Romero (7):</p>
<ul>
<li>docs: add sha256 checksums for 18.2.7</li>
<li>cherry-ignore: add explicit 18.3 only nominations</li>
<li>cherry-ignore: meson: libfreedreno depends upon libdrm (for fence support)</li>
<li>cherry-ignore: radv: Fix multiview depth clears</li>
<li>cherry-ignore: nir: properly find the entry to keep in copy_prop_vars</li>
<li>cherry-ignore: intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs</li>
<li>Update version to 18.2.8</li>
</ul>
<p>Kirill Burtsev (1):</p>
<ul>
<li>loader: free error state, when checking the drawable type</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: don't do partial resolve on layer &gt; 0</li>
</ul>
<p>Rhys Perry (2):</p>
<ul>
<li>radv: don't set surf_index for stencil-only images</li>
<li>ac: split 16-bit ssbo loads that may not be dword aligned</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>mesa/st/nir: fix missing nir_compact_varyings</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: switch on EOP when primitive restart is enabled with triangle strips</li>
</ul>
<p>Vinson Lee (2):</p>
<ul>
<li>meson: Fix typo.</li>
<li>meson: Fix libsensors detection.</li>
</ul>
</div>
</body>
</html>

283
docs/relnotes/18.3.0.html Normal file
View File

@@ -0,0 +1,283 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.3.0 Release Notes / December 7, 2018</h1>
<p>
Mesa 18.3.0 is a new development release. People who are concerned
with stability and reliability should stick with a previous release or
wait for Mesa 18.3.1.
</p>
<p>
Mesa 18.3.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
libwayland-egl is now distributed by Wayland (since 1.15,
<a href="https://lists.freedesktop.org/archives/wayland-devel/2018-April/037767.html">see announcement</a>),
and has been removed from Mesa in this release. Make sure you're using
an up-to-date version of Wayland to keep the functionality.
</p>
<h2>SHA256 checksums</h2>
<pre>
17a124d4dbc712505d22a7815c9b0cee22214c96c8abb91539a2b1351e38a000 mesa-18.3.0.tar.gz
b63f947e735d6ef3dfaa30c789a9adfbae18aea671191eaacde95a18c17fc38a mesa-18.3.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_AMD_depth_clamp_separate on r600, radeonsi.</li>
<li>GL_AMD_framebuffer_multisample_advanced on radeonsi.</li>
<li>GL_AMD_gpu_shader_int64 on i965, nvc0, radeonsi.</li>
<li>GL_AMD_multi_draw_indirect on all GL 4.x drivers.</li>
<li>GL_AMD_query_buffer_object on i965, nvc0, r600, radeonsi.</li>
<li>GL_EXT_disjoint_timer_query on radeonsi and most other Gallium drivers (ES extension)</li>
<li>GL_EXT_texture_compression_s3tc on all drivers (ES extension)<li>
<li>GL_EXT_vertex_attrib_64bit on i965, nvc0, radeonsi.</li>
<li>GL_EXT_window_rectangles on radeonsi.</li>
<li>GL_KHR_texture_compression_astc_sliced_3d on radeonsi.</li>
<li>GL_NV_fragment_shader_interlock on i965.</li>
<li>EGL_EXT_device_base for all drivers.</li>
<li>EGL_EXT_device_drm for all drivers.</li>
<li>EGL_MESA_device_software for all drivers.</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=13728">Bug 13728</a> - [G965] Some objects in Neverwinter Nights Linux version not displayed correctly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91433">Bug 91433</a> - piglit.spec.arb_depth_buffer_float.fbo-depth-gl_depth_component32f-copypixels fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93355">Bug 93355</a> - [BXT,SKLGT4e] intermittent ext_framebuffer_multisample.accuracy fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94957">Bug 94957</a> - dEQP failures on llvmpipe</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98699">Bug 98699</a> - &quot;float[a+++4 ? 1:1] f;&quot; crashes glsl_compiler</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99507">Bug 99507</a> - Corrupted frame contents with Vulkan version of DOTA2, Talos Principle and Sascha Willems' demos when they're run Vsynched in fullscreen</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99730">Bug 99730</a> - Metro Redux game(s) needs override for midshader extension declaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100200">Bug 100200</a> - Default Unreal Engine 4 frag shader fails to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101247">Bug 101247</a> - Mesa fails to link GLSL programs with unused output blocks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102597">Bug 102597</a> - [Regression] mpv, high rendering times (two to three times higher)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103241">Bug 103241</a> - Anv crashes when using 64-bit vertex inputs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104602">Bug 104602</a> - [apitrace] Graphical artifacts in Civilization VI on RX Vega</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104809">Bug 104809</a> - anv: DOOM 2016 and Wolfenstein II:The New Colossus crash due to not having depthBoundsTest</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104926">Bug 104926</a> - swrast: Mesa 17.3.3 produces: HW cursor for format 875713089 not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105333">Bug 105333</a> - [gallium-nine] missing geometry after commit ac: replace ac_build_kill with ac_build_kill_if_false</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105371">Bug 105371</a> - r600_shader_from_tgsi - GPR limit exceeded - shader requires 360 registers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105731">Bug 105731</a> - linker error &quot;fragment shader input ... has no matching output in the previous stage&quot; when previous stage's output declaration in a separate shader object</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105904">Bug 105904</a> - Needed to delete mesa shader cache after driver upgrade for 32 bit wine vulkan programs to work.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105975">Bug 105975</a> - i965 always reports 0 viewport subpixel bits</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106231">Bug 106231</a> - llvmpipe blends produce bad code after llvm patch https://reviews.llvm.org/D44785</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106283">Bug 106283</a> - Shader replacements works only for limited use cases</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106577">Bug 106577</a> - broken rendering with nine and nouveau (GM107)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106833">Bug 106833</a> - glLinkProgram is expected to fail when vertex attribute aliasing happens on ES3.0 context or later</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106865">Bug 106865</a> - [GLK] piglit.spec.ext_framebuffer_multisample.accuracy stencil tests fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106980">Bug 106980</a> - Basemark GPU vulkan benchmark hangs on GFX9</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106997">Bug 106997</a> - [Regression]. Dying light game is crashing on latest mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107088">Bug 107088</a> - [GEN8+] Hang when discarding a fragment if dual source blending is enabled but shader doesn't support it</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107098">Bug 107098</a> - Segfault after munmap(kms_sw_dt-&gt;ro_mapped)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107212">Bug 107212</a> - Dual-Core CPU E5500 / G45: RetroArch with reicast core results in corrupted graphics</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107223">Bug 107223</a> - [GEN9+] 50% perf drop in SynMark Fill* tests (E2E RBC gets disabled?)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107276">Bug 107276</a> - radv: OpBitfieldUExtract returns incorrect result when count is zero</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107280">Bug 107280</a> - [DXVK] Batman: Arkham City with tessellation enabled hangs on SKL GT4</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107313">Bug 107313</a> - Meson instructions on web site are non-optimal</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107359">Bug 107359</a> - [Regression] [bisected] [OpenGL CTS] [SKL,BDW] KHR-GL46.texture_barrier*-texels, GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners, and GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners fail with some configuration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107460">Bug 107460</a> - radv: OpControlBarrier does not always work correctly (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107477">Bug 107477</a> - [DXVK] Setting high shader quality in GTA V results in LLVM error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107483">Bug 107483</a> - DispatchSanity_test.GL31_CORE regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107487">Bug 107487</a> - [intel] [tools] intel gpu tools don't honor -D tools=[]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107488">Bug 107488</a> - gl.h:2090: error: redefinition of typedef GLeglImageOES</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107510">Bug 107510</a> - [GEN8+] up to 10% perf drop on several 3D benchmarks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107511">Bug 107511</a> - KHR/khrplatform.h not always installed when needed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107524">Bug 107524</a> - Broken packDouble2x32 at llvmpipe</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107544">Bug 107544</a> - intel/decoder: out of bounds group_iter</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107547">Bug 107547</a> - shader crashing glsl_compiler (uniform block assigned to vec2, then component substraced by 1)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107550">Bug 107550</a> - &quot;0[2]&quot; as function parameter hits assert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107563">Bug 107563</a> - [RADV] Broken rendering in Unity demos</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107565">Bug 107565</a> - TypeError: __init__() got an unexpected keyword argument 'future_imports'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107579">Bug 107579</a> - [SNB] The graphic corruption when we reuse the GS compiled and used for TFB when statebuffer contain magic trash in the unused space</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107601">Bug 107601</a> - Rise of the Tomb Raider Segmentation Fault when the game starts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107610">Bug 107610</a> - Dolphin emulator mis-renders shadow overlay in Super Mario Sunshine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107626">Bug 107626</a> - [SNB] The graphical corruption and GPU hang occur sometimes on the piglit test &quot;arb_texture_multisample-large-float-texture&quot; with parameter --fp16</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107658">Bug 107658</a> - [Regression] [bisected] [OpenGLES CTS] KHR-GLES3.packed_pixels.*rectangle.r*8_snorm</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107734">Bug 107734</a> - [GLSL] glsl-fface-invariant, glsl-fcoord-invariant and glsl-pcoord-invariant should fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107745">Bug 107745</a> - [bisected] [bdw bsw] piglit.­spec.­arb_fragment_shader_interlock.­arb_fragment_shader_interlock-image-load-store failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107760">Bug 107760</a> - GPU Hang when Playing DiRT 3 Complete Edition using Steam Play with DXVK</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107765">Bug 107765</a> - [regression] Batman Arkham City crashes with DXVK under wine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107772">Bug 107772</a> - Mesa preprocessor matches if(def)s &amp; endifs incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107779">Bug 107779</a> - Access violation with some games</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107786">Bug 107786</a> - [DXVK] MSAA reflections are broken in GTA V</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107806">Bug 107806</a> - glsl_get_natural_size_align_bytes() ABORT with GfxBench Vulkan AztecRuins</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107810">Bug 107810</a> - The 'va_end' call is missed after 'va_copy' in 'util_vsnprintf' function under windows</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107832">Bug 107832</a> - Gallium picking A16L16 formats when emulating INTENSITY16 conflicts with mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107843">Bug 107843</a> - 32bit Mesa build failes with meson.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107856">Bug 107856</a> - i965 incorrectly calculates the number of layers for texture views (assert)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107857">Bug 107857</a> - GPU hang - GS_EMIT without shader outputs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107865">Bug 107865</a> - swr fail to build with llvm-libs 6.0.1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107869">Bug 107869</a> - u_thread.h:87:4: error: use of undeclared identifier 'cpu_set_t'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107870">Bug 107870</a> - Undefined symbols for architecture x86_64: &quot;_util_cpu_caps&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107879">Bug 107879</a> - crash happens when link program</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107891">Bug 107891</a> - [wine, regression, bisected] RAGE, Wolfenstein The New Order hangs in menu</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107923">Bug 107923</a> - build_id.c:126: multiple definition of `build_id_length'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107926">Bug 107926</a> - [anv] Rise of the Tomb Raider always misrendering, segfault and gpu hang.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107941">Bug 107941</a> - GPU hang and system crash with Dota 2 using Vulkan</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107971">Bug 107971</a> - SPV_GOOGLE_hlsl_functionality1 / SPV_GOOGLE_decorate_string</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108012">Bug 108012</a> - Compiler crashes on access of non-existent member incremental operations</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108024">Bug 108024</a> - [Debian Stretch]Fail to build because &quot;xcb_randr_lease_t&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108082">Bug 108082</a> - warning: unknown warning option '-Wno-format-truncation' [-Wunknown-warning-option]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108109">Bug 108109</a> - [GLSL] no-overloads.vert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108112">Bug 108112</a> - [vulkancts] some of the coherent memory tests fail.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108113">Bug 108113</a> - [vulkancts] r32g32b32 transfer operations not implemented</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108115">Bug 108115</a> - [vulkancts] dEQP-VK.subgroups.vote.graphics.subgroupallequal.* fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108164">Bug 108164</a> - [radv] VM faults since 5d6a560a2986c9ab421b3c7904d29bb7bc35e36f</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108245">Bug 108245</a> - RADV/Vega: Low mip levels of large BCn textures get corrupted by vkCmdCopyBufferToImage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108272">Bug 108272</a> - [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108311">Bug 108311</a> - Query buffer object support is broken on r600.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108319">Bug 108319</a> - [GLK BXT BSW] Assertion in piglit.spec.arb_gpu_shader_fp64.execution.built-in-functions.vs-sign-sat-neg-abs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108491">Bug 108491</a> - Commit baa38c14 causes output issues on my VEGA with RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108524">Bug 108524</a> - [RADV] GPU lockup on event synchronization</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108530">Bug 108530</a> - (mesa-18.3) [Tracker] Mesa 18.3 Release Tracker</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108532">Bug 108532</a> - make check nir_copy_prop_vars_test.store_store_load_different_components regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108560">Bug 108560</a> - Mesa 32 is built without sse</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108595">Bug 108595</a> - ir3_compiler valgrind build error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108617">Bug 108617</a> - [deqp] Mesa fails conformance for egl_ext_device</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108630">Bug 108630</a> - [G965] piglit.spec.!opengl 1_2.tex3d-maxsize spins forever</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108635">Bug 108635</a> - Mesa master commit 68dc591af16ebb36814e4c187e4998948103c99c causes XWayland to segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108713">Bug 108713</a> - Gallium: use after free with transform feedback</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108829">Bug 108829</a> - [meson] libglapi exports internal API</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108894">Bug 108894</a> - [anv] vkCmdCopyBuffer() and vkCmdCopyQueryPoolResults() write-after-write hazard</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108909">Bug 108909</a> - Vkd3d test failure test_resolve_non_issued_query_data()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108914">Bug 108914</a> - blocky shadow artifacts in The Forest with DXVK, RADV_DEBUG=nohiz fixes this</li>
<h2>Changes</h2>
<ul>
<li>TBD</li>
</ul>
</div>
</body>
</html>

63
docs/relnotes/18.3.1.html Normal file
View File

@@ -0,0 +1,63 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.3.1 Release Notes / December 11, 2018</h1>
<p>
Mesa 18.3.1 is a bug fix release which fixes bugs found since the 18.3.0 release.
</p>
<p>
Mesa 18.3.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
256d0c3d88e380c1b8e3fc5c6ac34001e3b7c30458b8b852407ec68b8ccd9fda mesa-18.3.1.tar.gz
5b1f827d28684a25f6657289f8b7d47ac56395988c7ac23e0ec9a62b644bdc63 mesa-18.3.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>None</p>
<h2>Changes</h2>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 18.3.0</li>
<li>Update version to 18.3.1</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>anv,radv: Disable VK_EXT_pci_bus_info</li>
</ul>
</div>
</body>
</html>

265
docs/relnotes/18.3.2.html Normal file
View File

@@ -0,0 +1,265 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.3.2 Release Notes / January 17, 2019</h1>
<p>
Mesa 18.3.2 is a bug fix release which fixes bugs found since the 18.3.1 release.
</p>
<p>
Mesa 18.3.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
1cde4fafd40cd1ad4ee3a13b364b7a0175a08b7afdd127fb46f918c1e1dfd4b0 mesa-18.3.2.tar.gz
f7ce7181c07b6d8e0132da879af1729523a6c8aa87f79a9d59dfd064024cfb35 mesa-18.3.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106595">Bug 106595</a> - [RADV] Rendering distortions only when MSAA is enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107728">Bug 107728</a> - Wrong background in Sascha Willem's Multisampling Demo</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108114">Bug 108114</a> - [vulkancts] new VK_KHR_16bit_storage tests fail.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108116">Bug 108116</a> - [vulkancts] stencil partial clear tests fail.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108624">Bug 108624</a> - [regression][bisected] &quot;nir: Copy propagation between blocks&quot; regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108910">Bug 108910</a> - Vkd3d test failure test_multisample_array_texture()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108911">Bug 108911</a> - Vkd3d test failure test_clear_render_target_view()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108943">Bug 108943</a> - Build fails on ppc64le with meson</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109072">Bug 109072</a> - GPU hang in blender 2.80</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109081">Bug 109081</a> - [bisected] [HSW] Regression in clipping.user_defined.clip_* vulkancts tests</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109151">Bug 109151</a> - [KBL-G][vulkan] dEQP-VK.texture.explicit_lod.2d.sizes.31x55_nearest_linear_mipmap_nearest_repeat failed verification.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109202">Bug 109202</a> - nv50_ir.cpp:749:19: error: cannot use typeid with -fno-rtti</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109204">Bug 109204</a> - [regression, bisected] retroarch's crt-royale shader crash radv</li>
</ul>
<h2>Changes</h2>
<p>Alex Deucher (3):</p>
<ul>
<li>pci_ids: add new vega10 pci ids</li>
<li>pci_ids: add new vega20 pci id</li>
<li>pci_ids: add new VegaM pci id</li>
</ul>
<p>Alexander von Gluck IV (1):</p>
<ul>
<li>egl/haiku: Fix reference to disp vs dpy</li>
</ul>
<p>Andres Gomez (2):</p>
<ul>
<li>glsl: correct typo in GLSL compilation error message</li>
<li>glsl/linker: specify proper direction in location aliasing error</li>
</ul>
<p>Axel Davy (3):</p>
<ul>
<li>st/nine: Fix volumetexture dtor on ctor failure</li>
<li>st/nine: Bind src not dst in nine_context_box_upload</li>
<li>st/nine: Add src reference to nine_context_range_upload</li>
</ul>
<p>Bas Nieuwenhuizen (5):</p>
<ul>
<li>radv: Do a cache flush if needed before reading predicates.</li>
<li>radv: Implement buffer stores with less than 4 components.</li>
<li>anv/android: Do not reject storage images.</li>
<li>radv: Fix rasterization precision bits.</li>
<li>spirv: Fix matrix parameters in function calls.</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (3):</p>
<ul>
<li>nir: properly clear the entry sources in copy_prop_vars</li>
<li>nir: properly find the entry to keep in copy_prop_vars</li>
<li>nir: remove dead code from copy_prop_vars</li>
</ul>
<p>Dave Airlie (2):</p>
<ul>
<li>radv/xfb: fix counter buffer bounds checks.</li>
<li>virgl/vtest: fix front buffer flush with protocol version 0.</li>
</ul>
<p>Dylan Baker (6):</p>
<ul>
<li>meson: Fix ppc64 little endian detection</li>
<li>meson: Add support for gnu hurd</li>
<li>meson: Add toggle for glx-direct</li>
<li>meson: Override C++ standard to gnu++11 when building with altivec on ppc64</li>
<li>meson: Error out if building nouveau and using LLVM without rtti</li>
<li>autotools: Remove tegra vdpau driver</li>
</ul>
<p>Emil Velikov (12):</p>
<ul>
<li>docs: add sha256 checksums for 18.3.1</li>
<li>bin/get-pick-list.sh: rework handing of sha nominations</li>
<li>bin/get-pick-list.sh: warn when commit lists invalid sha</li>
<li>cherry-ignore: meson: libfreedreno depends upon libdrm (for fence support)</li>
<li>glx: mandate xf86vidmode only for "drm" dri platforms</li>
<li>meson: don't require glx/egl/gbm with gallium drivers</li>
<li>pipe-loader: meson: reference correct library</li>
<li>TODO: glx: meson: build dri based glx tests, only with -Dglx=dri</li>
<li>glx: meson: drop includes from a link-only library</li>
<li>glx: meson: wire up the dispatch-index-check test</li>
<li>glx/test: meson: assorted include fixes</li>
<li>Update version to 18.3.2</li>
</ul>
<p>Eric Anholt (6):</p>
<ul>
<li>v3d: Fix a leak of the transfer helper on screen destroy.</li>
<li>vc4: Fix a leak of the transfer helper on screen destroy.</li>
<li>v3d: Fix a leak of the disassembled instruction string during debug dumps.</li>
<li>v3d: Make sure that a thrsw doesn't split a multop from its umul24.</li>
<li>v3d: Add missing flagging of SYNCB as a TSY op.</li>
<li>gallium/ttn: Fix setup of outputs_written.</li>
</ul>
<p>Erik Faye-Lund (2):</p>
<ul>
<li>virgl: wrap vertex element state in a struct</li>
<li>virgl: work around bad assumptions in virglrenderer</li>
</ul>
<p>Francisco Jerez (5):</p>
<ul>
<li>intel/fs: Handle source modifiers in lower_integer_multiplication().</li>
<li>intel/fs: Implement quad swizzles on ICL+.</li>
<li>intel/fs: Fix bug in lower_simd_width while splitting an instruction which was already split.</li>
<li>intel/eu/gen7: Fix brw_MOV() with DF destination and strided source.</li>
<li>intel/fs: Respect CHV/BXT regioning restrictions in copy propagation pass.</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>i965/vec4/dce: Don't narrow the write mask if the flags are used</li>
<li>Revert "nir/lower_indirect: Bail early if modes == 0"</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>clover: Fix build after clang r348827</li>
</ul>
<p>Jason Ekstrand (6):</p>
<ul>
<li>nir/constant_folding: Fix source bit size logic</li>
<li>intel/blorp: Be more conservative about copying clear colors</li>
<li>spirv: Handle any bit size in vector_insert/extract</li>
<li>anv/apply_pipeline_layout: Set the cursor in lower_res_reindex_intrinsic</li>
<li>spirv: Sign-extend array indices</li>
<li>intel/peephole_ffma: Fix swizzle propagation</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nv50/ir: fix use-after-free in ConstantFolding::visit</li>
</ul>
<p>Kirill Burtsev (1):</p>
<ul>
<li>loader: free error state, when checking the drawable type</li>
</ul>
<p>Lionel Landwerlin (5):</p>
<ul>
<li>anv: don't do partial resolve on layer &gt; 0</li>
<li>i965: include draw_params/derived_draw_params for VF cache workaround</li>
<li>i965: add CS stall on VF invalidation workaround</li>
<li>anv: explictly specify format for blorp ccs/mcs op</li>
<li>anv: flush fast clear colors into compressed surfaces</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>st/mesa: don't leak pipe_surface if pipe_context is not current</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>radeonsi: Fix use of 1- or 2- component GL_DOUBLE vbo's.</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>meson: link LLVM 'native' component when LLVM is available</li>
</ul>
<p>Rhys Perry (3):</p>
<ul>
<li>radv: don't set surf_index for stencil-only images</li>
<li>ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics</li>
<li>ac: split 16-bit ssbo loads that may not be dword aligned</li>
</ul>
<p>Rob Clark (2):</p>
<ul>
<li>freedreno/drm: fix memory leak</li>
<li>mesa/st/nir: fix missing nir_compact_varyings</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: switch on EOP when primitive restart is enabled with triangle strips</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>tgsi/scan: fix loop exit point in tgsi_scan_tess_ctrl()</li>
<li>tgsi/scan: correctly walk instructions in tgsi_scan_tess_ctrl()</li>
</ul>
<p>Vinson Lee (2):</p>
<ul>
<li>meson: Fix typo.</li>
<li>meson: Fix libsensors detection.</li>
</ul>
</div>
</body>
</html>

208
docs/relnotes/18.3.3.html Normal file
View File

@@ -0,0 +1,208 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.3.3 Release Notes / January 31, 2019</h1>
<p>
Mesa 18.3.3 is a bug fix release which fixes bugs found since the 18.3.2 release.
</p>
<p>
Mesa 18.3.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
6b9893942fe8011c7736d51448deb6ef80ece2257e0fac27b02e997a6605d5e4 mesa-18.3.3.tar.gz
2ab6886a6966c532ccbcc3b240925e681464b658244f0cbed752615af3936299 mesa-18.3.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108877">Bug 108877</a> - OpenGL CTS gl43 test cases were interrupted due to segment fault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109023">Bug 109023</a> - error: inlining failed in call to always_inline __m512 _mm512_and_ps(__m512, __m512): target specific option mismatch</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109129">Bug 109129</a> - format_types.h:1220: undefined reference to `_mm256_cvtps_ph'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109229">Bug 109229</a> - glLinkProgram locks up for ~30 seconds</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109242">Bug 109242</a> - [RADV] The Witcher 3 system freeze</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109488">Bug 109488</a> - Mesa 18.3.2 crash on a specific fragment shader (assert triggered) / already fixed on the master branch.</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (2):</p>
<ul>
<li>bin/get-pick-list.sh: fix the oneline printing</li>
<li>bin/get-pick-list.sh: fix redirection in sh</li>
</ul>
<p>Axel Davy (1):</p>
<ul>
<li>st/nine: Immediately upload user provided textures</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Only use 32 KiB per threadgroup on Stoney.</li>
<li>radv: Set partial_vs_wave for pipelines with just GS, not tess.</li>
<li>nir: Account for atomics in copy propagation.</li>
</ul>
<p>Bruce Cherniak (1):</p>
<ul>
<li>gallium/swr: Fix multi-context sync fence deadlock.</li>
</ul>
<p>Carsten Haitzler (Rasterman) (2):</p>
<ul>
<li>vc4: Use named parameters for the NEON inline asm.</li>
<li>vc4: Declare the cpu pointers as being modified in NEON asm.</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>glsl: Fix copying function's out to temp if dereferenced by array</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>dri_interface: add put shm image2 (v2)</li>
<li>glx: add support for putimageshm2 path (v2)</li>
<li>gallium: use put image shm2 path (v2)</li>
</ul>
<p>Dylan Baker (4):</p>
<ul>
<li>meson: allow building dri driver without window system if osmesa is classic</li>
<li>meson: fix swr KNL build</li>
<li>meson: Fix compiler checks for SWR with ICC</li>
<li>meson: Add warnings and errors when using ICC</li>
</ul>
<p>Emil Velikov (4):</p>
<ul>
<li>docs: add sha256 checksums for 18.3.2</li>
<li>cherry-ignore: radv: Fix multiview depth clears</li>
<li>cherry-ignore: spirv: Handle arbitrary bit sizes for deref array indices</li>
<li>cherry-ignore: WARNING: Commit XXX lists invalid sha</li>
</ul>
<p>Eric Anholt (2):</p>
<ul>
<li>vc4: Don't leak the GPU fd for renderonly usage.</li>
<li>vc4: Enable NEON asm on meson cross-builds.</li>
</ul>
<p>Eric Engestrom (2):</p>
<ul>
<li>configure: EGL requirements only apply if EGL is built</li>
<li>meson/vdpau: add missing soversion</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>anv/device: fix maximum number of images supported</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>anv/nir: Rework arguments to apply_pipeline_layout</li>
<li>anv: Only parse pImmutableSamplers if the descriptor has samplers</li>
<li>nir/xfb: Fix offset accounting for dvec3/4</li>
</ul>
<p>Karol Herbst (2):</p>
<ul>
<li>nv50/ir: disable tryCollapseChainedMULs in ConstantFolding for precise instructions</li>
<li>glsl/lower_output_reads: set invariant and precise flags on temporaries</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: fix invalid binding table index computation</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>radeonsi: also apply the GS hang workaround to draws without tessellation</li>
<li>radeonsi: fix a u_blitter crash after a shader with FBFETCH</li>
<li>radeonsi: fix rendering to tiny viewports where the viewport center is &gt; 8K</li>
<li>st/mesa: purge framebuffers when unbinding a context</li>
</ul>
<p>Niklas Haas (1):</p>
<ul>
<li>radv: correctly use vulkan 1.0 by default</li>
</ul>
<p>Pierre Moreau (1):</p>
<ul>
<li>meson: Fix with_gallium_icd to with_opencl_icd</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>loader: fix the no-modifiers case</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: clean up setting partial_es_wave for distributed tess on VI</li>
</ul>
<p>Timothy Arceri (5):</p>
<ul>
<li>ac/nir_to_llvm: fix interpolateAt* for arrays</li>
<li>ac/nir_to_llvm: fix clamp shadow reference for more hardware</li>
<li>radv/ac: fix some fp16 handling</li>
<li>glsl: use remap location when serialising uniform program resource data</li>
<li>glsl: Copy function out to temp if we don't directly ref a variable</li>
</ul>
<p>Tomeu Vizoso (1):</p>
<ul>
<li>etnaviv: Consolidate buffer references from framebuffers</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>meson: Fix typo.</li>
</ul>
</div>
</body>
</html>

180
docs/relnotes/18.3.4.html Normal file
View File

@@ -0,0 +1,180 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.3.4 Release Notes / February 18, 2019</h1>
<p>
Mesa 18.3.4 is a bug fix release which fixes bugs found since the 18.3.3 release.
</p>
<p>
Mesa 18.3.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
e22e6fe4c3aca80fe872a0a7285b6c5523e0cfc0bfb57ffcc3b3d66d292593e4 mesa-18.3.4.tar.gz
32314da4365d37f80d84f599bd9625b00161c273c39600ba63b45002d500bb07 mesa-18.3.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109107">Bug 109107</a> - gallium/st/va: change va max_profiles when using Radeon VCN Hardware</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109401">Bug 109401</a> - [DXVK] Project Cars rendering problems</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109543">Bug 109543</a> - After upgrade mesa to 19.0.0~rc1 all vulkan based application stop working [&quot;vulkan-cube&quot; received SIGSEGV in radv_pipeline_init_blend_state at ../src/amd/vulkan/radv_pipeline.c:699]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109603">Bug 109603</a> - nir_instr_as_deref: Assertion `parent &amp;&amp; parent-&gt;type == nir_instr_type_deref' failed.</li>
</ul>
<h2>Changes</h2>
<p>Bart Oldeman (1):</p>
<ul>
<li>gallium-xlib: query MIT-SHM before using it.</li>
</ul>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>radv: Only look at pImmutableSamples if the descriptor has a sampler.</li>
<li>amd/common: Use correct writemask for shared memory stores.</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>get-pick-list: Add --pretty=medium to the arguments for Cc patches</li>
<li>meson: Add dependency on genxml to anvil</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: add sha256 checksums for 18.3.3</li>
<li>cherry-ignore: nv50,nvc0: add explicit settings for recent caps</li>
<li>cherry-ignore: add more 19.0 only nominations from Ilia</li>
<li>cherry-ignore: radv: fix using LOAD_CONTEXT_REG with old GFX ME firmwares on GFX8</li>
<li>Update version to 18.3.4</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>vc4: Fix copy-and-paste fail in backport of NEON asm fixes.</li>
</ul>
<p>Eric Engestrom (2):</p>
<ul>
<li>xvmc: fix string comparison</li>
<li>xvmc: fix string comparison</li>
</ul>
<p>Ernestas Kulik (2):</p>
<ul>
<li>vc4: Fix leak in HW queries error path</li>
<li>v3d: Fix leak in resource setup error path</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nvc0: we have 16k-sized framebuffers, fix default scissors</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>intel/fs: Handle IMAGE_SIZE in size_read() and is_send_from_grf()</li>
<li>intel/fs: Do the grf127 hack on SIMD8 instructions in SIMD16 mode</li>
<li>nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocks</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>anv/cmd_buffer: check for NULL framebuffer</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>st/mesa: Limit GL_MAX_[NATIVE_]PROGRAM_PARAMETERS_ARB to 2048</li>
</ul>
<p>Kristian H. Kristensen (1):</p>
<ul>
<li>freedreno/a6xx: Emit blitter dst with OUT_RELOCW</li>
</ul>
<p>Leo Liu (2):</p>
<ul>
<li>st/va: fix the incorrect max profiles report</li>
<li>st/va/vp9: set max reference as default of VP9 reference number</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>meson: drop the xcb-xrandr version requirement</li>
<li>gallium/u_threaded: fix EXPLICIT_FLUSH for flush offsets &gt; 0</li>
<li>radeonsi: fix EXPLICIT_FLUSH for flush offsets &gt; 0</li>
<li>winsys/amdgpu: don't drop manually added fence dependencies</li>
</ul>
<p>Mario Kleiner (2):</p>
<ul>
<li>egl/wayland: Allow client-&gt;server format conversion for PRIME offload. (v2)</li>
<li>egl/wayland-drm: Only announce formats via wl_drm which the driver supports.</li>
</ul>
<p>Oscar Blumberg (1):</p>
<ul>
<li>radeonsi: Fix guardband computation for large render targets</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno: stop frob'ing pipe_resource::nr_samples</li>
</ul>
<p>Rodrigo Vivi (1):</p>
<ul>
<li>intel: Add more PCI Device IDs for Coffee Lake and Ice Lake.</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>radv: fix compiler issues with GCC 9</li>
<li>radv: always export gl_SampleMask when the fragment shader uses it</li>
</ul>
</div>
</body>
</html>

74
docs/relnotes/19.0.0.html Normal file
View File

@@ -0,0 +1,74 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.0.0 Release Notes / TBD</h1>
<p>
Mesa 19.0.0 is a new development release. People who are concerned
with stability and reliability should stick with a previous release or
wait for Mesa 19.0.1.
</p>
<p>
Mesa 19.0.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<ul>
<li>GL_AMD_texture_texture4 on all GL 4.0 drivers.</li>
<li>GL_EXT_shader_implicit_conversions on all drivers (ES extension).</li>
<li>GL_EXT_texture_compression_bptc on all GL 4.0 drivers (ES extension).</li>
<li>GL_EXT_texture_compression_rgtc on all GL 3.0 drivers (ES extension).</li>
<li>GL_EXT_render_snorm on gallium drivers (ES extension).</li>
<li>GL_EXT_texture_view on drivers supporting texture views (ES extension).</li>
<li>GL_OES_texture_view on drivers supporting texture views (ES extension).</li>
<li>GL_NV_shader_atomic_float on nvc0 (Fermi/Kepler only).</li>
<li>Shader-based software implementations of GL_ARB_gpu_shader_fp64, GL_ARB_gpu_shader_int64, GL_ARB_vertex_attrib_64bit, and GL_ARB_shader_ballot on i965.</li>
<li>VK_ANDROID_external_memory_android_hardware_buffer on Intel</li>
<li>Fixed and re-exposed VK_EXT_pci_bus_info on Intel and RADV</li>
<li>VK_EXT_scalar_block_layout on Intel and RADV</li>
<li>VK_KHR_depth_stencil_resolve on Intel</li>
<li>VK_KHR_draw_indirect_count on Intel</li>
<li>VK_EXT_conditional_rendering on Intel</li>
<li>VK_EXT_memory_budget on RADV</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>TBD</li>
</ul>
<h2>Changes</h2>
<ul>
<li>TBD</li>
</ul>
</div>
</body>
</html>

60
docs/relnotes/19.1.0.html Normal file
View File

@@ -0,0 +1,60 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.1.0 Release Notes / TBD</h1>
<p>
Mesa 19.1.0 is a new development release. People who are concerned
with stability and reliability should stick with a previous release or
wait for Mesa 19.1.1.
</p>
<p>
Mesa 19.1.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<ul>
<li>GL_EXT_texture_compression_s3tc_srgb on Gallium drivers and i965 (ES extension).</li>
<li>VK_EXT_buffer_device_address on Intel and RADV.</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>TBD</li>
</ul>
<h2>Changes</h2>
<ul>
<li>TBD</li>
</ul>
</div>
</body>
</html>

View File

@@ -35,9 +35,9 @@ You may access the repository either as an
<p> <p>
You may also You may also
<a href="https://cgit.freedesktop.org/mesa/mesa/" <a href="https://gitlab.freedesktop.org/mesa/mesa"
>browse the main Mesa git repository</a> and the >browse the main Mesa git repository</a> and the
<a href="https://cgit.freedesktop.org/mesa/demos" <a href="https://gitlab.freedesktop.org/mesa/demos"
>Mesa demos and tests git repository</a>. >Mesa demos and tests git repository</a>.
</p> </p>
@@ -52,7 +52,7 @@ To get the Mesa sources anonymously (read-only):
<li>Install the git software on your computer if needed.<br><br> <li>Install the git software on your computer if needed.<br><br>
<li>Get an initial, local copy of the repository with: <li>Get an initial, local copy of the repository with:
<pre> <pre>
git clone git://anongit.freedesktop.org/git/mesa/mesa git clone https://gitlab.freedesktop.org/mesa/mesa.git
</pre> </pre>
<li>Later, you can update your tree from the master repository with: <li>Later, you can update your tree from the master repository with:
<pre> <pre>
@@ -60,7 +60,7 @@ To get the Mesa sources anonymously (read-only):
</pre> </pre>
<li>If you also want the Mesa demos/tests repository: <li>If you also want the Mesa demos/tests repository:
<pre> <pre>
git clone git://anongit.freedesktop.org/git/mesa/demos git clone https://gitlab.freedesktop.org/mesa/demos.git
</pre> </pre>
</ol> </ol>
@@ -98,24 +98,17 @@ on a particular driver, add a new extension, etc.) in the bugzilla record.
</ol> </ol>
<p> <p>
Once your account is established: Once your account is established, you can update your push url to use SSH:
</p> <pre>
git remote set-url --push <em>origin</em> git@gitlab.freedesktop.org:mesa/mesa.git
</pre>
<ol> You can also use <a href="https://gitlab.freedesktop.org/profile/personal_access_tokens">personal access tokens</a>
<li>Get an initial, local copy of the repository with: to push over HTTPS instead (useful for people behind strict proxies).
<pre> In this case, create a token, and put it in the url as shown here:
git clone git+ssh://username@git.freedesktop.org/git/mesa/mesa <pre>
</pre> git remote set-url --push <em>origin</em> https://<em>USER</em>:<em>TOKEN</em>@gitlab.freedesktop.org/mesa/mesa.git
Replace <em>username</em> with your actual login name.<br><br> </pre>
<li>Later, you can update your tree from the master repository with:
<pre>
git pull origin
</pre>
<li>If you also want the Mesa demos/tests repository:
<pre>
git clone git+ssh://username@git.freedesktop.org/git/mesa/demos
</pre>
</ol>
<h2>Windows Users</h2> <h2>Windows Users</h2>
@@ -149,12 +142,12 @@ code while a branch has the latest stable code.
</p> </p>
<p> <p>
The command <code>git-branch</code> will list all available branches. The command <code>git branch</code> will list all available branches.
</p> </p>
<p> <p>
Questions about branch status/activity should be posted to the Questions about branch status/activity should be posted to the
mesa3d-dev mailing list. mesa-dev mailing list.
</p> </p>
<h2>Developer Git Tips</h2> <h2>Developer Git Tips</h2>

View File

@@ -85,7 +85,7 @@ should match the filenames of the corresponding dumped shaders.
<p> <p>
Setting <b>MESA_SHADER_CAPTURE_PATH</b> to a directory will cause the compiler Setting <b>MESA_SHADER_CAPTURE_PATH</b> to a directory will cause the compiler
to write <tt>.shader_test</tt> files for use with to write <tt>.shader_test</tt> files for use with
<a href="https://cgit.freedesktop.org/mesa/shader-db">shader-db</a>, a tool <a href="https://gitlab.freedesktop.org/mesa/shader-db">shader-db</a>, a tool
which compiler developers can use to gather statistics about shaders which compiler developers can use to gather statistics about shaders
(instructions, cycles, memory accesses, and so on). (instructions, cycles, memory accesses, and so on).
</p> </p>

View File

@@ -31,7 +31,7 @@ the <code>doxygen</code> directory and run <code>make</code>.
<p> <p>
For an example of Doxygen usage in Mesa, see a recent source file For an example of Doxygen usage in Mesa, see a recent source file
such as <a href="https://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/main/bufferobj.c">bufferobj.c</a>. such as <a href="https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/mesa/main/bufferobj.c">bufferobj.c</a>.
</p> </p>

View File

@@ -0,0 +1,82 @@
Name
MESA_device_software
Name Strings
EGL_MESA_device_software
Contributors
Adam Jackson <ajax@redhat.com>
Emil Velikov <emil.velikov@collabora.com>
Contacts
Adam Jackson <ajax@redhat.com>
Status
DRAFT
Version
Version 2, 2018-10-03
Number
EGL Extension #TODO
Extension Type
EGL device extension
Dependencies
Requires EGL_EXT_device_query.
This extension is written against the EGL 1.5 Specification.
Overview
This extension defines a software EGL "device". The device is not backed by
any actual device node and simply renders into client memory.
By defining this as an extension, EGL_EXT_device_enumeration is able to
sanely enumerate a software device.
New Types
None
New Procedures and Functions
None
New Tokens
None
Additions to the EGL Specification
None
New Behavior
The device list produced by eglQueryDevicesEXT will include a software
device. This can be distinguished from other device classes in the usual
way by calling eglQueryDeviceStringEXT(EGL_EXTENSIONS) and matching this
extension's string in the result.
Issues
None
Revision History
Version 2, 2018-10-03 (Emil Velikov)
- Drop "fallback" from "software fallback device"
- Add Emil Velikov as contributor
Version 1, 2017-07-06 (Adam Jackson)
- Initial version

View File

@@ -0,0 +1,95 @@
Name
MESA_query_driver
Name Strings
EGL_MESA_query_driver
Contact
Rob Clark <robdclark 'at' gmail.com>
Nicolai Hähnle <Nicolai.Haehnle 'at' amd.com>
Contibutors
Veluri Mithun <velurimithun38 'at' gmail.com>
Status
Complete
Version
Version 3, 2019-01-24
Number
EGL Extension 131
Dependencies
EGL 1.0 is required.
Overview
When an application has to query the name of a driver and for
obtaining driver's option list (UTF-8 encoded XML) of a driver
the below functions are useful.
XML file formally describes all available options and also
includes verbal descriptions in multiple languages. Its main purpose
is to be automatically processed by configuration GUIs.
The XML shall respect the following DTD:
<!ELEMENT driinfo (section*)>
<!ELEMENT section (description+, option+)>
<!ELEMENT description (enum*)>
<!ATTLIST description lang CDATA #REQUIRED
text CDATA #REQUIRED>
<!ELEMENT option (description+)>
<!ATTLIST option name CDATA #REQUIRED
type (bool|enum|int|float) #REQUIRED
default CDATA #REQUIRED
valid CDATA #IMPLIED>
<!ELEMENT enum EMPTY>
<!ATTLIST enum value CDATA #REQUIRED
text CDATA #REQUIRED>
New Procedures and Functions
char* eglGetDisplayDriverConfig(EGLDisplay dpy);
const char* eglGetDisplayDriverName(EGLDisplay dpy);
Description
By passing EGLDisplay as parameter to `eglGetDisplayDriverName` one can retrieve
driverName. Similarly passing EGLDisplay to `eglGetDisplayDriverConfig` we can retrieve
driverConfig options of the driver in XML format.
The string returned by `eglGetDisplayDriverConfig` is heap-allocated and caller
is responsible for freeing it.
EGL_BAD_DISPLAY is generated if `disp` is not an EGL display connection.
EGL_NOT_INITIALIZED is generated if `disp` has not been initialized.
If the implementation does not have enough resources to allocate the XML then an
EGL_BAD_ALLOC error is generated.
New Tokens
No new tokens
Issues
None
Revision History
Version 1, 2018-11-05 - First draft (Veluri Mithun)
Version 2, 2019-01-23 - Final version (Veluri Mithun)
Version 3, 2019-01-24 - Mark as complete, add Khronos extension
number, fix parameter name in prototypes,
write revision history (Eric Engestrom)

View File

@@ -0,0 +1,200 @@
Name
INTEL_shader_atomic_float_minmax
Name Strings
GL_INTEL_shader_atomic_float_minmax
Contact
Ian Romanick (ian . d . romanick 'at' intel . com)
Contributors
Status
In progress
Version
Last Modified Date: 06/22/2018
Revision: 4
Number
TBD
Dependencies
OpenGL 4.2, OpenGL ES 3.1, ARB_shader_storage_buffer_object, or
ARB_compute_shader is required.
This extension is written against version 4.60 of the OpenGL Shading
Language Specification.
Overview
This extension provides GLSL built-in functions allowing shaders to
perform atomic read-modify-write operations to floating-point buffer
variables and shared variables. Minimum, maximum, exchange, and
compare-and-swap are enabled.
New Procedures and Functions
None.
New Tokens
None.
IP Status
None.
Modifications to the OpenGL Shading Language Specification, Version 4.60
Including the following line in a shader can be used to control the
language features described in this extension:
#extension GL_INTEL_shader_atomic_float_minmax : <behavior>
where <behavior> is as specified in section 3.3.
New preprocessor #defines are added to the OpenGL Shading Language:
#define GL_INTEL_shader_atomic_float_minmax 1
Additions to Chapter 8 of the OpenGL Shading Language Specification
(Built-in Functions)
Modify Section 8.11, "Atomic Memory Functions"
(add a new row after the existing "atomicMin" table row, p. 179)
float atomicMin(inout float mem, float data)
Computes a new value by taking the minimum of the value of data and
the contents of mem. If one of these is an IEEE signaling NaN (i.e.,
a NaN with the most-significant bit of the mantissa cleared), it is
always considered smaller. If one of these is an IEEE quiet NaN
(i.e., a NaN with the most-significant bit of the mantissa set), it is
always considered larger. If both are IEEE quiet NaNs or both are
IEEE signaling NaNs, the result of the comparison is undefined.
(add a new row after the exiting "atomicMax" table row, p. 179)
float atomicMax(inout float mem, float data)
Computes a new value by taking the maximum of the value of data and
the contents of mem. If one of these is an IEEE signaling NaN (i.e.,
a NaN with the most-significant bit of the mantissa cleared), it is
always considered larger. If one of these is an IEEE quiet NaN (i.e.,
a NaN with the most-significant bit of the mantissa set), it is always
considered smaller. If both are IEEE quiet NaNs or both are IEEE
signaling NaNs, the result of the comparison is undefined.
(add to "atomicExchange" table cell, p. 180)
float atomicExchange(inout float mem, float data)
(add to "atomicCompSwap" table cell, p. 180)
float atomicCompSwap(inout float mem, float compare, float data)
Interactions with OpenGL 4.6 and ARB_gl_spirv
If OpenGL 4.6 or ARB_gl_spirv is supported, then
SPV_INTEL_shader_atomic_float_minmax must also be supported.
The AtomicFloatMinmaxINTEL capability is available whenever the OpenGL or
OpenGL ES implementation supports INTEL_shader_atomic_float_minmax.
Issues
1) Why call this extension INTEL_shader_atomic_float_minmax?
RESOLVED: Several other extensions already set the precedent of
VENDOR_shader_atomic_float and VENDOR_shader_atomic_float64 for extensions
that enable floating-point atomic operations. Using that as a base for
the name seems logical.
There already exists NV_shader_atomic_float, but the two extensions have
nearly zero overlap in functionality. NV_shader_atomic_float adds
atomicAdd and image atomic operations that currently shipping Intel GPUs
do not support. Calling this extension INTEL_shader_atomic_float would
likely have been confusing.
Adding something to describe the actual functions added by this extension
seemed reasonable. INTEL_shader_atomic_float_compare was considered, but
that name was deemed to be not properly descriptive. Calling this
extension INTEL_shader_atomic_float_min_max_exchange_compswap is right
out.
2) What atomic operations should we support for floating-point targets?
RESOLVED. Exchange, min, max, and compare-swap make sense, and these are
all supported by the hardware. Future extensions may add other functions.
For buffer variables and shared variables it is not possible to bit-cast
the memory location in GLSL, so existing integer operations, such as
atomicOr, cannot be used. However, the underlying hardware implementation
can do this by treating the memory as an integer. It would be possible to
implement atomicNegate using this technique with atomicXor. It is unclear
whether this provides any actual utility.
3) What should be said about the NaN behavior?
RESOLVED. There are several aspects of NaN behavior that should be
documented in this extension. However, some of this behavior varies based
on NaN concepts that do not exist in the GLSL specification.
* atomicCompSwap performs the comparison as the floating-point equality
operator (==). That is, if either 'mem' or 'compare' is NaN, the
comparison result is always false.
* atomicMin and atomicMax implement the IEEE specification with respect to
NaN. IEEE considers two different kinds of NaN: signaling NaN and quiet
NaN. A quiet NaN has the most significant bit of the mantissa set, and
a signaling NaN does not. This concept does not exist in SPIR-V,
Vulkan, or OpenGL. Let qNaN denote a quiet NaN and sNaN denote a
signaling NaN. atomicMin and atomicMax specifically implement
- fmin(qNaN, x) = fmin(x, qNaN) = fmax(qNaN, x) = fmax(x, qNaN) = x
- fmin(sNaN, x) = fmin(x, sNaN) = fmax(sNaN, x) = fmax(x, sNaN) = sNaN
- fmin(sNaN, qNaN) = fmin(qNaN, sNaN) = fmax(sNaN, qNaN) =
fmax(qNaN, sNaN) = sNaN
- fmin(sNaN, sNaN) = sNaN. This specification does not define which of
the two arguments is stored.
- fmax(sNaN, sNaN) = sNaN. This specification does not define which of
the two arguments is stored.
- fmin(qNaN, qNaN) = qNaN. This specification does not define which of
the two arguments is stored.
- fmax(qNaN, qNaN) = qNaN. This specification does not define which of
the two arguments is stored.
Further details are available in the Skylake Programmer's Reference
Manuals available at
https://01.org/linuxgraphics/documentation/hardware-specification-prms.
4) What about atomicMin and atomicMax with (+0.0, -0.0) or (-0.0, +0.0)
arguments?
RESOLVED. atomicMin should store -0.0, and atomicMax should store +0.0.
Due to a known issue in shipping Skylake GPUs, the incorrectly signed 0 is
stored. This behavior may change in later GPUs.
Revision History
Rev Date Author Changes
--- ---------- -------- ---------------------------------------------
1 04/19/2018 idr Initial version
2 05/05/2018 idr Describe interactions with the capabilities
added by SPV_INTEL_shader_atomic_float_minmax.
3 05/29/2018 idr Remove mention of 64-bit float support.
4 06/22/2018 idr Resolve issue #2.
Add issue #3 (regarding NaN behavior).
Add issue #4 (regarding atomicMin(-0, +0).

View File

@@ -0,0 +1,81 @@
Name
MESA_framebuffer_flip_y
Name Strings
GL_MESA_framebuffer_flip_y
Contact
Fritz Koenig <frkoenig@google.com>
Contributors
Fritz Koenig, Google
Kristian Høgsberg, Google
Chad Versace, Google
Status
Proposal
Version
Version 1, June 7, 2018
Number
302
Dependencies
OpenGL ES 3.1 is required, for FramebufferParameteri.
Overview
This extension defines a new framebuffer parameter,
GL_FRAMEBUFFER_FLIP_Y_MESA, that changes the behavior of the reads and
writes to the framebuffer attachment points. When GL_FRAMEBUFFER_FLIP_Y_MESA
is GL_TRUE, render commands and pixel transfer operations access the
backing store of each attachment point with an y-inverted coordinate
system. This y-inversion is relative to the coordinate system set when
GL_FRAMEBUFFER_FLIP_Y_MESA is GL_FALSE.
Access through TexSubImage2D and similar calls will notice the effect of
the flip when they are not attached to framebuffer objects because
GL_FRAMEBUFFER_FLIP_Y_MESA is associated with the framebuffer object and
not the attachment points.
IP Status
None
Issues
None
New Procedures and Functions
None
New Types
None
New Tokens
Accepted by the <pname> argument of FramebufferParameteri and
GetFramebufferParameteriv:
GL_FRAMEBUFFER_FLIP_Y_MESA 0x8BBB
Errors
An INVALID_OPERATION error is generated by GetFramebufferParameteriv if the
default framebuffer is bound to <target> and <pname> is FRAMEBUFFER_FLIP_Y_MESA.
Revision History
Version 1, June, 2018
Initial draft (Fritz Koenig)

View File

@@ -20,11 +20,11 @@ Status
Version Version
Version 8, 14-February-2014 Version 9, 09 November 2018
Number Number
TBD. OpenGL Extension #446
Dependencies Dependencies
@@ -32,9 +32,6 @@ Dependencies
GLX_ARB_create_context and GLX_ARB_create_context_profile are required. GLX_ARB_create_context and GLX_ARB_create_context_profile are required.
This extension interacts with GLX_EXT_create_context_es2_profile and
GLX_EXT_create_context_es_profile.
Overview Overview
In many situations, applications want to detect characteristics of a In many situations, applications want to detect characteristics of a
@@ -95,18 +92,13 @@ New Tokens
GLX_RENDERER_VENDOR_ID_MESA GLX_RENDERER_VENDOR_ID_MESA
GLX_RENDERER_DEVICE_ID_MESA GLX_RENDERER_DEVICE_ID_MESA
Accepted as an attribute name in <*attrib_list> in
glXCreateContextAttribsARB:
GLX_RENDERER_ID_MESA 0x818E
Additions to the OpenGL / WGL Specifications Additions to the OpenGL / WGL Specifications
None. This specification is written for GLX. None. This specification is written for GLX.
Additions to the GLX 1.4 Specification Additions to the GLX 1.4 Specification
[Add the following to Section X.Y.Z of the GLX Specification] [Add to Section 3.3.2 "GLX Versioning" of the GLX Specification]
To obtain information about the available renderers for a particular To obtain information about the available renderers for a particular
display and screen, display and screen,
@@ -206,29 +198,6 @@ Additions to the GLX 1.4 Specification
format as the string that would be returned by glGetString of GL_RENDERER. format as the string that would be returned by glGetString of GL_RENDERER.
It may, however, have a different value. It may, however, have a different value.
[Add to section section 3.3.7 "Rendering Contexts"]
The attribute name GLX_RENDERER_ID_MESA specified the index of the render
against which the context should be created. The default value of
GLX_RENDERER_ID_MESA is 0.
[Add to list of errors for glXCreateContextAttribsARB in section section
3.3.7 "Rendering Contexts"]
* If the value of GLX_RENDERER_ID_MESA specifies a non-existent
renderer, BadMatch is generated.
Dependencies on GLX_EXT_create_context_es_profile and
GLX_EXT_create_context_es2_profile
If neither extension is supported, remove all mention of
GLX_RENDERER_OPENGL_ES2_PROFILE_VERSION_MESA from the spec.
If GLX_EXT_create_context_es_profile is not supported, remove all mention of
GLX_RENDERER_OPENGL_ES_PROFILE_VERSION_MESA from the spec.
Issues Issues
1) How should the difference between on-card and GART memory be exposed? 1) How should the difference between on-card and GART memory be exposed?
@@ -408,3 +377,9 @@ Revision History
read GLX_RENDERER_ID_MESA. The VENDOR/DEVICE_ID read GLX_RENDERER_ID_MESA. The VENDOR/DEVICE_ID
example given in issue #17 should be 0x5143 and example given in issue #17 should be 0x5143 and
0xFFFFFFFF respectively. 0xFFFFFFFF respectively.
Version 9, 2018/11/09 - Remove GLX_RENDERER_ID_MESA, which has never been
implemented. Remove the unnecessary interactions
with the GLX GLES profile extensions. Note the
official GL extension number. Specify the section
of the GLX spec to modify.

View File

@@ -71,6 +71,9 @@ GL_MESA_tile_raster_order
GL_TILE_RASTER_ORDER_INCREASING_X_MESA 0x8BB9 GL_TILE_RASTER_ORDER_INCREASING_X_MESA 0x8BB9
GL_TILE_RASTER_ORDER_INCREASING_Y_MESA 0x8BBA GL_TILE_RASTER_ORDER_INCREASING_Y_MESA 0x8BBA
GL_MESA_framebuffer_flip_y
GL_FRAMEBUFFER_FLIP_Y_MESA 0x8BBB
EGL_MESA_drm_image EGL_MESA_drm_image
EGL_DRM_BUFFER_FORMAT_MESA 0x31D0 EGL_DRM_BUFFER_FORMAT_MESA 0x31D0
EGL_DRM_BUFFER_USE_MESA 0x31D1 EGL_DRM_BUFFER_USE_MESA 0x31D1

View File

@@ -21,7 +21,7 @@
<li><a href="#guidelines">Basic guidelines</a> <li><a href="#guidelines">Basic guidelines</a>
<li><a href="#formatting">Patch formatting</a> <li><a href="#formatting">Patch formatting</a>
<li><a href="#testing">Testing Patches</a> <li><a href="#testing">Testing Patches</a>
<li><a href="#mailing">Mailing Patches</a> <li><a href="#submit">Submitting Patches</a>
<li><a href="#reviewing">Reviewing Patches</a> <li><a href="#reviewing">Reviewing Patches</a>
<li><a href="#nominations">Nominating a commit for a stable branch</a> <li><a href="#nominations">Nominating a commit for a stable branch</a>
<li><a href="#criteria">Criteria for accepting patches to the stable branch</a> <li><a href="#criteria">Criteria for accepting patches to the stable branch</a>
@@ -36,14 +36,16 @@
perhaps, in very trivial cases.) perhaps, in very trivial cases.)
<li>Code patches should follow Mesa <li>Code patches should follow Mesa
<a href="codingstyle.html" target="_parent">coding conventions</a>. <a href="codingstyle.html" target="_parent">coding conventions</a>.
<li>Whenever possible, patches should only effect individual Mesa/Gallium <li>Whenever possible, patches should only affect individual Mesa/Gallium
components. components.
<li>Patches should never introduce build breaks and should be bisectable (see <li>Patches should never introduce build breaks and should be bisectable (see
<code>git bisect</code>.) <code>git bisect</code>.)
<li>Patches should be properly <a href="#formatting">formatted</a>. <li>Patches should be properly <a href="#formatting">formatted</a>.
<li>Patches should be sufficiently <a href="#testing">tested</a> before submitting. <li>Patches should be sufficiently <a href="#testing">tested</a> before submitting.
<li>Patches should be submitted to <a href="#mailing">mesa-dev</a> <li>Patches should be <a href="#submit">submitted</a>
for <a href="#reviewing">review</a> using <code>git send-email</code>. to <a href="#mailing">mesa-dev</a> or with
a <a href="#merge-request">merge request</a>
for <a href="#reviewing">review</a>.
</ul> </ul>
@@ -122,9 +124,9 @@ Please use common sense and do <strong>not</strong> blindly add everyone.
<pre> <pre>
$ scripts/get_reviewer.pl --help # to get the help screen $ scripts/get_reviewer.pl --help # to get the help screen
$ scripts/get_reviewer.pl -f src/egl/drivers/dri2/platform_android.c $ scripts/get_reviewer.pl -f src/egl/drivers/dri2/platform_android.c
Rob Herring <robh@kernel.org> (reviewer:ANDROID EGL SUPPORT,added_lines:188/700=27%,removed_lines:58/283=20%) Rob Herring &lt;robh@kernel.org&gt; (reviewer:ANDROID EGL SUPPORT,added_lines:188/700=27%,removed_lines:58/283=20%)
Tomasz Figa <tfiga@chromium.org> (reviewer:ANDROID EGL SUPPORT,authored:12/41=29%,added_lines:308/700=44%,removed_lines:115/283=41%) Tomasz Figa &lt;tfiga@chromium.org&gt; (reviewer:ANDROID EGL SUPPORT,authored:12/41=29%,added_lines:308/700=44%,removed_lines:115/283=41%)
Emil Velikov <emil.l.velikov@gmail.com> (authored:13/41=32%,removed_lines:76/283=27%) Emil Velikov &lt;emil.l.velikov@gmail.com&gt; (authored:13/41=32%,removed_lines:76/283=27%)
</pre> </pre>
</ul> </ul>
@@ -156,18 +158,29 @@ As mentioned at the begining, patches should be bisectable.
A good way to test this is to make use of the `git rebase` command, A good way to test this is to make use of the `git rebase` command,
to run your tests on each commit. Assuming your branch is based off to run your tests on each commit. Assuming your branch is based off
<code>origin/master</code>, you can run: <code>origin/master</code>, you can run:
</p>
<pre> <pre>
$ git rebase --interactive --exec "make check" origin/master $ git rebase --interactive --exec "make check" origin/master
</pre> </pre>
<p>
replacing <code>"make check"</code> with whatever other test you want to replacing <code>"make check"</code> with whatever other test you want to
run. run.
</p> </p>
<h2 id="mailing">Mailing Patches</h2> <h2 id="submit">Submitting Patches</h2>
<p> <p>
Patches should be sent to the mesa-dev mailing list for review: Patches may be submitted to the Mesa project by
<a href="#mailing">email</a> or with a
GitLab <a href="#merge-request">merge request</a>. To prevent
duplicate code review, only use one method to submit your changes.
</p>
<h3 id="mailing">Mailing Patches</h3>
<p>
Patches may be sent to the mesa-dev mailing list for review:
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev"> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">
mesa-dev@lists.freedesktop.org</a>. mesa-dev@lists.freedesktop.org</a>.
When submitting a patch make sure to use When submitting a patch make sure to use
@@ -201,8 +214,70 @@ disabled before sending your patches. (Note that you may need to contact
your email administrator for this.) your email administrator for this.)
</p> </p>
<h3 id="merge-request">GitLab Merge Requests</h3>
<p>
<a href="https://gitlab.freedesktop.org/mesa/mesa">GitLab</a> Merge
Requests (MR) can also be used to submit patches for Mesa.
</p>
<p>
If the MR may have interest for most of the Mesa community, you can
send an email to the mesa-dev email list including a link to the MR.
Don't send the patch to mesa-dev, just the MR link.
</p>
<p>
Add labels to your MR to help reviewers find it. For example:
<ul>
<li>Mesa changes affecting all drivers: mesa
<li>Hardware vendor specific code: amd, intel, nvidia, ...
<li>Driver specific code: anvil, freedreno, i965, iris, radeonsi,
radv, vc4, ...
<li>Other tag examples: gallium, util
</ul>
</p>
<p>
Tick the following when creating the MR. It allows developers to
rebase your work on top of master.
<pre>Allow commits from members who can merge to the target branch</pre>
</p>
<p>
If you revise your patches based on code review and push an update
to your branch, you should maintain a <strong>clean</strong> history
in your patches. There should not be "fixup" patches in the history.
The series should be buildable and functional after every commit
whenever you push the branch.
</p>
<p>
It is your responsibility to keep the MR alive and making progress,
as there are no guarantees that a Mesa dev will independently take
interest in it.
</p>
<p>
Some other notes:
<ul>
<li>Make changes and update your branch based on feedback
<li>Old, stale MR may be closed, but you can reopen it if you
still want to pursue the changes
<li>You should periodically check to see if your MR needs to be
rebased
<li>Make sure your MR is closed if your patches get pushed outside
of GitLab
<li>Please send MRs from a personal fork rather than from the main
Mesa repository, as it clutters it unnecessarily.
</ul>
</p>
<h2 id="reviewing">Reviewing Patches</h2> <h2 id="reviewing">Reviewing Patches</h2>
<p>
To participate in code review, you should monitor the
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">
mesa-dev</a> email list and the GitLab
Mesa <a href="https://gitlab.freedesktop.org/mesa/mesa/merge_requests">Merge
Requests</a> page.
</p>
<p> <p>
When you've reviewed a patch on the mailing list, please be unambiguous When you've reviewed a patch on the mailing list, please be unambiguous
about your review. That is, state either about your review. That is, state either
@@ -229,6 +304,29 @@ which tells the patch author that the patch can be committed, as long
as the issues are resolved first. as the issues are resolved first.
</p> </p>
<p>
These Reviewed-by, Acked-by, and Tested-by tags should also be amended
into commits in a MR before it is merged.
</p>
<p>
When providing a Reviewed-by, Acked-by, or Tested-by tag in a gitlab MR,
enclose the tag in backticks:
</p>
<pre>
`Reviewed-by: Joe Hacker &lt;jhacker@example.com&gt;`</pre>
<p>
This is the markdown format for literal, and will prevent gitlab from hiding
the &lt; and &gt; symbols.
</p>
<p>
Review by non-experts is encouraged. Understanding how someone else
goes about solving a problem is a great way to learn your way around
the project. The submitter is expected to evaluate whether they have
an appropriate amount of review feedback from people who also
understand the code before merging their patches.
</p>
<h2 id="nominations">Nominating a commit for a stable branch</h2> <h2 id="nominations">Nominating a commit for a stable branch</h2>
@@ -246,7 +344,14 @@ release.
Note: resending patch identical to one on mesa-dev@ or one that differs only Note: resending patch identical to one on mesa-dev@ or one that differs only
by the extra mesa-stable@ tag is <strong>not</strong> recommended. by the extra mesa-stable@ tag is <strong>not</strong> recommended.
</p> </p>
<p>
If you are not the author of the original patch, please Cc: them in your
nomination request.
</p>
<p>
The current patch status can be observed in the <a href="releasing.html#stagingbranch">staging branch</a>.
</p>
<h3 id="thetag">The stable tag</h3> <h3 id="thetag">The stable tag</h3>

View File

@@ -17,7 +17,7 @@
<h1>Development Utilities</h1> <h1>Development Utilities</h1>
<dl> <dl>
<dt><a href="https://cgit.freedesktop.org/mesa/demos">Mesa demos collection</a></dt> <dt><a href="https://gitlab.freedesktop.org/mesa/demos">Mesa demos collection</a></dt>
<dd>includes several utility routines in the <code>src/util/</code> <dd>includes several utility routines in the <code>src/util/</code>
directory.</dd> directory.</dd>
@@ -31,7 +31,7 @@
<dd>is a very useful tool for tracking down <dd>is a very useful tool for tracking down
memory-related problems in your code.</dd> memory-related problems in your code.</dd>
<dt><a href="https://scan.coverity.com/projects/mesa">Coverity</a><dt> <dt><a href="https://scan.coverity.com/projects/mesa">Coverity</a></dt>
<dd>provides static code analysis of Mesa. If you create an account <dd>provides static code analysis of Mesa. If you create an account
you can see the results and try to fix outstanding issues.</dd> you can see the results and try to fix outstanding issues.</dd>
</dl> </dl>

View File

@@ -18,8 +18,8 @@
<p> <p>
This page lists known issues with This page lists known issues with
<a href="https://www.spec.org/gwpg/gpc.static/vp11info.html" target="_main">SPEC Viewperf 11</a> <a href="https://www.spec.org/gwpg/gpc.static/vp11info.html">SPEC Viewperf 11</a>
and <a href="https://www.spec.org/gwpg/gpc.static/vp12info.html" target="_main">SPEC Viewperf 12</a> and <a href="https://www.spec.org/gwpg/gpc.static/vp12info.html">SPEC Viewperf 12</a>
when running on Mesa-based drivers. when running on Mesa-based drivers.
</p> </p>
@@ -66,13 +66,10 @@ either in Viewperf or the Mesa driver.
<p> <p>
These tests use features of the These tests use features of the
<a href="https://www.opengl.org/registry/specs/NV/fragment_program2.txt" <a href="https://www.opengl.org/registry/specs/NV/fragment_program2.txt">GL_NV_fragment_program2</a>
target="_main"> and
GL_NV_fragment_program2</a> and <a href="https://www.opengl.org/registry/specs/NV/vertex_program3.txt">GL_NV_vertex_program3</a>
<a href="https://www.opengl.org/registry/specs/NV/vertex_program3.txt" extensions without checking if the driver supports them.
target="_main">
GL_NV_vertex_program3</a> extensions without checking if the driver supports
them.
</p> </p>
<p> <p>
When Mesa tries to compile the vertex/fragment programs it generates errors When Mesa tries to compile the vertex/fragment programs it generates errors
@@ -86,8 +83,8 @@ Subsequent drawing calls become no-ops and the rendering is incorrect.
<p> <p>
These tests depend on the These tests depend on the
<a href="https://www.opengl.org/registry/specs/NV/primitive_restart.txt" <a href="https://www.opengl.org/registry/specs/NV/primitive_restart.txt">GL_NV_primitive_restart</a>
target="_main">GL_NV_primitive_restart</a> extension. extension.
</p> </p>
<p> <p>
@@ -124,7 +121,7 @@ never specified.
<p> <p>
A trace captured with A trace captured with
<a href="https://github.com/apitrace/apitrace" target="_main">API trace</a> <a href="https://github.com/apitrace/apitrace">API trace</a>
shows this sequences of calls like this: shows this sequences of calls like this:
<pre> <pre>

View File

@@ -43,6 +43,23 @@ This requires:
Otherwise, OpenGL 2.1 is supported. Otherwise, OpenGL 2.1 is supported.
</p> </p>
<p>
With the Fall 2018 Workstation 15 / Fusion 11 releases, additional
features are supported in the driver:
<ul>
<li>Multisample antialiasing (2x, 4x)
<li>GL_ARB/AMD_draw_buffers_blend
<li>GL_ARB_sample_shading
<li>GL_ARB_texture_cube_map_array
<li>GL_ARB_texture_gather
<li>GL_ARB_texture_query_lod
<li>GL_EXT/OES_draw_buffers_indexed
</ul>
<p>
This requires version 2.15.0 or later of the vmwgfx kernel module and
the VM must be configured for hardware version 16 or later.
</p>
<p> <p>
OpenGL 3.3 support can be disabled by setting the environment variable OpenGL 3.3 support can be disabled by setting the environment variable
SVGA_VGPU10=0. SVGA_VGPU10=0.
@@ -126,7 +143,7 @@ Begin by saving your current directory location:
<ul> <ul>
<li>Mesa/Gallium master branch. This code is used to build libGL, and the direct rendering svga driver for libGL, vmwgfx_dri.so, and the X acceleration library libxatracker.so.x.x.x. <li>Mesa/Gallium master branch. This code is used to build libGL, and the direct rendering svga driver for libGL, vmwgfx_dri.so, and the X acceleration library libxatracker.so.x.x.x.
<pre> <pre>
git clone git://anongit.freedesktop.org/git/mesa/mesa git clone https://gitlab.freedesktop.org/mesa/mesa.git
</pre> </pre>
<li>VMware Linux guest kernel module. Note that this repo contains the complete DRM and TTM code. The vmware-specific driver is really only the files prefixed with vmwgfx. <li>VMware Linux guest kernel module. Note that this repo contains the complete DRM and TTM code. The vmware-specific driver is really only the files prefixed with vmwgfx.
<pre> <pre>
@@ -136,7 +153,7 @@ Begin by saving your current directory location:
Most distros ship with this but it's safest to install a newer version. Most distros ship with this but it's safest to install a newer version.
To get the latest code from git: To get the latest code from git:
<pre> <pre>
git clone git://anongit.freedesktop.org/git/mesa/drm git clone https://gitlab.freedesktop.org/mesa/drm.git
</pre> </pre>
<li>xf86-video-vmware. The chainloading driver, vmware_drv.so, the legacy driver vmwlegacy_drv.so, and the vmwgfx driver vmwgfx_drv.so. <li>xf86-video-vmware. The chainloading driver, vmware_drv.so, the legacy driver vmwlegacy_drv.so, and the vmwgfx driver vmwgfx_drv.so.
<pre> <pre>

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

9690
include/CL/cl2.hpp Normal file

File diff suppressed because it is too large Load Diff

Some files were not shown because too many files have changed in this diff Show More