Compare commits

..

3117 Commits

Author SHA1 Message Date
Juan A. Suarez Romero
b43b55d461 nir/spirv: return after emitting a branch in block
When emitting a branch in a block, it does not make sense to continue
processing further instructions, as they will not be reachable.

This fixes a nasty case with a loop with a branch that both then-part
and else-part exits the loop:

%1 = OpLabel
     OpLoopMerge %2 %3 None
     OpBranchConditional %false %2 %2
%3 = OpLabel
     OpBranch %1
%2 = OpLabel
    [...]

We know that block %1 will branch always to block %2, which is the merge
block for the loop. And thus a break is emitted. If we keep continuing
processing further instructions, we will be processing the branch
conditional and thus emitting the proper NIR conditional, which leads to
instructions after the break.

This fixes dEQP-VK.graphicsfuzz.continue-and-merge.

CC: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-28 09:47:06 +01:00
Eric Engestrom
0c3287e94d egl/android: replace magic 0=CbCr,1=CrCb with simple enum
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-28 07:44:46 +00:00
Caio Marcelo de Oliveira Filho
6a553bedcc st/nir: count num_uniforms for FS bultin shader
Usually the uniforms will be assigned locations and have their slots
counted automatically, but for builtin shaders the location assignment
is manual.  So count them too otherwise we get num_uniforms == 0.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-27 22:18:24 -08:00
Ray Zhang
b344e32cdf glx: fix shared memory leak in X11
call XShmDetach to allow X server to free shared memory

Fixes: bcd80be49a "drisw/glx: use XShm if possible"
Signed-off-by: Ray Zhang <zhanglei002@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-02-28 14:23:02 +10:00
Timothy Arceri
e907337fad radeonsi/nir: move si_lower_nir() call into compiler thread
This helps improve compile times. For example the shader-db dolphin
shader shaders/dolphin/ubershaders/120.shader_test goes from
~1.69 -> ~1.57 seconds on my machine with this change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-28 11:54:06 +11:00
Timothy Arceri
7536af670b glsl: fix shader cache for packed param list
Some types of params such as some builtins are always padded. We
need to keep track of this so we can restore the list correctly.

Here we also remove a couple of cache entries that are not actually
required as they get rebuilt by the _mesa_add_parameter() calls.

This patch fixes a bunch of arb_texture_multisample and
arb_sample_shading piglit tests for the radeonsi NIR backend.

Fixes: edded12376 ("mesa: rework ParameterList to allow packing")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-28 11:47:37 +11:00
Yevhenii Kolesnikov
07f4b4e403 i965: Fix allow_higher_compat_version workaround limited by OpenGL 3.0
Added check for higher compat profile being allowed
before assigning certain extensions.

Fixes: 272fe94942 (mesa: enable ARB_texture_buffer_* extensions in the Compatibility profile)

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107052
2019-02-28 10:25:16 +11:00
Lionel Landwerlin
6e184147dd intel/compiler: use correct swizzle for replacement
The optimization in 4cd1a0be76 introduced a replacement of :

cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D
...
cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D

By :

cmp(8).z.f0.0 vgrf15.x:D, vgrf10.xxxx:D, vgrf2.yyyy:D
...
mov(8) vgrf11.y:D, vgrf15.yyyy:D

The first cmp instruction is storing in x while the second mov is
sourcing from y. We need to take into account where the replacement on
the scan_inst destination is going to store thing so that the
replacement mov can source things from the correct location.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4cd1a0be76 ("i965/vec4: Propagate conditional modifiers from more compares to other compares")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109759
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-27 20:06:42 +00:00
Jonathan Marek
61e3188633 freedreno: catch failing fd_blit and fallback to software blit
Fixes cases where the fd_blit fails and never happens (ex: blit to etc1)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-02-27 18:46:28 +00:00
Jonathan Marek
e3591b0339 freedreno: use renderonly path for buffers allocated with modifiers
Now that freedreno has create_with_modifiers(), this "hack" is needed to
make some cases work. Copied from vc4.

Fixes: 41ddf1d1

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-02-27 18:46:28 +00:00
Jonathan Marek
6c0fefb448 freedreno: a2xx: fix mipmapping for NPOT textures
Fixes: 3a273a4a

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-02-27 18:46:28 +00:00
Jonathan Marek
4f23767590 freedreno: a2xx: fix fast clear for some gmem configurations
In freedreno_gmem.c, gmem_align of 0x8000 is used. Alignment used here
should be the same.

Fixes: 912a9c8d

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-02-27 18:46:28 +00:00
Jonathan Marek
8eca6df5ed freedreno: a2xx: add use_hw_binning function
Fixes: cb2322c7

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-02-27 18:46:28 +00:00
Jonathan Marek
357313ab0f freedreno: a2xx: don't write 4th vertex in mem2gmem
There is only room for 3 vertices now (RECT has 3 vertices).

Fixes: 6ef7700a

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-02-27 18:46:28 +00:00
Erik Faye-Lund
71a76a47cc swr/codegen: fix autotools build
When the output directory was changed, the BUILT_SOURCES and build-rule
target-path was no longer correct, leading to races to generate the
sources and compiling them.

Fix this by updating both sets of paths, so automake see what's going on
here.

Fixes: 773b3ceaca ("swr/rast: Fix autotools and scons codegen")
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-02-27 17:59:06 +00:00
Timo Aaltonen
738626daca util/os_misc: Add check for PIPE_OS_HURD
Fix build on Hurd.

Signed-off-by: Timo Aaltonen <tjaalton@debian.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-27 14:56:48 +00:00
Lionel Landwerlin
2fff5966d6 vulkan/overlay: install layer binary in libdir
This will allow multilib.

v2: Drop path from json file, dlopen should be able to locate the lib in libdir

v3: Switch from configure_file to install_data (Dylan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109788
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-27 11:45:42 +00:00
Eric Engestrom
7763e664ce meson/swr: replace hard-coded path with current_build_dir()
Fixes: 93cd9905c8 "swr/rast: Cleanup and generalize gen_archrast"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Alok Hota <alok.hota@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-02-27 11:13:05 +00:00
Gert Wollny
b7201a468d nir: Add posibility to not lower to source mod 'abs' for ops with three sources
This is useful for r600 since there the abs source modifier is not supported
for ops with three sources

v2: Use correct logic to enable lowering to abs source mod (Eric Anhold)

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-27 11:04:06 +00:00
Gurchetan Singh
ce112fcc87 virgl/vtest: deprecate protocol version 1
This is a partial revert of 9d81cd ("virgl: Pass resource size and
transfer offsets").

The adjustments made in the client code means there's various
mismatches when transfering data.

Let's fallback to protocol version 0 and deprecate protocol
version 1.  We can still use the protocol version 1 slots for
a shared memory transfer mechanism later.

Fixes:
  dEQP-GLES31.functional.copy_image.mixed.viewclass_128_bits_mixed.*_renderbuffer

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2019-02-27 11:02:29 +00:00
Tapani Pälli
b9acfef337 util: fix a warning when building against clang7 headers
Header xmmintrin.h conditionally includes emmintrin.h that defines
_MM_DENORMALS_ZERO_MASK, add ifndef to fix this warning.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-27 08:57:41 +02:00
Tapani Pälli
d1af8115f8 iris: add libmesa_iris_gen8 library to the build
Patch fixes iris build on Android.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-27 08:57:41 +02:00
Tapani Pälli
5e52184f72 android: make libbacktrace optional on USE_LIBBACKTRACE
Otherwise with VNDK enabled we fail linking:
   src/gallium/targets/dri/Android.mk: error: gallium_dri (native:vendor)
   should not link to libbacktrace.vendor (native:vndk_private)

Option makes it possible to use libbacktrace only when VNDK is not
enabled.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-27 08:56:46 +02:00
Tapani Pälli
a3c366c4b2 android: add liblog to libmesa_intel_common build
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-27 08:53:09 +02:00
Alyssa Rosenzweig
b7a5b81d14 panfrost/midgard: Allow flt to run on most units
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-27 03:56:56 +00:00
Alyssa Rosenzweig
4c82abb9b6 panfrost: Expose perf counters in environment
Previously, we were guarded by an #ifdef, which is generally a bad form.
This patch instead guards them behind an environmental variable.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-27 03:56:38 +00:00
Alyssa Rosenzweig
60270c83b5 panfrost: Identify 4-bit channel texture formats
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-27 03:56:17 +00:00
Alyssa Rosenzweig
90fd82c540 panfrost: Add RGB565, RGB5A1 texture formats
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-27 03:55:19 +00:00
Jose Maria Casanova Crespo
4122665dd9 iris: Enable ARB_shader_draw_parameters support
Additional VERTEX_ELEMENT_STATE are used to store basevertex and
baseinstance and drawid updating the DWordLength of the
3DSTATE_VERTEX_ELEMENTS command.

This passes all piglit tests for spec.*draw_parameters.* tests
and VK-GL-CTS KHR-GL45.shader_draw_parameters_tests.* tests.

Now we only mark a dirty_update when parameters are changed or
when we have an indirect draw.

We enable PIPE_CAP_DRAW_PARAMETERS on Iris.

There is no edge flag support in the Vertex Elements setup.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-26 13:28:38 -08:00
Pierre Moreau
1c9fdcefd4 clover: Fix indentation issues
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
5285fff5f9 clover: Only use devices supporting IR_NATIVE
Currently clover will advertise any device that advertises
PIPE_CAP_COMPUTE, even if they do not support PIPE_SHADER_IR_NATIVE,
which is the IR used internally by clover.
This avoids clover advertising devices as available even though they
actually are not supported.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
8f9b4a2be6 clover: Move platform extensions definitions to clover/platform.cpp
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
2019-02-26 21:02:07 +01:00
Pierre Moreau
b033620abf clover: Move device extensions definitions to core/device.cpp
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
2019-02-26 21:02:07 +01:00
Pierre Moreau
d42f5896c5 clover: Validate program and library linking options
Program linking options are only valid if the library was created with
the `-enable-link-options` option, which itself is only valid when
creating a library, and only when creating an executable.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
fccc6ecb52 clover: Disallow creating libraries from other libraries
If creating a library, do not allow non-compiled object in it, as
executables are not allowed, and libraries would make it really hard to
enforce the "-enable-link-options" flag.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
2019-02-26 21:02:07 +01:00
Pierre Moreau
bad161c894 clover/api: Fail if trying to build a non-executable binary
From the OpenCL 1.2 Specification, Section 5.6.2 (about clBuildProgram):

> If program is created with clCreateProgramWithBinary, then the
> program binary must be an executable binary (not a compiled binary or
> library).

Reviewed-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
25d4e65eb7 clover/api: Rework the validation of devices for building
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
505ec3a530 clover: Add an helper for checking if an IR is supported
Reviewed-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
67769c913f clover: Remove the TGSI backend as unused
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Pierre Moreau
669d00ba4c clover: Avoid warnings from new OpenCL headers
* Avoid warnings from references to deprecated CL 1.0, 1.2, 2.0 and 2.1 APIs.
* Avoid warnings from not defining CL_TARGET_OPENCL_VERSION.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-02-26 21:02:07 +01:00
Karol Herbst
ba8d21a8d3 clover: update ICD table to support everything up to 2.2
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
2019-02-26 21:02:07 +01:00
Pierre Moreau
dddc5649bf include/CL: Update to the latest OpenCL 2.2 headers
Acked-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-02-26 21:02:07 +01:00
Marek Olšák
2ae07830e7 gallium/u_tests: use a compute-only context to test GCN compute ring 2019-02-26 14:58:55 -05:00
Marek Olšák
a1378639ab radeonsi: always use compute rings for clover on CI and newer (v2)
initialize all non-compute context functions to NULL.

v2: fix SI
2019-02-26 14:58:55 -05:00
Bas Nieuwenhuizen
c0110477b5 radv: Interpolate less aggressively.
Seems like dxvk used integer builtins without setting the flat
interpolation decoration.

I believe in the current spec the app is required to set these,
but in the meantime to avoid breaking things in stable releases
(and so close to release for 19.0), only expand the interpolation
to float16 and struct (which cannot be builtins as our spirv parser
lowers the builtin block).

Fixes: f324784104 "radv: Allow interpolation on non-float types."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-26 18:51:35 +00:00
Drew Davenport
1fd79b4b6d util: Don't block SIGSYS for new threads
SIGSYS is needed for programs using seccomp for sandboxing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-26 19:39:14 +01:00
Rob Clark
64206102fc freedreno/ir3: gsampler2DMSArray fixes
Array index should come before sample-id.  And exclude all isam variants
(which take integer texel coords) from adding of offset.

Fixes dEQP-GLES31.functional.texture.multisample.samples_1.use_texture_*_2d_array

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
a06bb486b0 freedreno/ir3/a6xx: fix atomic shader outputs
We also need to put in the output mov.  Possibly we could just fixup the
output register to read it directly from the dummy, but that is more
work and I guess dEQP is probably the only time you encounter this.

Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.atomic_counter.const_literal_fragment

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
db1fa21374 freedreno/a6xx: vertex_id is not _zero_based
Fixes dEQP-GLES31.functional.draw_base_vertex.draw_elements_base_vertex.builtin_variable.vertex_id

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
79180a0566 freedreno/a6xx: fix DRAW_IDX_INDIRECT max_indicies
The indirect offset does not effect the index buffer size.  Fixes all of
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_100x100_drawcount_*
with drawcount > 1.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
cabe55a2e7 freedreno/ir3/a6xx: fix non-ssa atomic dst
We weren't propagating the array info for cases where result of atomic
is array/reg.  This can happen, for example, if result is part of a phi
web lowered to regs.

Fixes dEQP-GLES31.functional.ssbo.atomic.compswap.*

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
edd5b3126d freedreno/a6xx: fix ssbo alignment
Fixes a bunch of deqp ssbo tests that use multiple ssbo blocks packed
into a single buffer.

Note the a5xx value seems suspicious, but this is what blob seems to
advertise.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
cb884d8ab2 freedreno/ir3: use nopN encoding when possible
Use the (nopN) encoding for slightly denser shaders.. this lets us fold
nop instructions into the previous alu instruction in certain cases.

Shouldn't change the # of cycles a shader takes to execute, but reduces
the size.  (ex: glmark2 refract goes from 168 to 116 instructions)

Currently only enabled for a6xx, but I think we could enable this for
a5xx and possibly a4xx.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Rob Clark
04c2520d91 freedreno/a6xx: fix hangs with large shaders
We were overflowing instrlen (which is # of groups of 16 instructions)
in a couple dEQP tests, causing gpu hangs:

dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-26 13:19:44 -05:00
Brian Paul
6dabcb5bcf mesa: fix display list corner case assertion
This fixes a failed assertion in glDeleteLists() for the following
case:

list = glGenLists(1);
glDeleteLists(list, 1);

when those are the first display list commands issued by the
application.

When we generate display lists, we plug in empty lists created with
the make_list() helper.  This function uses the OPCODE_END_OF_LIST
opcode but does not call dlist_alloc() which would set the
InstSize[OPCODE_END_OF_LIST] element to non-zero.

When the empty list was deleted, we failed the InstSize[opcode] > 0
assertion.

Typically, display lists are created with glNewList/glEndList so we
set InstSize[OPCODE_END_OF_LIST] = 1 in dlist_alloc().  That's why
this bug wasn't found before.

To fix this failure, simply initialize the InstSize[OPCODE_END_OF_LIST]
element in make_list().

The game oolite was hitting this.

Fixes: https://github.com/OoliteProject/oolite/issues/325
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-26 09:56:45 -07:00
Brian Paul
cb52d4482d svga: fix dma.pending > 0 test
The dma.pending field is boolean, so testing for > 0 isn't right.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2019-02-26 09:56:45 -07:00
Brian Paul
96ea977c79 svga: assorted whitespace and formatting fixes
Remove trailing whitespace, etc.

Trivial.
2019-02-26 09:56:45 -07:00
Brian Paul
a81eebf9bc st/mesa: whitespace/formatting fixes in st_cb_texture.c
Remove trailing whitespace, replace tabs w/ spaces, etc.

Trivial.
2019-02-26 09:56:45 -07:00
Eleni Maria Stea
fd37a19ac4 i965: fixed clamping in set_scissor_bits when the y is flipped
Calculating the scissor rectangle fields with the y flipped (0 on top)
can generate negative values that will cause assertion failure later on
as the scissor fields are all unsigned. We must clamp the bbox values
again to make sure they don't exceed the fb_height. Also fixed a
calculation error.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999
          https://bugs.freedesktop.org/show_bug.cgi?id=109594

v2:
   - I initially clamped the values inside the if (Y is flipped) case
   and I made a mistake in the calculation: the clamp of the bbox[2] should
   be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
   shouldn't have changed the ScissorRectangleYMax calculation. As the
   fixed code is equivalent with using CLAMP instead of MAX2 at the top of
   the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
   clear, I replaced it. (Nanley Chery)

v3:
   - Reversed the CLAMP change in bbox[3] as the API guarantees that the
   viewport height is positive. (Nanley Chery)

v4:
  - Added nomination for the mesa-stable branch and the link to the second
  bugzilla bug (Nanley Chery)

CC: <mesa-stable@lists.freedesktop.org>
Tested-by: Paul Chelombitko <qamonstergl@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-02-26 08:23:26 -08:00
Eduardo Lima Mitev
0bf667984b freedreno/a6xx: Silence compiler warnings
util_format_compose_swizzles() expects 'const unsigned char' and we
are feeding it 'char'.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-02-26 14:15:33 +01:00
Kasireddy, Vivek
7cab8d3661 i965: Add support for sampling from XYUV images
Add support to the i965 DRI driver to sample from XYUV8888 buffers.

Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 13:08:52 +00:00
Kasireddy, Vivek
65600d0946 dri: Add XYUV8888 format
In addition to adding this format to the dri_interface header,
add an entry in the android and wayland backends as well.

Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 13:08:52 +00:00
Vivek Kasireddy
ff14d06be5 drm-uapi: Update headers from drm-next
Pull new updates from drm-next as of the following commit:

commit a5f2fafece141ef3509e686cea576366d55cabb6
Merge: 71f4e45a4ed3 860433ed2a55
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Feb 20 12:16:30 2019 +1000

    Merge https://gitlab.freedesktop.org/drm/msm into drm-next

Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 13:08:51 +00:00
Kasireddy, Vivek
78fb3fd17e nir/lower_tex: Add support for XYUV lowering
The memory layout associated with this format would be:
Byte:      0 1 2 3
Component: V U Y X

Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 13:08:51 +00:00
Lionel Landwerlin
913d711e0f imgui: update memory editor
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-26 12:49:07 +00:00
Lionel Landwerlin
ab9ae080ec imgui: update commit
In commit 3950e7c11e ("imgui: bump copy") I forgot to update the
README about what copy of imgui we carry.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-26 12:49:04 +00:00
Eric Engestrom
a213b927f2 driinfo: add DTD to allow the xml to be validated
This DTD can be used to validate the output and make sure any parsers
out there can handle it:
$ xmllint --noout --valid driinfo.xml

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-26 12:48:28 +00:00
Lionel Landwerlin
9646750822 vulkan/overlay: fix includes
The Loader/Validation-Layers repository allow the user to choose where
header files are installed. On my system I choose /usr/include
thinking it was the obvious "base" location, but it turns out the
headers end up being installed right there rather in a vulkan
subdirectory. On Debian/Ubuntu the selected installation path is
/usr/include/vulkan, so just go with that.

Hopefully other distro don't choose another path.

Note that the validation layer doesn't provide a .pc file so we have
no way of querying where the headers are installed.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109739
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 12:29:54 +00:00
Lionel Landwerlin
47ef52d333 vulkan/overlay: fix missing installation of layer
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109739
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 12:29:46 +00:00
Eric Engestrom
318e550549 dri_interface: add missing #include
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-26 12:03:20 +00:00
Eric Engestrom
7f5d9c2757 gitlab-ci: always run the containers build
If the first time a fork was created, the job creating the containers was
manually cancelled, this would have left the fork unable to use the CI
(until the next automatic regeneration of the container).

Avoid this by always running the container-generation job, even though
99% of the time it will spin up, see that the container exists and shut
down.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2019-02-26 12:02:14 +00:00
Emil Velikov
40a82e6463 docs: mention "Allow commits from members who can merge..."
Mention the tick-box otherwise only the MR author can rebase the series.

Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reivewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-26 11:27:10 +00:00
Emil Velikov
d9d1cb43d7 egl/android: bump the number of drmDevices to 64
It's the current maximum supported by the kernel. Stay consistent with
the rest of Mesa and use the same number.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-26 11:07:23 +00:00
Emil Velikov
02344fe80b loader: use loader_open_device() to handle O_CLOEXEC
Some platforms lack O_CLOEXEC. The loader_open_device() handles those
appropriately, so use the helper.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 11:07:23 +00:00
Emil Velikov
f0a7b463b5 meson: egl: correctly manage loader/xmlconfig
Earlier commit introduced support for haiku yet did not properly
annotate the loader/xmlconfig dependencies.

Thus we ended up adding inc_loader for each !haiku platform - see
659910eda0 9a96bf0ecd c731508b98 ec6cb01e21.

One piece remained though - the wayland platform. Hence the following
would fail:

 meson -Dgallium-drivers=etnaviv -Ddri-drivers=''\
       -Dtools=etnaviv -Dplatforms=wayland -Dglx=disabled \
       build/

Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Reported-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 834d221512 ("meson: Add Haiku platform support v4")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-02-26 11:07:23 +00:00
Emil Velikov
9d84a922b8 egl/dri: de-duplicate dri2_load_driver*
The difference between the three functions is the list of mandatory
driver extensions. Pass that as an argument to the common helper.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-26 11:07:23 +00:00
Samuel Pitoiset
4924dfc851 radv: don't copy buffer descriptors list for samplers
Sampler descriptors don't have a buffer list.

This fixes some crashes with new CTS
dEQP-VK.binding_model.descriptor_copy.*.sampler_*.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-26 11:22:28 +01:00
Samuel Pitoiset
9256e0a09d radv: fix out-of-bounds access when copying descriptors BO list
We shouldn't increment the buffer list pointers twice.

This fixes some crashes with new CTS
dEQP-VK.binding_model.descriptor_copy.*.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-26 11:22:22 +01:00
Tapani Pälli
1d5e5ec30a nir: use nir_variable_create instead of open-coding the logic
Fixes: 3d7611e9 "st/nir: use NIR for asm programs"
Reported-by: Matthias Lorenz <oschowa@web.de>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-26 09:00:36 +02:00
Tapani Pälli
22267feff1 nir: initialize value in copy_prop_vars_block
Fixes following valgrind warning:

   ==27561== Conditional jump or move depends on uninitialised value(s)
   ==27561==    at 0x667856B: value_set_ssa_components (nir_opt_copy_prop_vars.c:78)
   ==27561==    by 0x667A1C4: copy_prop_vars_block (nir_opt_copy_prop_vars.c:797)

Fixes: 62332d139c "nir: Add a local variable-based copy propagation pass"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-26 08:56:25 +02:00
Eric Anholt
97566efe5c v3d: Rematerialize MOVs of uniforms instead of spilling them.
If we have a MOV of a uniform value available to spill, that's one of our
best choices.  We can just not spill the value, and emit a new load of the
uniform as the fill.  This saves bothering the TMU and the thrsw, and is
the same cost in uniforms (since the spill offset is a uniform anyway).

This doesn't have a huge impact on shader-db, since there aren't a whole
lot of spills and we usually copy-prop the uniforms at the VIR level such
that the only uniform MOVs are from vir_lower_uniforms:

total instructions in shared programs: 6430292 -> 6430279 (<.01%)
total uniforms in shared programs: 2386023 -> 2385787 (<.01%)
total spills in shared programs: 4961 -> 4960 (-0.02%)
total fills in shared programs: 6352 -> 6350 (-0.03%)

However, I'm interested in dropping the uniforms copy-prop in the backend,
since it would be cheaper to not load repeated uniforms if we have the
registers to spare.  This also saves many spills on
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20, which is what
motivated a bunch of my recent backend work in the first place:

before: 46 spills, 106 fills, 3062 instructions
after: 0 spills, 0 fills, 2611 instructions
2019-02-25 21:33:47 -08:00
Eric Anholt
e0fada983d v3d: Dump the VIR after register spilling if we were forced to.
Spilling is unusual, but one often has to debug it when it happens, so
dump it.
2019-02-25 21:26:24 -08:00
Eric Anholt
2786d2161a v3d: Fix vir_is_raw_mov() for input unpacks.
There are no users at the moment, but I wanted to start using this in
register spilling.
2019-02-25 21:26:24 -08:00
Mathias Fröhlich
1ab2159249 st/mesa: Reduce array updates due to current changes.
Since using bitmasks we can easily check if we have any
current value that is potentially uploaded on array setup.
So check for any potential vertex program input that is not
already a vao enabled array. Only flag array update if there is
a potential overlap.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2019-02-26 05:42:04 +01:00
Dylan Baker
6f42303646 meson/iris: Use current coding style
Just a few minor style things.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-25 23:37:27 +00:00
Timothy Arceri
603206d0a6 radeonsi: fix query buffer allocation
Fix the logic for buffer full check on alloc.

This patch just takes the fix Nicolai attached to the bug report
and updates it to work on master.

Fixes: e0f0d3675d ("radeonsi: factor si_query_buffer logic out of si_query_hw")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109561
2019-02-26 09:55:41 +11:00
Eric Anholt
7c1bf075f3 nir: Just return when asked to rewrite uses of an SSA def to itself.
The nir_builder swizzling improvement to not emit extra MOVs resulted in
nir_lower_tex() trying to rewrite an SSA def to itself, triggering the
assert on all texturing in v3d.  There's no work to be done in this case,
so just stop asserting.

Fixes: 743700be1f ("nir/builder: Don't emit no-op swizzles")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-25 21:25:24 +00:00
Samuel Pitoiset
5671f38085 radv: fix clearing attachments in secondary command buffers
If no framebuffer is bound, get the number of samples and the
image format from the render pass.

This fixes new CTS dEQP-VK.geometry.layered.*.secondary_cmd_buffer.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-25 21:42:50 +01:00
Alok Hota
773b3ceaca swr/rast: Fix autotools and scons codegen
Use new input flags for gen_archrast.py

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-25 13:05:39 -06:00
Alok Hota
16e10b8c30 swr/rast: Add general SWTag statistics
Update Archrast parser to use stats, used with an internal tool

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-25 13:05:36 -06:00
Alok Hota
b45a15a39f swr/rast: Add string handling to AR event framework
For use by an internal tool

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-25 13:05:31 -06:00
Alok Hota
8608a747aa swr/rast: Add initial SWTag proto definitions
Update gen_archrast.py to properly generate event IDs

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-25 13:05:17 -06:00
Alok Hota
93cd9905c8 swr/rast: Cleanup and generalize gen_archrast
Update meson.build to accomodate

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-25 13:05:07 -06:00
Daniel Schürmann
0bd45f96b9 nir: Use SM5 properties to optimize shift(a@32, iand(31, b))
This is a common pattern from HLSL->SPIRV translation
and supported in HW by all current NIR backends.

vkpipeline-db results anv (SKL):

    total instructions in shared programs: 6403130 -> 6402380 (-0.01%)
    instructions in affected programs: 204084 -> 203334 (-0.37%)
    helped: 208
    HURT: 0

    total cycles in shared programs: 1915629582 -> 1918198408 (0.13%)
    cycles in affected programs: 1158892682 -> 1161461508 (0.22%)
    helped: 107
    HURT: 86

shader-db results on i965 (KBL):

    total instructions in shared programs: 15284592 -> 15284568 (<.01%)
    instructions in affected programs: 81683 -> 81659 (-0.03%)
    helped: 24
    HURT: 0

    total cycles in shared programs: 375013622 -> 375013932 (<.01%)
    cycles in affected programs: 40169618 -> 40169928 (<.01%)
    helped: 13
    HURT: 9

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-25 12:59:44 -06:00
Daniel Schürmann
0525bdc225 nir: Define shifts according to SM5 specification.
SPIR-V shifts are undefined for values >= bitsize, but SM5 shifts
are defined to only use the least significant bits.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-25 12:59:43 -06:00
Jason Ekstrand
c4fb6b0c81 intel/eu: Add an EOT parameter to send_indirect_[split]_message
For split indirect sends we have to put the EOT parameter in the
extended descriptor as well as the instruction itself so just calling
brw_inst_set_eot is insufficient.  Moving the EOT handling handling into
the send_indirect_[split]_message helper lets us handle it properly.
2019-02-25 11:35:12 -06:00
Sergii Romantsov
dcc4866419 d3d: meson: do not prefix user provided d3d-drivers-path
The user can select the location where there d3d drivers
are installed by the d3d-drivers-path meson option.

By default path will be $prefix/$libdir/d3d.

Currently we add $prefix to the user provided path.
Resulting in an incorrect or even missing path.

Based on logic of
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109698
CC: Kenneth Graunke <kenneth@whitecape.org>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-25 16:07:02 +00:00
Sergii Romantsov
f6556ec7d1 dri: meson: do not prefix user provided dri-drivers-path
The user can select the location where there dri drivers
are installed by the dri-drivers-path meson option.

By default path will be $prefix/$libdir/dri.

Currently we add $prefix to the user provided path.
Resulting in an incorrect or even missing path.

v2: fixed dri_search_path by default, rebased to master

v3: new commit-message (Emil Velikov), cc mesa-stable

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109698
CC: Rafael Antognolli <rafael.antognolli@intel.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Fixes: 306914db92 (meson: Add dridriverdir variable to dri.pc.)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-25 16:07:02 +00:00
Lionel Landwerlin
30828f4646 intel/aub_viewer: silence more compiler warnings
format not a string literal and no format arguments.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-25 13:11:16 +00:00
Lionel Landwerlin
91df8b1780 intel/aub_viewer: silence compiler warning
buffer_addr may be used uninitialized.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-25 13:11:13 +00:00
Lionel Landwerlin
f1da10e0c5 intel/aub_viewer: printout 48bits addresses
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-25 13:11:05 +00:00
Gert Wollny
875942c059 mesa/core: Enable EXT_depth_clamp for GLES >= 2.0
The extension NV_depth_clamp is written against OpenGL 1.2.1, and
since GLES 2.0 is based on GL 2.0 there is no reason not to enable
this extension also for GLES >= 2.0.

v2: Use EXT_depth_clamp that has been proposed to Khronos

v3: - Fix check for extension availability (Erik Faya-Lund)
    - Also fix the test in is_enabled
v4: - Test both, ARB and EXT extension (Erik)
v5: - Fix white space errors (Erik)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-25 09:44:27 +00:00
Kenneth Graunke
b45186a6cd iris: Properly allow rendering to RGBX formats.
I was converting them at pipe_surface creation time, but not when
answering queries about whether formats support rendering.  This caused
a lot of FBO incomplete errors for formats that ought to be supported.

Fixes "Child of Light", which uses PIPE_FORMAT_R8G8B8X8_UNORM_SRGB.

Also fixes Witcher 1 using wined3d (GL) according to Timur Kristóf.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109738
2019-02-25 01:11:27 -08:00
Kenneth Graunke
fce089c8a2 iris: Drop RGBX -> RGBA for storage image usages
GLSL doesn't expose RGB/RGBX image formats, so this isn't needed.
2019-02-25 00:57:50 -08:00
Kenneth Graunke
6921588d54 mesa: Fix RGBBuffers for renderbuffers with sized internal formats
For texture attachments, 'f' is texImg->_BaseFormat, but for
renderbuffer attachments, 'f' is att->Renderbuffer->InternalFormat.

InternalFormat may be something like GL_RGB8, which causes our
(f == GL_RGB) check to fail.  Switch to using a proper _BaseFormat,
which drops the size.

Fixes dEQP-GLES31.functional.draw_buffers_indexed.random.
max_required_draw_buffers.15 on iris when combined with a driver fix.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
2019-02-25 00:57:42 -08:00
Oscar Blumberg
da9c030763 glsl: Fix function return typechecking
apply_implicit_conversion only converts and check base types but we
need actual type equality for function returns, otherwise you can
return a vec2 from a function declared as returning a float.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-25 08:49:06 +02:00
Jordan Justen
bd0ad651e0 iris: Always use in-tree i915_drm.h
Ref: f1374805a8 "drm-uapi: use local files, not system libdrm"
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-24 21:06:40 -08:00
Alyssa Rosenzweig
f943047e48 panfrost: Decode render target swizzle/channels
On MRT-capable systems, the framebuffer format is encoded as a 64-bit
word in the render target descriptor. Previously, the two 32-bit
words were exposed as opaque hex values. This commit identifies a 12-bit
Mali swizzle and a 2-bit channel counter, removing some of the magic. It
also adds decoding support for the AFBC and MSAA enable bits, which were
already known but otherwise ignored in pandecode.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 04:49:50 +00:00
Alyssa Rosenzweig
c6be9969d2 panfrost/midgard: Add fround(_even), ftrunc, ffma
These ops were discovered by invoking the correspondingly names GLSL
functions. The rounding ops here behave exact as expected and are mapped
to their corresponding NIR ops where applicable. The ffma behaves as a
LUT instruction and requires some special argument packing (since
Midgard normally only allows for 2 arguments); this quirk will be
addressed in the future, but for now FMA is still lowered.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 02:36:26 +00:00
Alyssa Rosenzweig
4a4726af3c panfrost/nondrm: Split out dump_counters
Previously, this function was implied a part of the job submit.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 02:34:16 +00:00
Alyssa Rosenzweig
cdca103d43 panfrost/nondrm: Make COHERENT_LOCAL explicit
This flag corresponds to what was MEM_COHERENT_LOCAL in the vendor
driver, which seems to influence the cache policy, necessary for the
varying temporary storage but nothing else.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 02:32:45 +00:00
Alyssa Rosenzweig
f44d4653a9 panfrost/nondrm: Flag CPU-invisible regions
Potentially, the kernel could optimize these allocations, or perhaps we
can save on mapping costs.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 02:31:09 +00:00
Alyssa Rosenzweig
10cc251842 panfrost/meson: Remove subdir for nondrm
This change fixes cross builds with the (temporary) non-DRM overlay.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 02:27:26 +00:00
Alyssa Rosenzweig
77fea552f6 panfrost: Use tiler fast path (performance boost)
For reasons that are still unclear (speculation included in the comment
added in this patch), the tiler? metadata has a fast path that we were
not enabling; there looks to be a possible time/memory tradeoff, but the
details remain unclear.

Regardless, this patch improves performance dramatically. Particular
wins are for geometry-heavy scenes. For instance, glmark2-es2's
Phong-shaded bunny, rendering at fullscreen (2400x1600) via GBM, jumped
from ~20fps to hitting vsync cap at 60fps. Gains are even more obvious
when vsync is disabled, as in glmark2-es2-wayland.

With this patch, on GLES 2.0 samples not involving FBOs, it appears
performance is converging with (and sometimes surpassing) the blob.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-25 02:25:50 +00:00
Jason Ekstrand
743700be1f nir/builder: Don't emit no-op swizzles
The nir_swizzle helper is used some on it's own but it's also called by
nir_channel and nir_channels which are used everywhere.  It's pretty
quick to check while we're walking the swizzle anyway whether or not
it's an identity swizzle.  If it is, we now don't bother emitting the
instruction.  Sure, copy-prop will clean it up for us but there's no
sense making more work for the optimizer than we have to.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-24 20:01:27 -06:00
Jason Ekstrand
724371c6b9 nir/split_vars: Don't compact vectors unnecessarily
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-24 20:01:18 -06:00
Erik Faye-Lund
7a6a5d4bfa st/mesa: remove unused header-file
This header has been unused since f8f2520e88 ("st/mesa: Remove
unnecessary headers"). And in the more than 8 years since, this
hasn't been useful. So let's just get rid of it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-24 20:53:37 +01:00
Maya Rashish
021c496135 configure: fix test portability
From the bash manual:

string1 == string2
string1 = string2
       True if the strings are equal.  = should be used with the test
       command for POSIX conformance.
2019-02-24 19:26:15 +00:00
David Shao
6fa923a65d meson: ensure that xmlpool_options.h is generated for gallium targets that need it
Fixes: 68076b8747 "meson: build gallium vdpau state tracker"
Fixes: 22a817af8a "meson: build gallium xvmc state tracker"
Fixes: 5a785d51a6 "meson: build gallium va state tracker"
Fixes: 0ba909f0f1 "meson: build gallium xa state tracker"
Fixes: 1d36dc674d "meson: build gallium omx state tracker"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-24 09:00:39 +00:00
Matthias Lorenz
f91654120b vulkan/overlay: Add fps counter
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109747
2019-02-24 01:07:26 +00:00
Lionel Landwerlin
239b0d8570 Revert "anv: add support for INTEL_DEBUG=bat"
This reverts commit e4d88396d2.

Apologies, I pushed the wrong commit.
2019-02-24 01:06:39 +00:00
Lionel Landwerlin
e4d88396d2 anv: add support for INTEL_DEBUG=bat
As requested by Ken ;)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-23 23:29:04 +00:00
Christian Gmeiner
c56e734496 etnaviv: blt: mark used src resource as read from
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-02-23 16:00:50 +01:00
Christian Gmeiner
7244e76804 etnaviv: rs: mark used src resource as read from
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-02-23 16:00:25 +01:00
Vinson Lee
2bd08b8b9d gallium/auxiliary/vl: Fix duplicate symbol build errors.
CXXLD    gallium_dri.la
duplicate symbol _compute_shader_video_buffer in:
    ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o)
    ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o)
duplicate symbol _compute_shader_weave in:
    ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o)
    ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o)
duplicate symbol _compute_shader_rgba in:
    ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o)
    ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o)

Fixes: 9364d66cb7 ("gallium/auxiliary/vl: Add video compositor compute shader render")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
2019-02-22 23:07:26 -08:00
Caio Marcelo de Oliveira Filho
4c160b6bd8 nir: fix MSVC build
Zero initialize struct with {0} instead of {}.
2019-02-22 22:38:05 -08:00
Caio Marcelo de Oliveira Filho
eb13211997 nir/copy_prop_vars: add tests for load/store elements of vectors
Test using array deref on vectors in loads and stores.  These are
marked DISABLED_ as this optimization is currently not done.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho
4f3809d389 nir: nir_build_deref_follower accept array derefs of vectors
Code itself already supports it, just make sure we can use it for
those cases.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho
c4beadd28e nir/copy_prop_vars: change test helper to get intrinsics
Replace find_next_intrinsic(intrinsic, after) with
get_intrinsic(intrinsic, index).  This makes slightly more convenient
to check the resulting loads/stores/copies, since in most tests we
know which one we care about.  The cost is to perform more traversals,
but for such tests this is not a problem.

Added the ASSERT_EQ() on count to some tests missing it, so the
indices queried are always expected to find something.

Also, drop two nir_print_shader leftover calls in a test.

v2: Remove redundant assertions.  nir_src_comp_as_uint already
    assert what we need.  (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho
fdcb9779d9 nir/copy_prop_vars: keep track of components in copy_entry
When a copy_entry is SSA, store not only the nir_ssa_def* for each
component, but also the source component they come from.  At the
moment this is always a match (i.e. 'component[i] == i'), because all
the operations for a copy_entry happen using definitions with the same
size.  This prepares the code for array_derefs of vectors, in which
'component[i] != i'.

Also, extract setting all SSA components into a function of its own.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho
6624decbb5 nir/copy_prop_vars: add debug helpers
Disabled by default, to be used during development.  Adding those
so I don't rewrite some ad-hoc version of them everytime I'm working
with this pass.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho
60d9bb9ff5 nir/copy_prop_vars: don't get confused by array_deref of vectors
For now these derefs are not handled, so don't let these get into the
copies list -- which would cause wrong propagations.  For load_derefs,
do nothing.  For store_derefs, invalidate whatever the store is
writing to.  For copy_derefs, invalidate whatever the copy is writing
to.

These cases will happen once derefs to SSBOs/UBOs are kept around long
enough to get optimized by copy_prop_vars.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 21:00:50 -08:00
Timothy Arceri
f48527e51a nir: allow nir_lower_phis_to_scalar() on more src types
Rather than only lowering if all srcs are scalarizable we instead
check that at least one src is scalarizable.

We change undef type to return false otherwise it will cause
regressions when it is the only scalarizable src.

total instructions in shared programs: 13219105 -> 13024547 (-1.47%)
instructions in affected programs: 1153797 -> 959239 (-16.86%)
helped: 581
HURT: 74

total cycles in shared programs: 333968972 -> 324807922 (-2.74%)
cycles in affected programs: 129809402 -> 120648352 (-7.06%)
helped: 571
HURT: 131

total spills in shared programs: 57947 -> 29130 (-49.73%)
spills in affected programs: 53364 -> 24547 (-54.00%)
helped: 351
HURT: 0

total fills in shared programs: 51310 -> 25468 (-50.36%)
fills in affected programs: 44882 -> 19040 (-57.58%)
helped: 351
HURT: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-23 11:11:51 +11:00
Alok Hota
6053499f2e swr/rast: bypass size limit for non-sampled textures
This fixes a bug where SWR will fail to render in cases with large
buffer allocations, e.g. very large meshes whose vertex buffers exceed
2GB

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-22 23:35:11 +00:00
Marek Olšák
b326a15eda tgsi: don't set tgsi_info::uses_bindless_images for constbufs and hw atomics
This might have decreased performance for radeonsi/tgsi, because most
most shaders claimed they used bindless.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-02-22 18:00:54 -05:00
Jordan Justen
cf652205cf iris: Add gitlab-ci build testing
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-22 14:08:21 -08:00
Rob Clark
fd360c82f0 freedreno/a6xx: cube image fix
Note that emit_intrinsic_load_image() already swaps a .3d flag with an
.a flag.  I tried doing things the other way around (going back to .3d)
but that didn't work.  And treating cube images as 2d array is also what
blob does, so let's just go with that.

Fixes dEQP-GLES31.functional.image_load_store.cube.load_store.*

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-22 14:05:32 -05:00
Rob Clark
f90c3b4485 freedreno/a6xx: fix border-color offset
Fixes nearly all of dEQP-GLES31.functional.texture.border_clamp.* when
run after a test that binds textures used in vertex shader.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-22 14:05:32 -05:00
Rob Clark
bdedb8277a freedreno/ir3: don't hardcode wrmask
Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler.const_literal.vertex.samplercubeshadow
and few other similar tests that do multiple texture fetches into
individual components of a packet output.  Mostly works around the
issue mentioned in ra_block_find_definers().

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-22 14:05:32 -05:00
Rob Clark
5d4fa194b8 freedreno: fix race condition
rsc->write_batch can be cleared behind our back, so we need to acquire
the lock *before* deref'ing.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-22 14:05:32 -05:00
Kenneth Graunke
3090c6b9e9 vulkan: Fix 32-bit build for the new overlay layer
vulkan_core.h defines non-dispatchable handles as (struct object *)
on 64-bit systems, but uint64_t on 32-bit systems.  The former can be
implicitly cast to void *, but the latter requires an explicit cast.

While here, %lu is the wrong format specifier for uint64_t on 32-bit
systems, so use PRIu64, fixing a warning.

Reported-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-22 08:56:54 -08:00
Juan A. Suarez Romero
4f917e6a61 anv: advertise 8 subpixel precision bits
On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is
used to select between 8 bit subpixel precision (value 0) or 4 bit
subpixel precision (value 1). As this value is not set, means it is
taking the value 0, so 8 bit are used.

On the other side, in the Vulkan CTS tests, if the reference rasterizer,
which uses 8 bit precision, as it is used to check what should be the
expected value for the tests, is changed to use 4 bit as ANV was
advertising so far, some of the tests will fail.

So it seems ANV is actually using 8 bits.

v2: explicitly set 3DSTATE_SF::VertexSubPixelPrecisionSelect (Jason)

v3: use _8Bit definition as value (Jason)

v4: (by Jason)
anv: Explicitly set 3DSTATE_CLIP::VertexSubPixelPrecisionSelect

This field was added on gen8 even though there's an identically defined
one in 3DSTATE_SF.

CC: Jason Ekstrand <jason@jlekstrand.net>
CC: Kenneth Graunke <kenneth@whitecape.org>
CC: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 17:53:55 +01:00
Juan A. Suarez Romero
3b423eeb2d genxml: add missing field values for 3DSTATE_SF
Fill out "Vertex Sub Pixel Precision Select" possible values.

CC: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 17:53:45 +01:00
Bas Nieuwenhuizen
f324784104 radv: Allow interpolation on non-float types.
In particular structs containing floats and 16-bit floating point
types.

Fixes: 62024fa775 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features"
Fixes: da29594636 "spirv: Only split blocks"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109735
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-22 17:06:55 +01:00
Bas Nieuwenhuizen
a1fdd4a4a7 radv: Fix float16 interpolation set up.
float16 types can have non-flat interpolation so set up the HW
correctly for that.

Fixes: 62024fa775 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-22 17:06:55 +01:00
Ilia Mirkin
ae2cb72804 nv50: disable compute
It causes more trouble than it's worth. Now vl tries to create compute
shaders without all the proper checking. Since there's really no
(current) way to use compute on nv50, just mark it disabled.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109742
Fixes: f6ac0b5d71 ("gallium/auxiliary/vl: Add compute shader to support video compositor render")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-02-22 09:42:41 -05:00
Lionel Landwerlin
1d626fc028 intel: fix urb size for CFL GT1
Same 192Kb amount as SKL/KBL GT1 applies.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Fixes: de7ed0ba55 ("i965/CFL: Add PCI Ids for Coffee Lake.")
2019-02-22 11:53:49 +00:00
Samuel Iglesias Gonsálvez
bd2c5a8203 isl: the display engine requires 64B alignment for linear surfaces
v2: Add PRM quote (Lionel)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-22 11:45:45 +00:00
Gert Wollny
2ee197d6e8 virgl: Enable mixed color FBO attachemnets only when the host supports
it

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2019-02-22 10:44:08 +01:00
Mauro Rossi
338dacc341 android: intel/isl: remove redundant building rules
Fixes the following building error:

including ./external/mesa/Android.mk ...
build/core/base_rules.mk:183: *** external/mesa/src/intel:
MODULE.TARGET.STATIC_LIBRARIES.libmesa_isl_tiled_memcpy already defined by external/mesa/src/intel.
make: *** [build/core/ninja.mk:164: out/build-android_x86_64.ninja] Error 1

ISL_TILED_MEMCPY_FILES is isl/isl_tiled_memcpy_normal.c
and that source file includes isl_tiled_memcpy.c source

Fixes: 96bb328 ("iris: add Android build")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-22 07:56:11 +02:00
Kenneth Graunke
b21de090d6 Revert "iris: Enable auxiliary buffer support"
This reverts commit cd0ced49e7.

It breaks glxgears rendering.
2019-02-21 15:50:46 -08:00
Kenneth Graunke
e2cb0c5e0e iris: Enable -msse2 and -mstackrealign
This is needed for gen_clflush.h intrinsics to work on 32-bit builds.
i965 and anv both set these, and iris needs to as well.

Tested-by: Mark Janes <mark.a.janes@intel.com>
2019-02-21 14:51:15 -08:00
Francisco Jerez
7272fe9c08 intel/fs: Rely on undocumented unrestricted regioning for 32x16-bit integer multiply.
Even though the hardware spec claims that any "integer DWord multiply"
operation is affected by the regioning restrictions of CHV/BXT/GLK,
this is inconsistent with the behavior of the simulator and with
empirical evidence -- Return false from has_dst_aligned_region_restriction()
for such instructions as a micro-optimization.

Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 14:07:25 -08:00
Francisco Jerez
e03be78252 intel/fs: Implement extended strides greater than 4 for IR source regions.
Strides up to 32B can be implemented for the source regions of most
instructions by leveraging either the vertical or the horizontal
stride of the hardware Align1 region.  The main motivation for this is
that currently the lower_integer_multiplication() pass will happily
double the stride of one of the 32-bit sources, which can blow up if
the stride of the original source was already the maximum value
allowed by the hardware.

An alternative would be to use the regioning legalization pass in
order to lower such strides into the composition of multiple legal
strides, but that would be somewhat less efficient.

This showed up as a regression from my commit cbea91eb57
in Vulkan 1.1 CTS tests on CHV/BXT platforms, however it was really a
pre-existing problem that had affected conformance on other platforms
without native support for integer multiplication.  CHV/BXT were
getting around it because the code I removed in that commit had the
"fortunate" side effect of emitting narrower regions that didn't hit
the hardware stride limit after lowering.  Beyond fixing the
regression this fixes ~90 additional Vulkan 1.1 subgroup CTS tests on
ICL (that's why this patch is marked for inclusion in mesa-stable even
though the original regressing patch was not).

According to Jason, a nearly equivalent change had been committed
previously as e8c9e65185 and then (mistakenly?) reverted as
a31d038208.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328
Reported-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 14:07:25 -08:00
Francisco Jerez
7f9f6263c1 intel/fs: Cap dst-aligned region stride to maximum representable hstride value.
This is required in combination with the following commit, because
otherwise if a source region with an extended 8+ stride is present in
the instruction (which we're about to declare legal) we'll end up
emitting code that attempts to write to such a region, even though
strides greater than four are still illegal for the destination.

Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 14:07:25 -08:00
Francisco Jerez
e2f475ddff intel/fs: Lower integer multiply correctly when destination stride equals 4.
Because the "low" temporary needs to be accessed with word type and
twice the original stride, attempting to preserve the alignment of the
original destination can potentially lead to instructions with illegal
destination stride greater than four.  Because the CHV/BXT alignment
restrictions are now being enforced by the regioning lowering pass run
after lower_integer_multiplication(), there is no real need to
preserve the original strides anymore.

Note that this bug can be reproduced on stable branches, but
back-porting would be non-trivial, because the fix relies on the
regioning lowering pass recently introduced.

Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 14:07:25 -08:00
Francisco Jerez
c3c27762f7 intel/fs: Exclude control sources from execution type and region alignment calculations.
Currently the execution type calculation will return a bogus value in
cases like:

  mov_indirect(8) vgrf0:w, vgrf1:w, vgrf2:ud, 32u

Which will be considered to have a 32-bit integer execution type even
though the actual indirect move operation will be carried out with
16-bit precision.

Similarly there's no need to apply the CHV/BXT double-precision region
alignment restrictions to such control sources, since they aren't
directly involved in the double-precision arithmetic operations
emitted by these virtual instructions.  Applying the CHV/BXT
restrictions to control sources was expected to be harmless if mildly
inefficient, but unfortunately it exposed problems at codegen level
for virtual instructions (namely the SHUFFLE instruction used for the
Vulkan 1.1 subgroup feature) that weren't prepared to accept control
sources with an arbitrary strided region.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328
Reported-by: Mark Janes <mark.a.janes@intel.com>
Fixes: efa4e4bc5f "intel/fs: Introduce regioning lowering pass."
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 14:07:25 -08:00
Timothy Arceri
d9e08e753b nir: clone instruction set rather than removing individual entries
This reduces the time spent in nir_opt_cse() by almost a half.

The massif tool from callgrind reported no change in peak
memory use with the large doliphin uber shaders I used for
testing.

Reviewed-by: Thomas Helland<thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 08:36:36 +11:00
Jordan Justen
cd0ac3a6af genxml: Remove extra space in gen4/45/5 field name
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 13:17:10 -08:00
Jordan Justen
a9b0b72a78 genxml/gen_bits_header.py: Use regex to strip no alphanum chars
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 13:15:59 -08:00
Kenneth Graunke
cd0ced49e7 iris: Enable auxiliary buffer support
This currently regresses KHR-GL4x.compute_shader.resource-texture,
but that's a pre-existing bug (https://bugs.freedesktop.org/109113)
which should be fixed up once we have fast clear support.
2019-02-21 10:26:12 -08:00
Rafael Antognolli
db81445837 iris: Flag ALL_DIRTY_BINDINGS on aux state change.
If we change the aux state for a given resource, we need to re-emit the
binding table pointers for any stage that has such resource bound. Since
we don't track that, flag IRIS_ALL_DIRTY_BINDINGS and emit all of them.
2019-02-21 10:26:12 -08:00
Rafael Antognolli
95589652a1 iris: Skip resolve if there's no context.
If iris_resource_get_handle() gets called without a context, we can't
resolve the resource. Hopefully it shouldn't be compressed anyway, so
let's just add an assert to ensure it's correct.
2019-02-21 10:26:12 -08:00
Rafael Antognolli
36138bb7fc iris/clear: Pass on render_condition_enabled. 2019-02-21 10:26:12 -08:00
Rafael Antognolli
8190165d13 iris: Avoid leaking if we fail to allocate the aux buffer.
Otherwise we could leak the aux state map or the aux BO.
2019-02-21 10:26:12 -08:00
Kenneth Graunke
7da53d7188 iris: Only resolve compute resources for compute shaders 2019-02-21 10:26:12 -08:00
Kenneth Graunke
95a36bd55c iris: Fix aux usage in render resolve code 2019-02-21 10:26:12 -08:00
Rafael Antognolli
4f191feb0c iris: Pin HiZ buffers when rendering. 2019-02-21 10:26:12 -08:00
Rafael Antognolli
dfd54f9954 iris: Flush before hiz_exec. 2019-02-21 10:26:12 -08:00
Kenneth Graunke
f3f7d45a63 iris: Allow disabling aux via INTEL_DEBUG options 2019-02-21 10:26:12 -08:00
Kenneth Graunke
4634b754f4 iris: do flush for buffers still 2019-02-21 10:26:12 -08:00
Kenneth Graunke
15822f33ad iris: make surface states for CCS_D too
CCS_E can fall back to CCS_D with incompatible format views

CCS_D is pretty useless without fast clears and we may as well use NONE,
but we're surely going to hook those up at some point, so may as well
just go ahead and do it now...
2019-02-21 10:26:12 -08:00
Rafael Antognolli
689b590069 iris: Skip msaa16 on gen < 9.
Also needed to add gen information to KEY_INIT.
2019-02-21 10:26:12 -08:00
Kenneth Graunke
fd2038b22a iris: Set program key fields for MCS 2019-02-21 10:26:12 -08:00
Kenneth Graunke
92c310fd3f iris: don't use hiz for MSAA buffers 2019-02-21 10:26:12 -08:00
Kenneth Graunke
2cddc953cd iris: some initial HiZ bits 2019-02-21 10:26:12 -08:00
Kenneth Graunke
9b1126c990 iris: disable aux for external things 2019-02-21 10:26:12 -08:00
Kenneth Graunke
45f4dab62b iris: Resolves for compute 2019-02-21 10:26:12 -08:00
Kenneth Graunke
ecc897b8ad iris: consider framebuffer parameter for aux usages 2019-02-21 10:26:12 -08:00
Kenneth Graunke
b77d2dc71b iris: Make blit code use actual aux usages 2019-02-21 10:26:12 -08:00
Kenneth Graunke
bfc76d3525 iris: store modifier info in res 2019-02-21 10:26:12 -08:00
Kenneth Graunke
56f1fe3eac iris: pin the buffers 2019-02-21 10:26:12 -08:00
Kenneth Graunke
f8aa9aa353 iris: resolve before transfer maps 2019-02-21 10:26:12 -08:00
Kenneth Graunke
c53a67d469 iris: be sure to skip buffers in resolve code
Buffers don't have ISL surfaces, and this can get us into trouble.
2019-02-21 10:26:12 -08:00
Kenneth Graunke
5eb75345b8 iris: try to fix copyimage vs copybuffers 2019-02-21 10:26:12 -08:00
Kenneth Graunke
d8f3bc1c4c iris: actually use the multiple surf states for aux modes 2019-02-21 10:26:12 -08:00
Kenneth Graunke
3c979b0e6d iris: add some draw resolve hooks 2019-02-21 10:26:12 -08:00
Kenneth Graunke
53c484ba8a iris: blorp using resolve hooks 2019-02-21 10:26:12 -08:00
Kenneth Graunke
77a1070d36 iris: Initial import of resolve code 2019-02-21 10:26:12 -08:00
Kenneth Graunke
f879349398 iris: create aux surface if needed 2019-02-21 10:26:12 -08:00
Kenneth Graunke
3efd5299af iris: Fill out SURFACE_STATE entries for each possible aux usage 2019-02-21 10:26:12 -08:00
Kenneth Graunke
3cfc6a207b iris: Fill out res->aux.possible_usages 2019-02-21 10:26:12 -08:00
Kenneth Graunke
a7bc4d6074 iris: Add iris_resource fields for aux surfaces
But without fast clears or HiZ per-level tracking just yet.
2019-02-21 10:26:12 -08:00
Jordan Justen
d0996d5fab iris: Emit default L3 config for the render pipeline
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:12 -08:00
Kenneth Graunke
51ddc40084 iris: Always emit at least one BLEND_STATE 2019-02-21 10:26:12 -08:00
Kenneth Graunke
d6dd57d43c iris: Add missing depth cache flushes 2019-02-21 10:26:12 -08:00
Kenneth Graunke
1b5c342f33 iris: Simplify iris_get_depth_stencil_resources
We can safely assume that the given resource is depth, depth/stencil,
or stencil already.  The stencil-only case is easily detectable with
a single format check, and all other cases are handled identically.

This saves some CPU overhead.
2019-02-21 10:26:12 -08:00
Kenneth Graunke
07ec1f0b25 iris: Make an IRIS_MAX_MIPLEVELS define 2019-02-21 10:26:12 -08:00
Rafael Antognolli
455c959689 iris: Store internal_format when getting resource from handle. 2019-02-21 10:26:12 -08:00
Kenneth Graunke
973f01d55a iris: Move create and bind driver hooks to the end of iris_program.c
This just moves the code for dealing with pipe_shader_state /
pipe_compute_state / iris_uncompiled_shader to the end of the file.
Now that those do precompiles, they want to call the actual compile
functions.  Putting them at the end eliminates the need for a bunch
of prototypes.
2019-02-21 10:26:12 -08:00
Timur Kristóf
cacf84ed5f iris: implement clearing render target and depth stencil
v2 (Kenneth Graunke): split color/depthstencil cases, fix iris_clear
2019-02-21 10:26:12 -08:00
Kenneth Graunke
8ab82bd1fd iris: Drop XXX about checking for swizzling
Caio noted that this is not necessary on Gen8+:

   "Before Gen8, there was a historical configuration control field to
    swizzle address bit[6] for in X/Y tiling modes.  This was set in
    three different places: TILECTL[1:0], ARB_MODE[5:4], and
    DISP_ARB_CTL[14:13].  For Gen8 and subsequent generations, the
    swizzle fields are all reserved, and the CPU's memory controller
    performs all address swizzling modifications."

Since we don't support earlier hardware, we can skip it entirely.
2019-02-21 10:26:12 -08:00
Kenneth Graunke
bf23e79629 iris: Set HasWriteableRT correctly
A bit of irritating state cross dependency here, but nothing too hard
2019-02-21 10:26:12 -08:00
Kenneth Graunke
d612cd1bf8 iris: Set 3DSTATE_WM::ForceThreadDispatchEnable
The Vulkan driver only sets this if color writes are disabled, which
is more conservative - but would require us to inspect blend state.

(If color writes are enabled, we don't need to force anything, because
the internal signal is already correct.  But it shouldn't hurt to do so.)
2019-02-21 10:26:12 -08:00
Kenneth Graunke
27d751cdd8 iris: Drop XXX about alpha testing
I was misreading i965 - the 3DSTATE_WM::PixelShaderKillsPixel bit from
Gen < 8 needed all of this, but the 3DSTATE_PS_EXTRA bit only needs
prog_data->uses_kill.
2019-02-21 10:26:12 -08:00
Andre Heider
bffb65d28e iris: improve PIPE_CAP_VIDEO_MEMORY bogus value
-1 is a little too bogus for most games ;)

Signed-off-by: Andre Heider <a.heider@gmail.com>
2019-02-21 10:26:12 -08:00
Andre Heider
f89a578818 iris: fix build with gallium nine
Signed-off-by: Andre Heider <a.heider@gmail.com>
2019-02-21 10:26:12 -08:00
Kenneth Graunke
be49fb051d iris: Stop chopping off the first nine characters of the renderer string 2019-02-21 10:26:12 -08:00
Kenneth Graunke
15341778ba iris: rework num textures to util_lastbit 2019-02-21 10:26:12 -08:00
Kenneth Graunke
974229df46 iris: Add PIPE_CAP_MAX_VARYINGS 2019-02-21 10:26:11 -08:00
Kenneth Graunke
1cd001aa63 iris: Make a iris_batch_reference_signal_syncpt helper function.
Suggested by Chris Wilson.  More obvious what's going on.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
9376799bd6 iris: Use READ_ONCE and WRITE_ONCE for snapshots_landed
Suggested by Chris Wilson, if only to make it obvious to the human
readers that these are volatile reads.  It may also be necessary for
the compiler in a few cases.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
18e31a9b31 iris: Fix accidental busy-looping in query waits
When switching from bo_wait to sync-points, I missed that we turned an
if (not landed) bo_wait into a while (not landed) check_syncpt(), which
has a timeout of 0.  This meant, rather than sleeping until the batch
is complete, we'd busy-loop, continually asking the kernel "is the batch
done yet???".  This is not what we want at all - if we wanted a busy
loop, we'd just loop on !snapshots_landed.  We want to sleep.

Add an effectively infinite timeout so that we sleep.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
3b1ac8244e iris: Add a timeout_nsec parameter, rename check_syncpt to wait_syncpt
I want to be able to wait with a non-zero timeout from elsewhere.
2019-02-21 10:26:11 -08:00
Sagar Ghuge
c24a574e6c iris: Don't allocate a BO per query object
Instead of allocating 4K BO per query object, we can create a large blob
of memory and split it into pieces as required.

Having one BO for multiple query objects, we don't want to wait on all
of them, instead when we write last snapshot, we create a sync point, and
check syncpoints while waiting on particular object.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-02-21 10:26:11 -08:00
Kenneth Graunke
a1ebac3750 iris: Implement ALT mode for ARB_{vertex,fragment}_shader
Fixes gl-1.0-spot-light
2019-02-21 10:26:11 -08:00
Kenneth Graunke
732c3a90a4 iris: Fix bug in bound vertex buffer tracking
res might be NULL, at which point this is an unbind.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
4bfd12bbf7 iris: minor tidying 2019-02-21 10:26:11 -08:00
Kenneth Graunke
b1bacbf038 iris: Unreference some more things on state module teardown 2019-02-21 10:26:11 -08:00
Kenneth Graunke
e092ed9213 iris: Drop dead state_size hash table
I inherited this from i965.  It would be nice to track the state size
so INTEL_DEBUG=color,bat decoding can print the right number of e.g.
binding table entries or blend states, but...without a single point
of entry for state, it's a little tricky to get right.  Punt for now,
and drop the dead code in the meantime.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
6e41f1b459 iris: Drop comment about ISP_DIS
i965 re-emits 3DSTATE_CONSTANT_* on every batch, so there's no point in
restoring the constants from the context.  Iris actually re-pins the
constant buffers properly across the batch, and avoids re-emitting the
constant packets unless it's necessary.  So, we don't want ISP_DIS.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
edd3ce5a63 iris: Enable PIPE_CAP_COMPACT_ARRAYS 2019-02-21 10:26:11 -08:00
Kenneth Graunke
1db394f46b iris: Remap stream output indexes back to VARYING_SLOT_*.
Previously I had a hack in st/mesa to make it stop remapping
VARYING_SLOT_* into the naively compacted slots, which aren't
what we want.  But that wasn't very feasible, as we'd have to
update all drivers, or add capability bits, and it gets messy fast.

It turns out that I can map back to VARYING_SLOT_* in about 5 LOC,
so let's just do that.  It removes the need for hacks, and is easy.

This also fixes KHR-GL46.enhanced_layouts.xfb_capture_struct, which
apparently with my hack was still getting the wrong slot info.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
5d3d757178 iris: Zero the compute predicate when changing the render condition
1. Set a render condition.  We emit it immediately on the render
   engine, and stash q->bo as ice->state.compute_predicate in case
   the compute engine needs it.

2. Clear the render condition.  We were incorrectly leaving a stale
   compute_predicate kicking around...

3. Dispatch compute.  We would then read the stale compute predicate,
   and try to load it into MI_PREDICATE_DATA.  But q->bo may have been
   freed altogether, causing us to try and use garbage memory as a BO,
   adding it to the validation list, failing asserts, and tripping
   EINVALs in execbuf.

Huge thanks to Mark Janes for narrowing this sporadic GL CTS failure
down to a list of 48 tests I could easily run to reproduce it.  Huge
thanks to the Valgrind authors for the memcheck tool that immediately
pinpointed the problem.
2019-02-21 10:26:11 -08:00
Caio Marcelo de Oliveira Filho
4fd1f70e62 iris: always include an extra constbuf0 if using UBOs
In st_nir_lower_uniforms_to_ubo() all UBO access in the shader have
its index incremented to open room for uniforms in constbuf0.  So if
we use UBOs, we always need to include the extra binding entry in the
table.

To avoid doing this checks both when compiling the shader and when
assigning binding tables, store the num_cbufs in iris_compiled_shader.

Fixes a bunch of tests from Piglit and CTS that use UBOs but don't use
uniforms or system values.  Note that some tests fitting this criteria
were passing because the UBOs were moved to be push
constants (avoiding the problem).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-21 10:26:11 -08:00
Kenneth Graunke
4801af2f26 iris: Do binder address allocations per-context, not globally.
iris_bufmgr allocates addresses across the entire screen, since buffers
may be shared between multiple contexts.  There used to be a single
special address, IRIS_BINDER_ADDRESS, that was per-context - and all
contexts used the same address.  When I moved to the multi-binder
system, I made a separate memory zone for them.  I wanted there to be
2-3 binders per context, so we could cycle them to avoid the stalls
inherent in pinning two buffers to the same address in back-to-back
batches.  But I figured I'd allow 100 binders just to be wildly
excessive/cautious.

What I didn't realize was that we need 2-3 binders per *context*,
and what I did was allocate 100 binders per *screen*.  Web browsers,
for example, might have 1-2 contexts per tab, leading to hundreds of
contexts, and thus binders.

To fix this, we stop allocating VMA for binders in bufmgr, and let
the binder handle it itself.  Binders are per-context, and they can
assign context-local addresses for the buffers by simply doing a
ringbuffer style approach.  We only hold on to one binder BO at a
time, so we won't ever have a conflicting address.

This fixes dEQP-EGL.functional.multicontext.non_shared_clear.

Huge thanks to Tapani Pälli for debugging this whole mess and
figuring out what was going wrong.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-21 10:26:11 -08:00
Kenneth Graunke
0f33204f05 iris: Fix memzone_for_address for the surface and binder zones
We use > for IRIS_MEMZONE_DYNAMIC because IRIS_BORDER_COLOR_POOL_ADDRESS
lives at the very start of that zone.  However, IRIS_MEMZONE_SURFACE and
IRIS_MEMZONE_BINDER are normal zones.  They used to be a single zone
(surface) with a single binder BO at the beginning, similar to the
border color pool.  But when I moved us to multiple binders, I made them
have a real zone (if a small one).  So both zones should use >=.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-21 10:26:11 -08:00
Kenneth Graunke
3bcb1a7fcd iris: Don't whack SO dirty bits when finishing a BLORP op
Re-emitting 3DSTATE_SO_BUFFERS can be hazardous, as it could zero
offsets.  Plus, it's just not necessary - BLORP doesn't change these.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
b9697dd820 iris: Fix SO issue with INTEL_DEBUG=reemit, set fewer bits
INTEL_DEBUG=reemit was breaking streamout tests, by re-emitting
3DSTATE_SO_BUFFER commands that tell the HW to zero the SO write
offsets.  We would need to alter them to use 0xFFFFFFFF for the offset.

Also, have each upload function only flag bits relevant to its own
pipeline.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
61798e3c88 iris: CS stall on VF cache invalidate workarounds
See commit 31e4c9ce40 in i965.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
c81941f1e7 iris: Pay attention to blit masks
For combined depth/stencil formats, we may want to only blit one half.
If PIPE_BLIT_Z is set, blit depth; if PIPE_BLIT_S is set, blit stencil.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
7837fec740 iris: Assert about blits with color masking
st/mesa never asks for this today, but in theory someone might, and we
don't support it.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
0f677b0d87 iris: Don't enable smooth points when point sprites are enabled
dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_*.primitives.points
2019-02-21 10:26:11 -08:00
Kenneth Graunke
3b336a1513 iris: Allow sample mask of 0
I think this was an attempt to work around various sample mask bugs I
had early on.  It's not correct.  A sample mask of 0 is legal and means
to disable all samples.

Fixes dEQP-GLES31.functional.texture.multisample.*.*sample_mask*
2019-02-21 10:26:11 -08:00
Kenneth Graunke
e17333ea1e iris: fail to create screen for older unsupported HW
loader shouldn't try, but let's be paranoid
2019-02-21 10:26:11 -08:00
Kenneth Graunke
1f91f688e8 iris: Switch to the new PIPELINE_STATISTICS_QUERY_SINGLE capability
I had a hack in place earlier to pass the query type as q->index
for the regular statistics query, but we ended up adjusting the
interface and adding a new query type.  Use that instead, fixing
pipeline statistics queries since the rebase.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
a23c06cabc iris: Use new PIPE_STAT_QUERY enums rather than hardcoded numbers. 2019-02-21 10:26:11 -08:00
Kenneth Graunke
5aef30b886 iris: Fix Broadwell WaDividePSInvocationCountBy4
We were dividing by 4 in calculate_result_on_gpu(), and also in
iris_get_query_result().  We should stop doing the latter, and instead
divide by 4 in calculate_result_on_cpu() as well.

Otherwise, if snapshots were available, and you hit the
calculate_result_on_cpu() path, but requested it be written to a QBO,
you'd fail to get a divide.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
7f318bf2ac iris: Delete genx->bound_vertex_buffers
This is actually stored in ice->state, as it isn't gen-specific
2019-02-21 10:26:11 -08:00
Kenneth Graunke
02991e2878 iris: Drop a dead comment 2019-02-21 10:26:11 -08:00
Kenneth Graunke
572fad1e84 iris: Don't check other batches for our batch BO
This is an awkward corner case.  We create batches in order, each of
which creates and pins a BO.  The other batches may not be set up yet,
so it may not be safe to ask whether they reference a BO.

Just avoid this for now.  We could avoid it for other context-local BOs
too, but we currently don't have a flag for that (and I'm not certain
whether it's worth it).
2019-02-21 10:26:11 -08:00
Kenneth Graunke
8eda6f2288 iris: Handle PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE somewhat
Various places in the transfer code need to know whether they must
read the existing resource's values.  Rather than checking both flags
everywhere, just make PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE also flag
PIPE_TRANSFER_DISCARD_RANGE - if we can discard everything, we can
discard a subrange, too.

Obviously, we can do better for PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE,
but eventually u_threaded_context should handle swapping out buffers
for new idle buffers, anyway.  In the meantime, this is at least better.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
bacc722d13 iris: Flush the render cache in flush_and_dirty_for_history
BLORP uses the render engine to write to buffers, and we need to flush
that data out to the actual surface (finishing the write).  Then, the
rest of this function invalidates any caches that might have stale data
which needs to be refetched.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
7a9e87c224 iris: Implement multi-slice copy_region
I don't know if this is required - surprisingly, I haven't seen it
matter - but I'd like to use it for multi-slice transfer maps.  We may
as well do the right thing.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
307f3f9924 iris: Leave a comment about why Broadwell images are broken
There are a variety of ways to fix this, many of which are simple, but
I could use some advice on which ones other people prefer, and so we'll
punt until after the holidays.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
7ed1383c0a iris: Fix surface states for Gen8 lowered-to-untype images
We have to use SURFTYPE_BUFFER and ISL_FORMAT_RAW for these.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
477e7d575b iris: Fill out brw_image_params for storage images on Broadwell 2019-02-21 10:26:11 -08:00
Kenneth Graunke
7e35333c73 iris: Don't make duplicate system values
We were relying on CSE/GVN/etc to coalesce all intrinsics that load the
same value, but that's a bad idea.  We might have a couple intrinsics
that reload the same value.  If so, we only want to set up the uniform
on the first one we see.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
bc3bb28645 iris: Don't enable push constants just because there are system values
System values are built-in uniforms.  We set them up as UBO values, and
might pull or push them.  UBO push analysis will take care of that.  We
only want to enable push constants if there's an actual range being
pushed.  Otherwise, we might get into a scenario where 3DSTATE_PS
enables push constants but 3DSTATE_CONSTANT_PS isn't pushing anything.

This fixes GPU hangs in Broadwell image load store tests which have
unused image param system values but no other uniforms.  (We shouldn't
be making those anyway, but that's a separate fix...)
2019-02-21 10:26:11 -08:00
Kenneth Graunke
2ca0d913ea iris: Fix framebuffer layer count
cso_fb->layers is only valid for no-attachment framebuffers.  Use the
helper function to get the real value, then stash it so we don't have
to call the helper function on the old value for comparison, or at draw
time for Force Zero RTA Index setting.

This fixes Force Zero RTA Index being set even when attempting layered
rendering.
2019-02-21 10:26:11 -08:00
Dave Airlie
df60241ff7 iris: handle qbo fragment shader invocation workaround 2019-02-21 10:26:11 -08:00
Dave Airlie
5ae2e5aa94 iris: add fs invocations query workaround for broadwell 2019-02-21 10:26:11 -08:00
Dave Airlie
8806b29e16 iris: setup gen8 caps 2019-02-21 10:26:11 -08:00
Dave Airlie
1bbf095473 iris: limit gen8 to 8 samples 2019-02-21 10:26:11 -08:00
Dave Airlie
823609b1a3 iris/WIP: add broadwell support
This adds all the state changes, MOCS changes,
2019-02-21 10:26:11 -08:00
Kenneth Graunke
5be72d9a20 iris: Delete bogus comment about cube array counting.
Both 'z' and 'depth' are counted in slices, according to the Gallium
docs (context.rst).  In our temporary memory, we allocate `box.depth`
slices, so we need to rebase the starting slice (box.z) down to 0,
and back again when writing on unmap.

There's nothing strange about cubes here.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
73709be0c3 iris: Fix compute scratch pinning
Thanks to Eero Tamminen for helping catch this.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
3ab3aa23c2 iris: Add a more long term TODO about timebase scaling 2019-02-21 10:26:11 -08:00
Kenneth Graunke
7ddc1f8ded iris: Only resolve inputs for actual shader stages
We don't need to consider compute at render time, and don't need to
consider disabled stages.  4% on drawoverhead.
2019-02-21 10:26:11 -08:00
Rhys Kidd
6c17e7d95f iris: Fix assertion in iris_resource_from_handle() tiling usage
Assertion error:

  iris_resource_from_handle: Assertion `res->bo->tiling_mode ==
      isl_tiling_to_i915_tiling(res->surf.tiling)' failed.

This patch fixes 16 piglit tests on KBL:
glx/glx-multithread-texture
glx/glx-query-drawable-glx_fbconfig_id-glxpbuffer
glx/glx-query-drawable-glx_fbconfig_id-glxpixmap
glx/glx-query-drawable-glx_preserved_contents
glx/glx-query-drawable-glxpbuffer-glx_height
glx/glx-query-drawable-glxpbuffer-glx_width
glx/glx-query-drawable-glxpixmap-glx_height
glx/glx-query-drawable-glxpixmap-glx_width
glx/glx-swap-pixmap
glx/glx-swap-pixmap-bad
glx/glx-tfp
glx/glx-visuals-depth -pixmap
glx/glx-visuals-stencil -pixmap
spec/egl 1.4/eglcreatepbuffersurface and then glclear
spec/egl 1.4/largest possible eglcreatepbuffersurface and then glclear
spec/egl_nok_texture_from_pixmap/basic

Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
2019-02-21 10:26:11 -08:00
Kenneth Graunke
73d525f188 iris: Fix scratch space allocation on Icelake.
Gen9-10 have fewer than 4 subslices per slice, so they need this to be
rounded up.  Gen11 isn't documented as needing this hack, and it can
also have more than 4 subslices, so the hack actually can break things.

Fixes tests/spec/arb_enhanced_layouts/execution/component-layout/
sso-vs-gs-fs-array-interleave
2019-02-21 10:26:11 -08:00
Kenneth Graunke
154e3e45bb iris: better MOCS 2019-02-21 10:26:11 -08:00
Dave Airlie
aaaf611130 iris: fix gpu calcs for timestamp queries 2019-02-21 10:26:11 -08:00
Kenneth Graunke
3c45d03049 iris: only mark depth/stencil as writable if writes are actually enabled 2019-02-21 10:26:11 -08:00
Kenneth Graunke
3a938a4b23 iris: more dead comments 2019-02-21 10:26:11 -08:00
Kenneth Graunke
e169cb09c3 iris: pin and re-pin the scratch BO 2019-02-21 10:26:11 -08:00
Kenneth Graunke
dd0d47a5d2 iris: delete finished comments 2019-02-21 10:26:11 -08:00
Kenneth Graunke
32ee2e4c27 iris: always pin the binder...in the compute context, too.
not sure why this hasn't tripped things up
2019-02-21 10:26:11 -08:00
Kenneth Graunke
fbfe07c4f3 iris: Track blend enables, save outbound for resolve code 2019-02-21 10:26:11 -08:00
Kenneth Graunke
5481887ca8 iris: whitespace fixes 2019-02-21 10:26:11 -08:00
Kenneth Graunke
b2fa90706e iris: Make a alloc_surface_state helper
This does the gtt_offset addition for us
2019-02-21 10:26:11 -08:00
Kenneth Graunke
b358c4b92b iris: Use a surface state fill helper
This will check aux_usage eventually
2019-02-21 10:26:11 -08:00
Kenneth Graunke
b92ca4d0f6 iris: don't print the pointer in INTEL_DEBUG=submit
lots of noise in diff, hope was it would be useful for gdb, but the
the GEM handle is good enough
2019-02-21 10:26:11 -08:00
Kenneth Graunke
ad969a00c0 iris: Fix the prototype for iris_bo_alloc_tiled
This now matches the actual function in iris_bufmgr.c, as well as the
equivalent brw_bufmgr.c function...
2019-02-21 10:26:11 -08:00
Kenneth Graunke
598a78849e iris: Fix for PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET
This fixes ext_transform_feedback-builtin-varyings gl_Position after the
combination of my transform feedback reworks and my vertex buffer
reworks (?)
2019-02-21 10:26:11 -08:00
Kenneth Graunke
392fba5f31 iris: drop unnecessary genx->streamout field 2019-02-21 10:26:11 -08:00
Kenneth Graunke
5307ff6a5f iris: Implement DrawTransformFeedback()
We get the count by dividing the offset by the stride.
2019-02-21 10:26:11 -08:00
Jason Ekstrand
2e103fff63 iris: Copy anv's MI_MATH helpers for multiplication and division
(import done by Ken but with author set to Jason because it's his
code that's being imported, so he deserves the credit)
2019-02-21 10:26:11 -08:00
Kenneth Graunke
52baba80f3 iris: only get space for one offset in stream output targets
Target corresponds to a buffer, buffer only records one offset, not
multiple.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
31357bae4b iris: Move iris_stream_output_target def to iris_context.h
now that it doesn't have genxml
2019-02-21 10:26:10 -08:00
Kenneth Graunke
cf4931e586 iris: Don't bother packing 3DSTATE_SO_BUFFER at create time
We have to do half the packet late anyway, we may as well just do it
all at set time.  This also lets us move the struct def out of genxml
2019-02-21 10:26:10 -08:00
Kenneth Graunke
754d678b0a iris: Add _MI_ALU helpers that don't paste
This lets you pass arguments as function parameters
2019-02-21 10:26:10 -08:00
Kenneth Graunke
5094062bbe iris: Reorder LRR parameters to have dst first.
LRI and LRM both put dst first, be consistent.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
2f5d85661f iris: rewrite set_vertex_buffer and VB handling
I was using the Gallium API wrong.  set_* functions with start_slot
and count parameters are supposed to update a subrange of the items.
I had been trashing all bound vertex buffers and starting over.

This should hopefully also make it easier to slot in additional
VERTEX_BUFFER_STATEs at draw time, say, for shader draw parameters.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
286b8b8f99 iris: handle PatchVerticesIn as a system value. 2019-02-21 10:26:10 -08:00
Tapani Pälli
96bb328e9b iris: add Android build
Note that at least following additional libs/components require changes
since they refer to BOARD_GPU_DRIVERS variable which is used to select
the driver:

  - mixins
  - minigbm
  - libdrm
  - drm_gralloc

v2: (feedback by Gustaw Smolarczyk) Fix trailing \ in a few cases

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-21 10:26:10 -08:00
Kenneth Graunke
97e82e80f9 iris: override alpha to one src1 blend factors
No idea why this used to pass and doesn't after updating...seems like
we should have been handling it all along...
2019-02-21 10:26:10 -08:00
Kenneth Graunke
90b2745148 iris: Always do rasterizer discard in clipper
but continue doing it in SOL if possible because it's faster

Fixes ./bin/ext_transform_feedback-discard-drawarrays - simpler too
2019-02-21 10:26:10 -08:00
Kenneth Graunke
5f511798d0 iris: Fix primitive generated query active flag 2019-02-21 10:26:10 -08:00
Kenneth Graunke
99cab4d381 iris: Enable guardband clipping 2019-02-21 10:26:10 -08:00
Kenneth Graunke
f062dcdfbb iris: Clamp viewport extents to the framebuffer dimensions
Fixes arb_framebuffer_no_attachments-query's resize subtest.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
fb2df1b5d5 iris: Fix clear dimensions
Fixes depthstencil-render-miplevels 1024 s=z24_s8
2019-02-21 10:26:10 -08:00
Kenneth Graunke
2e79e46d23 iris: Drop continues in resolve
Now that we u_bit_scan we know it exists
2019-02-21 10:26:10 -08:00
Kenneth Graunke
5fde1fa988 iris: Replace num_textures etc with a bitmask we can scan
More accurate bounds, plus can skip dead ones
2019-02-21 10:26:10 -08:00
Kenneth Graunke
7ad7d0beea iris: Fix set_sampler_views with start > 0 2019-02-21 10:26:10 -08:00
Kenneth Graunke
1c6fea8e7b iris: fix set_sampler_views to not unbind, be better about bounds 2019-02-21 10:26:10 -08:00
Kenneth Graunke
598ce8e88e iris: fix overhead regression from flushing for storage images
st calls us with count = 32 but a NULL pointer...we only really care
about the highest non-NULL image...
2019-02-21 10:26:10 -08:00
Kenneth Graunke
4749f6cc4f iris: Fix NOS mechanism
Set bits, not values
2019-02-21 10:26:10 -08:00
Kenneth Graunke
a24734a2d7 iris: re-pin inherited streamout buffers 2019-02-21 10:26:10 -08:00
Kenneth Graunke
19803d0aa7 iris: reemit SBE when sprite coord origin changes
fixes arb_point_sprite-checkerboard
2019-02-21 10:26:10 -08:00
Kenneth Graunke
480c62bc7e iris: omask can kill 2019-02-21 10:26:10 -08:00
Kenneth Graunke
bd031eb2e8 iris: reject all clipping when we can't use streamout render disabled 2019-02-21 10:26:10 -08:00
Kenneth Graunke
72cf2185c8 iris: make clipper statistics dynamic 2019-02-21 10:26:10 -08:00
Kenneth Graunke
1114f0c1ce iris: CS stall for stream out -> VB
i965 doesn't do this, but I suspect it just stalls a lot and doesn't hit
this.  Fixes ext_transform_feedback-position render among others.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
c03fbb41aa iris: fix dma buf import strides 2019-02-21 10:26:10 -08:00
Kenneth Graunke
90274bd48f iris: fix alpha channel for RGB BC1 formats 2019-02-21 10:26:10 -08:00
Jason Ekstrand
47d4ea1a16 iris: Allocate buffer resources separately
(cleaned up by Ken - make sure a bunch of things were more obviously
not using res->surf, do allow checking res->surf.tiling == LINEAR,
drop format cpp checks that aren't needed, drop memzone handling for
images, assume buffers / non-buffers in a few places...)
2019-02-21 10:26:10 -08:00
Kenneth Graunke
585c95f8cc iris: Don't bother considering if the underlying surface is a cube
Dave fixed it to consider whether the sampler view is a cube.
With that, there's no point (possibly harm) in looking if the original
resource was a cube...if it's an array view, we don't want to treat it
as a cube anymore...
2019-02-21 10:26:10 -08:00
Kenneth Graunke
773adeb9e9 iris: move some non-buffer case code in a bit 2019-02-21 10:26:10 -08:00
Kenneth Graunke
2c0f001295 iris: Stop leaking iris_uncompiled_shaders like mad
Now shader-db actually executes.  We still need a plan for culling
dead iris_compiled_shaders...
2019-02-21 10:26:10 -08:00
Kenneth Graunke
68d531d7d7 iris: Destroy the bufmgr
Plugs a 12360 byte leak
2019-02-21 10:26:10 -08:00
Kenneth Graunke
7c29c3d01e iris: Fix IRIS_MEMZONE_COUNT to exclude the border color pool
This is supposed to exclude single address zones.  We were getting
too many VMA allocators but failing to set them up, which worked out
because we also forgot to destroy them...
2019-02-21 10:26:10 -08:00
Kenneth Graunke
6cb211121b iris: Unref unbound_tex resource
Plugs a 12536 byte leak
2019-02-21 10:26:10 -08:00
Kenneth Graunke
f73fdb4001 iris: Destroy the border color pool
This plugs a 12224 byte leak
2019-02-21 10:26:10 -08:00
Kenneth Graunke
3d55e9a2aa iris: Destroy transfer helper on screen teardown
Plugs a 16 byte leak
2019-02-21 10:26:10 -08:00
Kenneth Graunke
bdc1269eb2 iris: Fix failed to compile TCS message 2019-02-21 10:26:10 -08:00
Kenneth Graunke
fbf3124771 iris: Rework tiling/modifiers handling
We were being very picky about things being Y tiled.  But, not
everything can be - for example, > 16382 surfaces on SKL GT1-3
have to fall back to linear.

Instead, give ISL options and let it pick.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
761a5fb36a iris: fix conditional compute, don't stomp predicate for pipelined queries 2019-02-21 10:26:10 -08:00
Kenneth Graunke
40b12c103c iris: check query first
this lets us avoid the predicate bit in more cases, which is nice
2019-02-21 10:26:10 -08:00
Kenneth Graunke
0c3ea03e4b iris: for BLORP, only use the predicate enable bit when USE_BIT 2019-02-21 10:26:10 -08:00
Dave Airlie
7bbf3ff4a9 iris: add conditional render support 2019-02-21 10:26:10 -08:00
Kenneth Graunke
dbe198d6ba iris: drop key_size_for_cache
dead since my program cache API rework.  we could still use it for one
function, but it's so trivial to pass the size, that it's probably not
worth the extra code
2019-02-21 10:26:10 -08:00
Dave Airlie
e4115eaca0 iris: iris add load register reg32/64
These will be needed for broadwell and conditional render
2019-02-21 10:26:10 -08:00
Dave Airlie
311a1b3198 iris: execute compute related query on compute batch.
This only happens for the compute invocations query.
2019-02-21 10:26:10 -08:00
Dave Airlie
00645ea01c iris: fix cube texture view 2019-02-21 10:26:10 -08:00
Kenneth Graunke
39d1056d10 iris: fix some SO overflow query bugs and tidy the code a bit 2019-02-21 10:26:10 -08:00
Dave Airlie
527e5bcdc7 iris: add initial transform feedback overflow query paths (V3)
v2: fix cpu overflow calc
v3: use a struct
2019-02-21 10:26:10 -08:00
Kenneth Graunke
0ded23a552 iris: actually flush for storage images 2019-02-21 10:26:10 -08:00
Kenneth Graunke
69e97670bc iris: add an extra BT assert from Chris Wilson 2019-02-21 10:26:10 -08:00
Kenneth Graunke
4312784674 iris: add assertions about binding table starts 2019-02-21 10:26:10 -08:00
Kenneth Graunke
240615695d iris: drop pull constant binding table entry
nothing uses this
2019-02-21 10:26:10 -08:00
Kenneth Graunke
10d04cdaa4 iris: Use program's num textures not the state tracker's bound
the state tracker might bind more textures than the program is using.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
855ff47d36 iris: Enable precompiles 2019-02-21 10:26:10 -08:00
Kenneth Graunke
ed4ffb9715 iris: rework program cache interface
This exposes iris_upload_shader() without having to bind it, which will
be useful for precompiles.  It also lets us examine the old programs and
flag dirty bits at a higher level, rather than cramming all that
knowledge into the cache layer.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
701a6b6006 iris: Use wrappers for create_xs_state rather than a switch statement 2019-02-21 10:26:10 -08:00
Kenneth Graunke
e628095b9a iris: fix comment location 2019-02-21 10:26:10 -08:00
Kenneth Graunke
e5df8913e1 iris: export iris_upload_shader 2019-02-21 10:26:10 -08:00
Kenneth Graunke
d525b3dfad iris: fix prototype warning 2019-02-21 10:26:10 -08:00
Kenneth Graunke
84a8c63527 iris: Re-pin even if nothing is dirty 2019-02-21 10:26:10 -08:00
Kenneth Graunke
415ede346d iris: Flush for history at various moments
When we blit, transfer, or copy_resource to a buffer, we need to flush
to ensure any stale data for that buffer is invalidated in the caches.

bind_history will inform us which caches need to be flushed.

Also, for any push constant buffers, we need to flag those dirty so
that we re-emit 3DSTATE_CONSTANT_*, causing the data to be re-pushed.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
c8579e708e iris: add iris_flush_and_dirty_for_history 2019-02-21 10:26:10 -08:00
Kenneth Graunke
d169747a3e iris: Track a binding history for buffer resources
This will let us know what caches to flush / state to dirty when
altering the contents of a buffer.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
f49f506b13 iris: drop long dead XXX comment 2019-02-21 10:26:10 -08:00
Kenneth Graunke
5dbd6df9f7 iris: Do the 48-bit vertex buffer address invalidation workaround 2019-02-21 10:26:10 -08:00
Kenneth Graunke
1b1ea23766 iris: Fix VIEWPORT/LAYER in stream output info
Fixes glsl-1.50-transform-feedback-builtins and
ext_transform_feedback-builtin-varyings gl_PointSize
2019-02-21 10:26:10 -08:00
Kenneth Graunke
c5b22441f1 iris: Fix buffer -> buffer copy_region
Size can be too large for a surf, blorp_buffer_copy chops things up
into segments we can actually handle

Fixes map_buffer_range_test and copy_buffer_coherency
2019-02-21 10:26:10 -08:00
Kenneth Graunke
beb2d5e065 iris: Lie about indirects
fixes interpolateAt tests
2019-02-21 10:26:10 -08:00
Kenneth Graunke
b9ccb00e2c iris: Enable ctx->Const.UseSTD430AsDefaultPacking
hooray for obscurely named pipe caps with bizarre descriptions!
2019-02-21 10:26:10 -08:00
Kenneth Graunke
39cb10613c iris: update comment 2019-02-21 10:26:10 -08:00
Kenneth Graunke
f9612e7682 iris: RT flush for memorybarrier with texture bit
PIXEL_BUFFER_BARRIER_BIT turns into PIPE_BARRIER_TEXTURE and it ought
to trigger an RT flush, according to brw_memory_barrier
2019-02-21 10:26:10 -08:00
Kenneth Graunke
2c23721397 iris: PIPE_CONTROL workarounds for GPGPU mode 2019-02-21 10:26:10 -08:00
Kenneth Graunke
f1a7392be1 iris: Put batches in an array
We keep re-making this array all over the place
2019-02-21 10:26:10 -08:00
Kenneth Graunke
c2a77efa71 iris: put render batch first in fence code
this shouldn't matter, but it will make the next refactor easier
2019-02-21 10:26:10 -08:00
Kenneth Graunke
d918c09975 iris: flush the compute batch too if border pool is redone 2019-02-21 10:26:10 -08:00
Kenneth Graunke
017b556609 iris: leave a TODO 2019-02-21 10:26:10 -08:00
Chris Wilson
f459c56be6 iris: Add fence support using drm_syncobj 2019-02-21 10:26:10 -08:00
Kenneth Graunke
db199d9d07 iris: Add wait fences to properly sync between render/compute
When flushing a batch due to a data dependency, we need to not only
kick off the other batch's work, but stall our execution until it
completes.  Just wait on last_syncpt after flushing it.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
d69bc4ac12 iris: Hang on to the last batch's sync-point, so we can wait on it 2019-02-21 10:26:10 -08:00
Chris Wilson
fae74234d9 iris: Tag each submitted batch with a syncobj
(adjusted by Ken to make the signalling sync object immediately on
batch reset, rather than batch finish time.  this will work better
with deferred flushes...)
2019-02-21 10:26:10 -08:00
Kenneth Graunke
3e332af611 iris: Drop vestiges of throttling code 2019-02-21 10:26:10 -08:00
Chris Wilson
54347c078e iris: Merge two walks of the exec_bos list 2019-02-21 10:26:10 -08:00
Kenneth Graunke
3455f57575 iris: replace vestiges of fence fds with newer exec_fence API
patch by me and Chris Wilson
2019-02-21 10:26:10 -08:00
Kenneth Graunke
11da219be9 iris: Avoid synchronizing due to the workaround BO 2019-02-21 10:26:10 -08:00
Kenneth Graunke
30d7bebc8a iris: Avoid cross-batch synchronization on read/reads
This avoids flushing batches just because e.g. both are reading the same
dynamic state streaming buffer, or shader assembly buffer.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
b21e916a62 iris: Combine iris_use_pinned_bo and add_exec_bo 2019-02-21 10:26:10 -08:00
Kenneth Graunke
fb4c898842 iris: Use iris_use_pinned_bo rather than add_exec_bo directly
less special this way
2019-02-21 10:26:10 -08:00
Chris Wilson
e5528151a7 iris: Fix assigning the output handle for exporting for KMS
Fixes gbm_bo_get_handle() used for KMS in glamor.
2019-02-21 10:26:10 -08:00
Chris Wilson
01e729f883 iris: Tidy exporting the flink handle 2019-02-21 10:26:10 -08:00
Kenneth Graunke
1b69b14c2a iris: Fix SLM
Now that Jason has set up the L3 we can do this.  Also, my assert was
useless because we hadn't set up the field in the first place.  Oops.
2019-02-21 10:26:10 -08:00
Jason Ekstrand
f9c5e277ac iris: Don't set constant read lengths at upload time
They're set in derived_data as part of store_cs_state
2019-02-21 10:26:10 -08:00
Jason Ekstrand
a90a0e22cb iris: Configure the L3$ on the compute context 2019-02-21 10:26:10 -08:00
Kenneth Graunke
25a41b1aef iris: properly pin stencil buffers 2019-02-21 10:26:10 -08:00
Kenneth Graunke
8545e39808 iris: Fix TCS/TES slot unification
TCS outputs, TES inputs...not TCS inputs

Fixes some barrier tests
2019-02-21 10:26:10 -08:00
Kenneth Graunke
da5590496e iris: more todo notes 2019-02-21 10:26:10 -08:00
Kenneth Graunke
9878ea842f iris: scissored and mirrored blits 2019-02-21 10:26:10 -08:00
Kenneth Graunke
25f194d5ac iris: more TODO 2019-02-21 10:26:10 -08:00
Kenneth Graunke
5207a5f5d5 iris: Fix independent alpha blending.
independent_blend_enable means per-RT blending, not RGB != A
2019-02-21 10:26:10 -08:00
Kenneth Graunke
c06f6d12a5 iris: "Fix" transfer maps of buffers
x should be in bytes, not cpp units

This generally worked out because PIPE_BUFFER is supposedly required
to be R8_UINT or R8_UNORM.  I hear some state trackers pass
PIPE_FORMAT_NONE instead, however, which would make this break.

Just do the right thing directly, to be defensive and clear.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
b2c04aa3a0 iris: Fix SourceAlphaBlendFactor 2019-02-21 10:26:10 -08:00
Kenneth Graunke
89833eddab iris: leave another TODO 2019-02-21 10:26:10 -08:00
Kenneth Graunke
983e2ae7d2 iris: only clip lower if there's something to clip against 2019-02-21 10:26:10 -08:00
Kenneth Graunke
e11c497fc6 iris: fix sysval only binding tables 2019-02-21 10:26:10 -08:00
Kenneth Graunke
2ddbc1025e iris: don't forget to upload CS consts 2019-02-21 10:26:10 -08:00
Kenneth Graunke
f1f84a1ae7 iris: drop param stuffs 2019-02-21 10:26:10 -08:00
Kenneth Graunke
1b5d35319e iris: don't trip on param asserts
I'd rather not rewrite i965's compute system value handling right now :(
2019-02-21 10:26:10 -08:00
Kenneth Graunke
f4829a2fe1 iris: don't support pull constants.
I don't think it matters, we won't have any params anyway, but let's
be sure it doesn't try
2019-02-21 10:26:10 -08:00
Kenneth Graunke
911f9e8f3f iris: regather info so we get CLIP_DIST slots, not CLIP_VERTEX 2019-02-21 10:26:09 -08:00
Kenneth Graunke
6d19fe376d iris: enable push constants if we have sysvals but no uniforms 2019-02-21 10:26:09 -08:00
Kenneth Graunke
1ef68d77c0 iris: drop iris_setup_push_uniform_range
it doesn't do anything, we have no params.  I guess I thought there
would be some, but they all get dead code eliminated even if we try
to make them exist in the first place.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
7eeb124c02 iris: fix more uniform setup 2019-02-21 10:26:09 -08:00
Kenneth Graunke
50743eb748 iris: fix num clip plane consts 2019-02-21 10:26:09 -08:00
Kenneth Graunke
a98634a28f iris: actually upload clip planes. 2019-02-21 10:26:09 -08:00
Kenneth Graunke
c60ce3f4fd iris: bypass params and do it ourselves
the backend keeps dead code eliminating them all, so we can't do that,
plus we don't want to because params[] is lame
2019-02-21 10:26:09 -08:00
Kenneth Graunke
78fc760bab iris: dodge backend UCP lowering 2019-02-21 10:26:09 -08:00
Kenneth Graunke
deb6d588a6 iris: fix system value remapping 2019-02-21 10:26:09 -08:00
Kenneth Graunke
2b0a2915dc iris: hook up key stuff for clip plane lowering 2019-02-21 10:26:09 -08:00
Kenneth Graunke
2876dd1a37 iris: lower user clip planes 2019-02-21 10:26:09 -08:00
Kenneth Graunke
80c856cbee iris: only bother with params if there are any... 2019-02-21 10:26:09 -08:00
Kenneth Graunke
2186d83185 iris: fill out params array with built-ins, like clip planes 2019-02-21 10:26:09 -08:00
Kenneth Graunke
d3e8ff143d iris: add param domain defines 2019-02-21 10:26:09 -08:00
Kenneth Graunke
ecb28b2802 iris: drop unnecessary param[] setup from iris_setup_uniforms
the backend just considers these dead anyway
2019-02-21 10:26:09 -08:00
Kenneth Graunke
ed08f022f0 iris: Defer cbuf0 upload to draw time 2019-02-21 10:26:09 -08:00
Kenneth Graunke
e98cf9c24b iris: Clone the NIR
The backend compiler used to do this for us, but after a rebase, it's
now the driver's responsibility.  This lets us alter it for say, clip
vertex lowering, at the global level rather than the per-variant level.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
587e438128 iris: Print the batch name when decoding 2019-02-21 10:26:09 -08:00
Kenneth Graunke
2727a942a4 iris: partial set_query_active_state
used to avoid OQ during clears for example

fixes occlusion_query_meta_no_fragments
2019-02-21 10:26:09 -08:00
Kenneth Graunke
64af1d9248 iris: Fix multiple RTs with non-independent blending
rt[i] isn't filled out in this case, so we have to use rt[0]
2019-02-21 10:26:09 -08:00
Kenneth Graunke
58507c02ce iris: Fix TextureBarrier
I don't know how I came up with the old one, this is now what i965 does
Also we now do compute batches too
2019-02-21 10:26:09 -08:00
Kenneth Graunke
e5d84bbd36 iris: Fix MSAA smooth points
Fixes bin/ext_framebuffer_multisample-point-smooth 2 -auto -fbo
2019-02-21 10:26:09 -08:00
Kenneth Graunke
4d219b0eb3 iris: implement scratch space!
we borrow the approach from anv rather than i965, as it works better
with pre-baked state that needs to contain scratch BO addresses

fixes a bunch of varying packing tests
2019-02-21 10:26:09 -08:00
Kenneth Graunke
9511b89ef9 iris: tidy more warnings 2019-02-21 10:26:09 -08:00
Kenneth Graunke
846316b258 iris: Enable msaa_map transfer helpers
This does the downsampling for us.  It'll use BLORP anyway because
it uses blit(), and that uses BLORP.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
9ec927497e iris: Actually create/destroy HW contexts
The intention is that render and compute use their own contexts,
and each is PIPELINE_SELECT'd to the right pipeline.  But we hadn't
actually made them, so we got the fd-default context.

Thanks to Chris Wilson for catching this!
2019-02-21 10:26:09 -08:00
Kenneth Graunke
cb5f47f585 iris: Don't leak the compute batch 2019-02-21 10:26:09 -08:00
Kenneth Graunke
fbe5d75f11 iris: cross batch flushing 2019-02-21 10:26:09 -08:00
Kenneth Graunke
c3cc525c7a iris: Cross-link iris_batches so they can potentially flush each other
This makes e.g. the render batch aware of the compute batch, so it can
ask questions like "is this BO referenced by some other batch?" and do
something about that.
2019-02-21 10:26:09 -08:00
Dave Airlie
ed016b2a0b iris: fix crash in sparse vertex array
this fixes crash in array-stride piglit.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
bcac11c8f1 iris: Use at least 1x1 size for null FB surface state.
Otherwise we get 0 - 1 = 0xffffffff and fail to pack SURFACE_STATE.

Fixes some object namespace pollution gltexsubimage2d tests
2019-02-21 10:26:09 -08:00
Kenneth Graunke
9c8fdf8133 iris: Drop B5G5R5X1 support
This is oddly renderable but not supported for sampling, which is the
opposite of other X formats.  Just skip it and fall back to BGRA.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
4b31f506f8 iris: Enable A8/A16_UNORM in an inefficient manner
These are currently just use the 'A' hardware formats, rather than the
faster 'R' formats.  glBitmap handling needs these, it seems. :(
2019-02-21 10:26:09 -08:00
Kenneth Graunke
80497af192 iris: Enable ARB_shader_stencil_export 2019-02-21 10:26:09 -08:00
Kenneth Graunke
3e6aaa1ba5 iris: Disable a PIPE_CONTROL workaround on Icelake 2019-02-21 10:26:09 -08:00
Kenneth Graunke
84a419432d iris: Flag constants dirty on program changes
3DSTATE_CONSTANT_* looks at prog_data->ubo_ranges.  We were getting
saved by iris_set_constant_buffers() usually happening when changing
programs (as they usually change uniforms too), but with the clear
shader that doesn't use uniforms, we weren't getting one and were
leaving push constants enabled, screwing things up.

Also clean up a bit of a mess left by the hacks - we were missing
bindings in the VS/FS/CS case, among other issues...
2019-02-21 10:26:09 -08:00
Kenneth Graunke
317ba8796f iris: allow binding a null vertex buffer
PBO upload apparently does this...
2019-02-21 10:26:09 -08:00
Kenneth Graunke
aef1ba5ce4 iris: fix overhead regression from "don't stomp each other's dirty bits"
The change from dirty = 0ull to dirty &= ~NOT_MY_BITS broke the "nothing
to do?  skip it!" optimization.  thanks to Chris for noticing this!
2019-02-21 10:26:09 -08:00
Kenneth Graunke
525d89cafc iris: delete dead code 2019-02-21 10:26:09 -08:00
Kenneth Graunke
8a98e90415 iris: Fix refcounting of grid surface 2019-02-21 10:26:09 -08:00
Jason Ekstrand
8e8868d5ad iris/compute: Zero out the last grid size on indirect dispatches 2019-02-21 10:26:09 -08:00
Jason Ekstrand
c16e711ff2 iris/compute: Don't increment the grid size offset
It may be in the dynamic state buffer but the fact that we have a
resource takes care of that.  We don't need to add in the address of
the dynamic state buffer again.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
a3e813c5af iris: SO_DECL_LIST fix 2019-02-21 10:26:09 -08:00
Kenneth Graunke
927c4a21bd iris: Fall back to 1x1x1 null surface if no framebuffer supplied
If the state tracker never gave us the framebuffer dimensions via
a set_framebuffer_state() call, just fall back to the unbound texture
null surface, which is 1x1x1.  Otherwise we'd use a NULL resource
(no pun intended).
2019-02-21 10:26:09 -08:00
Kenneth Graunke
5d1a9db720 iris: Fix off by one in scissoring, empty scissors, default scissors 2019-02-21 10:26:09 -08:00
Kenneth Graunke
938d63b2e8 iris: Move snapshots_landed to the front.
Transform feedback overflow queries need to write additional data,
and it would be nice to have this field remain at a consistent offset.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
ba2a4207f9 iris: Clamp UBO and SSBO access to the actual BO size, for safety 2019-02-21 10:26:09 -08:00
Kenneth Graunke
a9b32f2bbf iris: Fix texture buffer / image buffer sizes.
Also fix image buffers with offsets.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
d1f8947792 iris: fix SF_CLIP_VIEWPORT array indexing with multiple VPs
fixes bunches of viewport stuffs
2019-02-21 10:26:09 -08:00
Kenneth Graunke
5bd49a47b6 iris: flag CC_VIEWPORT when changing num viewports
this also has a loop over num_viewports
2019-02-21 10:26:09 -08:00
Kenneth Graunke
d98967d936 iris: fix UBOs with bindings that have an offset 2019-02-21 10:26:09 -08:00
Kenneth Graunke
3f70956a4e iris: try and avoid pointless compute submissions
if apps don't use compute shaders, we don't even want to kick off the
compute initialization batch
2019-02-21 10:26:09 -08:00
Kenneth Graunke
97125e9bb3 iris: fix SBA flushing by refactoring code 2019-02-21 10:26:09 -08:00
Kenneth Graunke
8fa99481e7 iris: do PIPELINE_SELECT for render engine, add flushes, GLK hacks 2019-02-21 10:26:09 -08:00
Kenneth Graunke
b2d223b6bf iris: hack to avoid memorybarriers out the wazoo
we don't want to emit piles of pipe controls to a compute batch if
it isn't necessary...

prevents double-batch-wraps in cs-op-selection-bool-bvec4-bvec4
(but it's still kinda a big ol' hack...)
2019-02-21 10:26:09 -08:00
Kenneth Graunke
b3a40c27a2 iris: don't let render/compute contexts stomp each other's dirty bits
only clear what you process
2019-02-21 10:26:09 -08:00
Kenneth Graunke
f8796079da iris: better dirty checking 2019-02-21 10:26:09 -08:00
Kenneth Graunke
06a993dac2 iris: rewrite grid surface handling
now we only upload a new grid when it's actually changed, which saves us
from having to emit a new binding table every time it changes.

this also moves a bunch of non-gen-specific stuff out of iris_state.c
2019-02-21 10:26:09 -08:00
Kenneth Graunke
155e1a63d5 iris: XXX for compute state tracking :/
Maybe we should just move dirty to batch, it would help with the
reset stuff too
2019-02-21 10:26:09 -08:00
Kenneth Graunke
643030f4fb iris: fix whitespace 2019-02-21 10:26:09 -08:00
Kenneth Graunke
b0dc11993e iris: bail if SLM is needed 2019-02-21 10:26:09 -08:00
Kenneth Graunke
973b937cac iris: leave XXX about unnecessary binding table uploads 2019-02-21 10:26:09 -08:00
Kenneth Graunke
7fb8c20d7b iris: drop unnecessary #ifdefs 2019-02-21 10:26:09 -08:00
Kenneth Graunke
549db5b90e iris: drop XXX that Jordan handled 2019-02-21 10:26:09 -08:00
Jordan Justen
942bdb2906 iris/compute: Support indirect compute dispatch
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
b35c8f2182 iris/compute: Push subgroup-id
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
229450a2a6 iris/compute: Flush compute batch on memory-barriers
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
fb4637797e iris/compute: Provide binding table entry for gl_NumWorkGroups
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
fcd0364857 iris/compute: Wait on compute batch when mapping
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
ea416d0b5d iris/program: Don't try to push ubo ranges for compute
We only can push constants for compute shaders from one range.

Gallium glsl-to-nir (src/mesa/state_tracker/st_glsl_to_nir.cpp) lowers
all uniform accesses to a ubo.

Unfortunately we also load the subgroup-id as a uniform in the
compiler. Since we use the 1 push range for this subgroup-id, we then
lose the ability to actually push the ubo with all the normal user
uniform values.

In other words, there is lots of room for performance improvement, but
at least retrieving the uniforms as pull-constants is functional for
now.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
c7cfa4000f iris/compute: Get group counts from grid->grid
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
fd9ccd8b5d iris/compute: Flush compute batches
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
9b5cda95aa iris/compute: Add MEDIA_STATE_FLUSH following WALKER
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
6ebd04ac8f iris: Add iris_restore_compute_saved_bos
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
622aaa290f iris: Add IRIS_DIRTY_CONSTANTS_CS
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Jordan Justen
25f1625edf iris/compute: Set mask bits on PIPELINE_SELECT
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Kenneth Graunke
9fc672428d iris: little bits of compute basics 2019-02-21 10:26:09 -08:00
Kenneth Graunke
860ce6af3f iris: drop XXX's about swizzling
pretty sure this is unnecessary on modern HW
2019-02-21 10:26:09 -08:00
Kenneth Graunke
12de56f53d iris: drop dead format //'s
these just aren't supported
2019-02-21 10:26:09 -08:00
Kenneth Graunke
f6c68066a6 iris: yes 2019-02-21 10:26:09 -08:00
Kenneth Graunke
752abeb690 iris: initial compute caps
RET macro borrowed from freedreno
2019-02-21 10:26:09 -08:00
Kenneth Graunke
4da28c2c22 iris: Enable fb fetch
needed for ES 3.2
2019-02-21 10:26:09 -08:00
Kenneth Graunke
be905bd461 iris: advertise GL_ARB_shader_texture_image_samples 2019-02-21 10:26:09 -08:00
Jordan Justen
6441e906e8 iris: Set num_uniforms in bytes
Ref: brw_nir_lower_uniforms, type_size_scalar_bytes

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-02-21 10:26:09 -08:00
Kenneth Graunke
c29fd34259 iris: move images next to textures in binding table 2019-02-21 10:26:09 -08:00
Kenneth Graunke
0d9c5b4e7e iris: null for non-existent cbufs
prevents BTs from being shifted down incorrectly
2019-02-21 10:26:09 -08:00
Kenneth Graunke
98e8f80e7d iris: actually set image access 2019-02-21 10:26:09 -08:00
Jason Ekstrand
d9aee25a46 iris: Don't lower image formats for write-only images 2019-02-21 10:26:09 -08:00
Kenneth Graunke
a06f0fe517 iris: set image access correctly 2019-02-21 10:26:09 -08:00
Kenneth Graunke
5d1dadfc38 iris: bother with BTIs 2019-02-21 10:26:09 -08:00
Kenneth Graunke
f5b887da6c iris: implement set_shader_images hook 2019-02-21 10:26:09 -08:00
Kenneth Graunke
26a54ae4b2 iris: lower storage image derefs 2019-02-21 10:26:09 -08:00
Kenneth Graunke
e97a24da89 iris: set the binding table size
we weren't doing mark_surface_used on images (i965 does it while
uploading the unnecessary image uniforms), so our binding tables were
too small...
2019-02-21 10:26:09 -08:00
Kenneth Graunke
28b41992c8 iris: X32_S8X24 :/
This can happen when faking Z32_S8X24 and setting StencilSampling = true

I guess we'll just turn it into S8_UINT...

Fixes KHR-GL45.texture_swizzle.functional
2019-02-21 10:26:09 -08:00
Kenneth Graunke
6e7957a22d iris: enable I/L formats 2019-02-21 10:26:09 -08:00
Kenneth Graunke
bfbebbaa36 iris: Use R/RG instead of I/L/A when sampling 2019-02-21 10:26:09 -08:00
Kenneth Graunke
94569a6458 iris: rework format translation apis 2019-02-21 10:26:09 -08:00
Kenneth Graunke
b9eeed3e8f iris: Allow PIPE_CONTROL with Stall at Scoreboard and RT flush
It's nonsensical, but not illegal, and mandatory on Icelake
2019-02-21 10:26:09 -08:00
Kenneth Graunke
65d1cda995 iris: add gen11 to genX_call 2019-02-21 10:26:09 -08:00
Kenneth Graunke
0fdcb20803 iris: inline stage_from_pipe to avoid unused warnings 2019-02-21 10:26:09 -08:00
Kenneth Graunke
6fbb6ba290 iris: pipe to scs -> iris_pipe.h 2019-02-21 10:26:09 -08:00
Kenneth Graunke
87351b8dfe iris: force persample interp cap 2019-02-21 10:26:09 -08:00
Kenneth Graunke
90b9efc1f9 iris: stencil texturing 2019-02-21 10:26:09 -08:00
Kenneth Graunke
9b229d266d iris: fix Z32_S8 depth sampling
We were accidentally using the ISL_FORMAT_R32_FLOAT_X8X24_TYPELESS
format, which is NOT what we use.  We just store R32_FLOAT depth.

fixes Piglit's texwrap GL_ARB_depth_buffer_float
2019-02-21 10:26:09 -08:00
Kenneth Graunke
822f91508e iris: don't mark contains_draw = false when chaining batches
chaining to a new batch reuses create_batch(), but we don't need to do
the work of pinning BOs we inherit from a previous batch...when that is
actually part of the same execbuf invocation.

instead, just flag it when setting primary_batch_size = 0, in
iris_batch_reset
2019-02-21 10:26:09 -08:00
Kenneth Graunke
294ce58a30 iris: vma_free bo->size, not bo_size
this is more obviously correct.  I think the two end up being the same
in practice, since this is in the alloc_from_cache case, and presumably
bo from the bucket has bo->size == bucket->size, and bo_size also is
bucket->size...

still.  better to do the obvious thing.

brw_bufmgr already does it this way.
2019-02-21 10:26:09 -08:00
Kenneth Graunke
2f24000662 iris: drop a bunch of pipe_sampler_state stuff we don't need 2019-02-21 10:26:09 -08:00
Kenneth Graunke
c6016d3761 iris: just mark snapshots_landed from the CPU
otherwise, get results may check q->map->snapshots_landed...before our
commands to initialize it to false have actually executed...so it'd get
some random garbage from the BO...
2019-02-21 10:26:09 -08:00
Kenneth Graunke
3c0ef22edb iris: Enable ARB_shader_vote
The easiest get out the vote campaign ever
2019-02-21 10:26:08 -08:00
Kenneth Graunke
0395eba20f iris: magic number 36 -> #define 2019-02-21 10:26:08 -08:00
Kenneth Graunke
57f8a623c5 iris: better query file comment 2019-02-21 10:26:08 -08:00
Kenneth Graunke
d3a5d87219 iris: early return properly 2019-02-21 10:26:08 -08:00
Kenneth Graunke
07ff8c752f iris: 36-bit overflow fixes 2019-02-21 10:26:08 -08:00
Kenneth Graunke
dff174c103 iris: Need to | 1 when asking for timestamps 2019-02-21 10:26:08 -08:00
Kenneth Graunke
1d91eba7dc iris: glGet timestamps, more correct timestamps 2019-02-21 10:26:08 -08:00
Kenneth Graunke
36fbcfb06c iris: ...and SO prims emitted queries
looks like we have queries

some fails still due to races between snapshots_written and start/end
not being garbage...not sure what that's about
2019-02-21 10:26:08 -08:00
Kenneth Graunke
ec82be57e8 iris: timestamps 2019-02-21 10:26:08 -08:00
Kenneth Graunke
23572cdd07 iris: drop explicit pinning
writes will already rw_bo or ro_bo that
2019-02-21 10:26:08 -08:00
Kenneth Graunke
d8875fe406 iris: primitives generated query support 2019-02-21 10:26:08 -08:00
Kenneth Graunke
ffae6e3105 iris: pipeline stats 2019-02-21 10:26:08 -08:00
Kenneth Graunke
7840d0e091 iris: play chicken with timer queries for now
they have been crashy in the past and I don't want to risk tanking my
laptop right before my XDC talk
2019-02-21 10:26:08 -08:00
Kenneth Graunke
0b095c665d iris: gpr0 to bool
I think OQ is basically working now.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
f5a8908bd1 iris: fix random failures via CS stall...but why? 2019-02-21 10:26:08 -08:00
Kenneth Graunke
ad14795805 iris: flush batch when asking for result via QBO 2019-02-21 10:26:08 -08:00
Kenneth Graunke
cf261caad9 iris: results write 2019-02-21 10:26:08 -08:00
Kenneth Graunke
d4e4517569 iris: gen10+ workarounds and break fix 2019-02-21 10:26:08 -08:00
Kenneth Graunke
dca5632de1 iris: initial query code 2019-02-21 10:26:08 -08:00
Kenneth Graunke
dd478913d5 iris: LRM/SRM/SDI hooks 2019-02-21 10:26:08 -08:00
Kenneth Graunke
af9fe0d472 iris: rw_bo for pipe controls
this is used for WRITE IMMEDIATE...
but maybe we don't want to for the workaround BO?
2019-02-21 10:26:08 -08:00
Kenneth Graunke
30c370ed4b iris: use 0 for TCS passthrough program string ID
the passthrough shader doesn't need a real program string ID - that's
basically used for ARB programs indicating total program source code
changes, or other pre-baked uniform changes, etc...none of which a
passthrough shader has...so we don't need a unique identifier to
distinguish them.  We want to use a consistent value so we find
existing passthrough shaders in the cache.
2019-02-21 10:26:08 -08:00
Caio Marcelo de Oliveira Filho
54e23442e2 iris: Add support for TCS passthrough
If no TCS is provided, create a "passthrough" TCS that will take the
default values set in the API as constants and pass to the TES, along
with any other inputs it expects.  The code to create the NIR shader
is the same as in i965.

Tested with

    ./piglit run -t 'tess' quick_shader r

and fixed a dozen crashes from that list.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
5395658c61 iris: inherit the index buffer properly 2019-02-21 10:26:08 -08:00
Kenneth Graunke
a858b69880 iris: delete bogus comment
Caio asked what was wrong.  There is nothing wrong.  :)
2019-02-21 10:26:08 -08:00
Kenneth Graunke
f2f506fa43 iris: properly re-pin stencil buffers 2019-02-21 10:26:08 -08:00
Kenneth Graunke
aaced066e8 iris: fix context restore of 3DSTATE_CONSTANT ranges
if clean we want to DO the pinning...not SKIP the pinning.

thanks to Jordan Justen for catching this!
2019-02-21 10:26:08 -08:00
Kenneth Graunke
58a6c99ebe iris: silence const warning
not sure why this is labeled const, I'm pretty sure we are taking the
reference and owning this, so there's no particular reason we can't
change it.  it certainly seems to be working for non-compute.  and,
freedreno's ir3_shader.c seems to do this as well.  still...gross :/
2019-02-21 10:26:08 -08:00
Kenneth Graunke
897f8d9232 iris: refactor program CSO stuff 2019-02-21 10:26:08 -08:00
Caio Marcelo de Oliveira Filho
fb4a3e2736 iris: Fix uses of gl_TessLevel*
The backend compiler expects the gl_TessLevel* variables to be mapped
as inputs instead of system values.  Use the new PIPE_CAP to get this
behavior from GLSL compiler.

Tested with:
tests/spec/arb_tessellation_shader/execution/vs-tcs-tes-tessinner-tessouter-inputs-quads.shader_test
2019-02-21 10:26:08 -08:00
Kenneth Graunke
2b956a093a iris: totally untested icelake support 2019-02-21 10:26:08 -08:00
Kenneth Graunke
921790b080 iris: initialize "don't suck" bits, as Ben likes to call them 2019-02-21 10:26:08 -08:00
Kenneth Graunke
73a4cef220 iris: refactor LRIs in context setup
we're going to have more of them, so reduce the boilerplate
2019-02-21 10:26:08 -08:00
Kenneth Graunke
2d1db44e8e iris: enable ARB_enhanced_layouts 2019-02-21 10:26:08 -08:00
Kenneth Graunke
c0422d623c iris: re-pin binding table contents if we didn't re-emit them
fixes glsl-vs-loop and other regressions from multibinder.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
2963276a58 iris: move binder pinning outside the dirty == 0 check
This might be a new batch with back to back non-dirty calls, if so we
need to inherit the old binder...
2019-02-21 10:26:08 -08:00
Chris Wilson
1a61a211f0 iris: fix memzone_for_address since multibinder changes 2019-02-21 10:26:08 -08:00
Kenneth Graunke
f6924e2379 iris: update comments for multibinder 2019-02-21 10:26:08 -08:00
Kenneth Graunke
5cb0527c4f iris: fix SO offset writes for multiple streams 2019-02-21 10:26:08 -08:00
Kenneth Graunke
eff081cdd9 iris: Support multiple binder BOs, update Surface State Base Address 2019-02-21 10:26:08 -08:00
Kenneth Graunke
148e315d96 iris: fix null FB and unbound tex surface state addresses 2019-02-21 10:26:08 -08:00
Kenneth Graunke
f838400a59 iris: set EXEC_OBJECT_CAPTURE on all driver internal buffers 2019-02-21 10:26:08 -08:00
Kenneth Graunke
938afd484a iris: fix constant buffer 0 to be absolute
thanks to Jason for catching this.  Fixes some va64 tests.  Surprisingly
not much else, as apparently getting to UBO range 4 is uncommon!
2019-02-21 10:26:08 -08:00
Kenneth Graunke
5a2257bb2f iris: don't unconditionally emit 3DSTATE_VF / 3DSTATE_VF_TOPOLOGY
this was just laziness on my part
2019-02-21 10:26:08 -08:00
Kenneth Graunke
4c27cb031c iris: skip over whole function if dirty == 0
kinda pointless in non-pathological cases, but does boost our score in
the drawarrays case by 50%...
2019-02-21 10:26:08 -08:00
Kenneth Graunke
888efcd192 iris: Allow inlining of require/get_command_space
eliminates so many callqs for ptr++
2019-02-21 10:26:08 -08:00
Kenneth Graunke
2ebce6f8c8 iris: use Eric's new caps helper
this does change a couple caps...PRIMITIVE_RESTART_FOR_PATCHES...
2019-02-21 10:26:08 -08:00
Kenneth Graunke
3e7a41f228 iris: new caps 2019-02-21 10:26:08 -08:00
Kenneth Graunke
52eb8d5593 iris: fix blend state memcpy
thanks to Jason for noticing grumpy valgrind
2019-02-21 10:26:08 -08:00
Kenneth Graunke
9ce92fa036 iris: Skip primitive ID overrides if the shader wrote a custom value
Fixes glsl-1.50/execution/geometry/primitive-id-out
2019-02-21 10:26:08 -08:00
Kenneth Graunke
47d3019c4a iris: fix crash when binding optional shader for the first time 2019-02-21 10:26:08 -08:00
Kenneth Graunke
6331b754df iris: handle level/layer in direct maps
needed now that we do 1D linear
2019-02-21 10:26:08 -08:00
Kenneth Graunke
9f7654139b iris: use linear for 1D textures
This gets us the gen9 compact linear storage
2019-02-21 10:26:08 -08:00
Kenneth Graunke
b2a5e1ebb3 iris: big old hack for tex-miplevel-selection
copied from ilo.  I don't understand this at all..
2019-02-21 10:26:08 -08:00
Kenneth Graunke
e4d22b16c8 iris: fix sampler state setting 2019-02-21 10:26:08 -08:00
Kenneth Graunke
b3bb33c4c1 iris: try to hack around binder issue 2019-02-21 10:26:08 -08:00
Kenneth Graunke
d2516358f9 iris: fix line-aa-width
we should probably move the roundf to st_atom_raster
2019-02-21 10:26:08 -08:00
Kenneth Graunke
701b47a197 iris: implement get_sample_position
Fixes arb_sample_shading/builtin-gl-sample-position
2019-02-21 10:26:08 -08:00
Kenneth Graunke
7ed4b80233 iris: z_res -> s_res
fixes crashes introduced a few commits ago
2019-02-21 10:26:08 -08:00
Kenneth Graunke
d1cb4b330a iris: reenable R32G32B32 texture buffers
This dropped us from GL 4.2 to GL 3.3 by mistake.  Thanks to Dave for
catching this!
2019-02-21 10:26:08 -08:00
Chris Wilson
367f6bbd01 iris: Record reusability of bo on construction
We know that if the bufmgr->reuse is set to false or if the bo is too
large for a bucket, the same will be true when we come to free the bo.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
abe7dbfa4a iris: Reduce binder alignment from 64 to 32
3DSTATE_BINDING_TABLE_POINTER_XS's alignment requirement is only 32B.

Makes us waste less precious binder space.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
04e8c5bb43 iris: precompute hashes for cache tracking
saves a touch of cpu overhead in the new resolve tracking
2019-02-21 10:26:08 -08:00
Chris Wilson
d209cc5170 iris: AMD_pinned_memory
(rebased by Ken, mainly set res->internal_format)
2019-02-21 10:26:08 -08:00
Kenneth Graunke
93c1921ce2 iris: proper cache tracking
this is copied from the i965 aux resolve stuff...minus the aux resolves
2019-02-21 10:26:08 -08:00
Kenneth Graunke
5e30b1083b iris: Move cache tracking to iris_resolve.c 2019-02-21 10:26:08 -08:00
Kenneth Graunke
42dccb1233 iris: use consistent copyright formatting
some of them had typos, didn't say 'authors or copyright holders',
or other mistakes.  This is now https://opensource.org/licenses/MIT
text, formatted consistently.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
1d33982e9b iris: track depth/stencil writes enabled 2019-02-21 10:26:08 -08:00
Kenneth Graunke
3fecb1c44d iris: Move iris_sampler_view declaration to iris_resource.h
We'll need this for resolve tracking.  There's also no genxml stuff here
2019-02-21 10:26:08 -08:00
Kenneth Graunke
b75b52530a iris: Move things to iris_shader_state
We didn't originally have this struct, so we had lots of ad-hoc arrays.
Now that we have it, it makes sense to group things there.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
410a555bfb iris: move iris_shader_state from ice->shaders.state to ice->state.shaders
it's more state related...
2019-02-21 10:26:08 -08:00
Kenneth Graunke
33701d5341 iris: Drop bogus sampler state saving
We do this in an earlier loop.  This was just reading things out of the
array, and saving them back over the same array...but in the wrong slots
2019-02-21 10:26:08 -08:00
Kenneth Graunke
aba2cee711 iris: rename pipe to base 2019-02-21 10:26:08 -08:00
Kenneth Graunke
7705f62cb6 iris: don't emit SBE all the time 2019-02-21 10:26:08 -08:00
Kenneth Graunke
630d602900 iris: port non-bucket alignment bugfix
Sergii's 24839663a4
2019-02-21 10:26:08 -08:00
Kenneth Graunke
ad6ba5a712 iris: drop pwrite
nobody uses it
2019-02-21 10:26:08 -08:00
Kenneth Graunke
aad70ad8a1 iris: drop dead assignments
Eric's commit 9a6a631762
2019-02-21 10:26:08 -08:00
Kenneth Graunke
2bd7d6fa71 iris: last VUE map NOS, handle > 16 FS inputs
not sure if the UNCOMPILED_FS flagging is still needed, should
reevaluate those hacks at some point
2019-02-21 10:26:08 -08:00
Kenneth Graunke
ee8cb7e0ee iris: implement ARB_clear_texture 2019-02-21 10:26:08 -08:00
Kenneth Graunke
84b30a2900 iris: call maybe_flush for each blorp operation
otherwise with high layer counts we may exceed two batches worth of
commands... (!)
2019-02-21 10:26:08 -08:00
Kenneth Graunke
0e059e4829 iris: assert depth is 1 in resource_copy_region
given the dstz parameter I don't think it does multiple slices..
2019-02-21 10:26:08 -08:00
Kenneth Graunke
03933a2d1b iris: blorp blit multiple slices
fixes getteximage-depth
2019-02-21 10:26:08 -08:00
Kenneth Graunke
84832ab7d4 iris: Fix tiled memcpy for cubes...and for array slices
tiled_memcpy_map was not offsetting map->ptr based on the slice,
while unmap was.  also, we were doing offsetting wrong for cubes.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
bce7398646 iris: disallow RGB32 formats too 2019-02-21 10:26:08 -08:00
Kenneth Graunke
ea19d359cc iris: Convert RGBX to RGBA for rendering.
Fixes a bunch of RGB bugs.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
906becec70 iris: we can do multisample Z resolves 2019-02-21 10:26:08 -08:00
Kenneth Graunke
1f156f004b iris: deal with Marek's new MSAA caps
storage sample count is equal to sample count for us, for now,
so 0 the pipe cap and ignore the new parameter
2019-02-21 10:26:08 -08:00
Kenneth Graunke
532cf23d25 iris: say no to more formats
copied from brw_surface_formats.c
2019-02-21 10:26:08 -08:00
Kenneth Graunke
d5146ba670 iris: actually do stencil blits 2019-02-21 10:26:08 -08:00
Kenneth Graunke
ad76389f88 iris: refcounting, who needs it?
that's right, we do!
2019-02-21 10:26:08 -08:00
Kenneth Graunke
be60e3247c iris: drop stencil handling now that u_transfer_helper does it 2019-02-21 10:26:08 -08:00
Kenneth Graunke
b932938d01 iris: use u_transfer_helper for depth stencil packing/unpacking 2019-02-21 10:26:08 -08:00
Kenneth Graunke
853230b5e6 iris: WTF transfers
stencil unfortunately is stored in the Weird Tile Format (WTF or Tile-W)
which needs special CPU detiling code.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
d93a20e258 iris: allow S8 as a stencil format 2019-02-21 10:26:08 -08:00
Kenneth Graunke
7972599eab iris: actually emit stencil packets 2019-02-21 10:26:08 -08:00
Kenneth Graunke
753646dd6b iris: clear stencil 2019-02-21 10:26:08 -08:00
Kenneth Graunke
9ec2d3640e iris: depth or stencil fixes 2019-02-21 10:26:08 -08:00
Kenneth Graunke
763f9095ea iris: fill out more caps 2019-02-21 10:26:08 -08:00
Kenneth Graunke
2d578e71d5 iris: get angry about execbuf failures
want this to be easy to detect for now
2019-02-21 10:26:08 -08:00
Kenneth Graunke
a378ee3607 iris: simplify batch len qword alignment
Split from a patch by Chris Wilson so I can test it independently
2019-02-21 10:26:08 -08:00
Kenneth Graunke
621cb43f41 iris: rename ring to engine
makes more sense these days.  split from a patch by Chris Wilson
2019-02-21 10:26:08 -08:00
Kenneth Graunke
1a9651f29a iris: remember to set bo->userptr 2019-02-21 10:26:08 -08:00
Chris Wilson
796ad6fe97 iris: Wrap userptr for creating bo 2019-02-21 10:26:08 -08:00
Kenneth Graunke
5911fb8801 iris: sync bugfixes from brw_bufmgr
I wrote softpin support here first, then debugged and landed it in brw;
some of those fixes need to get brought back.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
dfe1ee4f6f iris: comment everything
1. Write the code
2. Add comments
3. PROFIT (or just avoid cost of explaining or relearning things...)
2019-02-21 10:26:08 -08:00
Kenneth Graunke
387a414f2c iris: add minor comments 2019-02-21 10:26:08 -08:00
Dave Airlie
9d39e69219 iris: fix some hangs around null framebuffers
This fixes some cases in fbo-none* and framebuffer_no_attachments.

I'm not sure this is correct otherwise, the tests don't all pass yet

No idea if this is in any way the correct answer
2019-02-21 10:26:08 -08:00
Chris Wilson
02b82fe80a iris: Set resource modifier on handle
Required for gdm_bo_create_with_modifiers
2019-02-21 10:26:08 -08:00
Kenneth Graunke
682aeff8d0 iris: we don't support textureGatherOffsets, need it lowered 2019-02-21 10:26:08 -08:00
Kenneth Graunke
03dc99475d iris: cube arrays are cubes too 2019-02-21 10:26:08 -08:00
Kenneth Graunke
80c7096672 iris: fix sample mask
0xffffffff does not mean 1, it means enable as many as there actually
are.  we don't get set_sample_mask() calls until some masking is
actually applied...i.e. it doesn't get updated based on # of samples
in the FBO changing.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
e990558152 iris: drop pipe_shader_state
looking at the freedreno code, this is totally unnecessary!  we can just
store the NIR and be happy, and not have any vestiges of TGSI.

plus we can reuse this structure for compute shaders, without needing a
pipe_compute_state base.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
834b97c34b iris: fix GS output component limit
this is total, so should be 1024, not 128
2019-02-21 10:26:08 -08:00
Kenneth Graunke
c9f9a6f61b iris: Avoid croaking when trying to create FBO surfaces with bad formats
create_surface happens before st_validate_attachment, which actually
does the "hey, this is a render target now, is that OK?" check

Fixes asserts in ./bin/arb_texture_view-rendering-formats, allowing the
rest of the tests to run.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
8da91ebb68 iris: enable texture gather 2019-02-21 10:26:08 -08:00
Kenneth Graunke
f3dd70182d iris: BIG OL' HACK for UBO updates
We need to re-push data when UBO changes.  This will need to be replaced
with a usage history based flushing system later.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
a7311ef068 iris: update a todo comment 2019-02-21 10:26:07 -08:00
Kenneth Graunke
8e7b0deee2 iris: Don't reserve new binding table section unless things are dirty 2019-02-21 10:26:07 -08:00
Kenneth Graunke
870f2e8434 iris: implement texture/memory barriers 2019-02-21 10:26:07 -08:00
Kenneth Graunke
82ee971497 iris: drop unused bo parameter 2019-02-21 10:26:07 -08:00
Kenneth Graunke
f0159d5ca3 iris: update bindings when changing programs
the binding table layout depends on program info.

not known to fix anything yet.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
b0e9c5797b iris: fix for disabling ssbos 2019-02-21 10:26:07 -08:00
Kenneth Graunke
b7b061c4e2 iris: fix SSBO indexing
st/nir offsets SSBO indexes by MaxABOs.  This is not what we want,
as it bloats the binding tables.  We'll need to adjust it to use
info->num_abos as the offset and buffer base instead.  For now,
just use the inefficient format to get us rolling.  We can add a
PIPE_CAP later.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
376c7253f8 iris: enable SSBOs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
75709d982b iris: fix TBO alignment to match 965 2019-02-21 10:26:07 -08:00
Kenneth Graunke
77b9219818 iris: unbind compiled shaders if none are present
avoids the case where you have a stale compiled shader bound, but no
uncompiled shader bound, which is not just boats, but an entire marina
2019-02-21 10:26:07 -08:00
Kenneth Graunke
fd5ed7b46b iris: shorten loop
num_ubos doesn't include Tim's magic UBO for regular uniforms, so +1
2019-02-21 10:26:07 -08:00
Kenneth Graunke
bf795b0244 iris: emit binding table for atomic counters and SSBOs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
2d5f545464 iris: implement set_shader_buffers
for SSBOs/ABOs.  We just stream out SURFACE_STATE for now...since it's
a set_* API...and the buffer offset may change...not sure where else
we'd do it.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
541cb60e7e iris: export get_shader_info 2019-02-21 10:26:07 -08:00
Kenneth Graunke
f0558ca22c iris: fix msaa flipping filters 2019-02-21 10:26:07 -08:00
Kenneth Graunke
2c73d7e3f1 iris: expose more things that we already support 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5b8dd5f303 iris: fix blorp filters
we have to switch to blorp enums after the rebase, but also we were
probably doing it wrong for MSAA before this.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
3aa1fcc65a iris: hack around samples confusion 2019-02-21 10:26:07 -08:00
Kenneth Graunke
2c15f38a29 iris: point sprite enables 2019-02-21 10:26:07 -08:00
Kenneth Graunke
c60a4de1f5 iris: reemit blend state for alpha test function changes
fixes bin/fbo-alphatest-formats GL_EXT_texture_snorm
2019-02-21 10:26:07 -08:00
Kenneth Graunke
a4036635b1 iris: fix Z24
This was backwards.

thanks to Jason Ekstrand for realizing that I was seeing the wrong bits.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
a12a370d7b iris: fix EmitNoIndirect
we were using pipe stages, which are ordered dumbly for historical
reasons.  we want gl_shader_stage here.  this got us the wrong options
2019-02-21 10:26:07 -08:00
Kenneth Graunke
5bd861de8b iris: assert about passthrough shaders to make this easier to detect
otherwise it just silently fails and looks like some obscure problem
2019-02-21 10:26:07 -08:00
Kenneth Graunke
5e19885d5a iris: fill out MAX_PATCH_VERTICES 2019-02-21 10:26:07 -08:00
Kenneth Graunke
3e9e3121e5 iris: fix SGVS when there are no valid vertex elements
tessellation nop.shader_test now passes
2019-02-21 10:26:07 -08:00
Kenneth Graunke
5520a54bc5 iris: vertex ID, instance ID 2019-02-21 10:26:07 -08:00
Kenneth Graunke
a9083bdb71 iris: don't emit SO_BUFFERS and SO_DECL_LIST unless streamout is enabled
Otherwise on the first draw, if XFB isn't enabled, we get a pile of
MI_NOOPS where SO_BUFFERS should be
2019-02-21 10:26:07 -08:00
Kenneth Graunke
ebb960c6d3 iris: compile a TCS...don't bother with passthrough yet 2019-02-21 10:26:07 -08:00
Kenneth Graunke
9aa8be3d8e iris: TES program key inputs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
fcee21da6b iris: fix texture buffer stride 2019-02-21 10:26:07 -08:00
Kenneth Graunke
3c41d4cf3f iris: fix sampler views of TBOs
we can't read levels/layers, they're invalid for PIPE_BUFFER
2019-02-21 10:26:07 -08:00
Kenneth Graunke
6e7e49cc4f iris: fix crash 2019-02-21 10:26:07 -08:00
Kenneth Graunke
841fc3e3ca iris: record FS NOS 2019-02-21 10:26:07 -08:00
Kenneth Graunke
d223b316ad iris: NOS mechanics 2019-02-21 10:26:07 -08:00
Kenneth Graunke
a6d480f892 iris: bind state helper function 2019-02-21 10:26:07 -08:00
Kenneth Graunke
48b826cdaf iris: s/hwcso/state/g 2019-02-21 10:26:07 -08:00
Kenneth Graunke
aeb6fc8782 iris: bits of multisample program key 2019-02-21 10:26:07 -08:00
Kenneth Graunke
e6b1cc2106 iris: save query type 2019-02-21 10:26:07 -08:00
Kenneth Graunke
44ba48eba7 iris: draw indirect support? 2019-02-21 10:26:07 -08:00
Kenneth Graunke
b030671298 iris: fix CC_VIEWPORT
I was confusing depth bounds test with depth clamping
2019-02-21 10:26:07 -08:00
Kenneth Graunke
fdbc205552 iris: multislice transfer maps 2019-02-21 10:26:07 -08:00
Kenneth Graunke
44248d16d2 iris: disable 6x MSAA support 2019-02-21 10:26:07 -08:00
Kenneth Graunke
bc1b4db3b3 iris: fix sample mask for MSAA-off 2019-02-21 10:26:07 -08:00
Kenneth Graunke
7b8c0f058e iris: actually pin the buffers 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5635abadef iris: fix SO_DECL_LIST 2019-02-21 10:26:07 -08:00
Kenneth Graunke
dc3b927e97 iris: bother setting program_string_id...
not sure how useful this really is...

./bin/ext_transform_feedback-tessellation triangles flat_first
is hitting a case where we rebind the same VS program, but with
different streamout info...which isn't in the key...but is in the
cache...so we don't rebuild it...
2019-02-21 10:26:07 -08:00
Kenneth Graunke
9c1cefff52 iris: set even if no outputs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
cef0b8b13b iris: streamout 2019-02-21 10:26:07 -08:00
Kenneth Graunke
059c096eff iris: SO buffers 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5c00f5fdca iris: Implement 3DSTATE_SO_DECL_LIST 2019-02-21 10:26:07 -08:00
Kenneth Graunke
6794f1ffb9 iris: rearrange iris_resource.h 2019-02-21 10:26:07 -08:00
Kenneth Graunke
a3f77eceb4 iris: slab allocate transfers
apparently we need this for u_threaded_context
2019-02-21 10:26:07 -08:00
Kenneth Graunke
5165308169 iris: don't crash on shader perf logs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
f20fc950a7 iris: fix depth bounds clamp enables
fixes depthrange-clear among others
2019-02-21 10:26:07 -08:00
Kenneth Graunke
eb274a31bc iris: fix clip flagging on fb changes 2019-02-21 10:26:07 -08:00
Kenneth Graunke
0232fbc2c4 iris: comment out l/a/i/la
in hopes of r/rg fallbacks
2019-02-21 10:26:07 -08:00
Kenneth Graunke
cf34dd7a61 iris: actually handle array layers in blits 2019-02-21 10:26:07 -08:00
Kenneth Graunke
33a17d566f iris: keep DISCARD_RANGE
this isn't really an iris_bo_map flag, but the various resource mappers
want to check for it to avoid making temp copies.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
c0ab9c9890 iris: actually set cube bit properly 2019-02-21 10:26:07 -08:00
Kenneth Graunke
d849501f4c iris: rename map->stride 2019-02-21 10:26:07 -08:00
Kenneth Graunke
36301bbe40 iris: fix zoffset asserts with 2DArray/Cube 2019-02-21 10:26:07 -08:00
Kenneth Graunke
7f39f4843f iris: SBE change stash
not used yet, but want to flag it so I don't forget
2019-02-21 10:26:07 -08:00
Kenneth Graunke
8a080223e6 iris: just malloc one iris_genx_state instead of a bunch of oddball pieces
Things that are gen-specific can go in iris_genx_state.  Things that are
gen-agnostic can go directly in ice->state.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
a7e0edffb6 iris: dead pointer 2019-02-21 10:26:07 -08:00
Kenneth Graunke
ccec5bab5b iris: implement border color, fix other sampler nonsense 2019-02-21 10:26:07 -08:00
Kenneth Graunke
8a16249285 iris: border color memory zone :(
They took away our pointer bits, so now we need a pile of special code
to handle this instead of just using u_upload_mgr. :(
2019-02-21 10:26:07 -08:00
Kenneth Graunke
1c19e3b21f iris: don't include binder in surface VMA range 2019-02-21 10:26:07 -08:00
Kenneth Graunke
1cea195a95 iris: state ref tuple 2019-02-21 10:26:07 -08:00
Kenneth Graunke
c0e80a8d0a iris: null surface for unbound textures
avoids crashes...may not be really right
2019-02-21 10:26:07 -08:00
Kenneth Graunke
d358a4a040 iris: depth clears 2019-02-21 10:26:07 -08:00
Kenneth Graunke
470fb01a7a iris: fix GS dispatch mode 2019-02-21 10:26:07 -08:00
Kenneth Graunke
01483c7933 iris: fix 3DSTATE_VERTEX_ELEMENTS / VF_INSTANCING for 0 elements 2019-02-21 10:26:07 -08:00
Kenneth Graunke
4c9067ae1d iris: don't emit garbage 3DSTATE_VERTEX_BUFFERS when there aren't any 2019-02-21 10:26:07 -08:00
Kenneth Graunke
adf0c20461 iris: geometry shader support 2019-02-21 10:26:07 -08:00
Kenneth Graunke
de08ac9b0f iris: TES uniform fixes
not that we have a TES, but...
2019-02-21 10:26:07 -08:00
Kenneth Graunke
d207f97840 iris: larger polygon offset 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5188e54e97 iris: fix provoking vertex ordering
had this backwards
2019-02-21 10:26:07 -08:00
Kenneth Graunke
cbbd6a61c4 iris: maybe-flush before blorp operations
otherwise if we have a lot of back-to-back blorp operations we can
potentially overflow even the chained batch
2019-02-21 10:26:07 -08:00
Kenneth Graunke
e0f3971280 iris: lightmodel flat 2019-02-21 10:26:07 -08:00
Kenneth Graunke
4d04111bfb iris: implement copy image 2019-02-21 10:26:07 -08:00
Kenneth Graunke
40fd2fd603 iris: fall back to u_generate_mipmap
It just does blits between layers, which is all we'd do anyway,
and it already should use BLORP because of iris_blit().  Plus it
handles 3D, which our code in i965 doesn't.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
6cf04c6ded iris: clear fix 2019-02-21 10:26:07 -08:00
Kenneth Graunke
d416b81779 iris: shader dirty bits 2019-02-21 10:26:07 -08:00
Kenneth Graunke
b7cd3a083a iris: rework DEBUG_REEMIT
don't want to have to special case this everywhere
2019-02-21 10:26:07 -08:00
Kenneth Graunke
72416a2d0d iris: clears 2019-02-21 10:26:07 -08:00
Kenneth Graunke
eef0d33cee iris: better boxing on maps 2019-02-21 10:26:07 -08:00
Kenneth Graunke
419fac2fc6 iris: fix fragcoord ytransform
the TGSI in the name is a misnomer, it actually controls wpos_ytransform
lowering in NIR these days.
2019-02-21 10:26:07 -08:00
Kenneth Graunke
e67951227d iris: Disable unsupported mirror clamp modes 2019-02-21 10:26:07 -08:00
Kenneth Graunke
234cf647a4 iris: tidy comments about mirroring modes 2019-02-21 10:26:07 -08:00
Kenneth Graunke
a3a998f19a iris: iris - fix QWord aligned endings after batch chaining rework
I need to save the primary batch size after expanding it to include
MI_BATCH_BUFFER_END and the QWord padding NOP
2019-02-21 10:26:07 -08:00
Kenneth Graunke
aacbcbbf47 iris: colorize batchbuffer failures to make them stand out 2019-02-21 10:26:07 -08:00
Kenneth Graunke
8e2b71b190 iris: bad inherited comments 2019-02-21 10:26:07 -08:00
Kenneth Graunke
8c54433275 iris: Handle batch submission failure "better"
We used to not reset the batch, and just keep appending to it, so you'd
get the same invalid contents over and over.

I'd also really like to know about this, so aborting seems wise for now,
if not for the long term
2019-02-21 10:26:07 -08:00
Kenneth Graunke
d0b55ca782 iris: don't always flush 2019-02-21 10:26:07 -08:00
Kenneth Graunke
9226ebfa85 iris: print second batch size separately 2019-02-21 10:26:07 -08:00
Kenneth Graunke
f12b079c0e iris: actually init num_viewports
fixes regressions
2019-02-21 10:26:07 -08:00
Kenneth Graunke
81f899c148 iris: scissor count fixes 2019-02-21 10:26:07 -08:00
Kenneth Graunke
92d6a70853 iris: fix VP iteration 2019-02-21 10:26:07 -08:00
Kenneth Graunke
4a94628513 iris: fix num viewports to be based on programs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
b17215800c iris: fix viewport counts and settings
seeing

   set_viewport_state 0 1
   set_viewport_state 1 15

which gives us a total of 16 viewports, updated incrementally
so keep old values around and update them...
2019-02-21 10:26:07 -08:00
Kenneth Graunke
636cf8971e iris: max VP index 2019-02-21 10:26:07 -08:00
Kenneth Graunke
7cdc6b1173 iris: emit 3DSTATE_SBE_SWIZ 2019-02-21 10:26:07 -08:00
Kenneth Graunke
26db2ea782 iris: avoid crashing on unbound constant resources
instead, read from the workaround BO
2019-02-21 10:26:07 -08:00
Kenneth Graunke
a7770501a7 iris: fix caps so tests run again 2019-02-21 10:26:07 -08:00
Kenneth Graunke
a6aeca9727 iris: fix major refcounting bug with resources
DONTBLOCK -> NULL was happening after taking a reference, causing those
to live forever

This resolves the OOM problems
2019-02-21 10:26:07 -08:00
Kenneth Graunke
49f9c88801 iris: support signed vertex buffer offsets 2019-02-21 10:26:07 -08:00
Kenneth Graunke
0a43c9defa iris: print refcounts in INTEL_DEBUG=submit 2019-02-21 10:26:07 -08:00
Kenneth Graunke
7d1e6f1fa1 iris: redo VB CSO a bit 2019-02-21 10:26:07 -08:00
Kenneth Graunke
432790bacd iris: print binder utilization in INTEL_DEBUG=submit 2019-02-21 10:26:07 -08:00
Kenneth Graunke
f8179dc760 iris: clean up some warnings so I can see through the noise 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5f3a7ee701 iris: use pipe resources not direct BOs 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5619c15ecc iris: indentation 2019-02-21 10:26:07 -08:00
Kenneth Graunke
27d45eb2f2 iris: don't leak keyboxes when searching for an existing program 2019-02-21 10:26:07 -08:00
Kenneth Graunke
7d504f3d52 iris: don't leak sampler state table resources 2019-02-21 10:26:07 -08:00
Kenneth Graunke
8e186cef2c iris: rzalloc iris_compiled_shader so memcmp works even if padding creeps in 2019-02-21 10:26:07 -08:00
Kenneth Graunke
5f722bf7c4 iris: remove 4 bytes of padding in iris_compiled_shader 2019-02-21 10:26:07 -08:00
Kenneth Graunke
0db86016f7 iris: pc fixes 2019-02-21 10:26:07 -08:00
Kenneth Graunke
f9f8ea7070 iris: more leak fixes 2019-02-21 10:26:07 -08:00
Kenneth Graunke
c763ecaa65 iris: plug leaks 2019-02-21 10:26:07 -08:00
Kenneth Graunke
477ea6c39a iris: clear dirty 2019-02-21 10:26:07 -08:00
Kenneth Graunke
23987df412 iris: some dirty fixes
two scissor bits, constants not being flagged, ZeroRTA, clip not being
flagged
2019-02-21 10:26:07 -08:00
Kenneth Graunke
ccf37c7da9 iris: bindings dirty tracking 2019-02-21 10:26:07 -08:00
Kenneth Graunke
bbc6d15b59 iris: flag DIRTY_WM properly 2019-02-21 10:26:06 -08:00
Kenneth Graunke
3f863cf680 iris: fix the validation list on new batches 2019-02-21 10:26:06 -08:00
Kenneth Graunke
80dee31846 iris: save pointers to streamed state resources
will be used for cross-batch validation list fixing
2019-02-21 10:26:06 -08:00
Kenneth Graunke
daceb04bc0 iris: put back the always flush - fixes some things :( 2019-02-21 10:26:06 -08:00
Kenneth Graunke
149408a360 iris: untested SAMPLER_STATE pin BO fix 2019-02-21 10:26:06 -08:00
Kenneth Graunke
de782e5b39 iris: delete some pointless STATIC_ASSERTS
these were useful when I was patching relocs
2019-02-21 10:26:06 -08:00
Kenneth Graunke
3eebea88dc iris: untested index buffer upload 2019-02-21 10:26:06 -08:00
Kenneth Graunke
9247546181 iris: state cleaning 2019-02-21 10:26:06 -08:00
Kenneth Graunke
7c40cdc12f iris: comment about reemitting and flushing 2019-02-21 10:26:06 -08:00
Kenneth Graunke
d46c5b7c6c iris: allow mapped buffers during execution (faster) 2019-02-21 10:26:06 -08:00
Kenneth Graunke
92de0f5aa6 iris: disable __gen_validate_value in release mode 2019-02-21 10:26:06 -08:00
Kenneth Graunke
08d1f13818 iris: drop assert for now 2019-02-21 10:26:06 -08:00
Kenneth Graunke
a9e357caac iris: fix release builds 2019-02-21 10:26:06 -08:00
Kenneth Graunke
73f3c2cad0 iris: better VFI 2019-02-21 10:26:06 -08:00
Chris Wilson
2cbd42cddd iris: IndexFormat = size/2
brw uses:
  IndexFormat = index_size >> 1

anv uses:
  IndexFromat = index_type[index_size]
2019-02-21 10:26:06 -08:00
Kenneth Graunke
5dcf62bb43 iris: use u_transfer helpers for now 2019-02-21 10:26:06 -08:00
Kenneth Graunke
48dc8bd4b0 iris: fix pull bufs that aren't the first user upload 2019-02-21 10:26:06 -08:00
Kenneth Graunke
eed7f7253e iris: fill out pull constant buffers 2019-02-21 10:26:06 -08:00
Kenneth Graunke
90046b43cc iris: make surface states for cbufs 2019-02-21 10:26:06 -08:00
Kenneth Graunke
4e007dbb30 iris: have more than one const_offset 2019-02-21 10:26:06 -08:00
Kenneth Graunke
9ea05ccf1f iris: completely rewrite binder
now we get a new one per batch, and flush if it fills up
2019-02-21 10:26:06 -08:00
Kenneth Graunke
26cc609927 iris: better ubo handling 2019-02-21 10:26:06 -08:00
Chris Wilson
a504b98e72 iris: fix import from dri2/3 2019-02-21 10:26:06 -08:00
Kenneth Graunke
badefe50a0 iris: fix constant packet length to match i965 2019-02-21 10:26:06 -08:00
Kenneth Graunke
201a4d923c iris: maybe slightly less boats uniforms 2019-02-21 10:26:06 -08:00
Kenneth Graunke
a6dd9caf0d iris: flush always 2019-02-21 10:26:06 -08:00
Kenneth Graunke
04d1a3a7de iris: transfers 2019-02-21 10:26:06 -08:00
Kenneth Graunke
7437c28c0d iris: util_copy_framebuffer_state (ported from Rob's v3d patches) 2019-02-21 10:26:06 -08:00
Kenneth Graunke
f6017da83f iris: fix VF INSTANCING length 2019-02-21 10:26:06 -08:00
Kenneth Graunke
7fb7704b2e iris: more depth stuffs...
still missing stencil
2019-02-21 10:26:06 -08:00
Kenneth Graunke
02890c75b5 iris: fix 3DSTATE_VERTEX_ELEMENTS length 2019-02-21 10:26:06 -08:00
Kenneth Graunke
601ee4c189 iris: fix whitespace 2019-02-21 10:26:06 -08:00
Kenneth Graunke
4d24874236 iris: Lower the max number of decoded VBO lines
saint foo, vbo lines!
2019-02-21 10:26:06 -08:00
Kenneth Graunke
48ddd7212d iris: fix decoding and undo testing code 2019-02-21 10:26:06 -08:00
Kenneth Graunke
f31eea1f00 iris: fix batch chaining...
don't chain a batch just for the end
2019-02-21 10:26:06 -08:00
Kenneth Graunke
5b914a6d58 iris: caps 2019-02-21 10:26:06 -08:00
Kenneth Graunke
604a1a1614 iris: chaining not growing 2019-02-21 10:26:06 -08:00
Kenneth Graunke
053fb51125 iris: just turn batch reset_and_clear_caches into reset 2019-02-21 10:26:06 -08:00
Kenneth Graunke
ca735c5e0c iris: delete growing code and just die for now
we need proper batch chaining.  without relocations, we can't grow,
since we've only allocated so much VMA for the batch, and the mechanism
only works if we can pin it at the old address
2019-02-21 10:26:06 -08:00
Kenneth Graunke
7167c6d508 iris: blorp bug fixes
I wrote this earlier, but it got lost somehow...
2019-02-21 10:26:06 -08:00
Kenneth Graunke
3650f8dfa1 iris: properly reject formats, fixes RGB32 rendering with texture float 2019-02-21 10:26:06 -08:00
Kenneth Graunke
4510098b9c iris: proper # of uniforms
or at least closer...we were using bytes, we want 256-bit units...
2019-02-21 10:26:06 -08:00
Kenneth Graunke
6091dc470f iris: proper length for VE packet? 2019-02-21 10:26:06 -08:00
Kenneth Graunke
64a3f7423a iris: uniforms for VS 2019-02-21 10:26:06 -08:00
Kenneth Graunke
d4a64e0a64 iris: bump GL version to 4.2 2019-02-21 10:26:06 -08:00
Kenneth Graunke
44993d451c iris: some depth stuff :( 2019-02-21 10:26:06 -08:00
Kenneth Graunke
eb12cc70f0 iris: assert surf init 2019-02-21 10:26:06 -08:00
Kenneth Graunke
a4a426008b iris: no more drawing rectangle in blorp
there's some bug here as Jason's patches for only emitting 3DS_DR once
got reverted by Mark later on, apparently they regressed MSAA tests.

need to sort that out.
2019-02-21 10:26:06 -08:00
Kenneth Graunke
0e3870b9de iris: blorp URB 2019-02-21 10:26:06 -08:00
Kenneth Graunke
01fe6df0ed iris: make blorp pin the binder 2019-02-21 10:26:06 -08:00
Kenneth Graunke
063fc7bbb0 iris: linear staging buffers - fast CPU access... 2019-02-21 10:26:06 -08:00
Kenneth Graunke
84abf77c67 iris: hacky flushing for now 2019-02-21 10:26:06 -08:00
Kenneth Graunke
75a1639262 iris: drop the 48b printout, we never use anything else 2019-02-21 10:26:06 -08:00
Kenneth Graunke
86d7fd71f4 iris: add INTEL_DEBUG=reemit 2019-02-21 10:26:06 -08:00
Kenneth Graunke
b8a11ad256 iris: fix blorp prog data crashes 2019-02-21 10:26:06 -08:00
Kenneth Graunke
e2ba98ba39 iris: more blorp 2019-02-21 10:26:06 -08:00
Kenneth Graunke
1bba60a4bf iris: fix sampler view crashes 2019-02-21 10:26:06 -08:00
Kenneth Graunke
e22da1e7b1 iris: drop bogus binder free
I was malloc'ing it but then I changed my mind and embedded it directly
2019-02-21 10:26:06 -08:00
Kenneth Graunke
698d45b725 iris: more blitting code to make readpixels work 2019-02-21 10:26:06 -08:00
Kenneth Graunke
c9d9e44720 iris: bits of blorp code 2019-02-21 10:26:06 -08:00
Kenneth Graunke
79466c1313 iris: move bo_offset_from_sba
for wider use
2019-02-21 10:26:06 -08:00
Kenneth Graunke
60d708bb80 iris: copy over i965's cache tracking
needed to split out vtbl so I can pipe control without ice
2019-02-21 10:26:06 -08:00
Kenneth Graunke
dbd4770397 iris: pull in newer comments 2019-02-21 10:26:06 -08:00
Kenneth Graunke
841b3b9003 iris: Defines for base addresses rather than numbers everywhere 2019-02-21 10:26:06 -08:00
Kenneth Graunke
c75a1254a4 iris: Move get_command_space to iris_batch.c
for reuse in blorp.  it's a better interface anyway.
2019-02-21 10:26:06 -08:00
Kenneth Graunke
39e795d473 iris: fix texturing! 2019-02-21 10:26:06 -08:00
Kenneth Graunke
4929f020c3 iris: better SBE 2019-02-21 10:26:06 -08:00
Kenneth Graunke
8bf167c9e9 iris: vma - fix assert 2019-02-21 10:26:06 -08:00
Kenneth Graunke
10e4f1e68c iris: vma fixes - don't free binder address 2019-02-21 10:26:06 -08:00
Kenneth Graunke
5a101e6434 iris: bo reuse 2019-02-21 10:26:06 -08:00
Kenneth Graunke
21acc00490 iris: crazy pipe control code
imported from ~kwg/mesa pcx-2, gen < 8 code dropped
2019-02-21 10:26:06 -08:00
Kenneth Graunke
87aa880795 iris: fixes 2019-02-21 10:26:06 -08:00
Kenneth Graunke
3fbf7294b1 iris: fixes from i965 2019-02-21 10:26:06 -08:00
Kenneth Graunke
999ed6e213 iris: port bug fix from i965 2019-02-21 10:26:05 -08:00
Kenneth Graunke
19d11a6df3 iris: fix index 2019-02-21 10:26:05 -08:00
Kenneth Graunke
010e845af7 iris: increase allocator alignment 2019-02-21 10:26:05 -08:00
Kenneth Graunke
35afa8c8f3 iris: better BT asserts
Probably nothing is working because texture upload isn't implemented
2019-02-21 10:26:05 -08:00
Kenneth Graunke
0148bd6839 iris: decoder fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
5d2673ba7e iris: set sampler views 2019-02-21 10:26:05 -08:00
Kenneth Graunke
34164ce622 iris: isv freeing fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
012154c20f iris: TES stash
TODO: key setup
2019-02-21 10:26:05 -08:00
Kenneth Graunke
d890aee15d iris: SBA once at context creation, not per batch
hooray!
2019-02-21 10:26:05 -08:00
Kenneth Graunke
e0eac28bd4 iris: fix a scissor bug 2019-02-21 10:26:05 -08:00
Kenneth Graunke
0707ff3f2f iris: assemble SAMPLER_STATE table at bind time
It's useless to allocate SAMPLER_STATEs in GPU memory on creation like
we do for SURFACE_STATES, because they need to be organized into a
contiguous block of memory.  But we can do that at bind time, rather
than draw time.
2019-02-21 10:26:05 -08:00
Kenneth Graunke
199c080926 iris: same treatment for sampler views 2019-02-21 10:26:05 -08:00
Kenneth Graunke
f51204a160 iris: allocate SURFACE_STATEs up front and stop streaming them 2019-02-21 10:26:05 -08:00
Kenneth Graunke
bf90d8a125 iris: delete more trash 2019-02-21 10:26:05 -08:00
Kenneth Graunke
1398c99aff iris: canonicalize addresses.
Back to working!  Woo!
2019-02-21 10:26:05 -08:00
Kenneth Graunke
b69a85bc4d iris: validation dumping improvements
backported from i965.  don't bother with (pinned) because everything is.
2019-02-21 10:26:05 -08:00
Kenneth Graunke
24bcf1054b iris: update vb BO handling now that we have softpin 2019-02-21 10:26:05 -08:00
Kenneth Graunke
9ac81f1890 iris: decoder fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
9955e8334b iris: binder fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
65073c2217 iris: hook up batch decoder 2019-02-21 10:26:05 -08:00
Kenneth Graunke
6cbd1d1692 iris: binders 2019-02-21 10:26:05 -08:00
Kenneth Graunke
209692c716 iris: include p_defines.h in iris_bufmgr.h
for PIPE_TRANSFER_WRITE and friends
2019-02-21 10:26:05 -08:00
Kenneth Graunke
1af84d345a iris: set EXEC_OBJECT_WRITE 2019-02-21 10:26:05 -08:00
Kenneth Graunke
651be7cf3d iris: rewrite to use memzones and not relocs 2019-02-21 10:26:05 -08:00
Kenneth Graunke
68229caa38 iris: more uploaders 2019-02-21 10:26:05 -08:00
Kenneth Graunke
3861d24e23 iris: Also set SUPPORTS_48B? Not sure if necessary. 2019-02-21 10:26:05 -08:00
Kenneth Graunke
e95ad5994a iris: dump gtt offset in dump_validation_list 2019-02-21 10:26:05 -08:00
Kenneth Graunke
d78be0188e iris: fix icache memzone 2019-02-21 10:26:05 -08:00
Kenneth Graunke
e4aa8338c3 iris: Soft-pin the universe
Breaks everything, woo!
2019-02-21 10:26:05 -08:00
Kenneth Graunke
3693307670 iris: some thinking about binding tables 2019-02-21 10:26:05 -08:00
Kenneth Graunke
f6be3d4f3a iris: bufmgr updates.
Drop BO_ALLOC_BUSY (best not to hand people a loaded gun...)
Drop vestiges of alignment
2019-02-21 10:26:05 -08:00
Kenneth Graunke
902a122404 iris: stop adding 9 to our varyings 2019-02-21 10:26:05 -08:00
Kenneth Graunke
a235da3e68 iris: set strides on transfers 2019-02-21 10:26:05 -08:00
Kenneth Graunke
6891f70d87 iris: enable a few more formats 2019-02-21 10:26:05 -08:00
Kenneth Graunke
7130c43d96 iris: decode batches if they fail to submit 2019-02-21 10:26:05 -08:00
Kenneth Graunke
23367688e9 iris: NOOP pad batches correctly 2019-02-21 10:26:05 -08:00
Kenneth Graunke
f3150e9ecd iris: warn if execbuf fails 2019-02-21 10:26:05 -08:00
Kenneth Graunke
a50a3a8edf iris: uniform bits...badly 2019-02-21 10:26:05 -08:00
Kenneth Graunke
213b70a222 iris: sample mask...not 0.
We now have a first triangle!
2019-02-21 10:26:05 -08:00
Kenneth Graunke
1a6bb266cf iris: write DISABLES are not write ENABLES...whoops 2019-02-21 10:26:05 -08:00
Kenneth Graunke
50a2596f46 iris: fix extents 2019-02-21 10:26:05 -08:00
Kenneth Graunke
ffcd84f55a iris: catastrophic state pointer mistake 2019-02-21 10:26:05 -08:00
Kenneth Graunke
1739dc0d5e iris: more SF CL VPs 2019-02-21 10:26:05 -08:00
Kenneth Graunke
ade381fb9c iris: fix dmabuf retval comparisons
0 means success
2019-02-21 10:26:05 -08:00
Kenneth Graunke
ed42ae2f9b iris: more sketchy SBE 2019-02-21 10:26:05 -08:00
Kenneth Graunke
9be4b3baaf iris: compctrl
oh, also run things
2019-02-21 10:26:05 -08:00
Kenneth Graunke
db15993cfd iris: actually pin the instruction cache buffers 2019-02-21 10:26:05 -08:00
Kenneth Graunke
bda9a77b47 iris: smaller blend state 2019-02-21 10:26:05 -08:00
Kenneth Graunke
f9d834d588 iris: don't do samplers for disabled stages 2019-02-21 10:26:05 -08:00
Kenneth Graunke
e21bddeb4f iris: render targets! 2019-02-21 10:26:05 -08:00
Kenneth Graunke
8503578e82 iris: fix silly unused batch with addr macro 2019-02-21 10:26:05 -08:00
Kenneth Graunke
352ec1f378 iris: warning fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
54ba8a60d5 iris: basic SBE code 2019-02-21 10:26:05 -08:00
Kenneth Graunke
5af16f5e20 iris: alpha testing in PSB 2019-02-21 10:26:05 -08:00
Kenneth Graunke
c96132d5fd iris: blend state 2019-02-21 10:26:05 -08:00
Kenneth Graunke
bb3c0be7a8 iris: dummy constants 2019-02-21 10:26:05 -08:00
Kenneth Graunke
538decc0de iris: URB configs. 2019-02-21 10:26:05 -08:00
Kenneth Graunke
b1115799e6 iris: actually set KSP offsets 2019-02-21 10:26:05 -08:00
Kenneth Graunke
6f1c07d7dd iris: actually softpin at an address 2019-02-21 10:26:05 -08:00
Kenneth Graunke
acdff2f9a6 iris: actually destroy the cache 2019-02-21 10:26:05 -08:00
Kenneth Graunke
9437e135ed iris: rewrite program cache to use u_upload_mgr 2019-02-21 10:26:05 -08:00
Kenneth Graunke
67ca2be992 iris: no NEW_SBA 2019-02-21 10:26:05 -08:00
Kenneth Graunke
e7a729ba34 iris: shuffle comments 2019-02-21 10:26:05 -08:00
Kenneth Graunke
6ecc93f764 iris: bits of WM key 2019-02-21 10:26:05 -08:00
Kenneth Graunke
bba13b1501 iris: move key pop to state module
shader key population needs to read state
2019-02-21 10:26:05 -08:00
Kenneth Graunke
5864c9414a iris: fix SBA 2019-02-21 10:26:05 -08:00
Kenneth Graunke
5ae278da18 iris: use vtbl to avoid multiple symbols, fix state base address 2019-02-21 10:26:05 -08:00
Kenneth Graunke
876417f9e8 iris: softpin some things 2019-02-21 10:26:05 -08:00
Kenneth Graunke
c493fee73f iris: drop const from prog data parameters
we ralloc steal things, which makes it a little bogus
2019-02-21 10:26:05 -08:00
Kenneth Graunke
cf7ba838ad iris: more comes from bits filled in
tomorrow, fix the build system to avoid symbol clashes somehow...
we're getting gen9 functions because they happen to be listed before 10
in the link list.
2019-02-21 10:26:05 -08:00
Kenneth Graunke
8dffc9b195 iris: index buffer BO 2019-02-21 10:26:05 -08:00
Kenneth Graunke
8665dfd602 iris: WM.
I could have added a dirty bit for this, but it doesn't seem worth it
2019-02-21 10:26:05 -08:00
Kenneth Graunke
bae5414594 iris: initial gpu state 2019-02-21 10:26:05 -08:00
Kenneth Graunke
0477591355 iris: reorganize commands to match brw 2019-02-21 10:26:05 -08:00
Kenneth Graunke
3e684d0eb7 iris: don't forget about TE 2019-02-21 10:26:05 -08:00
Kenneth Graunke
d71d2028ef iris: convert IRIS_DIRTY_* to #defines
enums are SIGNED.  so IRIS_DIRTY_VS << 4 gets sign extended, making it
not equal to IRIS_DIRTY_FS.  Surprising!
2019-02-21 10:26:05 -08:00
Kenneth Graunke
cfd5fcc256 iris: emit shader packets 2019-02-21 10:26:05 -08:00
Kenneth Graunke
1cf21cc813 iris: actually save derived state 2019-02-21 10:26:05 -08:00
Kenneth Graunke
57c1b71418 iris: promote iris_program_cache_item to iris_compiled_shader 2019-02-21 10:26:05 -08:00
Kenneth Graunke
581459a9fe iris: some shader bits 2019-02-21 10:26:05 -08:00
Kenneth Graunke
df401aaa11 iris: scissor slots 2019-02-21 10:26:05 -08:00
Kenneth Graunke
dc4453d886 iris: bind_state -> compute state 2019-02-21 10:26:05 -08:00
Kenneth Graunke
2f100c6e31 iris: 3DPRIMITIVE fields 2019-02-21 10:26:05 -08:00
Kenneth Graunke
b3646e2b48 iris: fix VF instancing length so we don't get garbage in batch 2019-02-21 10:26:05 -08:00
Kenneth Graunke
317263ab11 iris: vertex packet fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
129fae5a90 iris: fix VBs 2019-02-21 10:26:05 -08:00
Kenneth Graunke
fc5ddc64f9 iris: fix assert 2019-02-21 10:26:05 -08:00
Kenneth Graunke
e91289908a iris: fix indentation 2019-02-21 10:26:05 -08:00
Kenneth Graunke
41b32a4eda iris: hack to stop crashing on samplers for now 2019-02-21 10:26:05 -08:00
Kenneth Graunke
dcfb06375a iris: initialize dirty bits to ~0ull 2019-02-21 10:26:05 -08:00
Kenneth Graunke
0a513d63a1 iris: actually advance forward when emitting commands 2019-02-21 10:26:05 -08:00
Kenneth Graunke
24cc627612 iris: actually flush the commands 2019-02-21 10:26:05 -08:00
Kenneth Graunke
082911409e iris: actually APPEND commands, not stomp over the top and never incr 2019-02-21 10:26:05 -08:00
Kenneth Graunke
b332ff489c iris: VB fixes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
50b1e01996 iris: DEBUG=bat
Deleted in the interest of making the branch compile at each step
2019-02-21 10:26:05 -08:00
Kenneth Graunke
6e01bc0637 iris: VB addresses 2019-02-21 10:26:05 -08:00
Kenneth Graunke
b574b56325 iris: reference VB BOs 2019-02-21 10:26:05 -08:00
Kenneth Graunke
4dc683f64b iris: so, sba then. 2019-02-21 10:26:05 -08:00
Kenneth Graunke
d900a235b1 iris: try and have an iris address 2019-02-21 10:26:05 -08:00
Kenneth Graunke
f31ae76216 iris: flag SBA updates when instruction BO changes 2019-02-21 10:26:05 -08:00
Kenneth Graunke
7d90cc8da4 iris: bit of SBA code
genxml MOCS is stupid, addresses are hard news at 11
2019-02-21 10:26:05 -08:00
Kenneth Graunke
ff5c886fb3 iris: move MAX defines to iris_batch.h
for SBA
2019-02-21 10:26:05 -08:00
Kenneth Graunke
7bfc8f7d7d iris: kill iris_new_batch
reset and new are too similar, and this had exactly one caller
2019-02-21 10:26:05 -08:00
Kenneth Graunke
b701096ab9 iris: make iris_batch target a particular ring 2019-02-21 10:26:05 -08:00
Kenneth Graunke
64f043570d iris: lower io 2019-02-21 10:26:05 -08:00
Kenneth Graunke
695bd55d1a iris: do the FS...asserts because we don't lower uniforms yet 2019-02-21 10:26:05 -08:00
Kenneth Graunke
6aa15cadf3 iris: import program cache code 2019-02-21 10:26:05 -08:00
Kenneth Graunke
4525dda75f iris: reworks, FS compile pieces 2019-02-21 10:26:05 -08:00
Kenneth Graunke
628a71c2e3 iris: parse INTEL_DEBUG 2019-02-21 10:26:05 -08:00
Kenneth Graunke
d62b0b9ee8 iris: draw->restart_index is uninitialized if PR is not enabled 2019-02-21 10:26:05 -08:00
Kenneth Graunke
5fad62cef1 iris: fix bogus index buffer reference 2019-02-21 10:26:05 -08:00
Kenneth Graunke
95fe254cf2 iris: fix prim type 2019-02-21 10:26:05 -08:00
Kenneth Graunke
793276cd8b iris: msaa sample count packing problems
0 -> ffffffffffffffffffffffffffff
2019-02-21 10:26:05 -08:00
Kenneth Graunke
0252fb36e9 iris: actually save VBs 2019-02-21 10:26:05 -08:00
Kenneth Graunke
ed6ee3e270 iris: fix/rework line stipple 2019-02-21 10:26:05 -08:00
Kenneth Graunke
231935efa2 iris: init the batch! 2019-02-21 10:26:05 -08:00
Kenneth Graunke
9ca58ca517 iris: delete iris_pipe.c, shuffle code around 2019-02-21 10:26:05 -08:00
Kenneth Graunke
455e2d6dce iris: disable execbuf for now 2019-02-21 10:26:05 -08:00
Kenneth Graunke
86e0c08b14 iris: make an ice->render_batch field
we may want a second one for transfers
2019-02-21 10:26:05 -08:00
Kenneth Graunke
ffd7f13b4d iris: drop unused field 2019-02-21 10:26:05 -08:00
Kenneth Graunke
8097dc9dd9 iris: shader debug log 2019-02-21 10:26:05 -08:00
Kenneth Graunke
6c7a276470 iris: maps 2019-02-21 10:26:05 -08:00
Kenneth Graunke
49896861ce iris: linear resources 2019-02-21 10:26:05 -08:00
Kenneth Graunke
c820f5a4bd iris: some program code 2019-02-21 10:26:04 -08:00
Kenneth Graunke
d48dc416fa iris: basic push constant alloc 2019-02-21 10:26:04 -08:00
Kenneth Graunke
21c016b496 iris: emit 3DSTATE_SAMPLER_STATE_POINTERS 2019-02-21 10:26:04 -08:00
Kenneth Graunke
7b80f4587d iris: sampler states 2019-02-21 10:26:04 -08:00
Kenneth Graunke
60208d12b4 iris: COLOR_CALC_STATE 2019-02-21 10:26:04 -08:00
Kenneth Graunke
9367c44639 iris: fix crash - CSO binding can be NULL (when destroying context) 2019-02-21 10:26:04 -08:00
Kenneth Graunke
efea4d96d9 iris: some draw info, vbs, sample mask 2019-02-21 10:26:04 -08:00
Kenneth Graunke
d6ad9f4732 iris: a bit of depth
still need to allocate separate stencil
2019-02-21 10:26:04 -08:00
Kenneth Graunke
7abe5aefd3 iris: fix SF_CL length 2019-02-21 10:26:04 -08:00
Kenneth Graunke
c1c6c3a18a iris: don't segfault on !old_cso 2019-02-21 10:26:04 -08:00
Kenneth Graunke
3eadb1b3a1 iris: framebuffers 2019-02-21 10:26:04 -08:00
Kenneth Graunke
e7c9bddda7 iris: stipples and vertex elements 2019-02-21 10:26:04 -08:00
Kenneth Graunke
d0aab78dc3 iris: sampler views 2019-02-21 10:26:04 -08:00
Kenneth Graunke
831d630b8b iris: Surfaces! 2019-02-21 10:26:04 -08:00
Kenneth Graunke
4ec5f8be3e iris: SF_CLIP_VIEWPORT 2019-02-21 10:26:04 -08:00
Kenneth Graunke
970836c34e iris: scissors 2019-02-21 10:26:04 -08:00
Kenneth Graunke
7c875deaf0 iris: RASTER + SF + some CLIP, fix DIRTY vs. NEW 2019-02-21 10:26:04 -08:00
Kenneth Graunke
02f583b0a0 iris: initial gpu state, merges 2019-02-21 10:26:04 -08:00
Kenneth Graunke
a13d417ac1 iris: merge pack
this lets us merge dynamic and pre-baked state, also like anv
2019-02-21 10:26:04 -08:00
Kenneth Graunke
aee39df710 iris: packing with valgrind.
borrowed macros from anv!
2019-02-21 10:26:04 -08:00
Kenneth Graunke
d3d6ef37f6 iris: initial render state upload 2019-02-21 10:26:04 -08:00
Kenneth Graunke
26fb5a8ae2 iris: port over batchbuffer updates 2019-02-21 10:26:04 -08:00
Kenneth Graunke
14ca30507f iris: viewport state, sort of 2019-02-21 10:26:04 -08:00
Kenneth Graunke
2dce0e94a3 iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.
This commit introduces a new Gallium driver for Intel Gen8+ GPUs,
named 'iris_dri.so' after the hardware.

Developed by:
- Kenneth Graunke (overall driver)
- Dave Airlie (shaders, conditional render, overflow query, Gen8 port)
- Chris Wilson (fencing, pinned memory, ...)
- Jordan Justen (compute shaders)
- Jason Ekstrand (image load store)
- Caio Marcelo de Oliveira Filho (tessellation control passthrough)
- Rafael Antognolli (auxiliary buffer fixes)
- The rest of the i965 contributors and the Mesa community
2019-02-21 10:26:04 -08:00
James Zhu
eac822eac1 gallium/auxiliary/vl: Fix transparent issue on compute shader with rgba
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109646
Problem 1,4: they are caused by imcomplete blend comute shader
implementation. So Reverts rgba back to frament shader.

Fixes: 9364d66cb7 (Add video compositor compute shader render)
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Tested-by: Bruno Milreu <bmilreu@gmail.com>
2019-02-21 13:11:53 -05:00
Lionel Landwerlin
20c370c6b1 vulkan: add an overlay layer
Just a starting point to display frame timings & drawcalls/submissions
per frame.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
+1-by: Mike Lothian <mike@fireburn.co.uk>
+1-by: Tapani Pälli <tapani.palli@intel.com>
+1-by: Eric Engestrom <eric.engestrom@intel.com>
+1-by: Yurii Kolesnykov <root@yurikoles.com>
+1-by: myfreeweb <greg@unrelenting.technology>
+1-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-21 18:06:05 +00:00
Lionel Landwerlin
89f03d1872 imgui: make sure our copy of imgui doesn't clash with others in the same process
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
+1-by: Mike Lothian <mike@fireburn.co.uk>
+1-by: Tapani Pälli <tapani.palli@intel.com>
+1-by: Eric Engestrom <eric.engestrom@intel.com>
+1-by: Yurii Kolesnykov <root@yurikoles.com>
+1-by: myfreeweb <greg@unrelenting.technology>
+1-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-21 18:06:05 +00:00
Lionel Landwerlin
3950e7c11e imgui: bump copy
Updated at :

commit f977871854af941289f2a9090dcc90f7aa3449a8
Author: omar <omarcornut@gmail.com>
Date:   Fri Feb 15 13:10:22 2019 +0100

    ImFont: Minor adjustment to the structure.
    Examples: Removed unused variable.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
+1-by: Mike Lothian <mike@fireburn.co.uk>
+1-by: Tapani Pälli <tapani.palli@intel.com>
+1-by: Eric Engestrom <eric.engestrom@intel.com>
+1-by: Yurii Kolesnykov <root@yurikoles.com>
+1-by: myfreeweb <greg@unrelenting.technology>
+1-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-21 18:06:05 +00:00
Lionel Landwerlin
51047cd2e8 build: move imgui out of src/intel/tools to be reused
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
+1-by: Mike Lothian <mike@fireburn.co.uk>
+1-by: Tapani Pälli <tapani.palli@intel.com>
+1-by: Eric Engestrom <eric.engestrom@intel.com>
+1-by: Yurii Kolesnykov <root@yurikoles.com>
+1-by: myfreeweb <greg@unrelenting.technology>
+1-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-21 18:06:05 +00:00
Jason Ekstrand
f98fd9d15a nir/lower_clip_cull: Fix an incorrect assert
Copy+paste error.  It was supposed to test cull and not clip.

Fixes: 4e69fba534 "nir: Rewrite lower_clip_cull_distance_arrays..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109717
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-21 12:05:12 -06:00
Jason Ekstrand
f9b2f10a41 nir: Fix a compile warning 2019-02-21 09:44:42 -06:00
Rob Clark
908d5ee9eb freedreno/a6xx: enable tiled images
Turns out we can write to tiled images as well as read.  This avoids
having to linearize or do the tiling in the shader.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-21 09:06:06 -05:00
Alejandro Piñeiro
0629b2a462 nir, glsl: move pixel_center_integer/origin_upper_left to shader_info.fs
On GLSL that info is set as a layout qualifier when redeclaring
gl_FragCoord, so somehow tied to a specific variable. But in practice,
they behave as a global of the shader. On ARB programs they are set
using a global OPTION (defined at ARB_fragment_coord_conventions), and
on SPIR-V using ExecutionModes, that are also not tied specifically to
the builtin.

This patch moves that info from nir variable and ir variable to nir
shader and gl_program shader_info respectively, so the map is more
similar to SPIR-V, and ARB programs, instead of more similar to GLSL.

FWIW, shader_info.fs already had pixel_center_integer, so this change
also removes some redundancy. Also, as struct gl_program also includes
a shader_info, we removed gl_program::OriginUpperLeft and
PixelCenterInteger, as it would be superfluous.

This change was needed because recently spirv_to_nir changed the order
in which execution modes and variables are handled, so the variables
didn't get the correct values. Now the info is set on the shader
itself, and we don't need to go back to the builtin variable to set
it.

Fixes: e68871f6a ("spirv: Handle constants and types before execution
                   modes")

v2: (Jason)
   * glsl_to_nir: get the info before glsl_to_nir, while all the rest
     of the info gathering is happening
   * prog_to_nir: gather the info on a general info-gathering pass,
     not on variable setup.

v3: (Jason)
   * Squash with the patch that removes that info from ir variable
   * anv: assert that OriginUpperLeft is true. It should be already
     set by spirv_to_nir.
   * blorp: set origin_upper_left on its core "compile fragment
     shader", not just on some specific places (for this we added an
     helper on a previous patch).
   * prog_to_nir: no need to gather specifically this fragcoord modes
     as the full gl_program shader_info is copied.
   * spirv_to_nir: assert that we are a fragment shader when handling
     this execution modes.

v4: (reported by failing gitlab pipeline #18750)
   * state_tracker: update too due changes on ir.h/gl_program

v5:
   * blorp: minor change after change on previous patch
   * radeonsi: update due this change.

v6: (Timothy Arceri)
   * prog_to_nir: remove extra whitespace
   * shader_info: don't use :1 on origin_upper_left
   * glsl: program.fs.origin_upper_left/pixel_center_integer can be
     move out of the shader list loop
2019-02-21 11:47:59 +01:00
Alejandro Piñeiro
675eabb560 blorp: introduce helper method blorp_nir_init_shader
This initializes the nir shader that will be used by blorp. Right now
it doesn't do too much beyond calling nir_builder_init_simple_shader,
and setting a name. More stuff will be added on following patches.

v2: there is a case were it is used a VERTEX_SHADER (Alejandro)
2019-02-21 11:47:51 +01:00
Alyssa Rosenzweig
705723e6be panfrost: Verify and print brx condition in disasm
The condition code in extended branches is repeated 8 times for unclear
reasons; accordingly, the code would be disassembled as "unknown5555",
"unknownAAAA", etc. This patch correctly masks off the lower two bits to
find the true code to print, verifying that the code is repeated as
believed to be necessary (providing some assurance for compiler quality
and an assert trip in case we encounter a shader in the wild that breaks
the convention).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:09:06 +00:00
Alyssa Rosenzweig
779e140b1a panfrost: Dynamically set discard branch targets
discard and discard_if are both implemented with the branching pipeline
on Midgard; essentially, we branch to the end of the fragment shader in
a special "discard" mode, setting the condition as necessary.
Previously, we hardcoded the form of this instruction, which worked for
very simple shaders but was incorrect for anything remotely interesting.
This patch instead emits logical branches in the IR, which are flattened
to real discard ops the same way other branches are, allowing targets to
be computed correctly.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:08:59 +00:00
Alyssa Rosenzweig
5abb7b559e panfrost/midgard: Emit extended branches
Previously, we only emitted compact branches; however, the offset range
of these branches is too small for many real world shaders. This patch
implements support for emitting extended branches and switches to always
using them for control flow. This incurs a code size and possibly
performance penalty, but expands the range of working shaders and
provides opportunity for further optimization.

Support for emitting compact branches is retained but this code path is
presently unused. In the future, we'll want to heuristically determine
which type of branch should be emitted for optimal codegen.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:08:47 +00:00
Alyssa Rosenzweig
813bb34fd8 panfrost: Rectify doubleplusungood extended branch
Midgard features "compact branches" and "extended branches", i.e.
corresponds to short jumps and far jumps. The form of the extended
branch was previously incorrect in the ISA headers; this patch corrects
it and updates the disassembler (simultaneous to preserve
bisectability).

Additionally, we fix some a corner case in the disassembly of extended
branches, and we now prefix extended branches with "brx", to visually
differentiate from compact branches prefixed with "br".

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:07:39 +00:00
Alyssa Rosenzweig
2c74709517 panfrost/midgard: Fix nested/chained if-else
An if-else statement is compiled to a conditional branch (from the start
to the second block) and an unconditional branch (from the end of the
first block to the end of the else). We previously incorrectly computed
the block index of the unconditional branch to be exactly one after that
of the conditional branch, valid for a single if-else statement but
nothing fancier. This patch correctly computes the unconditional branch
target, fixing more complex if-else chains.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:06:26 +00:00
Alyssa Rosenzweig
5e55c11a1b panfrost/midgard: Refactor tag lookahead code
Each Midgard instruction is scheduled to a particular instruction type
("tag"). Presumably the hardware prefetches memory based on tag, so it
is required to report out the first tag to the command stream and the
next tag of a branch target. This procedure was implemented in two
separate parts of the compiler (one time with a slight bug relating to
empty blocks); this patch refactors to unite the two routines and solve
the bug when branching to empty blocks.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:05:59 +00:00
Alyssa Rosenzweig
396eb1440a panfrost: Implement pantrace (command stream dump)
Historically, Panfrost debugging entailed the use of the LD_PRELOADable
`panwrap` tool. This setup is a tad fragile; Panfrost can be traced
directly without the intermediate layer. pantrace implements the
quivalent functionality of panwrap into Panfrost proper, allowing dumps
to work regardless of the kernel layer in use.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:03:21 +00:00
Alyssa Rosenzweig
f611782045 panfrost: Add pandecode (command stream debugger)
The `panwrap` utility can be LD_PRELOAD'd into a GLES app, intercepting
communication between the driver and the kernel. Modern panwrap versions
do no processing of their own; instead, they create a trace directory.
This directory contains the following files:

 - control.log: a line-by-line plain text file, denoting important
   syscalls (mmaps and job submits) along with their arguments

 - memory_*.bin, shader_*.bin: binary dumps of mapped memory

Together, these files contain enough information to reconstruct the
command stream and shaders of (at minimum) a single frame.

The `pandecode` utility takes this directory structure as input,
reconstructing the mapped memory and using the job submit command as an
entrypoint. It then walks the descriptors as the hardware would, parsing
and pretty-printing. Its final output is the pretty-printed command
stream interleaved with the disassembled shaders, suitable for driver
debugging. For instance, the behaviour of two driver versions (one
working, one broken) can be compared by diff'ing their decoded logs.

pandecode/decode.c was originally a part of `panwrap`; it is the oldest
living code in the project. Its history is generally not worth
preserving.

panwrap itself will continue to live downstream for the foreseeable
future, as it is specifically written for the vendor kernel. It is
possible, however, to produce equivalent traces directly from Panfrost,
bypassing the intermediate wrapping layer for well-behaved drivers.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 07:01:48 +00:00
Alyssa Rosenzweig
fb3bbd0c1c panfrost: Stub out separate stencil functions
This is not yet functional, but it resolves a crash in various apps and
provides a framework for further work.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-21 06:58:50 +00:00
Marek Olšák
edbd2c1ff5 radeonsi: use SDMA for uploading data through const_uploader
v2: use tc.stream_uploader in si buffer_transfer_map if not called from
    the driver thread

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-02-20 21:04:29 -05:00
Marek Olšák
54f7545cd7 gallium/u_upload_mgr: allow use of FLUSH_EXPLICIT with persistent mappings
for radeonsi

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-02-20 21:04:29 -05:00
Marek Olšák
dc8a2c139d gallium/u_threaded: always unmap const_uploader
radeonsi will require this. It's a no-op for drivers supporting persistent
mappings.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-02-20 21:04:29 -05:00
Marek Olšák
8ef6f68fa5 st/mesa: always unmap the uploader in st_atom_array.c
This is a no-op for drivers supporting persistent mappings.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-02-20 21:04:29 -05:00
Jason Ekstrand
1a93fc382b nir/xfb: Handle compact arrays in gather_xfb_info
This makes us properly handle gl_ClipDistance and gl_CullDistance.

Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info"
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-21 00:08:42 +00:00
Jason Ekstrand
558c314504 nir/xfb: Work in terms of components rather than slots
We needed to better handle cases where a chunk of a variable starts at
some non-zero location_frac and rolls over into the next slot but may
not be more than 4 dwords.  For example, if gl_CullDistance is an array
of 3 things and has location_frac = 2, it will span across two vec4s but
is not, itself, bigger than a vec4.  If you ignore the clip/cull special
case, it's not allowed to happen for anything else because the only
things that can span more than one slot is dvec3 and dvec4 and they're
both bigger than a vec4.  The current code uses this attrib_slot thing
where we count attribute slots and iterate over them.  However, that
doesn't work in the case above because gl_CullDistance will have an
attrib_slot count of 1 even though it does span two slots.  We could fix
this by adjusting attrib_slot but we already have comp_mask and it's
easier to just handle it that way.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-21 00:08:42 +00:00
Jason Ekstrand
4e69fba534 nir: Rewrite lower_clip_cull_distance_arrays to do a lot less lowering
Instead of going to all the work of to combine them into one array, just
make two arrays and use location_frac to colocate them within CLIP0.
Then the back-end can sort things out and stack them on top of each
other.  Thanks to ef99f4c8, we also don't need to set compact anymore.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-21 00:08:42 +00:00
Jason Ekstrand
8f0fe71cc5 nir/xfb: Properly align 64-bit values
Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info"
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-21 00:08:42 +00:00
Jason Ekstrand
30b548fc62 compiler/types: Add a contains_64bit helper
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-21 00:08:42 +00:00
Rob Clark
323958908e freedreno/a6xx: samplerBuffer fixes
Use the 'UNK31' bit (which should probably be called 'BUFFER') for
samplerBuffer case, which increases the size of supported buffer
texture beyond 2^15 elements.

Also need to fix the 2nd coord injected to handle the tex instructions
that take integer coords.

Fixes dEQP-GLES31.functional.texture.texture_buffer.render.as_fragment_texture.buffer_size_131071
and similar

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
50dd773a2d freedreno/ir3/a6xx: use ldib for ssbo reads
... instead of isam.  It seems like when using isam, plus atomics, we
can have the problem of old data being in the texture cache.  Plus this
way we don't have to load a component at a time.

Note that blob still seems to use isam in some cases.  I suppose it might
be preferable in the case of loading a single component, when atomics
are not in the picture (or that the ssbo does not need to otherwise be
coherent).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
c543a2cf6f freedreno/ir3: sync instr/disasm and add ldib encoding
Resync disasm and instr header from envytools, and add ldib encoding.
This replaces an opcode from a3xx which was never seen in practice,
since that seemed easier than dealing with the same opc # meaning a
different thing on a6xx.  (Not really sure if 'sti' was actually a
real thing, I think it was only seen in fuzzing.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
cadf6def0c freedreno/ir3/a6xx: fix load_ssbo barrier type.
Silly copy/pasta bug, since load_image is actually the same instruction
but different barrier class.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
0df0fc28a5 freedreno/ir3: rename put_dst()
This was overlooked when it moved to ir3_context.c and ceased to be
static..

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
7fe9e790e7 freedreno: fix crash w/ masked non-SSA dst
Fixes
dEQP-GLES3.functional.shaders.indexing.varying_array.vec3_dynamic_write_dynamic_loop_read
regression.

Fixes: c1a27ba9ba freedreno/ir3: HIGH reg w/a for a6xx
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
8c486083d0 freedreno/a6xx: 3d and cube image fixes
Fixes dEQP-GLES31.functional.image_load_store.{3d,cube}.store.*
and a bunch more

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
97479df8aa freedreno/ir3: fix crash in compile fail case
The variant will be NULL if RA failed.  Which isn't ideal, but at least
lets not segfault and bring down the rest of the dEQP run with us.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Rob Clark
f5ee8c54ed freedreno/ir3: fix legalize for vecN inputs
The wrmask is handled in regmask_get()/regmask_set(), but it wasn't
being propagated from SSA src to dst.  So for example, an SSBO read
value that is passed in as src2.y component to atomic op, wasn't
getting the (sy) flag set.  Causing lots of fail.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-20 18:50:08 -05:00
Bas Nieuwenhuizen
688f5e456a radv: Disable depth clamping even without EXT_depth_range_unrestricted.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-20 23:24:31 +00:00
Bas Nieuwenhuizen
9f7e0523ce radv: Implement VK_EXT_depth_clip_enable.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-20 23:24:31 +00:00
Timothy Arceri
03783253b1 nir: remove non-ssa support from nir_copy_prop()
Even in a very basic shader this reduces the time spent in
nir_copy_prop() by ~17%.

No shader-db changes for radeonsi NIR or i965.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-21 10:18:24 +11:00
Bas Nieuwenhuizen
1ef2855692 radv: Handle clip+cull distances more generally as compact arrays.
Needed for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/248 .

That MR keeps the clip and cull arrays split.

So we have to handle
 - compact arrays with location_frac != 0
 - VARYING_SLOT_CLIP_DIST1

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-20 22:49:52 +00:00
Eric Anholt
8cfc17bdda kmsro: Add the rest of the current set of tinydrm drivers.
While I haven't tested them all, given that they're all using the same
allocation paths and modifiers in the kernel they should be fine to use in
the same way.

v2: Rebase on other kmsro changes.
v3: Skip repeated '[with_gallium_kmsro,' in the meson build.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-20 21:49:41 +00:00
Andrii Simiklit
f4f4ec941e i965: re-emit index buffer state on a reset option change.
Seems like we forget to update the index buffer (ib) status and
IndexedDrawCutIndexEnable or CutIndexEnable flag is left unchanged it
leads to ignoring of glEnable/glDisable functions for GL_PRIMITIVE_RESTART
in some cases. The index buffer (ib) status should be re-emmited after the
reset option change to avoid some unexpected behavior.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109451
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Signed-off-by: Andrii Simiklit <asimiklit.work@gmail.com>
2019-02-20 20:27:56 +02:00
Kenneth Graunke
d6337b59f6 nir: Don't forget if-uses in new nir_opt_dead_cf liveness check
Commit 08bfd710a2. (nir/dead_cf: Stop
relying on liveness analysis) introduced a new check that iterated
through a SSA def's uses, to see if it's used.  But it only checked
normal uses, and not uses which are part of an 'if' condition.  This
led to it thinking more nodes were dead than possible.

Fixes Piglit's variable-indexing/tcs-output-array-float-index-wr test
(and related tests) with the out-of-tree Iris driver.

Fixes: 08bfd710a2 nir/dead_cf: Stop relying on liveness analysis
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 09:44:06 -08:00
Kristian H. Kristensen
b9eed05e7f freedreno/a6xx: Support MSAA resolve blits on blitter
This gets stencil and depth resolves working properly.

Fixes:

  dEQP-GLES3.functional.fbo.msaa.2_samples.depth32f_stencil8
  dEQP-GLES3.functional.fbo.msaa.2_samples.depth24_stencil8
  dEQP-GLES3.functional.fbo.msaa.4_samples.depth32f_stencil8
  dEQP-GLES3.functional.fbo.msaa.4_samples.depth24_stencil8
  dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_color
  dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_color

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-20 08:56:21 -08:00
Kristian H. Kristensen
686211f4c9 freedreno/a6xx: Copy stencil as R8_UINT
Blitter does support it after all. Previous attempt to use R8_UINT
failed because we overwrote the a6xx format in emit_blit_texture(),
but some of the later setup still looked at the gallium format.

If we overwrite it in the pipe_blit_info before we even call into
emit_blit_texture() it works properly.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-20 08:56:21 -08:00
Kristian H. Kristensen
e827ea8c83 freedreno: Update headers
Add support for multisampled sources for the blitter.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-20 08:56:21 -08:00
Eric Engestrom
a16c398668 anv: use anv_shader_bin_write_to_blob()'s return value
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 16:40:13 +00:00
Eric Engestrom
d3115f34a6 anv: drop unused imports
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 14:28:55 +00:00
Eric Engestrom
8cbfcab425 anv: make sure the extensions stay sorted
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 14:28:55 +00:00
Eric Engestrom
bc76ce1033 anv: sort vendors extensions after KHR and EXT
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 14:28:55 +00:00
Eric Engestrom
427aa9d154 anv: sort extensions alphabetically
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 14:28:55 +00:00
Tapani Pälli
886cee1f96 anv: anv: refactor error handling in anv_shader_bin_write_to_blob()
v2: blob manages error state internally, just return
    true if errors did not occur (Jason)

CID: 1442546
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 15:39:19 +02:00
Carlos Garnacho
30a01cd923 wayland/egl: Ensure EGL surface is resized on DRI update_buffers()
Fullscreening and unfullscreening a totem window while playing a video
sometimes results in the video subsurface not changing size along. This
is also reproducible with epiphany.

If a surface gets resized while we have an active back buffer for it, the
resized dimensions won't get neither immediately applied on the resize
callback, nor correctly synchronized on update_buffers(), as the
(now stale) surface size and currently attached buffer size still do match.

There's actually 2 things to synchronize here, first the surface query
size might not be updated yet to the wl_egl_window's (i.e. resize_callback
happened while there is a back buffer), and second the wayland buffers
would need dropping if new surface size differs with the currently attached
buffer. These are done in separate steps now.

https://bugzilla.redhat.com/show_bug.cgi?id=1650929
https://bugs.freedesktop.org/show_bug.cgi?id=109594

Fixes: a9fb331ea7 ("wayland/egl: update surface size on window resize")
Signed-off-by: Carlos Garnacho <carlosg@gnome.org>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Bastien Nocera <hadess@hadess.net>
Tested-by: Denys Kostin <denys.kostin@globallogic.com>
2019-02-20 12:04:33 +01:00
Lionel Landwerlin
f509213675 anv: implement VK_EXT_depth_clip_enable
A new extension allowing the user to explictly specify the clipping
behavior.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-20 09:57:58 +00:00
Lionel Landwerlin
fa4e103c32 vulkan: Update the XML and headers to 1.1.101 2019-02-20 09:57:58 +00:00
Samuel Iglesias Gonsálvez
63a919a3ce isl: remove the cache line size alignment requirement
The cacheline size was a requirement for using the BLT engine, which
we don't use anymore except for a few things on old HW, so we drop it.

Fixes CTS's CL#3500 test:

dEQP-VK.api.image_clearing.core.clear_color_image.2d.linear.single_layer.r8g8b8_unorm

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-20 08:28:31 +01:00
Bas Nieuwenhuizen
572854e706 radv: Clean up a bunch of compiler warnings.
Random unused vars.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-20 03:21:09 +01:00
Bas Nieuwenhuizen
7631feaa00 radv: Sync ETC2 whitelisted devices.
Fixes: 4bb6c49375 "radv: Allow ETC2 on RAVEN and VEGA10 instead of all GFX9."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-02-20 02:55:41 +01:00
Timothy Arceri
3d7611e9a6 st/nir: use NIR for asm programs
This uses prog_to_nir to translate ARB assembly programs to NIR.

Co-authored by Tim Arceri, Dave Airlie, and Ken Graunke:

 - [Tim Arceri]: original patch
 - [Dave Airlie]: fix crashes with parameter names
 - [Ken Graunke]:
   - Rebase on SCALAR_ISA cap, lower wpos_ytransform too.
   - Rebase on streamout fixes.
   - Lower system values for fragcoord support.
   - Don't try to use prog_to_nir for ATI_fragment_shader programs.
   - Create TGSI for fixed-function or ARB vertex shaders even if the
     driver prefers NIR, so we can create draw module shaders for
     feedback/select emulation, which rely on TGSI.

Tested on:
- iris (Intel Skylake/Kabylake): Piglit & GL CTS - Ken Graunke
- radeonsi (AMD Vega 64): Piglit - Ken Graunke
- vc4/v3d - Piglit - Eric Anholt
- freedreno - dEQP - Kristian Høgsberg

Fixes lit_degenerate_case on vc4 and v3d, and vp-address-01,
vp-arl-constant-array-huge-offset-neg, and vp-arl-neg-array on v3d.
No Piglit regressions on radeonsi; no dEQP regressions on freedreno.

Acked-by: Eric Anholt <eric@anholt.net>
Tested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 15:56:26 -08:00
Kenneth Graunke
3b4929ec6e st/mesa: Copy VP TGSI tokens if they exist, even for NIR shaders.
Even if the driver wants to use NIR shaders, we may need to have TGSI
tokens for creating draw module vertex shaders for the feedback/select
render modes.

So...if the st_vertex_program has any TGSI...copy it to the variant.

Acked-by: Eric Anholt <eric@anholt.net>
Tested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 15:56:19 -08:00
Kenneth Graunke
ba7519ca36 radeonsi: Go back to using llvm.pow intrinsic for nir_op_fpow
ARB_vertex_program and ARB_fragment_program define 0^0 = 1 (while GLSL
leaves it undefined).  Performing fpow lowering in NIR would break this
behavior, preventing us from using prog_to_nir.

According to llvm/lib/Target/AMDGPU/SIInstructions.td, POW_common
expands to <V_LOG_F32_e32, V_EXP_F32_e32, V_MUL_LEGACY_F32_e32>,
which presumably does a zero-wins multiply.

Lowering in NIR results in a non-legacy multiply, where:

   pow(0, 0) = 2^(log2(0) * 0)
             = 2^(-INF * 0)
             = 2^(-NaN)
             = -NaN

which isn't the desired result.

This reverts:
- commit d6b7539206
  (ac/nir: remove emission of nir_op_fpow)
- commit 22430224fe
  (radeonsi/nir: enable lowering of fpow)

and prevents a regression in gl-1.0-spot-light with AMD_DEBUG=nir
after enabling prog_to_nir in st/mesa later in this series.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 15:56:19 -08:00
Timothy Arceri
9c4d5926aa radeonsi/nir: set shader_buffers_declared properly
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-20 10:46:19 +11:00
Timothy Arceri
94a3df62d7 radeonsi/nir: set colors_read properly
shader-db results for VEGA64:

Totals from affected shaders:
SGPRS: 1976 -> 1976 (0.00 %)
VGPRS: 1240 -> 1144 (-7.74 %)
Spilled SGPRs: 145 -> 145 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 34632 -> 34604 (-0.08 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 261 -> 285 (9.20 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-20 10:46:19 +11:00
Timothy Arceri
05cc1dd764 radeonsi/nir: set input_usage_mask properly
shader-db results for VEGA64:

Totals from affected shaders:
SGPRS: 791528 -> 792616 (0.14 %)
VGPRS: 421624 -> 410784 (-2.57 %)
Spilled SGPRs: 1639 -> 1674 (2.14 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 16103516 -> 16063696 (-0.25 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 136307 -> 137830 (1.12 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-20 10:46:19 +11:00
Timur Kristóf
9429bcc4b0 radeonsi/nir: Use uniform location when calculating const_file_max.
The nine state tracker can produce NIR uniform variables
whose location is explicitly set. radeonsi did not take that
into account when calculating const_file_max, resulting in
rendering glitches. This patch fixes that.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-20 10:37:47 +11:00
Mario Kleiner
afb15d14ca drirc: Add sddm-greeter to adaptive_sync blacklist.
This is the sddm login screen.

Fixes: a9c36dbf9c ("drirc: Initial blacklist for adaptive sync")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-02-19 18:03:05 -05:00
Marek Olšák
bff8da6c59 driconf: add Civ6Sub executable for Civilization 6
I'm getting Civ6Sub instead of Civ6.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 17:59:17 -05:00
Marek Olšák
ae21bdf47c radeonsi: always enable NIR for Civilization 6 to fix corruption
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104602

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 17:59:17 -05:00
Marek Olšák
ccbfe44e5f radeonsi: add driconf option radeonsi_enable_nir
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 17:59:17 -05:00
Kenneth Graunke
f9c835eb56 mesa: Align doubles to a 64-bit starting boundary, even if packing.
In the new Intel Iris driver, I am using Tim's new packed uniform
storage system.  It works great, with one caveat: our scalar compiler
backend assumes that uniform offsets will be aligned to the underlying
data type.  For example, doubles must be 64-bit aligned, floats 32-bit,
half-floats 16-bit, and so on.  It does not need any other padding.

Currently, _mesa_add_parameter aligns everything to 32-bit offsets,
creating doubles that have an unaligned offset.  This patch alters
that code to align doubles to 64-bit offsets.

This may be slightly less optimal for drivers which can support full
packing, and allow reads from unaligned offsets at no penalty.  We could
make this extra alignment optional.  However, it only comes into play
when intermixing double and single precision uniforms.  Doubles are
already not too common, and intermixed values (floats then doubles)
is probably even less common.  At most, we burn a single 32-bit slot
to the alignment, which is not that expensive.  So, it doesn't seem
worthwhile to add the extra complexity.

Eventually, we'll likely want to update this code to allow half-float
values to be packed tighter than 32-bit offsets.  At that point, we'll
probably want to revisit what drivers ultimately want, and add options.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 13:26:58 -08:00
Kenneth Graunke
3c2c6bd1c7 compiler: Make is_64bit(GL_*) helper more broadly available
I'd like to use this in the prog_parameter.c code, so I need to move it
into C, make it non-static, and so on.  This probably isn't the ideal
place for it, but I couldn't think of a better one.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 13:26:58 -08:00
Eric Engestrom
daf8ada08d gitlab-ci: automatically run the CI on pushes to ci/* branches
Last commit limited the CI to master and MRs, but to avoid having to
manually trigger CI runs, let's add a 3rd, automatic way: by pushing to
a branch named `ci/*` (or `ci-*` or just `ci`) (which you can delete
afterwards, the pipeline results will remain).

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-19 16:57:32 +00:00
Eric Engestrom
861ade7042 gitlab-ci: limit the automatic CI to master and MRs
Runs on random other branches (stables RCs, personal forks) can still be
triggered manually via the web interface, or an app using the API.

This should massively help with the current voracious state of our CI.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-19 16:57:28 +00:00
Eric Engestrom
f84f833981 tegra/autotools: add missing libdrm cflags
Fixes: f1374805a8 "drm-uapi: use local files, not system libdrm"
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=109647
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-19 13:29:05 +00:00
Eric Engestrom
b787403a21 tegra/meson: add missing dep_libdrm
Fixes: f1374805a8 "drm-uapi: use local files, not system libdrm"
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=109645
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-19 13:29:00 +00:00
Rhys Perry
238730daef ac/nir: implement half-float nir_op_ldexp
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:46 +00:00
Rhys Perry
6971e8d342 ac/nir: implement half-float nir_op_frsq
v2: don't use ac_get_onef()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:41 +00:00
Rhys Perry
2038aec22a ac/nir: implement half-float nir_op_frcp
v2: don't use ac_get_onef()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:35 +00:00
Rhys Perry
4261edc067 ac/nir: make ac_build_fdiv support 16-bit floats
v2: don't use ac_get_onef()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:29 +00:00
Rhys Perry
6790b3a8db ac/nir: make ac_build_isign work on all bit sizes
v2: don't use ac_get_zero(), ac_get_one() and ac_int_of_size()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:20 +00:00
Rhys Perry
bbbfdef683 ac/nir: make ac_build_clamp work on all bit sizes
v2: don't use ac_get_zerof() and ac_get_onef()
v3: rename "intr" to "name"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:58 +00:00
Rhys Perry
7e5004e30a ac/nir: fix 64-bit nir_op_f2f16_rtz
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:44 +00:00
Rhys Perry
c4ea20c0a0 ac/nir: implement 8-bit nir_load_const_instr
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:33 +00:00
Rhys Perry
0ca550e01a radv: ensure export arguments are always float
So that the signature is correct and consistent, the inputs to a export
intrinsic should always be 32-bit floats.

This and the previous commit fixes a large amount crashes from
dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_int_*
tests

Fixes: b722b29f10 ('radv: add support for 16bit input/output')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:22 +00:00
Rhys Perry
64065aa504 radv: bitcast 16-bit outputs to integers
16-bit outputs are stored as 16-bit floats in the outputs array, so they
have to be bitcast.

Fixes: b722b29f10 ('radv: add support for 16bit input/output')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:18 +00:00
Eric Engestrom
23b485c920 gitlab-ci: use ccache to speed up builds
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-19 10:09:51 +00:00
Eric Anholt
dbe3af67a4 v3d: Move i2b and f2b support into emit_comparison.
This lets us save a resolve to NIR true/false for ifs and discard_if.  No
change in shader-db.
2019-02-18 18:18:37 -08:00
Eric Anholt
0bba9c8489 v3d: Emit a simpler negate for the iabs implementation.
One program affected in my shader-db.

instructions in affected programs: 110 -> 108 (-1.82%)
2019-02-18 18:13:09 -08:00
Eric Anholt
1a775d43c9 v3d: Delay emitting ldvpm on V3D 4.x until it's actually used.
For V3D 3.x, we emitted the ldvpms all at the top so that we didn't need
to do VPM setup when the load_inputs are out of order.  For V3D 4.x, we
can reduce register pressure by delaying our loads until they're actually
needed.  This also avoids a bunch of silly MOVs in the pre-opt VIR dump.

total instructions in shared programs: 6421415 -> 6419933 (-0.02%)
total uniforms in shared programs: 2393139 -> 2393140 (<.01%)
total threads in shared programs: 153864 -> 153906 (0.03%)
2019-02-18 18:09:07 -08:00
Eric Anholt
5a84d46896 v3d: Stop tracking num_inputs for VPM loads.
It's unused in the VS (since we need vattr_sizes[] anyway), so move it to
FS prog data.
2019-02-18 18:09:07 -08:00
Eric Anholt
581eba072d v3d: Add a function to describe what the c->execute.file check means.
This is what pointed out that we were misusing the check for last_thrsw in
the previous commit.
2019-02-18 18:09:07 -08:00
Eric Anholt
441294962c v3d: Fix the check for "is the last thrsw inside control flow"
The execute.file check used to be good enough, until I stopped setting up
the execute mask for uniform ifs.

No known tests fixed, noticed while doing a refactor.

Fixes: 0805060573 ("v3d: Handle dynamically uniform IF statements with uniform control flow.")
2019-02-18 18:09:07 -08:00
Eric Anholt
07d5b5a972 v3d: Fix f2b32 behavior.
Now that we don't have the vir_PF() magic, it's obvious that we were doing
the wrong thing for f2b32 by allowing -0.0 to produce true instead of
false.
2019-02-18 18:09:07 -08:00
Eric Anholt
3022b4bd82 v3d: Kill off vir_PF(), which is hard to use right.
You were allowed to pass in any old temp so that you could hopefully fold
the PF up into the def of the temp.  If we couldn't find one, it
implicitly generated a MOV(nop, reg).  However, that PF could have
different behavior depending on whether the def being folded into was a
float or int opcode, which the caller doesn't necessarily control.

Due to the fragility of the function, just switch all callers over to
vir_set_pf().  This also encourages the callers to use a _dest call for
the inst they're putting the PF on, eliminating a bunch of temps in the
pre-optimization VIR.

shader-db says the change is in the noise:

total instructions in shared programs: 6226247 -> 6227184 (0.02%)
instructions in affected programs: 851068 -> 852005 (0.11%)
2019-02-18 18:09:06 -08:00
Eric Anholt
6186a8d44e v3d: Do bool-to-cond for discard_if as well.
Turns this minimal conditional discard (glsl-fs-discard-01.shader_test):

0x3de0b086c5fe9000 fcmp.pushn  -, r1, r5; mov  r2, 0
0x3dec3086bbfc001f nop                  ; mov.ifa  r2, -1
0x3c047186bbe80000 nop                  ; mov.pushz  -, r2
0x3dea3186ba837000 setmsf.ifna  -, 0    ; nop

into:

0x3c00b186c582a000 fcmp.pushn  -, r2, r5; nop
0x3de83186ba837000 setmsf.ifa  -, 0     ; nop

total instructions in shared programs: 6229820 -> 6226247 (-0.06%)
2019-02-18 18:09:06 -08:00
Eric Anholt
718eef62cb v3d: Refactor bcsel and if condition handling.
Both were doing the same thing to try to get a condition to predicate on.
Noticed when I wanted to do this for discard_if as well.

No change in shader-db.
2019-02-18 18:09:06 -08:00
Eric Anholt
4586f9f902 v3d: Add a helper function for getting a nop register.
Just a little refactor to explain what's going on with QFILE_NULL.
2019-02-18 18:09:06 -08:00
Eric Anholt
339155122b v3d: Drop our hand-lowered nir_op_ffract.
The NIR lowering works fine, though it causes some slight noise due to
what looks like choices about propagating constants up multiply chains
changing.

total instructions in shared programs: 6229671 -> 6229820 (<.01%)
total uniforms in shared programs: 2312171 -> 2312324 (<.01%)
2019-02-18 18:09:06 -08:00
Eric Anholt
16f5085490 v3d: Drop a perf note about merging unpack_half_*, which has been implemented.
This is handled with copy-propagation now.
2019-02-18 18:09:06 -08:00
Eric Anholt
146e432b49 v3d: Fix incorrect flagging of ldtmu as writing r4 on v3d 4.x.
Fixes some stalls in 3DMMES's main vertex shader.

total instructions in shared programs: 6280751 -> 6211270 (-1.11%)
instructions in affected programs: 2935050 -> 2865569 (-2.37%)
2019-02-18 18:09:06 -08:00
Eric Anholt
cd5e0b2729 v3d: Use the early_fragment_tests flag for the shader's disable-EZ field.
Apparently we need disable-EZ flagged, not just "does Z writes".

Fixes
dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo
on 7278, even though it passed in simulation.

Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: 051a41d3d5 ("v3d: Add support for the early_fragment_tests flag.")
2019-02-18 18:09:06 -08:00
Eric Anholt
332b969c4e v3d: Sync indirect draws on the last rendering.
Fixes intermittent fails in
dEQP-GLES31.functional.draw_indirect.compute_interop.separate.drawelements_compute_cmd_and_data_and_indices
and others (particularly when run as part of a CTS run)
2019-02-18 18:09:06 -08:00
Eric Anholt
32f16b0b1e v3d: Clear the GMP on initialization of the simulator.
Otherwise, we might have pages accessible that shouldn't be and miss out
on errors.  This is unlikely for most tests since v3d_hw_get_mem() is big
enough that it'll be a freshly zeroed mmap, but if screens are destroyed
and recreated then we'd be reusing the old v3d_hw_get_mem() contents.
2019-02-18 18:09:06 -08:00
Emil Velikov
ba652394a3 docs: update calendar, add news item and link release notes for 18.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-18 18:38:14 +00:00
Emil Velikov
d7108dac73 docs: add sha256 checksums for 18.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bfb5bdaa97)
2019-02-18 18:36:23 +00:00
Emil Velikov
a1ccff4aaf docs: add release notes for 18.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b26488dead)
2019-02-18 18:36:21 +00:00
Ilia Mirkin
57441af8bf i965: always enable EXT_float_blend
From the table in isl_format.c, it appears that all generations
support blending on 32-bit float surfaces.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-18 12:13:54 -05:00
Ilia Mirkin
9fec653093 st/mesa: enable GL_EXT_float_blend when possible
If the driver supports PIPE_BIND_BLENABLE on RGBA32F, flip
EXT_float_blend on (which will affect ES3 contexts).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-18 12:13:54 -05:00
Ilia Mirkin
070a5e5d92 mesa: add explicit enable for EXT_float_blend, and error condition
If EXT_float_blend is not supported, error out on blending of FP32
attachments in an ES2 context.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-18 12:13:54 -05:00
Samuel Pitoiset
47616810ed radv: fix writing the alpha channel of MRT0 when alpha coverage is enabled
This version is better and safer.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 18:06:07 +01:00
Rob Clark
d6c43cceff freedreno/ir3: handle quirky atomic dst for a6xx
The new encoding returns a value via the 2nd src.  The legalize pass
needs to be aware of this to set the correct needs_sy flag, otherwise we
can, in cases where the atomic dst is not used, overwrite the register
that hardware will asynchronously load result into without (sy) flag, so
it gets clobbered by the atomic result.

This fixes a whole lot of rando ssbo+atomic fails, like
dEQP-GLES31.functional.ssbo.layout.single_basic_type.packed.highp_vec4.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-18 12:01:36 -05:00
Rob Clark
28fc6733cd freedreno/a6xx: fix helper_invocation (sampler mask/id)
Since gl_HelperInvocation is lowered to:

  !((1 << sample_id) & sample_mask_in))

Not setting these enable bits was causing it be broken.  (And probably a
bunch of other stuff too.)

Fixes dEQP-GLES31.functional.shaders.helper_invocation.*

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-18 10:37:54 -05:00
Samuel Pitoiset
32ab7a59bb radv: remove unused variable in gather_push_constant_info()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-18 13:30:16 +01:00
Lionel Landwerlin
8c87d029bc i965: scale factor changes should trigger recompile
Found by inspection.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3da858a6b9 ("intel/compiler: add scale_factors to sampler_prog_key_data")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-18 12:18:13 +00:00
Samuel Pitoiset
0d8f096293 radv: write the alpha channel of MRT0 when alpha coverage is enabled
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109597
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:22 +01:00
Samuel Pitoiset
2cf5433b99 ac: use new LLVM 8 intrinsic when loading 16-bit values
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:20 +01:00
Samuel Pitoiset
f0223143a8 ac: add ac_build_llvm8_tbuffer_load() helper
It uses the new LLVM intrinsics.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:17 +01:00
Tapani Pälli
9762a9f893 mesa: return NULL if we exceed MaxColorAttachments in get_fb_attachment
This fixes invalid access to Attachment array which would occur if caller
would exceed MaxColorAttachments. In practice this should not ever happen
because DiscardFramebufferEXT specifies only GL_COLOR_ATTACHMENT0 to be
valid and InvalidateFramebuffer will error out before but this should
make coverity happy.

v2: const, remove _EXT (Ian)

CID: 1442559
Fixes: 0c42b5f3cb "mesa: wire up InvalidateFramebuffer"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-18 07:51:55 +02:00
Alyssa Rosenzweig
2c6a7fbeb7 panfrost: Fix clipping region
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:13:50 +00:00
Alyssa Rosenzweig
fa1b36ddc2 panfrost: Preserve w sign in perspective division
This fixes issues where polygons that should be culled (due to negative
w, for instance) may not be.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:13:34 +00:00
Alyssa Rosenzweig
49985cebea panfrost: Cleanup mali_viewport (clipping) code
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:13:03 +00:00
Alyssa Rosenzweig
a94463732a panfrost: Swap order of tiled texture (de)alloc
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:10:33 +00:00
Alyssa Rosenzweig
4a4ed53c01 panfrost: Free imported BOs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:10:06 +00:00
Alyssa Rosenzweig
b5a01296f4 panfrost: Fix various leaks unmapping resources
v2: Don't check for NULL before free()

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:09:41 +00:00
Kenneth Graunke
535251487b nir: Don't reassociate add/mul chains containing only constants
The idea here is to reassociate a * (b * c) into (a * c) * b, when
b is a non-constant value, but a and c are constants, allowing them
to be combined.

But nothing was enforcing that 'b' must be non-constant, which meant
that running opt_algebraic in a loop would never terminate if the IR
contained non-folded constant expressions like 256 * 0.5 * 2.  Normally,
we call constant folding in such a loop too, but IMO it's better for
nir_opt_algebraic to be robust and not rely on that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109581
Fixes: 32e266a9a5 i965: Compile fp64 funcs only if we do not have 64-bit hardware support

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-16 23:36:14 -08:00
Chris Wilson
e9882b879b i965: Assert the execobject handles match for this device
Object handles are local to the device fd, so double check we are not
mixing together objects from multiple screens on execbuf submission.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-16 23:35:29 -08:00
Rob Clark
99b90ecd35 freedreno/a6xx: cache flush harder
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark
1af0c5d320 freedreno/a6xx: compute support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark
5118dcf8c3 freedreno/a6xx: image/ssbo state emit
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark
2183d9cff7 freedreno/a6xx: border-color offset helper
Soon we'll need this logic to deal w/ image/SSBO case, so split out a
helper rather than duplicate the logic.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark
c1a27ba9ba freedreno/ir3: HIGH reg w/a for a6xx
It seems like some instructions (noticed this w/ cat3), cannot read HIGH
regs.. cat1 (mov/cov) can, and possibly some/all of cat2.

The blob seems to stick w/ an extra mov into low regs.  So lets do the
same.

This fixes WGID on a6xx, which unsurprisingly is related to a lot of
deqp compute fails.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark
947848524d freedreno/ir3: add a6xx+ SSBO/image support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark
b46d5b8a84 freedreno/ir3: add a6xx instruction encoding
For the handful of instructions that use a new encoding.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark
2e0ea3f09c freedreno/ir3: add image/ssbo <-> ibo/tex mapping
Images and SSBOs don't map directly to the hw.  They end up being part
texture and part something else.  Starting with a6xx, the hack used for
a5xx to smash the image tex state into hw texture state starting from
MAX counting down won't work, because we start using tex state also for
SSBO read.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark
75f3a5245e freedreno/ir3: fix ncomp for _store_image() src
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark
feee3050d3 freedreno/ir3: split out a4xx+ instructions
Note that image/ssbo support is currently only implemented for a5xx.
But the instruction encoding is the same for a4xx.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark
42af0640f6 freedreno/ir3: split out image helpers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark
aefdb9bed2 freedreno/a6xx: clean up some open-coded bits
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark
b51de44dea freedreno/a6xx: move stream-out emit to helper
Split out of the main fd6_emit() code, since it was already getting to
be a pretty giant function.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:26:14 -05:00
Rob Clark
c0d6be11d6 freedreno/ir3: fix varying packing vs. tex sharp edge
We probably need to rethink how we detect which instruction first
defines higher register classes.  But for now, this at least fixes
the symptom.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:26:14 -05:00
Samuel Pitoiset
52bdb043af radv: fix invalid element type when filling vertex input default values
The elements added into a vector should have the same type as the
first one, otherwise this hits an assertion in LLVM.

Fixes: 4b3549c084 ("radv: reduce the number of loaded channels for vertex input fetches")
reported-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-16 15:33:18 +01:00
Eleni Maria Stea
7188e2ba15 i965: Removed the field etc_format from the struct intel_mipmap_tree
After the previous changes to emulate the ETC/EAC formats using the
secondary shadow miptree, the etc_format field of the intel_mipmap_tree
struct became redundant and the remaining check that used it has been
replaced. (Nanley Chery)

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-02-15 15:54:41 -08:00
Eleni Maria Stea
248f2e7888 i965: Enabled the OES_copy_image extension on Gen 7 GPUs
OES_copy_image extension was disabled on Gen7 due to the lack of support
for ETC2 images. Enabled it back. (Kenneth Graunke)

v2:
  - Removed the blank lines in the comments above OES_copy_image and
  OES_texture_view extensions in intel_extensions.c (Nanley Chery)

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-02-15 15:54:41 -08:00
Eleni Maria Stea
db0c379c06 i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
For CopyImageSubData to copy the data during the 1st draw call, we need
to update the shadow tree right before the rendering.

v2:
  - Added assertion that the miptree doesn't need update at the time we
  update the texture surface. (Nanley Chery)

v3:
  - As we now update the tree before the rendering we don't need to copy
  the data during the unmap anymore. Removed the unnecessary update from
  the intel_miptree_unmap in intel_mipmap_tree.c (Nanley Chery)

v4:
  - Fixed unrelated empty line removal (Nanley Chery)
  - As now the intel_upate_etc_shadow of intel_mipmap_tree.c is only
  called inside its following function, we don't need to declare it at
  the top of the file anymore. (Nanley Chery)

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-02-15 15:54:41 -08:00
Eleni Maria Stea
d8eb7287fe i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
compressed EAC/ETC2 images to non-compressed RGBA images. When
GetCompressed* functions were called, the pixels were returned in this
RGBA format and not the compressed format that was expected.

Trying to fix this problem, we use a secondary shadow miptree to store the
decompressed data for the rendering and the main miptree to store the
compressed for the Get functions to work. Each time that the main miptree
is written with compressed data, we decompress them to RGB and update the
shadow. Then we use the shadow for rendering.

v2:
   - Fixes in the commit message (Nanley Chery)
   - Reversed the changes in brw_get_texture_swizzle and swapped the b, g
   values at the time that we decompress the data in the function:
   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
   - Simplified the format checks in the miptree_create function of the
   intel_mipmap_tree.c and reserved the call of the
   intel_lower_compressed_format for the case that we are faking the ETC
   support (Nanley Chery)
   - Removed the check for the auxiliary usage for the shadow miptree at
   creation (miptree_create of intel_mipmap_tree.c) as we won't use
   auxiliary buffers with these types of trees (Nanley Chery)
   - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
   removed the unecessary checks (Nanley Chery)
   - Fixed an unrelated indentation change (Nanley Chery)
   - Modified the function intel_miptree_finish_write to set the
   mt->shadow_needs_update to true to catch all the cases when we need to
   update the miptree (Nanley Chery)
   - In order to update the shadow miptree during the unmap of the
   main and always map the main (Nanley Chery) the following change was
   necessary: Splitted the previous update function that was updating all
   the mipmap levels and use two functions instead: one that updates one
   level and one that updates all of them. Used the first during unmap
   and the second before the rendering.
   - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
   miptree should be mapped each time and reversed all the changes in the
   higher level texture functions that upload data to textures as they
   aren't needed anymore.
   - Replaced the boolean needs_fake_etc with an inline function that
   checks when we need to fake the ETC compression (Nanley Chery)
   - Removed the initialization of the strides in the update function as
   the values will be overwritten by the intel_miptree_map call (Nanley
   Chery)
   - Used minify instead of division in the new update function
   intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
   Chery)
   - Removed the depth from the calculation of the number of slices in
   the new update function (intel_miptree_update_etc_shadow_levels of
   intel_mipmap_tree.c) as we don't need to support 3D ETC images.
   (Nanley Chery)

v3:
  - Renamed the rgba_fmt in function miptree_create
  (intel_mipmap_tree.c) to decomp_format as the format is not always in
  rgba order. (Nanley Chery)
  - Documented the new usage for the shadow miptree in the comment above
  the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
  Chery)
  - Removed the redundant flags from the mapping of the miptrees in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
  - Fixed the switch from surface's logical level to physical level in
  the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
  (Nanley Chery)
  - Excluded the Baytrail GPUs from the check for the ETC emulation as
  they support the ETC formats natively. (Nanley Chery)
  - Simplified the check if the format is BGRA in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)

v4:
  - Removed the functions intel_miptree_(map|unmap)_etc and the check if
   we need to call them as with the new changes, they became unreachable.
   (Nanley Chery)
  - We'd rather calculate the level width and height using the shadow
  miptree instead of the main in intel_miptree_update_etc_shadow_levels of
  intel_mipmap_tree.c (Nanley Chery)
  - Fixed the format in the mt_surface_usage, set at the miptree creation,
   in miptree_create of intel_mipmap_tree.c (Nanley Chery)

v5:
  - Fixed the levels calculations in intel_mipmap_tree.c (Nanley Chery)
  - Update the flag shadow_needs_update outside the function
  intel_miptree_update_etc_shadow (Nanley Chery)
  - Fixed indentation error (Nanley Chery)

v6:
  - Fixed typo in commit message (Nanley Chery)
  - Simplified the assignment of the mt_fmt in the miptree_create of the
  intel_mipmap_tree.c (Nanley Chery)
  - Combined declarations and assignments where it was possible in the
  intel_miptree_update_etc_shadow and
  intel_miptree_update_etc_shadow_levels of the intel_mipmap_tree.c
  (Nanley Chery)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81843
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104272
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-02-15 15:54:41 -08:00
Nanley Chery
c6dada70f0 i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*
Use more generic field names. We'll reuse these fields for a workaround
with ASTC miptrees.

Reviewed-by: Eleni Maria Stea <estea@igalia.com>
2019-02-15 15:54:41 -08:00
Timothy Arceri
a801196ec9 nir: remove simple dead if detection from nir_opt_dead_cf()
This was probably useful when it was first written, however it
looks to be no longer necessary.

As far as I can tell these days dce is smart enough to remove useless
instructions from if branches. Once this is done
nir_opt_peephole_select() will end up removing the empty if.

Removing this support reduces the dolphin uber shader compilation
time spent in nir_opt_dead_cf() by a little over 7x.

No shader-db changes on i965 or radeonsi.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-02-16 10:45:31 +11:00
Alok Hota
f695e43354 swr/rast: Add translation support to streamout
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:54:29 -06:00
Alok Hota
a7fa0cc0a5 swr/rast: simdlib cleanup, clipper stack space fixes
Reduce stack space used by clipper, which had lead to crashes in some
versions for MSVC

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:54:23 -06:00
Alok Hota
f9c29a301a swr/rast: convert DWORD->uint32_t, QWORD->uint64_t
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:54:19 -06:00
Alok Hota
c503b58878 swr/rast: Refactor scratch space variable names
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:54:14 -06:00
Alok Hota
0b4db43705 swr/rast: FP consistency between POSH/RENDER pipes
- Ensure all threads have optimal floating-point control state
- Disable auto-generation of fused FP ops for VERTEX shader stage
- Disable "fast" FP ops for VERTEX shader stage

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:54:09 -06:00
Alok Hota
dc7b3c95a4 swr/rast: Move knob defaults to generated cpp file
Reduces amount of compile churn when testing different default values

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:54:04 -06:00
Alok Hota
05e4ff33f5 swr/rast: Flip BitScanReverse index calculation
The intrinsic returns the number of leading zeros, not the bit number of
the first nonzero, so just flip it based on the mask size

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:53:58 -06:00
Alok Hota
ae400a9b11 swr/rast: Correctly align 64-byte spills/fills
Fixes crashes on some compute shaders when running on AVX512

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:53:54 -06:00
Alok Hota
78bab66479 swr/rast: Disable use of __forceinline by default
- Was not useful to inline in release builds
- FORCEINLINE can be used if absolutely necessary

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:52:51 -06:00
Alok Hota
20d5c88760 swr/rast: Convert system memory pointers to gfxptr_t
Fulfills an unused internal interface

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-15 14:52:32 -06:00
Bas Nieuwenhuizen
4b03a19a0b radv: Use correct num formats to detect whether we should be use 1.0 or 1.
normalized and scaled formats also return floats.

Fixes: 4b3549c084 ("radv: reduce the number of loaded channels for vertex input fetches")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-15 20:24:16 +00:00
Ian Romanick
979b43b347 nir/algebraic: Simplify comparison with sequential integers starting with 0
All of the affected shaders are Unreal4 demos.

All Gen6+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 15437170 -> 15437001 (<.01%)
instructions in affected programs: 21536 -> 21367 (-0.78%)
helped: 43
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 3.93 x̃: 4
helped stats (rel) min: 0.68% max: 1.01% x̄: 0.80% x̃: 0.80%
95% mean confidence interval for instructions value: -4.07 -3.79
95% mean confidence interval for instructions %-change: -0.83% -0.77%
Instructions are helped.

total cycles in shared programs: 383007896 -> 383007378 (<.01%)
cycles in affected programs: 158640 -> 158122 (-0.33%)
helped: 38
HURT: 4
helped stats (abs) min: 1 max: 48 x̄: 13.89 x̃: 6
helped stats (rel) min: 0.03% max: 1.01% x̄: 0.33% x̃: 0.19%
HURT stats (abs)   min: 2 max: 3 x̄: 2.50 x̃: 2
HURT stats (rel)   min: 0.06% max: 0.09% x̄: 0.08% x̃: 0.08%
95% mean confidence interval for cycles value: -16.90 -7.77
95% mean confidence interval for cycles %-change: -0.39% -0.19%
Cycles are helped.

Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8213746 -> 8213745 (<.01%)
instructions in affected programs: 127 -> 126 (-0.79%)
helped: 1
HURT: 0

total cycles in shared programs: 187734146 -> 187734144 (<.01%)
cycles in affected programs: 2132 -> 2130 (-0.09%)
helped: 1
HURT: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-15 11:11:02 -08:00
Ian Romanick
ad05920258 nir/algebraic: Convert some f2u to f2i
Section 5.4.1 (Conversion and Scalar Constructors) of the GLSL 4.60 spec
says:

     It is undefined to convert a negative floating-point value to an
     uint.

Assuming that (uint)some_float behaves like (uint)(int)some_float allows
some optimizations in the i965 backend to proceed.

This basically undoes the small amount of damage done by
"intel/compiler: Avoid propagating inequality cmods if types are
different".

v2: Replicate part of the commit message as a comment in the code.
Suggested by Jason.

shader-db results compairing *before* "intel/compiler: Avoid propagating
inequality cmods if types are different" and after this commit:

Skylake
total cycles in shared programs: 383007996 -> 383007896 (<.01%)
cycles in affected programs: 85208 -> 85108 (-0.12%)
helped: 13
HURT: 8
helped stats (abs) min: 2 max: 26 x̄: 10.77 x̃: 6
helped stats (rel) min: 0.09% max: 0.65% x̄: 0.28% x̃: 0.14%
HURT stats (abs)   min: 2 max: 12 x̄: 5.00 x̃: 3
HURT stats (rel)   min: 0.04% max: 0.32% x̄: 0.12% x̃: 0.07%
95% mean confidence interval for cycles value: -9.31 -0.21
95% mean confidence interval for cycles %-change: -0.24% <.01%
Cycles are helped.

Broadwell
total cycles in shared programs: 415251194 -> 415251370 (<.01%)
cycles in affected programs: 83750 -> 83926 (0.21%)
helped: 7
HURT: 13
helped stats (abs) min: 10 max: 12 x̄: 11.43 x̃: 12
helped stats (rel) min: 0.30% max: 0.30% x̄: 0.30% x̃: 0.30%
HURT stats (abs)   min: 2 max: 36 x̄: 19.69 x̃: 22
HURT stats (rel)   min: 0.05% max: 0.89% x̄: 0.44% x̃: 0.47%
95% mean confidence interval for cycles value: 0.76 16.84
95% mean confidence interval for cycles %-change: <.01% 0.37%
Inconclusive result (%-change mean confidence interval includes 0).

Haswell
total instructions in shared programs: 13823885 -> 13823886 (<.01%)
instructions in affected programs: 2249 -> 2250 (0.04%)
helped: 0
HURT: 1

total cycles in shared programs: 390094243 -> 390094001 (<.01%)
cycles in affected programs: 85640 -> 85398 (-0.28%)
helped: 15
HURT: 6
helped stats (abs) min: 4 max: 26 x̄: 18.53 x̃: 18
helped stats (rel) min: 0.09% max: 0.66% x̄: 0.47% x̃: 0.42%
HURT stats (abs)   min: 2 max: 14 x̄: 6.00 x̃: 2
HURT stats (rel)   min: 0.04% max: 0.37% x̄: 0.15% x̃: 0.04%
95% mean confidence interval for cycles value: -17.36 -5.69
95% mean confidence interval for cycles %-change: -0.44% -0.14%
Cycles are helped.

Ivy Bridge
total cycles in shared programs: 180986448 -> 180986552 (<.01%)
cycles in affected programs: 34835 -> 34939 (0.30%)
helped: 0
HURT: 10
HURT stats (abs)   min: 2 max: 18 x̄: 10.40 x̃: 10
HURT stats (rel)   min: 0.06% max: 0.36% x̄: 0.28% x̃: 0.30%
95% mean confidence interval for cycles value: 4.67 16.13
95% mean confidence interval for cycles %-change: 0.20% 0.35%
Cycles are HURT.

Sandy Bridge
total cycles in shared programs: 154603969 -> 154603970 (<.01%)
cycles in affected programs: 171514 -> 171515 (<.01%)
helped: 25
HURT: 14
helped stats (abs) min: 1 max: 4 x̄: 1.80 x̃: 1
helped stats (rel) min: 0.02% max: 0.10% x̄: 0.04% x̃: 0.04%
HURT stats (abs)   min: 1 max: 8 x̄: 3.29 x̃: 3
HURT stats (rel)   min: 0.03% max: 0.28% x̄: 0.10% x̃: 0.11%
95% mean confidence interval for cycles value: -0.91 0.96
95% mean confidence interval for cycles %-change: -0.02% 0.04%
Inconclusive result (value mean confidence interval includes 0).

No changes on Iron Lake or GM45.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-15 11:11:02 -08:00
Matt Turner
ac21dd4aee intel/compiler/test: Add unit test for mismatched signedness comparison
v2 (idr): Move adding the test to after adding the fix.  Reordering the
two commits prevents possible headaches for git-bisect with scripts that
always do 'ninja check'.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-15 11:11:02 -08:00
Matt Turner
2dff9a66b6 intel/compiler: Avoid propagating inequality cmods if types are different
v2: Fix silly bug in logic.  s/||/&&/

All but one of the affected shaders is in an Unreal4 demo.  The other is
in Tomb Raider.  All of the cases that Ian investigated appear to be
sequences like the following

    if (int(uint(some_float)) < 0) /* other relations too */
        ...

At least in Tomb Raider, it's not obvious that this sequence came from
the original shader.

In some of the Unreal demos, the shader contains code like

    if (int(uint(textureLod(...))) > 0)
        ...

which explicitly generates the offending sequence.

All Gen6+ platforms had similar results (Skylake shown):
total instructions in shared programs: 15437170 -> 15437187 (<.01%)
instructions in affected programs: 4492 -> 4509 (0.38%)
helped: 0
HURT: 17
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.05% max: 0.73% x̄: 0.66% x̃: 0.73%
95% mean confidence interval for instructions value: 1.00 1.00
95% mean confidence interval for instructions %-change: 0.57% 0.75%
Instructions are HURT.

total cycles in shared programs: 383007996 -> 383007992 (<.01%)
cycles in affected programs: 20542 -> 20538 (-0.02%)
helped: 6
HURT: 7
helped stats (abs) min: 2 max: 6 x̄: 5.33 x̃: 6
helped stats (rel) min: 0.11% max: 0.36% x̄: 0.32% x̃: 0.36%
HURT stats (abs)   min: 4 max: 4 x̄: 4.00 x̃: 4
HURT stats (rel)   min: 0.27% max: 0.27% x̄: 0.27% x̃: 0.27%
95% mean confidence interval for cycles value: -3.30 2.69
95% mean confidence interval for cycles %-change: -0.19% 0.19%
Inconclusive result (value mean confidence interval includes 0).

No changes on Iron Lake or GM45.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: nagrigoriadis@gmail.com
Tested-by: Danylo Piliaiev <danylo.piliaiev@gmail.com>
2019-02-15 11:11:02 -08:00
Matt Turner
e50db60d16 intel/compiler/test: Set devinfo->gen = 7
We emit an FBL instruction which only exists since Gen7. This prevents
the test from segfaulting when run with TEST_DEBUG=1.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-15 11:11:02 -08:00
James Zhu
9364d66cb7 gallium/auxiliary/vl: Add video compositor compute shader render
Add compute shader initilization, assign and cleanup in vl_compositor API.
Set video compositor compute shader render as default when pipe support it.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2019-02-15 10:07:03 -05:00
James Zhu
f6ac0b5d71 gallium/auxiliary/vl: Add compute shader to support video compositor render
Add compute shader to support video compositor render.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2019-02-15 10:07:03 -05:00
James Zhu
299e2bc046 gallium/auxiliary/vl: Rename csc_matrix and increase its size.
Rename csc_matrix to shader_params, and increase shader_params size
to store more constants for compute shader,

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2019-02-15 10:07:03 -05:00
James Zhu
7b7b5f2029 gallium/auxiliary/vl: Split vl_compositor graphic shaders from vl_compositor API
Split vl_compositor graphic shaders from vl_compositor API in order to share
vl_compositor API with vl_compositor compute shader later.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2019-02-15 10:07:03 -05:00
James Zhu
b34d7c5daa gallium/auxiliary/vl: Move dirty define to header file
Move dirty define to header file to share with compute shader.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2019-02-15 10:07:03 -05:00
Juan A. Suarez Romero
1fb24080b7 nir: remove jump from two merging jump-ending blocks
In opt_peel_initial_if optimization, when moving the continue list to
end of the continue block, before the jump, could happen that the
continue list itself also ends with a jump.

This would mean that we would have two jump instructions in a row: the
first one from the continue list and the second one from the contine
block.

As inserting an instruction after a jump is not allowed (and it does not
make sense, as it will not be executed), remove the jump from the
continue block and keep the one from continue list, as it will be
executed first.

CC: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-15 15:16:24 +01:00
Juan A. Suarez Romero
69be9934a7 nir: move ALU instruction before the jump instruction
opt_split_alu_of_phi moves ALU instruction to the end of continue block.

But if the continue block ends with a jump instruction (an explicit
"continue" instruction) then the ALU must be inserted before the jump,
as it is illegal to add instructions after the jump.

CC: Ian Romanick <ian.d.romanick@intel.com>
Fixes: 0881e90c09 ("nir: Split ALU instructions in loops that read phis")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-15 15:14:36 +01:00
Andres Gomez
a43596df62 mesa: INVALID_VALUE for wrong type or format in Clear*Buffer*Data
Instead of generating a GL_INVALID_ENUM error when the type or format
is incorrect while using glClear{Named}Buffer{Sub}Data, generate
GL_INVALID_VALUE.

From page 72 (page 94 of the PDF) of the OpenGL 4.6 spec:

  " An INVALID_VALUE error is generated if type is not one of the
    types in table 8.2.

    An INVALID_VALUE error is generated if format is not one of the
    formats in table 8.3."

Fixes the following test:
KHR-GL45.direct_state_access.buffers_errors

v2: correct the doxygen documentation.

Cc: Pi Tabred <servuswiegehtz@yahoo.de>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-15 14:28:06 +02:00
Gurchetan Singh
67426ccd42 virgl: use virgl_transfer_inline_write even less
We've noticed the Team Fortress 2 engine seems to do many small
calls to glSubData(..). Let's pick our heuristic based on the
resource base width, not the size of a particular upload.
This will cause transfers to be batched together in the transfer
queue.

Revelant glbench microbenchmark --

Before: buffer_upload_dynamic_element_array_131072 = 131.17 mbytes_sec
After: buffer_upload_dynamic_element_array_131072 = 6828.24 mbytes_sec
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
f0e71b1088 virgl: use transfer queue
This improves Unigine Valley benchmark by 3 to 10 fps (depending
on the scene).

It also improves the Team Fortress 2 benchmark from 6 fps to 13
fps (host: 20 fps).

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
4a7857b377 virgl: introduce transfer queue
Transfers will be placed here at unmap time instead of incurring
a VM exit. There's an attempt to deduplicate intersecting 1D transfers,
which are surprisingly common.

This can also help with mipmapped texture upload and smaller
textures, where the majority of the time is spent in the guest
kernel / QEMU -- not virglrenderer.  This is shown by the GLbench
texture upload benchmark:

Before:
    texture_upload_rgba_teximage2d_32 = 64.23 mtexel_sec
After:
    texture_upload_rgba_teximage2d_32 = 367.44 mtexel_sec

v2: Split up list iteration functions (@gerddie)
v3: Support for optimizing glBufferSubData
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
9c4930946a virgl: add encoder functions for new protocol
Let's encode the new protocol with new helper functions.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
5510cc67e0 virgl: make winsys modifications for encoded transfers
The idea is to have two command buffers:

1) One for transfers
2) One for commands, which can include transfers

At flush time, (2) will be filled.  Otherwise, (1) will be
used to submit transfers if there are enough of them.

v2: Pass size directly to cmd_buf_create (@gerddie)
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
90e9650585 virgl: add extra checks in virgl_res_needs_flush_wait
This is motivated by the following scenario:

glSubBufferData(GL_ARRAY_BUFFER, ...)
glFlush(..)
glSubBufferData(GL_ARRAY_BUFFER, ...)
glSubBufferData(GL_ARRAY_BUFFER, ...)
glSubBufferData(GL_ARRAY_BUFFER, ...)

This increases @davidriley's Team Fortress 2 apitrace from
1 fps to 6 fps and helps with the Chromium glbench
microbenchmarks:

Before: texture_update_rgba_texsubimage2d_2048 = 554.96 mtexel_sec
   buffer_upload_dynamic_array_12 = 0.02 mbytes_sec
   buffer_upload_dynamic_array_576 = 1.07 mbytes_sec
After: texture_update_rgba_texsubimage2d_2048 = 612.29 mtexel_sec
   buffer_upload_dynamic_array_12 = 2.22 mbytes_sec
   buffer_upload_dynamic_array_576 = 164.89 mbytes_sec
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
ab6ea6e9ce virgl: pass virgl transfer to virgl_res_needs_flush_wait
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
d98fbd9c92 virgl: keep track of number of computations
It's good to keep track of these things.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
35515985a9 virgl: limit command length to 16 bits
Much of our logic is based around the idea the upper 16 bits
of a command dword can encode the length of the command.

Now that the command buffer >= 2^16 - 1, we should check for
this.

v2: alignment, and only check VIRGL_ENCODE_MAX_DWORDS
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
503ffe46bb virgl: use virgl_transfer in inline write
Let's define a helper function and use it.

This commit also allows resources to be emitted into different command
buffers.

Like the ioctls, send 0 for layer_stride and stride.  If we actually
send the real values, there are various assumptions in virglrenderer
for non-1D buffers that may need to be modified.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
0fcd48bac5 virgl: add protocol for resource transfers
Mostly similar to VIRGL_CCMD_RESOURCE_INLINE_WRITE.  However, this
uses the resource's already attached iovecs rather than the command
buffer to transfer the data.

v2: Used (1 << 16) not (1 << 15) [@gerddie]
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:05 +01:00
Gurchetan Singh
168c3ffce3 virgl: when creating / freeing transfers, pass slab pool directly
This will allow us to destroy transfers w/o having a pointer
to the context.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Gurchetan Singh
d5c2dacc15 virgl: unmap uploader at flush time
This should save some memory when allocating and freeing transfers.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Gurchetan Singh
14f265b533 virgl: make alignment smaller when uploading index user buffers
Since we're just uploading to guest memory, let's just align to dword
size.

Fixes: e0f932 ("u_upload_mgr: pass alignment to u_upload_data manually")
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Gurchetan Singh
7626e6e189 virgl: track level cleanliness rather than resource cleanliness
This allows a minor optimization for texture upload.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Gurchetan Singh
c19aedcf1a virgl: don't mark unclean after a flush
The guest memory is still clean until host GL touches it,
which we should track elsewhere.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Gurchetan Singh
5b6a2ae987 virgl: use virgl_resource_dirty helper
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Gurchetan Singh
1d294ad264 virgl: add ability to do finer grain dirty tracking
There are levels to cleanliness.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-02-15 11:19:04 +01:00
Alyssa Rosenzweig
acc52fff20 panfrost: Improve logging and patch memory leaks
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-15 07:47:54 +00:00
Alyssa Rosenzweig
c70ed4ca18 panfrost: Don't align framebuffer dims
Fixes regressions with EGL clients

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-15 07:46:30 +00:00
Alyssa Rosenzweig
5155bcf099 panfrost: Implement PIPE_QUERY_OCCLUSION_COUNTER
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-15 07:46:02 +00:00
Alyssa Rosenzweig
2d22b5380c panfrost: Identify MALI_OCCLUSION_PRECISE bit
Setting this is required for desktop-style occlusion queries.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-15 07:45:56 +00:00
Tapani Pälli
595af46f0f drirc/i965: add option to disable 565 configs and visuals
We have cases where we would not like to expose these.

v2: call the option allow_rgb565_configs for consistency
    with existing allow_rgb10_configs (Eric, Jason)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-15 09:38:36 +02:00
Alyssa Rosenzweig
97aa05470a panfrost: Backport driver to Mali T600/T700
There are a few differenes between Mali T860 (Panfrost's primary
reference target) and the older Midgard generations (T600/T700):

 - Miscellaneous different magic numbers. It's not clear what these
numbers mean on either the old or new configurations yet.

 - Errata fixes. T800 is the final Midgard generation and presumably the
least buggy. Older Midgard has some extra hardware errata we have to
workaround.

- SFBD vs MFBD split. Essentially, older Midgard use a Single
FrameBuffer Descriptor (SFBD), which corresponds to single
render-target rendering. Newer Midgard (T760+) use a Multiple
FrameBuffer Descriptor (MFBD), allowing multiple RTs. On ES 2.0, these
descriptors serve the same function, but we implement both, depending on
the version of the hardware.

- CPU bitness. 32-bit systems generally use 32-bit GPU descriptors, and
vice versa for 64-bit. Our target T760 systems are 32-bit whereas our
target T860 systems are 64-bit. More work is needed in this area.

This patch fixes support in these areas for supporting older Midgard
hardware. It is tested on Mali T760 and Mali T860.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-15 07:22:42 +00:00
Alyssa Rosenzweig
f96e871c26 panfrost: Fix build; depend on libdrm
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-15 07:19:43 +00:00
Jason Ekstrand
08bfd710a2 nir/dead_cf: Stop relying on liveness analysis
The liveness analysis pass is fairly expensive because it has to build
large bit-sets and run a fix-point algorithm on them.  Instead of
requiring liveness for detecting if values escape a CF node, just take
advantage of the structured nature of NIR and use block indices instead.
This only requires the block index metadata which is the fastest we have
metadata to generate.

No shader-db changes on Kaby Lake

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-14 23:06:29 -06:00
Jason Ekstrand
b50465d197 nir/dead_cf: Inline cf_node_has_side_effects
We want to handle live SSA values differently and it's going to involve
walking the instructions.  We can make it a single instruction walk if
we combine it with cf_node_has_side_effects.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-14 23:05:28 -06:00
Jason Ekstrand
367b0ede4d intel/fs: Bail in optimize_extract_to_float if we have modifiers
This fixes a bug in runscape where we were optimizing x >> 16 to an
extract and then negating and converting to float.  The NIR to fs pass
was dropping the negate on the floor breaking a geometry shader and
causing it to render nothing.

Fixes: 1f862e923c "i965/fs: Optimize float conversions of byte/word..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109601
Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-02-14 23:02:44 -06:00
Ilia Mirkin
8c859367df swr: set PIPE_CAP_MAX_VARYINGS correctly
Unfortunately swr was missed in the original commit. The number of
varyings should generally match up to what's reported as the shader
caps for fragment inputs.

Fixes: 6010d7b8e8 (gallium: add PIPE_CAP_MAX_VARYINGS)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Alok Hota <alok.hota@intel.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-14 20:29:36 -05:00
Jason Ekstrand
5064464931 intel/fs: Silence a compiler warning
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-14 16:04:47 -06:00
Jason Ekstrand
9b202239ba anv: Silence some compiler warnings in release builds
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-14 16:04:45 -06:00
Jason Ekstrand
cd60c995a6 anv/blorp: Delete a pointless assert
Just a little higher up in the function we assert that the aspect masks
are actually equal so there's no reason for the weaker check.  Also, the
temporary variables were causing compiler warnings in release builds.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-14 16:04:42 -06:00
Jason Ekstrand
b14d7a6b60 nir: Silence a couple of warnings in release builds
[28/716] Compiling C object 'src/compiler/nir/068b2c8@@nir@sta/nir_gather_xfb_info.c.o'.
../src/compiler/nir/nir_gather_xfb_info.c: In function ‘nir_gather_xfb_info’:
../src/compiler/nir/nir_gather_xfb_info.c:171:13: warning: variable ‘max_offset’ set but not used [-Wunused-but-set-variable]
    unsigned max_offset[NIR_MAX_XFB_BUFFERS] = {0};
             ^~~~~~~~~~
[36/716] Compiling C object 'src/compiler/nir/068b2c8@@nir@sta/nir_instr_set.c.o'.
../src/compiler/nir/nir_instr_set.c:502:1: warning: ‘instr_each_src_and_dest_is_ssa’ defined but not used [-Wunused-function]
 instr_each_src_and_dest_is_ssa(nir_instr *instr)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-14 16:04:35 -06:00
Kenneth Graunke
6775665e5e spirv: Eliminate dead input/output variables after translation.
spirv_to_nir can generate input/output variables which are illegal
for the current shader stage, which would cause nir_validate_shader
to balk.  After my recent commit to start decorating arrays as compact,
dEQP-VK.spirv_assembly.instruction.graphics.module.same_module started
hitting validation errors due to outputs in a TCS (not intended for the
TCS at all) not being per-vertex arrays.

Thanks to Jason Ekstrand for suggesting this approach.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109573
Fixes: ef99f4c8d1 compiler: Mark clip/cull distance arrays as compact before lowering.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-02-14 11:03:56 -08:00
Kenneth Graunke
39aee57523 anv: Put MOCS in the correct location
My patch to switch from struct-based MOCS to numeric MOCS accidentally
divided all MOCS entries by 2 in the Vulkan driver.

MOCS on Gen9+ is just an array index into a table.  But in the hardware
packets, the index starts at bit 1.  So we need to shift it.

Fixes: 0b44644ca6 (genxml: Consistently use a numeric "MOCS" field)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-14 11:03:28 -08:00
Ian Romanick
9a918050e0 spirv: Add missing break
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: c6465fec0c ("spirv: add SpvCapabilityInt64Atomics")
CID: 1442555
2019-02-14 08:35:59 -08:00
Eric Engestrom
c2b4b46fa9 util/tests: compile to something sensible in release builds
assert()-based tests make no sense without asserts, so make sure asserts
are compiled in, even if the rest of the code has asserts turned off.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-14 12:52:34 +00:00
Eric Engestrom
f7c56475d2 anv/tests: compile to something sensible in release builds
assert()-based tests make no sense without asserts, so make sure asserts
are compiled in, even if the rest of the code has asserts turned off.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-14 12:52:34 +00:00
Eric Engestrom
4c1ca5b074 etnaviv: drop duplicate #define
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-14 11:20:00 +00:00
Eric Engestrom
7f68b38439 st/dri: drop duplicate #define
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-14 11:20:00 +00:00
Eric Engestrom
2fa165e757 gbm: drop duplicate #defines
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-14 11:20:00 +00:00
Eric Engestrom
f1374805a8 drm-uapi: use local files, not system libdrm
There was an issue recently caused by the system header being included
by mistake, so let's just get rid of this include path and always
explicitly #include "drm-uapi/FOO.h"

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-14 11:20:00 +00:00
Eric Engestrom
69e4c273c4 drm-uapi/README: remove explicit list of driver names
These headers are used by a lot more than just the intel drivers nowadays.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-14 11:20:00 +00:00
Samuel Pitoiset
227df98fa6 radv: fix radv_fixup_vertex_input_fetches()
We should check that num_channels is 4, otherwise that breaks
the world. Sorry for the short breakage.

Fixes: 4b3549c084 ("radv: reduce the number of loaded channels for vertex input fetches")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-14 09:44:35 +01:00
Samuel Pitoiset
4b3549c084 radv: reduce the number of loaded channels for vertex input fetches
It's unnecessary to load more channels than the vertex attribute
format. The remaining channels are filled with 0 for y and z,
and 1 for w.

29077 shaders in 15096 tests
Totals:
SGPRS: 1321605 -> 1318869 (-0.21 %)
VGPRS: 935236 -> 932252 (-0.32 %)
Spilled SGPRs: 24860 -> 24776 (-0.34 %)
Code Size: 49832348 -> 49819464 (-0.03 %) bytes
Max Waves: 242101 -> 242611 (0.21 %)

Totals from affected shaders:
SGPRS: 93675 -> 90939 (-2.92 %)
VGPRS: 58016 -> 55032 (-5.14 %)
Spilled SGPRs: 172 -> 88 (-48.84 %)
Code Size: 2862740 -> 2849856 (-0.45 %) bytes
Max Waves: 15474 -> 15984 (3.30 %)

This mostly helps Croteam games (Talos/Sam2017).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-14 09:10:56 +01:00
Samuel Pitoiset
210aec3612 radv: store vertex attribute formats as pipeline keys
The formats will be used for reducing the number of loaded channels.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-14 09:10:09 +01:00
Samuel Pitoiset
45382baef6 radv: use MAX_{VBS,VERTEX_ATTRIBS} when defining max vertex input limits
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-14 09:09:51 +01:00
Samuel Pitoiset
2154fac6f3 ac: make use of ac_build_expand_to_vec4() in visit_image_store()
And make ac_build_expand() a static function.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-14 09:09:48 +01:00
Eric Anholt
338d399fd0 freedreno: Use the NIR lowering for isign.
I think this will save an instruction and hopefully not increase any other
costs (possibly the immediate -1 and 1?), but I haven't actually tested.

Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-14 00:32:30 +00:00
Eric Anholt
8f3694e1ab intel: Use the NIR lowering for isign.
Drops one instruction from fs-sign-int.shader_test.  No change in
shader-db due to it having 0 instances of sign(genIType).  This may hurt
isign64 if algebraic runs before int64 lowering, but I wasn't sure how to
mark the algebraic opt as "every bit size but 64".

v2: Update commit message about shader-db.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
2019-02-14 00:32:30 +00:00
Eric Anholt
3f22b35a43 v3d: Use the NIR lowering for isign instead of rolling our own.
min/max instead of comparisons saves 2 instructions on
fs-sign-int.shader_test.
2019-02-14 00:32:30 +00:00
Eric Anholt
42d2cae907 nir: Move panfrost's isign lowering to nir_opt_algebraic.
I wanted to reuse this from v3d.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-14 00:32:30 +00:00
Timothy Arceri
68baf96824 nir: turn an ssa check in nir_search into an assert
Everything should be in ssa form when we call this. This is a
hotpath so replace the check with an assert.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-02-14 09:35:32 +11:00
Timothy Arceri
46a4d2c867 nir: turn ssa check into an assert
Everthing should be in ssa form when this is called. Checking
for it here is expensive so turn this into an assert instead.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-02-14 09:35:32 +11:00
Timothy Arceri
0a89c9779a nir: prehash instruction in nir_instr_set_add_or_rewrite()
There is no need to hash the instruction twice, especially as we
end up adding it in the majority of cases.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-14 09:35:32 +11:00
Dylan Baker
279060cd32 meson: Add dependency on genxml to anvil
Currently the Intel "anvil" driver races with the generation of genxml
files, while i965 has an explicit dependency. This patch adds the same
dependency to anvil.

Fixes: d1992255bb
       ("meson: Add build Intel "anv" vulkan driver")
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-13 22:01:00 +00:00
Samuel Pitoiset
334da034d8 radv: always export gl_SampleMask when the fragment shader uses it
For some reasons, this breaks trees rendering in Project Cars.

Fixes: 85010585cd ("radv: only enable gl_SampleMask if MSAA is enabled too")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109401
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-13 23:01:30 +01:00
Alok Hota
736241892f gallium/aux: add PIPE_CAP_MAX_VARYINGS to u_screen
Allows drivers using `u_pipe_screen_get_param_defaults` to use a
fallback value for the new pipe cap. Default value of 8 based on GL 2.1
MAX_VARYING_FLOATS

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-13 15:08:14 -06:00
Kristian H. Kristensen
e8566d7098 .mailmap: Add a few more alises for myself
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-13 12:03:41 -08:00
Samuel Pitoiset
5e18000d1b radv/winsys: fix BO list creation when RADV_DEBUG=allbos is set
Fixes: 50fd253bd6 ("radv/winsys: Add priority handling during submit.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-13 20:51:40 +01:00
Kristian H. Kristensen
0a41ddbd4e freedreno/a6xx: Fix point coord
Use ir3_next_varying() for iterating through varyings and unset the
global point coord invert bit.

Fixes:

  dEQP-GLES3.functional.shaders.builtin_variable.pointcoord

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-13 11:14:06 -08:00
Kristian H. Kristensen
2fbd2d5f58 freedreno/a6xx: Front facing needs UNK3 bit
We need to set UNK3 in GRAS_CNTL and RB_RENDER_CONTROL0 for the value
to be reliably delivered.

Fixes:

  dEQP-GLES3.functional.shaders.builtin_variable.frontfacing

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-13 11:14:06 -08:00
Kristian H. Kristensen
1831238c8e freedreno/a6xx: Update headers
This pulls in changes for compute shaders and a6xx ssbo/image support.
FACENESS bit moved from position 1 to 2 and there's a global invert
bit for point coord.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-13 11:14:06 -08:00
Kristian H. Kristensen
182e5c011f freedreno/a6xx: Clean up mixed use of swap and swizzle for texture state
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-13 11:03:29 -08:00
Rob Clark
61094629cb freedreno/a6xx: small compiler warning fix
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-13 13:54:05 -05:00
Dylan Baker
aff52dd2c6 get-pick-list: Add --pretty=medium to the arguments for Cc patches
Because none of them have been picked up for 19.0 due to this bug
being reintroduced.

v2: - Fix fixes tags

Fixes: e6b3a3b201
       ("bin/get-pick-list.sh: handle "typod" usecase.")
Fixes: fac10169bb
       ("bin/get-pick-list.sh: prefix output with "[stable] "")
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-13 08:59:30 -08:00
Eric Engestrom
68a9383c6f gitlab-ci: limit ninja to 4 threads max
I tried bumping the limit on make and scons instead, but that just
thrashed the runners, so let's not do that (sorry @daniels :]).

Instead, remove the automatic thread management from ninja and limit it
to 4 instead, in line with make and scons.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-13 16:15:43 +00:00
Konstantin Kharlamov
fccc9d3de6 mapi: work around GCC LTO dropping assembly-defined functions
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109391

Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-13 14:20:51 +00:00
Caio Marcelo de Oliveira Filho
017349997f nir: fix example in opt_peel_loop_initial_if description
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-12 20:33:20 -08:00
Karol Herbst
7e08f22a72 nir/opt_if: don't mark progress if nothing changes
if we have something like this:

loop {
   ...
   if x {
      break;
   } else {
      continue;
   }
}

opt_if_loop_last_continue returns true marking progress allthough nothing
changes.

Fixes: 5921a19d4b "nir: add if opt opt_if_loop_last_continue()"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-13 00:21:35 +01:00
Oscar Blumberg
3c540e0a74 radeonsi: Fix guardband computation for large render targets
Stop using 12.12 quantization for viewports that are not contained in
the lower 4k corner of the render target as the hardware needs to keep
both absolute and relative coordinates representable.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-12 17:21:46 -05:00
Chia-I Wu
2f8734e13b egl: fix KHR_partial_update without EXT_buffer_age
EGL_BUFFER_AGE_EXT can be queried without EXT_buffer_age.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-12 19:14:34 +00:00
Kenneth Graunke
5a006b026d mesa: Advertise EXT_float_blend in ES 3.0+ contexts.
This extension simply drops a draw time restriction:

    "Furthermore, an INVALID_OPERATION error is generated by
     DrawArrays and the other drawing commands defined in section
     2.8.3 (10.5 in ES 3.1) if blending is enabled (see below) and
     any draw buffer has 32-bit floating-point format components."

We never correctly enforced this restriction anyway, so we were
basically already implementing it.  We just need to advertise it
for our behavior to be correct.

The extension requires EXT_color_buffer_float, but we already enable
that via dummy_true.  So we can dummy_true this one as well.

Found while debugging WebGL conformance tests.  Does not fix any.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-12 10:57:25 -08:00
Alok Hota
d3dfa86a30 gallium/swr: Param defaults for unhandled PIPE_CAPs
Without using this function, we fail the -Wswitch flag when compiling
the default debugoptimized mode in Meson

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-12 18:55:14 +00:00
Juan A. Suarez Romero
1ad26f9417 anv/cmd_buffer: check for NULL framebuffer
This can happen when we record a VkCmdDraw in a secondary buffer that
was created inheriting from the primary buffer, but with the framebuffer
set to NULL in the VkCommandBufferInheritanceInfo.

Vulkan 1.1.81 spec says that "the application must ensure (using scissor
if neccesary) that all rendering is contained in the render area [...]
[which] must be contained within the framebuffer dimesions".

While this should be done by the application, commit 465e5a86 added the
clamp to the framebuffer size, in case of application does not do it.
But this requires to know the framebuffer dimensions.

If we do not have a framebuffer at that moment, the best compromise we
can do is to just apply the scissor as it is, and let the application to
ensure the rendering is contained in the render area.

v2: do not clamp to framebuffer if there isn't a framebuffer

v3 (Jason):
- clamp earlier in the conditional
- clamp to render area if command buffer is primary

v4: clamp also x and y to render area (Jason)

v5: rename used variables (Jason)

Fixes: 465e5a86 ("anv: Clamp scissors to the framebuffer boundary")
CC: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-12 19:19:13 +01:00
Marek Olšák
6c64413b6f radeonsi: use MEM instead of MEM_GRBM in COPY_DATA.DST_SEL
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-12 13:08:54 -05:00
Marek Olšák
f8e4c9df47 radeonsi: add AMD_DEBUG env var as an alternative to R600_DEBUG
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-12 13:08:54 -05:00
Samuel Pitoiset
1b8983c25b radv: fix using LOAD_CONTEXT_REG with old GFX ME firmwares on GFX8
This fixes a critical issue.

Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109575
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 17:39:30 +01:00
Samuel Pitoiset
bd1186572f radv: add support for push constants inlining when possible
This removes some scalar loads from shaders, but it increases
the number of SET_SH_REG packets. This is currently basic but
it could be improved if needed. Inlining dynamic offsets might
also help.

Original idea from Dave Airlie.

29077 shaders in 15096 tests
Totals:
SGPRS: 1321325 -> 1357101 (2.71 %)
VGPRS: 936000 -> 932576 (-0.37 %)
Spilled SGPRs: 24804 -> 24791 (-0.05 %)
Code Size: 49827960 -> 49642232 (-0.37 %) bytes
Max Waves: 242007 -> 242700 (0.29 %)

Totals from affected shaders:
SGPRS: 290989 -> 326765 (12.29 %)
VGPRS: 244680 -> 241256 (-1.40 %)
Spilled SGPRs: 1442 -> 1429 (-0.90 %)
Code Size: 8126688 -> 7940960 (-2.29 %) bytes
Max Waves: 80952 -> 81645 (0.86 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 17:25:54 +01:00
Samuel Pitoiset
8364ffe823 radv: keep track of the number of remaining user SGPRs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 17:25:52 +01:00
Samuel Pitoiset
5f9379ca35 radv: gather if shaders load dynamic offsets separately
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 17:25:49 +01:00
Samuel Pitoiset
5806d99984 radv: gather more info about push constants
This is needed in order to inline some push constants when possible.
This also adds a new helper for initializing the pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 17:25:34 +01:00
Samuel Pitoiset
129a9f4937 radv: fix compiler issues with GCC 9
"The C standard says that compound literals which occur inside of
the body of a function have automatic storage duration associated
with the enclosing block. Older GCC releases were putting such
compound literals into the scope of the whole function, so their
lifetime actually ended at the end of containing function. This
has been fixed in GCC 9. Code that relied on this extended lifetime
needs to be fixed, move the compound literals to whatever scope
they need to accessible in."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109543
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 14:48:08 +01:00
Tapani Pälli
2a2e69f975 i965: add P0x formats and propagate required scaling factors
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Lin Johnson <johnson.lin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-12 08:43:04 +02:00
Tapani Pälli
3da858a6b9 intel/compiler: add scale_factors to sampler_prog_key_data
Patch propagates given scale_factors to lowering options.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-12 08:42:25 +02:00
Tapani Pälli
722f96bfc8 dri: add P010, P012, P016 for 10bit/12bit/16bit YUV420 formats
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Lin Johnson <johnson.lin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-12 08:42:02 +02:00
Tapani Pälli
19a85a704b nir: add option to use scaling factor when sampling planes YUV lowering
Patch adds nir_lower_tex_options as parameter to sample_plane so that
we don't need to extend nir_tex_instr for this.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-12 08:41:20 +02:00
Kenneth Graunke
3eedc8f7b1 i965: Use info->textures_used instead of prog->SamplersUsed.
prog->SamplersUsed is set by the linker when validating resource limits,
while info->textures_used is gathered after NIR optimizations, which may
have eliminated some unused surfaces.

This may let us skip some work.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:50 -08:00
Kenneth Graunke
59ae985631 i965: Drop unnecessary 'and' with prog->SamplerUnits
textures_used_by_txf is a subset of textures_used which is a subset
of prog->SamplerUnits.  This should do nothing.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:48 -08:00
Kenneth Graunke
f5c7df4dc9 nir: Gather texture bitmasks in gl_nir_lower_samplers_as_deref.
Eric and I would like a bitmask of which samplers are used, similar to
prog->SamplersUsed, but available in NIR.  The linker uses SamplersUsed
for resource limit checking, but later optimizations may eliminate more
samplers.  So instead of propagating it through, we gather a new one.
While there, we also gather the existing textures_used_by_txf bitmask.

Gathering these bitfields in nir_shader_gather_info is awkward at best.
The main reason is that it introduces an ordering dependency between the
two passes.  If gathering runs before lower_samplers_as_deref, it can't
look at var->data.binding.  If the driver doesn't use the full lowering
to texture_index/texture_array_size (like radeonsi), then the gathering
can't use those fields.  Gathering might be run early /and/ late, first
to get varying info, and later to update it after variant lowering.  At
this point, should gathering work on pre-lowered or post-lowered code?
Pre-lowered is also harder due to the presence of structure types.

Just doing the gathering when we do the lowering alleviates these
ordering problems.  This fixes ordering issues in i965 and makes the
txf info gathering work for radeonsi (though they don't use it).

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:45 -08:00
Kenneth Graunke
120f9b8362 nir: Use sampler derefs in drawpixels and bitmap lowering.
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:44 -08:00
Kenneth Graunke
04bdc56872 program: Make prog_to_nir create texture/sampler derefs.
Until now, prog_to_nir has been setting texture_index and sampler_index
directly.  This is different than GLSL shaders, which create variable
dereferences and rely on lowering passes to reach this final form.

radeonsi uses variable dereferences for samplers rather than
texture_index and sampler_index, so it doesn't even make sense to set
them there.  By moving to derefs, we ensure that both GLSL and ARB
programs produce the same final form that the driver desires.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:40 -08:00
Kenneth Graunke
6a4be25a90 st/nir: Use sampler derefs in built-in shaders.
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:38 -08:00
Kenneth Graunke
ba9c1c8217 st/nir: Lower sampler derefs for builtin shaders.
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:36 -08:00
Kenneth Graunke
8d1646e0e1 st/nir: Pull sampler lowering into a helper function.
This will make it easier to reuse across GLSL / ARB / built-ins.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:35 -08:00
Kenneth Graunke
243c11dc16 i965: Call nir_lower_samplers for ARB programs.
An upcoming patch will start building derefs in prog_to_nir, at which
point we'll need to lower them to indexes.

This gets both GLSL and non-GLSL shaders using the same paths.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:30 -08:00
Kenneth Graunke
529a0711c1 glsl: Don't look at sampler uniform storage for internal vars
Passes like nir_lower_drawpixels add additional sampler variables,
and set an explicit binding which never changes.  These extra samplers
don't have proper uniform storage associated with them, and there is no
way to update bindings via the API.  So, for any 'hidden' variables,
just trust that there's an explicit binding set.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:28 -08:00
Kenneth Graunke
d34e434989 glsl: Allow gl_nir_lower_samplers*() without a gl_shader_program
I would like to be able to run gl_nir_lower_samplers() to turn texture
and sampler variable dereferences into indexes and offsets, even for
ARB programs, and built-in shaders.  This would make sampler handling
more consistent across the various types of shaders.

For GLSL programs, the gl_nir_lower_samplers_as_deref() pass looks up
the variable bindings in the shader program's uniform storage.  But
ARB programs and built-in shaders don't have a gl_shader_program, and
uniform storage doesn't exist.  In this case, we simply skip that
lookup, and trust var->data.binding to be set correctly by whoever
created the shader.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:34:22 -08:00
Kenneth Graunke
f45dd6d31b st/mesa: Limit GL_MAX_[NATIVE_]PROGRAM_PARAMETERS_ARB to 2048
Piglit's vp-max-array test creates a vertex program containing a uniform
array sized to the value of GL_MAX_NATIVE_PROGRAM_PARAMETERS_ARB.  Mesa
will then add additional state-var parameters for things like the MVP
matrix.

radeonsi currently exposes a value of 4096, derived from constant buffer
upload size.  This means the array will have 4096 elements, and the
extra MVP state-vars would get a prog_src_register::Index of over 4096.

Unfortunately, prog_src_register::Index is a signed 13-bit integer, so
values beyond 4096 end up turning into negative numbers.  Negative
source indexes are only valid for relative addressing, so this ends up
generating illegal IR.

In prog_to_nir, this would cause an out of bounds array access.
st_mesa_to_tgsi checks for a negative value, assumes it's bogus,
and remaps it to parameter 0 in order to get something in-range.
This isn't right - instead of reading the MVP matrix, it would read
the first element of the vertex program's large array.  But the test
only checks that the program compiles, so we never noticed that it
was broken.

This patch limits the size of the program limits, with the understanding
that we may need to generate additional state-vars internally.  i965 has
exposed 1024 for this limit for years, so I don't expect lowering it to
2048 will cause any practical problems for radeonsi or other drivers.

Fixes vp-max-array with prog_to_nir.c.

Cc: "19.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-11 21:09:51 -08:00
Francisco Jerez
374eb3cd6f intel/dump_gpu: Disambiguate between BOs from different GEM handle spaces.
This fixes a rather astonishing problem that came up while debugging
an issue in the Vulkan CTS.  Apparently the Vulkan CTS framework has
the tendency to create multiple VkDevices, each one with a separate
DRM device FD and therefore a disjoint GEM buffer object handle space.
Because the intel_dump_gpu tool wasn't making any distinction between
buffers from the different handle spaces, it was confusing the
instruction state pools from both devices, which happened to have the
exact same GEM handle and PPGTT virtual address, but completely
different shader contents.  This was causing the simulator to believe
that the vertex pipeline was executing a fragment shader, which didn't
end up well.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-11 12:27:22 -08:00
Kristian H. Kristensen
e404c6879d freedreno/a6xx: Fall back to masked RGBA blits for depth/stencil
The blitter doesn't seem to have a write mask, so for depth only and
stencil only blits to Z24S8 we cast the Z24S8 buffer to an RGBA UNORM8
buffer and fall back to pipeline blits with corresponding write mask.

Fixes

  dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_stencil_only
  dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_depth
  dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_depth
  dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_depth
  dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_depth
  dEQP-GLES3.functional.fbo.msaa.2_samples.stencil_index8
  dEQP-GLES3.functional.fbo.msaa.4_samples.stencil_index8

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
f03ba155d5 freedreno/a6xx: Add format argument to fd6_tex_swiz()
We need to allow overriding the format with that of the image or
sampler view, so we can't take it from the resource in fd6_tex_swiz().

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
bc8c813d5a freedreno/a6xx: Support y-inverted blits
The src coordinates are s24.8. For an inverted blit that ends at y=0
we need to program -1 for sy2, so we need to handle negative values
correctly.

Fixes

  dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_y
  dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_y
  dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_min_reverse_src_y
  dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_color
  dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_color

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
03a01e5d23 freedreno/a6xx: Support some depth/stencil blits on blitter
We can rewrite almost all depth stencil blits to various red-only
blits.  The exception is depth-only or stencil-only blits into z24s8
combined depth stencil buffer. We can fall back for depth-only, but
stencil-only remains broken.

Fixes

  dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_basic
  dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_scale
  dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_basic
  dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_scale
  dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_stencil_only

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
e9592da2b4 freedreno/a6xx: Move blit check so as to restore comment
The explanation for the compressed format check is broken across two
comments:

	/* We can blit if both or neither formats are compressed formats... */
	/* ... but only if they're the same compression format. */

but the ok_format() checks were inserted between, breaking up the flow
of the sentence.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
d2639f2eac freedreno: Don't tell the blitter what it can't do
Call ctx->blit() and let it reject blits it can't do instead of giving
up on stencil blits and blits u_blitter can't do.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
8cf1303698 freedreno: Consolidate u_blitter functions in freedreno_blitter.c
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
701d30dda8 freedreno/a6xx: Combine emit_blit and fd6_blit
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
6d1a7bdba3 freedreno/a6xx: Use the right resource for separate stencil stride
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
24b4172375 freedreno: Log number of draw for sysmem passes
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
a201cb157d freedreno/a6xx: Drop render condition check in blitter
We already check earlier in the call chain in fd_blit().
glBlitFramebuffer always sets render_condition_enable and thus we
would never try the blitter path for that.

Now that we get all of dEQP-GLES3.functional.fbo.blit.conversion.*
down this path, it turs out that the

  fail_if(info->mask != util_format_get_mask(info->src.format));
  fail_if(info->mask != util_format_get_mask(info->dst.format));

conditions weren't accurate.  util_format_get_mask() returns
PIPE_MASK_RGBA for any format with any color channels, while
info->mask is the exact set of channels to blit.  So we reject things
we could blit - for example, PIPE_FORMAT_R16G16_FLOAT where info->mask
is RG while util_format_get_mask() returns RGBA - and accept things we
can't.  It turns out that the blitter is happy to blit different
number of channels, but fails to blit formats with different numerical
formats and srgb formats.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-02-11 12:26:21 -08:00
Kristian H. Kristensen
4f7a9c23ed freedreno/a6xx: regen headers
Update for a6xx.xml.h to incorporate a few new bits and changes to
blit src rect coordinate types.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-02-11 12:26:21 -08:00
Leo Liu
a0a52a0367 st/va/vp9: set max reference as default of VP9 reference number
If there is no information about number of render targets

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-11 14:44:16 -05:00
Leo Liu
21cdb828a3 st/va: fix the incorrect max profiles report
Add "PIPE_VIDEO_PROFILE_MAX" to enum, so it will make sure here will
be correct when adding more profiles in the future.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109107

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-11 14:44:16 -05:00
Guttula, Suresh
2cf2a56739 st/va:Add support for indirect manner by returning VA_STATUS_ERROR_OPERATION_FAILED
Based on VA Spec,DeriveImage() returns VA_STATUS_ERROR_OPERATION_FAILED if driver
dont have support for internal surface formats.Currently vaDeriveImage()
failed for non-contiguous planes and operation failed error string is
required to support indirect manner i.e. vaCreateImage()+vaPutImage()
incase vaDeriveImage() failed with VA_STATUS_ERROR_OPERATION_FAILED.

This patch will notify to the client as operation failed with proper
error sting,so that client will fallback to vaCreateImage()+vaPutImage().

v2: updated commit message based on VA spec.

Signed-off-by: suresh guttula <suresh.guttula@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2019-02-11 14:44:16 -05:00
Marek Olšák
114a899cc8 winsys/amdgpu: cs_check_space sets the minimum IB size for future IBs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:48 -05:00
Marek Olšák
766e920cdb winsys/amdgpu: clean up IB buffer size computation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:48 -05:00
Marek Olšák
8c1cb393fc winsys/amdgpu: remove occurence of INDIRECT_BUFFER_CONST
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:48 -05:00
Marek Olšák
881ef14b32 winsys/amdgpu: use a separate fence list for syncobjs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:48 -05:00
Marek Olšák
9f00123d51 winsys/amdgpu: unify fence list code
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:48 -05:00
Marek Olšák
ddfe209a0d winsys/amdgpu: don't drop manually added fence dependencies
wow, it's hard to believe that fence and syncobjs dependencies were ignored.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:48 -05:00
Marek Olšák
61c678d4bc radeonsi: fix EXPLICIT_FLUSH for flush offsets > 0
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:06 -05:00
Marek Olšák
4522f01d4e gallium/u_threaded: fix EXPLICIT_FLUSH for flush offsets > 0
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-11 12:35:04 -05:00
Jason Ekstrand
9e6a6ef0d4 nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocks
When nir_rematerialize_derefs_in_use_blocks_impl was first written, I
attempted to optimize things a bit by not bothering to re-materialize
the sources of deref instructions figuring that the final caller would
take care of that.  However, in the case of more complex deref chains
where the first link or two lives in block A and then another link and
the load/store_deref intrinsic live in block B it doesn't work.  The
code in rematerialize_deref_in_block looks at the tail of the chain,
sees that it's already in block B and skips it, not realizing that part
of the chain also lives in block A.

The easy solution here is to just rematerialize deref sources of deref
instructions as well.  This may potentially lead to a few more deref
instructions being created by the conditions required for that to
actually happen are fairly unlikely and, thanks to the caching, it's all
linear time regardless.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109603
Fixes: 7d1d1208c2 "nir: Add a small pass to rematerialize derefs per-block"
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-11 10:57:23 -06:00
Jason Ekstrand
fd77606b5b intel/fs: Use enumerated array assignments in fb read TXF setup
It's more clear and means we don't have to update the array every time
we add an optional texture instruction argument

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-11 10:57:09 -06:00
Michel Dänzer
d6c55f6c62 gitlab-ci: Re-use docker image from the main repo in forked repos
Instead of generating it from scratch in each forked repo. This should
save time, energy and storage. (The xserver & xf86-video-amdgpu CI
scripts do basically the same)

v2:
* Hardcode "mesa" instead of using $CI_PROJECT_NAME, to avoid breakage
  if the project name is changed after forking (Eric Engestrom)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-11 12:24:31 +01:00
Ilia Mirkin
cc79a1483f nvc0: we have 16k-sized framebuffers, fix default scissors
For some reason we don't use view volume clipping by default, and use
scissors instead. These scissors were set to an 8k max fb size, while
the driver advertises 16k-sized framebuffers.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
2019-02-10 23:36:23 -05:00
Alyssa Rosenzweig
85e2bb58ca panfrost: Specify supported draw modes per-context
Midgard has native support for QUADS and POLYGONS; Bifrost seemingly
does not. Thus, Midgard generally skips prim_convert whereas Bifrost
needs the pass; this patch allows the setting of allowed primitives to
occur on a per-context basis (for runtime hardware selection).

v2: Use (POLYGONS + 1) instead of LINES_ADJACENCY.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
2019-02-11 03:23:00 +00:00
Dave Airlie
90c6880df7 radv: remove alloc parameter from pipeline init
clang points out this isn't used.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-11 10:04:40 +10:00
Dave Airlie
a523ae0cac radv/llvm: initialise passes member.
Fixes coverity warning

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-11 08:59:02 +10:00
Dave Airlie
d2e82c2682 glsl: glsl to nir fix uninit class member.
The constructor should init this to NULL

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-02-11 08:55:07 +10:00
Alyssa Rosenzweig
2458797256 panfrost: Elucidate texture op scheduling comment
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-10 00:51:57 +00:00
Alyssa Rosenzweig
658961aec3 panfrost: Remove speculative if 0'd format bit code
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-10 00:51:51 +00:00
Alyssa Rosenzweig
b1213a3947 panfrost: Remove if 0'd dead code
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-10 00:50:35 +00:00
Alyssa Rosenzweig
e91e1786c5 panfrost: Add kernel-agnostic resource management
Various methods relating to resource management were previously marked
as kernel-specific, forcing them to stay downstream in the vendor
overlay and eventually be duplicated for DRM code. This patch adds back
this code in kernel-neutral space, allowing for code sharing and
minimising the diff to downstream.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-10 00:44:32 +00:00
Alyssa Rosenzweig
4ed23b193a panfrost: Don't hardcode number of nir_ssa_defs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-10 00:42:52 +00:00
Alyssa Rosenzweig
97dcad8d3e panfrost: Clean-up one-argument passing quirk
Most Midgard instructions take two-arguments logically; there are always
two arguments at the assembly level. For the few instructions that take
only a single argument, generally the second argument slot is unused,
with a zero inline constant occupying the space. fmov/imov are the
exception, where the first argument is filled with r24 and the logical
argument is in the second slot.

Previously, these constraints were handled by a delicate, buggy series
of hacks. This commit removes these hacks. Instead, we look at the
logical number of arguments (from NIR), switching between two argument
and one-argument-one-zero style. We then introduce a quirk for the
flipped style, which applies to fmov/imov.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-10 00:41:25 +00:00
Karol Herbst
49397a3c84 glsl_type: initialize offset and location to -1 for glsl_struct_field
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-09 13:52:15 +01:00
Kenneth Graunke
55e00a2ea8 nouveau: Silence unhandled cap warnings
Nouveau apparently uses the u_screen helper but prints a warning in the
default case, so running any GL program would start grumbling.

Fixes: 8fa54bc549 gallium: Add a PIPE_CAP_NIR_COMPACT_ARRAYS capability bit.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-02-08 16:26:00 -08:00
Caio Marcelo de Oliveira Filho
ee670d09af intel/compiler: use 0 as sampler in emit_mcs_fetch
The sampler will be ignored since the underlying 'ld_mcs' operation
won't use it, so just fill the field with 0 instead of the texture to
make it clearer that's the case.

This will also avoid is_high_sampler() to kick in unnecessarily, in
case we are using the operation for a texture with index >= 16.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 14:51:56 -08:00
Eric Engestrom
e8e544436c wsi: query the ICD's max dimensions instead of hard-coding them
anv and radv both happened to already return 2^14 for these, but
querying the ICD is safer and will help if vdreno (or whatever it's
called) doesn't have the same max.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 18:54:57 +00:00
Ian Romanick
b031c64349 nir: Convert a bcsel with only phi node sources to a phi node
v2: Remove the original ALU instruciton after all of its readers are
modified to read the new ALU instruction.

v3: Fix an issue where a bcsel that may not be executed on a loop
iteration due to a break statement is converted to a phi (and therefore
incorrectly "executed").  Noticed by Tim.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109216
Fixes: 8fb8ebfbb0 ("intel/compiler: More peephole select")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Ian Romanick
0881e90c09 nir: Split ALU instructions in loops that read phis
A single shader in Unigine Superposition is affected by this change.
A single iadd is moved to the end of a loop.  This iadd is involved in
a complex set of logic to terminate the loop, and an extra mov
instruction is inserted.  This shader really needs the optimization
suggested by bugzilla #94747, and I expect that to make this tiny
regression go away.

All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 15047543 -> 15047545 (<.01%)
instructions in affected programs: 565 -> 567 (0.35%)
helped: 0
HURT: 2

total cycles in shared programs: 369977253 -> 369978253 (<.01%)
cycles in affected programs: 127910 -> 128910 (0.78%)
helped: 0
HURT: 2

v2: Skip nir_op_vec{2,3,4} and nir_op_[fi]mov instructions to avoid
infinite optimization loops.  Remove the original ALU instruciton after
all of its readers are modified to read the new ALU instruction.

v3: Extend to the more general case.  The if the prev-block value from
the phi is not undef, this means the ALU instruction has to be
duplicated in both the prev-block and the continue-block.

Fixes: 8fb8ebfbb0 ("intel/compiler: More peephole select")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Ian Romanick
0c0c69729b nir: Select phi nodes using prev_block instead of continue_block
This simplifies some changes coming later.

Fixes: 8fb8ebfbb0 ("intel/compiler: More peephole select")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Ian Romanick
8d8f80af3a nir: Refactor code that checks phi nodes in opt_peel_loop_initial_if
This will be used in a couple more places soon.

The function name is... horribly long.  Neither Matt nor I could think
of any thing that was shorter and still more descriptive than
"is_phi_foo".  I'm willing to entertain suggestions.

Fixes: 8fb8ebfbb0 ("intel/compiler: More peephole select")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Ian Romanick
4d65d2b12e nir: Document some fields of nir_loop_terminator
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Ian Romanick
28ef5bb74c intel/compiler: Silence warning about value that may be used uninitialized
For some reason, this warning only occurs for me in release builds.

In file included from src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c:25:0:
src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c: In function ‘brw_nir_lower_mem_access_bit_sizes’:
src/compiler/nir/nir_builder.h:501:26: warning: ‘src_swiz[2]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       alu_src.swizzle[i] = swiz[i];
       ~~~~~~~~~~~~~~~~~~~^~~~~~~~~
src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c:225:16: note: ‘src_swiz[2]’ was declared here
       unsigned src_swiz[4];
                ^~~~~~~~

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Ian Romanick
78169870e4 nir: Silence zillions of unused parameter warnings in release builds
Fixes: cd56d79b59 "nir: check NIR_SKIP to skip passes by name"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-08 10:37:06 -08:00
Eric Engestrom
3dc5faf523 gitlab-ci: workaround docker bug for users with uppercase characters
CI_REGISTRY_IMAGE == lower($CI_REGISTRY/$CI_PROJECT_PATH)

Suggested-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-08 17:45:57 +00:00
Andrii Simiklit
2b7d5c3217 i965: consider a 'base level' when calculating width0, height0, depth0
I guess that when we calculating the width0, height0, depth0
to use for function 'intel_miptree_create' we need to consider
the 'base level' like it is done in the 'intel_miptree_create_for_teximage'
function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107987
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-07 21:40:50 -08:00
Timothy Arceri
26aa460940 nir: rewrite varying component packing
There are a number of reasons for the rewrite.

1. Adding support for packing tess patch varyings in a sane way.

2. Making use of qsort allowing the code to be much easier to
   follow.

3. Fixes a bug where different interp types caused component
   packing to be skipped for all varyings in some scenarios.

4. Allows us to add a crude live range analysis for deciding
   which components should be packed together. This support can
   optionally be added in a future patch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 02:54:56 +00:00
Timothy Arceri
2f53260417 nir: add is_packing_supported_for_type() helper
This will be used in the following patches to determine if we
support packing the components of a varying.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 02:54:56 +00:00
Timothy Arceri
e041123841 nir: add glsl_type_is_32bit() helper
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 02:54:56 +00:00
Timothy Arceri
7b01d5c354 nir: add support for marking used patches when packing varyings
This adds support needed for marking the varyings as used but we
don't actually support packing patches in this patch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 02:54:56 +00:00
Timothy Arceri
d0af13cfb4 st/glsl_to_nir: call nir_remove_dead_variables() after lowing local indirects
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 02:54:56 +00:00
Timothy Arceri
d0abbaa528 util: move BITFIELD macros to util/macros.h
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 02:54:56 +00:00
Karol Herbst
cbd1ad6165 st/mesa: require RGBA2, RGB4, and RGBA4 to be renderable
If the driver does not support rendering to these formats but does
support texturing, we can end up in incompatibilities between textures
and renderbuffers that are then copied to.

Fixes KHR-GL45.copy_image.functional on nvc0

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-07 21:51:45 -05:00
Karol Herbst
6010d7b8e8 gallium: add PIPE_CAP_MAX_VARYINGS
Some NVIDIA hardware can accept 128 fragment shader input components,
but only have up to 124 varying-interpolated input components. We add a
new cap to express this cleanly. For most drivers, this will have the
same value as PIPE_SHADER_CAP_MAX_INPUTS for the fragment shader.

Fixes KHR-GL45.limits.max_fragment_input_components

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
[imirkin: rebased, improved docs/commit message]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-07 21:51:45 -05:00
Alyssa Rosenzweig
738346fa23 kmsro: Silence warning if missing
Regardless of whether the build uses kmsro, kmsro is the default driver
descriptor when the static loader is used. Thus, in an edge case where
the static loader is used, no static targets are loaded, and kmsro is
not compiled, a spurious warning is printed. There's no harm in
executing the stub function in this case, but it's not "an error" to not
have kmsro in the build; the driver missing warning should not printed
kmsro.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-02-08 01:48:37 +00:00
Lionel Landwerlin
f1bcb9be46 radv: assert that colorAttachment is valid for CmdClearAttachment
This partially reverts a change from b7a93cbded ("radv: Handle
VK_ATTACHMENT_UNUSED in CmdClearAttachment") which fixed actual issues
but also started to accept invalid values for the colorAttachment
field.

This change asserts that the field is valid for the current pass.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b7a93cbded ("radv: Handle VK_ATTACHMENT_UNUSED in CmdClearAttachment")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-08 00:18:16 +00:00
Lionel Landwerlin
a934a3d124 anv: assert that color attachment are valid
This reverts commit d76e777988.

Let's make this obvious that there is an application issue if it tries
to access an attachment that doesn't exist in the current pass.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d76e777988 ("anv: Handle VK_ATTACHMENT_UNUSED in colorAttachment")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-08 00:18:16 +00:00
Dave Airlie
3c153b3982 docs: update qbo support for virgl
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-02-08 09:06:36 +10:00
Eric Engestrom
6e0effbd34 travis: fix osx make build
This variable was removed in commit 087af992a2 "travis: remove
unused linux code path" because it looked like it was only used by the
Linux build. Turns out I was wrong, so let's restore it.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-02-07 20:14:14 +00:00
Jason Ekstrand
eaf5e4a24d README: Drop the badges from the readme
They have been added as badges directly to the GitLab project.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-07 12:46:17 -06:00
Eric Engestrom
358d0cfab2 driconf: drop unused macro
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-07 13:40:26 +00:00
Eric Engestrom
00be88aab8 meson: add script to print the options before configuring a builddir
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-07 13:22:41 +00:00
Alyssa Rosenzweig
d43ec104b7 panfrost: Include glue for out-of-tree legacy code
In addition to the DRM interface in active development, for legacy
kernels Panfrost has a small, optional, out-of-tree glue repository. For
various reasons, this legacy code should not be included in Mesa proper,
but this commit allows it to coexist peacefully with upstream Panfrost.
If the nondrm repo is cloned/symlinked to the directory
`src/gallium/drivers/panfrost/nondrm`, legacy functionality will be
built. Otherwise, the driver will build normally, though a runtime error
message will be printed if a legacy kernel is detected.

This workaround is icky, but it allows a nearly-upstream Panfrost to
work on real hardware, today. Ideally, this patch will be reverted when
the Panfrost kernel module is mature and we drop legacy support.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-07 01:58:32 +00:00
Alyssa Rosenzweig
7da251fc72 panfrost: Check in sources for command stream
This patch includes the command stream portion of the driver,
complementing the earlier compiler. It provides a base for future work,
though it does not integrate with any particular winsys.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-07 01:57:50 +00:00
Alyssa Rosenzweig
8f4485ef1a panfrost: Use u_pipe_screen_get_param_defaults
Switching to the defaults function cleans up pan_screen.h markedly and
futureproofs for when new PIPE_CAPs are added.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Suggested-by: Eric Anholt <eric@anholt.net>
2019-02-07 01:57:19 +00:00
Alyssa Rosenzweig
8f9f99d84d kmsro: Move DRM entrypoints to shared block
As kmsro allows an essentially mix-and-match hodgepodge of display
drivers and renderonly GPUs, it doesn't make sense to couple the display
driver entrypoint definition with the driver. Instead, we move *all*
kmsro entrypoints to a shared kmsro block at the end (avoiding clutter
and distraction since this list may snowball in the future).

v2: Alphabetize driver list.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-07 01:50:16 +00:00
Rhys Perry
5b6f522fc2 nvc0: add compute invocation counter
The strategy is to keep a CPU-side counter of the direct invocations,
and a GPU-side counter of the indirect invocations, and then add them
together for queries.

The specific technique is a macro which multiplies a list of integers
together and accumulates the product into SCRATCH registers held inside
of the context. Another macro will read those values out and add them to
the passed-in cpu-side counter to be stored in a query buffer the same
way that all the other statistics are stored.

Original implementation by Rhys Perry, redone by Ilia Mirkin to use the
SCRATCH temporaries.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-02-06 19:35:57 -05:00
Karol Herbst
cce4955721 gm107/ir: add fp64 rsq
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Karol Herbst
815a8e59c6 gm107/ir: add fp64 rcp
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Karol Herbst
12669d2970 gk104/ir: Use the new rcp/rsq in library
[imirkin: add a few more "long" prefixes to safen things up]
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Boyan Ding
656ad06051 gk110/ir: Use the new rcp/rsq in library
v2: (Karol Herbst <kherbst@redhat.com>
 * fix Value setup for the builtins

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
[imirkin: track the fp64 flag when switching ops to calls]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Boyan Ding
7937408052 gk110/ir: Add rsq f64 implementation
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Boyan Ding
04593d9a73 gk110/ir: Add rcp f64 implementation
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Ilia Mirkin
6adb9b38bf nvc0: stick zero values for the compute invocation counts
Not quite perfect, but at least we don't end up with random values in
the query buffer.

Fixes KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Ilia Mirkin
e00799d3dc nv50,nvc0: use condition for occlusion queries when already complete
For the NO_WAIT variants, we would jump into the ALWAYS case for both
nested and inverted occlusion queries. However if the query had
previously completed, the application could reasonably expect that the
render condition would follow that result.

To resolve this, we remove the nesting distinction which unnecessarily
created an imbalance between the regular and inverted cases (since
there's no "zero" condition mode). We also use the proper comparison if
we know that the query has completed (which could happen as a result of
an earlier get_query_result call).

Fixes KHR-GL45.conditional_render_inverted.functional

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Ilia Mirkin
162352e671 nvc0: fix 3d images on kepler
Looks like SUBFM.3D and SUEAU are perfectly capable of dealing with 3d
tiling, they just need the correct inputs. Supply them.

We also have to deal with the case where a 2d "layer" of a 3d image is
bound. In this case, we supply the z coordinate separately to the
shader, which has to optionally treat every 2d case as if it could be a
slice of a 3d texture.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Ilia Mirkin
5de5beedf2 nvc0/ir: fix second tex argument after levelZero optimization
We used to pre-set a bunch of extra arguments to a texture instruction
in order to force the RA to allocate a register at the boundary of 4.
However with the levelZero optimization, which removes a LOD argument
when it's uniformly equal to zero, we undid that logic by removing an
extra argument. As a result, we could end up with insufficient alignment
on the second wide texture argument.

Instead we switch to a different method of achieving the same result.
The logic runs during the constraint analysis of the RA, and adds unset
sources as necessary right before being merged into a wide argument.

Fixes MISALIGNED_REG errors in Hitman when run with bindless textures
enabled on a GK208.

Fixes: 9145873b15 ("nvc0/ir: use levelZero flag when the lod is set to 0")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Ilia Mirkin
4443b6ddf2 nvc0/ir: always use CG mode for loads from atomic-only buffers
Atomic operations don't update the local cache, which means that we
would have to issue CCTL operations in order to get the updated values.
When we know that a buffer is primarily used for atomic operations, it's
easier to just avoid the caching at that level entirely.

The same issue persists for non-atomic buffers, which will have to be
fixed separately.

Fixes the failing dEQP-GLES31.functional.atomic_counter.* tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Ilia Mirkin
399215eb7a nvc0: add support for handling indirect draws with attrib conversion
The hardware does not natively support FIXED and DOUBLE formats. If
those are used in an indirect draw, they have to be converted. Our
conversion tries to be clever about only converting the data that's
needed. However for indirect, that won't work.

Given that DOUBLE or FIXED are highly unlikely to ever be used with
indirect draws, read the indirect buffer on the CPU and issue draws
directly.

Fixes the failing dEQP-GLES31.functional.draw_indirect.random.* tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 19:35:57 -05:00
Kristian H. Kristensen
0f7a20e91e freedreno/a6xx: Use tiling for all resources
We used to restrict this to just PIPE_BIND_SAMPLER_VIEW resources, but
most resources benefit from being tiled.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-02-06 15:28:48 -08:00
Kristian H. Kristensen
357ea7da51 freedreno/a6xx: Emit blitter dst with OUT_RELOCW
We're writing to the bo and the kernel needs to know for
fd_bo_cpu_prep() to work.

Fixes: f93e431272 ("freedreno/a6xx: Enable blitter")
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-06 15:22:25 -08:00
Bas Nieuwenhuizen
13ab63bb62 radv: Implement VK_EXT_buffer_device_address.
v2: Also update the release notes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:37:38 +01:00
Bas Nieuwenhuizen
3259e7b036 radv: Do not use the bo list for local buffers.
The kernel already does it for us.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:36:19 +01:00
Bas Nieuwenhuizen
8a15950211 amd/common: Implement global memory accesses.
Needed for VK_EXT_buffer_device_address.

The pointers are implmemented as i8*, since I could not figure
out how to emulate setting struct offsets in LLVM based on the
SPIR-V offsets (and more weird stuff like row major matrices).

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:36:11 +01:00
Bas Nieuwenhuizen
5703ecf651 amd/common: Do not use 32-bit loads for shared memory.
We use a straight glsl->llvm type conversion so types should already be right.

Also even though the writemasks were changed we we not actually doing 32-bit
things, so this fails miserably.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:36:06 +01:00
Bas Nieuwenhuizen
8d1718590b amd/common: handle nir_deref_cast for shared memory from integers.
Can happen e.g. after a phi.

Fixes: a2b5cc3c39 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:36:02 +01:00
Bas Nieuwenhuizen
830fd0efc1 amd/common: Handle nir_deref_type_ptr_as_array for shared memory.
Fixes: a2b5cc3c39 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:58 +01:00
Bas Nieuwenhuizen
dbdb44d575 amd/common: Fix stores to derefs with unknown variable.
Fixes: a2b5cc3c39 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:54 +01:00
Bas Nieuwenhuizen
3c24fc64c7 amd/common: Use correct writemask for shared memory stores.
The check was for 1 bit being set, which is clearly not what we want.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:49 +01:00
Bas Nieuwenhuizen
00253ab2c4 radv: Fix the shader info pass for not having the variable.
For example with VK_EXT_buffer_device_address or
 VK_KHR_variable_pointers.

Fixes: a2b5cc3c39 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:45 +01:00
Bas Nieuwenhuizen
58c8dadd32 amd/common: Implement ptr->int casts in ac_to_integer.
For the implicit casts inherent in nir.

This should probably have been done for shared memory for
VK_KHR_variable_pointers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:40 +01:00
Bas Nieuwenhuizen
e00d9a9a72 amd/common: Add gep helper for pointer increment.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:36 +01:00
Bas Nieuwenhuizen
39ab4e12f7 radv: Only look at pImmutableSamples if the descriptor has a sampler.
Equivalent of ANV patch c7f4a2867c

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:32 +01:00
Eric Engestrom
40b53a7203 xvmc: fix string comparison
Fixes: 6fca18696d "g3dvl: Update XvMC unit tests."
Cc: Younes Manton <younes.m@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 18:15:43 +00:00
Eric Engestrom
110a6e1839 xvmc: fix string comparison
Fixes: c7b65dcaff "xvmc: Define some Xv attribs to allow users
                             to specify color standard and procamp"
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 18:15:43 +00:00
Eric Engestrom
ba26bc4ef0 gitlab-ci: add meson glvnd build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
5459900f38 travis: remove unused scons code path
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
087af992a2 travis: remove unused linux code path
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
73275147fe gitlab-ci: add make Gallium ST Other build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
360a7bfbe9 gitlab-ci: add make Gallium ST Clover LLVM-7 build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
39315a747b gitlab-ci: add make Gallium ST Clover LLVM-6.0 build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
e80f88c48a gitlab-ci: add make Gallium ST Clover LLVM-5.0 build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
cc85f50029 gitlab-ci: add make Gallium ST Clover LLVM-4.0 build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
984e295500 gitlab-ci: add make Gallium ST Clover LLVM-3.9 build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
d0dff24cbb gitlab-ci: add make Gallium Drivers "Other" build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
055cfbc6de gitlab-ci: add make Gallium Drivers RadeonSI build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
7b26a19f31 gitlab-ci: add make Gallium Drivers SWR build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
bbdc563c11 gitlab-ci: add make loaders/classic DRI build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
f33517bda7 gitlab-ci: add meson gallium ST "Other" build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
8dab707ab8 gitlab-ci: add meson gallium ST Clover (LLVM 7.0) build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
8744ac0904 gitlab-ci: add meson gallium ST Clover (LLVM 6.0) build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
b5a70af062 gitlab-ci: add meson gallium ST Clover (LLVM 5.0) build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
d407ead204 gitlab-ci: add meson gallium "other drivers" build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
06e8f1961b gitlab-ci: add meson gallium RadeonSI build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
360c814bfe gitlab-ci: add meson gallium SWR build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
d73265e20d gitlab-ci: add meson loader/classic DRI build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
6a19ec9daa gitlab-ci: add scons SWR build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
d4c6d4d5cb gitlab-ci: add scons llvm 3.5 build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
06b245b438 gitlab-ci: add a scons no-llvm build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
89a7467899 gitlab-ci: add a make vulkan build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
46d23c0a46 gitlab-ci: add a meson vulkan build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Eric Engestrom
329f5cd780 gitlab-ci: add ubuntu container
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-06 17:56:30 +00:00
Marek Olšák
42a1cd034d radeonsi: use local ws variable in si_need_dma_space
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-06 11:17:21 -05:00
Marek Olšák
2c4911c652 radeonsi: don't leak an index buffer if draw_vbo fails
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-06 11:17:21 -05:00
Marek Olšák
d72c319867 radeonsi: make allocator_zeroed_memory unmappable and use bigger buffers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-06 11:17:21 -05:00
Marek Olšák
5068dec5de radeonsi: clear allocator_zeroed_memory with SDMA
so that it can be used in parallel IBs.

This also removes the SO_FILLED_SIZE hack.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-06 11:17:21 -05:00
Marek Olšák
7d4c935654 radeonsi: initialize textures using DCC to black when possible
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-02-06 11:17:21 -05:00
Jonathan Marek
3361305f57 freedreno: a2xx: fix fast clear
Fixes: 912a9c8d

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-06 14:34:57 +00:00
Eric Engestrom
54fa5eceae egl: use coherent variable names
`EGLDisplay` variables (the opaque Khronos type) have mostly been
consistently called `dpy`, as this is the name used in the Khronos
specs.

However, `_EGLDisplay` variables (our internal struct) have been
randomly called `dpy` when there was no local variable clash with
`EGLDisplay`s, and `disp` otherwise.

Let's be consistent and use `dpy` for the Khronos type, and `disp`
for our struct.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-02-06 11:53:24 +00:00
Alyssa Rosenzweig
a81d5587d6 meson: Remove panfrost from default driver list
Until the kernel side matures and the full driver is upstreamed, to
avoid end-user surprises, Panfrost should only be built for the
adventurous.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-06 02:59:00 +00:00
Eric Anholt
3c08ecf147 v3d: Whitespace consistency fix. 2019-02-05 15:46:42 -08:00
Eric Anholt
940501a446 v3d: Fix copy-propagation of input unpacks.
I had a single function for "does this do float input unpacking" with two
major flaws: It was missing the most common thing to try to copy propagate
a f32 input nunpack to (the VFPACK to an FP16 render target) along with
several other ALU ops, and also would try to propagate an f32 unpack into
a VFMUL which only does f16 unpacks.

instructions in affected programs: 659232 -> 655895 (-0.51%)
uniforms in affected programs: 132613 -> 135336 (2.05%)

and a couple of programs increase their thread counts.

The uniforms hit appears to be a pattern in generated code of doing (-a >=
a) comparisons, which when a is abs(b) can result in the abs instruction
being copy propagated once but not fully DCEed.
2019-02-05 15:46:04 -08:00
Eric Anholt
e5c6938590 v3d: Fix input packing of .l for rounding/fdx/fdy.
Avoids a regression in
dEQP-GLES3.functional.shaders.derivate.fwidth.texture.* once we start
copy-propagating more input packs.
2019-02-05 15:45:23 -08:00
Eric Anholt
1a4170952d v3d: Fix pack/unpack of VFPACK operand unpacks.
We want to be able to copy propagate our texture unpacks into the vfpack.
2019-02-05 15:45:23 -08:00
Eric Anholt
d0fdbd4211 v3d: Fix dumping of shaders with alpha test.
We were trying to print a NULL entry from the table.
2019-02-05 15:42:14 -08:00
Eric Anholt
bdef17b052 v3d: Store the actual mask of color buffers present in the key.
If you only bound rt 1+, we'd still emit a write to the rt0 that isn't
present (noticed while debugging an
ext_framebuffer_multisample-alpha-to-coverage-no-draw-buffer-zero
regression in another change).
2019-02-05 15:42:04 -08:00
Eric Anholt
17a649af05 v3d: Fix precompile of FRAG_RESULT_DATA1 and higher outputs.
I was just leaving the other MRT targets than DATA0 out, by accident.
2019-02-05 15:35:49 -08:00
Kristian H. Kristensen
ba4b22011a st/nir: Use src/ relative include path for autotools
Fixes: cdc53fa81c
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-05 14:19:51 -08:00
Kenneth Graunke
8fa54bc549 gallium: Add a PIPE_CAP_NIR_COMPACT_ARRAYS capability bit.
Iris would like to use compact arrays for tesslevels and clip/cull
distances.  radeonsi will likely want to switch to these at some point,
since it'll be necessary for GL_ARB_gl_spirv support, but it's not ready
for them just yet.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-05 13:58:46 -08:00
Kenneth Graunke
cf731564e6 st/nir: Call nir_lower_clip_cull_distance_arrays().
Today, st always sets LowerCombinedClipCullDistance, causing the GLSL IR
lowering to run, giving us vec4[2] arrays.  I would like to disable this
and instead run the NIR lowering so that we get compact float[] arrays
instead.

Calling the new pass is a noop if the GLSL IR pass has already run, so
it's safe to call the pass unconditionally.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-05 13:58:46 -08:00
Kenneth Graunke
15c6902117 nir: Avoid splitting compact arrays into per-element variables.
Compact arrays are used for special variables like clip and cull
distances, or tessellation levels.  Drivers using compact arrays
assume that these values will always be actual arrays.  We don't
want to turn a float[1] gl_CullDistance into a single float; that
would confuse drivers.

Today, i965 uses compact arrays, and Gallium drivers use
nir_lower_io_arrays_to_elements, so we haven't had any overlap
that would demonstrate the issue.  Iris will use both.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-05 13:58:46 -08:00
Kenneth Graunke
ba9dcc80fb nir: Avoid clip/cull distance lowering multiple times.
A couple places in st/nir assume that cull distances have been lowered
away, so it will need to call this lowering pass for drivers which opt
out of the GLSL IR lowering.  The Intel backend also calls this pass,
for i965 and anv.  We need to only do it once.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-05 13:58:46 -08:00
Kenneth Graunke
5730364d69 nir: Bail on clip/cull distance lowering if GLSL IR already did it.
We have a GLSL IR pass to convert clip/cull distance float[] arrays
into vec4[2] arrays.  In ff281e6204, we attempted to skip this pass
if the GLSL IR lowering had already run.  But, that code was not quite
right, as we forgot to strip away the per-vertex IO array layer for
geometry and tessellation shader varyings.

If the GLSL IR pass has run, the variables will not be marked as
"compact".  So we can simply check that and bail.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-05 13:58:46 -08:00
Kenneth Graunke
ef99f4c8d1 compiler: Mark clip/cull distance arrays as compact before lowering.
nir_lower_clip_cull_distance_arrays() marks the combined clip/cull
distance array as compact.  However, when translating in from GLSL
or SPIR-V, we were not marking the original float[] arrays as compact.

We should do so.  That way, we can detect these corner cases properly.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-05 13:58:46 -08:00
Kenneth Graunke
3327c93510 nir: Record info->fs.pixel_center_integer in lower_system_values
radeonsi uses a system value for gl_FragCoord rather than an input var.
These get translated into load_frag_coord NIR intrinsics, which lose the
pixel_center_integer and origin_upper_left decorations.  To cope with
this, Tim added a shader_info field for pixel_center_integer, and made
glsl_to_nir set it accordingly.

prog_to_nir also needs to handle these fragcoord conventions.  Instead
of duplicating the logic to set the info field, just move it to
nir_lower_system_values so it'll happen regardless of who makes the NIR.

(For what it's worth, we don't need an info flag for origin_upper_left,
because radeonsi lowers origin conventions in nir_lower_wpos_ytransform
before nir_lower_system_values destroys the variable and qualifiers.)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:51:52 -08:00
Kenneth Graunke
536abd453b program: Extend prog_to_nir handle system values.
Some drivers, such as radeonsi, use a system value for gl_FragCoord
rather than an input variable.  In this case, our Mesa IR will have
a PROGRAM_SYSTEM_VALUE register, which we need to translate.

This makes prog_to_nir work for Gallium drivers which expose the
PIPE_CAP_TGSI_FS_POSITION_IS_SYSVAL capability bit.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:51:51 -08:00
Kenneth Graunke
fa38ca25f6 program: Use u_bit_scan64 in prog_to_nir.
We can simply iterate the bits rather than using util_last_bit and
checking each one up until that point.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:51:50 -08:00
Kenneth Graunke
a01ad3110a st/mesa: Add NIR versions of the PBO upload/download shaders.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:42 -08:00
Kenneth Graunke
a02349b9e7 st/mesa: Add a NIR version of the OES_draw_texture built-in shaders.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:41 -08:00
Kenneth Graunke
be492affa8 st/mesa: Add NIR versions of the clear shaders.
We implement the basic VS and FS, as well as the VS that does layered
clears by writing gl_Layer from the vertex shader.  Drivers which need
a geometry shader for writing layer continue falling back to TGSI, as
I didn't need this and so didn't bother implementing it.  (We certainly
could, however, if people want to add it in the future.)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:39 -08:00
Kenneth Graunke
3f28b245b5 st/mesa: Add NIR versions of the drawpixels Z/stencil fragment shaders.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:37 -08:00
Kenneth Graunke
2d45f9fa25 st/mesa: Add a NIR version of the drawpixels/bitmap VS copy shader.
This provides a native NIR version of the DrawPixels/Bitmap passthrough
vertex shader.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:36 -08:00
Kenneth Graunke
cdc53fa81c st/nir: Make new helpers for constructing built-in NIR shaders.
The state tracker generates several built-in shaders in order to
perform scissored clears, upload/download PBOs, and so on.  These
are currently constructed using TGSI, using ureg and u_simple_shader.

I want to have NIR versions of these shaders, for my Gallium driver
that has a NIR backend but no TGSI support.  To that end, we'll want
a few helpers to help construct simple shaders.

This patch adds two new helpers:

- st_nir_finish_builtin_shader() takes a manually constructed NIR
  shader, applies lowering passes (like st_link_nir would do for GLSL),
  and constructs the pipe_shader_state.

- st_nir_make_passthrough_shader() makes a simple passthrough shader,
  which copies inputs to outputs.  This is similar to u_simple_shaders.

v2: Set info->fs.untyped_color_outputs for vc4/v3d (thanks Eric!).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:33 -08:00
Kenneth Graunke
4f799264d1 st/nir: Move varying setup code to a helper function.
I want to reuse this for built-in shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eric Anholt <eric@anholt.net>
2019-02-05 13:43:02 -08:00
Jason Ekstrand
36734987a5 nir/deref: Drop zero ptr_as_array derefs
They are effectively (&x)[0] or *&x which does nothing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-05 15:17:19 -06:00
Eric Anholt
aaef12702f nir: Move V3D's "the shader was TGSI, ignore FS output types" flag to NIR.
Ken's rework of mesa/st builtins to NIR means that we'll have more NIR
shaders with color output types that are mismatched with the render target
types.  Since this is behavior that GLSL doesn't require, add it as a
shader_info option so the driver can know that it needs to ignore the FS
output's base type in favor of the actual render target's.  This prevents
needing additional variants in several mesa/st paths (clear, pbo upload,
pbo download), given that the driver already has to handle the variants
for any TGSI being passed to it (from u_blitter, for example).

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-05 12:12:33 -08:00
Emil Velikov
8943eb8f03 anv: wire up the state_pool_padding test
Cc: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 927ba12b53 ("anv/tests: Adding test for the state_pool padding.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com><Paste>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-02-05 11:39:36 -08:00
Karol Herbst
a61c388d07 nvc0/ir: replace cvt instructions with add to improve shader performance
gives me an performance boost of 0.2% in pixmark_piano on my gk106, gm204 and
gp107.

reduces the amount of generated convert instructions by roughly 30% in
shader-db.

v2: only for 32 bit operations
    move some common code out of the switch
    handle OP_SAT with modifiers
v3: only for registers and const memory
    rework if clauses
    merge isCvt into this patch
v4: merge isCvt into its use

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-02-05 20:35:38 +01:00
Bart Oldeman
a203eaa4f4 gallium-xlib: query MIT-SHM before using it.
When Mesa is compiled for gallium-xlib using e.g.
./configure --enable-glx=gallium-xlib --disable-dri --disable-gbm
-disable-egl
and is used by an X server (usually remotely via SSH X11 forwarding)
that does not support MIT-SHM such as XMing or MobaXterm, OpenGL
clients report error messages such as
Xlib:  extension "MIT-SHM" missing on display "localhost:11.0".
ad infinitum.

The reason is that the code in src/gallium/winsys/sw/xlib uses
MIT-SHM without checking for its existence, unlike the code
in src/glx/drisw_glx.c and src/mesa/drivers/x11/xm_api.c.
I copied the same check using XQueryExtension, and tested with
glxgears on MobaXterm.

This issue was reported before here:
https://lists.freedesktop.org/archives/mesa-users/2016-July/001183.html

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: <mesa-stable@lists.freedesktop.org>
2019-02-05 17:53:35 +00:00
Alok Hota
6e5eb4ead6 swr/rast: update SWR rasterizer shader stats
Primarily refactoring internal stats types

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-02-05 11:41:25 -06:00
Michel Dänzer
c0a540f320 loader/dri3: Use strlen instead of sizeof for creating VRR property atom
sizeof counts the terminating null character as well, so that also
contributed to the ID computed for the X11 atom. But the convention is
for only the non-null characters to contribute to the atom ID.

Fixes: 2e12fe425f "loader/dri3: Enable adaptive_sync via
                     _VARIABLE_REFRESH property"
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-02-05 17:18:44 +00:00
Jonathan Marek
4f0a3c9f9e nir: add missing vec opcodes in lower_bool_to_float
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-05 15:34:15 +00:00
Gert Wollny
b0b3de2be7 mesa: release references to image textures when a context is destroyed
When a texture is still bound as an image and the context it was bound in
is destroyed but not the texture, then the texture will still hold the
resource and will not be freed when it is finally destroyed. Hence, release
these references when the context is destroyed.

This leak was triggered by virglrenderer:
https://gitlab.freedesktop.org/virgl/virglrenderer/issues/86

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-05 10:53:41 +00:00
Gert Wollny
f1f3640f6f radeonsi: release tokens after creating the shader program
ureg_get_tokens clears the reference to the tokens, and create_compute_state makes
a copy, hence the tokens must be explicitely released.

Fixes: Direct leak of 256 byte(s) in 1 object(s) allocated from:
    #0 0x7ff729cf3c60 in realloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdbc60)
    #1 0x7ff721b1240c in tokens_expand ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:234
    #2 0x7ff721b1c9c0 in get_tokens ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:257
    #3 0x7ff721b1c9c0 in copy_instructions ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:2040
    #4 0x7ff721b1c9c0 in ureg_finalize ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:2090
    #5 0x7ff721b1e919 in ureg_get_tokens ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:2167
    #6 0x7ff721f8b35a in si_create_dma_compute_shader ../../samba/mesa/src/gallium/drivers/radeonsi/si_shaderlib_tgsi.c:219
    #7 0x7ff722043ed9 in si_compute_do_clear_or_copy ../../samba/mesa/src/gallium/drivers/radeonsi/si_compute_blit.c:156
    #8 0x7ff7220448d3 in si_clear_buffer ../../samba/mesa/src/gallium/drivers/radeonsi/si_compute_blit.c:247
    #9 0x7ff7220350e8 in vi_dcc_clear_level ../../samba/mesa/src/gallium/drivers/radeonsi/si_clear.c:274

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-05 11:50:54 +01:00
Caio Marcelo de Oliveira Filho
8c7c543936 isl: assert that Gen8+ don't have bit6_swizzling
v2: Rewrite the condition to more clearly match the comment. (Jordan)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-04 20:44:41 -08:00
Caio Marcelo de Oliveira Filho
5299c9cbcc anv: skip bit6 swizzle detection in Gen8+
It is always false on Gen8+.  Also, move the variable definition near
its use.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-04 20:44:41 -08:00
Caio Marcelo de Oliveira Filho
60740eade3 i965: skip bit6 swizzle detection in Gen8+
It is always false on Gen8+.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-04 20:44:41 -08:00
Caio Marcelo de Oliveira Filho
51547bbc5a nir: keep the phi order when splitting blocks
All things being equal is better to keep the original order.  Since
the new block is empty, push the phis in order to tail.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
2019-02-04 20:41:13 -08:00
Ilia Mirkin
38f542783f nv50,nvc0: add explicit settings for recent caps
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-02-04 23:36:46 -05:00
Alyssa Rosenzweig
e67e072637 panfrost: Implement Midgard shader toolchain
This patch implements the free Midgard shader toolchain: the assembler,
the disassembler, and the NIR-based compiler. The assembler is a
standalone inaccessible Python script for reference purposes. The
disassembler and the compiler are implemented in C, accessible via the
standalone `midgard_compiler` binary. Later patches will use these
interfaces from the driver for online compilation.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-05 01:26:28 +00:00
Alyssa Rosenzweig
61d3ae6e0b panfrost: Initial stub for Panfrost driver
This patch adds an initial stub for the Gallium driver, containing
simple screen functions and the majority of the driver headers but no
actual functionality. It further adds the winsys glue for linking in
this stub driver via kmsro on Rockchip/Amlogic boards.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-05 01:19:30 +00:00
Marek Olšák
742d6cdb42 radeonsi: fix crashing performance counters (division by zero)
Fixes: e2b9329f17 "radeonsi: move remaining perfcounter code into si_perfcounter.c"
2019-02-04 18:46:25 -05:00
Marek Olšák
a03ecbaeec radeonsi: handle render_condition_enable in si_compute_clear_render_target 2019-02-04 18:46:25 -05:00
Sonny Jiang
984fd73515 radeonsi: use compute for clear_render_target when possible
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-02-04 18:46:25 -05:00
Kenneth Graunke
dc46317d1a st/mesa: Set pipe_image_view::shader_access in PBO readpixels.
Commit 8b626a22b2 introduced a new
pipe_image_view::shader_access field, indicating the access mode
specified in the shader.  st/mesa's built-in PBO download shader
creates a write-only image buffer, so we should flag it as such.

Nobody uses this field yet (Iris will), so we don't need to backport
this fix to stable branches.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-04 11:17:56 -08:00
Rodrigo Vivi
56c3b4971d intel: Add more PCI Device IDs for Coffee Lake and Ice Lake.
Align with kernel commits:

5e0f5a58b167 ("drm/i915/cfl: Adding another PCI Device ID.")
03ca3cf8e9aa ("drm/i915/icl: Adding few more device IDs for Ice Lake")

Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-04 10:05:25 -08:00
Danylo Piliaiev
64d3b148fe anv: Fix VK_EXT_transform_feedback working with varyings packed in PSIZ
Transform feedback did not set correct SO_DECL.ComponentMask for
varyings packed in VARYING_SLOT_PSIZ:
 gl_Layer         - VARYING_SLOT_LAYER    in VARYING_SLOT_PSIZ.y
 gl_ViewportIndex - VARYING_SLOT_VIEWPORT in VARYING_SLOT_PSIZ.z
 gl_PointSize     - VARYING_SLOT_PSIZ     in VARYING_SLOT_PSIZ.w

Fixes: 36ee2fd61c "anv: Implement the basic form of VK_EXT_transform_feedback"

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-04 15:30:43 +00:00
Danylo Piliaiev
b7a93cbded radv: Handle VK_ATTACHMENT_UNUSED in CmdClearAttachment
From the Vulkan 1.0.98 spec for vkCmdClearAttachments:

"If any attachment to be cleared in the current subpass is VK_ATTACHMENT_UNUSED,
then the clear has no effect on that attachment."

"If the aspectMask member of any element of pAttachments contains
VK_IMAGE_ASPECT_COLOR_BIT, then the colorAttachment member of that
element must either refer to a color attachment which is VK_ATTACHMENT_UNUSED,
or must be a valid color attachment."

"If the aspectMask member of any element of pAttachments contains
VK_IMAGE_ASPECT_DEPTH_BIT, then the current subpass' depth/stencil attachment
must either be VK_ATTACHMENT_UNUSED, or must have a depth component"

"If the aspectMask member of any element of pAttachments contains
VK_IMAGE_ASPECT_STENCIL_BIT, then the current subpass' depth/stencil attachment
must either be VK_ATTACHMENT_UNUSED, or must have a stencil component"

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 14:50:43 +02:00
Danylo Piliaiev
d76e777988 anv: Handle VK_ATTACHMENT_UNUSED in colorAttachment
From the Vulkan 1.0.98 spec for vkCmdClearAttachments:

"If the aspectMask member of any element of pAttachments contains
VK_IMAGE_ASPECT_COLOR_BIT, then the colorAttachment member of that
element must either refer to a color attachment which is VK_ATTACHMENT_UNUSED,
or must be a valid color attachment."

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-04 14:49:50 +02:00
Samuel Pitoiset
0d0affad3c radv: don't flush src stages when dstStageMask == BOTTOM_OF_PIPE
Original patch by Fredrik Höglund.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
9efa3405a7 radv: do not set preserveAttachments for internal render passes
We don't use that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
80e809d993 radv: drop useless checks when resolving subpass color attachments
The Vulkan spec says:
   "If pResolveAttachments is not NULL, for each resolve attachment
    that does not have the value VK_ATTACHMENT_UNUSED, the
    corresponding color attachment must not have the value
    VK_ATTACHMENT_UNUSED."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
76c17cfd8d radv: execute external subpass barriers after ending subpasses
Outgoing dependencies (ie. external) should happen after the subpass.
This doesn't change anything for subpass resolves as we already
make sure that attachments are shader readable.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
b482c030f5 radv: accumulate all ingoing external dependencies to the first subpass
In case two or more subpasses declare ingoing external dependencies.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
eaab35e5e3 radv: handle subpass dependencies correctly
The different masks should be accumulated. For example if two
subpasses declare an outgoing dependency (ie. dst ==
VK_SUBPASS_EXTERNAL).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
6430616e77 radv: track if subpasses have color attachments
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
1e810f1c53 radv: add radv_render_pass_add_subpass_dep() helper
To share common code that handles subpass dependencies.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
2472907563 radv: move some render pass things to radv_render_pass_compile()
radv_render_pass_compile() is common to vkCreateRenderPass()
and vkCreateRenderPass2().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset
b509013060 radv: handle final layouts at end of every subpass and render pass
That shouldn't change anything as we check if the last
subpass id is the final subpass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:18:38 +01:00
Samuel Pitoiset
5699ac0078 radv: determine the last subpass id for every attachments
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:59 +01:00
Samuel Pitoiset
e1a0a268c6 radv: use the new attachments array when starting subpasses
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:57 +01:00
Samuel Pitoiset
a20c2e38d8 radv: store the list of attachments for every subpass
This reworks how the depth stencil attachment is used for
simplicity. This also introduces radv_render_pass_compile()
helper that will be used for further optimizations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:54 +01:00
Samuel Pitoiset
a7c7d811f1 radv: move subpass image transitions to radv_cmd_buffer_begin_subpass()
Instead of doing them in radv_cmd_buffer_set_subpass().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:52 +01:00
Samuel Pitoiset
291a933786 radv: add radv_cmd_buffer_begin_subpass() helper
To unify some code in BeginRenderPass() and NextSubpass().
Based on Intel ANV driver.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:50 +01:00
Samuel Pitoiset
41199e2eeb radv: remove useless MAYBE_UNUSED in CmdBeginRenderPass()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:46 +01:00
Samuel Pitoiset
545552c9b9 radv: remove unused radv_render_pass_attachment::view_mask
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:42 +01:00
Samuel Pitoiset
0f932bbede radv: bail out when no image transitions will be performed
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:40 +01:00
Marek Olšák
1e85cfb91a meson: drop the xcb-xrandr version requirement
autotools doesn't have any requirement. This fixes meson on Ubuntu 16.04.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-03 18:39:57 -05:00
Eric Engestrom
808bf59cac wsi/display: add comment
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
2019-02-02 23:08:03 +00:00
Jason Ekstrand
0aa5a97b03 relnotes: Add VK_EXT_buffer_device_address 2019-02-02 08:42:14 -06:00
Jason Ekstrand
48ed2a7bb0 anv: Implement VK_EXT_buffer_device_address
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-01 17:09:42 -06:00
Jason Ekstrand
e644ed468f intel/fs: Implement nir_intrinsic_global_atomic_*
eviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:11:00 -06:00
Jason Ekstrand
a91f392073 intel/fs: Use SENDS for A64 writes on gen9+
eviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:11:00 -06:00
Jason Ekstrand
1c25bf4373 intel/fs: Implement load/store_global with A64 untyped messages
eviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:11:00 -06:00
Jason Ekstrand
b4f0d062cd intel/fs: Do the grf127 hack on SIMD8 instructions in SIMD16 mode
Previously, we only applied the fix to shaders with a dispatch mode of
SIMD8 but the code it relies on for SIMD16 mode only applies to SIMD16
instructions.  If you have a SIMD8 instruction in a SIMD16 shader,
neither would trigger and the restriction could still be hit.

Fixes: 232ed89802 "i965/fs: Register allocator shoudn't use grf127..."
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:11:00 -06:00
Jason Ekstrand
79724a0756 intel/fs: Properly handle 64-bit types in LOAD_PAYLOAD
By just assigning dst.type to src[i].type, we ensure that the offset at
the end of the loop actually offsets it by the right number of
registers.  Otherwise, we'll get into a case where we copy with a Q type
and then offset with a D type and things get out of sync.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:10:57 -06:00
Jason Ekstrand
f02914a991 intel/fs/cse: Split create_copy_instr into three cases
Previously, we tried to combine all cases where the instruction being
CSE'd writes to more than one MOV worth of registers into one case with
a bit of special casing for LOAD_PAYLOAD.  This commit splits things so
that LOAD_PAYLOAD is entirely it's own case.  This makes tweaking the
LOAD_PAYLOAD case simpler in the next commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:10:40 -06:00
Jason Ekstrand
f409a08e5f intel/nir: Add global support to lower_mem_access_bit_sizes
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:08:29 -06:00
Oscar Blumberg
fea5b8e5ad intel/fs: Fix memory corruption when compiling a CS
Missing check for shader stage in the fs_visitor would corrupt the
cs_prog_data.push information and trigger crashes / corruption later
when uploading the CS state.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 10:53:33 -08:00
Jason Ekstrand
ab940b0d97 spirv: Support LocalSizeId and LocalSizeHintId execution modes
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-01 17:34:02 +00:00
Jason Ekstrand
7223590c42 spirv: Handle OpExecutionModeId
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-01 17:34:02 +00:00
Jason Ekstrand
e68871f6a4 spirv: Handle constants and types before execution modes
We already defer handling the actual execution modes until after we've
created the shader.  This just moves it a tiny bit further so we
actually have constants and types and can handle OpExecutionModeId.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-01 17:34:02 +00:00
Jason Ekstrand
7d862ef530 spirv: Rework handling of spec constant workgroup size built-ins
Instead of handling it as part of the handling of constant instructions,
just stash the vtn_value when we see the decoration and handle it
explicitly later.  This will let us re-order handling of constant
instructions without breaking the Vulkan SPIR-V requirement that
decorating a specialization constant as the WorkgroupSize built-in
overrides the workgroup size set as an execution mode.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-01 17:34:02 +00:00
Jason Ekstrand
9b37e93e42 spirv: Replace vtn_constant_value with vtn_constant_uint
The uint version is less typing, supports different bit sizes, and is
probably a bit more safe because we're actually verifying that the
SPIR-V value is an integer scalar constant.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-01 17:34:02 +00:00
Samuel Pitoiset
5e7f800f32 radv: fix build
Fixes: 9b9ccee4d6 ("radv: take LDS into account for compute shader occupancy stats")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-01 15:31:55 +01:00
Timothy Arceri
9b9ccee4d6 radv: take LDS into account for compute shader occupancy stats
Ported from d205faeb6c.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-01 22:25:30 +11:00
Timothy Arceri
a53d68d318 ac/radv/radeonsi: add ac_get_num_physical_sgprs() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-01 22:25:30 +11:00
Gurchetan Singh
574186f0e8 docs: add GL_EXT_texture_compression_s3tc_srgb to release notes
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-01 10:01:59 +00:00
Gurchetan Singh
dc9a15aefb st/mesa: expose EXT_texture_compression_s3tc_srgb
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-01 10:01:59 +00:00
Gurchetan Singh
a2ab400719 i965: Set flag for EXT_texture_compression_s3tc_srgb
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-01 10:01:59 +00:00
Gurchetan Singh
db24132d80 mesa/main: Expose EXT_texture_compression_s3tc_srgb
Required for the following test:

bin/compressedteximage GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT

pass when emulating GL on GLES.

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-01 10:01:59 +00:00
Timothy Arceri
0f3a8e1b64 st/glsl_to_nir: remove dead local variables
Without this we do not end up with a deterministic NIR because
temporary register variables are added in random order. NIR must
be deterministic because we use it to produce a sha for the
radeonsi backends disk cache.

This fixes the shader cache for a bunch of shaders.

Another positive is that this results in a large reduction in the
size of the NIR that the state tracker stores to the disk cache.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 15:56:02 +11:00
Dylan Baker
4052142de7 meson: remove -std=c++11 from intel/tools
for meson all C++ code is already compiled as C++11, so it's
unnecessary. It's also the wrong way to do this, if we really needed
this the correct way is to set:

```meson
executable(
  ...
  override_options : ['cpp_std=c++11'],
)
```

Which ensures not only that the correct syntax for the current
compiler is used, but also that meson doesn't create arguments like
`-std=c++14 ... -std=c++11`

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-31 21:42:16 +00:00
Dylan Baker
8e49b32f63 meson: fix style in intel/tools
The `:` in options should always have one space before and after `foo
: bar`, and lists do not get spaces around the braces: `[foo]` not `[
foo ]`

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-31 21:42:16 +00:00
Dylan Baker
d93d53fa72 meson: remove build_by_default : true
Which is and has always been the default. This is largely an artifact
of how the building of these tools was controlled when the meson build
was originally created.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-31 21:42:16 +00:00
Emil Velikov
1240c3cb10 docs: update calendar, add news item and link release notes for 18.3.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-31 21:17:38 +00:00
Emil Velikov
83160c6c05 docs: add sha256 checksums for 18.3.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 7475d7727f)
2019-01-31 21:15:20 +00:00
Emil Velikov
4d0732dc39 docs: add release notes for 18.3.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 190a79f462)
[Emil: drop VERSION hunk]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

 Conflicts:
	VERSION
2019-01-31 21:14:56 +00:00
Neha Bhende
69d736b17a st/mesa: Fix topogun-1.06-orc-84k-resize.trace crash
We need to initialize all fields in rs->prim explicitly while
creating new rastpos stage.

Fixes: bac8534267 ("st/mesa: allow glDrawElements to work with GL_SELECT
feedback")

v2: Initializing all fields in rs->prim as per Ilia.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-31 12:21:59 -07:00
Dylan Baker
c812c740e6 android,autotools,i965: Fix location of float64_glsl.h
Android.mk and autotools disagree about where generated files should
go, which wasn't a problem until we wanted to build a dist
tarball. This corrects the problem by changing the output and include
paths to be the same on android and autotools (meson already has the
correct include path).

Fixes: 7d7b30835c
       ("automake: Fix path to generated source")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-31 19:04:30 +00:00
Marek Olšák
d49c16a597 gallium: allow more PIPE_RESOURCE_ driver flags
radeonsi has 8 and will probably have 9 soon.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-01-31 13:10:42 -05:00
Eric Anholt
ab4d5775b0 v3d: Fix image_load_store clamping of signed integer stores.
This was copy-and-paste fail, that oddly showed up in the CTS's
reinterprets of r32f, rgba8, and srgba8 to rgba8i, but not r32ui and r32i
to rgba8i or reinterprets to other signed int formats.

Fixes: 6281f26f06 ("v3d: Add support for shader_image_load_store.")
2019-01-31 08:39:40 -08:00
Eric Anholt
db2ae51121 mesa: Skip partial InvalidateFramebuffer of packed depth/stencil.
One of the CTS cases tries to invalidate just stencil of packed
depth/stencil, and we incorrectly lost the depth contents.

Fixes dEQP-GLES3.functional.fbo.invalidate.whole.unbind_read_stencil
Fixes: 0c42b5f3cb ("mesa: wire up InvalidateFramebuffer")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-31 08:37:46 -08:00
Rob Clark
39cfdf9930 freedreno: more fixing release tarball
Fixes: aa0fed10d3 freedreno: move ir3 to common location
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-31 09:59:18 -05:00
Rob Clark
e252656d14 freedreno: fix release tarball
Fixes: b4476138d5 freedreno: move drm to common location
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-31 09:59:18 -05:00
Emmanuel Gil Peyrot
0d4dd59ae5 docs: make bugs.html easier to find
Thanks to Yann Kervran for the report and suggestions.

Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-31 14:31:48 +00:00
Dave Airlie
9279a28f07 virgl: ARB_query_buffer_object support
v1.1: fix size define.

Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-01-31 11:23:38 +10:00
Dave Airlie
38658c6d4d virgl: enable elapsed time queries
GL underneath always has GL_TIME_ELAPSED so always enable these.

Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-01-31 11:23:30 +10:00
Dylan Baker
da48cba61e automake: Add --enable-autotools to distcheck flags
Fixes: e68777c87c
       ("autotools: Deprecate the use of autotools")
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-01-30 19:32:44 +00:00
Marek Olšák
ffbd37d8e9 radeonsi: fix a comment typo in si_fine_fence_set 2019-01-30 14:32:05 -05:00
Marek Olšák
f4eb746ef7 r600: add -Wstrict-overflow=0 to meson to silence the warning
same as radeonsi
2019-01-30 12:49:45 -05:00
Marek Olšák
d50bef9831 winsys/amdgpu: remove amdgpu_drm.h definitions
trivial
2019-01-30 12:38:56 -05:00
Marek Olšák
16672f16da radeonsi: unify error paths in si_texture_create_object 2019-01-30 12:35:22 -05:00
Marek Olšák
2361558eb7 radeonsi: merge & rename texture BO metadata functions 2019-01-30 12:35:22 -05:00
Marek Olšák
1c12d56e4d radeonsi: enable dithered alpha-to-coverage for better quality
same as AMDVLK.

GL_NV_alpha_to_coverage_dither_control allows controlling this behavior.
The default is implementation-dependent.
2019-01-30 12:35:22 -05:00
Dylan Baker
b4986d2e0c gallium: wrap u_screen in extern "C" for c++
Some drivers (notabily SWR) are written in C++, and as such they need
access to C headers with extern "C". So lets add that.
2019-01-30 15:12:27 +00:00
Gert Wollny
45903cddc3 mesa/core: Enable EXT_texture_sRGB_R8 also for desktop GL
As of Nov/30/2018 the extension is also valid for OpenGL >= 1.2, so
enable it accordingly and also add the required view class entry.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-30 11:32:40 +00:00
Samuel Pitoiset
9c762c01c8 radv/winsys: fix hash when adding internal buffers
This fixes serious stuttering in Shadow Of The Tomb Raider.

Fixes: 50fd253bd6 ("radv/winsys: Add priority handling during submit.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-30 12:29:10 +01:00
Erik Faye-Lund
3b6f95ad66 mesa: expose NV_conditional_render on GLES
The extension spec has been updated to include GLES 2 support, so let's
enable it there.

v2: fixup ABI-check as well

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-30 09:43:44 +01:00
Ernestas Kulik
90458bef54 v3d: Fix leak in resource setup error path
Reported by Coverity: in the case of unsupported modifier request, the
code does not jump to the “fail” label to destroy the acquired resource.

CID: 1435704
Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com>
Fixes: 45bb8f2957 ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.")
2019-01-29 16:14:13 -08:00
Ernestas Kulik
f6e49d5ad0 vc4: Fix leak in HW queries error path
Reported by Coverity: in the case where there exist hardware and
non-hardware queries, the code does not jump to err_free_query and leaks
the query.

CID: 1430194
Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com>
Fixes: 9ea90ffb98 ("broadcom/vc4: Add support for HW perfmon")
2019-01-29 16:14:13 -08:00
Eric Anholt
6053c7bb43 v3d: Fix a release build set-but-unused compiler warning. 2019-01-29 16:02:51 -08:00
Eric Anholt
0c05198d6b v3d: Always enable the NEON utile load/store code.
I can't imagine the new HW block being paired with a v6 CPU, so don't
bother with the CPU detection that vc4 had to do.

Improves 1024x1024 TexImage on my 7278 by 47.3229% +/- 0.679632%
2019-01-29 16:00:25 -08:00
Emil Velikov
385843ac3c vc4: Declare the last cpu pointer as being modified in NEON asm.
Earlier commit addressed 7 of the 8 instances available.

v2: Rebase patch back to master (by anholt)

Cc: Carsten Haitzler (Rasterman) <raster@rasterman.com>
Cc: Eric Anholt <eric@anholt.net>
Fixes: 300d3ae8b1 ("vc4: Declare the cpu pointers as being modified in NEON asm.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-29 16:00:25 -08:00
Dylan Baker
75ad254acf docs: Add relnotes stub for 19.1 2019-01-29 15:32:16 -08:00
Dylan Baker
dba0989ac1 bump version for 19.0 branch 2019-01-29 15:30:25 -08:00
Dylan Baker
90a7a9c973 automake: Add include dir for nir src directory
Fixes: 6281f26f06
       ("v3d: Add support for shader_image_load_store.")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-29 23:24:57 +00:00
Dylan Baker
82365595e9 automake: Add float64.glsl to dist tarball
Fixes: b63a1f8e40
       ("glsl: Create file to contain software fp64 functions")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-29 23:24:57 +00:00
Dylan Baker
7d7b30835c automake: Fix path to generated source
Fixes: b63a1f8e40
       ("glsl: Create file to contain software fp64 functions")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-29 23:24:57 +00:00
Matt Turner
9de90caca8 nir: Optimize double-precision lower_round_even()
Use the trick of adding and then subtracting 2**52 (52 is the number of
explicit mantissa bits a double-precision floating-point value has) to
implement round-to-even.

Cuts the number of instructions on SKL of the piglit test
fs-roundEven-double.shader_test from 109 to 21.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-01-29 15:02:23 -08:00
Marek Olšák
3e249b853e ac: use the correct LLVM processor name on Raven2
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-29 17:46:55 -05:00
Eric Anholt
f7769b5121 v3d: Fix the autotools build.
Noticed while looking at the gitlab-CI MR.
2019-01-29 14:00:27 -08:00
Jonathan Marek
31a1348a66 freedreno: fix sysmem rendering being used when clear is used
This batch->cleared value is only used to decide to use sysmem rendering
or not, so it should include any buffers that are affected by a clear.

This is required because the a2xx fast clear doesn't work with sysmem
rendering. The a22x "normal" clear path doesn't work with sysmem either.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-29 20:22:33 +00:00
Jonathan Marek
c93d77431f freedreno: fix depth usage logic
Depth can be used even when there is no restore/resolve of depth. This
happens when the depth buffer is invalidated after rendering to avoid
the resolve operation.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-29 20:22:33 +00:00
Jonathan Marek
bcefa0f1cb freedreno: fix invalidate logic
Set dirty bits on invalidate to trigger invalidate logic in fd_draw_vbo.

Also, resource_written for color needs to be after the invalidate logic.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-29 20:22:32 +00:00
Jonathan Marek
786f9639d6 mesa/st: wire up DiscardFramebuffer
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-29 20:22:32 +00:00
Rob Clark
0c42b5f3cb mesa: wire up InvalidateFramebuffer
And before someone actually starts implementing DiscardFramebuffer()
lets rework the interface to something that is actually usable.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-29 20:22:32 +00:00
Jonathan Marek
e685566612 st/dri: invalidate_resource depth/stencil before flush_resource
This allows freedreno to be aware of the depth invalidate when flushing
batches on flush_resource.

AFAIK, the only other driver which might care about this change is vc4,
where I think it should help by allowing the depth invalidate to work with
GALLIUM_HUD.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-29 20:22:32 +00:00
Mario Kleiner
820dfcea43 egl/wayland-drm: Only announce formats via wl_drm which the driver supports.
Check if a pixel format is supported by the Wayland servers gpu driver
before exposing it to the client via wl_drm, so we avoid reporting formats
to the client which the server gpu can't handle.

Restrict this reporting to the new color depth 30 formats for now, as the
ARGB/XRGB8888 and RGB565 formats are probably supported by every gpu under
the sun.

Atm. this is mostly useful to allow proper PRIME renderoffload for depth
30 formats on the typical Intel iGPU + NVidia dGPU "NVidia Optimus" laptop
combo.

Tested on Intel, AMD, NVidia with single-gpu setup and on a Intel + NVidia
Optimus setup.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-01-29 20:03:20 +00:00
Mario Kleiner
a34b0d68bb egl/wayland: Allow client->server format conversion for PRIME offload. (v2)
Support PRIME render offload between a Wayland server gpu and a Wayland
client gpu with different channel ordering for their color formats,
e.g., between Intel drivers which currently only support ARGB2101010
and XRGB2101010 import/display and nouveau which only supports ABGR2101010
rendering and display on nv-50 and later.

In the wl_visuals table, we also store for each format an alternate
sibling format which stores colors at the same precision, but with
different channel ordering, e.g., ARGB2101010 <-> ABGR2101010.

If a given client-gpu renderable format is not supported by the server
for import, but the alternate format is supported by the server, expose
the client-gpu renderable format as a valid EGLConfig to the client. At
eglSwapBuffers time, during the blitImage() detiling blit from the client
backbuffer to the linear buffer, the client format is converted to the
server supported format. As we have to do a copy for PRIME anyway,
this channel swizzling conversion comes essentially for free.

Note that even if a server gpu in principle does support sampling
from the clients native format, this conversion will be a performance
advantage if it allows to convert to the servers preferred format
for direct scanout, as the Wayland compositor may then be able to
directly page-flip a fullscreen client wl_buffer onto the primary
plane, or onto a hardware overlay plane, avoiding an extra data copy
for desktop composition.

Tested so far under Weston with: nouveau single-gpu, Intel single-gpu,
AMD single-gpu, "Optimus" Intel server iGPU for display + NVidia
client dGPU for rendering.

v2: Implement minor review comments by Eric Engestrom: Add some
    comment and assert, and some style fixes for clarity.
    No functional change.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-01-29 20:03:20 +00:00
Jason Ekstrand
a920979d4f intel/fs: Use split sends for surface writes on gen9+
Surface reads don't need them because they just have the one address
payload.  With surface writes, on the other hand, we can put the address
and the data in the different halves and avoid building the payload all
together.

The decrease in register pressure and added freedom in register
allocation resulting from this change reduces spilling enough to improve
the performance of one customer benchmark by about 2x.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
014edff0d2 intel/fs: Add interference between SENDS sources
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
eab1c55590 intel/fs: Support SENDS in SHADER_OPCODE_SEND
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
cca199fd85 intel/disasm: Properly disassemble split sends
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
8babaa84e8 intel/eu: Add support for the SENDS[C] messages
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d6a6e10390 intel/inst: Indent some code
We're about to add some more if cases so let's have the giant re-indent
in it's own patch to make review easier.

Acked-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d96969120d intel/inst: Fix the ia16_addr_imm helpers
These have clearly never seen any use.... On gen8, the bottom 4 bits are
missing so we need to shift them off before we call set_bits and shift
again when we get the bits.  Found by inspection.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
e46fb33143 intel/disasm: Rework SEND decoding to use descriptors
Instead of fetching the information out of the instruction directly,
fetch the descriptor and then pluck the information out of the
descriptor.  The current scheme works ok for SEND but with SENDS, it all
falls to pieces because the descriptor is completely shuffled around.

This commit doesn't actually convert everything.  One notable exception
is URB messages which don't even use descriptors in emit_urb_WRITE yet.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
13a6fabc62 intel/eu: Add more message descriptor helpers
We want to be able to extract data from descriptors as well as unify a
bit of the descriptor construction.

One of the unifications we do is to unify the read/write and dataport
descriptors.  On gen4-5, read/write are substantially different and the
read descriptors change between gen4 and gen4.x.  On gen6, they unified
layouts between read, write, and dataport.  Then, on gen8, they added
one bit to the message type field but left it reserved MBZ for
read/write messages.  This commit chooses to treat that as if they
expanded the field everywhere and just didn't have enough enum values
for read/write to bother with the extra bit.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
c3aa436bfe intel/eu/validate: SEND restrictions also apply to SENDC
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
fee6bd8d8e intel/eu: Use GET_BITS in brw_inst_set_send_ex_desc
It's a bit more readable

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
b284d222db intel/fs: Use SHADER_OPCODE_SEND for varying UBO pulls on gen7+
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
8514eba693 intel/fs: Use SHADER_OPCODE_SEND for texturing on gen7+
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
f547cebbe0 intel/fs: Use a logical opcode for IMAGE_SIZE
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d2d3e04501 intel/fs: Use SHADER_OPCODE_SEND for surface messages
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
7f1cf046cd intel/fs: Add a generic SEND opcode
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
ba3c5300f9 intel/eu: Rework surface descriptor helpers
This commit pulls the surface descriptor helpers out into brw_eu.h and
makes them no longer depend on the codegen infrastructure.  This should
allow us to use them directly from the IR code instead of the generator.
This change is unfortunately less mechanical than perhaps one would like
but it should be fairly straightforward.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
5b17379631 intel/eu: Add has_simd4x2 bools to surface_write functions
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
2ce93b88c0 intel/fs: Take an explicit exec size in brw_surface_payload_size()
Instead of magically falling back to SIMD8 for atomics and typed
messages on Ivy Bridge, explicitly figure out the exec size and pass
that into brw_surface_payload_size.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
cf42b0f9e2 intel/fs: Handle IMAGE_SIZE in size_read() and is_send_from_grf()
Like all the other sends, it's just mlen * REG_SIZE.

Fixes: 3cbc02e469 "intel: Use TXS for image_size when we have..."
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
009c0bd840 intel/defines: Explicitly cast to uint32_t in SET_FIELD and SET_BITS
If you pass a bool in as the value to set, the C standard says that it
gets converted to an int prior to shifting.  If you try to set a bool to
bit 31, this lands you in undefined behavior.  It's better just to add
the explicit cast and let the compiler delete it for us.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
077b9557a4 intel/fs: Get rid of fs_inst::equals
There are piles of fields that it doesn't check so using it is a lie.
The only reason why it's not causing problem is because it has exactly
one user which only uses it for MOV instructions (which aren't very
interesting) and only on Sandy Bridge and earlier hardware.  Just get
rid of it and inline it in the one place that it's actually used.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Rob Clark
446a14bc0a freedreno: minor cleanups
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-29 12:30:50 -05:00
Rob Clark
c3baa077bf freedreno: stop frob'ing pipe_resource::nr_samples
Previously we tried to normalize nr_samples to MAX2(1, nr_samples) to
avoid having to deal with 0 vs 1 everywhere.  But this causes problems
in mesa/st, for example st_finalize_texture() will think there is a
nr_samples mismatch and recreate the texture.  Somehow this manifests
as corrupt x11 font rendering on generations that do not support MSAA
(but apparently works fine on a5xx and a6xx which do support MSAA.)

Fixes: cf0c7258ee freedreno/a5xx: MSAA
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-29 12:30:50 -05:00
Rob Clark
1a6ddfe5ee freedreno/a6xx: fix blitter nr_samples check
nr_samples for non-MSAA case could be either zero or one.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-29 12:22:08 -05:00
Rob Clark
9106a0fe33 freedreno/a5xx: fix blitter nr_samples check
nr_samples for non-MSAA case could be either zero or one.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-29 12:21:19 -05:00
Bas Nieuwenhuizen
69edc972fc radv: Enable VK_EXT_memory_priority.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-29 15:56:56 +01:00
Bas Nieuwenhuizen
50fd253bd6 radv/winsys: Add priority handling during submit.
Switched to the raw bo list api to avoid having to use 2 arrays for
everything.

This was introduced in libdrm 2.4.97 which we already depend upon.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-29 15:56:52 +01:00
Bas Nieuwenhuizen
ead54d4a42 radv/winsys: Set winsys bo priority on creation.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-29 15:56:41 +01:00
Samuel Pitoiset
3a8d6c0880 radv: re-enable fast depth clears for 16-bit surfaces on VI
This has been disabled some months ago because it introduced
rendering issues with Shadow Of Warrier II (DXVK). This game is
no longer affected, I wonder if 824cfc1ee5 ("radv: rework the
TC-compat HTILE hardware bug with COND_EXEC") fixed the problem.
I checked The Forest on my Polaris, and it renders fine too.

According to Phillip, this gives +5.5% with Rise Of The Tomb
Raider and DXVK. This is because DXVK  uses 16-bit depth surfaces
while the native port from Feral uses 32-bit depth surfaces.

Unfortunately, Shadow Of The Tomb Raider isn't affected because
it clears each layer of a D16 array texture individually. So it
doesn't hit the fast clear path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-29 15:20:55 +01:00
Eric Anholt
932ed9c00b vc4: Enable NEON asm on meson cross-builds.
The core Mesa with_asm_arch and USE_ARM_ASM flags are disabled for meson
cross-builds because of the need to run host binaries on the build system.
vc4 doesn't need to do that, so skip with_asm_arch to enable NEON on my
cross-builds.

Fixes: ebcb4c2156 ("meson: Enable VC4's NEON assembly support.")
2019-01-28 16:45:48 -08:00
Carsten Haitzler (Rasterman)
300d3ae8b1 vc4: Declare the cpu pointers as being modified in NEON asm.
Otherwise, the compiler is free to reuse the register containing the input
for another call and assume that the value hasn't been modified.  Fixes
crashes on texture upload/download with current gcc.

We now have to have a temporary for the cpu2 value, since outputs must be
lvalues.

(commit message by anholt)

Fixes: 4d30024238 ("vc4: Use NEON to speed up utile loads on Pi2.")
2019-01-28 16:45:45 -08:00
Carsten Haitzler (Rasterman)
522f688471 vc4: Use named parameters for the NEON inline asm.
This makes the asm code more intelligible and clarifies the functional
change in the next commit.

(commit message and commit squashing by anholt)
2019-01-28 16:40:46 -08:00
Jonathan Marek
f6292c32cc kmsro: Add freedreno renderonly support
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-28 18:25:27 -05:00
Jonathan Marek
7d458c0c69 freedreno: a2xx: add perfcntrs
Based on a5xx perfcntrs implementation.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-28 18:21:16 -05:00
Jonathan Marek
cccec0b457 freedreno: a2xx: minor solid_vertexbuf fixups
The big thing here is the 0x60 offset for the mem2gmem copy which I missed
in my last patch.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-28 18:21:16 -05:00
Jonathan Marek
912a9c8d8c freedreno: a2xx: clear fixes and fast clear path
This fixes the depth/stencil clear on a20x, and adds a fast clear path.

The fast clear path is only used for a20x, needs performance tests on a22x.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-28 18:21:16 -05:00
Jonathan Marek
cb2322c7c0 freedreno: a2xx: a20x hw binning
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-28 18:21:16 -05:00
Jonathan Marek
501c6e70d4 freedreno: update a2xx registers
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-28 18:21:16 -05:00
Timothy Arceri
fb78a6cb72 glsl: use remap location when serialising uniform program resource data
This allows us to avoid expensive string compares since we already have
a map to the pointers.

These compares were taking ~30 seconds for a single shader compile
in Godot due to it using 64,000+ uniforms.

Fixes: c4cff5f402 ("glsl: add basic support for resource list to shader cache")

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109229
2019-01-29 09:39:54 +11:00
Vinson Lee
be5b271ea7 meson: Fix typo.
meson.build:166:21: ERROR:  Unknown method "verson_compare" for a string.

Fixes: c1efa240c9 ("meson: Add warnings and errors when using ICC")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
2019-01-28 10:47:32 -08:00
Jonathan Marek
7c930d99ad freedreno: a2xx: enable early-Z testing
Enable earlyZ when alpha test is disabled.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-28 13:04:41 -05:00
Jonathan Marek
32b1d2d716 freedreno: a2xx: ir2 cleanup
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-28 13:04:41 -05:00
Rob Herring
41a0acd6a1 Switch imx to kmsro and remove the imx winsys
The kmsro winsys is equivalent to the imx winsys, so we can switch
to it and remove the imx one.

Signed-off-by: Rob Herring <robh@kernel.org>
2019-01-28 11:50:08 -06:00
Rob Herring
827e0d6654 kmsro: Add etnaviv renderonly support
Enable using etnaviv for KMS renderonly. This still needs KMS driver
name mapping to kmsro to be used automatically.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Rob Herring <robh@kernel.org>
2019-01-28 11:45:43 -06:00
Eric Anholt
272b6cf58f kmsro: Extend to include hx8357d.
This allows vc4 to initialize on the Adafruit PiTFT 3.5" touchscreen with
the hx8357d tinydrm driver

v2: Whitespace fix noted by Eric Engestrom, update commit message for the
    driver being merged.
v3: Rebase on Rob Herring's pipe-loader changes.

Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Acked-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2019-01-28 09:35:45 -08:00
Rob Herring
511e7b6f61 pipe-loader: Fallback to kmsro driver when no matching driver name found
If we can't find a driver matching by name, then use the kmsro driver.
This removes the need for needing a driver descriptor for every possible
KMS driver.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-28 09:35:45 -08:00
Eric Anholt
ed65aeec78 pl111: Rename the pl111 driver to "kmsro".
The vc4 driver can do prime sharing to many different KMS-only devices,
such as the various tinydrm drivers for SPI-attached displays.  Rename the
driver away from "pl111" to represent what it will actually support:
various sorts of KMS displays with the renderonly layer used to attach a
GPU.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-28 09:35:45 -08:00
Samuel Pitoiset
afeef3cacf radv: set noalias/dereferenceable LLVM attributes based on param types
Instead of using this useless array_params_mask variable.
This should set these two attributes to streamout buffers too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 16:30:38 +01:00
Samuel Pitoiset
320b058d32 radv: simplify allocating user SGPRS for descriptor sets
Unnecesary to check the current stages if desc_set_used_mask
is used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 16:30:36 +01:00
Samuel Pitoiset
d1994ed229 radv: remove radv_userdata_info::indirect field
Always false.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 16:30:33 +01:00
Gert Wollny
212c0c630a mesa/main: Expose EXT_sRGB_write_control
Use EXT_framebuffer_sRGB to expose EXT_sRGB_write_control on GLES. Remove
the checks for desktion GL in the enable calls, since EXT_framebuffer_sRGB
now also indicates support for switching the linear-sRGB color
space conversion on GLES.

Thanks to Ilia Mirkin for all the helpful discussions that helped to rework
this series.

v2: Fix alphabetical listing of extensions (Tapani Pälli)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2019-01-28 12:18:40 +01:00
Gert Wollny
1013dfece1 mesa/main/version: Lower the requirements for GLES 3.0
GLES 3.0 does not actually require support for EXT_framebuffer_sRGB, it
only needs support for sRGB attachments to framebuffers and framebuffer
objects as defined in ARB_framebuffer_objects.

v2: Clarify that ARB_framebuffer_objects is needed.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-28 12:18:40 +01:00
Gert Wollny
76c3f6fb3f mesa/main: Use flag for EXT_sRGB instead of EXT_framebuffer_sRGB where possible
All drivers that support EXT_framebuffer_sRGB also support EXT_sRGB, but
in order to keep this commit minial, and not to break any drivers both
flags are checked.

v2: - Use only EXT_sRGB (Ilia Mirkin)
    - Move adding the flag EXT_sRGB to gl_extensions to a separate patch

v3: use _mesa_has_EXT_framebuffer_sRGB instead of extension flag
    The _mesa_has function also checks for the correct versions and
    should be preferred over using the flags directly (Erik)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-28 12:18:40 +01:00
Gert Wollny
8f9dfb7d88 mesa/st: rework support for sRGB framebuffer attachements
For GLES sRGB framebuffer attachemnt support is provided in two steps:
sRGB attachments like described in EXT_sRGB (and GLES 3.0) that enable
linear to sRGB color space transformation automatically, and the ability
to switch formats of the render target surface between sRGB and linear
that introduces full support for EXT_framebuffer_sRGB.
Set the according flags to reflect these two levels of sRGB support.

As a difference between desktopm GL and GLES, on desktop GL for a sRGB
framebuffer attachment the linear-sRGB conversion is turned off by default,
and for GLES it is turned on. This needs to be taken into account when
initally creating a surface, i.e. on desktop GL creation of a sRGB surface
is preferred, but on GLES sRGB surfaces are only created when explicitely
requested.

v2: - Use the new CAPS name

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-28 12:18:40 +01:00
Gert Wollny
385081cd17 i965: Set flag for EXT_sRGB
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: <Gurchetan Singh gurchetansingh@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-28 12:18:40 +01:00
Gert Wollny
7577c82fed mesa:main: Add flag for EXT_sRGB to gl_extensions
EXT_sRGB is an (incomplete) GLES extension that provides support for sRGB
framebuffer attachments, hence it can be used to check for this support
as an alternative to EXT_framebuffer_sRGB that provies the same
functionality but also sRGB write control support.

However, since EXT_sRGB  is incomplete and superseted by GLES 3.0 it will
not be exposed as an extension.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-28 12:18:40 +01:00
Gert Wollny
2845939d6a virgl: Set sRGB write control CAP based on host capabilities
v2: - Use the renamed CAPS
    - add assetions to make sure that mesa doesn't try to switch
      destination surface formats when it is not supported. (Ilia Mirkin)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-01-28 12:18:40 +01:00
Gert Wollny
8021f1875e Gallium: Add new CAPS to indicate whether a driver can switch SRGB write
Add a new cap that indicates whether the drivers supports
enabling/disabling the conversion from linear space to sRGB
for a framebuffer attachment. In Driver terms that this CAP indicates
whether the driver can switcht between a linear and and a sRGB surface
format for draw destinations witout changing the sourface itself.

v2: rename CAP to DEST_SURFACE_SRGB_CONTROL to reflect its
    purpouse better (pointed out by Ilia Mirkin)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-28 12:18:40 +01:00
Neil Roberts
75b3719c4f spirv: Don't use special semantics when counting vertex attribute size
Under Vulkan, the double vertex attributes take up the same size
regardless of whether they are vertex inputs or any other stage
interface.

Under OpenGL (ARB_gl_spirv), from GLSL 4.60 spec, section 4.3.9
Interface Blocks:

   "It is a compile-time error to have an input block in a vertex
    shader or an output block in a fragment shader. These uses are
    reserved for future use."

So we also don't need to check if it is an vertex input or not, and
use false in any case.

v2: (changes made by Alejandro Piñeiro)
    * Update required after "spirv: Handle location decorations on
      block interface members" own updates (original patch was sent
      several months ago)
    * After Neil suggesting it, confirm that this change can be also
      done for OpenGL (ARB_gl_spirv). Expand commit message.

v3: update after changing name of main method on a previous patch

Signed-off-by: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-28 11:42:46 +01:00
Neil Roberts
5c797f7354 glsl_types: Rename parameter of glsl_count_attribute_slots
glsl_count_attribute_slots takes a parameter to specify whether the
type is being used as a vertex input because on GL double attributes
only take up one slot. Vulkan doesn’t make this distinction so this
patch renames the argument to is_gl_vertex_input in order to make it
more clear that it should always be false on Vulkan.

v2: minor variable renaming (s/member/member_type) (Tapani)

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-28 11:42:46 +01:00
Neil Roberts
dfc3a7cb3c spirv/nir: handle location decorations on block interface members
Previously the code was taking any location decoration on the block
and using that to calculate the member locations for all of the
members. I think this was assuming that there would only be one
location decoration for the entire block. According to the Vulkan spec
it is possible to add location decorations to individual members:

   “If the structure type is a Block but without a Location, then each
    of its members must have a Location decoration. If it is a Block
    with a Location decoration, then its members are assigned
    consecutive locations in declaration order, starting from the
    first member which is initially the Block. Any member with its own
    Location decoration is assigned that location. Each remaining
    member is assigned the location after the immediately preceding
    member in declaration order.”

This patch makes it instead keep track of which members have been
assigned an explicit location. It also has a space to store the
location for the struct as a whole. Once all the decorations have been
processed it iterates over each member to fill in the missing
locations using the rules described above.

So, this commit is needed to get working a case like this, on both
Vulkan and OpenGL using SPIR-V (ARB_gl_spirv):

     out block {
            layout(location = 2) vec4 c;
            layout(location = 3) vec4 d;
            layout(location = 0) vec4 a;
            layout(location = 1) vec4 b;
     } name;

v2: (changes made by Alejandro Piñeiro)
   * Update after introducing struct member splitting (See commit b0c643d)
   * Update after only exposing interface_type for blocks, not to any struct
   * Update after last changes done for xfb support

v3: use "assign" instead of "add" on the new method added (Tapani)

Signed-off-by: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-28 11:42:46 +01:00
Christian Gmeiner
34458c1cf6 etnaviv: add linear sampling support
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-01-28 07:36:12 +01:00
Christian Gmeiner
42ca4dda2d etnaviv: update headers from rnndb
Update to etna_viv commit 4d2f857.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-01-28 07:36:09 +01:00
Christian Gmeiner
5b4a155d2b etnaviv: extend etna_resource with an addressing mode
Defines how sampler (and pixel pipes) needs to access the data
represented with a resource. The used default is mode is
ETNA_ADDRESSING_MODE_TILED.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-01-28 07:36:05 +01:00
Ilia Mirkin
d1d2bb8c07 nvc0: don't put text segment into bufctx
The text segment is shared among multiple contexts, while each one has
its own bufctx. So when reallocating the text segment, some contexts may
end up with stale values in their bufctx's. Instead limit the exposure
to the bufctx to within a single draw.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-27 21:47:09 -05:00
Timothy Arceri
0907ae35ad radv/ac: fix some fp16 handling
Fixes: b722b29f10 ("radv: add support for 16bit input/output")

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 10:41:48 +11:00
Eric Anholt
c496b60ed8 v3d: Create separate sampler states for the various blend formats.
The sampler border color is encoded in the TMU's blending format (half
floats, 32-bit floats, or integers) and must be clamped to the format's
range unorm/snorm/int ranges by the driver.  Additionally, the TMU doesn't
know about how we're abusing the swizzle to support BGRA, A, and LA, so we
have to pre-swizzle the border color for those.

We don't really want to spend half a kb on sampler states in most cases,
so skip generating the variants when the border color is unused or is
0,0,0,0.
2019-01-27 08:30:03 -08:00
Eric Anholt
5fe4250a2c v3d: Move the sampler state to the long-lived state uploader.
Samplers are small (8-24 bytes), so allocating 4k for them is a huge
waste.
2019-01-27 08:30:03 -08:00
Eric Anholt
09472006ff v3d: Use the symbolic names for wrap modes from the XML. 2019-01-27 08:30:03 -08:00
Eric Anholt
c51d125d18 v3d: Fix stencil sampling from a separate-stencil buffer.
When the sampler view is in sample-stencil mode, we need to return uint
stencil values.  To do that, fill in the format table to return R8I, and
have the sampler view point at the separate stencil buffer.

Fixes dEQP-GLES31.functional.stencil_texturing.format.depth32f_stencil8_2d
2019-01-27 08:30:03 -08:00
Eric Anholt
8a0b0a8f37 v3d: Fix stencil sampling from packed depth/stencil.
We need to pick the 8-bit unorm value out, not the depth component.
2019-01-27 08:30:03 -08:00
Eric Anholt
fcdbd441a2 v3d: Fix release-build warning about utile_h. 2019-01-27 08:30:03 -08:00
Eric Anholt
edb1fcd963 v3d: Flush blit jobs immediately after generating them.
Fixes OOMs in the CTS's packed_pixels.varied_rectangle.* tests -- the
series of texture uploads at the start before texturing occurred would end
up all sitting around as cached jobs for reuse.  By flushing immediately,
peak active BO usage goes from 150M to 40M.

We could maybe put some limits on how many jobs we keep around, but blits
seem particularly unlikely to get reused for other drawing.
2019-01-27 08:30:03 -08:00
Eric Anholt
ac333ffa59 v3d: Fix BO stats accounting for imported buffers. 2019-01-27 08:30:03 -08:00
Eric Anholt
060575bea8 v3d: Drop maximum number of texture units down to 16.
This is the GLES 3.2 minmax, and also what the closed source driver does.
Avoids hitting OOMs in the CTS's
dEQP-GLES3.functional.texture.units.all_units.only_cube.1.
2019-01-27 08:30:03 -08:00
Eric Anholt
3e743d8cd8 v3d: Avoid duplicating limits defines between gallium and v3d core.
We don't want to pull the compiler into every include in the gallium
driver, so just make a new little header to store the limits.
2019-01-27 08:30:03 -08:00
Eric Anholt
fe6a21c867 v3d: Fix overly-large vattr_sizes structs.
We want one vector size per vector, not per component.
2019-01-27 08:30:03 -08:00
Eric Anholt
533b3f0541 v3d: Rename gallium-local limits defines from VC5 to V3D.
The compiler has its limits under V3D_* (like most V3D stuff), so sync up
with that.
2019-01-27 08:30:03 -08:00
Bas Nieuwenhuizen
b4870a15ae radv: Remove unused variable.
Trivial.
2019-01-27 13:51:35 +01:00
Niklas Haas
804cc44d09 radv: add device->instance extension dependencies
From the vulkan spec 33.3 "Extension Dependencies":

"Any device extension that has an instance extension dependency that is
not enabled by vkCreateInstance is considered to be unsupported, hence
it must not be returned by vkEnumerateDeviceExtensionProperties for any
VkPhysicalDevice child of the instance."

Therefore we need to check whether the instance-level extensions are
actually enabled when deciding to support a device-level extension or
not.

Furthermore, we need to do this for all instance-level extensions of any
(transitive) device-level extension dependency, due to the following
paragraph:

"If an extension is supported (as queried by
vkEnumerateInstanceExtensionProperties or
vkEnumerateDeviceExtensionProperties), then required extensions of that
extension must also be supported for the same instance or physical
device."

Finally, because some of these vulkan extensions may be implicitly
promoted to future vulkan core API versions, we can also satisfy the
dependency if the vulkan API version is high enough.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-27 13:50:35 +01:00
Niklas Haas
d12dc39396 radv: correctly use vulkan 1.0 by default
From the vulkan spec 3.2 "Instances":

"Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing an
apiVersion of 0 is equivalent to providing an apiVersion of
VK_MAKE_VERSION(1,0,0)."

Fixes: ffa15861ef "radv: UseEnumerateInstanceVersion for the default version."
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-27 12:49:28 +01:00
Niklas Haas
d9bd3b1cb8 glsl: fix block member alignment validation for vec3
Section 7.6.2.2 (Standard Uniform Block Layout) of the GL spec says:

    The base offset of the first member of a structure is taken from the
    aligned offset of the structure itself. The base offset of all other
    structure members is derived by taking the offset of the last basic
    machine unit consumed by the previous member and adding one.

The current code does not reflect this last sentence - it effectively
instead aligns up the next offset up to the alignment of the previous
member. This causes an issue in exactly one case:

layout(std140) uniform block {
    layout(offset=0) vec3 var1;
    layout(offset=12) float var2;
};

As per section 7.6.2.1 (Uniform Buffer Object Storage) and elsewhere, a
vec3 consumes 3 floats, i.e. 12 basic machine units. Therefore, `var1`
in the example above consumes units 0-11, with 12 being the first
available offset afterwards. However, before this commit, mesa
incorrectly assumes `var2` must start at offset=16 when using explicit
offsets, which results in a compile-time error. Without explicit
offsets, the shaders actually work fine, indicating that mesa is already
correctly aligning these fields internally. (Just not in the code that
handles explicit buffer offset parsing)

This patch should fix piglit tests:
ssbo-explicit-offset-vec3.vert
ubo-explicit-offset-vec3.vert

Signed-off-by: Niklas Haas <git@haasn.xyz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-27 03:00:03 -05:00
Jason Ekstrand
86e5f76d3d spirv: Add support for SPV_EXT_physical_storage_buffer
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-26 13:41:50 -06:00
Jason Ekstrand
fb282a68bc spirv: Implement OpConvertPtrToU and OpConvertUToPtr
This only implements the actual opcodes and does not implement support
for using them with specialization constants.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-26 13:41:50 -06:00
Jason Ekstrand
837ed2ba51 spirv: Handle OpTypeForwardPointer
We handle forward declarations by creating the pointer type with it's
storage type based on storage class and just waiting to fill out the
actual deref type until we get the OpTypePointer.  Because any
composites using the forward declared type only care about the storage
type (i.e. uint64_t, uvec2, etc.) when creating their glsl_type, this
works fine and we can defer the actual deref_type as far as we need.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26 13:41:50 -06:00
Jason Ekstrand
4602e705e4 spirv: Drop a bogus assert
This was valid back when the only valid types of pointers were uint32
and uvec2.  Now that we're allowing more variety, it could be just about
anything so we'll just drop the assert.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26 13:41:50 -06:00
Jason Ekstrand
9e34781aef nir: Allow SSBOs and global to alias
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-26 13:41:50 -06:00
Jason Ekstrand
9839ce8bf9 nir/validate: Allow array derefs of vectors for nir_var_mem_global
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26 13:39:18 -06:00
Jason Ekstrand
5f5503d498 nir/lower_io: Add support for nir_var_mem_global
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26 13:39:18 -06:00
Jason Ekstrand
314d2c90c3 nir/lower_io: Add a 32 and 64-bit global address formats
These are simple scalar addresses.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-26 13:39:18 -06:00
Jason Ekstrand
e461926ef2 nir: Add load/store/atomic global intrinsics
These correspond roughly to reading/writing OpenCL global pointers.  The
idea is that they just take a bare address and load/store from it.  Of
course, exactly what this address means is driver-dependent.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26 13:39:18 -06:00
Axel Davy
6380fedb60 st/nine: Enable debug info if NDEBUG is not set
We want to have debug info as well if using
meson's debugoptimized when ndebug is off.

v2: use u_debug functions that do something
even if DEBUG is not set.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2019-01-26 19:53:19 +01:00
Axel Davy
d7433c22e6 st/nine: Immediately upload user provided textures
Fixes regression caused by
42d672fa6a
st/nine: Bind src not dst in nine_context_box_upload

Before that patch, for user provided textures,
when the texture was destroyed, the safety
check for pending uploads, which according to
the code "Following condition cannot happen currently",
was flushing the queue and thus triggering the upload.

After the patch, the texture destruction was delayed after
the upload. However the user frees the texture buffer,
as it thinks the texture released.

Instead of reverting the faulty patch,
this patch instead flushes the csmt queue right away
after queuing the upload for this type of textures.
This is more future-proof, as we may want to bind the
surface for other reasons in the future.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
2019-01-26 19:53:00 +01:00
Matt Turner
a7d629a590 i965: Always compile fp64 funcs when needed
Compilation of user-specified shaders with software fp64 works by
compiling on demand an "fp64-funcs" shader implementing various fp64
operations and then linking it into the "user shader".

In

   commit 64b8c86d37
   Author: Timothy Arceri <tarceri@itsqueeze.com>
   Date:   Thu Jan 17 17:16:29 2019 +1100

       glsl: be much more aggressive when skipping shader compilation

we changed the behavior of the shader cache to skip compilation earlier
when we get a cache hit.

After the aforementioned commit, compiling a user program using fp64
would store into the cache an entry for the fp64-funcs shader.
Subsequent compilations of uncached user shaders using fp64 would fail
in compile_fp64_funcs() after finding a cache entry for the fp64-funcs,
but being unprepared to read from the cache.

It's unclear to me how to retrieve the cached NIR of the fp64-funcs (if
it even is cached), so just call _mesa_glsl_compile_shader() with
force_recompile=true in order to ensure we generate the fp64-funcs
successfully.

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-26 10:33:22 -08:00
Matt Turner
18b467c066 intel/compiler: Add a file-level description of brw_eu_validate.c
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-01-26 10:33:22 -08:00
Jonathan Marek
41ddf1d150 freedreno: add renderonly scanout
This allows creating a fd_screen with a renderonly object which will be
used to allocated scanout resources.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
[slight tweak to fix uninitialized 'prsc' in debug print]
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-26 10:47:21 -05:00
Rob Clark
cd79b5e0c2 freedreno/a2xx: fix unused variable warning
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-26 10:44:31 -05:00
Timothy Arceri
8e9ad592c3 tgsi: remove culldist semantic from docs
The semantic was removed in e6d9389366.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-26 12:04:53 +11:00
Timothy Arceri
5d66f7103f ac/nir_to_llvm: fix clamp shadow reference for more hardware
Fixes the following piglit test on my VEGA and matches the behaviour in the
tgsi backend.

tests/spec/glsl-1.10/execution/samplers/glsl-fs-shadow2D-clamp-z.shader_test

Fixes: 625dcbbc45 ("amd/common: pass address components individually to ac_build_image_intrinsic")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-26 12:03:24 +11:00
Eric Anholt
08f4a904b3 gallium: Make sure we return is_unorm/is_snorm for compressed formats.
The util helpers were looking for a non-void channels in a non-mixed
format and returning its snorm/unorm state.  However, compressed formats
don't have non-void channels, so they always returned false.  V3D wants to
use util_format_is_[su]norm for its border color clamping workarounds, so
fix the functions to return the right answer for these.

This now means that we ignore .is_mixed.  I could retain the is_mixed
check, but it doesn't seem like a useful feature -- the only code I could
find that might care is freedreno's blit, which has some notes about how
things are wonky in this area anyway.

Reviewed-by: <Roland Scheidegger sroland@vmware.com>
2019-01-25 13:06:50 -08:00
Eric Anholt
104c7883e7 gallium: Fix comment about possible colorspaces.
Two typos, and missing one of the colorspaces.

Reviewed-by: <Roland Scheidegger sroland@vmware.com>
2019-01-25 13:06:47 -08:00
Eric Anholt
54abd2e084 gallium: Enable unit tests as actual meson unit tests.
These tests don't need swrast, so we can always enable them when
build_tests is set.  Most of them run to successful completion quickly
(.9s on my SKL).

Reviewed-by: <Roland Scheidegger sroland@vmware.com>
2019-01-25 13:06:45 -08:00
Emil Velikov
3b6aaab7e9 mapi: print function declarations for shared glapi
Earlier commit aimed to remove unneeded function declarations. Namely
OpenGL entrypoints which are not applicable for OpenGLES*

Although it did not consider the shared glapi which needs all,
including hidden ones. Resulting in warning/errors like the following

../build/src/mapi/shared-glapi/glapi_mapi_tmp.h:26014:15:
error: no previous prototype for ‘shared_dispatch_stub_1414’ [-Werror=missing-prototypes]

This patch addressed that.

Cc: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reported-by: Eric Anholt <eric@anholt.net>
Fixes: 6148cce388 ("mapi: drop unneeded gl_dispatch_stub declarations")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-01-25 13:04:04 -08:00
Rob Clark
4aa64940c6 freedreno: limit tiling to PIPE_BIND_SAMPLER_VIEW
1ce5d757d0 dropped this limit.. which is probably the right thing to
do.  But it results in an extra tiled->linear blit for glReadPixels()
(ie. dEQP/piglit) which is hitting some intermittent corruption (looks
like cache) on a6xx, causing a lot of spurious fails.

Since we are getting close to 19.0 branchpoint, re-instate this limit
for now, until the blitter problems are resolved.

Fixes: 1ce5d757d0 freedreno: core buffer modifier support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-25 10:20:05 -05:00
Samuel Pitoiset
378e2d2414 radv: fix computing number of user SGPRs for streamout buffers
Streamout buffers are emitted like push constants.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-25 15:36:16 +01:00
Jose Fonseca
65b8d723fd appveyor: Revert commits adding Cygwin support.
This reverts commits 00ad77b9f6 and
5334dafee2.

This avoids Appveyor build breakage due to Cygwin, but more importantly,
there are several problems with these patches, as highlighted to my
recent mesa-dev mail.  So better to revert for now, and pursue Cygwin
support after these have been address.
2019-01-25 14:13:26 +00:00
Tapani Pälli
540939ecee android: fix build issues with libmesa_anv_gen* libraries
We need this include path to find nir/nir_xfb_info.h.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-25 15:21:06 +02:00
Andrii Simiklit
4759bb2fcf intel/batch-decoder: fix a vb end address calculation
According to the loop implementation (in 'ctx_print_buffer' function),
which advances dword by dword over vertex buffer(vb),
the vb size should be aligned by 4 bytes too.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109449
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-25 15:12:30 +02:00
Andrii Simiklit
db39a44f10 intel/batch-decoder: fix vertex buffer size calculation for gen<8
It should be incremented by one according to
how it is calculated by 'emit_vertex_buffer_state':
  "\#if GEN_GEN < 8
      .BufferAccessType = step_rate ? INSTANCEDATA : VERTEXDATA,
      .InstanceDataStepRate = step_rate,
   \#if GEN_GEN >= 5
      .EndAddress = ro_bo(bo, end_offset - 1),
   \#endif
   \#endif"

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109449
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-25 15:12:07 +02:00
Eric Engestrom
69e9440367 meson/vdpau: add missing soversion
This mirrors what autotools does in src/gallium/state_trackers/vdpau/Makefile.am
and src/gallium/targets/vdpau/Makefile.am:

  VDPAU_MAJOR = 1
  VDPAU_MINOR = 0
  libvdpau_gallium_la_LDFLAGS = -version-number $(VDPAU_MAJOR):$(VDPAU_MINOR)

Reported-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
Fixes: 68076b8747 "meson: build gallium vdpau state tracker"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-25 12:10:00 +00:00
Eric Engestrom
9af77fcf98 anv: drop always-successful VkResult
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-25 09:45:27 +00:00
Rafael Antognolli
f2ece26601 anv/allocator: Avoid race condition in anv_block_pool_map.
Accessing bo->map and then pool->center_bo_offset without a lock is
racy. One way of avoiding such race condition is to store the bo->map +
center_bo_offset into pool->map at the time the block pool is growing,
which happens within a lock.

v2: Only set pool->map if not using softpin (Jason).
v3: Move things around and only update center_bo_offset if not using
softpin too (Jason).

Cc: Jason Ekstrand <jason@jlekstrand.net>
Reported-by: Ian Romanick <idr@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109442
Fixes: fc3f588320
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-24 17:39:40 -08:00
Dylan Baker
c1efa240c9 meson: Add warnings and errors when using ICC
ICC tries to be helpful by not erroring when it sees something that it
doesn't understand, which is completely the opposite of helpful. Meson
0.49.0 does much better at handling this by really trying to make ICC
error, but there are some things in mesa that still get ignored until
0.49.1

v2: - Fix id check, which is 'intel' not 'icc'

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
2019-01-24 19:14:50 +00:00
Dylan Baker
7cb7f35bc7 meson: Fix compiler checks for SWR with ICC
This is a bit fragile, as the way this "fixes" the check is to move the
one that we know is correct before the one that is incorrectly reported
as working. In meson 0.49.1 (which isn't out yet) this is fixed that the
incorrect check is reported as a failure.

Fixes: e0b037d697
       ("meson: Build SWR driver")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109129
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-24 19:14:50 +00:00
Dylan Baker
3ba7ab8d2c meson: fix swr KNL build
There's a typo in one of the #defines that breaks compilation.

Fixes: e0b037d697
       ("meson: Build SWR driver")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109023
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-24 19:14:50 +00:00
Matt Turner
70a7ece035 gallivm: Return true from arch_rounding_available() if NEON is available
LLVM uses the single instruction "FRINTI" to implement llvm.nearbyint.
Fixes the rounding tests of lp_test_arit.

Bug: https://bugs.gentoo.org/665570
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-01-24 11:07:24 -08:00
Matt Turner
385ee7c3d0 gallium: Enable ASIMD/NEON on aarch64.
NEON (now called ASIMD) is available on all aarch64 CPUs. Our code was
missing an aarch64 path, leading to util_cpu_caps.has_neon always being
false on aarch64.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-24 11:07:24 -08:00
Dave Airlie
1f6b92b476 gallium: use put image shm2 path (v2)
This fixes the drisw paths to use the new shm2 interface, so that
we don't trigger the X server overflow checks when the x offset is non-zero.

This just hides the versioning in drisw, and either passes the src_x
or adds the offset fixup for the fallback path.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-01-25 04:27:45 +10:00
Dave Airlie
00af91ca46 glx: add support for putimageshm2 path (v2)
v2: pass x,0 in as the offset coords at glx level not earlier

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-01-25 04:27:45 +10:00
Dave Airlie
db83a2b40f dri_interface: add put shm image2 (v2)
This adds a new interface to the swrast interface to fix an shm put image bug.

The current code adds the x,y src offsets into the offset parameters,
however if the x offset is > 0, and the put image copies up to the height
of the image, this can trigger an X server validation check to fail and
the renderering to get BadMatch.

This patch fixes it to pass the x offset coord in as a src x.

We cannot pass the Y coordinate due to the horrible code mangling the
image w/h vs stride in swrastXPutImage.

v2: drop srcx,y from api

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-01-25 04:27:45 +10:00
Emil Velikov
281421e1bc mapi: remove machinery handling CSV files
We haven't have one in years, so just drop the code.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
8a0012692a mapi: remove old, unused ES* generator code
As of earlier commit, everyone has switched to the new script for the ES
dispatch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
a41214ca3e mapi/es2api: remove no longer present entrypoints
With the previous scripts API from the following was incorrectly
exported. Drop them from the list, since they're no longer around.

GL_EXT_blend_func_extended
GL_EXT_texture_integer

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
05f8558b27 mapi/es*api: remove GL_EXT_multi_draw_arrays entrypoints
Now we use the upstream XML file and a cleaner generator. Thus the
symbols are no longer exported and we can drop them from this list.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
5661ce6c64 mapi/es*api: remove GL_OES_EGL_image entrypoints
As some point in the past we fixed the scripts so, these are no longer
exported. Drop them from the list.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
9f86f1da7c Revert "mapi/new: sort by slot number"
This reverts commit a1f5d9412cf7cacb3534635f6c2409fafbe6574e.

We no longer needed to sort - it was meant only to ease compare against
the old generated files.
2019-01-24 18:13:25 +00:00
Emil Velikov
3bf08292d2 scons: wire the new generator for es1 and es2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
0842bc879b meson: wire the new generator for es1 and es2
v2: use ${foo})_py naming (Dylan)
v3: use symbolic name for genCommon.py

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v2)
2019-01-24 18:13:25 +00:00
Emil Velikov
656845301d autotools: wire the new generator for es1 and es2
The output produced functionally identical, with the following changes:
 - A cosmetic: swapped ABI compatible types [ GLclampf -> GLfloat, etc ]
 - B cosmetic: renamed parameters [ zNear -> n, etc ]
 - C dropped extension entrypoints - invalid/incorrect

To make things easier to validate, normalise both old/new headers run
the sed patterns A, B and C to both sets.

A
      s/\<GLclampf\>/GLfloat/g; s/\<GLclampx\>/GLfixed/g;
      s/\<GLvoid\>/void/g;

B
      s/\ \* / */g; s/\<texture\>/target/g;
      s/\<plane\>/p/g; s/\<depth\>/d/g; s/\<modeAlpha\>/modeA/g;
      s/\<shader\>/program/g; s/\<obj\>/shaders/g; s/\<equation\>/eqn/g;
      s/\<param\>/data/g; s/\<params\>/data/g; s/\<buffers\>/buffer/g;
      s/\<src\>/mode/g; s/\<count\>/n/g; s/\<zNear\>/n/g; s/\<zFar\>/f/g;
      s/\<zfail\>/dpfail/g; s/\<zpass\>/dppass/g; s/\<buf\>/index/g;
      s/\<value\>/target/g; s/\<cap\>/target/g; s/\<maskNumber\>/index/g;
      s/\<srcRGB\>/sfactorRGB/g; s/\<dstRGB\>/dfactorRGB/g;
      s/\<srcAlpha\>/sfactorAlpha/g; s/\<dstAlpha\>/dfactorAlpha/g;
      s/\<primitiveMode\>/mode/g; s/\<primcount\>/instancecount/g;
      s/\<top\>/t/g; s/\<bottom\>/b/g; s/\<left\>/l/g; s/\<right\>/r/g;
      s/\<x\>/v0/g; s/\<y\>/v1/g; s/\<z\>/v2/g; s/\<w\>/v3/g;
      s/\<sfactor\>/mode/g; s/\<dfactor\>/dst/g; s/\<attribindex\>/bindingindex/g;
      s/\<internalFormat\>/internalformat/g; s/\<bufSize\>/bufsize/g;

C
glMultiDrawArraysEXT
glMultiDrawElementsEXT

glBindFragDataLocationEXT

glGetTexParameterIivEXT
glGetTexParameterIuivEXT
glTexParameterIivEXT
glTexParameterIuivEXT

v2:
 - gl_dispatch_stub declarations are addressed with previous patch
 - the public_entries table is no longer generated

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
389bc2bc6e mapi/new: remove duplicate GLvoid/void substitution
We already do it a few lines above - drop the duplicate.

Note that for consistency sake, we keep the substitution since the GL
API is a mixed bad - some use GLvoid while others a normal void.

We might want to merge this back in GLVND.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
5fa6c34949 mapi/new: fixup the GLDEBUGPROCKHR typedef to the non KHR one
This way we can reuse the latter, which is already present in the
headers that we use. Thus we can drop the manual typedef we generate.

We might want to merge this back in GLVND.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
babec55f7e mapi/new: don't print info we don't need for ES1/ES2
There is no need for the noop functions, the public_stubs and
public_entries table or table size defines. Remove those.

Pretty much all of this is applicable to GLVND, although it
requires preparatory work.

v2:
 - python style fixes (Dylan)
 - use "gldispatch" instead of not "glesv1" "glesv2"
 - remove the public_entries table/array (Erik)

v3:
 - use if == "gldispatch", instead of "in" (Kyle)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v2)
2019-01-24 18:13:25 +00:00
Emil Velikov
5b1bdce156 mapi/new: split out public_entries handling
The only instance that requires the public_entries table is the
dispatch library - split that into another function.

We have to be careful with when undefining the guard, so split it out.

We might want to merge this back in GLVND.
Minor GLVND cleanup will be needed first.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
313f977224 mapi/new: reinstate _NO_HIDDEN suffixes in the new generator
Strictly speaking we can rework the rest of the code so we do not need
those. That said, this will require a series on it's own so let's carry
this local quirk for now.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
451805f810 mapi/new: use the static_data offsets in the new generator
Otherwise the incorrect ones will be used, effectively breaking the ABI.

Note: some entries in static_data.py list a suffixed API, while (for ES*
at least) we expect the one w/o suffix.

v2:
 - rework path handling (Dylan)
 - use else if chain (Erik)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
bba375c016 mapi/new: sort by slot number
Makes it easier to compare the newly generated header against the old
one. Will be reverted after the transition.
2019-01-24 18:13:25 +00:00
Emil Velikov
06eb3fe371 mapi/new: import mapi scripts from glvnd
Currently we have over 20 scripts that generate the libGL* dispatch and
various other functionality. More importantly we're using local XML
files instead of the Khronos provides one(s). Resulting in an
increasing complexity of writing, maintaining and bugfixing.

One fairly annoying bug is handling of statically exported symbols.
Today, if we enable a GL extension for GLES1/2, we add a special tag to
the xml. Thus the ES dispatch gets generated, but also since we have no
separate notion of GL/ES1/ES2 static functions it also gets exported
statically.

This commit adds step one towards clearing and simplifying our setup.
It imports the mapi generator from GLVND.

  012fe39 ("Remove a couple of duplicate typedefs.")

v2: use local genCommon.py

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
cd0f11bac5 mapi: move genCommon.py to src/mapi/new
The helper will also be used by the new Khronos gl.xml aware generator.

v2: Move existing one, instead of duplicating it.
v3: Correct genCommon.py references in meson [Erik]
v4: Drop the file from the EGL EXTRA_DIST [Erik]

Suggested-by: Kyle Brenneman <kbrenneman@nvidia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
a08a793180 genCommon.py: Fix typo in _LIBRARY_FEATURE_NAMES.
Port glvnd commit 37fc6caa4b8 ("Fix typo in _LIBRARY_FEATURE_NAMES.")
from Michal Srb.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
cf317bf093 mapi: add all _glapi_table entrypoints to static_data.py
Currently various parts of mesa use the glapi_table differently.

Some use _glapi_get_proc_offset() to get the offset, while others
directly reference the specific offset via _gloffset_Function.

Add all static entries, to ensure things don't break as we flip to the
upstream XML + new mapi generator.

Note: the offsets are also used for the alias remap table, thus we need
to ensure we honour the correct offsets range or it will break.

Currently this is done via MAX_OFFSETS constant, although a better
solution is in the works.

v2: add FramebufferTexture2DMultisampleEXT
v3: add MAX_OFFSETS guard

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v1)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
fe9f5c0e21 mapi: sort static entrypoints numerically
A few of the entrypoints were incorrectly placed. Sort those to align
with the rest of the list.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Emil Velikov
5a81e8d40e Revert "mesa/main: remove ARB suffix from glGetnTexImage"
This reverts commit f1998e15ff.

This changes the ABI, such that glGetnTexImageARB entry-point from the
GLAPI gets removed. Thus accessing many functions by offset (as we do)
will result in getting the wrong one.

Follow-up work will swap the by-offset handling, but for now revert
this patch.

Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:25 +00:00
Erik Faye-Lund
6148cce388 mapi: drop unneeded gl_dispatch_stub declarations
These declarations are not used anywhere - be that generated code or
otherwise.

[Emil: format the hunk from Erik into a patch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 18:13:24 +00:00
Emil Velikov
ca152234e1 mesa: correctly use os.path.join in our python scripts
With Windows in mind, using forward slash isn't the right thing to do.
Even if it just works, we might want to fix it.

As here, use __file__ instead of argv[0] and sys.path.insert over
sys.path.append. With the path tweak being reportedly faster.

Suggested-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-24 18:13:24 +00:00
Emil Velikov
9cc8e12505 freedreno: automake: ship ir3_nir_trig.py in the tarball
Fixes: aa0fed10d3 ("freedreno: move ir3 to common location")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 18:13:24 +00:00
Eric Engestrom
8ed966b506 egl/glvnd: sync egl.xml from Khronos
Fixes: 98984b7cdd "egl: add glvnd entrypoints for EGL_MESA_query_driver"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 16:55:21 +00:00
Eric Engestrom
d2ca270511 travis: bump libdrm to 2.4.97
Fixes: c02f761bdf "winsys/amdgpu: use the new BO list API"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-24 14:50:33 +00:00
Veluri Mithun
85edfc04b8 egl: Implementation of egl dri2 drivers for MESA_query_driver
Signed-off-by: Veluri Mithun <velurimithun38@gmail.com>

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 14:37:52 +00:00
Eric Engestrom
98984b7cdd egl: add glvnd entrypoints for EGL_MESA_query_driver
Fixes: fbdd7bde29863935106c "egl: Implement EGL API for MESA_query_driver"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 14:37:47 +00:00
Veluri Mithun
6afce78128 egl: Implement EGL API for MESA_query_driver
Signed-off-by: Veluri Mithun <velurimithun38@gmail.com>

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 14:37:47 +00:00
Eric Engestrom
7d9274388b egl: update headers from Khronos
Cheating a tiny bit as these headers aren't in the Khronos repo yet, but
I expect them to be within a couple days.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 14:37:44 +00:00
Eric Engestrom
381d0e753a egl: finalize EGL_MESA_query_driver
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 14:37:36 +00:00
Matt Turner
e166003cb7 intel/compiler: Reset default flag register in brw_find_live_channel()
emit_uniformize() emits SHADER_OPCODE_FIND_LIVE_CHANNEL with its
flag_subreg set, so that the IR knows which flag is accessed. However
the flag is only used on Gen7 in Align1 mode.

To avoid setting unnecessary bits in the instruction words, get the
information we need and reset the default flag register. This allows
round-tripping through the assembler/disassembler.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-01-23 22:48:29 -08:00
Kenneth Graunke
74c9c906f9 gallium: Add forgotten docs for PIPE_CAP_GLSL_TESS_LEVELS_AS_INPUTS.
Thanks to Ilia for catching this.
2019-01-23 17:16:22 -08:00
Mark Janes
022800a058 Revert "Implement EGL API for MESA_query_driver"
This reverts commit ff621a5055.

with default warnings configuration, this commit generates:

   ../src/egl/main/eglapi.c:2654:1: error: no previous prototype for
            ‘eglGetDisplayDriverConfig’ [-Werror=missing-prototypes]
2019-01-23 16:29:13 -08:00
Mark Janes
9e9fa13c81 Revert "Implementation of egl dri2 drivers for MESA_query_driver"
This reverts commit 2720f78ef2.
2019-01-23 16:28:47 -08:00
Veluri Mithun
2720f78ef2 Implementation of egl dri2 drivers for MESA_query_driver
Signed-off-by: Veluri Mithun <velurimithun38@gmail.com>

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-23 22:29:14 +00:00
Veluri Mithun
ff621a5055 Implement EGL API for MESA_query_driver
Signed-off-by: Veluri Mithun <velurimithun38@gmail.com>

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-23 22:29:14 +00:00
Veluri Mithun
499869908b Add extension doc for MESA_query_driver
Signed-off-by: Veluri Mithun <velurimithun38@gmail.com>

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-23 22:29:14 +00:00
Sergii Romantsov
cfca5cd958 nir: Length of boolean vtn_value now is 1
During conversion type-length was lost due to math.

v2 (Jason Ekstrand):
 - Use a size/offset of 4 bytes

Fixes: 44227453ec (nir: Switch to using 1-bit Booleans for almost everything)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109353
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Tested-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-23 15:43:06 -06:00
Marek Olšák
42aea4f1a7 st/mesa: fix PRIMITIVES_GENERATED query after the "pipeline stat single" changes
When this functionality was added, the PRIMITIVES_GENERATED query was
accidentally omitted. This causes issues for drivers that support
transform feedback."

Fixes: d644698b44 ("gallium: Add the ability to query a single
pipeline statistics counter")

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-23 14:32:57 -05:00
Marek Olšák
c89e8470e5 st/mesa: purge framebuffers when unbinding a context
This fixes pipe_surface "leaks".

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-01-23 14:32:57 -05:00
Erik Faye-Lund
5c17c01815 docs: add note about sending merge-requests from forks
Sending MRs from the main Mesa repository increase clutter in the
repository, and decrease visibility of project-wide branches. So it's
better if MRs are sent from forks instead.

Let's add a note about this, in case its not obvious to everyone.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-23 18:14:06 +01:00
Rob Clark
5a4af871e3 freedreno: set modifier when exporting buffer
Fixes an assert we start hitting with kms/gbm:

  #0  0x0000007fbf3d6e3c in raise () from /lib64/libc.so.6
  #1  0x0000007fbf3c4a68 in abort () from /lib64/libc.so.6
  #2  0x0000007fbf3d04e8 in __assert_fail_base () from /lib64/libc.so.6
  #3  0x0000007fbf3d0550 in __assert_fail () from /lib64/libc.so.6
  #4  0x0000007fbf5a73c4 in gbm_dri_bo_create (gbm=0x5820f0, width=2160, height=1440, format=875713112, usage=0, modifiers=0x695e00, count=1) at ../src/gbm/backends/dri/gbm_dri.c:1150
  #5  0x0000007fbf5a49c4 in gbm_bo_create_with_modifiers (gbm=0x5820f0, width=2160, height=1440, format=875713112, modifiers=0x695e00, count=1) at ../src/gbm/main/gbm.c:491
  #6  0x0000007fbbac3d64 in get_back_bo (dri2_surf=0x6f4cc0) at ../src/egl/drivers/dri2/platform_drm.c:258
  #7  0x0000007fbbac4318 in dri2_drm_image_get_buffers (driDrawable=0x704490, format=4098, stamp=0x6fc730, loaderPrivate=0x6f4cc0, buffer_mask=1, buffers=0x7fffffe210) at ../src/egl/drivers/dri2/platform_drm.c:409
  #8  0x0000007fbf5a5318 in image_get_buffers (driDrawable=0x704490, format=4098, stamp=0x6fc730, loaderPrivate=0x70e150, buffer_mask=1, buffers=0x7fffffe210) at ../src/gbm/backends/dri/gbm_dri.c:135
  #9  0x0000007fbe4308c4 in dri_image_drawable_get_buffers (drawable=0x6fc730, images=0x7fffffe210, statts=0x6f2660, statts_count=1) at ../src/gallium/state_trackers/dri/dri2.c:339
  #10 0x0000007fbe430c44 in dri2_allocate_textures (ctx=0x614b30, drawable=0x6fc730, statts=0x6f2660, statts_count=1) at ../src/gallium/state_trackers/dri/dri2.c:466
  #11 0x0000007fbe435580 in dri_st_framebuffer_validate (stctx=0x714160, stfbi=0x6fc730, statts=0x6f2660, count=1, out=0x7fffffe3b8) at ../src/gallium/state_trackers/dri/dri_drawable.c:85
  #12 0x0000007fbe7b2c84 in st_framebuffer_validate (stfb=0x6f2190, st=0x714160) at ../src/mesa/state_tracker/st_manager.c:222
  #13 0x0000007fbe7b4884 in st_api_make_current (stapi=0x7fbf0430d8 <st_gl_api>, stctxi=0x714160, stdrawi=0x6fc730, streadi=0x6fc730) at ../src/mesa/state_tracker/st_manager.c:1074
  #14 0x0000007fbe434f44 in dri_make_current (cPriv=0x703c20, driDrawPriv=0x704490, driReadPriv=0x704490) at ../src/gallium/state_trackers/dri/dri_context.c:301
  #15 0x0000007fbe42c910 in driBindContext (pcp=0x703c20, pdp=0x704490, prp=0x704490) at ../src/mesa/drivers/dri/common/dri_util.c:579
  #16 0x0000007fbbabab40 in dri2_make_current (drv=0x69d170, disp=0x69c6e0, dsurf=0x6f4cc0, rsurf=0x6f4cc0, ctx=0x70cb40) at ../src/egl/drivers/dri2/egl_dri2.c:1456
  #17 0x0000007fbbaa8ef4 in eglMakeCurrent (dpy=0x69c6e0, draw=0x6f4cc0, read=0x6f4cc0, ctx=0x70cb40) at ../src/egl/main/eglapi.c:862
  #18 0x0000007fbf5736ac in InternalMakeCurrentVendor (dpy=dpy@entry=0x614fb0, draw=draw@entry=0x6f4cc0, read=read@entry=0x6f4cc0, context=context@entry=0x70cb40, apiState=apiState@entry=0x6fc940, vendor=0x6975f0) at libegl.c:861
  #19 0x0000007fbf573764 in InternalMakeCurrentDispatch (dpy=0x614fb0, draw=0x6f4cc0, read=0x6f4cc0, context=0x70cb40, vendor=0x6975f0) at libegl.c:630
  #20 0x0000000000403640 in init_egl (egl=0x5805a8 <gl>, gbm=0x580528 <gbm>, samples=0) at ../common.c:263
  #21 0x0000000000403c1c in init_cube_smooth (gbm=0x580528 <gbm>, samples=0) at ../cube-smooth.c:225
  #22 0x0000000000408618 in main (argc=1, argv=0x7fffffe8d8) at ../kmscube.c:145

Fixes: 1ce5d757d0 freedreno: core buffer modifier support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-23 10:21:00 -05:00
Samuel Pitoiset
963c044c55 radv: always pass the GFX9 fence data to si_cs_emit_cache_flush()
Remove two useless checks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:14 +01:00
Samuel Pitoiset
5f0b17d581 radv: compute the GFX9 fence VA at allocation time
Instead of doing every time we emit cache flushes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:12 +01:00
Samuel Pitoiset
e7ac792400 radv: only allocate the GFX9 fence and EOP BOs for the gfx queue
It's invalid to emit a ZPASS_DONE event on the compute queue, and
the fence BO is unused on the compute queue (ie. we don't flush
CB or DB caches).
This saves some space in the upload BO.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:09 +01:00
Samuel Pitoiset
bd098884f1 radv: remove old_fence parameter from si_cs_emit_write_event_eop()
This parameter is actually useless as the immediate value
can always be zero.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:07 +01:00
Samuel Pitoiset
698afa177e radv: improve gathering of load_push_constants with dynamic bindings
For example, if a pipeline has two stages VS and FS. And if only
the fragment stage needs dynamic bindings, we shouldn't allocate
an extra user SGPR for the vertex stage. Of course, if the vertex
stage loads constants, it needs an user SGPR.

This should reduce the number of SET_SH_REG packets that are emitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 09:43:53 +01:00
Caio Marcelo de Oliveira Filho
e0485a1dd7 gallium: Add PIPE_CAP_GLSL_TESS_LEVELS_AS_INPUTS
In the Intel backend, it makes the most sense to treat gl_TessLevelInner
and gl_TessLevelOuter as ordinary shader inputs.  For Radeon, it makes
more sense to treat them as system values which get special handling.

We already have a compiler option for this, but the Iris driver will
need a capability bit so we can set it appropriately.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-23 00:35:56 -08:00
Ilia Mirkin
8e26d534be nv50,nvc0: mark textures dirty on fb update
We may have to flush the cache if there are any textures presently bound
that refer to the outgoing framebuffer. This is only checked at
validation time.

Fixes a number of dEQP-GLES3.functional.fbo.color.repeated_clear.sample.*
tests, which would bind a texture, then clear it while the binding was
in effect, and then render to a different texture. This seems legal
under the "no feedback loops" rule.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-22 23:16:01 -05:00
Timothy Arceri
678ef2a4a5 ac/nir_to_llvm: fix interpolateAt* for structs
This fixes the arb_gpu_shader5 interpolateAt* tests that contain
structs.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2019-01-23 10:41:37 +11:00
Timothy Arceri
559e5b0408 ac/nir_to_llvm: add bindless support for uniform handles
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-23 10:41:37 +11:00
Timothy Arceri
f0ed59076f radeonsi/nir: add missing piece for bindless image support
This fixes some piglit tests and is was TGSI does.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-23 10:41:37 +11:00
Rob Clark
1ce5d757d0 freedreno: core buffer modifier support
Split out of a patch from Fritz Koenig to decouple from a6xx UBWC
enablement, and added fd_resource_create_with_modifiers().
2019-01-22 16:33:27 -05:00
Rob Clark
c56fe4118a loader: fix the no-modifiers case
Normally modifiers take precendence over use flags, as they are more
explicit.  But if the driver supports modifiers, but the xserver does
not, then we should fallback to the old mechanism of allocating a buffer
using 'use' flags.

Fixes: 069fdd5f9f
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-01-22 16:33:27 -05:00
Fritz Koenig
7c4b9510d1 freedreno: add query for dmabuf modifiers 2019-01-22 16:33:27 -05:00
Fritz Koenig
ddbe6171e6 freedreno: drm_fourcc.h header include
Add Qualcomm modifier for UBWC
2019-01-22 16:33:27 -05:00
Brian Paul
956c219c8f svga: add new gallium formats to the format conversion table
Fixes a static assertion which broke the build.

Fixes: 3ee240890 "gallium: add SINT formats to have exact counterparts to SNORM formats"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Neha Bhende<bhenden@vmware.com>
2019-01-22 12:58:04 -07:00
Marek Olšák
d85917deaf radeonsi: rename rfence -> sfence
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 13:34:03 -05:00
Marek Olšák
260ff57647 radeonsi: rename rbo, rbuffer to buf or buffer
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 13:34:01 -05:00
Marek Olšák
63b91f25bc radeonsi: rename rsrc -> ssrc, rdst -> sdst
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 13:33:04 -05:00
Marek Olšák
4666f36c04 radeonsi: rename rquery -> squery
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 13:32:59 -05:00
Marek Olšák
501ff90a95 radeonsi: rename r600_resource -> si_resource
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 13:32:18 -05:00
Lionel Landwerlin
a75b12ce66 vulkan: make generated enum to strings helpers available from c++
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-22 18:20:53 +00:00
Marek Olšák
1cfbed7587 radeonsi: remove r600 from comments
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:26:45 -05:00
Marek Olšák
e0a6399eb4 winsys/amdgpu: rename rfence, rsrc, rdst -> afence, asrc, adst
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:26:45 -05:00
Marek Olšák
2792ec2cdd radeonsi: rename rview -> sview
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:26:45 -05:00
Marek Olšák
96610f625d radeonsi: rename rscreen -> sscreen
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:25:57 -05:00
Marek Olšák
86e25ed5a3 radeonsi: disable render cond & pipeline stats for internal compute dispatches 2019-01-22 12:24:35 -05:00
Sonny Jiang
1b25d340b7 radeonsi: use compute for resource_copy_region when possible
v2: marek: fix snorm8 blits

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-01-22 12:24:35 -05:00
Jiang, Sonny
8daf5bb209 radeonsi: add compute_last_block to configure the partial block fields 2019-01-22 12:22:46 -05:00
Marek Olšák
b443465fb9 gallium/util: add util_format_snorm8_to_sint8 (from radeonsi) 2019-01-22 12:21:43 -05:00
Marek Olšák
3ee240890c gallium: add SINT formats to have exact counterparts to SNORM formats
for radeonsi
2019-01-22 12:21:43 -05:00
Marek Olšák
4d5f8f39f3 radeonsi: move PKT3_WRITE_DATA generation into a helper function
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák
c252273f98 radeonsi: don't use WRITE_DATA.DST_SEL == MEM_GRBM on >= CIK
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák
a545415eb9 radeonsi: fix the top-of-pipe fence on SI
SI doesn't have MEM.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák
e402961e1d radeonsi: correct WRITE_DATA.DST_SEL definitions
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák
c605738113 radeonsi: compile clear and copy buffer compute shaders on demand
same as all other shaders
2019-01-22 11:59:27 -05:00
Marek Olšák
f139589069 radeonsi: remove redundant call to emit_cache_flush in compute clear/copy
launch_grid calls it.
2019-01-22 11:59:27 -05:00
Marek Olšák
e3d283eaca radeonsi: use buffer_store_format_x & xy 2019-01-22 11:59:27 -05:00
Marek Olšák
4c4c8bb1f0 radeonsi: fix rendering to tiny viewports where the viewport center is > 8K
This fixes an assertion failure with GL CTS when cts-runner is used.
(not a specific test)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108877
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
2019-01-22 11:59:27 -05:00
Marek Olšák
caa2dcd730 radeonsi: fix a u_blitter crash after a shader with FBFETCH
This fixes an assertion failure with GL CTS when cts-runner is used.
(not a specific test)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108877
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
2019-01-22 11:59:27 -05:00
Marek Olšák
c02f761bdf winsys/amdgpu: use the new BO list API 2019-01-22 11:59:27 -05:00
Jason Ekstrand
ac0f8a6ea0 anv: Implement transform feedback queries
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:57 -06:00
Jason Ekstrand
7f4d9bb7b8 genxml: Add SO_PRIM_STORAGE_NEEDED and SO_NUM_PRIMS_WRITTEN
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:57 -06:00
Jason Ekstrand
673f33c77d anv: Implement CmdBegin/EndQueryIndexed
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:57 -06:00
Jason Ekstrand
2be89cbd82 anv: Implement vkCmdDrawIndirectByteCountEXT
Annoyingly, this requires that we implement integer division on the
command streamer.  Fortunately, we're only ever dividing by constants so
we can use the mulh+add+shift trick and it's not as bad as it sounds.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
36ee2fd61c anv: Implement the basic form of VK_EXT_transform_feedback
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
39925d60ec anv: Add pipeline cache support for xfb_info
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
e3bd49eaa7 anv: Add but do not enable VK_EXT_transform_feedback
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Alejandro Piñeiro
6b50b0a4a8 nir/xfb: distinguish array of structs vs array of blocks
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
ac704e777c nir/xfb: Properly handle arrays of blocks
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22 10:42:56 -06:00
Alejandro Piñeiro
5649a0a6e8 nir/xfb: don't assert when xfb_buffer/stride is present but not xfb_offset
In order to allow nir_gather_xfb_info to be used on OpenGL,
specifically ARB_gl_spirv.

So, from OpenGL 4.6 spec, section 11.1.2.1, "Output Variables":

    "outputs specifying both an *XfbBuffer* and an *Offset* are
     captured, while outputs not specifying both of these are not
     captured. Values are captured each time the shader writes to such
     a decorated object."

This implies that are captured if both are present, and not if one of
those are lacking. Technically, it doesn't explicitly point that
having just one or the other is a mistake. In some cases, glslang is
adding some extra XfbBuffer without XfbOffset around, and mentioning
that technically that is not a bug (see issue#1526)

And for the case of Vulkan, as the same glslang issue mentions, it is
not clear if that should be a mistake or not. But even if it is a
mistake, it is not really needed to be checked on the driver, and we
can let the validation layers to check that.

v2: simplify explicit_xfb_buffer and explicit_offset checks (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
4f99ac9144 nir/xfb: Fix offset accounting for dvec3/4
Before, we were double-counting the component slots when we had a dvec3
or dvec4.  Instead, just add them in once and manually offset the
recorded output offset.

Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
96fa23bca5 nir: Preserve offsets in lower_io_to_scalar_early
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22 10:42:56 -06:00
Samuel Pitoiset
b2bbd978d0 nir: fix lowering arrays to elements for XFB outputs
If we have a transform feedback output like:

float[2] x2_out (VARYING_SLOT_VAR1.x, 0, 0)

which is lowered by nir_lower_io_arrays_to_elements to,

float x2_out (VARYING_SLOT_VAR1.x, 0, 0)
float x2_out@5 (VARYING_SLOT_VAR2.x, 0, 0)

We have to update the destination offset to avoid overwriting
the same value.

v2 (Jason Ekstrand):
 - Compute the correct offsets for arrays of vectors and/or doubles

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22 10:42:56 -06:00
Samuel Pitoiset
9f4e0aa7c1 nir: do not remove varyings used for transform feedback
When a xfb buffer is explicitely declared on a varying
variable, we shouldn't remove it at link time.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
9c14440e81 spirv: Only set interface_type on blocks
Instead of setting interface_type to whatever the per-vertex type is, we
only set it on blocks.  This allows later passes to tell the difference
between variables that are in blocks and those that aren't.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
da29594636 spirv: Only split blocks
Instead of splitting every per-vertex struct, just split the ones that
are actually blocks.  The reason for the split is so that we have
separate variables for separate locations, qualifiers, and builtin
decorations.  The vulkan spec only allows these on members of blocks.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
662cfb121b spirv: Initialize struct member offsets to -1
This is the "no offset specified" value.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
b4eae8444e anv: Always emit at least one vertex element
This seems to make the simulator happier.  The early return wasn't
really protecting anything and the code that follows will happily
initialize the dummy element to STORE_0 and emit it.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Eric Engestrom
610f956fde configure: EGL requirements only apply if EGL is built
Issue was hit with this configuration:
  --disable-{egl,gbm} --with-platform=drm

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 3208fd2e46 ("configure: move platform handling further up")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-22 16:12:40 +00:00
Jonathan Marek
fc4f6b2f12 freedreno: a2xx: add partial lower_scalar pass for ir2
Some instructions can only be scalar on a2xx, lower these only

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-22 14:45:03 +00:00
Jonathan Marek
9f614c74b7 freedreno: a2xx: add ir2 copy propagation
Two cases:
* replacing srcs which refer to MOV instructions
* replacing MOVs used to write to exports

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-22 14:45:03 +00:00
Jonathan Marek
c7dbf0b280 freedreno: a2xx: insert scalar MOV to allow 2 source scalar
If we want to use a scalar instruction with two sources, both sources have
to be in the same register. This covers a common case by inserting a scalar
MOV into a previous instruction with only a vector alu instruction.

A better method would be to have the sources end up in the same register in
the first place, but when one source is a constant this is the only way.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-22 14:45:03 +00:00
Jonathan Marek
67610a0323 freedreno: a2xx: NIR backend
This patch replaces the a2xx TGSI compiler with a NIR compiler.

It also adds several new features:
-gl_FrontFacing, gl_FragCoord, gl_PointCoord, gl_PointSize
-control flow (including loops)
-texture related features (LOD/bias, cubemaps)
-filling scalar ALU slot when possible

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-22 14:45:03 +00:00
Tapani Pälli
da3ca69afa nir: cleanup glsl_get_struct_field_offset, glsl_get_explicit_stride
Take away const qualifier from return type of these functions as
-Wignored-qualifiers points out it is ignored for these cases.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 13:09:15 +02:00
Eric Engestrom
41a0c00392 travis: fix autotools build after --enable-autotools switch addition
Fixes: e68777c87c "autotools: Deprecate the use of autotools"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 10:29:19 +00:00
Jason Ekstrand
27af1cc2a6 spirv: Update the JSON and headers from Khronos master
This corresponds to commit 79b6681aadcb53c27d1052e on GitHub.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 18:55:05 -06:00
Jason Ekstrand
ca8c6c9781 nir: Mark deref UBO and SSBO access as non-scalar
Fixes: 63b9aa2e25 "spirv: Add support for using derefs for..."
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 18:41:47 -06:00
Karol Herbst
5ee0adfb6e nir/spirv: handle ContractionOff execution mode
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 20:36:41 +01:00
Rob Clark
fa737042ad nir/vtn: add caps for some cl related capabilities
vtn supports these, so don't squalk if user is happy with enabling
these.

v2: add new members sorted

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 20:36:41 +01:00
Karol Herbst
ce08e5f39c vtn: handle SpvExecutionModelKernel
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 20:36:41 +01:00
Karol Herbst
8bb46de08b mesa: add MESA_SHADER_KERNEL
used for CL kernels

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 20:36:41 +01:00
Jason Ekstrand
2aa78e46e9 anv/pipeline: Add a pdevice helper variable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-21 11:57:00 -06:00
Jason Ekstrand
344171b9ee relnotes: Add newly added Vulkan extensions
Both the Intel and RADV people have been really bad about adding things
to the release notes.  We should start actually paying attention.

Acked-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-21 11:46:06 -06:00
Jason Ekstrand
c7f4a2867c anv: Only parse pImmutableSamplers if the descriptor has samplers
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-21 11:45:58 -06:00
Rhys Perry
f0ba826054 radv: prevent dirtying of dynamic state when it does not change
DXVK often sets dynamic state without actually changing it.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry
e4c6423c5e radv: avoid context rolls when binding graphics pipelines
It's common in some applications to bind a new graphics pipeline without
ending up changing any context registers.

This has a pipline have two command buffers: one for setting context
registers and one for everything else. The context register command buffer
is only emitted if it differs from the previous pipeline's.

v2: ensure late scissor emission is done when radv_emit_rbplus_state() is
    called
v2: make use of cmd_buffer->state.workaround_scissor_bug
v3: rename "workaround_scissor_bug" to
    "context_roll_without_scissor_emitted"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry
5564a797f2 radv: add missed situations for scissor bug workaround
v2: rename "workaround_scissor_bug" to
    "context_roll_without_scissor_emitted"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry
5d1a29071a radv: pass radv_draw_info to radv_emit_draw_registers()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Jonathan Marek
5886c5d092 freedreno: a2xx: sysmem rendering
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:34 -05:00
Jonathan Marek
bec6e4b054 freedreno: a2xx: fix non-zero texture base offsets
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:27 -05:00
Jonathan Marek
02ab85afd8 freedreno: a2xx: fix VERTEX_REUSE/DEALLOC on a20x
On a20x, set VGT_VERTEX_REUSE_BLOCK_CNTL to 2 and don't change it. Small
rearrangement on a220 to reduce the size of draw commands.

Only set DEALLOC_CNTL on a20x because the correct a220 value is not known.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:22 -05:00
Jonathan Marek
0286a11b7e freedreno: a2xx: fix gmem2mem viewport
Fixes cases where previous viewport values might case gmem2mem to fail.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:16 -05:00
Jonathan Marek
64b12520a2 freedreno: a2xx: cleanup REG_A2XX_PA_CL_VTE_CNTL
Doesn't change much, but reduces the size of fd2_emit_state

gmem2mem does not need to change the value: no Z clipping on resolve
mem2gmem now needs to restore the common value after rendering

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:10 -05:00
Jonathan Marek
6ef7700ac6 freedreno: a2xx: cleanup init_shader_const
Only 3 vertices are used so we can drop the data for vertex 4

It doesn't make sense to have 1.1 for some coordinates, use 1.0 instead

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:21:51 -05:00
Karol Herbst
0a793c78a3 nir: add bit_size parameter to system values with multiple allowed bit sizes
v2: add assert to verify we have at least one valid bit_size
v3: fix use of load_front_face in nir_lower_two_sided_color and tgsi_to_nir

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 00:17:18 +01:00
Karol Herbst
4125211e9c nir: add legal bit_sizes to intrinsics
With OpenCL some system values match the address bits, but in GLSL we also
have some system values being 64 bit like subgroup masks.

With this it is possible to adjust the builder functions so that depending
on the bit_sizes the correct bit_size is used or an additional argument is
added in case of multiple possible values.

v2: validate dest bit_size
v3: generate hex values in python code
    remove useless imports
    rename and move bit_sizes
v4: add 1 to legal bit_sizes for front_face

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 00:16:51 +01:00
Karol Herbst
27bd07e230 nir/validate: allow to check against a bitmask of bit_sizes
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 00:16:51 +01:00
Karol Herbst
b9fec2b38c nir: replace more nir_load_system_value calls with builder functions
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 00:16:51 +01:00
Karol Herbst
987744be98 glsl/lower_output_reads: set invariant and precise flags on temporaries
fixes a couple of deqp tests (on nvc0 and potential other drivers):
dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_1
dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_2
dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_3
dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_1
dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_2
dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_3
dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_1
dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_2
dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_3

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-21 00:16:50 +01:00
Rhys Kidd
8002eaab6c nv50,nvc0: add missing CAPs for unsupported features
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-20 13:51:01 -05:00
Karol Herbst
acdad24585 nir/spirv: handle SpvStorageClassCrossWorkgroup
v2: rename nir_var_global to nir_var_mem_global

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:42 +01:00
Karol Herbst
36a76b7192 nir: rename nir_var_shared to nir_var_mem_shared
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Karol Herbst
6fefd69724 nir: rename nir_var_ssbo to nir_var_mem_ssbo
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Karol Herbst
3afc1e068f nir: rename nir_var_ubo to nir_var_mem_ubo
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Karol Herbst
9b24028426 nir: rename nir_var_function to nir_var_function_temp
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Karol Herbst
e5daef9587 nir: rename nir_var_private to nir_var_shader_temp
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Lionel Landwerlin
ad99c1670a intel/genxml: add missing MI_PREDICATE compare operations
Doesn't save us a great deal of lines but at least they get decoded in
aubinators.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-01-19 15:47:36 +00:00
Lionel Landwerlin
79514cc5fb anv: document cache flushes & invalidations
A little bit of explanation regarding how vkCmdPipelineBarrier()
works.

v2: Avoid referring to data port cache when it's actually sampler
    caches (Jason)
    Complete explanation for indirect draws (Jason)

v3: s/samplers/sampler/ (Jason)
    s/UBOs/data port/
    Add documentation for VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT_EXT (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
2019-01-19 15:45:41 +00:00
Lionel Landwerlin
3c4c18341a anv: narrow flushing of the render target to buffer writes
In commit 9a7b319903 ("anv/query: flush render target before
copying results") we tracked all the render target writes to apply a
flushes in the vkCopyQueryResults(). But we can narrow this down to
only when we write a buffer (which is the only input of
vkCopyQueryResults).

v2: Drop newer render target write flags introduce by 1952fd8d2c
    ("anv: Implement VK_EXT_conditional_rendering for gen 7.5+")

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
2019-01-19 15:45:41 +00:00
Timothy Arceri
6ca652faf3 glsl: be much more aggressive when skipping shader compilation
Currently we only add a cache key for a shader once it is linked.
However games like Team Fortress 2 compile a whole bunch of shaders
which are never actually linked. These compiled shaders can take
up a bunch of memory.

This patch changes things so that we add the key for the shader to
the cache as soon as it is compiled. This means on a warm cache we
can avoid the wasted memory from these shaders. Worst case scenario
is we need to compile the shaders at link time but this can happen
anyway if the shader has been evicted from the cache.

Reduces memory use in Team Fortress 2 from 1.3GB -> 770MB on a
warm cache from start up to the game menu.

V2: only add key to cache when compilation is successful.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2019-01-19 13:12:25 +11:00
Francisco Jerez
c84ec70b3a intel/fs: Promote execution type to 32-bit when any half-float conversion is needed.
The docs are fairly incomplete and inconsistent about it, but this
seems to be the reason why half-float destinations are required to be
DWORD-aligned on BDW+ projects.  This way the regioning lowering pass
will make sure that the destination components of W to HF and HF to W
conversions are aligned like the corresponding conversion operation
with 32-bit execution data type.

Tested-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-18 16:09:39 -08:00
Timothy Arceri
9e669ed22b ac/nir_to_llvm: fix interpolateAt* for arrays
This builds on the recent interpolate fix by Rhys ee8488ea3b.

This fixes the arb_gpu_shader5 interpolateAt* tests that contain
arrays.

Fixes: ee8488ea3b ("ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics")

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 10:59:38 +11:00
Timothy Arceri
860a9e4849 Revert "glsl: be much more aggressive when skipping shader compilation"
This reverts commit 64b8c86d37.

Reverting for now as it was causing some segfaults.
2019-01-19 10:45:07 +11:00
Kristian H. Kristensen
5486c9d526 freedreno/a6xx: Turn on texture tiling by default
The color swap isn't available for tiled formats and it's not needed
either. We pick one channel order and use for all non-linear formats.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-18 14:27:15 -08:00
Kristian H. Kristensen
60c6778dda freedreno: Synchronize batch and flush for staging resource
Staging blit downloads would wait on the src resource instead of the
staging resource and didn't make sure to submit the blit batch first.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-18 14:27:12 -08:00
Timothy Arceri
64b8c86d37 glsl: be much more aggressive when skipping shader compilation
Currently we only add a cache key for a shader once it is linked.
However games like Team Fortress 2 compile a whole bunch of shaders
which are never actually linked. These compiled shaders can take
up a bunch of memory.

This patch changes things so that we add the key for the shader to
the cache as soon as it is compiled. This means on a warm cache we
can avoid the wasted memory from these shaders. Worst case scenario
is we need to compile the shaders at link time but this can happen
anyway if the shader has been evicted from the cache.

Reduces memory use in Team Fortress 2 from 1.3GB -> 770MB on a
warm cache from start up to the game menu.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2019-01-19 08:24:47 +11:00
Timothy Arceri
c9d7b0f184 glsl: don't skip GLSL IR opts on first-time compiles
This basically reverts c2bc0aa7b1.

By running the opts we reduce  memory using in Team Fortress 2
from 1.5GB -> 1.3GB from start-up to game menu.

This will likely increase Deus Ex start up times as per commit
c2bc0aa7b1. However currently 32bit games like Team Fortress 2
can run out of memory on low memory systems, so that seems more
important.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-19 08:24:43 +11:00
Caio Marcelo de Oliveira Filho
cd56d79b59 nir: check NIR_SKIP to skip passes by name
Passes' function names, separated by comma, listed in NIR_SKIP
environment variable will be skipped in debug mode.  The mechanism is
hooked into the _PASS macro, like NIR_PRINT.

The extra macro NIR_SKIP is available as a developer convenience, to
skip at pointer other than the passes entry points.

v2: Fix typo in NIR_SKIP macro. (Bas)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 12:31:49 -08:00
Danylo Piliaiev
1952fd8d2c anv: Implement VK_EXT_conditional_rendering for gen 7.5+
Conditional rendering affects next functions:
- vkCmdDraw, vkCmdDrawIndexed, vkCmdDrawIndirect, vkCmdDrawIndexedIndirect
- vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR
- vkCmdDispatch, vkCmdDispatchIndirect, vkCmdDispatchBase
- vkCmdClearAttachments

Value from conditional buffer is cached into designated register,
MI_PREDICATE is emitted every time conditional rendering is enabled
and command requires it.

v2: by Jason Ekstrand
  - Use vk_find_struct_const instead of manually looping
  - Move draw count loading to prepare function
  - Zero the top 32-bits of MI_ALU_REG15

v3: Apply pipeline flush before accessing conditional buffer
 (The issue was found by Samuel Iglesias)

v4: - Remove support of Haswell due to possible hardware bug
    - Made TMP_REG_PREDICATE and TMP_REG_DRAW_COUNT defines to
       define registers in one place.

v5: thanks to Jason Ekstrand and Lionel Landwerlin
    - Workaround the fact that MI_PREDICATE_RESULT is not
      accessible on Haswell by manually calculating
      MI_PREDICATE_RESULT and re-emitting MI_PREDICATE
      when necessary.

v6: suggested by Lionel Landwerlin
    - Instead of calculating the result of predicate once - re-emit
      MI_PREDICATE to make it easier to investigate error states.

v7: suggested by Jason
    - Make anv_pipe_invalidate_bits_for_access_flag add CS_STALL
      if VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT is set.

v8: suggested by Lionel
    - Precompute conditional predicate's result to
      support secondary command buffers.
    - Make prepare_for_draw_count_predicate more readable.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-18 18:31:44 +00:00
Danylo Piliaiev
ed6e2bf263 anv: Implement VK_KHR_draw_indirect_count for gen 7+
v2: by Jason Ekstrand
  - Move out of the draw loop population of registers
    which aren't changed in it.
  - Remove dependency on ALU registers.
  - Clarify usage of PIPE_CONTROL
  - Without usage of ALU registers patch works for gen7+

v3: set pending_pipe_bits |= ANV_PIPE_RENDER_TARGET_WRITES

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-18 18:31:44 +00:00
Dylan Baker
9e989b860a bin/meson-cmd-extract: Also handle cross and native files
Native file support in command line serialization isn't present in meson
0.49, but will be for 0.49.1 and 0.50

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-18 09:37:01 -08:00
Jason Ekstrand
b54df1b6df anv: Re-sort the extensions list
I like to keep things in good order so that you can find them.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-18 10:32:23 -06:00
Jason Ekstrand
eb32dad07c intel/fs: Don't touch accumulator destination while applying regioning alignment rule
In some shaders, you can end up with a stride in the source of a
SHADER_OPCODE_MULH.  One way this can happen is if the MULH is acting on
the top bits of a 64-bit value due to 64-bit integer lowering.  In this
case, the compiler will produce something like this:

mul(8)    acc0<1>UD   g5<8,4,2>UD   0x0004UW      { align1 1Q };
mach(8)   g6<1>UD     g5<8,4,2>UD   0x00000004UD  { align1 1Q AccWrEnable };

The new region fixup pass looks at the MUL and sees a strided source and
unstrided destination and determines that the sequence is illegal.  It
then attempts to fix the illegal stride by replacing the destination of
the MUL with a temporary and emitting a MOV into the accumulator:

mul(8)    g9<2>UD     g5<8,4,2>UD   0x0004UW      { align1 1Q };
mov(8)    acc0<1>UD   g9<8,4,2>UD                 { align1 1Q };
mach(8)   g6<1>UD     g5<8,4,2>UD   0x00000004UD  { align1 1Q AccWrEnable };

Unfortunately, this new sequence isn't correct because MOV accesses the
accumulator with a different precision to MUL and, instead of filling
the bottom 32 bits with the source and zeroing the top 32 bits, it
leaves the top 32 (or maybe 31) bits alone and full of garbage.  When
the MACH comes along and tries to complete the multiplication, the
result is correct in the bottom 32 bits (which we throw away) and
garbage in the top 32 bits which are actually returned by MACH.

This commit does two things:  First, it adds an assert to ensure that we
don't try to rewrite accumulator destinations of MUL instructions so we
can avoid this precision issue.  Second, it modifies
required_dst_byte_stride to require a tightly packed stride so that we
fix up the sources instead and the actual code which gets emitted is
this:

mov(8)    g9<1>UD     g5<8,4,2>UD                 { align1 1Q };
mul(8)    acc0<1>UD   g9<8,8,1>UD   0x0004UW      { align1 1Q };
mach(8)   g6<1>UD     g5<8,4,2>UD   0x00000004UD  { align1 1Q AccWrEnable };

Fixes: efa4e4bc5f "intel/fs: Introduce regioning lowering pass"
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-01-18 10:18:52 -06:00
Jason Ekstrand
0a7ac6d543 intel/eu: Stop overriding exec sizes in send_indirect_message
For a long time, we based exec sizes on destination register widths.
We've not been doing that since 1ca3a94427 but a few remnants
accidentally remained.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2019-01-18 10:18:52 -06:00
Samuel Pitoiset
f682ed11c3 radv: initialize the per-queue descriptor BO only once
Totally useless to write the descriptors inside the loop.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:32 +01:00
Samuel Pitoiset
72d9745a40 radv: do not write unused descriptors to the per-queue BO
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:30 +01:00
Samuel Pitoiset
8c164ea8f5 radv: reduce size of the per-queue descriptor BO
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:28 +01:00
Samuel Pitoiset
83cc87ead4 radv: drop unused code related to 16 sample locations
The driver only supports up to 8 sample locations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:24 +01:00
Karol Herbst
80dae7022e gm107/ir: disable TEXS for tex with derivAll set
fixes deqp tests:
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.samplercube_fixed_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.samplercube_float_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.isamplercube_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.usamplercube_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler3d_fixed_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler3d_float_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.isampler3d_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.usampler3d_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler2dshadow_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler3d_fixed_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler3d_float_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.isampler3d_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.usampler3d_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler2dshadow_vertex

Fixes: f821e80213
       "gm107/ir: use scalar tex instructions where possible"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-18 03:27:51 +01:00
Karol Herbst
30b5c9eda2 nv50/ir: disable tryCollapseChainedMULs in ConstantFolding for precise instructions
fixes dEQP-GLES2.functional.shaders.invariance.mediump.loop_3

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-18 02:03:30 +01:00
Bas Nieuwenhuizen
8424cd8fbd nir: Account for atomics in copy propagation.
Otherwise writes get propagated across atomics if no barrier is
used. Without barrier writes should still be visible in the same
invocation, so an atomic has to be considered a write.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: b3c6146925 "nir: Copy propagation between blocks"
Fixes: 62332d139c "nir: Add a local variable-based copy propagation pass"
2019-01-18 00:55:35 +01:00
Rafael Antognolli
927ba12b53 anv/tests: Adding test for the state_pool padding.
Add a test that checks that we can use the extra space allocated for
padding while allocating larger anv_states.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:26 -08:00
Rafael Antognolli
731c4adcf9 anv/allocator: Add support for non-userptr.
If softpin is supported, create new BOs for the required size and add the
respective BO maps. The other main change of this commit is that
anv_block_pool_map() now returns the map for the BO that the given
offset is part of. So there's no block_pool->map access anymore (when
softpin is used.

v3:
 - set fd to -1 on softpin case (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:24 -08:00
Rafael Antognolli
643248b66a anv: Remove state flush.
We have all the state buffers snooped, so we don't need to clflush
everything anymore.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:22 -08:00
Rafael Antognolli
5d61c74f3d anv/allocator: Enable snooping on block pool and anv_bo_pool BOs.
We are not going to use userptr for anv block pool BOs anymore. However,
so far we have been relying on the fact that userptr BOs are snooped on
non-llc platforms. Let's make sure that the block pool BOs are still
snooped, and we can also remove the clflush'ing that we do on all state
buffers.

And since we plan to remove the flushes, set the anv_bo_pool BOs to
cached (snooped on non-LLC platforms) too. For LLC platforms, they are
all cached by default, so this becomes a no-op.

v5:
 - Add snooping to anv_bo_pool BOs too (Jason).
 - Remove anv_gem_set_domain.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:20 -08:00
Rafael Antognolli
dfc9ab2ccd anv/allocator: Add padding information.
It's possible that we still have some space left in the block pool, but
we try to allocate a state larger than that state. This means such state
would start somewhere within the range of the old block_pool, and end
after that range, within the range of the new size.

That's fine when we use userptr, since the memory in the block pool is
CPU mapped continuously. However, by the end of this series, we will
have the block_pool split into different BOs, with different CPU
mapping ranges that are not necessarily continuous. So we must avoid
such case of a given state being part of two different BOs in the block
pool.

This commit solves the issue by detecting that we are growing the
block_pool even though we are not at the end of the range. If that
happens, we don't use the space left at the end of the old size, and
consider it as "padding" that can't be used in the allocation. We update
the size requested from the block pool to take the padding into account,
and return the offset after the padding, which happens to be at the
start of the new address range.

Additionally, we return the amount of padding we used, so the caller
knows that this happens and can return that padding back into a list of
free states, that can be reused later. This way we hopefully don't waste
any space, but also avoid having a state split between two different
BOs.

v3:
 - Calculate offset + padding at anv_block_pool_alloc_new (Jason).
v4:
 - Remove extra "leftover".

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:19 -08:00
Rafael Antognolli
7ed0898a8d anv/allocator: Rework chunk return to the state pool.
This commit tries to rework the code that split and returns chunks back
to the state pool, while still keeping the same logic.

The original code would get a chunk larger than we need and split it
into pool->block_size. Then it would return all but the first one, and
would split that first one into alloc_size chunks. Then it would keep
the first one (for the allocation), and return the others back to the
pool.

The new anv_state_pool_return_chunk() function will take a chunk (with
the alloc_size part removed), and a small_size hint. It then splits that
chunk into pool->block_size'd chunks, and if there's some space still
left, split that into small_size chunks. small_size in this case is the
same size as alloc_size.

The idea is to keep the same logic, but make it in a way we can reuse it
to return other chunks to the pool when we are growing the buffer.

v2:
 - Include Jason's suggestions to the algorithm that returns chunks.
 - Update comments.

v3:
 - Disallow returning 0 blocks (Jason).
 - fix min_size in the loop (Jason).
 - remove temporary variables (Jason)
v4:
 - return_chunk() should never return blocks larger than
 pool->block_size.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:17 -08:00
Rafael Antognolli
6a1f4c96cc anv: Remove some asserts.
They won't be true anymore once we add support for multiple BOs with
non-userptr.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:14 -08:00
Rafael Antognolli
f39dad7e4e anv: Validate the list of BOs from the block pool.
We now have multiple BOs in the block pool, but sometimes we still
reference only the first one in some instructions, and use relative
offsets in others. So we must be sure to add all the BOs from the block
pool to the validation list when submitting commands.

v2:
   - Don't add block pool BOs to the dependency list right before
   execbuf (Jason)
   - Call anv_execbuf_add_bo() to each BO in the block pools (Jason)
   - Use anv_execbuf_add_bo_set() to add surface state dependencies to
   execbuf.

v3:
   - Add comment to the non-softpin case (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:10 -08:00
Rafael Antognolli
11a5d4620b anv: Split code to add BO dependencies to execbuf.
This part of the anv_execbuf_add_bo() code is totally independent of the
BO being added. Let's split it out, so we can reuse it later.

v3: rename to anv_execbuf_add_bo_set (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:08 -08:00
Rafael Antognolli
f874604f45 anv/allocator: Add support for a list of BOs in block pool.
So far we use only one BO (the last one created) in the block pool. When
we switch to not use the userptr API, we will need multiple BOs. So add
code now to store multiple BOs in the block pool.

This has several implications, the main one being that we can't use
pool->map as before. For that reason we update the getter to find which
BO a given offset is part of, and return the respective map.

v3:
 - Simplify anv_block_pool_map (Jason).
 - Use fixed size array for anv_bo's (Jason)
v4:
 - Respect the order (item, container) in anv_block_pool_foreach_bo
 (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:04 -08:00
Rafael Antognolli
e3dc56d731 anv: Update usage of block_pool->bo.
Change block_pool->bo to be a pointer, and update its usage everywhere.
This makes it simpler to switch it later to a list of BOs.

v3:
 - Use a static "bos" field in the struct, instead of malloc'ing it.
 This will be later changed to a fixed length array of BOs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:02 -08:00
Rafael Antognolli
fc3f588320 anv/allocator: Remove pool->map.
After switching to using anv_state_table, there are very few places left
still using pool->map directly. We want to avoid that because it won't
be always the right map once we split it into multiple BOs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:00 -08:00
Rafael Antognolli
54e21e145e anv/allocator: Rename anv_free_list2 to anv_free_list.
Now that we removed the original anv_free_list, we can now use its name.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:58 -08:00
Rafael Antognolli
234c9d8a40 anv/allocator: Remove anv_free_list.
The next commit already renames anv_free_list2 -> anv_free_list since
the old one is gone.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:56 -08:00
Rafael Antognolli
e2179aceaf anv/allocator: Use anv_state_table on back_alloc too.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:52 -08:00
Rafael Antognolli
d18267fb48 anv/allocator: Use anv_state_table on anv_state_pool_alloc.
Use anv_state_pool_return_blocks() to return blocks to the pool, instead
of manually pushing them.

v3:
 - return blocks from the end of the chunk (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:50 -08:00
Rafael Antognolli
6a1dcfe73d anv/allocator: Add helper to push states back to the state table.
The use of anv_state_table_add() combined with anv_state_table_push(),
specially when adding a bunch of states to the table, is very verbose.
So we add this helper that makes things easier to digest.

We also already add the anv_state_table member in this commit, so things
can compile properly, even though it's not used.

v2: assert that the states are always aligned to their size (Jason)
v3: Add "table" member to anv_state_pool in this commit.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:47 -08:00
Rafael Antognolli
e8b6e0a5ba anv/allocator: Add getter for anv_block_pool.
We will need the anv_block_pool_map to find the map relative to some BO
that is not at the start of the block pool.

v2: just return a pointer instead of a struct (Jason)
v4: Update comment (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:43 -08:00
Rafael Antognolli
6a2d5ae305 anv/allocator: Add anv_state_table.
Add a structure to hold anv_states. This table will initially be used to
recycle anv_states, instead of relying on a linked list implemented in
GPU memory. Later it could be used so that all anv_states just point to
the content of this struct, instead of making copies of anv_states
everywhere.

One has to call anv_state_table_add(), which returns an index for the
state in the table, and then get a pointer to such index, and finally
fill in the rest of the struct.

TODO:
   1) There's a lot of common code between this table backing store
   memory and the anv_block_pool buffer, due to how we grow it. I think
   it's possible to refactory this and reuse code on both places.

   2) Add unit tests.

v3:
 - Rename state table memfd (Jason)
 - Return VK_ERROR_OUT_OF_HOST_MEMORY on more places (Jason)
 - anv_state_table_grow returns VkResult (Jason)
 - Rename variables to be more informative (Jason)
 - Return errors on state table grow.
 - Rename anv_state_table_push/pop to anv_free_list_push2/pop2
   This will be renamed again to remove the trailing "2" later.

v4:
 - Remove exit(-1) from anv_state_table (Jason).
 - Use uint32_t "next" field in anv_free_entry (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:34 -08:00
Rafael Antognolli
27478ce00e anv/tests: Fix block_pool_no_free test.
There were 2 problems with this test.

First it was comparing highest, which was -1, with an uint32_t. So the
current value would never be higher than that, and the assert would
always be false. It just never reached this point because of the next
problem.

It was always looking for the highest value of each thread and storing
it in thread_max. So a test case like this wouldn't work:

[Thread]: [Blocks]
   [0]: [0, 32, 64, 96]
   [1]: [128, 160, 192, 224]
   [2]: [256, 288, 320, 352]

Not only that would skip values and iterate only over thread number 2,
instead of walking through all of them, but thread_max was also
initialized to -1. And then compared to unsigned blocks[i][next[i].

We fix that by getting the smallest value of each thread, and checking
if it is lower than thread_min, which is initialized to INT32_MAX. And
then we end up walking through all the blocks of all threads. We also
change "blocks" to be int32_t instead of uint32_t, since in some places
(alloc_blocks) it was already referenced as int32_t, and that fixes the
comparison to -1.

v2:
 - keep highest initialized to -1, and change blocks to be int32_t.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:05:58 -08:00
Lionel Landwerlin
4149d41f2e anv: fix invalid binding table index computation
The ++ operator strikes again.

Fixes: f92c5bc8f3 ("anv/device: fix maximum number of images supported")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 11:49:10 -08:00
Eric Engestrom
c4c5c90255 docs: explain how to see what meson options exist
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-17 17:05:41 +00:00
Emil Velikov
406623f5b1 docs: update calendar, add news item and link release notes for 18.3.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-17 11:37:41 +00:00
Emil Velikov
9d58641bf2 docs: add sha256 checksums for 18.3.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 8320a07221)
2019-01-17 11:32:20 +00:00
Emil Velikov
2dad014496 docs: add release notes for 18.3.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 95a3b709c0)
2019-01-17 11:32:19 +00:00
Iago Toral Quiroga
f92c5bc8f3 anv/device: fix maximum number of images supported
We had defined MAX_IMAGES as 8, which we used to size the array for
image push constant data. The comment there stated that this was for
gen8, but anv_nir_apply_pipeline_layout runs for all gens and writes
that array, asserting that we don't exceed that number of images,
which imposes a limit of MAX_IMAGES on all gens.

Furthermore, despite this, we are exposing up to 64 images per shader
stage on all gens, gen8 included.

This patch lowers the number of images we expose in gen8 to 8 and
keeps 64 images for gen9+ while making sure that only pre-SKL gens
use push constant space to handle images.

v2:
 - <= instead of < in the assert (Eric, Lionel)
 - Change the way the assertion is written (Eric)

v3:
 - Revert the way the assertion is written to the form it had in v1,
   the version in v2 was not equivalent and was incorrect. (Lionel)

v4:
 - gen9+ doesn't need push constants for images at all (Jason)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v3)
2019-01-17 07:59:00 +01:00
Tapani Pälli
a311aa631d anv: do not advertise AHW support if extension not enabled
Fixes following failing vk-gl-cts cases on Linux desktop:

   dEQP-VK.api.external.memory.android_hardware_buffer.suballocated.buffer.info
   dEQP-VK.api.external.memory.android_hardware_buffer.suballocated.image.info
   dEQP-VK.api.external.memory.android_hardware_buffer.dedicated.image.info
   dEQP-VK.api.external.memory.android_hardware_buffer.dedicated.buffer.info

Fixes: 517103abf1 "anv/android: add ahardwarebuffer external memory properties"
Reported-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-01-17 07:22:02 +02:00
Eric Anholt
99ef66c325 vc4: Don't leak the GPU fd for renderonly usage.
Noticed while debugging V3D -- the ro->gpu_fd was freshly opened in ro
setup, and it needs to stay open until screen close (since it may be used
by renderonly) and should be the same one used by the vc4 screen.

Fixes: 7029ec05e2 ("gallium: Add renderonly-based support for pl111+vc4.")
2019-01-16 16:28:41 -08:00
Eric Anholt
0605726776 v3d: Don't leak the GPU fd for renderonly usage.
The CTS was running out of fds, because of the ro->gpu_fd never being
closed.  ro->gpu_fd should match the screen (in case the caller of
v3d_drm_screen_create_renderonly() has a scanout_for_resource() that uses
gpu_fd) and the screen is expected to close its fd at the end, fixing the
resource leak.

Fixes: e113b21cb7 ("v3d: Add renderonly support.")
2019-01-16 16:28:41 -08:00
Eric Anholt
59527a36e9 v3d: Restructure RO allocations using resource_from_handle.
I had bugs in the old path where I was laying out as tiled (so we'd render
tiled) but then only allocating space in the shared object for linear
rendering.  The resource_from_handle makes it so the same layout choices
are made in both the import and export scanout cases.  Also, fixes a leak
of the fd that was tripping up the CTS.

Now that we're checking PIPE_BIND_SHARED to choose to use RO, the
DRM_FORMAT_MOD_LINEAR check wasn't needed any more.

Fixes visual corruption and MMU faults in X in renderonly mode.

Fixes: bd09bb1629 ("v3d: SHARED but not necessarily SCANOUT buffers on RO must be linear.")
2019-01-16 16:28:41 -08:00
Eric Anholt
d70eb2302b v3d: If the modifier is not known on BO import, default to linear for RO.
Part of fixing DRI3 rendering with RO on X11.

Fixes: e113b21cb7 ("v3d: Add renderonly support.")
2019-01-16 16:28:41 -08:00
Timothy Arceri
cb527d2c4c ac/nir_to_llvm: add support for structs to get_sampler_desc()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-17 10:35:36 +11:00
Timothy Arceri
b12316cc92 ac/nir_to_llvm: fix regression in bindless support
This wasn't ported over when deref support was implemented.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-17 10:35:36 +11:00
Timothy Arceri
e106e0f2dd radeonsi/nir: get correct type for images inside structs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-17 10:35:36 +11:00
Timothy Arceri
292887ac0d ac/nir_to_llvm: fix type handling in image code
The current code only strips off arrays and cannot find the type
for images that are struct members.

Instead of trying to get the image type from the variable, we just
get it directly from the deref instruction.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-17 10:35:36 +11:00
Rhys Perry
8a52e4cc4f radv: use dithered alpha-to-coverage
This matches the behaviour of AMDVLK and hides banding.
It is also seems to be allowed by the Vulkan spec.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-16 20:49:23 +00:00
Alok Hota
187a6506a3 swr/rast: Store cached files in multiple subdirs
This improves cache filesystem performance, especially during CI tests
Also updated jitcache magic number due to codegen parameter changes
Removed 2 `if constexpr` to prevent C++17 requirement
2019-01-16 13:53:30 -06:00
Alok Hota
bb98be61f4 swr/rast: New execution engine per JIT
Fixes relocation errors with LLVM 7.0.0
2019-01-16 13:53:30 -06:00
Alok Hota
b135db5d58 swr/rast: Scope MEM_CLIENT enum for mem usages
Avoids confusion with other defaulted integer parameters

- fixed some unspecified usages
- removed unnecessary includes
- removed unecessary protected access specifier in buckets framework
2019-01-16 13:53:30 -06:00
Alok Hota
c722ad7379 swr/rast: Unaligned and translations in gathers
- added graphics address translation in odd gathers
- added support for unaligned gathers in fetch shader
- changed how 2+ GB offsets are handled to make them compatible with
unaligned offsets
2019-01-16 13:53:30 -06:00
Alok Hota
9459863dfa swr/rast: partial support for Tiled Resources
- updated sample from TRTT surfaces correctly
- implemented mapped status return for TRTT surfaces
- implemented per-sample instruction minLod clamp
- updated bilinear filter weight calculation to be closer to D3D specs
- implemented "ReducedTexcoordRange" operation from D3D specs to avoid
loss of precision on high-value normalized coordinates
2019-01-16 13:53:30 -06:00
Alok Hota
9cacf9d877 swr/rast: Add annotator to interleave isa text
To make debugging simpler
2019-01-16 13:53:30 -06:00
Alok Hota
c9fa2ee343 swr/rast: Use gfxptr_t value in JitGatherVertices
Use gfxptr_t type value for stream pointer uses in gather and similar
calls
2019-01-16 13:53:30 -06:00
Gert Wollny
e68777c87c autotools: Deprecate the use of autotools
Since Meson will eventually be the only build system deprecate autotools
now. It can still be used by invoking configure with the flag
  --enable-autotools

NAKed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2019-01-16 09:52:42 -08:00
Dylan Baker
431e9abaab meson: allow building dri driver without window system if osmesa is classic
This was already enabled for gallium based osmesa with gallium drivers
in 9d10581897, so do the same for classic
driver with classic osmesa.

Fixes: cbbd5bb889
       ("meson: build classic osmesa")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-16 17:49:51 +00:00
Bruce Cherniak
ed7673afd2 gallium/swr: Fix multi-context sync fence deadlock.
Various recreation scenarios lead to API thread getting stuck in
swr_fence_finish().  This is a multi-context issue, whereby one context
overwrites the fence read-value with a previous sync's lesser value.
The fence sync value is supposed to be always increasing.

In swr_fence_cb(), only update the "read" value if the new value is
greater.

(This may seem like we're not waiting on the other context to finish, but
had we needed for it to finish there would have been a wait prior to
submitting a new sync.)

cc: mesa-stable@lists.freedesktop.org
2019-01-16 09:26:36 -06:00
Samuel Pitoiset
d5d7b5e950 ac/nir: don't trash L1 caches for store operations with writeonly memory
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-16 13:57:22 +01:00
Kenneth Graunke
5b51d754d0 st/mesa: Optionally override RGB/RGBX dst alpha blend factors
Intel's blending hardware does not properly return 1.0 for destination
alpha for RGBX formats; it requires the factors to be overridden to
either zero or one.  Broadcom vc4 and v3d also could use this override.
While overriding these factors is safe in general, Nouveau and Radeon
would prefer not to.  Their blending hardware already returns correct
values for RGB/RGBX formats, and would like to avoid the resulting
per-buffer blending and independent blend factors (rgb != a) since it
can cause additional overhead.

I considered simply handling this in the driver, but it's not as nice.
pipe_blend_state doesn't have any format information, so we'd need the
hardware blend state to depend on both pipe_blend_state and
pipe_framebuffer_state.  Furthermore, Intel GPUs don't have a native
RGBX_SNORM format, so I avoid exposing one, which makes Gallium fall
back to RGBA_SNORM.  The pipe_surfaces we get in the driver have an RGBA
format, making it impossible to tell that there shouldn't be an alpha
channel.  One could argue that st not handling it in that case is a bug.
To work around this, we'd have to expose RGBX pipe formats, mapped to
RGBA hardware formats, and add format swizzling special cases.  All
doable, but it ends up being more code than I'd like.

st_atom_blend already has access to the right information and it's
trivial to accomplish there, so we just add a cap bit and do that.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-15 20:53:44 -08:00
Marek Olšák
11735d6c9c winsys/amdgpu: fix whitespace 2019-01-15 19:10:16 -05:00
Pierre Moreau
0b736f7fd4 meson: Fix with_gallium_icd to with_opencl_icd
`with_gallium_icd` is never used throughout the different Meson build
files, whereas `with_opencl_icd` tracks whether or not `gallium-opencl`
was set to "icd".

Fixes: 42ea0631f1
         ("meson: build clover")
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-15 13:06:50 -08:00
Kenneth Graunke
d644698b44 gallium: Add the ability to query a single pipeline statistics counter
Gallium historically has treated pipeline statistics queries as a single
query, PIPE_QUERY_PIPELINE_STATISTICS, which returns a block of 11
values.  This was originally patterned after the D3D1x API.  Much later,
Brian introduced an OpenGL extension that exposed these counters - but
it exposes 11 separate queries, each of which returns a single value.

Today, st/mesa simply queries all 11 values, and returns a single value.
While pipeline statistics counters aren't typically performance
critical, this is still not a great fit.  A D3D1x->GL translator might
request all 11 counters by creating 11 separate GL queries...which
Gallium would map to reads of all 11 values each time, resulting in a
total 121 counter reads.  That's not ideal.

This patch adds a new cap, PIPE_CAP_QUERY_PIPELINE_STATISTICS_SINGLE,
and corresponding query type PIPE_QUERY_PIPELINE_STATISTICS_SINGLE.
When calling create_query(), q->index should be set to one of the
PIPE_STAT_QUERY_* enums to select a counter.  Unlike the block query,
this returns the value in pipe_query_result::u64 (as it's a single
value) instead of the pipe_query_data_pipeline_statistics group.

We update st/mesa to expose ARB_pipeline_statistics_query if either
capability is set, preferring the new SINGLE variant when available.

Thanks to Roland, Ilia, and Marek for helping me sort this out.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-15 11:43:04 -08:00
Kenneth Graunke
f967273fb4 st/mesa: Rearrange PIPE_QUERY_PIPELINE_STATISTICS result fetching.
This just changes the order of the switch statements, so we only
look at target if the query type is PIPE_QUERY_PIPELINE_STATISTICS.

The next commit will introduce a new SINGLE query type which can be
used for the same GL query types, and it won't want this processing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-15 11:43:04 -08:00
Kenneth Graunke
e760be08b4 st/mesa: Make an enum for pipeline statistics query result indices.
Gallium handles pipeline statistics queries as a single query
(PIPE_QUERY_PIPELINE_STATISTICS) which returns a struct with 11 values.
Sometimes it's useful to refer to each of those values individually,
rather than as a group.  To avoid hardcoding numbers, we define a new
enum for each value.  Here, the name and enum value correspond to the
index in the struct pipe_query_data_pipeline_statistics result.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-15 11:43:04 -08:00
Dylan Baker
4a131a1330 meson: Add a script to extract the cmd line used for meson
Upstream I'm persuing a more comprehensive solution, but this should
prove a suitable stop-gap measure in the meantime.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109325
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-15 17:38:47 +00:00
Samuel Pitoiset
7bef192018 radv: add support for VK_EXT_memory_budget
A simple Vulkan extension that allows apps to query size and
usage of all exposed memory heaps.

The different usage values are not really accurate because
they are per drm-fd, but they should be close enough.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-15 11:18:37 +01:00
Samuel Pitoiset
9784400a6b radv: add two small helpers for getting VRAM and visible VRAM sizes
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-15 11:18:35 +01:00
Samuel Pitoiset
a6e5ce5130 radv: remove unnecessary returns in GetPhysicalDevice*Properties()
These functions return nothing.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-15 11:18:17 +01:00
Bas Nieuwenhuizen
568e7a2998 radv: Set partial_vs_wave for pipelines with just GS, not tess.
Looking at -pro we need to enable it for pipelines with just a
GS too.

This seems to reduce the hangs from
https://bugs.freedesktop.org/show_bug.cgi?id=109242 on a RX 550 to
the point where I can't reproduce, after the false start with the
wd_switch_on_eop patch due to flakiness.

(but people are reporting it does not fix the issue completely for
 them on polaris 11)

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-15 10:22:30 +01:00
Marek Olšák
5183e794af radeonsi: also apply the GS hang workaround to draws without tessellation
ported from AMDVLK.

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 18:55:58 -05:00
Eric Anholt
bd09bb1629 v3d: SHARED but not necessarily SCANOUT buffers on RO must be linear.
We don't have a way to talk to RO about modifiers it can do yet, so assume
the minimum.
2019-01-14 15:40:55 -08:00
Eric Anholt
f72820c851 v3d: Add support for CS barrier() intrinsics. 2019-01-14 15:40:55 -08:00
Eric Anholt
9b45b06d7c v3d: Add support for CS shared variable load/store/atomics.
CS shared variables are handled effectively as SSBO access to a temporary
buffer that will be allocated at CS dispatch time.
2019-01-14 15:40:55 -08:00
Eric Anholt
01d913cf90 v3d: Add support for CS workgroup/invocation id intrinsics.
We get a payload for the ivec3 workgroup and an int local invocation
index, and we use the core lowering to turn into the global invocation id
and the local invocation id ivec3s.
2019-01-14 15:40:55 -08:00
Eric Anholt
6281f26f06 v3d: Add support for shader_image_load_store.
This is only exposed on V3D 4.1+, because we didn't have the TMU write
operations for images on 3.3 (To do GLES 3.1 there, you have to lower it
to SSBO load/stores, which is a problem to solve later).
2019-01-14 15:40:55 -08:00
Eric Anholt
5932c2f0b9 v3d: Add SSBO/atomic counters support.
So far I assume that all the buffers get written.  If they weren't, you'd
probably be using UBOs instead.
2019-01-14 15:40:55 -08:00
Eric Anholt
6c8edcb89c v3d: Drop the GLSL version level.
This was an arbitrary "we support lots of stuff" value when I started the
driver.  However, at 400 we expose OES_gpu_shader5, which claims support
for dynamically indexing samplers, which the driver doesn't do yet.
2019-01-14 13:18:02 -08:00
Eric Anholt
1a63227ea0 v3d: Add support for matrix inputs to the FS.
We've been relying on linking splitting up our varying matrices into
separate vectors, but with SSO that doesn't happen.  Supporting matrix
inputs isn't too hard, though.
2019-01-14 13:18:02 -08:00
Eric Anholt
49b7e26fac v3d: Add an isr to the simulator to catch GMP violations.
Otherwise, the simulator raises the GMP interrupt and waits for it to be
handled, and v3d ends up spinning in v3d_hw_tick().  Aborting right when
violation happens gives us a chance to look at the backtrace of whatever
thread triggered the violation.
2019-01-14 13:18:02 -08:00
Eric Anholt
3790ee07e6 v3d: Fix txf_ms 2D_ARRAY array index.
We need to pass the array index through our coordinate transform
unchanged.  Fixes
dEQP-GLES31.functional.texture.multisample.samples_1.*_2d_array
2019-01-14 13:18:02 -08:00
Eric Anholt
619a28b845 v3d: Add support for GL_ARB_framebuffer_no_attachments.
Fixes
dEQP-GLES31.functional.state_query.integer.max_framebuffer_height_getboolean
when GLES3 is enabled.
2019-01-14 13:18:02 -08:00
Eric Anholt
051a41d3d5 v3d: Add support for the early_fragment_tests flag.
If this flag hasn't been set by the shader and it has some visible side
effects, then we need to disable EZ.
2019-01-14 13:18:02 -08:00
Eric Anholt
b417a9f7b2 v3d: Add support for flushing dirty TMU data at job end.
This will be needed for SSBOs and image_load_store.
2019-01-14 13:18:02 -08:00
Samuel Pitoiset
ad6ceb2872 ac: add missing 16-bit types to glsl_base_to_llvm_type()
Fix crashes with
dEQP-VK.spirv_assembly.instruction.compute.workgroup_memory.*16

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 21:18:23 +01:00
Bas Nieuwenhuizen
76b12fa564 radv: Only use 32 KiB per threadgroup on Stoney.
Causes hangs on some machines.

What works for dEQP-VK.tessellation.shader_input_output.barrier:

- running num_patches = 6 (which limits LDS to 32 KiB)
- running num_patches = 8, and artificially cutting LDS size at 32 KiB.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-14 19:58:27 +00:00
Marek Olšák
76df5e8f52 st/dri: fix dri2_format_table for argb1555 and rgb565
The bug caused that rgb565 framebuffers used argb1555.

Fixes: 433ca3127a

Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-14 14:54:19 -05:00
Jason Ekstrand
2d2737dcfe nir: Add a bool to float32 lowering pass
From @jekstrand's nir-1-bit-bool branch, with improved ior/inot lowering.

ior: fmax instead of fadd allows removing the fsat.

inot: seq(x, 0) can be better than fsub(1, x). On a2xx, it works better
with the scalar instruction set.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-01-14 19:27:06 +00:00
Caio Marcelo de Oliveira Filho
09c3ff01df src/intel: use new hash table and set creation helpers
Replace calls to create hash tables and sets that use
_mesa_hash_pointer/_mesa_key_pointer_equal with the helpers
_mesa_pointer_hash_table_create() and _mesa_pointer_set_create().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-01-14 10:49:33 -08:00
Caio Marcelo de Oliveira Filho
9fdded0cc3 src/compiler: use new hash table and set creation helpers
Replace calls to create hash tables and sets that use
_mesa_hash_pointer/_mesa_key_pointer_equal with the helpers
_mesa_pointer_hash_table_create() and _mesa_pointer_set_create().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-01-14 10:49:28 -08:00
Caio Marcelo de Oliveira Filho
ee23e8b17c util: Helper to create sets and hashes with pointer keys
These combinations are common enough and deserve a shortcut.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-01-14 10:49:21 -08:00
Samuel Pitoiset
929df7afaf ac/nir: set cache policy when loading/storing buffer images
This was missing.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 17:59:51 +01:00
Samuel Pitoiset
af2a85df74 ac/nir: add get_cache_policy() helper and use it
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 17:59:49 +01:00
Jason Ekstrand
5e4f9ea363 anv: Implement VK_KHR_depth_stencil_resolve 2019-01-14 10:16:52 -06:00
Jason Ekstrand
9f44088468 anv: Move resolve_subpass to genX_cmd_buffer.c
We may have to do transitions around certain kinds of resolves so it
helps to have it genX code.
2019-01-14 10:16:52 -06:00
Jason Ekstrand
930b17161f anv/blorp: Refactor MSAA resolves into an exportable helper function
This function is modeled after the aux_op functions except that it has a
lot more parameters because it deals with two images as well as source
and destination regions.
2019-01-14 10:16:52 -06:00
Jason Ekstrand
c92c449361 anv: Rename has_resolve to has_color_resolve 2019-01-14 10:16:52 -06:00
Jason Ekstrand
4bd976e3b8 intel/blorp: Add two more filter modes 2019-01-14 10:16:52 -06:00
Andres Gomez
3ec9ab80b8 bin/get-pick-list.sh: fix redirection in sh
"&>" is bash specific.

Fixes: e0dbfc9953 ("bin/get-pick-list.sh: warn when commit lists invalid sha")
Cc: Juan A. Suarez <jasuarez@igalia.com>
Cc: Eric Engestrom <eric.engestrom@intel.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-01-14 17:40:15 +02:00
Andres Gomez
716ed41a36 bin/get-pick-list.sh: fix the oneline printing
"--summary" will also print extended header information such as
creations, renames and mode changes.

Let's just use "--no-patch", which suppresses the diff output.

v2: Use "--no-patch" instead of the "-s" abbreviation (Eric).

Fixes: 559c32d241 ("bin/get-pick-list.sh: simplify git oneline printing")
Cc: Juan A. Suarez <jasuarez@igalia.com>
Cc: Eric Engestrom <eric.engestrom@intel.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-01-14 17:36:56 +02:00
Michel Dänzer
1a20b56798 amd/common: Restore v4i32 suffix for llvm.SI.load.const intrinsic
It was accidentally dropped in commit e4803ab7d2 "amd/common: use
llvm.amdgcn.s.buffer.load for LLVM 8.0", breaking the universe with LLVM
7.

Trivial.
2019-01-14 12:52:52 +01:00
Nicolai Hähnle
7fbd48fdc0 amd/common/vi+: enable SMEM loads with GLC=1
Only on LLVM 8.0+, which supports the new intrinsic.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 08:30:15 +01:00
Nicolai Hähnle
e4803ab7d2 amd/common: use llvm.amdgcn.s.buffer.load for LLVM 8.0
llvm.SI.load.const is deprecated.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 08:30:12 +01:00
Iago Toral Quiroga
1c1ae6376c anv/pipeline_cache: free NIR shader cache
Fixes: f6aa9f7185 'anv/pipeline_cache: Add support for caching NIR'
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-14 07:59:27 +01:00
Danylo Piliaiev
0862929bf6 glsl: Fix copying function's out to temp if dereferenced by array
Function's out variable could be an array dereferenced by an array:
 func(v[w[i]]);
or something more complicated.

Copy index in any case.

Fixes: 76c27e47b9 ("glsl: Copy function out to temp if we don't directly ref a variable")

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-14 12:04:07 +11:00
Kenneth Graunke
04c2f12ab2 i965: Drop mark_surface_used mechanism.
The original idea was that the backend compiler could eliminate
surfaces, so we would have it mark which ones are actually used,
then shrink the binding table accordingly.  Unfortunately, it's a
pretty blunt mechanism - it can only prune things from the end,
not the middle - since we decide the layout before we even start
the backend compiler, and only limit the size.  It also basically
gives up if it sees indirect array access.

Besides, we do the vast majority of our surface elimination in NIR
anyway, not the backend - and I don't see that trend changing any
time soon.  Vulkan abandoned this plan a long time ago, and I don't
use it in Iris, but it's still been kicking around in i965.

I hacked shader-db to print the binding table size in bytes, and
observed no changes with this patch.  So, this code appears to do
nothing useful.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-13 09:35:32 -08:00
Eric Engestrom
bdf6a5c1d2 egl: fix python lib deprecation warning
DeprecationWarning: the imp module is deprecated in favour of importlib

Instead of complicated logic, just import the file directly.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-13 13:59:08 +00:00
Jason Ekstrand
b938d5fbef spirv: Emit switch conditions on-the-fly
Instead of emitting all of the conditions for the cases of a switch
statement up-front, emit them on-the-fly as we emit the code for each
case.  The original justification for this was that we were going to
have to build a default case anyway which would need them all.  However,
we can just trust CSE to clean up the mess in that case.  Emitting each
condition right before the if statement that uses it reduces register
pressure and, in one customer benchmark, reduces spilling and improves
performance by about 2x.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-12 17:55:49 -06:00
Jason Ekstrand
821b6861ec nir/gcm: Support deref instructions
Even though no one's been brave enough to ever use this pass, I like to
keep it functionally working.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-12 17:55:49 -06:00
Jason Ekstrand
24c8108ea6 intel/nir: Call nir_opt_deref in brw_nir_optimize
It's an optimization so we should probably be calling it in the
optimization loop.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-12 17:55:49 -06:00
Jason Ekstrand
e57e26121a spirv: Contain the GLSLang issue #179 workaround to old GLSLang
Instead of applying the workaround universally, detect semi-old GLSLang
via the generator ID and only enable the workaround on old GLSLang.
This isn't nearly as precise as one would like it to be because the
first GLSLang generator id version bump was on October 7, 2017 which is
about 1.5 years after the bug was fixed.  However, it at least lets us
disable it for non-GLSLang and for more modern versions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-12 17:55:49 -06:00
Jason Ekstrand
b57c1ec421 spirv: Whack sampler/image pointers to uniform
A long time in a galaxy far far away, there was a GLSLang bug with how
it handled samplers passed in as function parameters.  (The bug can be
found here: https://github.com/KhronosGroup/glslang/issues/179.)
Unfortunately, that version was shipped in several apps and has been
causing heartburn for our SPIR-V parser ever since.

Recent changes to NIR uncovered a moderately old bug in how we work
around this issue.  In particular, we ended up with a deref_cast from
uniform to local which is not a no-op cast so nir_opt_deref wasn't
getting rid of the cast.  The only reason why it worked before was
because someone just happened to call nir_fixup_deref_modes which
"fixed" the cast (that shouldn't be happening) and then a later round of
copy-prop would get rid of it.  The fact that the deref_cast survived
that long without causing trouble for other parts of NIR is a bit
surprising.

Just whacking the mode of the pointer seems to fix it fairly
unobtrusively.  Currently, only apps with this bug will have a local
variable containing an image or sampler.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109304
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-12 17:55:49 -06:00
Kenneth Graunke
2b876bc922 st/nir: Lower TES gl_PatchVerticesIn to a constant if linked with a TCS.
If the TCS and TES are linked together, we can simply replace the TES's
gl_PatchVerticesIn system value with a constant, possibly allowing extra
optimization or letting the driver avoid uploading a special value.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-01-11 13:07:54 -08:00
Jonathan Marek
3d182601bb glsl/nir: keep bool types when native_integers=false
With the new handling of bool types, the conversion to float in glsl_to_nir
should not apply to bool types anymore.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-11 19:16:11 +00:00
Jonathan Marek
b27ad17115 glsl/nir: ftrunc for native_integers=false float to int cast
out_type in the default cast case is always GLSL_TYPE_FLOAT, so we get a
mov otherwise.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-11 19:16:11 +00:00
Jonathan Marek
d3b47e073e glsl/nir: int constants as float for native_integers=false
All alu instructions emitted with native_integers=false expect float
(or bool in some cases) constants, so this change is necessary.

This will cause changes with some intrinsics which had integer sources,
such as nir_intrinsic_load_uniform. Apparently it might cause issues with
some opt passes, but perhaps those don't apply in OpenGL ES 2.0 cases?

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-11 19:16:11 +00:00
Jason Ekstrand
1ede463b6e intel/peephole_ffma: Fix swizzle propagation
The num_components value passed into get_mul_for_src is used to only
compose the parts of the swizzle that we know will be used so we don't
compose invalid swizzle components.  However, we had a bug where we
passed the number of components of the add all the way through.  For the
given source, we need the number of components read from that source.
In the case where we have a narrow add, say 2 components, that is
sourced from a chain of wider instructions, we may not compose all the
swizzles.  All we really need to do is pass through the right number of
components at each level.

Fixes: 2231cf0ba3 "nir: Fix output swizzle in get_mul_for_src"
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-01-11 10:44:08 -06:00
Kenneth Graunke
ae683ed3bc nir: Allow a non-existent sampler deref in nir_lower_samplers_as_deref
GL_ARB_gl_spirv does not provide a sampler deref for e.g. texelFetch(),
so we can't assume that both are present and identical.  Simply lower
each if it is present.

Fixes regressions in GL_ARB_gl_spirv tests since I switched everyone to
using this pass.  Thanks to Alejandro Piñeiro for catching these.

Fixes: f003859f97 nir: Make gl_nir_lower_samplers use gl_nir_lower_samplers_as_deref

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-11 07:54:32 -08:00
Eric Engestrom
e12b0b5c6d travis: avoid using unset llvm-config
Fixes the following errors:
  usage: which [-as] program ...
  /Users/travis/.travis/job_stages: line 110: --version: command not found

... caused by the use of an undefined $LLVM_CONFIG

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-11 14:38:35 +00:00
Eric Engestrom
c8ae891035 egl: remove unused include
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-11 14:37:47 +00:00
Eric Engestrom
d75fbff667 egl: add missing includes
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-11 14:37:47 +00:00
Iago Toral Quiroga
4b1e436bc9 anv/pipeline_cache: fix incorrect guards for NIR cache
Fixes: f6aa9f7185 'anv/pipeline_cache: Add support for caching NIR'
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-11 12:45:18 +01:00
Kenneth Graunke
ad9832d17b blorp: Pass the batch to lookup/upload_shader instead of context
This will allow drivers to pin shader buffers if necessary.

i965 and anv do not need to do this today, but iris will.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-10 20:52:04 -08:00
Kenneth Graunke
084a1cdbb7 blorp: Add blorp_get_surface_address to the driver interface.
Currently, BLORP expects drivers to provide two functions for dealing
with buffers: blorp_emit_reloc and blorp_surface_reloc.  Both record a
relocation and combine the BO address and offset into a full 64-bit
address.  Traditionally, blorp_surface_reloc has written that combined
address to an implicitly-known buffer where surface states are stored.
(In contrast, blorp_emit_reloc returns the value.)

The upcoming Iris driver stores surface states in multiple buffers,
which makes it impossible for blorp_surface_reloc to write the combined
address - it only takes an offset, not the actual buffer to write to.

This commit adds a third function, blorp_get_surface_address, which
combines and returns an address, which is then passed to ISL's surface
state fill functions.  Softpin-only drivers can return a real address
here and skip writing it in blorp_surface_reloc.  Relocation-based
drivers are have options.  They can simply return 0 from the new
function, and continue writing the address from blorp_surface_reloc.
Or, they can return a presumed address from blorp_get_surface_address,
and have other relocation processing write the real value later.

For now, i965 and anv simply return 0.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-10 20:51:53 -08:00
Ilia Mirkin
2165636e9c docs: fix gallium screen cap docs
Make sure that the next line starts with spaces so that bullets are
maintained throughout, add `` around a few more special tokens, and fix
SAMPLE_COUNT_TEXTURE -> SAMPLE_COUNT.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-01-10 21:44:09 -05:00
Danylo Piliaiev
a2db6b4254 glsl: Make invariant outputs in ES fragment shader not to cause error
In all GLSL ES versions output variables in fragment shader are allowed
to be invariant.

 From Section 4.6.1 ("The Invariant Qualifier") GLSL ES 1.00 spec:
 "Only the following variables may be declared as invariant:
   ...
   - Built-in special variables output from the fragment shader."

 From Section 4.6.1 ("The Invariant Qualifier") GLSL ES 3.00 spec:
 "Only variables output from a shader can be candidates for invariance."

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107842
2019-01-11 13:01:11 +11:00
Jason Ekstrand
eb4b1477dc anv/pipeline: Cache the pre-lowered NIR
This adds a second level of caching for the pre-lowered NIR that's only
based off of the shader module, entrypoint and specialization constants.
This is enough for spirv_to_nir as well as our first round of lowering
and optimization.  Caching at this level should allow for faster shader
recompiles due to state changes.

The NIR caching does not get serialized to disk via either the
VkPipelineCache serialization mechanism or the transparent on-disk
cache.  We could but it's usually not that expensive to fall back to
SPIR-V for the odd cache miss especially if it only happens once for
several misses and it simplifies the cache.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-10 19:15:27 -06:00
Jason Ekstrand
f6aa9f7185 anv/pipeline_cache: Add support for caching NIR
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-10 19:15:27 -06:00
Jason Ekstrand
8dfda5ebbe anv/pipeline: Hash shader modules and spec constants separately
The stuff hashed by anv_pipeline_hash_shader is exactly the inputs to
anv_shader_compile_to_nir so it can be used for NIR caching.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-10 19:15:27 -06:00
Jason Ekstrand
b90e55a5d5 compiler/types: Serialize/deserialize subpass input types correctly
They have glsl_sampler_dim enum values of 8 and 9 which don't work when
you & them with 0x7.  Fortunately, we have plenty of bits.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-10 19:15:27 -06:00
Jason Ekstrand
73ddfbeb85 anv/pipeline: Move wpos and input attachment lowering to lower_nir
This lets us make anv_pipeline_compile_to_nir take a device instead of a
pipeline.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-10 19:15:27 -06:00
Matt Turner
32e266a9a5 i965: Compile fp64 funcs only if we do not have 64-bit hardware support
Brown bag fix...
2019-01-10 15:22:17 -08:00
Jason Ekstrand
8ea8727a87 anv/pipeline: Constant fold after apply_pipeline_layout
Thanks to the new NIR load_descriptor intrinsic added by the UBO/SSBO
lowering series, we weren't getting UBO pushing because the UBO range
detection pass couldn't see the constants it needed.  This fixes that
problem with a quick round of constant folding.  Because we're folding
we no longer need to go out of our way to generate constants when we
lower the vulkan_resource_index intrinsic and we can make it a bit
simpler.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-10 20:34:00 +00:00
Rob Clark
031e94dc72 freedreno/a6xx: fix 3d+tiled layout
The last round of fixing 3d layer+level layout skipped the tiled case,
since tiled texture support was not in place yet.  This finishes the
job.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-10 14:21:39 -05:00
Rob Clark
c92c18c70c freedreno/a6xx: move tile_mode to sampler-view CSO
This is known when the CSO is created, so no need to patch it in later.

Also, it seems like smaller textures where the first level is small
enough to be linear, it seems like we should set linear tile mode.

See: dEQP-GLES3.functional.texture.format.unsized.rgb_unsigned_byte_3d_pot

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-10 14:21:39 -05:00
Rob Clark
eb625d30b7 freedreno/a6xx: separate stencil restore/resolve fixes
Previously we'd use format/etc from the primary (z32) buffer for the
stencil (s8), due to confusion about rsc vs psurf.  Rework this to drop
extra arg and push down handling of separate stencil case (and make sure
we take the fmt from the right place).

This doesn't completely fix separate-stencil, but at least it avoids the
GPU scribbling over random other cmdstream buffers and causing a bunch
of bogus fails in dEQP.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-10 14:21:39 -05:00
Rob Clark
04aff7e42b freedreno: make cmdstream bo's read-only to GPU
If nothing else, this will make problems with cmdstream getting blit
over with pixels easier to track down (ie. faults when it first happens
rather than strange failures later from corrupted cmdstream when a
stateobj is later reused).

(NOTE this somewhat depends on the kernel supporting the flag, and the
iommu implementation.  But the worst case is just that the cmdstream
ends up writeable as before.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-10 14:21:39 -05:00
Guido Günther
286de96af8 etnaviv: fix typo in cflush_all description
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-01-10 18:46:10 +01:00
Eric Engestrom
53fbde4df3 radv: remove a few more unnecessary KHR suffixes
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)
2019-01-10 16:53:44 +00:00
Rhys Perry
0210243923 nir: fix copy-paste error in nir_lower_constant_initializers
Fixes: 393b59e077
    ('nir: Rework nir_lower_constant_initializers() to handle functions')
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-10 10:51:52 -06:00
Andres Gomez
6c3164cd08 docs: complete the calendar and release schedule documentation
As suggested by Emil Velikov.

Cc: Dylan Baker <dylan.c.baker@intel.com>
Cc: Juan A. Suarez <jasuarez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-01-10 15:53:02 +02:00
Andres Gomez
428164d87f glsl/linker: specify proper direction in location aliasing error
The check for location aliasing was always asuming output variables
but this validation is also called for input variables.

Fixes: e2abb75b0e ("glsl/linker: validate explicit locations for SSO programs")
Cc: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-10 15:51:57 +02:00
Andres Gomez
e2e03f84f9 editorconfig: Add max_line_length property
The property is supported by the most of the editors, but not all:
https://github.com/editorconfig/editorconfig/wiki/EditorConfig-Properties#max_line_length

Cc: Eric Engestrom <eric@engestrom.ch>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-10 15:50:34 +02:00
Tapani Pälli
864cc419eb intel/isl: move tiled_memcpy static libs from i965 to isl
Patch moves intel_tiled_memcpy[_sse41] libraries to isl, renames some
functions and types and makes the required build system changes for
meson, automake and Android. No functional changes are introduced.

v2: code cleanups, move isl_get_memcpy_type to i965 (Jason)
v3: move isl_mem_copy_fn to priv header, cleanups (Jason, Dylan)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-10 08:02:30 +02:00
Matt Turner
406f603b34 i965: Enable 64-bit GLSL extensions
Now that we have software implementations of ARB_gpu_shader_int64 and
ARB_gpu_shader_fp64 we can unconditionally enable these extensions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
613ac3aaa2 i965: Compile fp64 software routines and lower double-ops
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
18b4e87370 intel/compiler: Heap-allocate temporary storage
Shaders containing software implementations of double-precision
operations can be very large such that we cannot stack-allocate
an array of grf_count*16.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
622d429128 intel/compiler: Expand size of the 'nr' field
Shaders containing software implementations of double-precision
operations can be very large such that we have more the 2^16 virtual
registers during optimization.

Move the 'nr' field to the union containing the immediate storage and
expand it to 32-bits.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
7e4e9da90d intel/compiler: Prevent warnings in the following patch
The next patch replaces an unsigned bitfield with a plain unsigned,
which triggers gcc to begin warning on signed/unsigned comparisons.

Keeping this patch separate from the actual move allows bisectablity and
generates no additional warnings temporarily.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
2b801b6668 intel/compiler: Rearrange code to avoid future problems
A follow on commit will move nr to the same union as the immediate
data, so we should assert these invariants before we overwrite the nr
field.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
3b967e1724 intel/compiler: Avoid false positive assertions
A follow on patch will move the 'nr' field to the union containing the
immediate field, so prepare by checking that we're only testing these
assertions if the .file is correct.

The assertions with != ARF were kind of silly to begin with because the
<128 check is specifically only for things in the GRF.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
8534742404 intel/compiler: Split 64-bit MOV-indirects if needed
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:40 -08:00
Matt Turner
e76772af6c intel/compiler: Lower 64-bit MOV/SEL operations 2019-01-09 16:42:40 -08:00
Matt Turner
2623653126 nir: Unset metadata debug bit if no progress made
NIR metadata validation verifies that the debug bit was unset (by a call
to nir_metadata_preserve) if a NIR optimization pass made progress on
the shader. With the expectation that the NIR shader consists of only a
single main function, it has been safe to call nir_metadata_preserve()
iff progress was made.

However, most optimization passes calculate progress per-function and
then return the union of those calculations. In the case that an
optimization pass makes progress only on a subset of the functions in
the shader metadata validation will detect the debug bit is still set on
any unchanged functions resulting in a failed assertion.

This patch offers a quick solution (short of a larger scale refactoring
which I do not wish to undertake as part of this series) that simply
unsets the debug bit on unchanged functions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Matt Turner
e633fae5cb nir: Add lowering support for 64-bit operations to software
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Matt Turner
fe2cbcf3ee nir: Create nir_builder in nir_lower_doubles_impl()
We're going to use it more in a future patch, and this avoids a lot of
gross code.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Matt Turner
ecb115eb3f nir: Add and set info::uses_64bit
Will be used to communicate that a shader uses 64-bit operations to the
concerned lowering passes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Matt Turner
41f3e9e5f5 nir: Implement lowering of 64-bit shift operations
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Matt Turner
62d55f1281 nir: Wire up int64 lowering functions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Jason Ekstrand
adab27e741 nir: Add some more int64 lowering helpers
[mattst88]: Found in an old branch of Jason's.

Jason implemented: inot, iand, ior, iadd, isub, ineg, iabs, compare,
                   imin, imax, umin, umax
Matt implemented:  ixor, bcsel, b2i, i2b, i2i8, i2i16, i2i32, i2i64,
                   u2u8, u2u16, u2u32, u2u64, and fixed ilt

Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
dde73e646f nir: Tag entrypoint for easy recognition by nir_shader_get_entrypoint()
We're going to have multiple functions, so nir_shader_get_entrypoint()
needs to do something a little smarter.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 16:42:40 -08:00
Matt Turner
393b59e077 nir: Rework nir_lower_constant_initializers() to handle functions
Previously it assumed that only a single function (the entrypoint)
existed and attempted to lower constant initializers of shader outputs
for each function, for instance.
2019-01-09 16:42:40 -08:00
Sagar Ghuge
f998ce4111 glsl: Add "built-in" functions to do fp32_to_int64(fp32)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
2632c12477 glsl: Add "built-in" functions to do fp32_to_uint64(fp32)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
876a4b85fe glsl: Add "built-in" functions to do fp64_to_int64(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
21e9bb2b3f glsl: Add utility function to round and pack int64_t value
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
5a674fd789 glsl: Add "built-in" functions to do fp64_to_uint64(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
5a87441807 glsl: Add utility function to round and pack uint64_t value
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
c9d333a6b7 glsl: Add "built-in" functions to do int64_to_fp32(int64_t)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
d5cf6e92b4 glsl: Add "built-in" functions to do uint64_to_fp32(uint64_t)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
b830efb191 glsl: Add "built-in" functions to do int64_to_fp64(int64_t)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
7c5b982b89 glsl: Add "built-in" functions to do uint64_to_fp64(uint64_t)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Matt Turner
15757bc80b glsl: Add "built-in" functions to convert bool to double
And vice versa.

Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
e213f3871f glsl: Add "built-in" functions to do ffract(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
5c9a659f50 glsl: Add "built-in" function to do ffloor(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
83762afa66 glsl: Add "built-in" functions to do fmin/fmax(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
92ac2169fb glsl: Add "built-in" functions to do ffma(fp64)
Definitely not actually a fused-multiply add.

Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
3db81b5d9f glsl: Add "built-in" functions to do round(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
48891ab441 glsl: Add "built-in" functions to do trunc(fp64)
v2: use mix.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
2119094b1d glsl: Add "built-in" functions to do sqrt(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
cad58fc5e7 glsl: Add "built-in" functions to do fp32_to_fp64(fp32)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
407bd1bbf9 glsl: Add "built-in" functions to do fp64_to_fp32(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
f499942b31 glsl: Add "built-in" functions to do int_to_fp64(int)
v2: use mix
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
773190f281 glsl: Add "built-in" functions to do fp64_to_int(fp64)
v2: use mix

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
cbf090b809 glsl: Add "built-in" functions to do uint_to_fp64(uint)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
a3551ee61f glsl: Add "built-in" functions to do fp64_to_uint(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
4a93401546 glsl: Add "built-in" functions to do mul(fp64, fp64)
v2: use mix
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
f111d72596 glsl: Add "built-in" functions to do add(fp64, fp64)
v2: use mix and findMSB to optimise.
v3: [Sagar] Fix zFrac0 == 0u case in __normalizeRoundAndPackFloat64

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
c036fc97a2 glsl: Add "built-in" functions to do lt(fp64, fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
3e4d5ea7b8 glsl: Add utility function to extract 64-bit sign
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:40 -08:00
Elie Tournier
ec6e823a99 glsl: Add "built-in" functions to do eq/ne(fp64, fp64) 2019-01-09 16:42:40 -08:00
Elie Tournier
c802cdde9d glsl: Add "built-in" function to do sign(fp64)
v2: use mix.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
eac66f0248 glsl: Add "built-in" functions to do neg(fp64)
v2: use mix.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
0428951b9d glsl: Add "built-in" function to do abs(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Matt Turner
b63a1f8e40 glsl: Create file to contain software fp64 functions
The following patches will add implementations of various
double-precision operations to this file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:40 -08:00
Ian Romanick
412472da5c glsl: Add utility to convert text files to C strings
Will be used to convert the .glsl source file containing software fp64
routines to a .h file that can be included while building the compiler.

This commit contains two squashed together: the first from Ian adding
the utility (with the existing title), and the second from Dylan making
the code both python2 and python3 compatible.

This is somewhat modeled after the xxd utility that comes with Vim.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>

xxd.py: Make python2 and 3 compatible

This makes use of unicode_literals, so that undecorated strings are
considered text (python2 unicode, python3 str) and not bytes in python2
and text in python3. It makes use of io.open, which provides python2
with python3's open behavior (it's an alias in python3), in particular
support for the 't' and 'b' option. Finally, it decorates all of the
string literals with the 'b' prefix, so that python interprets them as
bytes.

I've removed the stdin and stdout options, as python2 always requires
these to be bytes, but python3 always treats them as text (there is a
way to get at the underlying bytes buffer, but that's even more
complexity), and makes the input files required arguments.

In the meson we use the '@INPUT@' shorthand instead of listing each
input, as meson will expand that to [prog_python, '@INPUT0@', @INPUT1@,
..., @OUTPUT@, ...]
2019-01-09 16:42:40 -08:00
Timothy Arceri
76c27e47b9 glsl: Copy function out to temp if we don't directly ref a variable
Otherwise we can end up with IR that looks like this:

    (
      (declare (temporary ) vec4 f@8)
      (assign  (xyzw) (var_ref f@8)  (var_ref f) )
      (call f16  ((swiz y (var_ref f@8) )))

      (assign  (xyzw) (var_ref f)  (var_ref f@8) )
    ))

When we really need:

      (declare (temporary ) float inout_tmp)
      (assign  (x) (var_ref inout_tmp)  (swiz y (var_ref f) ))
      (call f16  ((var_ref inout_tmp) ))

      (assign  (y) (var_ref f)  (swiz y (swiz xxxx (var_ref inout_tmp) )))
      (declare (temporary ) void void_var)

The GLSL IR function inlining code seemed to produce correct code
even without this but we need the correct IR for GLSL IR -> NIR to
be able to understand whats going on.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:40 -08:00
Matt Turner
63f6d7afd6 glsl: Add function support to glsl_to_nir
Based on a patch from Tim Arceri, but I had to substantially rewrite it
as a result of the NIR derefs rework.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:40 -08:00
Francisco Jerez
230a8a541d intel/fs: Remove FS_OPCODE_UNPACK_HALF_2x16_SPLIT opcodes.
These are broken on a future platform, but it turns out we don't need
to fix them, since they're just type-converting moves with strided
source.  Kill them.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
cbea91eb57 intel/fs: Remove nasty open-coded CHV/BXT 64-bit workarounds.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
2c99c7a56c intel/fs: Remove existing lower_conversions pass.
It's redundant with the functionality provided by lower_regioning now.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
efa4e4bc5f intel/fs: Introduce regioning lowering pass.
This legalization pass is meant to handle situations where the source
or destination regioning controls of an instruction are unsupported by
the hardware and need to be lowered away into separate instructions.
This should be more reliable and future-proof than the current
approach of handling CHV/BXT restrictions manually all over the
visitor.  The same mechanism is leveraged to lower unsupported type
conversions easily, which obsoletes the lower_conversions pass.

v2: Give conditional modifiers the same treatment as predicates for
    SEL instructions in lower_dst_modifiers() (Iago).  Special-case a
    couple of other instructions with inconsistent conditional mod
    semantics in lower_dst_modifiers() (Curro).

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
b94519971a intel/fs: Constify fs_inst::can_do_source_mods().
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
c301f447ea intel/fs: Respect CHV/BXT regioning restrictions in copy propagation pass.
Currently the visitor attempts to enforce the regioning restrictions
that apply to double-precision instructions on CHV/BXT at NIR-to-i965
translation time.  It is possible though for the copy propagation pass
to violate this restriction if a strided move is propagated into one
of the affected instructions.  I've only reproduced this issue on a
future platform but it could affect CHV/BXT too under the right
conditions.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Francisco Jerez
464e79144f intel/eu/gen7: Fix brw_MOV() with DF destination and strided source.
I triggered this bug while prototyping code for a future platform on
IVB.  Could be a problem today though if a strided move is
copy-propagated into a type-converting move with DF destination.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Francisco Jerez
bc781a0323 intel/fs: Fix bug in lower_simd_width while splitting an instruction which was already split.
This seems to be a problem in combination with the lower_regioning
pass introduced by a future commit, which can modify a SIMD-split
instruction causing its execution size to become illegal again.  A
subsequent call to lower_simd_width() would hit this bug on a future
platform.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Francisco Jerez
812ede088f intel/fs: Implement quad swizzles on ICL+.
Align16 is no longer a thing, so a new implementation is provided
using Align1 instead.  Not all possible swizzles can be represented as
a single Align1 region, but some fast paths are provided for
frequently used swizzles that can be represented efficiently in Align1
mode.

Fixes ~90 subgroup quad swap Vulkan CTS tests.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Francisco Jerez
c5f9c0009d intel/fs: Handle source modifiers in lower_integer_multiplication().
lower_integer_multiplication() implements 32x32-bit multiplication on
some platforms by bit-casting one of the 32-bit sources into two
16-bit unsigned integer portions.  This can give incorrect results if
the original instruction specified a source modifier.  Fix it by
emitting an additional MOV instruction implementing the source
modifiers where necessary.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Andrii Simiklit
0206ffc28d anv/pipeline: remove unnecessary null-pointer check
Looks like it is impossible that 'last' variable is a null
because at least the get_vs_prog_data shouldn't return a null pointer.
So this check is unnecessary starts from commit:
99d497c5b6 "anv/pipeline: Replace get_fs_input_map with ..."

This small issue is found by cppcheck.

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-09 12:29:12 -06:00
Indrajit Das
d2c170eb35 st/va: Return correct status from vlVaQuerySurfaceStatus
This ensures that during encoding, applications can get
the correct status of the surface before submitting
more operations on the same.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>
2019-01-09 11:34:22 -05:00
Roland Scheidegger
0c226d40ef Revert "llvmpipe: Always return some fence in flush (v2)"
This reverts commit f6a6da8131.

With this commit we see massive amounts of asserts triggering
in lp_fence_wait(), assert(f->issued), for instance with libgl_xlib
state tracker and piglit. Not entirely sure if the assert could
just be removed.
2019-01-09 17:28:53 +01:00
Marek Olšák
e986c1ca1d st/mesa: don't leak pipe_surface if pipe_context is not current
We have found some pipe_surface leaks internally.

This is the same code as surface_destroy in radeonsi.
Ideally, surface_destroy would be in pipe_screen.

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-01-09 11:08:44 -05:00
Marek Olšák
fd82a1d1d6 st/mesa: don't reference pipe_surface locally in PBO code
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-01-09 11:08:44 -05:00
Marek Olšák
5da442338b st/mesa: unify window-system renderbuffer initialization
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-01-09 11:08:44 -05:00
Mario Kleiner
5e30e54e05 radeonsi: Fix use of 1- or 2- component GL_DOUBLE vbo's.
With Mesa 18.1, commit be973ed21f, si_llvm_load_input_vs()
changed the number of source 32-bit wide dword components
used for fetching vertex attributes into the vertex shader
from a constant 4 to a variable num_channels number, depending
on input data format, with some special case handling for
input data formats like 64-Bit doubles.

In the case of a GL_DOUBLE input data format with one
or two components though, e.g, submitted via ...

a) glTexCoordPointer(1, GL_DOUBLE, 0, buffer);
b) glTexCoordPointer(2, GL_DOUBLE, 0, buffer);

... the input format would be SI_FIX_FETCH_RG_64_FLOAT,
but no special case handling was implemented for that
case, so in the default path the number of 32-bit
dwords would be set to the number of float input components
derived from info->input_usage_mask. This ends with corrupted
input to the vertex shader, because fetching a 64-bit double
from the vbo requires fetching two 32-bit dwords instead of 1,
and fetching a two double input requires 4 dword fetches
instead of 2, so in these cases the vertex shader receives
incomplete/truncated input data:

a) float v = gl_MultiTexCoord0.x;  -> v.x is corrupted.
b) vec2  v = gl_MultiTexCoord0.xy; -> v.x is assigned
   correctly, but v.y is corrupted.

This happens with the standard TGSI IR compiled shaders.
Under NIR with R600_DEBUG=nir, we got correct behavior
because the current radeonsi nir code always assigns
info->input_usage_mask = TGSI_WRITEMASK_XYZW, thereby
always fetches 4 dwords regardless of what the shader
actually needs.

Fix this by properly assigning 2 or 4 dword fetches for
one or two component GL_DOUBLE input.

Fixes: be973ed21f ("radeonsi: load the right number of
       components for VS inputs and TBOs")

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-01-09 11:08:44 -05:00
Rhys Perry
ee8488ea3b ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics
Fixes artifacts in World of Warcraft when Multi-sample Alpha-Test is
enabled with DXVK.
It also fixes artifacts with Fallout 4's god rays with DXVK.
Various piglit interpolateAt*() tests under NIR are also fixed.

v2: formatting fix
    update commit message to include Fallout 4 and the Fixes tag

Fixes: f4e499ec79 ('radv: add initial non-conformant radv vulkan driver')
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106595
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
2019-01-09 14:57:07 +00:00
Samuel Pitoiset
b8c4f523b4 radv: skip draws with instance_count == 0
Loosely based on RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-09 14:22:38 +01:00
Samuel Pitoiset
a2b5cc3c39 radv: enable variable pointers
The Vulkan spec 1.1.97 says:
   "variablePointers specifies whether the implementation supports
    the SPIR-V VariablePointers capability. When this feature is
    not enabled, shader modules must not declare the
    VariablePointers capability."

As the SPIR-V feature is enabled, we should turn on the
extension feature as well.

All dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.*
pass with the khronos internal repo. Note that a bunch of them
fails with the public repo, but it's expected as they violate the
specification.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-09 12:32:18 +01:00
Samuel Pitoiset
d58b11e709 radv: get rid of bunch of KHR suffixes
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-09 12:26:48 +01:00
Maya Rashish
a2ddb710fd radeon: fix printf format specifier.
From glibc printf(3):

Z      A nonstandard synonym for z that predates the appearance of z.
       Do not use in new code.

Z may not exist on non-glibc systems. Prefer the standard symbol.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-01-09 14:15:06 +11:00
Tomasz Figa
f6a6da8131 llvmpipe: Always return some fence in flush (v2)
If there is no last fence, due to no rendering happening yet, just
create a new signaled fence and return it, to match the expectations of
the EGL sync fence API.

Fixes random "Could not create sync fence 0x3003" assertion failures from
Skia on Android, coming from the following code:

https://android.googlesource.com/platform/frameworks/base/+/master/libs/hwui/pipeline/skia/SkiaOpenGLPipeline.cpp#427

Reproducible especially with thread count >= 4.

One could make the driver always keep the reference to the last fence,
but:

 - the driver seems to explicitly destroy the fence whenever a rendering
   pass completes and changing that would require a significant functional
   change to the code. (Specifically, in lp_scene_end_rasterization().)

 - it still wouldn't solve the problem of an EGL sync fence being created
   and waited on without any rendering happening at all, which is
   also likely to happen with Android code pointed to in the commit.

Therefore, the simple approach of always creating a fence is taken,
similarly to other drivers, such as radeonsi.

Tested with piglit llvmpipe suite with no regressions and following
tests fixed:

egl_khr_fence_sync
 conformance
  eglclientwaitsynckhr_flag_sync_flush
  eglclientwaitsynckhr_nonzero_timeout
  eglclientwaitsynckhr_zero_timeout
  eglcreatesynckhr_default_attributes
  eglgetsyncattribkhr_invalid_attrib
  eglgetsyncattribkhr_sync_status

v2:
 - remove the useless lp_fence_reference() dance (Nicolai),
 - explain why creating the dummy fence is the right approach.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
2019-01-09 02:06:13 +01:00
Eric Anholt
700aeaf9c8 glsl: Fix buffer overflow with an atomic buffer binding out of range.
The binding is checked against the limits later in the function, so we
need to make sure we don't overflow before the check here.

Fixes this valgrind warning (and sometimes segfault):

==1460== Invalid write of size 4
==1460==    at 0x74C98DD: ast_declarator_list::hir(exec_list*, _mesa_glsl_parse_state*) (ast_to_hir.cpp:4943)
==1460==    by 0x74C054F: _mesa_ast_to_hir(exec_list*, _mesa_glsl_parse_state*) (ast_to_hir.cpp:159)
==1460==    by 0x7435C12: _mesa_glsl_compile_shader (glsl_parser_extras.cpp:2130)

in

dEQP-GLES31.functional.debug.negative_coverage.get_error.compute.
   exceed_atomic_counters_limit

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-01-08 15:44:58 -08:00
Eric Anholt
211b826790 nir: Make nir_deref_instr_build/get_const_offset actually use size_align.
I think this was copy-and-paste mistake -- nir_opt_large_constants was
passing in glsl_get_natural_size_align_bytes() given brw_nir.c's arguments
to the opt pass.

I wanted to reuse this function for handling constant offsets of arrays of
images in V3D.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-08 15:40:53 -08:00
Danylo Piliaiev
9f29d90327 glsl/linker: Fix unmatched TCS outputs being reduced to local variable
Always match TCS outputs since they are shared by all invocations
within the patch and should not be converted to local variables.

This is one of the issues found in Downward.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104297
2019-01-09 10:31:13 +11:00
Eric Anholt
db3b6b6bca v3d: Enable GL_ARB_texture_gather on V3D 4.x.
This is part of GLES 3.1, and with the NIR lowering we're now passing the
GLES31 testcases.
2019-01-08 13:03:44 -08:00
Eric Anholt
6051c11d17 nir: Add nir_lower_tex support for Broadcom's swizzled TG4 results.
V3D returns the texels in a different order in the resulting vec4 from
what GLSL wants, so we need to put in a swizzle.  Fixes
dEQP-GLES31.functional.texture.gather.basic.2d.rgba8.base_level.level_1

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 13:03:41 -08:00
Bas Nieuwenhuizen
3fcec4a550 freedreno: Move register constant files to src/freedreno.
This way they can be shared. Build tested with meson, but not too sure
on the autotools stuff though.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Rob Clark <robdclark@gmail.com>
2019-01-08 21:46:14 +01:00
Caio Marcelo de Oliveira Filho
baabfb1959 nir: fix warning in nir_lower_io.c
Initialize the variable with NULL.  Fixes the following

    In file included from ../src/compiler/nir/nir_lower_io.c:34:
    ../src/compiler/nir/nir_lower_io.c: In function ‘nir_lower_explicit_io’:
    ../src/compiler/nir/nir.h:668:11: warning: ‘addr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
        return src;
               ^~~
    ../src/compiler/nir/nir_lower_io.c:735:17: note: ‘addr’ was declared here
        nir_ssa_def *addr;
                     ^~~~

v2: Avoid using a 'default' case so we get help from the compiler when
    new deref types are added. (Lionel)

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 12:29:56 -08:00
Chia-I Wu
3cb65cf8aa freedreno/drm: sync uapi again
"pad" was missing in Mesa's msm_drm.h.  sizeof(drm_msm_gem_info)
remains the same, but now the compiler initializes the field to
zero.

Buffer allocation results in EINVAL without this for me.

Cc: Rob Clark <robdclark@gmail.com>
Cc: Kristian Høgsberg <hoegsberg@gmail.com>
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
2019-01-08 19:55:28 +00:00
Chia-I Wu
6eeb1fe491 meson: fix EGL/X11 build without GLX
dep_xcb and others were not set under this configuration.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-08 10:58:48 -08:00
Eric Engestrom
b38a48a569 wsi: drop unneeded KHR suffix
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 18:48:03 +00:00
Eric Engestrom
4f5a526789 anv: drop unneeded KHR suffix
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 18:47:56 +00:00
Karol Herbst
d0c6ef2793 nir: rename global/local to private/function memory
the naming is a bit confusing no matter how you look at it. Within SPIR-V
"global" memory is memory accessible from all threads. glsl "global" memory
normally refers to shader thread private memory declared at global scope. As
we already use "shared" for memory shared across all thrads of a work group
the solution where everybody could be happy with is to rename "global" to
"private" and use "global" later for memory usually stored within system
accessible memory (be it VRAM or system RAM if keeping SVM in mind).
glsl "local" memory is memory only accessible within a function, while SPIR-V
"local" memory is memory accessible within the same workgroup.

v2: rename local to function as well
v3: rename vtn_variable_mode_local as well

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 18:51:46 +01:00
Dylan Baker
401dca1c73 autotools: Remove tegra vdpau driver
This has never functioned and probably wont ever function, due to the
way gallium media state trackers are architected and the tegra video
decoder is architected.

Cc: Thierry Reding <thierry.reding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Fixes: 1755f608f5
       ("tegra: Initial support")
2019-01-08 09:42:56 -08:00
Pierre Moreau
ba55cb2bcd clover/meson: Ignore 'svn' suffix when computing CLANG_RESOURCE_DIR
The version exported by LLVM in its CMake configuration files can
include the “svn” suffix when building a development version (for
example “8.0.0svn”). However the exported clang headers are still found
under “lib/clang/8.0.0/”, without the “svn” suffix.
Meson takes care of removing the “svn” suffix from the version when
using the dependency’s `version()` method.

This processing is already performed in “configure.ac” when using
autotools.

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-08 08:53:38 -08:00
Lionel Landwerlin
add5a2ec92 anv: flush fast clear colors into compressed surfaces
In the following scenario :

   1. Create image format R8G8B8A8_UNORM
   2. Create image view format R8G8B8A8_SRGB
   3. Clear the view through a sub pass to a particular color
   4. Barrier on the image to from color attachment to source transfer
   5. Copy the image into a linear buffer to check the content

The step 4 resolving the clear color is unaware of the SRGB format of
the view, because the blorp resolve operations operate on images the
color associated with the resolve will not operate on SRGB format but
UNORM. Leading to the wrong color being written into surfaces.

This change forces a clear color resolve at the end of the render pass
so following resolves won't have to deal with the clear color with a
format that doesn't match the image's format.

On gfxbench vulkan_5_normal 1280x720, this appear to cost us ~0.5fps,
from 49.316 down to 48.949.

v2: Only fast clear resolve when image & view have different formats
    (Lionel)

v3: Update warning (Jason)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2019-01-08 16:37:00 +00:00
Lionel Landwerlin
366eb656ac anv: explictly specify format for blorp ccs/mcs op
Resolve operations can happen when dealing with view (begin/end
subpasses) in which case the view's format needs to apply, not the
image's format.

v2: Relayout arguments of a ccs_op() call (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911
Cc: mesa-stable@lists.freedesktop.org
2019-01-08 16:36:56 +00:00
Tapani Pälli
c292414765 dri3: initialize adaptive_sync as false before configQueryb
Fixes following errors from valgrind output:

   ==23388== Conditional jump or move depends on uninitialised value(s)
   ==23388==    at 0x48B4924: loader_dri3_drawable_init (loader_dri3_helper.c:381)
   ==23388==    by 0x48A97D2: dri3_create_drawable (dri3_glx.c:386)
   ==23388==    by 0x489E190: driFetchDrawable (dri_common.c:369)
   ==23388==    by 0x48A9187: dri3_bind_context (dri3_glx.c:195)
   ==23388==    by 0x488B75C: MakeContextCurrent (glxcurrent.c:220)
   ==23388==    by 0x488B8DB: glXMakeCurrent (glxcurrent.c:267)
   ==23388==    by 0x10A987: ??? (in /usr/bin/glxgears)
   ==23388==    by 0x4BEB412: (below main) (in /usr/lib64/libc-2.28.so)
   ==23388==
   ==23388== Conditional jump or move depends on uninitialised value(s)
   ==23388==    at 0x48B5A40: loader_dri3_swap_buffers_msc (loader_dri3_helper.c:923)
   ==23388==    by 0x48A9B7E: dri3_swap_buffers (dri3_glx.c:587)
   ==23388==    by 0x4887A81: glXSwapBuffers (glxcmds.c:857)
   ==23388==    by 0x10ADED: ??? (in /usr/bin/glxgears)
   ==23388==    by 0x4BEB412: (below main) (in /usr/lib64/libc-2.28.so)

Fixes: 2e12fe425f "loader/dri3: Enable adaptive_sync via _VARIABLE_REFRESH property"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
2019-01-08 08:15:07 +02:00
Dave Airlie
4298a85ae8 virgl: use primconvert provoking vertex properly
This stores the raster state and calls the correct primconvert interface
using the currently bound raster state.

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-01-08 12:06:41 +10:00
Jason Ekstrand
754eff07d2 anv: Sort properties and features switch statements
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-07 18:41:15 -06:00
Jason Ekstrand
05d72d6d48 spirv: Sort supported capabilities
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-07 18:41:15 -06:00
Jason Ekstrand
34af63fa22 anv: Enable the new deref-based UBO/SSBO path
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
63b9aa2e25 spirv: Add support for using derefs for UBO/SSBO access
For now, it's hidden behind a cap.  Hopefully, we can eventually drop
that along with all the manual offset code in spirv_to_nir.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
3a7c5667c8 spirv: Make better use of vtn_pointer_uses_ssa_offset
The choice of whether or not we should use block_load/store isn't a
choice between external and not so much as a choice between deref
instructions and manually calculated offsets.  In vtn_pointer_from_ssa,
we guard the index+offset case behind vtn_pointer_uses_ssa_offset and
then branch out from there.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
adc155a815 spirv: Add explicit pointer types
Instead of baking in uvec2 for UBO and SSBO pointers and uint for push
constant and shared memory pointers, make it configurable.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
be039cb467 spirv: Choose atomic deref type with pointer_uses_ssa_offset
Previously, we hard-coded the rule about workgroup variables and the
builder lower_workgroup_access_to_offsets flag.  Instead base it on the
handy helper we have for exactly this sort of thing.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
5c3cb9c3ce spirv: Add error checking for Block and BufferBlock decorations
Variable pointers being well-defined across the block boundary requires
a couple of very specific SPIR-V validation rules.  Normally, we'd trust
the validator to catch these but since CTS tests have been found in the
wild which violate them, we'll carry our own checks.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
e90b738f20 nir/vulkan: Add a descriptor type to vulkan resource intrinsics
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
f393b10b3f nir/lower_io: Add "explicit" IO lowering
This new pass is for lowering explicitly laid out memory coming in from
SPIR-V or a similar source.  It's quite a bit more complicated than the
normal lower_io because we have to be able to handle matrices.  The
way the stride information is stored for matrices is awkward and dealing
with row-major matrices is especially painful.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
52dd43c7ef nir/validate: Allow array derefs on vectors in more modes
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
013ee5732b nir/intrinsics: Add access flags to load/store_deref
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
7755171e4c nir/intrinsics: Allow deref sources to consume anything
This commit adds a new num_components value for intrinsic sources of -1
which means that it consumes everything and the number of components
effectively isn't validated.  This is useful for deref sources which
just take the result of the deref and we leave it up to the driver to
decide what that size should be.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
d0fe52a456 nir/validate: Allow derefs in phi nodes
We added this assert when first moving derefs over to instructions to
ensure that deref chains could go all the way back to the variables.
Now that we're going to start using derefs for things that we can do
variable pointers on such as UBOs and SSBOs, we need to be able to run
derefs through phi nodes, selects, and basically anything else.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
7e85480a67 nir/remove_dead_variables: Properly handle deref casts
We already detect any incomplete deref chains (where the deref is used
for something other than another deref or a load/store) and flag the
variable as used thanks to deref_used_for_not_store.  All that's left to
do is to properly skip casts when cleaning up.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
78d80f7db2 nir/deref: Skip over casts in fixup_deref_modes
This pass is used when, for instance, we lazily change the mode of
variables rather than replacing the variable with a new one.  Since we
only do this in cases where we know we have full deref chains, it's ok
to just skip them in fixup_deref_modes.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
d8e3edb784 nir/deref: Support casts and ptr_as_array in comparisons
The code which constructs deref paths already gives you the path
starting at the nearest deref_cast or deref_var.  All we need to do for
casts is handle the case where the start of the path isn't a deref_var.
For ptr_as_array derefs, we just bail if we have any after the
divergence point between the two derefs.  We may be able to do better in
the future but this works for now.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
a1c688517d nir/opt_deref: Properly optimize ptr_as_array derefs
When handling casts, we can't blindly propagate the parent of a cast
into a ptr_as_array deref because doing so might loose the stride
information from the cast.  Instead, before we can propagate into
ptr_as_array derefs, we need to check that the cast is a cast of an
array deref and that the stride matches.  For other types of derefs, we
can continue to propagate casts as normal because they don't need the
stride.  We also add an optimization which can combine a ptr_as_array
deref with it parent if it is also an array deref of some form.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
427558a717 nir/validate: Don't allow derefs in if conditions
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
e94a027af8 nir: Add a ptr_as_array deref type
These correspond directly to SPIR-V's OpPtrAccessChain.  As such, they
treat whatever their parent gives them as if it's the first element in
some array and dereferences that array.  If the parent is, itself, an
array deref, then the two indices can just be added together to get the
final array deref.  However, it can also be used in cases where what you
have is a dereference to some random vec2 value somewhere.  In this
case, we require a cast before the ptr_as_array and use the ptr_stride
field in the cast to provide a stride for the ptr_as_array derefs.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
fc9c4f89b8 nir: Move propagation of cast derefs to a new nir_opt_deref pass
We're going to want to do more deref optimizations going forward and
this gives us a central place to do them.  Also, cast propagation will
get a bit more complicated with the addition of ptr_as_array derefs.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
bf1a1eed88 spirv: Propagate layout decorations to created glsl_types
Instead of just storing the decorations in the vtn_type, propagate them
all the way through to the glsl_type.  For array strides, this means we
need to handle them earlier so we break array stride handling into it's
own function and explicitly call it for both pointer and array types.

Due to type deduplication in the SPIR-V, we may have explicit layout
decorations on all sorts of types that don't actually want them.  In
order to prevent these leaking into unfortunate places in NIR, we
explicitly strip them off before creating NIR variables and when casting
pointers to non-external memory.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
6cebeb4f71 glsl_type: Add support for explicitly laid out matrices and arrays
SPIR-V allows for matrix and array types to be decorated with explicit
byte stride decorations and matrix types to be decorated row- or
column-major.  This commit adds support to glsl_type to encode this
information.  Because this doesn't work nicely with std430 and std140
alignments, we add asserts to ensure that we don't use any of the std430
or std140 layout functions with explicitly laid out types.

In SPIR-V, the layout information for matrices is applied to the parent
struct member instead of to the matrix type itself.  However, this is
gets rather clumsy when you're walking derefs trying to compute offsets
because, the moment you hit a matrix, you have to crawl back the deref
chain and find the struct.  Instead, we take the same path here as we've
taken in spirv_to_nir and put the decorations on the matrix type itself.

This also subtly adds support for strided vector types.  These don't
come up in SPIR-V directly but you can get one as the result of taking a
column from a row-major matrix or a row from a column-major matrix.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
7f70b3e555 glsl_type: Simplify glsl_channel_type
This is C++ so we can just poke at the fields of glsl_type if we wish
and calling get_instance is way easier and more reliable than handling
each instance separately.  While we're at it, we re-arrange the base
type labels to match the enum order and add 8-bit type support.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
d8a11bfc08 glsl_type: Add a C wrapper to get struct field offsets
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
d34f19feba glsl_type: Drop the glsl_get_array_instance C helper
It was added in bce6f99875 even though it's completely redundant with
glsl_array_type().

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
a700a82bda nir: Distinguish between normal uniforms and UBOs
Previously, NIR had a single nir_var_uniform mode used for atomic
counters, UBOs, samplers, images, and normal uniforms.  This commit
splits this into nir_var_uniform and nir_var_ubo where nir_var_uniform
is still a bit of a catch-all but the nir_var_ubo is specific to UBOs.
While we're at it, we also rename shader_storage to ssbo to follow the
convention.

We need this so that we can distinguish between normal uniforms and UBO
access at the deref level without going all the way back variable and
seeing if it has an interface type.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
c9a4135e14 nir: Allow storing to shader_storage
I have no idea how shader_storage made it into the list of banned
variable modes for stores but it clearly should be allowed.  This only
doesn't cause us a problem today because we never actually use derefs on
shader_storage variables.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
cd93b0a670 nir/validate: Require array indices to match the deref bit size
This doesn't currently change anything because array indices are
required to be 32 bits and all derefs are also 32 bits.  However, we
will one day have 64-bit derefs for OpenCL.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
abfe674c54 spirv: Handle arbitrary bit sizes for deref array indices
We already had code in link_as_ssa to handle bit sizes; we just need to
use it.  While we're at it we clean up link_as_ssa a bit and add an
explicit bit_size parameter in preparation for a day when we have derefs
that aren't 32 bit.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
bfe31c5e46 nir/builder: Add nir_i2i and nir_u2u helpers which take a bit size
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com
2019-01-08 00:38:29 +00:00
Jason Ekstrand
639c236e74 spirv: Emit NIR deref instructions on-the-fly
This simplifies our deref handling by emitting the actual NIR deref
instructions on-the-fly instead of of building up a deref chain and then
emitting them at the last moment.  In order for this to work with the
parts of the compiler that assume they can chase deref chains, we have
to run nir_rematerialize_derefs_in_use_blocks_impl to put the derefs
back in the right places.  Otherwise, in cases such as loop continues
where the SPIR-V blocks are not in the same order as the NIR blocks, we
may end up with a deref chain with a parent that does not dominate it's
child and nir_repair_ssa_impl will insert phis in the deref chain.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
c59f07684c spirv: Sign-extend array indices
The SPIR-V spec was recently updated to clarify that array indices are
treated as signed integers.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
f8992eb5ba anv/apply_pipeline_layout: Set the cursor in lower_res_reindex_intrinsic
The loop through instructions doesn't set the cursor for us so unless we
set it somewhere, we may end up emitting instructions in the wrong
place.  The only reason why we haven't been bitten by this in the past
is that it only happens in a few variable pointers cases and the CTS
tests for those don't use much control flow so things were getting
emitted in the correct order by accident.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
42b2f3e91f spirv: Handle any bit size in vector_insert/extract
This crops up both in the actual SPIR-V VectorInsert/Extract opcodes as
well as various places where we deal with vector derefs.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Jason Ekstrand
a392ddb781 glsl_type: Support serializing 8 and 16-bit types
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:29 +00:00
Bas Nieuwenhuizen
70ed049cc6 spirv: Fix matrix parameters in function calls.
They can be handled exactly the same as arrays, we just need to handle
the base type correctly in the switches.

Fixes: a45b6fb452 "spirv: Pass SSA values through functions"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109204
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-08 01:30:03 +01:00
Bas Nieuwenhuizen
3cc940277a radv: Fix rasterization precision bits.
Note that these limits are exact, not a "precision is at least x",
as texel coords also get snapped to a multiple of this step size
before filtering.

This fixes CTS tests

dEQP-VK.texture.explicit_lod.2d.sizes.31x55_nearest_linear_mipmap_nearest_repeat
dEQP-VK.texture.explicit_lod.2d.sizes.57x35_nearest_linear_mipmap_nearest_repeat

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109151
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 23:27:30 +01:00
Kenneth Graunke
f003859f97 nir: Make gl_nir_lower_samplers use gl_nir_lower_samplers_as_deref
These days, we have two sampler lowering passes.  The newer one,
gl_nir_lower_samplers_as_deref, is used by radeonsi.  It rewrites
variables to drop structures out of sampler deref chains, to make
life simpler.  It then sets var->data.binding for non-bindless
sampler and image variables based on the GL uniform storage's
opaque index values.

The older one converts sampler deref chains (nir_tex_src_texture_deref)
to a numerical offset (nir_tex_src_texture_offset).  It also stores the
constant-valued portion of that number in tex->texture_index, making
life really simple for drivers that don't support indirects.  It too
pokes at GL uniform storage's opaque index values.

Logically, we can do the first pass (simplify derefs, set bindings)
then the second (turn derefs to offsets, set texture_index).  This
patch does exactly that, eliminating some redundancy (only one pass
has to poke at GL uniform storage), and gaining proper var->data.binding
values for drivers using the full lowering.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-07 14:25:04 -08:00
Kenneth Graunke
c69f9297cf nir: Fix gl_nir_lower_samplers_as_deref's structure type handling.
We recurse to remove structures, and at each step, re-modify the
resulting type for our link in the deref chain.  For arrays, the
result of recursion is the new underlying type - so we wrap it with
the array dimensionality again.  For structs, we want to simply use
the new underlying type, skipping the struct altogether.

The correct way to do this is to do nothing at all.  Previously, we
had reset type to next->type, which is the /old/ field type, not the
new field type we obtained by recursing.  This undid our recursive work.

Fixes about 338 tests with nested structs, such as:

dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.nested_structs_arrays.sampler2D_samplerCube_fragment

Note that currently only radeonsi uses this pass, and NIR support is
disabled there by default, so the breakage was likely not seen by most
people.  The next commit uses this pass for more drivers, so this fix
prevents regressions from that change.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-07 14:25:04 -08:00
Bas Nieuwenhuizen
be6cee51c0 amd/common: Add some parentheses to silence warning.
[1/59] Compiling C object 'src/amd/common/src@amd@common@@amd_common@sta/ac_nir_to_llvm.c.o'.
../mesa/src/amd/common/ac_nir_to_llvm.c: In function ‘get_inst_tessfactor_writemask’:
../mesa/src/amd/common/ac_nir_to_llvm.c:4089:32: warning: suggest parentheses around ‘+’ inside ‘<<’ [-Wparentheses]
   writemask = ((1 << num_comps + 1) - 1) << first_component;
                      ~~~~~~~~~~^~~
../mesa/src/amd/common/ac_nir_to_llvm.c:4091:33: warning: suggest parentheses around ‘+’ inside ‘<<’ [-Wparentheses]
   writemask = (((1 << num_comps + 1) - 1) << first_component) << 4;

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 23:15:37 +01:00
Bas Nieuwenhuizen
64c83efaee radv: Remove unused variable.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 23:15:33 +01:00
Bas Nieuwenhuizen
656c1c488c radv: Remove device path.
unused and gcc complains about strncpy. (from what I can see because
strncpy does not leave a 0 byte on truncate. That said we don't use
it so this does not fix a real bug).

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 23:15:14 +01:00
Marek Olšák
492ad9a402 ac: remove unused variable from ac_build_ddxy
trivial
2019-01-07 14:51:25 -05:00
Andres Gomez
0cc01f45e7 glsl: correct typo in GLSL compilation error message
v2: Add the "fix" tag (Erik).

Fixes: 037f68d81e ("glsl: apply align layout qualifier rules to block offsets")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-07 19:07:33 +02:00
Jason Ekstrand
027835b1da vulkan: Update the XML and headers to 1.1.97
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 10:00:01 -06:00
Andres Gomez
6decc6b1d9 docs: update 18.3 and add 19.x cycles for the release calendar
v2: replace incorrect "<td/>" with "<td>" (Eric).

Cc: Dylan Baker <dylan.c.baker@intel.com>
Cc: Juan A. Suarez <jasuarez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
2019-01-07 17:19:47 +02:00
Bas Nieuwenhuizen
110564fdec anv/android: Do not reject storage images.
We do the ImageFormatProperties check already, and rejecting an usage
flag when both ImageFormatProperties and the WSI (which is Android)
support it is not allowed.

Intel does support storage for some of the support WSI formats, such
as R8G8B8A8_UNORM, and looking at the ISL_SURF_USAGE_DISABLE_AUX_BIT,
the imported images do not have any form of compression that would
prevent this fix.

v2: Also consider STORAGE bit for Gralloc usage bits.
     (From Kevin Strasser <kevin.strasser@intel.com>)

Fixes: 053d4c328f "anv: Implement VK_ANDROID_native_buffer (v9)"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-07 15:20:55 +01:00
Bas Nieuwenhuizen
9a45a190ad radv: Implement buffer stores with less than 4 components.
We started using it in the btoi paths for r32g32b32, and the LLVM IR
checker will complain about it because we end up with intrinsics with
the wrong type extension in the name.

Fixes: 593996bc02 ("radv: implement buffer to image operations for R32G32B32")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 14:54:14 +01:00
Jon Turney
00ad77b9f6 appveyor: Add a Cygwin build script 2019-01-07 13:40:58 +00:00
Jon Turney
5334dafee2 appveyor: put build steps in a script, rather than inline in appveyor.yml 2019-01-07 13:40:57 +00:00
Lucas Stach
d015888efb etnaviv: annotate variables only used in debug build
Some of the status variables in the compiler are only used in asserts
and thus may be unused in release builds. Annotate them accordingly
to avoid 'unused but set' warnings from the compiler.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-01-07 11:51:02 +01:00
Lucas Stach
b56d903b5a etnaviv: enable full overwrite in a few more cases
Take into account the render target format when checking if the color
mask affects all channels of the RT. This allows to enable full
overwrite in a few cases where a non-alpha format is used.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-01-07 11:50:23 +01:00
Timothy Arceri
6dade5d534 nir: avoid uninitialized variable warning
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109231
2019-01-07 10:57:00 +11:00
Timothy Arceri
17fac39398 st/glsl: refactor st_link_nir()
The functional change here is moving the nir_lower_io_to_scalar_early()
calls inside st_nir_link_shaders() and moving the st_nir_opts() call
after the call to nir_lower_io_arrays_to_elements().

This fixes a bug with the following piglit test due to the current code
not cleaning up dead code after we lower arrays. This was causing an
assert in the new duplicate varyings link time opt introduced in
70be9afccb.

tests/spec/glsl-1.10/execution/vsfs-unused-array-member.shader_test

Moving the nir_lower_io_to_scalar_early() calls also allows us to tidy
up the code a little and merge some loops.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-07 10:54:20 +11:00
Eric Anholt
8847370424 v3d: Use the core tex lowering.
Even without any clever optimization on the unpack operations, this gives
us a useful value for the channels read field, which we can use to avoid
ldtmu instructions to the no-op register.

instructions in affected programs: 890712 -> 881974 (-0.98%)
2019-01-04 15:59:59 -08:00
Eric Anholt
f217a94542 nir: Add nir_lower_tex options to lower sampler return formats.
I've been doing this in the nir-to-vir and nir-to-qir backends of v3d and
vc4, but nir could potentially do some useful stuff for us (like avoiding
unpack/repacks) if we give it the information.

v2: Skip lowering for txs/query_levels
v3: Fix a crash on old-style shadow
v4: Rename to tex_packing, use nir_format_unpack_sint/uint helpers, pack
    the enum.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-04 15:59:57 -08:00
Eric Anholt
a74f2aeb4f nir: Allow nir_format_unpack_int/sint to unpack larger values.
For V3D, I want to unpack 4-16-bit packed integers for 8 and 16-bit
integer samplers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-04 15:59:30 -08:00
Jason Ekstrand
19c608fe43 intel/blorp: Be more conservative about copying clear colors
In 92eb5bbc68 we attempted to avoid copying clear colors whenever
we weren't doing a resolve.  However, this broke MSAA resolves because
we need the clear color in the source.  This patch makes blorp much more
conservative such that it only avoids the clear color copy if either
aux_usage == NONE or it's explicitly doing a fast-clear.

Fixes: 92eb5bbc68 "intel/blorp: Only copy clear color when doing..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107728
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-01-04 17:57:43 -06:00
Eric Anholt
81b9361b68 v3d: Stop scalarizing our uniform loads.
We can pull a whole vector in a single indirect load.  This saves a bunch
of round-trips to the TMU, instructions for setting up multiple loads,
references to the UBO base in the uniforms, and apparently manages to
reduce register pressure as well.

instructions in affected programs: 3086665 -> 2454967 (-20.47%)
uniforms in affected programs: 919581 -> 721039 (-21.59%)
threads in affected programs: 1710 -> 3420 (100.00%)
spills in affected programs: 596 -> 522 (-12.42%)
fills in affected programs: 680 -> 562 (-17.35%)

Improves 3dmmes performance by 2.29312% +/- 0.139825% (n=5)
2019-01-04 15:41:23 -08:00
Eric Anholt
f8a8de8b9a v3d: Do UBO loads a vector at a time.
In the process of adding support for SSBOs and CS shared vars, I ended up
needing a helper function for doing TMU general ops.  This helper can be
that starting point, and saves us a bunch of round-trips to the TMU by
loading a vector at a time.
2019-01-04 15:41:23 -08:00
Eric Anholt
b0e0086257 v3d: Remove dead switch cases and comments from v3d_nir_lower_io.
Moving things to NIR left this mess around.  All we lower now is uniforms.
2019-01-04 15:41:23 -08:00
Eric Anholt
f8e6b364b0 v3d: Fix up VS output setup during precompiles.
I noticed that a VS I was debugging was missing all of its output stores
-- outputs_written was for POS, VAR0, VAR3, while the shader's variables
were POS, VAR9, and VAR12.  I'm not sure what outputs_written is supposed
to be doing here, but we can just walk the declared variables and avoid
both this bug and the emission of extra stvpms for less-than-vec4
varyings.
2019-01-04 15:41:23 -08:00
Eric Anholt
e1385e879d v3d: Reinstate the new shader-db output after v3d_compile() refactor.
I misplaced it in the rebase conflicts.
2019-01-04 15:26:19 -08:00
Caio Marcelo de Oliveira Filho
bbf9ee9b18 nir: remove dead code from copy_prop_vars
When copy_prop_vars also took care of dead write handling, intrin was
used as part of store_to_entry.  Now it isn't, so this assignment
isn't used really used.  Add a comment clarifying what happens to
intrin.

Fixes: 4dfa7adc10 "nir: Remove handling of dead writes from copy_prop_vars"
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-04 15:18:41 -08:00
Lionel Landwerlin
31e4c9ce40 i965: add CS stall on VF invalidation workaround
Even with the previous commit, hangs are still happening. The problem
there is that the VF cache invalidate do happen immediately without
waiting for previous rendering to complete. What happens is that we
invalidate the cache the moment the PIPE_CONTROL is parsed but we
still have old rendering in the pipe which continues to pull data into
the cache with the old high address bits. The later rendering with the
new high address bits then doesn't have the clean cache that it
expects/needs.

v2: Update commit message/explanation with Jason's

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: a363bb2cd0 ("i965: Allocate VMA in userspace for full-PPGTT systems.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109072
2019-01-04 11:18:54 +00:00
Lionel Landwerlin
92b7407090 i965: include draw_params/derived_draw_params for VF cache workaround
These buffers are using VB slots and should be included in the
workaround decision.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: a363bb2cd0 ("i965: Allocate VMA in userspace for full-PPGTT systems.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109072
2019-01-04 11:18:54 +00:00
Lionel Landwerlin
da634a4acb intel/blorp: emit VF caching workaround before 3DSTATE_VERTEX_BUFFERS
Probably no difference but it's nice to have i965 & blorp emit things
in the same order.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-04 11:18:51 +00:00
Lionel Landwerlin
e5ed217545 i965: limit VF caching workaround to gen8/9/10
Documentation of the 3DSTATE_VERTEX_BUFFERS packet says this is only
needed before ICL.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-04 11:18:48 +00:00
Andres Gomez
f0312cfa93 glsl/linker: complete documentation for assign_attribute_or_color_locations
Commit 27f1298b9d ("glsl/linker: validate attribute aliasing before optimizations")
forgot to complete the documentation.

Cc: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-04 09:04:31 +02:00
Gurchetan Singh
6b7aea9d85 virgl: remove empty file
Fixes: 174f53 ("virgl: consolidate transfer code")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-01-03 20:59:29 +01:00
Gurchetan Singh
ca66457b05 virgl: don't flush an empty range
Otherwise, the gl-1.0-long-dlist Piglit test crashes.

Fixes: db7757 ("virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT")
Reported by airlied@

v2: Exit on any invalid range (Erik)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109190
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
2019-01-03 20:59:29 +01:00
Eric Engestrom
393a756e6a docs: advertise distro-provided meson cross-files
Hopefully we can kick start the revolution and other distros will start
providing them as well :)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-03 18:53:21 +00:00
Eric Engestrom
8b363bc42e docs: fix the meson aarch64 cross-file
`gcc-ar` is preferred over the generic `ar`, and the `arm` family is
for 32-bit ARM [1].

[1] https://mesonbuild.com/Reference-tables.html#cpu-families

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-01-03 18:53:21 +00:00
Jakob Bornecrantz
6a9be6fc0c virgl/vtest: Use default socket name from protocol header
No functional change as the socket name is the same,
just removing the double definition of the path.

Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2019-01-03 15:50:38 +00:00
Rob Clark
e869481ef3 freedreno: fix staging resource size for arrays
A 2d-array texture (for example), should get the # of array elements
from box->depth, rather than depth0 which is minified.

Fixes dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darray_bias_float_fragment
with tiled textures.

Reported-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-03 08:11:40 -05:00
Rob Clark
67a7f6f244 freedreno: remove blit_via_copy_region()
If we hit the memcpy() path for copy_region(), that will try to do a
transfer_map(), which goes badly for blits to/from staging triggered
by transfer_map() or transfer_unmap().

We could possibly add fd_blit2() which has allow_transfer_map param,
and call that for staging blits.  But I'm not really sure if trying
the blit via copy_region() is very useful.  At least for newer gens
that implement fd_context::blit(), it probably isn't.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-03 08:10:32 -05:00
Rob Clark
2fc17e16a3 freedreno/a6xx: rework blitter API
Switch over to using fd_context::blit(), in the same way that a5xx does.
The previous patch wires fd_resource_copy_region() up to the blitter so
a6xx no longer needs to bypass the core layer to accelerate this.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-03 08:10:23 -05:00
Rob Clark
53b8eb78d5 freedreno: try blitter for fd_resource_copy_region()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-03 08:10:16 -05:00
Rob Clark
228eddd7ee freedreno: rework blit API
First step to unify the way fd5 and fd6 blitter works.  Currently a6xx
bypasses the blit API in order to also accelerate resource_copy_region()

But this approach can lead to infinite recursion:

  #0  fd_alloc_staging (ctx=0x5555936480, rsc=0x7fac485f90, level=0, box=0x7fbab29220) at ../src/gallium/drivers/freedreno/freedreno_resource.c:291
  #1  0x0000007fbdebed04 in fd_resource_transfer_map (pctx=0x5555936480, prsc=0x7fac485f90, level=0, usage=258, box=0x7fbab29220, pptrans=0x7fbab29240) at ../src/gallium/drivers/freedreno/freedreno_resource.c:479
  #2  0x0000007fbe5c5068 in u_transfer_helper_transfer_map (pctx=0x5555936480, prsc=0x7fac485f90, level=0, usage=258, box=0x7fbab29220, pptrans=0x7fbab29240) at ../src/gallium/auxiliary/util/u_transfer_helper.c:243
  #3  0x0000007fbde2dcb8 in util_resource_copy_region (pipe=0x5555936480, dst=0x7fac485f90, dst_level=0, dst_x=0, dst_y=0, dst_z=0, src=0x7fac47c780, src_level=0, src_box_in=0x7fbab2945c) at ../src/gallium/auxiliary/util/u_surface.c:350
  #4  0x0000007fbdf2282c in fd_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47c780, src_level=0, src_box=0x7fbab2945c) at ../src/gallium/drivers/freedreno/freedreno_blitter.c:173
  #5  0x0000007fbdf085d4 in fd6_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47c780, src_level=0, src_box=0x7fbab2945c) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:587
  #6  0x0000007fbde2f3d0 in util_try_blit_via_copy_region (ctx=0x5555936480, blit=0x7fbab29430) at ../src/gallium/auxiliary/util/u_surface.c:864
  #7  0x0000007fbdec02c4 in fd_blit (pctx=0x5555936480, blit_info=0x7fbab29588) at ../src/gallium/drivers/freedreno/freedreno_resource.c:993
  #8  0x0000007fbdf08408 in fd6_blit (pctx=0x5555936480, info=0x7fbab29588) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:546
  #9  0x0000007fbdebdc74 in do_blit (ctx=0x5555936480, blit=0x7fbab29588, fallback=false) at ../src/gallium/drivers/freedreno/freedreno_resource.c:129
  #10 0x0000007fbdebe58c in fd_blit_from_staging (ctx=0x5555936480, trans=0x7fac47b7e8) at ../src/gallium/drivers/freedreno/freedreno_resource.c:326
  #11 0x0000007fbdebea38 in fd_resource_transfer_unmap (pctx=0x5555936480, ptrans=0x7fac47b7e8) at ../src/gallium/drivers/freedreno/freedreno_resource.c:416
  #12 0x0000007fbe5c5c68 in u_transfer_helper_transfer_unmap (pctx=0x5555936480, ptrans=0x7fac47b7e8) at ../src/gallium/auxiliary/util/u_transfer_helper.c:516
  #13 0x0000007fbde2de24 in util_resource_copy_region (pipe=0x5555936480, dst=0x7fac485f90, dst_level=0, dst_x=0, dst_y=0, dst_z=0, src=0x7fac47b8e0, src_level=0, src_box_in=0x7fbab2997c) at ../src/gallium/auxiliary/util/u_surface.c:376
  #14 0x0000007fbdf2282c in fd_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47b8e0, src_level=0, src_box=0x7fbab2997c) at ../src/gallium/drivers/freedreno/freedreno_blitter.c:173
  #15 0x0000007fbdf085d4 in fd6_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47b8e0, src_level=0, src_box=0x7fbab2997c) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:587
  ...

Instead rework the API to push the fallback back to core code, so that
we can rework resource_copy_region() to have it's own fallback path,
and then finally convert fd6 over to work in the same way.

This also makes ctx->blit() optional, and cleans up some unnecessary
callers.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-03 08:09:52 -05:00
Rob Clark
f1c88336e6 freedreno: skip depth resolve if not written
For multi-pass rendering, it is common to keep the same depth buffer
from previous pass, to discard geometry that would be hidden by later
draws.  In the later passes with depth-test enabled, but depth-write
disabled, there is no reason to do gmem2mem resolve.

TODO probably do something similar for stencil.. although stencil
buffer isn't used as commonly these days

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-03 08:09:24 -05:00
Timothy Arceri
4d3f6cb973 nir: merge some basic consecutive ifs
After trying multiple times to merge if-statements with phis
between them I've come to the conclusion that it cannot be done
without regressions. The problem is for some shaders we end up
with a whole bunch of phis for the merged ifs resulting in
increased register pressure.

So this patch just merges ifs that have no phis between them.
This seems to be consistent with what LLVM does so for radeonsi
we only see a change (although its a large change) in a single
shader.

Shader-db results i965 (SKL):

total instructions in shared programs: 13098176 -> 13098152 (<.01%)
instructions in affected programs: 1326 -> 1302 (-1.81%)
helped: 4
HURT: 0

total cycles in shared programs: 332032989 -> 332037583 (<.01%)
cycles in affected programs: 60665 -> 65259 (7.57%)
helped: 0
HURT: 4

The cycles estimates reported by shader-db for i965 seem inaccurate
as the only difference in the final code is the removal of the
redundent condition evaluations and jumps.

Also the biggest code reduction (~7%) for radeonsi was in a tomb
raider tressfx shader but for some reason this does not get merged
for i965.

Shader-db results radeonsi (VEGA):

Totals from affected shaders:
SGPRS: 232 -> 232 (0.00 %)
VGPRS: 164 -> 164 (0.00 %)
Spilled SGPRs: 59 -> 59 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 14584 -> 13520 (-7.30 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 13 -> 13 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-01-03 15:17:16 +11:00
Timothy Arceri
19cafe8084 nir: add rewrite_phi_predecessor_blocks() helper
This will also be used by the if merge pass in the following commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-01-03 15:17:16 +11:00
Timothy Arceri
5122fbc4ba nir: simplify does_varying_match()
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-03 11:47:56 +11:00
Timothy Arceri
8d05ee2005 nir: make use of does_varying_match() helper
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-03 11:47:56 +11:00
Timothy Arceri
0016166d19 nir: make nir_opt_remove_phis_impl() static
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-03 11:47:56 +11:00
Eric Anholt
d2b899c0ec v3d: Refactor compiler entrypoints.
Before, I had per-stage entryoints with some helpers shared between them.
As I extended for compute shaders and shader-db, it turned out that the
other common code in the middle wanted to be shared too.
2019-01-02 14:12:29 -08:00
Eric Anholt
0805060573 v3d: Handle dynamically uniform IF statements with uniform control flow.
Loops will be trickier, since we need some analysis to figure out if the
breaks/continues inside are uniform.  Until we get that in NIR, this gets
us some quick wins.

total instructions in shared programs: 6192844 -> 6174162 (-0.30%)
instructions in affected programs: 487781 -> 469099 (-3.83%)
2019-01-02 14:12:29 -08:00
Eric Anholt
5e9ee6e841 v3d: Fold comparisons for IF conditions into the flags for the IF.
total instructions in shared programs: 6193810 -> 6192844 (-0.02%)
instructions in affected programs: 800373 -> 799407 (-0.12%)
2019-01-02 14:12:29 -08:00
Eric Anholt
078dc176bc v3d: Don't try to fold non-SSA-src comparisons into bcsels.
There could have been a write of a src in between the comparison and the
bcsel that would invalidate the comparison.
2019-01-02 14:12:29 -08:00
Eric Anholt
2e0433b687 v3d: Move the "Find the ALU instruction generating our bool" out of bcsel.
This will be reused for if statements.
2019-01-02 14:12:29 -08:00
Eric Anholt
c3ae0aa264 v3d: Simplify the emission of comparisons for the bcsel optimization.
I wanted to reuse the comparison stuff for nir_ifs, but for that I just
want the flags and no destination value.  Splitting the conditions from
the destinations ended up cleaning the existing code up, anyway.
2019-01-02 14:12:29 -08:00
Eric Anholt
49d8e2aff1 v3d: Don't forget to include RT writes in precompiles.
Looking at some assembly dumps for an optimization, we were clearly
missing important parts of the shader!
2019-01-02 14:12:29 -08:00
Eric Anholt
3a81c753a3 v3d: Fix segfault when failing to compile a program.
We'll still fail at draw time, but this avoids a regression in shader-db
execution once I enable TLB writes in precompiles.

Fixes: b38e4d313f ("v3d: Create a state uploader for packing our shaders together.")
2019-01-02 14:12:29 -08:00
Marek Olšák
3ae57957be radeonsi: always unmap texture CPU mappings on 32-bit CPU architectures
Team Fortress 2 32-bit version runs out of the CPU address space.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:59 -05:00
Marek Olšák
edfca1f8dc radeonsi: remove unused variables in si_insert_input_ptr
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:58 -05:00
Marek Olšák
cba475b3e7 radeonsi: use u_decomposed_prims_for_vertices instead of u_prims_for_vertices
It seems to be the same, but this doesn't use integer division with
a variable divisor.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:56 -05:00
Marek Olšák
54bc87469a radeonsi: make si_cp_wait_mem more configurable
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:54 -05:00
Marek Olšák
9d2c3a1fe0 radeonsi: call si_fix_resource_usage for the GS copy shader as well
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:53 -05:00
Marek Olšák
d28e208213 radeonsi: don't emit redundant PKT3_NUM_INSTANCES packets
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:50 -05:00
Caio Marcelo de Oliveira Filho
7d6babf995 nir: add a way to print the deref chain
Makes debugging easier when we care about the deref chain and not the
deref instruction itself.  To make it take a const pointer, constify
some of the static functions in nir_print.c.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 10:09:04 -08:00
Dylan Baker
a2596450ac meson: Error out if building nouveau and using LLVM without rtti
Nouveau requires rtti. Often LLVM is configured without rtti, and code
with and without cannot be linked safely. Lets just error out if nouveau
is requested and llvm is built without rtti.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109202
Fixes: c5a97d658e
       ("meson: fix builds against LLVM built without rtti")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-02 09:30:12 -08:00
Alexander von Gluck IV
1b97a72328 egl/haiku: Fix reference to disp vs dpy
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 00992700c9 "egl: set the EGLDevice when creating a display"
2019-01-02 13:45:09 +00:00
Iago Toral Quiroga
ec79069856 compiler/spirv: use 32-bit polynomial approximation for 16-bit asin()
The 16-bit polynomial execution doesn't meet Khronos precision requirements.
Also, the half-float denorm range starts at 2^(-14) and with asin taking input
values in the range [0, 1], polynomial approximations can lead to flushing
relatively easy.

An alternative is to use the atan2 formula to compute asin, which is the
reference taken by Khronos to determine precision requirements, but that
ends up generating too many additional instructions when compared to the
polynomial approximation. Specifically, for the Intel case, doing this
adds +41 instructions to the program for each asin/acos call, which looks
like an undesirable trade off.

So for now we take the easy way out and fallback to using the 32-bit
polynomial approximation, which is better (faster) than the 16-bit atan2
implementation and gives us better precision that matches Khronos
requirements.

v2:
 - Fallback to 32-bit using recursion (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:39 +01:00
Iago Toral Quiroga
fda3f6d424 compiler/spirv: implement 16-bit frexp
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:35 +01:00
Iago Toral Quiroga
7d3c34197a compiler/spirv: implement 16-bit hyperbolic trigonometric functions
v2:
 - use nir_fadd_imm and nir_fmul_imm helpers (Jason)

v3:
 - since we need to define one for fsub use it for fdiv too (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
88663ba67c compiler/spirv: implement 16-bit exp and log
v2
 - use nir_fmul_imm helper (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
f18554e2ce compiler/spirv: implement 16-bit atan2
v2:
 - fix huge_val for 16-bit, it was mean't to be 2^14 not 10^14.

v3:
 - rebase on top of new bool sized opcodes
 - use nir_b2f helper
 - use nir_fmul_imm helper

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
1c8de08ec9 compiler/spirv: implement 16-bit atan
v2:
 - use nir_fadd_imm and nir_fmul_imm helpers (Jason)
 - rebased on top of new sized boolean opcodes
 - use nir_b2f helper

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
df118535ca compiler/spirv: implement 16-bit acos
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
dbbbe24d76 compiler/spirv: implement 16-bit asin
v2:
  - use nir_fmul_imm and nir_fadd_imm helpers (Jason)

v3:
 - missed one case where we need to replace nir_imm_float
   with nir_imm_floatN_t (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
95b7c29c2c compiler/spirv: handle 16-bit float in radians() and degrees()
v2:
 - use nir_imm_fmul helper (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
aeee683780 compiler/nir: add nir_fadd_imm() and nir_fmul_imm() helpers
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Iago Toral Quiroga
5fc9ad1cb0 compiler/nir: add a nir_b2f() helper
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-02 07:54:05 +01:00
Timothy Arceri
70be9afccb nir: link time opt duplicate varyings
If we are outputting the same value to more than one output
component rewrite the inputs to read from a single component.

This will allow the duplicate varying components to be optimised
away by the existing opts.

shader-db results i965 (SKL):

total instructions in shared programs: 12869230 -> 12860886 (-0.06%)
instructions in affected programs: 322601 -> 314257 (-2.59%)
helped: 3080
HURT: 8

total cycles in shared programs: 317792574 -> 317730593 (-0.02%)
cycles in affected programs: 2584925 -> 2522944 (-2.40%)
helped: 2975
HURT: 477

shader-db results radeonsi (VEGA):

SGPRS: 31576 -> 31664 (0.28 %)
VGPRS: 17484 -> 17064 (-2.40 %)
Spilled SGPRs: 184 -> 167 (-9.24 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 583340 -> 569368 (-2.40 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 6162 -> 6270 (1.75 %)
Wait states: 0 -> 0 (0.00 %)

vkpipeline-db results RADV (VEGA):

Totals from affected shaders:
SGPRS: 14880 -> 15080 (1.34 %)
VGPRS: 10872 -> 10888 (0.15 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 674016 -> 668396 (-0.83 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 2708 -> 2704 (-0.15 %)
Wait states: 0 -> 0 (0.00 %

V2: bunch of tidy ups suggested by Jason

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 12:19:17 +11:00
Timothy Arceri
d828694b80 nir: rework nir_link_opt_varyings()
This just cleans things up a little and make things more safe for
derefs.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 12:19:17 +11:00
Timothy Arceri
c0aba8b0dc nir: add can_replace_varying() helper
This will be reused by the following patch.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 12:19:17 +11:00
Timothy Arceri
50de3f80a8 nir: rename nir_link_constant_varyings() nir_link_opt_varyings()
The following patches will add support for an additional
optimisation so this function will no longer just optimise varying
constants.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 12:19:17 +11:00
Timothy Arceri
0a4378ce56 st/glsl_to_nir: call nir_lower_load_const_to_scalar() in the st
This will help the new opt introduced in the following patches
allowing us to remove extra duplicate varyings.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 12:19:17 +11:00
Timothy Arceri
2ef0f944f5 radeonsi: make use of ac_are_tessfactors_def_in_all_invocs()
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-02 10:01:31 +11:00
Timothy Arceri
2832bc972b ac/nir_to_llvm: add ac_are_tessfactors_def_in_all_invocs()
The following patch will use this with the radeonsi NIR backend
but I've added it to ac so we can use it with RADV in future.

This is a NIR implementation of the tgsi function
tgsi_scan_tess_ctrl().

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-02 10:01:24 +11:00
Timothy Arceri
2817a4ec0b radeonsi: remove unrequired param in si_nir_scan_tess_ctrl()
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-02 10:01:15 +11:00
Timothy Arceri
4dda445750 tgsi/scan: correctly walk instructions in tgsi_scan_tess_ctrl()
The previous code used a do while loop and continues after walking
a nested loop/if-statement. This means we end up evaluating the
last instruction from the nested block against the while condition
and potentially exit early if it matches the exit condition of the
outer block.

Fixes: 386d165d8d ("tgsi/scan: add a new pass that analyzes tess factor writes")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-02 09:53:01 +11:00
Timothy Arceri
dd061eb044 tgsi/scan: fix loop exit point in tgsi_scan_tess_ctrl()
This just happened not to crash/assert because all loops have at
least 1 if-statement and due to a second bug we end up matching
the same ENDIF to exit both the iteration over the if-statment
and the loop.

The second bug is fixed in the following patch.

Fixes: 386d165d8d ("tgsi/scan: add a new pass that analyzes tess factor writes")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-02 09:53:01 +11:00
Ilia Mirkin
8f98ff362c nv30: disable rendering to 3D textures
There's no way to tell the 3D engine about swizzling on such textures.
While rendering to NPOT ones may be possible, there's no great way to
expose that in gallium, nor would there be any practical benefit.

Fixes the non-compressed-format "copyteximage 3D" failures. Something
odd going on with the compressed formats.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-01 15:11:14 -05:00
Bas Nieuwenhuizen
8c93ef5de9 radv: Do a cache flush if needed before reading predicates.
This caused random failures for two conditional rendering tests:

dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_discard
dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_no_discard

These wrote the predicate with the vertex shader, did a barrier and then
started the conditional rendering. However the cache flushes for the barrier
only happen on first draw, so after the predicate has been read.

Fixes: e45ba51ea4 "radv: add support for VK_EXT_conditional_rendering"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-12-31 20:52:08 +01:00
Erik Faye-Lund
86089a7316 anv/autotools: make sure tests link with -msse2
Without this, I get the following error when building the tests with
autotools on i686:

---8<---
src/intel/common/gen_clflush.h: In function ‘gen_clflush_range’:
src/intel/common/gen_clflush.h:37:7: warning: implicit declaration of function ‘__builtin_ia32_clflush’; did you mean ‘__builtin_ia32_pause’? [-Wimplicit-function-declaration]
       __builtin_ia32_clflush(p);
       ^~~~~~~~~~~~~~~~~~~~~~
       __builtin_ia32_pause
src/intel/common/gen_clflush.h: In function ‘gen_flush_range’:
src/intel/common/gen_clflush.h:45:4: warning: implicit declaration of function ‘__builtin_ia32_mfence’; did you mean ‘__builtin_ia32_fnclex’? [-Wimplicit-function-declaration]
    __builtin_ia32_mfence();
    ^~~~~~~~~~~~~~~~~~~~~
    __builtin_ia32_fnclex
---8<---

The erros are generated for each of these files:
- mesa/src/intel/vulkan/tests/state_pool_no_free.c
- mesa/src/intel/vulkan/tests/state_pool.c
- mesa/src/intel/vulkan/tests/block_pool_no_free.c
- mesa/src/intel/vulkan/tests/state_pool_free_list_only.c

This is obviously because gen_clflush.h contains code that uses
intrinsics that are only available with SSE3. Since the driver already
uses SSE3, it seems reasonable to add this to the tests as well.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Eric Engeström <eric@engestrom.ch>
2018-12-31 17:28:21 +01:00
Erik Faye-Lund
89679e18a9 anv/meson: make sure tests link with -msse2
Without this, I get the following error when building the tests using
meson on i686:

---8<---
In file included from ../../../mesa/src/intel/vulkan/anv_private.h:46,
                 from ../../../mesa/src/intel/vulkan/tests/state_pool_no_free.c:26:
../../../mesa/src/intel/common/gen_clflush.h: In function ‘gen_clflush_range’:
../../../mesa/src/intel/common/gen_clflush.h:37:7: error: implicit declaration of function ‘__builtin_ia32_clflush’; did you mean ‘__builtin_ia32_pause’? [-Werror=implicit-function-declaration]
       __builtin_ia32_clflush(p);
       ^~~~~~~~~~~~~~~~~~~~~~
       __builtin_ia32_pause
../../../mesa/src/intel/common/gen_clflush.h: In function ‘gen_flush_range’:
../../../mesa/src/intel/common/gen_clflush.h:45:4: error: implicit declaration of function ‘__builtin_ia32_mfence’; did you mean ‘__builtin_ia32_fnclex’? [-Werror=implicit-function-declaration]
    __builtin_ia32_mfence();
    ^~~~~~~~~~~~~~~~~~~~~
    __builtin_ia32_fnclex
---8<---

The errors are generated for each of these files:
- mesa/src/intel/vulkan/tests/state_pool_no_free.c
- mesa/src/intel/vulkan/tests/state_pool.c
- mesa/src/intel/vulkan/tests/block_pool_no_free.c
- mesa/src/intel/vulkan/tests/state_pool_free_list_only.c

This is obviously because gen_clflush.h contains code that uses
intrinsics that are only available with SSE3. Since the driver already
uses SSE3, it seems reasonable to add this to the tests as well.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engeström <eric@engestrom.ch>
2018-12-31 17:27:33 +01:00
Ilia Mirkin
207fb558e4 nv30: fix some s3tc layout issues
s3tc layouts are a bit finicky - they're packed, but not swizzled.
Adjust logic to allow for that case:

  - Don't set a uniform pitch for POT-sized compressed textures
  - Adjust define_rect API to be less confused about block sizes
  - Only mark a texture as linear if it has a uniform pitch set

This has been tested to fix xonotic (as well as the s3tc-* piglits)
on nv3x and keeps it working on nv4x.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30 23:32:21 -05:00
Ilia Mirkin
ad251330e8 nv30: use correct helper to get blocks in y direction
This doesn't matter since all compressed formats supported by this
hardware use square blocks, but best to use the correct helper.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30 23:32:21 -05:00
Ilia Mirkin
b04c1907c8 nv30: add support for multi-layer transfers
This logic mirrors what we do on nv50. The relatively new
texture_subdata callback can cause this to happen with 3D textures,
which is triggered at least by xonotic, and probably many piglits.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30 23:32:21 -05:00
Ilia Mirkin
b34cfd4749 nv30: fix rare issue with fp unbinding not finding the bufctx
If the last-active context gets deleted, the pushbuf doesn't have a
bufctx to reference. Then there could be a sequence of binds which would
trigger a reset on that bin before validation was done. Instead we just
pass in the bufctx in question directly.

All other instances of PUSH_RESET happen strictly after a validation is
run.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102349
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30 19:44:43 -05:00
Ilia Mirkin
ef3eac9545 nv30: avoid setting user_priv without setting cur_ctx
The whole user_priv thing is a mess, but as long as it's there, it
basically has to map 1:1 to the cur_ctx. Unfortunately we were setting
user_priv to some context, then that context could get deleted without
any draws/validations in it, leading user_priv to become NULL, with
cur_ctx still pointing at some old context. Then we wouldn't run the
switch logic, which in turn led to a NULL bufctx being dereferenced.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102349
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30 19:44:43 -05:00
Eric Anholt
ad1e59cf8d v3d: Add support for gl_HelperInvocation.
We can just look at the MSF flags -- if they're unset, then we're
definitely in a helper invocation.  Fixes
dEQP-GLES31.functional.shaders.helper_invocation.* with GLES3.1 enabled.
2018-12-30 08:05:11 -08:00
Eric Anholt
20021e3473 v3d: Add support for textureSize() on MSAA textures.
Fixes failures in
dEQP-GLES31.functional.shaders.builtin_functions.texture_size.samples_1_texture_2d
in the GLES3.1 suite.
2018-12-30 08:05:11 -08:00
Eric Anholt
f695d62fe5 v3d: Add support for requesting the sample offsets. 2018-12-30 08:05:11 -08:00
Eric Anholt
906fca1b4b v3d: Add support for non-constant texture offsets.
Fixes
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_pot.clamp_to_edge_repeat
and others.
2018-12-30 08:05:11 -08:00
Eric Anholt
47caefc7b4 v3d: Force sampling from base level for tg4.
This is what the GLSL ES 310 spec tells us to do, but apparently the
"gather mode" flag doesn't imply it in the HW.  Fixes
dEQP-GLES31.functional.texture.gather.basic.2d.rgba8.filter_mode.min_nearest_mipmap_linear_mag_linear
2018-12-30 08:05:11 -08:00
Eric Anholt
f9bdce9966 v3d: Add a note for a potential performance win on multop/umul24.
Noticed while debugging a testcase.
2018-12-30 08:05:11 -08:00
Eric Anholt
b36757448d v3d: Dead-code eliminate unused flags updates.
The greedy comparison folding in bcsel means that we may have left the
original bool-generating NIR ALU instruction dead, but DCE wasn't
eliminating the VIR code for it because of the flags updates.

total instructions in shared programs: 5186024 -> 5100894 (-1.64%)
instructions in affected programs: 1448695 -> 1363565 (-5.88%)
2018-12-30 08:05:11 -08:00
Eric Anholt
20e3526298 v3d: Don't generate temps for comparisons.
This was just generated work for vir_opt_dead_code and cluttered up the
dumps.
2018-12-30 08:04:54 -08:00
Eric Anholt
ebde5afb93 v3d: Move "does this instruction have flags" from sched to generic helpers.
I wanted to reuse it for DCE of flags updates.
2018-12-30 08:03:51 -08:00
Eric Anholt
39b1112189 v3d: Drop incorrect dependency for flpop.
It is just shifting probably-means-flags bits out of a value, it doesn't
actually update the flags on its own.
2018-12-30 08:03:51 -08:00
Eric Anholt
a7c9fd7573 v3d: Drop unused count_nir_instrs() helper.
This was for shader-db, but I haven't cared about NIR instruction counts
in a long time.
2018-12-30 08:03:51 -08:00
Eric Anholt
696f63f1b4 v3d: Hook up some shader-db output to GL_ARB_debug_output.
This allows the original shader-db project's run.c runner to parse things
easily, and is probably a good thing to have for GL_ARB_debug_output in
general.  I formatted it more like Intel's so I can mostly reuse their
report script.
2018-12-30 08:03:51 -08:00
Eric Anholt
87b251a940 v3d: Add a "precompile" debug flag for shader-db.
I've been using my apitrace-based shader-db so far, but it's slow
(apitrace decompression), intrusive (apitrace windows spamming the
screen), and doesn't have much coverage.  The original shader-db provides
a lot more coverage and compiles faster, at the expense of not having the
actual runtime variant key.  As v3d has a lot less runtime variation than
vc4 did, this tradeoff makes more sense.
2018-12-29 13:52:09 -08:00
Eric Anholt
9ec6a3d621 v3d: Fix uniform pretty printing assertion failure with branches.
Fixes: 248a7fb392 ("v3d: Do uniform pretty-printing in the QPU dump.")
2018-12-29 13:52:09 -08:00
Dylan Baker
133a5b8383 meson: Override C++ standard to gnu++11 when building with altivec on ppc64
Otherwise there will be symbol collisions for the vector name.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108943
Distro Bug: https://bugs.gentoo.org/673622
Fixes: 42ea0631f1
       ("meson: build clover")
Acked-by: Matt Turner <mattst88@gmail.com>
2018-12-28 11:04:57 -08:00
Lionel Landwerlin
f7bccf6ab4 intel/aub_viewer: highlight true booleans
Useful to spot PIPE_CONTROL flags.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-12-28 16:48:46 +00:00
Lionel Landwerlin
6ba61ea391 intel/aub_viewer: fold binding/sampler table items
Makes things easier to read rather than a long block of text.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-12-28 16:48:43 +00:00
Lionel Landwerlin
7ab8c80625 intel/aub_viewer: fix shader view
Not decoding the shader at the right offset.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-12-28 16:48:40 +00:00
Lionel Landwerlin
f3ed4a058d intel/aub_viewer: print address of missing shader
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-12-28 16:48:21 +00:00
Lionel Landwerlin
0382e11989 intel/aub_viewer: fixup 0x address prefix
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-12-28 16:48:18 +00:00
Lionel Landwerlin
8e2fda411a intel/aub_viewer: fix shader get_bo
Instruction addresses are always in ppgtt space.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-12-28 16:48:08 +00:00
Nicholas Kazlauskas
e260493f2a radeonsi: Enable adaptive_sync by default for radeon
It's better to let most applications make use of adaptive sync
by default. Problematic applications can be placed on the blacklist
or the user can manually disable the feature.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
2018-12-28 17:08:14 +01:00
Nicholas Kazlauskas
2e12fe425f loader/dri3: Enable adaptive_sync via _VARIABLE_REFRESH property
The DDX driver can be notified of adaptive sync suitability by
flagging the application's window with the _VARIABLE_REFRESH property.

This property is set on the first swap the application performs
when adaptive_sync is set to true in the drirc.

It's performed here instead of when the loader is initialized for
two reasons:

(1) The window's drawable can be missing during loader init.
    This can be observed during the Unigine Superposition benchmark.

(2) Adaptive sync will only be enabled closer to when the application
    actually begins rendering.

If adaptive_sync is false then the _VARIABLE_REFRESH property
is deleted on loader init.

The property is only managed on the glx DRI3 backend for now. This
should cover most common applications and games on modern hardware.

Vulkan support can be implemented in a similar manner but would likely
require splitting the function out into a common helper function.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
2018-12-28 16:44:47 +01:00
Nicholas Kazlauskas
a9c36dbf9c drirc: Initial blacklist for adaptive sync
Applications that don't present at a predictable rate (ie. not games)
shouldn't have adapative sync enabled. This list covers some of the
common desktop compositors, web browsers and video players.

[ Michel Dänzer: Added entry for firefox-esr ]

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
2018-12-28 16:44:27 +01:00
Nicholas Kazlauskas
7407670036 util: Add adaptive_sync driconf option
This option lets the user decide whether mesa should notify the
window manager / DDX driver that the current application is adaptive
sync capable.

It's off by default.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
2018-12-28 16:38:06 +01:00
Nicholas Kazlauskas
759b940389 util: Get program name based on path when possible
Some programs start with the path and command line arguments in
argv[0] (program_invocation_name). Chromium is an example of
an application using mesa that does this.

This tries to query the real path for the symbolic link /proc/self/exe
to find the program name instead. It only uses the realpath if it
was a prefix of the invocation to avoid breaking wine programs.

Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-28 15:41:01 +01:00
Tomeu Vizoso
bf1dfcc3e8 etnaviv: Consolidate buffer references from framebuffers
We were leaking surfaces because the references taken in
etna_set_framebuffer_state weren't being released on context destroy.

Instead of just directly releasing those references in
etna_context_destroy, use the util_copy_framebuffer_state helper.

Take the chance to remove the duplicated buffer references in
compiled_framebuffer_state to avoid confusion.

The leak can be reproduced with a client that continuously creates and
destroys contexts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reported-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-12-28 10:22:01 +01:00
Dave Airlie
d1ce7eba8b virgl/vtest: fix front buffer flush with protocol version 0.
Older versions of virglrenderer before 33da7361aec486290df0aec4ad8dfa8ff6adde2c
in vtest mode, misrender gears.

Fixes: 9d81cd8e7c (virgl: Pass resource size and transfer offsets)
Reviewed-By: Gert  Wollny <gert.wollny@collabora.com>
2018-12-28 16:50:38 +10:00
Dylan Baker
6adbd9ac74 docs/autoconf: Mark autoconf as being replaced
I know it's not what anyone wants, but how about we start with a
message in the documentation that encourages people to try meson.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engeström <eric@engestrom.ch>
2018-12-27 09:03:20 -08:00
Dylan Baker
4c32964f49 docs/install: Update python dependency section
Note that meson requires python 3, scons requires python 2, and
autotools works with either.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engeström <eric@engestrom.ch>
2018-12-27 09:03:20 -08:00
Dylan Baker
a57dbe6971 docs/meson: Update LLVM section with information about native files
Reviewed-by: Eric Engeström <eric@engestrom.ch>
2018-12-27 09:03:17 -08:00
Dylan Baker
40ec5fec0a docs/install: Add meson to the main install page
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engeström <eric@engestrom.ch>
2018-12-27 09:03:07 -08:00
Juan A. Suarez Romero
fe7919acad docs: update calendar, add news item and link release notes for 18.2.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-12-27 17:37:33 +01:00
Juan A. Suarez Romero
0d53451890 docs: add sha256 checksums for 18.2.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 24c31bc0e2)
2018-12-27 17:35:04 +01:00
Juan A. Suarez Romero
008478e340 docs: add release notes for 18.2.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 785e09e3b3)
2018-12-27 17:35:02 +01:00
Ilia Mirkin
2269ab8588 nv50,nvc0: add missing CAPs for unsupported features
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-26 20:28:07 -05:00
Ilia Mirkin
1d10bb2025 nvc0: enable GL_NV_shader_atomic_float on pre-Maxwell
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-26 20:04:57 -05:00
Ilia Mirkin
0dd55db10f nv50/ir: add support for converting ATOMFADD to proper ir
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-26 20:04:57 -05:00
Ilia Mirkin
9867f2a1f7 st/mesa: expose GL_NV_shader_atomic_float when ATOMFADD is supported
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-26 20:04:57 -05:00
Ilia Mirkin
4d5a6a1649 st/mesa: select ATOMFADD when source type is float
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-26 20:04:57 -05:00
Ilia Mirkin
d139231b32 gallium: add PIPE_CAP_TGSI_ATOMFADD to indicate support
ATOMFADD is a little special -- make drivers have to specify it
explicitly.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-26 20:04:57 -05:00
Ilia Mirkin
5574414edc tgsi: add ATOMFADD operation
This is supported by at least NVIDIA hardware, and exposeable via GL
extensions.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-26 20:04:57 -05:00
Ilia Mirkin
bac8534267 st/mesa: allow glDrawElements to work with GL_SELECT feedback
Not sure if this ever worked, but the current logic for setting the
min/max index is definitely wrong for indexed draws. While we're at it,
bring in all the usual logic from the non-indirect drawing path.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109086
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-12-26 19:30:33 -05:00
Eric Anholt
7d7ecfbcbc gallium/ttn: Fix setup of outputs_written.
We need a 64-bit value, otherwise we only handle the low 32, and happen to
sign-extend to claim to write all varying slots if VARYING_SLOT_VAR2 was
used.

Fixes: 4d0b2c7aaa ("ttn: Update shader->info as we generate code.")
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-12-26 11:42:09 -08:00
Lionel Landwerlin
e2ae5f2f0a anv: don't do partial resolve on layer > 0
We've made the choice not to use fast clears on layer > 0 with
multilayer images. This is partly because we would need to store
multiple clear colors for each layer, making the existing memory
layout, already including aux surfaces, fast clear color, image state,
etc... even more complex.

Partial resolves are the operations transfering the clear colors into
the auxiliary buffers. This operation is currently implemented in
Blorp by loading the clear color from the image's BO, into a shader
that then samples from the auxiliary buffer and writes the color only
if it isn't there already.

The problem here is that because we store only one clear color for all
layers and it is used for partial resolves. If you trigger a partial
clear on a layer > 0, then you're likely to deal with a color that is
not what you actually want. In the particular issues below, we have
multiple layers, each cleared with a different color but the partial
resolve just writes the wrong color into the auxiliary buffers for
layers > 0.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108910
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911
Cc: mesa-stable@lists.freedesktop.org
2018-12-24 09:42:46 +00:00
Axel Davy
c6b37e5412 st/nine: Increase the limit of cached ff shaders
100 is too small for some games, which triggers recompilations
every frame. Increase to 1024.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-12-23 08:14:50 +01:00
Axel Davy
104681c5d5 st/nine: Add src reference to nine_context_range_upload
Just like nine_context_box_upload, nine_context_range_upload
should reference the src, which holds the ram source buffer.

Fixes: https://github.com/iXit/Mesa-3D/issues/327
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Cc: mesa-stable@lists.freedesktop.org
2018-12-23 08:14:50 +01:00
Axel Davy
42d672fa6a st/nine: Bind src not dst in nine_context_box_upload
nine_context_box_upload uploads a ram buffer (from src)
to a pipe_resource (dst).
We already have a refcount on the pipe_resource,
what needs to be protected from release is the ram buffer,
thus a reference to src.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Cc: mesa-stable@lists.freedesktop.org
2018-12-23 08:14:50 +01:00
Axel Davy
f91f748fab st/nine: Fix volumetexture dtor on ctor failure
The dtor is called on allocation failure,
thus we must check the volumes are allocated
before trying to release them.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Cc: mesa-stable@lists.freedesktop.org
2018-12-23 08:14:50 +01:00
Axel Davy
1cc8192ad0 st/nine: Switch to presentation buffer if resize is detected
This enables to match the window size
on resize on all cases, as it only works
currently with presentation buffers.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-12-23 08:14:50 +01:00
Axel Davy
c442dd7890 st/nine: Use helper to release swapchain buffers later
This patch introduces a structure to release the
present_handles only when they are fully released
by the server, thus making
"DestroyD3DWindowBuffer" actually release the buffer
right away when called.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-12-23 08:14:50 +01:00
Rob Clark
51a44c3aac freedreno/a6xx: fix 3d texture layout
Maybe not 100% perfect, but seems to be a pretty good approximation of
that.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-22 15:29:15 -05:00
Rob Clark
8f60f1381d freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-22 15:28:50 -05:00
Rob Clark
be9ec158d7 freedreno/a6xx: improve setup_slices() debug msgs
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-22 15:28:24 -05:00
Rob Clark
2b497fc507 freedreno/a6xx: simplify special case for 3d layout
This logic can be re-written as the two cases for 3d (ie. before/after
the miplevel sizes start reducing) vs everything else.  I think it is
easier to read this way.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-22 15:27:57 -05:00
Rob Clark
d71a50f831 freedreno: combine fd_resource_layer_offset()/fd_resource_offset()
We really only need this logic in one place.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-22 15:27:37 -05:00
Rob Clark
6667dde098 freedreno/ir3: don't treat all inputs/outputs as vec4
This was a hold-over from the early TGSI days, and mostly not needed
with NIR.  This avoids burning an entire 4 consecutive scalar regs
for vec3 outputs, for example.  Which fixes a few places that we were
doing worse that we should on register usage.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-22 15:27:21 -05:00
Rob Clark
3453814622 freedreno/ir3: fix fallout of extra assert
Fixes the following crash that happened after d6110d4d

The problem happens if we first compile a "vanilla" shader with nothing
lowered in NIR, which perform the final lowering passes on so->shader->
nir (including nir_lower_locals_to_regs()), and then later we have
compile a shader with some lowering.  The second time through we would
have already done nir_lower_locals_to_regs().

Arguably this was already a bug, just one we hadn't noticed yet.

Fixes: d6110d4d54 intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-21 19:04:22 -05:00
Kenneth Graunke
626f2477ab st/nir: Drop unused gl_program parameter in VS input handling helper.
Nobody uses this, so let's drop it.  This makes the helper callable
from places without a gl_program.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-21 15:29:32 -08:00
Kenneth Graunke
3a78b46e59 st/nir: Gather info after applying lowering FS variant features
DrawPixels lowering, for example, adds new varyings that need to be
accounted for in inputs_read.  The earlier info gathering at link time
cannot account for this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-21 15:29:30 -08:00
Kenneth Graunke
bcb6f19947 st/mesa: Combine the DrawPixels and Bitmap passthrough VS programs.
They're now identical, so we can just compile it once.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-21 15:29:29 -08:00
Kenneth Graunke
80dd9dfe33 st/mesa: Don't open code the drawpixels vertex shader.
Now that we always copy color, we can just use the util function.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-21 15:29:28 -08:00
Kenneth Graunke
ed1a356c5e st/mesa: Drop !passColor optimization in drawpixels shaders.
The glDrawPixels passthrough vertex shader copies position and texcoord
vertex attributes to varying outputs.  It also optionally copies a third
gl_Color attribute, which sometimes is unnecessary.  Until now, we've
compiled separate variants of the shader, one of which does this extra
copy, and the other of which doesn't.  We have done this since 2007.

But, the vertex shader runs for a whopping four vertices, and so the
cost of a copying a single input to output is likely inconsequential.
In theory, we could bind one fewer vertex element - but we always bind
all three regardless.  So, we don't even get that savings.

This patch unifies the two, so we always copy the optional color,
and save having to compile the variant.  It also makes the VS input
interface match up with the vertex element state without any dead
(unused) input attributes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-21 15:29:25 -08:00
Kenneth Graunke
42d31e0516 st/mesa: Drop dead 'passthrough_fs' field.
Dead since 2015 (commit 5142564734).

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-21 15:29:20 -08:00
Bas Nieuwenhuizen
bba5749484 radv: Fix wrongly positioned paren.
Trivial.

Fixes: 9f0bfbed11 "radv: Work around non-renderable 128bpp compressed 3d textures on GFX9."
2018-12-21 21:06:55 +01:00
Dylan Baker
1e872d1486 docs: add note about using backticks for rbs in gitlab
So that gitlab will render the < and > correctly allowing the tag to be
copy-n-pasted without additional formatting.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-12-21 17:43:56 +00:00
Alex Deucher
516160d717 pci_ids: add new VegaM pci id
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-12-21 11:51:34 -05:00
Roland Scheidegger
171983dc89 gallivm: abort when trying to use non-existing intrinsic
Whenever llvm removes an intrinsic (we're using), we're hitting segfaults
due to llvm doing calls to address 0 in the jitted code instead.
However, Jose figured out we can actually detect this with
LLVMGetIntrinsicID(), so use this to abort, so we don't have to wonder
what got broken. (Of course, someone still needs to fix the code to
no longer use this intrinsic.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-12-21 17:37:00 +01:00
Roland Scheidegger
f3b1acff48 gallivm: don't use pavg.b intrinsic on llvm >= 6.0
This intrinsic disppeared with llvm 6.0, using it ends up in segfaults
(due to llvm issuing call to NULL address in the jited shaders).
Add code doing the same thing as the autoupgrade code in llvm so it
can be matched and replaced back with a pavgb.

While here, also improve lp_test_format, so it tests both with and without
cache (as it was, it tested the cache versions only, whereas cache is
actually disabled in llvmpipe, and in any case even with it enabled
vertex and geometry shaders wouldn't use it). (Although at least for
the unorm8 uncached fetch, the code is still quite different to what
llvmpipe is using, since that would use unorm8x16 type, whereas
the test code is using unorm8x4 type, hence disabling some intrinsic
paths.)

Fixes: 6f4083143b ("gallivm: use llvm jit code for decoding s3tc")

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2018-12-21 17:35:05 +01:00
Emil Velikov
a8d020c3dc travis: meson: port gallium build combinations over
This commit adds a number of build combinations:

 - Gallium Drivers {SWR, RadeonSI, Others)
Each one has different LLVM requirements. Building SWR alone is twice
as slow as all other drivers combined.

 - Gallium ST Clover LLVM {5,6,7}
Because C++ API changes all the time. Analogous to above building
Clover takes as much time as building all other ST combined.

 - Gallium ST Others
Nouveau is used, instead of i915g since meson has explicit target
tracking. Meaning that a configure error is thrown if we use i915g
with say va, vdpau or others.

Note: LLVM prior to 5.0 is intentionally dropped. If needed we can add
that later.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-13 01:34:59 +00:00
Emil Velikov
39634f2f35 travis: meson: add explicit handling to gallium ST
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 13:52:20 +00:00
Emil Velikov
51318c32fe travis: meson: explicitly control the DRI loaders
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 13:42:36 +00:00
Emil Velikov
e890aaabed travis: meson: add unwind handling
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-12 13:33:14 +00:00
Emil Velikov
266ae2225e travis: meson: use FOO_DRIVERS directly
It makes for a shorter MESON_OPTIONS and cleaner handling.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 13:18:54 +00:00
Dylan Baker
31c162ad22 travis: meson: enable unit tests
v2: [Emil] pass the argument directly to meson

Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-11 10:34:51 -08:00
Dylan Baker
116f0fb216 travis: Don't try to read libdrm out of configure.ac
Since we're going to delete it shortly

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-11 11:09:21 -08:00
Dylan Baker
ecf96413bb travis: meson: use native files to override llvm-config
This is the supported way to do this, and should be more robust and
reliable.

v2: [Emil]
 - enable backslash escapes
 - don't hardcode the path
 - pass the argument directly to meson

Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-11 10:40:25 -08:00
Emil Velikov
81173fd69f travis: printout llvm-config --version
Provides quick and easy feedback.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-13 10:38:20 +00:00
Emil Velikov
de72c1fe6c travis: meson: print the configured state
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 17:43:07 +00:00
Emil Velikov
7c38d7b7c8 travis: flip to distro xenial, drop sudo false
The latter is the default these days and Travis will be removing sudo
soonish.

Flipping to xenial, allows us to remove a bunch of hacks we have. Plus
it prevents us from adding new ones, to workaround what seems like a
gcc/binutils bug. For example (from the upcoming meson build):

FAILED: ccache c++  -o src/gallium/targets/pipe-loader/pipe_r600.so ...
  ... src/util/libmesa_util.a ... /usr/lib/x86_64-linux-gnu/libz.so ...

src/util/libmesa_util.a(disk_cache.c.o): In function `deflate_and_write_to_disk':
_build/../src/util/disk_cache.c:746: undefined reference to `deflateInit_'
_build/../src/util/disk_cache.c:765: undefined reference to `deflate'
...

As we can see, even though libz.so is explicitly passed after the
object that requires it - the linker still fails to see the symbols.
Avoid all those situations - flip the switch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-13 11:20:41 +00:00
Emil Velikov
12187550f9 configure: add CXX11_CXXFLAGS to LLVM_CXXFLAGS
Seemingly with LLVM7 and GCC 5.0, the former won't properly advertise
-std=c++11 and the latter will choke.

dd this temporary workaround, otherwise we'll get errors like:

In file included from /usr/include/c++/5/type_traits:35:0,
                 from /usr/lib/llvm-7/include/llvm/Support/type_traits.h:18,
                 from /usr/lib/llvm-7/include/llvm/ADT/Optional.h:22,
                 from /usr/lib/llvm-7/include/llvm/ADT/STLExtras.h:20,
                 from /usr/lib/llvm-7/include/llvm/ADT/StringRef.h:13,
                 from /usr/lib/llvm-7/include/llvm/Target/TargetMachine.h:17,
                 from ../../../src/amd/common/ac_llvm_helper.cpp:36:
/usr/include/c++/5/bits/c++0x_warning.h:32:2: error: #error This file requires compiler and library support for the ISO C++ 2011 standard. This support must be enabled with the -std=c++11 or -std=gnu++11 compiler options.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-13 11:56:40 +00:00
Emil Velikov
f331419f26 glx/test: meson: assorted include fixes
Swap '..' with the symbolic inc_glx and add glproto as dependency. That
will pull the correct include, effectively fixing the tests on macOS.

Fixes: a47c525f32 ("meson: build glx")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 19:24:14 +00:00
Emil Velikov
e139d7a8a3 glx: meson: wire up the dispatch-index-check test
Accidentally dropped with earlier commit.!

Fixes: 4ccb981673 ("meson: Use consistent style for tests")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 19:07:52 +00:00
Emil Velikov
b44875e2dc glx: meson: drop includes from a link-only library
When producing the final libGL.so/libGLX_mesa.so we only link the local
static helper lib (libglx). Thus there's no reason for the includes.

Fixes: a47c525f32 ("meson: build glx")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 17:55:08 +00:00
Emil Velikov
9527f9ea26 TODO: glx: meson: build dri based glx tests, only with -Dglx=dri
The library itself (libGL) is only built when -Dglx=dri, yet it's
accompanying tests are build even with -Dglx=xlib.

Adjust the guards, so we don't build the tests when they are not
applicable

v2:
 - Reword commit message (Dylan)
 - Drop build_by_default hunk (Dylan)

Fixes: a47c525f32 ("meson: build glx")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 17:47:36 +00:00
Emil Velikov
2eedb79e1a pipe-loader: meson: reference correct library
The library is called libgalliumvl_stub - note singular.

Fixes: 42ea0631f1 ("meson: build clover")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-13 04:10:50 +00:00
Emil Velikov
9d10581897 meson: don't require glx/egl/gbm with gallium drivers
The gallium drivers do not require a DRI loader. Drop the artificial
and unnecessary restriction.

Fixes: af9d276134 ("meson: build libmesa_gallium")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-13 03:54:03 +00:00
Emil Velikov
e0dbfc9953 bin/get-pick-list.sh: warn when commit lists invalid sha
We had cases where people would list old/invalid sha in the commit.
Add a trivial checker to catch those and throw a warning.

CC: Juan A. Suarez <jasuarez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-12-21 14:39:52 +00:00
Emil Velikov
6b296f64af bin/get-pick-list.sh: rework handing of sha nominations
Currently our is_sha_nomination does:
 - folds any whitespace, attempting to extract sha-like information
 - checks that at least one of the shas has landed

Split it in two and do sha-like validation first.

This way, commits with mesa-stable and sha nominations will feature the
fixes/revert/etc instead of stable (a) or will be omitted if not
applicable for the respective branch (b).

Misc examples from 18.3

(a)
-[   stable ] 5bc509363b glx: make xf86vidmode mandatory for direct rendering
+[    fixes ] 5bc509363b glx: make xf86vidmode mandatory for direct rendering

(b)
-[   stable ] 9a7b319903 anv/query: flush render target before copying results

CC: Juan A. Suarez <jasuarez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-12-21 14:39:34 +00:00
Eric Anholt
17218a0406 vc4: Hook up perf_debug() output to GL_ARB_debug_output as well.
This is the right channel to report these things, so that end-users don't
need to know each driver's custom debug options.
2018-12-20 11:31:25 -08:00
Rhys Kidd
acc481ad79 vc4: Wire up core pipe_debug_callback
This lets the driver use pipe_debug_message() for GL_ARB_debug_output.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-12-20 11:31:19 -08:00
Eric Anholt
ba36312fbd v3d: Hook up perf_debug() output to GL_ARB_debug output as well.
This is the right channel to report these things, so that end-users don't
need to know each driver's custom debug options.
2018-12-20 11:31:19 -08:00
Rhys Kidd
d3991d2472 v3d: Wire up core pipe_debug_callback
This lets the driver use pipe_debug_message() for GL_ARB_debug_output.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-12-20 11:31:16 -08:00
Eric Anholt
d80761b8f3 v3d: Drop shadow comparison state from shader variant key.
The shadow state is now in the sampler.
2018-12-20 11:29:30 -08:00
Eric Anholt
0e2758daad v3d: Fix simulator mode on i915 render nodes.
i915 render nodes refuse the dumb ioctls, so the simulator would crash on
the original non-apitrace shader-db.  Replace them with direct i915 calls
if we detect that we're on one of their gem fds.
2018-12-20 11:29:30 -08:00
Dylan Baker
0ff7eed289 docs/meson: Recommend not using CFLAGS and friends
Because of the many caveats involved, using -Dc_args instead of CFLAGS
is recommended both by meson upstream and by us.

v2: - Fix typo

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-12-20 11:16:40 -08:00
Samuel Pitoiset
9606310081 radv: enable shaderStorageImageMultisample feature on GFX8+
Untested on older chips.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:19 +01:00
Samuel Pitoiset
6b976024a8 radv: add support for FMASK expand
Original patch by Dave Airlie.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:17 +01:00
Samuel Pitoiset
fa16da53d8 radv: initialize FMASK for images in fully expanded mode
The value depends on the number of samples.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:15 +01:00
Samuel Pitoiset
65d82c84d2 ac/nir: restrict fmask lookup to image load intrinsics
We don't ever want to do the fmask lookup on a atomic or
store, the fmask should have been decompressed if the
surface has been moved to IMAGE_LAYOUT.

Original patch by Dave Airlie.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:11 +01:00
Samuel Pitoiset
f45e43e156 spirv: add support for SpvCapabilityStorageImageMultisample
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:09 +01:00
Samuel Pitoiset
5b1ec10e4c radv: compute optimal VM alignment for imported buffers
This fixes GPU hangs on GFX9 with
dEQP-VK.memory.external_memory_host.bind_image_memory_and_render.with_zero_offset.*

Copied from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 17:34:04 +01:00
Bas Nieuwenhuizen
9f0bfbed11 radv: Work around non-renderable 128bpp compressed 3d textures on GFX9.
Exactly what title says, the new addrlib does not allow the above with
certain dimensions that the CTS seems to hit. Work around it by not
allowing the app to render to it via compat with  other 128bpp formats
and do not render to it ourselves during copies.

Fixes: 776b911365 "amd/addrlib: update Mesa's copy of addrlib"
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-20 15:07:20 +01:00
Samuel Pitoiset
5c7935f8fc radv: fix subpass image transitions with multiviews
The driver needs to decompress all image layers if a fast
depth/color clear has been performed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 13:36:37 +01:00
Samuel Pitoiset
0a7e767e58 radv: drop the amdgpu-skip-threshold=1 workaround for LLVM 8
This workaround has been introduced by 135e4d434f for fixing
DXVK GPU hangs with many games. It is no longer needed since
LLVM r345718.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 12:09:57 +01:00
Samuel Pitoiset
576040f2e5 ac/nir: remove the bitfield_extract workaround for LLVM 8
This workaround has been introduced by 3d41757788 and it
is no longer needed since LLVM r346422.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-20 09:40:16 +01:00
Iago Toral Quiroga
d6110d4d54 intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs
The former expects to see SSA-only things, but the latter injects registers.

The assertions in the lowering where not seeing this because they asserted
on the bit_size values only, not on the is_ssa field, so add that assertion
too.

Fixes: 11dc130779 "nir: Add a bool to int32 lowering pass"
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-20 08:02:44 +01:00
Ilia Mirkin
1250383e36 st/mesa: remove sampler associated with buffer texture in pbo logic
A long time ago, when this was first implemented, not having a sampler
bound would cause problems on Fermi. I didn't work out the reasons, but
the solution was simple -- just put the samplers back in.

Since then, regular texturing paths appear to have lost their associated
samplers which required a fuller investigation and fix in nouveau. Now
that this is done, this code should no longer need a sampler state for
fetching texels from a buffer texture.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-20 00:27:16 -05:00
Roland Scheidegger
6f4083143b gallivm: use llvm jit code for decoding s3tc
This is (much) faster than using the util fallback.
(Note that there's two methods here, one would use a cache, similar to
the existing code (although the cache was disabled), except the block
decode is done with jit code, the other directly decodes the required
pixels. For now don't use the cache (being direct-mapped is suboptimal,
but it's difficult to come up with something better which doesn't have
too much overhead.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-12-20 06:03:20 +01:00
Jason Ekstrand
ec1d5841fa radv/query: Use 1-bit booleans in query shaders
Fixes: 44227453ec "nir: Switch to using 1-bit Booleans for almost..."
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-19 16:36:40 -06:00
Jason Ekstrand
6896c91c10 radv/query: Add a nir_test_flag helper
This is little more than an iadd_imm right now but it will help in the
next commit where we refactor things further.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-19 16:36:26 -06:00
Eduardo Lima Mitev
c2ebc38052 freedreno/ir3: Handle GL_NONE in get_num_components_for_glformat()
An earlier patch that introduced the function failed to handle the case
where an image format layout qualifier is not specified, which is allowed
on desktop GL profiles. In these cases, nir_variable's image format is
GL_NONE, and we don't need to print a debug message for those.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-12-19 22:49:05 +01:00
Eric Anholt
90818558f0 docs: Add an encouraging note about providing reviews and acks.
Across several projects I've seen new contributors say "I wasn't sure if I
should provide a review tag since I'm not really an expert in this area."
Everyone I know already applies some implicit weighting to reviews from
different people, so encourage participation.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-19 12:49:17 -08:00
Eric Anholt
463df0ffe2 docs: Add a note that MRs should still include any r-b or a-b tags.
v2: Mention "Tested-by" too

Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-19 12:48:13 -08:00
Eric Anholt
fcfb7f573c v3d: Load and store aligned utiles all at once.
This calls the expensive uif offset function once per utile, but it still
gets us a 212.218% +/- 2.41216% (n=10) win on 1024x1024 glTexImage over
calling it on each pixel.
2018-12-19 10:27:26 -08:00
Eric Anholt
7c56b7a6ea v3d: Add a fallthrough path for utile load/store of 32 byte lines.
Now that V3D has 8 byte per pixel formats exposed, we've got stride==32
utiles to load and store.  Just handle them through the non-NEON paths for
now.
2018-12-19 10:27:26 -08:00
Eric Anholt
f6a0f4f41e vc4: Move the utile load/store functions to a header for reuse by v3d.
These implementations of whole-utile load/stores would be the same for
v3d, though the layouts of blocks of utiles has changed.
2018-12-19 10:27:26 -08:00
Eric Anholt
8ee752194c v3d: Implement texture_subdata to reduce teximage upload copies.
This lets us store the non-PBO glTexImage data directly into the tiled
image without making an extra untiled memcpy for the gallium transfer.
Improves 1024x1024 TexImage perf by ~19%, mostly from not thrashing around
in the kernel mapping and unmapping the transfer's temporary area.
2018-12-19 10:27:26 -08:00
Eric Anholt
e09d8aecb4 v3d: Remove dead prototypes for load/store utile functions. 2018-12-19 10:27:26 -08:00
Eric Anholt
fcf881adda v3d: Don't try to create shadow tiled temporaries for 1D textures.
They're raster order anyway, so we'd assertion fail along with wasting
bandwidth.

Fixes: 6ad9e8690d ("v3d: Add support for texturing from linear.")
2018-12-19 10:27:21 -08:00
Eric Anholt
b5adc744ba v3d: Fix check for TFU job completion in the simulator.
We're waiting for the jobs-completed count to increment (with wrapping),
not to reach its starting state.  This mostly ended up working out because
the next v3d_hw_tick() for a submit CL would end up doing the TFU
operation first, but it did fail when a blit was used for glReadPixels()
at the end of a test.

Fixes: ee0549ff9a ("v3d: Add the V3D TFU submit interface to the simulator.")
2018-12-19 10:26:04 -08:00
Eric Anholt
365728dc5d v3d: Put the dst bo first in the list of BOs for TFU calls.
In the UAPI, the first BO is the destination, and the one the kernel
should do an exclusive reservation on.  Currently we only do exclusive
reservations, anyway.  However, in the simulator path I was only copying
back the "destination" BO (actually src in this case), and this caused
regressions once I fixed the simulator to actually complete TFU before
returning (since otherwise, the TFU op would happen at the start of the
next CL submit and the draw would get the right contents).

Fixes: 976ea90bdc ("v3d: Add support for using the TFU to do some blits.")
2018-12-19 10:26:04 -08:00
Caio Marcelo de Oliveira Filho
947f7b452a nir: properly find the entry to keep in copy_prop_vars
When copy propagation handles a store/copy, it iterates the current
copy entries to remove aliases, but keeps the "equal" entry (if
exists) to be updated.

The removal step may swap the entries around (to ensure there are no
holes), invalidating previous iteration pointers.  The bug was saving
such pointer to use later.  Change the code to first perform the
removals and then find the remaining right entry.

This was causing updates to be lost since they were being made to an
entry that was not part of the current copies.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108624
Fixes: b3c6146925 "nir: Copy propagation between blocks"
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-19 09:33:36 -08:00
Michel Dänzer
9d8395bf0e winsys/amdgpu: Pull in LLVM CFLAGS
Fixes build failure if the LLVM headers aren't in a standard include
directory.

Fixes: ec22dd34c8 "radeonsi: move SI_FORCE_FAMILY functionality to
                     winsys"
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-12-19 17:54:18 +01:00
Caio Marcelo de Oliveira Filho
0ddc911f4d nir: properly clear the entry sources in copy_prop_vars
When updating a copy entry source value from a "non-SSA" (the data
come from a copy instruction) to a "SSA" (the data or parts of it come
from SSA values), it was possible to hold invalid data in ssa[0]
depending on the writemask.  Because the union, ssa[0] could contain a
pointer to a nir_deref_instr left-over from previous non-SSA usage.

Change code to clean up the array before use to avoid invalid data
around.

Fixes: 62332d139c "nir: Add a local variable-based copy propagation pass"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-19 08:35:48 -08:00
Eric Engestrom
0e4c7c3d5b docs: format code blocks a bit nicely
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-19 16:32:30 +00:00
Eric Engestrom
b0319d0768 docs: add meson cross compilation instructions
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-19 16:31:51 +00:00
Gurchetan Singh
b45aa6290b virgl: move resource creation / import / destruction to common code
We can remove some duplicated code.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
1d3d311133 virgl: move resource metadata into base resource
A resource is just a buffer with some metadata.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
db77573d7b virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT
Previously, we ignored the the glUnmap(..) operation and
flushed before we flush the cbuf.  Now, let's just flush
the data when we unmap.

Neither method is optimal, for example:

glMapBufferRange(.., 0, 100, GL_MAP_FLUSH_EXPLICIT_BIT)
glFlushMappedBufferRange(.., 25, 30)
glFlushMappedBufferRange(.., 65, 70)

We'll end up flushing 25 --> 70.  Maybe we can fix this later.

v2: Add fixme comment in the code (Elie)

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
11939f6fa2 virgl: make virgl_buffers use resource helpers
We can reuse the helpers we created.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
4e2c77cd51 virgl: make transfer code with PIPE_BUFFER targets
util_format_get_blocksize returns 1 for R8 formats (all
PIPE_BUFFERs are R8).

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
174f530008 virgl: consolidate transfer code
We could allocate and destroy transfers in one place.

v2: Keep l_stride around.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
13626b46f1 virgl: store layer_stride in metadata
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
2a44acc83b virgl: move vrend_get_tex_image_offset to common code
Will be reused.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
f749229a8e virgl: move virgl_resource_layout to common code
Will be reused.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
a63da9c062 virgl: move texture metadata to common code
Will be reused.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
6e7d396ad3 virgl: remove unnessecary code
With commit 89b479, we moved to tracking buffer cleanliness
when binding.

TEST=dEQP-GLES31.functional.image_load_store.buffer.load_store.r32ui

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
6d13d1aadb virgl: texture_transfer_pool --> transfer_pool
It's used for all types of resources.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Nicolai Hähnle
d73a25f2c0 radeonsi: const-ify the si_query_ops
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:02:07 +01:00
Nicolai Hähnle
c85b0dea0a radeonsi: split perfcounter queries from si_query_hw
Remove a level of indirection to make the code more explicit -- should
make it easier to follow what's going on.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:02:04 +01:00
Nicolai Hähnle
e0f0d3675d radeonsi: factor si_query_buffer logic out of si_query_hw
This is a move towards using composition instead of inheritance for
different query types.

This change weakens out-of-memory error reporting somewhat, though this
should be acceptable since we didn't consistently report such errors in
the first place.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:02:01 +01:00
Nicolai Hähnle
0fc6e573dd radeonsi: move query suspend logic into the top-level si_query struct
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:59 +01:00
Nicolai Hähnle
e2b9329f17 radeonsi: move remaining perfcounter code into si_perfcounter.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:57 +01:00
Nicolai Hähnle
7dd289d9e4 radeonsi: track constant buffer bind history in si_pipe_set_constant_buffer
Other callers of si_set_constant_buffer don't need it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:54 +01:00
Nicolai Hähnle
829d417914 radeonsi: use si_set_rw_shader_buffer for setting streamout buffers
Reduce the number of places that encode buffer descriptors.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:52 +01:00
Nicolai Hähnle
ce785f5ffd radeonsi: add an si_set_rw_shader_buffer convenience function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:50 +01:00
Nicolai Hähnle
556c4c42b7 radeonsi: avoid using hard-coded SI_NUM_RW_BUFFERS
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:48 +01:00
Nicolai Hähnle
1e49d72317 radeonsi: show the fixed function TCS in debug dumps
This is rather important for merged VS/TCS as LSHS shaders...

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:45 +01:00
Nicolai Hähnle
6e67e79de4 radeonsi: const-ify si_set_tesseval_regs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:42 +01:00
Nicolai Hähnle
5c841a1b1e radeonsi: rename SI_RESOURCE_FLAG_FORCE_TILING to clarify its purpose
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:39 +01:00
Nicolai Hähnle
0d58dcc3cf radeonsi: don't set RAW_WAIT for CP DMA clears
There is never a read-after-write hazard because the command doesn't read.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:34 +01:00
Nicolai Hähnle
23af72af25 radeonsi/gfx9: use SET_UCONFIG_REG_INDEX packets when available
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:32 +01:00
Nicolai Hähnle
f18b2ac0db radeonsi: add si_init_draw_functions and make some functions static
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:30 +01:00
Nicolai Hähnle
555cb668cc radeonsi: extract declare_vs_blit_inputs
Prepare for some later refactoring.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:27 +01:00
Nicolai Hähnle
ec22dd34c8 radeonsi: move SI_FORCE_FAMILY functionality to winsys
This helps some debugging cases by initializing addrlib with
slightly more appropriate settings.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:25 +01:00
Nicolai Hähnle
0ef263d62f ac/surface: 3D and cube surfaces are never displayable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:22 +01:00
Nicolai Hähnle
8efaffa893 amd/common: add i1 special case to ac_build_{inclusive,exclusive}_scan
Allow for a unified but efficient treatment of adding a bitmask over a
wave or an entire threadgroup.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:19 +01:00
Nicolai Hähnle
300876a9a7 amd/common: scan/reduce across waves of a workgroup
Order-aware scan/reduce can trade-off LDS traffic for external atomics
memory traffic in producer/consumer compute shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:17 +01:00
Nicolai Hähnle
3963402fd3 amd/common: add ac_build_ifcc
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:15 +01:00
Nicolai Hähnle
3c77f26ccc amd/common: whitespace fixes
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:12 +01:00
Nicolai Hähnle
76c5ad1995 amd/sid_tables: add additional python3 compatibility imports
This happened to bite me while doing some experiments.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:09 +01:00
Nicolai Hähnle
6f0322b16a r600: remove redundant semicolon
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:00:49 +01:00
Nicolai Hähnle
7230cb8f2b ddebug: always flush when requested, even when hang detection is disabled
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 11:59:18 +01:00
Nicolai Hähnle
539fdc49f1 ddebug: simplify watchdog loop and fix crash in the no-timeout case
The following race condition could occur in the no-timeout case:

  API thread               Gallium thread            Watchdog
  ----------               --------------            --------
  dd_before_draw
  u_threaded_context draw
  dd_after_draw
    add to dctx->records
    signal watchdog
                                                     dump & destroy record
                           execute draw
                           dd_after_draw_async
                             use-after-free!

Alternatively, the same scenario would assert in a debug build when
destroying the record because record->driver_finished has not signaled.

Fix this and simplify the logic at the same time by
- handing the record pointers off to the watchdog thread *before* each
  draw call and
- waiting on the driver_finished fence in the watchdog thread

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 11:59:10 +01:00
Tapani Pälli
3627c9efff anv/android: turn on VK_ANDROID_external_memory_android_hardware_buffer
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:42 +02:00
Tapani Pälli
3dc424a4f4 anv: ignore VkSamplerYcbcrConversion on non-yuv formats
This fulfills a requirement for clients that want to utilize same
code path for images with external formats (VK_FORMAT_UNDEFINED) and
"regular" RGBA images where format is known. This is similar to how
OES_EGL_image_external works.

To support this, we allow color conversion samplers for non-YUV
formats but skip setting up conversion when format does not have
can_ycbcr flag set.

v2: add comment and bundle can_ycbcr to the existing break
    condition (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
a7b7772cfb anv: support VkSamplerYcbcrConversionInfo in vkCreateImageView
If a conversion struct was passed, then initialize view using
format from the conversion structure.

v2: use vk_format directly from the anv_format struct
v3: added some assertions (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
bb0721aea4 anv: add VkFormat field as part of anv_format
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
c070b0e25f anv: support VkExternalFormatANDROID in vkCreateSamplerYcbcrConversion
If external format is used, we store the external format identifier in
conversion to be used later when creating VkImageView.

v2: rebase to b43f955037 changes
v3: added assert, ignore components when creating external
    format conversion (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
f1654fa7e3 anv/android: support creating images from external format
Since we don't know the exact format at creation time, some initialization
is done only when bound with memory in vkBindImageMemory.

v2: demand dedicated allocation in vkGetImageMemoryRequirements2 if
    image has external format

v3: refactor prepare_ahw_image, support vkBindImageMemory2,
    calculate stride correctly for rgb(x) surfaces, rename as
    'resolve_ahw_image'

v4: rebase to b43f955037 changes
v5: add some assertions to verify input correctness (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
517103abf1 anv/android: add ahardwarebuffer external memory properties
v2: have separate memory properties for android, set usage
    flags for buffers correctly

v3: code cleanup (Jason)
    + limit maxArrayLayers to 1 for AHardwareBuffer based images

v4: rebase to b43f955037 changes

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
c79a528d2b anv/android: support import/export of AHardwareBuffer objects
v2: add support for non-image buffers (AHARDWAREBUFFER_FORMAT_BLOB)
v3: properly handle usage bits when creating from image
v4: refactor, code cleanup (Jason)
v5: rebase to b43f955037 changes,
    initialize bo flags as ANV_BO_EXTERNAL (Lionel)
v6: add assert that anv_bo_cache_import succeeds, add comment
    about multi-bo support to clarify current implementation (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
5c65c60d6c anv: refactor, remove else block in AllocateMemory
This makes it cleaner to introduce more cases where we import memory
from different types of external memory buffers.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
884fc90fde anv: add anv_ahw_usage_from_vk_usage helper function
v2: rebase to b43f955037 changes

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
1e6a44400a anv/android: add GetAndroidHardwareBufferPropertiesANDROID
Use the anv_format address in formats table as implementation-defined
external format identifier for now. When adding YUV format support this
might need to change.

v2: code cleanup (Jason)
v3: set anv_format address as identifier
v4: setup suggestedYcbcrModel and suggested[X|Y]ChromaOffset
    as expected for HAL_PIXEL_FORMAT_NV12_Y_TILED_INTEL
v5: set linear tiling for GPU_DATA_BUFFER usage, add comment
    about multi-bo support to clarify current implementation (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
aa94e01bfe anv: add from/to helpers with android and vulkan formats
v2: handle R8G8B8X8 as R8G8B8_UNORM (Jason)
v3: add HAL_PIXEL_FORMAT_NV12_Y_TILED_INTEL, we make it define
    for now to avoid direct dependency to minigbm headers

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
c1f15a0a1a anv: make anv_get_image_format_features public
This will be utilized later by GetAndroidHardwareBufferPropertiesANDROID.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
8a469fd335 anv: refactor make_surface to use data from anv_image
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
2a98e5bbb9 anv: add create_flags as part of anv_image
This will make it possible for next patch to rip
anv_image_create_info out from make_surface function.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Ian Romanick
96c4b135e3 nir/algebraic: Don't put quotes around floating point literals
The quotation marks around 1.0 cause it to be treated as a string
instead of a floating point value.  The generator then treats it as an
arbitrary variable replacement, so any iand involving a ('ineg', ('b2i',
a)) matches.

v2: Remove misleading comment about sized literals (suggested by
Timothy).  Add assertion that the name of a varible is entierly
alphabetic (suggested by Jason).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com> [v1]
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> [v1]
Fixes: 6bcd2af086 ("nir/algebraic: Add some optimizations for D3D-style Booleans")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109075
2018-12-18 23:28:31 -08:00
Vinson Lee
0f7ba5758b meson: Fix libsensors detection.
Fixes: 5e71efef44 ("meson: Add lmsensors support")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-18 19:24:01 -08:00
Vinson Lee
84f39e5971 meson: Fix typo.
Fixes: 6b4c7047d5 ("meson: build gallium nine state_tracker")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-18 19:14:11 -08:00
Sagar Ghuge
933c44bcc4 nir: Add a new lowering option to lower 3D surfaces from txd to txl.
Tested on gen9.

v2: Rename lower_txd_3d_surafaces flag to lower_txd_3d (Jason Ekstrand)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-18 13:44:09 -08:00
Christian Gmeiner
7ea8e54dd6 meson: add etnaviv to the tools option
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-18 21:50:58 +01:00
Adam Jackson
e36d136102 specs: Bump GLX_MESA_query_renderer to version 9
Note that we have an official GL extension number, pick the appropriate
section of the GLX spec to modify, and add changelog.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2018-12-18 15:46:10 -05:00
Adam Jackson
9e8332ebc2 specs: Remove GLX_RENDERER_ID_MESA from GLX_MESA_query_renderer
This has not even had an attempt at implementation. If you asked for
renderer 0 - which, the spec implies, should always work - then
dri2_convert_glx_attribs would fail, we'd silently fall back to creating
an indirect context, and xserver would also not recognize the attribute
and would throw BadValue at you.

The API would be difficult to use in any case, since there's no way to
enumerate how many renderers the screen has. I'd be tempted to add that
by defining:

    glXQueryRendererIntegerMESA(dpy, screen,
                                /* renderer = */ -1,
                                0, &value);

to return the number of renderers, but a new entrypoint might be
cleaner. Still, better to not specify it at all than to lie about it.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2018-12-18 15:46:10 -05:00
Adam Jackson
c63c391756 specs: Remove GLES profile interaction text from GLX_MESA_query_renderer
In one place we say, if GLES isn't supported then the profile version
will be 0.0. Then later we say, if the GLES profile extension isn't
supported then GLX_RENDERER_OPENGL_ES_PROFILE_VERSION_MESA is not
mentioned in the spec. A strict reading of the latter would mean that
GLX_RENDERER_OPENGL_ES_PROFILE_VERSION_MESA is not a recognized token,
and the query should instead return False.

The implementation does not check for the GLES profile extensions, and
the additional complexity doesn't seem worth it. Removing the
interaction text makes the spec match the implementation.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2018-12-18 15:46:10 -05:00
Eduardo Lima Mitev
5820e63418 freedreno/ir3: Make imageStore use num components from image format
emit_intrinsic_store_image() is always using 4 components when
collecting registers for the value. When image has less than
4 components (e.g, r32f, rg32i, etc) this results in extra mov
instructions.

This patch uses the actual number of components from the image format.

For example, in a shader like:

layout (r32f, binding=0) writeonly uniform imageBuffer u_image;
...
void main(void) {
   ...
   imageStore (u_image, some_offset, vec4(1.0));
   ...
}

instruction count is reduced in at least 3 instructions (note image
format is r32f, 1 component only).

This obviously reduces register pressure as well.

v2: - Added support for image formats from NV_image_format extension
    (Ilia Mirkin).
    - Return 4 components by default instead of asserting. (Rob Clark).

v3: Added more missing formats (Ilia Mirkin).

v4: Added a debug message for unknown image formats (Rob Clark).

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-12-18 21:15:20 +01:00
Jason Ekstrand
5dad1abfdc nir/dead_write_vars: Get modes directly from derefs
Instead of going all the way back to the variable, just look at the
deref.  The modes are guaranteed to be the same by nir_validate whenever
the variable can be found.  This fixes clear_unused_for_modes for
derefs that don't have an accessible variable.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Jason Ekstrand
fa40a58fd9 nir/copy_prop_vars: Get modes directly from derefs
Instead of going all the way back to the variable, just look at the
deref.  The modes are guaranteed to be the same by nir_validate whenever
the variable can be found.  This fixes apply_barrier_for_modes for
derefs that don't have an accessible variable.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Jason Ekstrand
cf7fb39805 nir/lower_wpos_center: Look at derefs for modes
This is instead of looking all the way back to the variable which may
not exist for all derefs.  This makes this code properly ignore casts
with modes other than the mode[s] we care about (where casts aren't
allowed).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Jason Ekstrand
867fe35a16 nir/lower_io_to_scalar: Look at derefs for modes
This is instead of looking all the way back to the variable which may
not exist for all derefs.  This makes this code properly ignore casts
with modes other than the mode[s] we care about (where casts aren't
allowed).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Jason Ekstrand
3fe0363dda nir/lower_io_arrays_to_elements: Look at derefs for modes
This is instead of looking all the way back to the variable which may
not exist for all derefs.  This makes this code properly ignore casts
with modes other than the mode[s] we care about (where casts aren't
allowed).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Jason Ekstrand
8cc0f92492 nir/linking_helpers: Look at derefs for modes
This is instead of looking all the way back to the variable which may
not exist for all derefs.  This makes this code properly ignore casts
with modes other than the mode[s] we care about (where casts aren't
allowed).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Jason Ekstrand
8410cf66d7 nir/propagate_invariant: Skip unknown vars
If we can't find the variable from the deref, just assume it isn't
invariant and continue on.  This can happen if, for instance, we're
writing to a deref that points into an SSBO.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-18 13:13:28 -06:00
Ian Romanick
29e4b949b4 Revert "nir/lower_indirect: Bail early if modes == 0"
"There's no point in walking the program if we're never going to
    actually lower anything."

Except we might lower compacted local arrays.  In that case, modes will
be 0, but there is still lowering to be done.

This reverts commit 7f75cf2a94.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109081
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
2018-12-18 10:47:54 -08:00
Lucas Stach
433ca3127a st/dri: replace format conversion functions with single mapping table
Each time I have to touch the buffer import/export functions in the dri
state tracker I get lost in the maze of functions converting between
DRI_IMAGE_FOURCC, DRI_IMAGE_FORMAT, DRI_IMAGE_COMPONENTS and pipe format.

Rip it out and replace by a single table, which defines the correspondence
between the different representations.

Also this now stores all the known representations in the __DRIimageRec,
to avoid the loss of information we currently have when importing a buffer
with a fourcc, which doesn't have a corresponding dri format.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-18 19:19:45 +01:00
Lucas Stach
67174d40f1 st/dri: allow both render and sampler compatible dma-buf formats
Currently all the EGL APIs are missing a way to specify how an imported
dma-buf is intended to be used. Demanding the format to be both usable
for sampling and rendering artificially restricts the list of formats a
driver is able to import.

Looking at how the Intel driver implements those DRI2 image APIs it
doesn't distinguish between render or sampler compatible formats. So
this patch aligns behavior between Intel and Gallium based drivers.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-18 19:19:40 +01:00
Lucas Stach
a3e592e839 etnaviv: use surface format directly
There is no need to do the detour over the resource behind the
surface to get the format. Use the surface format directly.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2018-12-18 19:07:10 +01:00
Dylan Baker
7a90886921 meson: Add toggle for glx-direct
GNU Hurd needs to turn off glx-direct, rather than special case it,
we'll just add a toggle.

CC: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-18 09:20:53 -08:00
Dylan Baker
8c77f4c76d meson: Add support for gnu hurd
CC: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-18 09:20:49 -08:00
Dylan Baker
6cf5f25bc5 meson: remove duplicate definition
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-18 09:18:12 -08:00
Dylan Baker
e430a034b9 meson: Fix ppc64 little endian detection
Old versions of meson returned ppc64le as the cpu_family for little
endian power8 cpus, versions >=0.48 don't do this, so the check wouldn't
work in that case. This generalizes the check to work for both old and
new versions of meson.

Fixes: 34bbb24ce7
       ("meson: Add support for ppc assembly/optimizations")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-18 09:17:54 -08:00
Jason Ekstrand
3feda3cf35 anv: Bump the patch version to 96
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-18 09:40:46 -06:00
Kenneth Graunke
3c71ba3baa i965: Don't override subslice count to 4 on Gen11.
Gen9-10 have fewer than 4 subslices per slice, so they need this to be
rounded up.  Gen11 isn't documented as needing this hack, and it can
also have more than 4 subslices, so the hack actually can break things.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-12-17 14:03:45 -08:00
Ian Romanick
af07141b33 intel/compiler: More peephole_select for pre-Gen6
No shader-db changes on any Gen6+ platform.

All of the shaders with cycles hurt by more than ~2% are from Master of
Orion.  All of the shaders have instructions helped.  It looks like the
pass enables some control flow to be converted to bcsels, then the
scheduler does dumb things.  These are new shaders (just added before
doing this shader-db run), so there's probably some low-hanging fruit.

Iron Lake
total instructions in shared programs: 8214327 -> 8213684 (<.01%)
instructions in affected programs: 84469 -> 83826 (-0.76%)
helped: 114
HURT: 26
helped stats (abs) min: 2 max: 18 x̄: 7.75 x̃: 9
helped stats (rel) min: 0.17% max: 13.73% x̄: 2.52% x̃: 1.05%
HURT stats (abs)   min: 2 max: 20 x̄: 9.23 x̃: 8
HURT stats (rel)   min: 0.70% max: 2.48% x̄: 1.66% x̃: 1.61%
95% mean confidence interval for instructions value: -5.87 -3.32
95% mean confidence interval for instructions %-change: -2.32% -1.17%
Instructions are helped.

total cycles in shared programs: 187736850 -> 187749314 (<.01%)
cycles in affected programs: 506750 -> 519214 (2.46%)
helped: 104
HURT: 36
helped stats (abs) min: 2 max: 72 x̄: 21.96 x̃: 16
helped stats (rel) min: 0.02% max: 6.16% x̄: 0.97% x̃: 0.63%
HURT stats (abs)   min: 4 max: 1402 x̄: 409.67 x̃: 40
HURT stats (rel)   min: 0.33% max: 23.12% x̄: 5.79% x̃: 1.39%
95% mean confidence interval for cycles value: 28.32 149.74
95% mean confidence interval for cycles %-change: -0.07% 1.61%
Inconclusive result (%-change mean confidence interval includes 0).

GM45
total instructions in shared programs: 5044014 -> 5043652 (<.01%)
instructions in affected programs: 46751 -> 46389 (-0.77%)
helped: 63
HURT: 13
helped stats (abs) min: 2 max: 29 x̄: 7.65 x̃: 9
helped stats (rel) min: 0.17% max: 13.73% x̄: 2.93% x̃: 1.04%
HURT stats (abs)   min: 2 max: 20 x̄: 9.23 x̃: 8
HURT stats (rel)   min: 0.66% max: 2.35% x̄: 1.58% x̃: 1.52%
95% mean confidence interval for instructions value: -6.54 -2.99
95% mean confidence interval for instructions %-change: -3.04% -1.28%
Instructions are helped.

total cycles in shared programs: 128143042 -> 128150188 (<.01%)
cycles in affected programs: 324564 -> 331710 (2.20%)
helped: 57
HURT: 19
helped stats (abs) min: 6 max: 74 x̄: 30.70 x̃: 32
helped stats (rel) min: 0.08% max: 4.74% x̄: 1.22% x̃: 0.81%
HURT stats (abs)   min: 10 max: 1400 x̄: 468.21 x̃: 60
HURT stats (rel)   min: 0.56% max: 19.94% x̄: 5.80% x̃: 1.70%
95% mean confidence interval for cycles value: 6.90 181.15
95% mean confidence interval for cycles %-change: -0.52% 1.59%
Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Ian Romanick
378f996771 nir/opt_peephole_select: Don't peephole_select expensive math instructions
On some GPUs, especially older Intel GPUs, some math instructions are
very expensive.  On those architectures, don't reduce flow control to a
csel if one of the branches contains one of these expensive math
instructions.

This prevents a bunch of cycle count regressions on pre-Gen6 platforms
with a later patch (intel/compiler: More peephole select for pre-Gen6).

v2: Remove stray #if block.  Noticed by Thomas.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Ian Romanick
8fb8ebfbb0 intel/compiler: More peephole select
Shader-db results:

The one shader hurt for instructions is a compute shader that had both
spills and fills hurt.

v2: Fix typo in comment noticed by Caio.

v3: Fix inverted condition in brw_nir.c.  Noticed by Lionel.

Skylake, Broadwell, and Haswell had similar results. (Skylake shown)
total instructions in shared programs: 15072761 -> 15047884 (-0.17%)
instructions in affected programs: 895539 -> 870662 (-2.78%)
helped: 3623
HURT: 1
helped stats (abs) min: 1 max: 181 x̄: 6.89 x̃: 4
helped stats (rel) min: 0.10% max: 25.00% x̄: 3.93% x̃: 3.20%
HURT stats (abs)   min: 92 max: 92 x̄: 92.00 x̃: 92
HURT stats (rel)   min: 1.92% max: 1.92% x̄: 1.92% x̃: 1.92%
95% mean confidence interval for instructions value: -7.10 -6.63
95% mean confidence interval for instructions %-change: -4.03% -3.82%
Instructions are helped.

total cycles in shared programs: 369738930 -> 369535732 (-0.05%)
cycles in affected programs: 68027851 -> 67824653 (-0.30%)
helped: 2609
HURT: 1035
helped stats (abs) min: 1 max: 4508 x̄: 181.44 x̃: 77
helped stats (rel) min: <.01% max: 71.31% x̄: 9.14% x̃: 5.47%
HURT stats (abs)   min: 1 max: 33336 x̄: 261.04 x̃: 20
HURT stats (rel)   min: <.01% max: 47.61% x̄: 2.93% x̃: 1.47%
95% mean confidence interval for cycles value: -96.43 -15.09
95% mean confidence interval for cycles %-change: -6.07% -5.36%
Cycles are helped.

total spills in shared programs: 10158 -> 10159 (<.01%)
spills in affected programs: 166 -> 167 (0.60%)
helped: 1
HURT: 1

total fills in shared programs: 22105 -> 22116 (0.05%)
fills in affected programs: 837 -> 848 (1.31%)
helped: 4
HURT: 1

Ivy Bridge
total instructions in shared programs: 12021190 -> 11990256 (-0.26%)
instructions in affected programs: 910561 -> 879627 (-3.40%)
helped: 3344
HURT: 18
helped stats (abs) min: 1 max: 99 x̄: 9.29 x̃: 6
helped stats (rel) min: 0.11% max: 31.18% x̄: 5.19% x̃: 3.31%
HURT stats (abs)   min: 2 max: 20 x̄: 7.89 x̃: 6
HURT stats (rel)   min: 0.70% max: 2.59% x̄: 1.63% x̃: 1.70%
95% mean confidence interval for instructions value: -9.49 -8.91
95% mean confidence interval for instructions %-change: -5.32% -4.98%
Instructions are helped.

total cycles in shared programs: 179077826 -> 178570196 (-0.28%)
cycles in affected programs: 63205667 -> 62698037 (-0.80%)
helped: 2767
HURT: 620
helped stats (abs) min: 1 max: 7531 x̄: 217.58 x̃: 88
helped stats (rel) min: <.01% max: 75.86% x̄: 9.59% x̃: 6.09%
HURT stats (abs)   min: 1 max: 31255 x̄: 152.27 x̃: 11
HURT stats (rel)   min: <.01% max: 36.36% x̄: 2.77% x̃: 0.58%
95% mean confidence interval for cycles value: -173.94 -125.81
95% mean confidence interval for cycles %-change: -7.68% -6.97%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10852569 -> 10843758 (-0.08%)
instructions in affected programs: 235803 -> 226992 (-3.74%)
helped: 800
HURT: 0
helped stats (abs) min: 1 max: 88 x̄: 11.01 x̃: 8
helped stats (rel) min: 0.11% max: 23.08% x̄: 4.69% x̃: 3.36%
95% mean confidence interval for instructions value: -11.93 -10.10
95% mean confidence interval for instructions %-change: -4.99% -4.39%
Instructions are helped.

total cycles in shared programs: 154732047 -> 154608941 (-0.08%)
cycles in affected programs: 4063110 -> 3940004 (-3.03%)
helped: 606
HURT: 253
helped stats (abs) min: 1 max: 2524 x̄: 227.93 x̃: 62
helped stats (rel) min: 0.02% max: 39.24% x̄: 4.36% x̃: 1.81%
HURT stats (abs)   min: 1 max: 1966 x̄: 59.36 x̃: 11
HURT stats (rel)   min: 0.02% max: 67.10% x̄: 3.22% x̃: 0.67%
95% mean confidence interval for cycles value: -170.49 -116.13
95% mean confidence interval for cycles %-change: -2.61% -1.65%
Cycles are helped.

No change on Iron Lake or GM45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Ian Romanick
09b7e1d8e4 nir/opt_peephole_select: Don't try to remove flow control around indirect loads
That flow control may be trying to avoid invalid loads.  On at least
some platforms, those loads can also be expensive.

No shader-db changes on any Intel platform (even with the later patch
"intel/compiler: More peephole select").

v2: Add a 'indirect_load_ok' flag to nir_opt_peephole_select.  Suggested
by Rob.  See also the big comment in src/intel/compiler/brw_nir.c.

v3: Use nir_deref_instr_has_indirect instead of deref_has_indirect (from
nir_lower_io_arrays_to_elements.c).

v4: Fix inverted condition in brw_nir.c.  Noticed by Lionel.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Ian Romanick
4cd1a0be76 i965/vec4: Propagate conditional modifiers from more compares to other compares
If there is a CMP.NZ that compares a single component (via a .zzzz
swizzle, for example) with 0, it can propagate its conditional modifier
back to a previous CMP that writes only that component.  The specific
case that I saw was:

    cmp.l.f0(8)     g42<1>.xF       g61<4>.xF       (abs)g18<4>.zF
    ...
    cmp.nz.f0(8)    null<1>D        g42<4>.xD       0D

In this case we can just delete the second CMP.

No changes on Broadwell or Skylake because they do not use the vec4
backend.  Also no changes on GM45 or Iron Lake.

Sandy Bridge, Ivy Bridge, and Haswell had similar results. (Sandy Bridge shown)
total instructions in shared programs: 10856676 -> 10852569 (-0.04%)
instructions in affected programs: 228322 -> 224215 (-1.80%)
helped: 1331
HURT: 0
helped stats (abs) min: 1 max: 7 x̄: 3.09 x̃: 4
helped stats (rel) min: 0.11% max: 6.67% x̄: 1.88% x̃: 1.83%
95% mean confidence interval for instructions value: -3.19 -2.99
95% mean confidence interval for instructions %-change: -1.93% -1.83%
Instructions are helped.

total cycles in shared programs: 154788865 -> 154732047 (-0.04%)
cycles in affected programs: 2485892 -> 2429074 (-2.29%)
helped: 1097
HURT: 59
helped stats (abs) min: 2 max: 168 x̄: 51.96 x̃: 64
helped stats (rel) min: 0.12% max: 12.70% x̄: 3.44% x̃: 2.22%
HURT stats (abs)   min: 2 max: 16 x̄: 3.02 x̃: 2
HURT stats (rel)   min: 0.18% max: 0.83% x̄: 0.64% x̃: 0.71%
95% mean confidence interval for cycles value: -51.04 -47.26
95% mean confidence interval for cycles %-change: -3.40% -3.07%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Ian Romanick
9a83c3d3b3 i965/fs: Eliminate unary op on operand of compare-with-zero
The (-abs(x) >= 0) => (x == 0) optimization is removed from the vec4 and
scalar parts. In the VS part, adding the new pattern was not
helpful. The pattern that is removed is really old, and it has been
handled by NIR for ages.

All Gen7+ platforms had similar results. (Broadwell shown)
total instructions in shared programs: 14715715 -> 14715709 (<.01%)
instructions in affected programs: 474 -> 468 (-1.27%)
helped: 6
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.12% max: 1.35% x̄: 1.28% x̃: 1.35%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -1.40% -1.15%
Instructions are helped.

total cycles in shared programs: 559569911 -> 559569809 (<.01%)
cycles in affected programs: 5963 -> 5861 (-1.71%)
helped: 6
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 1.45% max: 1.88% x̄: 1.73% x̃: 1.85%
95% mean confidence interval for cycles value: -18.15 -15.85
95% mean confidence interval for cycles %-change: -1.95% -1.51%
Cycles are helped.

Iron Lake and Sandy Bridge had similar results. (Iron Lake shown)
total instructions in shared programs: 7780915 -> 7780913 (<.01%)
instructions in affected programs: 246 -> 244 (-0.81%)
helped: 2
HURT: 0

total cycles in shared programs: 177876108 -> 177876106 (<.01%)
cycles in affected programs: 3636 -> 3634 (-0.06%)
helped: 1
HURT: 0

GM45
total instructions in shared programs: 4799152 -> 4799151 (<.01%)
instructions in affected programs: 126 -> 125 (-0.79%)
helped: 1
HURT: 0

total cycles in shared programs: 122052654 -> 122052652 (<.01%)
cycles in affected programs: 3640 -> 3638 (-0.05%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Ian Romanick
440c051340 i965/vec4/dce: Don't narrow the write mask if the flags are used
In an instruction sequence like

            cmp(8).ge.f0.0 vgrf17:D, vgrf2.xxxx:D, vgrf9.xxxx:D
    (+f0.0) sel(8) vgrf1:UD, vgrf8.xyzw:UD, vgrf1.xyzw:UD

The other fields of vgrf17 may be unused, but the CMP still needs to
generate the other flag bits.

To my surprise, nothing in shader-db or any test suite appears to hit
this.  However, I have a change to brw_vec4_cmod_propagation that
creates cases where this can happen.  This fix prevents a couple dozen
regressions in that patch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 5df88c20 ("i965/vec4: Rewrite dead code elimination to use live in/out.")
2018-12-17 13:47:06 -08:00
Ian Romanick
111bcc8d02 i965/vec4: Silence unused parameter warnings in vec4 compiler tests
src/intel/compiler/test_vec4_copy_propagation.cpp: In member function ‘virtual brw::dst_reg* copy_propagation_vec4_visitor::make_reg_for_system_value(int)’:
src/intel/compiler/test_vec4_copy_propagation.cpp:57:51: warning: unused parameter ‘location’ [-Wunused-parameter]
    virtual dst_reg *make_reg_for_system_value(int location)
                                                   ^~~~~~~~
src/intel/compiler/test_vec4_copy_propagation.cpp: In member function ‘virtual void copy_propagation_vec4_visitor::emit_urb_write_header(int)’:
src/intel/compiler/test_vec4_copy_propagation.cpp:77:43: warning: unused parameter ‘mrf’ [-Wunused-parameter]
    virtual void emit_urb_write_header(int mrf)
                                           ^~~
src/intel/compiler/test_vec4_copy_propagation.cpp: In member function ‘virtual brw::vec4_instruction* copy_propagation_vec4_visitor::emit_urb_write_opcode(bool)’:
src/intel/compiler/test_vec4_copy_propagation.cpp:82:57: warning: unused parameter ‘complete’ [-Wunused-parameter]
    virtual vec4_instruction *emit_urb_write_opcode(bool complete)
                                                         ^~~~~~~~
src/intel/compiler/test_vec4_register_coalesce.cpp: In member function ‘virtual brw::dst_reg* register_coalesce_vec4_visitor::make_reg_for_system_value(int)’:
src/intel/compiler/test_vec4_register_coalesce.cpp:60:51: warning: unused parameter ‘location’ [-Wunused-parameter]
    virtual dst_reg *make_reg_for_system_value(int location)
                                                   ^~~~~~~~
src/intel/compiler/test_vec4_register_coalesce.cpp: In member function ‘virtual void register_coalesce_vec4_visitor::emit_urb_write_header(int)’:
src/intel/compiler/test_vec4_register_coalesce.cpp:80:43: warning: unused parameter ‘mrf’ [-Wunused-parameter]
    virtual void emit_urb_write_header(int mrf)
                                           ^~~
src/intel/compiler/test_vec4_register_coalesce.cpp: In member function ‘virtual brw::vec4_instruction* register_coalesce_vec4_visitor::emit_urb_write_opcode(bool)’:
src/intel/compiler/test_vec4_register_coalesce.cpp:85:57: warning: unused parameter ‘complete’ [-Wunused-parameter]
    virtual vec4_instruction *emit_urb_write_opcode(bool complete)
                                                         ^~~~~~~~
src/intel/compiler/test_vec4_cmod_propagation.cpp: In member function ‘virtual brw::dst_reg* cmod_propagation_vec4_visitor::make_reg_for_system_value(int)’:
src/intel/compiler/test_vec4_cmod_propagation.cpp:60:51: warning: unused parameter ‘location’ [-Wunused-parameter]
    virtual dst_reg *make_reg_for_system_value(int location)
                                                   ^~~~~~~~
src/intel/compiler/test_vec4_cmod_propagation.cpp: In member function ‘virtual void cmod_propagation_vec4_visitor::emit_urb_write_header(int)’:
src/intel/compiler/test_vec4_cmod_propagation.cpp:85:43: warning: unused parameter ‘mrf’ [-Wunused-parameter]
    virtual void emit_urb_write_header(int mrf)
                                           ^~~
src/intel/compiler/test_vec4_cmod_propagation.cpp: In member function ‘virtual brw::vec4_instruction* cmod_propagation_vec4_visitor::emit_urb_write_opcode(bool)’:
src/intel/compiler/test_vec4_cmod_propagation.cpp:90:57: warning: unused parameter ‘complete’ [-Wunused-parameter]
    virtual vec4_instruction *emit_urb_write_opcode(bool complete)
                                                         ^~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Bas Nieuwenhuizen
f67dea5e19 radv: Fix multiview depth clears
We were not using the view mask for depth clears, causing only the
first view to be cleared.

Fixes: 2e86f6b259 "radv: Add multiview clears."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-17 20:16:26 +00:00
Bas Nieuwenhuizen
9add63a3a5 radv: Remove redundant format check.
The switch directly after the check has a default case that returns
NULL too, so the effective return value is not changed. Also this
check is wrong once we start dealing with formats introduced by an
extension (e.g. YUV formats).

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-17 20:09:38 +00:00
Eric Anholt
708d8f4d0a nir: Fix clamping of uints for image store lowering.
I botched some copy-and-paste and clamped to signed int max instead of
uint max.  Fixes KHR-GL46.shader_image_load_store.multiple-uniforms on
skl.

Fixes: d3e046e76c ("nir: Pull some of intel's image load/store format
conversion to nir_format.h")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-17 20:02:22 +00:00
Eric Anholt
00e2cbc049 v3d: Fix the argument type for vir_BRANCH().
Apparently this has been spewing warnings for Jason's clang, but not my
gcc.
2018-12-17 09:52:23 -08:00
Eric Anholt
376054fff3 vc4: Reuse nir_format_convert.h in our blend lowering.
These helpers came along after and have effectively the same
implementation.
2018-12-17 09:52:23 -08:00
Samuel Pitoiset
445867c80d radv: report Vulkan version 1.1.90 for real
I thought the value was correctly propagated, but actually not.

Fixes: 2ac6d55f38 ("radv: bump reported version to 1.1.90")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-17 17:51:48 +01:00
Jason Ekstrand
cae373117c anv,radv: Re-enable VK_EXT_pci_bus_info
Now at version 2 with the fixed header.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 10:42:35 -06:00
Jason Ekstrand
e5b59fe6f5 vulkan: Update the XML and headers to 1.1.96
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 10:41:56 -06:00
Rhys Perry
ef198e8c6a radv: switch from nir_bcsel to nir_b32csel
Fixes: 191a1dce92 ('nir: Add 1-bit Boolean opcodes')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-17 14:52:39 +00:00
Rhys Perry
bba94a3d85 radv: don't set surf_index for stencil-only images
Fixes: f8d5b377c8 ('radv: set cb base tile swizzles for MRT speedups (v4)')
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108116
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-17 14:52:10 +00:00
Ian Romanick
9dc135efa1 nir: Release per-block metadata in nir_sweep
nir_sweep already marks all metadata invalid, so it is safe to release
the memory here too.

mean soft fp64 using uint64:   1,342,759,331 => 1,010,670,475
gfxbench5 aztec ruins high 11:    63,555,571 =>    61,889,811
deus ex mankind divided 148:      62,845,304 =>    62,829,640
deus ex mankind divided 2890:     71,922,686 =>    71,922,686
dirt showdown 676:                69,238,607 =>    69,238,607
dolphin ubershaders 210:          77,822,072 =>    77,822,072

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-16 14:39:56 -08:00
Ian Romanick
7adafd6e1c nir: Fix holes in nir_instr
Found using pahole.

Changes in peak memory usage according to Valgrind massif:

mean soft fp64 using uint64:   1,343,991,403 => 1,342,759,331
gfxbench5 aztec ruins high 11:    63,619,971 =>    63,555,571
deus ex mankind divided 148:      62,887,728 =>    62,845,304
deus ex mankind divided 2890:     72,399,750 =>    71,922,686
dirt showdown 676:                69,464,023 =>    69,238,607
dolphin ubershaders 210:          78,359,728 =>    77,822,072

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-16 14:39:56 -08:00
Ian Romanick
8161a87b24 nir/phi_builder: Use per-value hash table to store [block] -> def mapping
Replace the old array in each value with a hash table in each value.

Changes in peak memory usage according to Valgrind massif:

mean soft fp64 using uint64:   5,499,875,082 => 1,343,991,403
gfxbench5 aztec ruins high 11:    63,619,971 =>    63,619,971
deus ex mankind divided 148:      62,887,728 =>    62,887,728
deus ex mankind divided 2890:     72,402,222 =>    72,399,750
dirt showdown 676:                74,466,431 =>    69,464,023
dolphin ubershaders 210:         109,630,376 =>    78,359,728

Run-time change for a full run on shader-db on my Haswell desktop (with
-march=native) is 1.22245% +/- 0.463879% (n=11).  This is about +2.9
seconds on a 237 second run.  The first time I sent this version of this
patch out, the run-time data was quite different.  I had misconfigured
the script that ran the test, and none of the tests from higher GLSL
versions were run.  These are generally more complex shaders, and they
are more affected by this change.

The previous version of this patch used a single hash table for the
whole phi builder.  The mapping was from [value, block] -> def, so a
separate allocation was needed for each [value, block] tuple.  There was
quite a bit of per-allocation overhead (due to ralloc), so the patch was
followed by a patch that added the use of the slab allocator.  The
results of those two patches was not quite as good:

mean soft fp64 using uint64:   5,499,875,082 => 1,343,991,403
gfxbench5 aztec ruins high 11:    63,619,971 =>    63,619,971
deus ex mankind divided 148:      62,887,728 =>    62,887,728
deus ex mankind divided 2890:     72,402,222 =>    72,402,222 *
dirt showdown 676:                74,466,431 =>    72,443,591 *
dolphin ubershaders 210:         109,630,376 =>    81,034,320 *

The * denote tests that are better now.  In the tests that are the same
in both patches, the "after" peak memory usage was at a different
location.  I did not check the local peaks.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-16 14:39:56 -08:00
Ian Romanick
e3043e1276 util/hash_table: Add _mesa_hash_table_init function
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-16 14:39:56 -08:00
Jason Ekstrand
db197fdb6c st/nir: Use nir_src_as_uint for tokens
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-16 15:07:28 -06:00
Jason Ekstrand
47e1e0692c radv: Fix a stupid if in gather_intrinsic_info
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 15:06:07 -06:00
Jason Ekstrand
6bcd2af086 nir/algebraic: Add some optimizations for D3D-style Booleans
D3D Booleans use a 32-bit 0/-1 representation.  Because this previously
matched NIR exactly, we didn't have to really optimize for it.  Now that
we have 1-bit Booleans, we need some specific optimizations to chew
through the D3D12-style Booleans.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15136811 -> 14967944 (-1.12%)
    instructions in affected programs: 2457021 -> 2288154 (-6.87%)
    helped: 8318
    HURT: 10

    total cycles in shared programs: 373544524 -> 359701825 (-3.71%)
    cycles in affected programs: 151029683 -> 137186984 (-9.17%)
    helped: 7749
    HURT: 682

    total loops in shared programs: 4431 -> 4399 (-0.72%)
    loops in affected programs: 32 -> 0
    helped: 21
    HURT: 0

    total spills in shared programs: 10290 -> 10051 (-2.32%)
    spills in affected programs: 2532 -> 2293 (-9.44%)
    helped: 18
    HURT: 18

    total fills in shared programs: 22203 -> 21732 (-2.12%)
    fills in affected programs: 3319 -> 2848 (-14.19%)
    helped: 18
    HURT: 18

Note that a large chunk of the improvement fixing regressions caused by
switching to 1-bit Booleans.  Previously, our ability to optimize D3D
booleans was improved by using the D3D representation directly in NIR.
Now that NIR does 1-bit bools, we need a few more optimizations.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
3b30814791 nir/algebraic: Optimize 1-bit Booleans
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
44227453ec nir: Switch to using 1-bit Booleans for almost everything
This is a squash of a few distinct changes:

    glsl,spirv: Generate 1-bit Booleans

    Revert "Use 32-bit opcodes in the NIR producers and optimizations"

    Revert "nir/builder: Generate 32-bit bool opcodes transparently"

    nir/builder: Generate 1-bit Booleans in nir_build_imm_bool

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
11dc130779 nir: Add a bool to int32 lowering pass
We also enable it in all of the NIR drivers.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
191a1dce92 nir: Add 1-bit Boolean opcodes
We also have to add support for 1-bit integers while we're here so we
get 1-bit variants of iand, ior, and inot.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
615cc26b97 nir/algebraic: Generalize an optimization
This just makes it nicely scale across bit sizes.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
487514ae61 nir/large_constants: Properly handle 1-bit bools
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
3191a82372 nir: Add support for 1-bit data types
This commit adds support for 1-bit Booleans and integers.  Booleans
obviously take a value of true or false.  Because we have to define the
semantics of 1-bit signed and unsigned integers, we define uint1_t to
take values of 0 and 1 and int1_t to take values of 0 and -1.  1-bit
arithmetic is then well-defined in the usual way, just with fewer bits.
The definition of int1_t and uint1_t doesn't usually matter but we do
need something for purposes of constant folding.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
2fe8708ffd nir/constant_expressions: Rework Boolean handling
This commit contains three related changes.  First, we define boolN_t
for N = 8, 16, and 64 and move the definition of boolN_vec to the loop
with the other vec definitions.  Second, there's no reason why we need
the != 0 on the source because that happens implicitly when it's
converted to bool.  Third, for destinations, we use a signed integer
type and just do -(int)bool_val which will give us the 0/-1 behavior we
want and neatly scales to all bit widths.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
80e8dfe9de nir: Rename Boolean-related opcodes to include 32 in the name
This is a squash of a bunch of individual changes:

    nir/builder: Generate 32-bit bool opcodes transparently

    nir/algebraic: Remap Boolean opcodes to the 32-bit variant

    Use 32-bit opcodes in the NIR producers and optimizations

        Generated with a little hand-editing and the following sed commands:

        sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c
        sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c
        sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c
        sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c
        sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c
        sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c

     Use 32-bit opcodes in the NIR back-ends

        Generated with a little hand-editing and the following sed commands:

        sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c
        sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c
        sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c
        sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c
        sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c
        sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
b569093566 nir/algebraic: Make an optimization more specific
Later in this series, bool is not going to imply 32-bit.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
517099809a nir: Drop support for lower_b2f
This was originally added for the out-of-tree Mali driver but I think
we've all agreed it's easy enough for them to just do in their back-end.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
4bb1a34727 nir/algebraic: Optimize x2b(xneg(a)) -> a
Shader-db results on Kaby Lake:

    total instructions in shared programs: 15072525 -> 15072525 (0.00%)
    instructions in affected programs: 0 -> 0
    helped: 0
    HURT: 0

This helps prevent regressions in later commits.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
3595a0abf4 nir/constant_folding: Fix source bit size logic
Instead of looking at input_sizes[i] which contains the number of
components for each source, we look at the bit size of input_types[i].
This fixes a regression in the 1-bit boolean series though I have no
idea how we haven't seen it before now.

Fixes: 35baee5dce "nir/constant_folding: fix incorrect bit-size check"
Fixes: 9076c4e289 "nir: update opcode definitions for different bit sizes"
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
9f7bd843af nir/tgsi: Use nir_bany in ttn_kill_if
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-12-16 21:03:02 +00:00
Jason Ekstrand
e17426058c nir/lower_idiv: Use ilt instead of bit twiddling
The previous code was creating a boolean by doing an arithmetic right-
shift by 31 which produces a boolean which is true if the argument is
negative.  This is the same as the expression r < 0 which is much
simpler and doesn't depend on NIR's representation of booleans.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-12-16 21:03:02 +00:00
Eric Anholt
2977c77758 v3d: Use the original bit size when scalarizing uniform loads.
Prevents a regression in jekstrand's 1-bit series.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-16 21:03:01 +00:00
Eric Anholt
91a0251dbc vc4: Use the original bit size when scalarizing uniform loads.
Prevents a regression in jekstrand's 1-bit series.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-16 21:03:01 +00:00
Rhys Perry
bde9f482de ac: split 16-bit ssbo loads that may not be dword aligned
Fixes: 7e7ee82698 ('ac: add support for 16bit buffer loads')
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108114
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-16 14:56:10 +00:00
Rhys Perry
12dc7cb202 ac: refactor visit_load_buffer
This is so that we can split different types of loads more easily.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-16 14:56:10 +00:00
Rhys Perry
ed4020fabe nir: fix constness in nir_intrinsic_align()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-16 14:56:10 +00:00
Jan Vesely
e4f9a37ace clover: Fix build after clang r348827
CodeGenOptions were moved to Basic.

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>
CC: mesa-stable@lists.freedesktop.org
2018-12-16 06:38:10 -05:00
Jon Turney
d512b35b62 glx: Fix compilation with GLX_USE_WINDOWSGL
Sadly, the GLX_USE_APPLEGL and GLX_USE_WINDOWSGL cases are not identical
(because GLX_USE_WINDOWSGL uses vtables rather than a maze of ifdefs)

Include <sys/time.h> again, as functions prototyped by it are used in
the GLX_USE_WINDOWSGL path.

Make the include guard around the __glxGetMscRate() definition match the
one at it's declaration again, as it's referenced from dri_common.c
which is built for GLX_USE_WINDOWSGL.

Fixes: a95ec138 ("glx: mandate xf86vidmode only for "drm" dri platforms")
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-15 13:49:24 +00:00
Eric Anholt
29927e7524 v3d: Drop in a bunch of notes about performance improvement opportunities.
These have all been floating in my head, and while I've thought about
encoding them in issues on gitlab once they're enabled, they also make
sense to just have in the area of the code you'll need to work in.
2018-12-14 17:48:01 -08:00
Eric Anholt
248a7fb392 v3d: Do uniform pretty-printing in the QPU dump.
If you're trying to trace what's going on in a QPU dump, this will
definitely help you find your way.
2018-12-14 17:48:01 -08:00
Eric Anholt
a370ed76ab v3d: Use the uniform pretty-printer in v3d_write_uniforms()'s debug code.
This will be a lot easier than my usual "38400.000000?  that looks like a
viewport scale" decoding strategy.
2018-12-14 17:48:01 -08:00
Eric Anholt
532b6c5671 v3d: Move uniform pretty-printing to its own helper function.
I want to reuse it in the QPU dump.
2018-12-14 17:48:01 -08:00
Eric Anholt
78ef05bde4 v3d: Move uinfo->data[] dereference to the top of v3d_write_uniforms().
Follows 3954331aff ("vc4: Pull uinfo->data[i] dereference out to the top
of the loop.") which showed a large performance win for vc4, but also
cleans up the code a decent bit.
2018-12-14 17:48:01 -08:00
Eric Anholt
a7e15a5086 v3d: Avoid assertion failures when removing end-of-shader instructions.
After generating VIR, we leave c->cursor pointing at the end of the
shader.  If the shader had dead code at the end (for example from preamble
instructions in a shader with no side effects), we would assertion fail
that we were leaving the cursor pointing at freed memory.  Since anything
following DCE should be setting up a new cursor anyway, just clear the
cursor at the start.
2018-12-14 17:48:01 -08:00
Eric Anholt
5b2cc03852 v3d: Add support for draw indirect for GLES3.1.
In trying to enable compute shaders, I found that a bunch of deqp-gles31's
compute stuff wanted to interact with indirect dispatch.  This was easy to
do on its own.
2018-12-14 17:48:01 -08:00
Eric Anholt
ff80e58b38 v3d: Add missing flagging of SYNCB as a TSY op.
Fixes: f2e41daac5 ("broadcom/vc5: Update QPU instruction pack/unpack for v4.2.")
2018-12-14 17:48:01 -08:00
Eric Anholt
3f9bcf9136 v3d: Make sure that a thrsw doesn't split a multop from its umul24.
The thrsw will invalidate rtop, just like accumulators and flags.  Caught
by simulator assertions in CS imulextended/umulextended tests.

Fixes: 90269ba353 ("broadcom/vc5: Use THRSW to enable multi-threaded shaders.")
2018-12-14 17:48:01 -08:00
Eric Anholt
332a5cf6a5 v3d: Add safety checks for resource_create().
This should ease my debugging next time I screw it up.
2018-12-14 17:48:01 -08:00
Eric Anholt
6ad9e8690d v3d: Add support for texturing from linear.
Just like vc4, we have to support linear shared BOs for X11 on arbitrary
displays.  When we're faced with a request to texture from one of those,
make a shadow image that we copy using the TFU at the start of the draw
call.
2018-12-14 17:48:01 -08:00
Eric Anholt
976ea90bdc v3d: Add support for using the TFU to do some blits.
This will be useful in particular for blits from raster to UIF for X11.
2018-12-14 17:48:01 -08:00
Eric Anholt
e5b4d1f55f v3d: Don't forget to bump the number of writes when doing TFU ops.
generatemipmap is just filling out the rest of the mipmap that's already
been written (by a mapping or a draw call), so it didn't matter.  As I
reuse the TFU code for linear-to-UIF conversions, it'll start mattering.
2018-12-14 17:48:01 -08:00
Eric Anholt
485df2574e v3d: Set up the right stride for raster TFU.
I didn't have any raster images in the generatemipmap path, so the
pixels-vs-bytes mixup didn't matter here.
2018-12-14 17:48:01 -08:00
Eric Anholt
e731d53716 v3d: Don't forget to wait for our TFU job before rendering from it.
Otherwise we may race to read old contents.  This didn't show up in the
CTS and piglit for me, but it did once I started using the TFU to do
linear->UIF blits for X11.

Fixes: 2ebca177dc ("v3d: Use the TFU to do generatemipmap.")
2018-12-14 17:48:01 -08:00
Ilia Mirkin
153d3fc5f9 nvc0: always keep TSC slot 0 bound to fix TXF
Same as on nv50, the TXF op always uses the TSC bound to slot 0,
returning blank values if nothing is bound.

An earlier change arranges for the TSC entries list to always have valid
data at entry 0, so here we just make use of it.

Fixes arb_texture_buffer_object-subdata-sync among others.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-14 20:01:31 -05:00
Ilia Mirkin
4aeaf89aa7 nvc0: replace use of explicit default_tsc with entry 0
This was used for implementing FBFETCH. However that uses TXF, which
doesn't do much with a TSC. The only important bit is that sRGB-decoding
works as expected, which we can achieve since all samplers we ever
generate enable sRGB-decoding. Always point to entry 0 in the TSC table,
and ensure that even before it ever gets initialized, the sRGB-decoding
enable bit is set.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-14 20:01:31 -05:00
Rob Clark
5f9085638a freedreno/a6xx: fix corrupted uniforms
For older gen's fd_wfi() is used to conditionally insert a WFI if there
hasn't already been one since last draw.  But this doesn't work out well
with stateobj since the order the stateobj is evaluated might not be
what you expect.  (Ie. stateobj might not be evaluated until a later
draw if there is no geometry from the current draw in a given tile.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-14 15:01:30 -05:00
Alex Deucher
4db4b3447d pci_ids: add new vega20 pci id
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-12-14 14:48:39 -05:00
Alex Deucher
56cf25a114 pci_ids: add new vega10 pci ids
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-12-14 14:48:18 -05:00
Rafael Antognolli
5c454661c6 i965/gen9: Add workarounds for object preemption.
Gen9 hardware requires some workarounds to disable preemption depending
on the type of primitive being emitted.

We implement this by adding a function that checks the primitive type
and number of instances right before the 3DPRIMITIVE.

For now, we just ignore blorp.  The only primitive it emits is
3DPRIM_RECTLIST, and since it's not listed in the workarounds, we can
safely leave preemption enabled when it happens. Or it will be disabled
by a previous 3DPRIMITIVE, which should be fine too.

v3:
 - Apply missing workarounds for instanced rendering and line loop (Ken)
 - Move workaround code to brw_draw_single_prim()

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-12-14 09:40:27 -08:00
Rafael Antognolli
d8b50e152a i965/gen10+: Enable object level preemption.
Set bit when initializing context.

v3:
 - Always toggle preemption bool to false before enabling it for the
 first time, so the state gets emitted (Chris Wilson).
 - Emit end of pipe sync with PIPE_CONTROL_RENDER_TARGET_FLUSH (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-12-14 09:40:27 -08:00
Rafael Antognolli
019a92ffa4 intel/genxml: Add register for object preemption.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-12-14 09:40:27 -08:00
Ian Romanick
a6b7d1151c util/slab: Rename slab_mempool typed parameters to mempool
Now everything with type 'struct slab_child_pool *' is name pool, and
everything with type 'struct slab_mempool *' is named mempool.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-14 07:36:05 -08:00
Ian Romanick
ba5402ec9a nir/phi_builder: Internal users should use nir_phi_builder_value_set_block_def too
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-14 07:36:05 -08:00
Christian Gmeiner
489ffaf0c1 etnaviv: drop redundant ctx function parameter
There is no need to have an extra ctx paramter as all the other
parameters carry all the needed information.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-12-14 11:23:00 +01:00
Kenneth Graunke
0b44644ca6 genxml: Consistently use a numeric "MOCS" field
When we first started using genxml, we decided to represent MOCS as an
actual structure, and pack values.  However, in many places, it was more
convenient to use a numeric value rather than treating it as a struct,
so we added secondary setters in a bunch of places as well.

We were not entirely consistent, either.  Some places only had one.
Gen6 had both kinds of setters for STATE_BASE_ADDRESS, but newer gens
only had the struct-based setters.  The names were sometimes "Constant
Buffer Object Control State" instead of "Memory", making it harder to
find.  Many had prefixes like "Vertex Buffer MOCS"...in a vertex buffer
packet...which is a bit redundant.

On modern hardware, MOCS is simply an index into a table, but we were
still carrying around the structure with an "Index to MOCS Table" field,
in addition to the direct numeric setters.  This is clunky - we really
just want a number on new hardware.

This patch eliminates the struct-based setters, and makes the numeric
setters be consistently called "MOCS".  We leave the struct definition
around on Gen7-8 for reference purposes, but it is unused.

v2: Drop bonus "Depth Buffer MOCS" fields on Gen7.5 and Gen9

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2018-12-14 00:44:54 -08:00
Timothy Arceri
a2ec78883f nir: fix opt_if_loop_last_continue()
The pass did not correctly handle loops ending in:

	if ssa_7 {
		block block_8:
		/* preds: block_7 */
		continue
		/* succs: block_1 */
	} else {
		block block_9:
		/* preds: block_7 */
		break
		/* succs: block_11 */
	}

The break will get eliminated by another opt but if this pass gets
called first (as it does on RADV) we ended up inserting
instructions after the break.

Fixes: 5921a19d4b ("nir: add if opt opt_if_loop_last_continue()")
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-12-14 17:21:35 +11:00
Rob Clark
0ac5acaeaa freedreno/a6xx: fix resource_copy_region()
pctx->resource_copy_region() needs to fall back to sw copy for
non-renderable formats.  But previously for things that we could
not use the blitter for, would fall back to 3d.  Which won't work
if 3d can't render to the dst format either.

Instead rework things to fallback to fd_resource_copy_region(),
which will try 3d core and then fall back to memcpy().

Fixes (for example) dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
4ec2f6129b freedreno: move fd_resource_copy_region()
Code-motion prep for next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
57b76ee2a8 freedreno/a6xx: more blitter fixes
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
d15fc787bc freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
532f8c0043 gallium/aux: add is_unorm() helper
We already had one for is_snorm() but not unorm.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
85cd4df47f freedreno/a6xx: fix blitter crash
Fixes a crash with unsupported formats in dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot

Also fixes gpu hangs with some formats that are supported, but which we
don't know what internal-format to use for the blitter, for ex
dEQP-GLES3.functional.texture.format.sized.2d_array.rgb10_a2_pot

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
cca1e9606c freedreno/ir3: don't remove unused input components
Fixes: 0d240c2214 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
c19c4bf488 freedreno/ir3: fix crash
Fixes a crash in dEQP-GLES3.functional.shaders.fragdepth.compare.fragcoord_z

Fixes: 0d240c2214 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
3e8e033f4c freedreno: also set DUMP flag on shaders
If we emit shader as a pointer to a GEM object, also set the RELOC_DUMP
flag as a hint to kernel that this is a useful buffer to snapshot for
debug dumps.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
4cd016b5d6 freedreno: debug GEM obj names
With a recent enough kernel, set debug names for GEM BOs, which will
show up in $debugfs/gem

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
7ef722861b freedreno/drm: sync uapi and enable softpin
Pull in updated UAPI and use kernel API version to enable softpin.
Since MSM_SUBMIT_BO_DUMP flag was added at same time, use that to
signal to kernel that cmdstream buffers are useful to dump for
debugging/cmdstream-traces.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Eric Anholt
4407e688cd nir: Move intel's half-float image store lowering to to nir_format.h.
I needed the same function for v3d.  This was originally in d3e046e76c
("nir: Pull some of intel's image load/store format conversion to
nir_format.h") before we made am istake about simplifying the function.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 12:24:26 -08:00
Eric Anholt
3a417a044e Revert "intel: Simplify the half-float packing in image load/store lowering."
This reverts commit 06fbcd2cd5.
nir_pack_half_2x16_split *isn't* vectorizable, it's 1-component only, thus
why we had this split-scalar code in the first place.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 12:24:24 -08:00
Eric Anholt
c2c44dba7a nir: Print the format of image variables.
This helps a lot when debugging image load/store lowering on large
testcases.  Unfortunately the Mesa enum name stuff is under src/mesa and
we can't get at it from the compiler.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 12:24:12 -08:00
Eric Anholt
19ffcba161 mesa/st: Expose compute shaders when NIR support is advertised.
We have a NIR path, and V3D doesn't have TGSI input for compute (only what
TTN can handle for the various gallium-internal shaders).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-12-13 11:44:47 -08:00
Dave Airlie
b3f2b03ece radv/xfb: fix counter buffer bounds checks.
If we gave this function 0 counter buffers, we'd still try and
access pCounterBuffers[0] as this check was incorrect.

Fixes crash with ext_transform_feedback-pipeline-basic-primgen
on zink on radv.

Fixes: 677b496b6 (radv: fix begin/end transform feedback with 0 counter buffers.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-13 19:27:05 +00:00
Jason Ekstrand
9ebc00f32e i965: Enable nir_opt_idiv_const for 32 and 64-bit integers
The pass should work for all bit sizes but it's less clear that the
extra instructions are worth it on small integers.  Also, the hardware
doesn't do mul_high on anything other than 32-bit integers and, absent
any decent mechanism for testing the pass on 8 and 16-bit types, it's
probably best to just leave it disabled for now.

Shader-db results on Sky Lake:

    total instructions in shared programs: 15105795 -> 15111403 (0.04%)
    instructions in affected programs: 72774 -> 78382 (7.71%)
    helped: 0
    HURT: 265

Note that hurt here actually means helped because we're getting rid of
integer quotient operations (which are a send on some platforms!) and
replacing them with fairly cheap ALU ops.

Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Jason Ekstrand
455ec7327d i965/vec4: Implement nir_op_uadd_sat
Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Ian Romanick
e639d39faf i965/fs: Implement nir_op_uadd_sat
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 17:49:48 +00:00
Jason Ekstrand
74492ebad9 nir: Add a pass for lowering integer division by constants
It's a reasonably well-known fact in the world of compilers that integer
divisions by constants can be replaced by a multiply, an add, and some
shifts.  This commit adds such an optimization to NIR for easiest case
of udiv.  Other division operations will be added in following commits.
In order to provide some additional driver control, the pass takes a
minimum bit size to optimize.

Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Ian Romanick
090e282407 nir: Add a saturated unsigned integer add opcode
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 17:49:48 +00:00
Jason Ekstrand
39198a1238 nir/lower_int64: Add support for [iu]mul_high
Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Jason Ekstrand
9525971e2b nir: Allow [iu]mul_high on non-32-bit types
Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Emil Velikov
a95ec13879 glx: mandate xf86vidmode only for "drm" dri platforms
Currently we have the three dri "platforms" - drm, apple and windows.

Since xf86vidmode is a thing only for the drm one, adjust the
preprocessor guards and correctly check for the dependency.

v2: terminate the GLX_USE_WINDOWSGL hunk

Cc: Jon TURNEY <jon.turney@dronecode.org.uk>
Fixes: 5bc509363b ("glx: make xf86vidmode mandatory for direct rendering")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-13 17:38:19 +00:00
Alejandro Piñeiro
c7bdcd67aa nir: remove unused variable
To avoid the following warning:
./src/compiler/nir/nir_loop_analyze.c:807:16: warning: unused variable ‘ns’ [-Wunused-variable]
    nir_shader *ns = impl->function->shader;
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-13 16:35:21 +01:00
Erik Faye-Lund
e888f28d1f virgl: work around bad assumptions in virglrenderer
Virglrenderer does the wrong thing when given an instance divisor;
it tries to use the element-index rather than the binding-index as
the argument to glVertexBindingDivisor(). This worked fine as long
as there was a 1:1 relationship between elements and bindings,
which was the case util 19a91841c3 "st/mesa: Use Array._DrawVAO in
st_atom_array.c.".

So let's detect instance divisors, and restore a 1:1 relationship in
that case. This will make old versions of virglrenderer behave
correctly. For newer versions, we can consider making a better
interface, where the instance divisor isn't specified per element,
but rather per binding. But let's save that for another day.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 19a91841c3 "st/mesa: Use Array._DrawVAO in st_atom_array.c."
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
2018-12-13 16:12:10 +01:00
Erik Faye-Lund
8447b64238 virgl: wrap vertex element state in a struct
This just has one member for now; the handle. But this is about to
change.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
2018-12-13 16:12:10 +01:00
Erik Faye-Lund
b702ff5378 virgl: simplify virgl_hw_set_index_buffer
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
2018-12-13 16:12:10 +01:00
Erik Faye-Lund
00143a6241 virgl: simplify virgl_hw_set_vertex_buffers
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
2018-12-13 16:12:10 +01:00
Juan A. Suarez Romero
0991085f66 docs: update calendar, add news item and link release notes for 18.2.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-12-13 15:45:20 +01:00
Juan A. Suarez Romero
e0b0995dcf docs: add sha256 checksums for 18.2.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit e90429cc6d)
2018-12-13 15:42:49 +01:00
Juan A. Suarez Romero
c8a17b45ea docs: add release notes for 18.2.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 419ee20097)
2018-12-13 15:42:46 +01:00
Samuel Pitoiset
5088ba2aeb radv: don't check if format is depth in radv_image_can_enable_hile()
This is always TRUE if htile_size is not 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:21 +01:00
Samuel Pitoiset
eb0034fe28 radv: check if addrlib enabled HTILE in radv_image_can_enable_htile()
When hile_size is 0, we can't enable HTILE. This doesn't change
anything, except not calling radv_image_alloc_htile().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:19 +01:00
Samuel Pitoiset
d8325f1f07 radv: switch on EOP when primitive restart is enabled with triangle strips
Otherwise, Yakuza hangs the GPU with DXVK. We don't know if
linetrip and pointlist are affected, so my point is to do that
only for triangle strips.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:16 +01:00
Samuel Pitoiset
74cf3b627c radv: allow to skip DCC decompressions with the new predicate
Feral games aren't affected because they don't decompress DCC.
F1 2018 has one DCC decompression per frame, but I don't see
any performance improvements. This new predicate will be
probably more useful for DCC/MSAA.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:14 +01:00
Samuel Pitoiset
3a5adc2879 radv: add a predicate for reflecting DCC decompression state
It's somehow similar to the FCE predicate.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:10 +01:00
Jordan Justen
c506eae53d i965/compute: Emit GPGPU_WALKER in genX_state_upload
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-12 22:28:06 -08:00
Jordan Justen
1b85c605a6 i965/genX_state: Add register access functions
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-12 22:28:02 -08:00
Eric Anholt
06fbcd2cd5 intel: Simplify the half-float packing in image load/store lowering.
This was noted by Jason in review when I tried to make a helper for the
old path.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-12 16:09:48 -08:00
Eric Anholt
d3e046e76c nir: Pull some of intel's image load/store format conversion to nir_format.h
I needed the same functions for v3d.  Note that the color value in the
Intel lowering has already been cut down to image.chans num_components.

v2: Drop the half float one, since it was a 1-liner after cleanup.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-12 16:09:43 -08:00
Eric Anholt
19c7cba2ab nir: Add some more consts to the nir_format_convert.h helpers.
Most of the bits were constant, but a few were missed.  Avoids warnings
from v3d's upcoming static const bits declarations.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-12 16:09:37 -08:00
Timothy Arceri
9e6b39e1d5 nir: detect more induction variables
This allows loop analysis to detect inductions variables that
are incremented in both branches of an if rather than in a main
loop block. For example:

   loop {
      block block_1:
      /* preds: block_0 block_7 */
      vec1 32 ssa_8 = phi block_0: ssa_4, block_7: ssa_20
      vec1 32 ssa_9 = phi block_0: ssa_0, block_7: ssa_4
      vec1 32 ssa_10 = phi block_0: ssa_1, block_7: ssa_4
      vec1 32 ssa_11 = phi block_0: ssa_2, block_7: ssa_21
      vec1 32 ssa_12 = phi block_0: ssa_3, block_7: ssa_22
      vec4 32 ssa_13 = vec4 ssa_12, ssa_11, ssa_10, ssa_9
      vec1 32 ssa_14 = ige ssa_8, ssa_5
      /* succs: block_2 block_3 */
      if ssa_14 {
         block block_2:
         /* preds: block_1 */
         break
         /* succs: block_8 */
      } else {
         block block_3:
         /* preds: block_1 */
         /* succs: block_4 */
      }
      block block_4:
      /* preds: block_3 */
      vec1 32 ssa_15 = ilt ssa_6, ssa_8
      /* succs: block_5 block_6 */
      if ssa_15 {
         block block_5:
         /* preds: block_4 */
         vec1 32 ssa_16 = iadd ssa_8, ssa_7
         vec1 32 ssa_17 = load_const (0x3f800000 /* 1.000000*/)
         /* succs: block_7 */
      } else {
         block block_6:
         /* preds: block_4 */
         vec1 32 ssa_18 = iadd ssa_8, ssa_7
         vec1 32 ssa_19 = load_const (0x3f800000 /* 1.000000*/)
         /* succs: block_7 */
      }
      block block_7:
      /* preds: block_5 block_6 */
      vec1 32 ssa_20 = phi block_5: ssa_16, block_6: ssa_18
      vec1 32 ssa_21 = phi block_5: ssa_17, block_6: ssa_4
      vec1 32 ssa_22 = phi block_5: ssa_4, block_6: ssa_19
      /* succs: block_1 */
   }

Unfortunatly GCM could move the addition out of the if for us
(making this patch unrequired) but we still cannot enable the GCM
pass without regressions.

This unrolls a loop in Rise of The Tomb Raider.

vkpipeline-db results (VEGA):

Totals from affected shaders:
SGPRS: 88 -> 96 (9.09 %)
VGPRS: 56 -> 52 (-7.14 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 2168 -> 4560 (110.33 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 4 -> 4 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211
2018-12-13 10:58:35 +11:00
Timothy Arceri
c03d6e61cc nir: reword code comment
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-12-13 10:58:35 +11:00
Timothy Arceri
48b40380e3 nir: in loop analysis track actual control flow type
This will allow us to improve analysis to find more induction
variables.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-12-13 10:58:35 +11:00
Danylo Piliaiev
5921a19d4b nir: add if opt opt_if_loop_last_continue()
Removing the last continue can allow more loops to unroll. Also
inserting code into the if branch can allow the various if opts
to progress further.

The insertion of some loops into the if branch also reduces VGPR
use in some shaders.

vkpipeline-db results (VEGA):

Totals from affected shaders:
SGPRS: 6552 -> 6576 (0.37 %)
VGPRS: 6544 -> 6532 (-0.18 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 481952 -> 478032 (-0.81 %) bytes
LDS: 13 -> 13 (0.00 %) blocks
Max Waves: 241 -> 242 (0.41 %)
Wait states: 0 -> 0 (0.00 %)

Shader-db results radeonsi (VEGA):

Totals from affected shaders:
SGPRS: 168 -> 168 (0.00 %)
VGPRS: 144 -> 140 (-2.78 %)
Spilled SGPRs: 157 -> 157 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 8524 -> 8488 (-0.42 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 7 -> 7 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

v2: (Timothy Arceri):
- allow for continues in either branch
- move any trailing loops inside the if as well as blocks.
- leave nir_opt_trivial_continues() to actually remove the
  continue.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211
2018-12-13 10:58:35 +11:00
Timothy Arceri
721566bddb nir: rework force_unroll_array_access()
Here we rework force_unroll_array_access() so that we can reuse
the induction variable detection in a following patch.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-12-13 10:39:51 +11:00
Timothy Arceri
48135f175c nir: factor out some of the complex loop unroll code to a helper
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-12-13 10:34:48 +11:00
Jordan Justen
7fe4e0ad5d docs: Document GitLab merge request process (email alternative)
This documents a process for using GitLab Merge Requests as an second
way to submit code changes for Mesa. Only one of the two methods is
allowed for each patch series.

We will *not* require all patches to be emailed. Some code changes may
be reviewed and merged without any discussion on the mesa-dev email
list.

v2:
 * No longer require email. Allow submitter to choose email or a
   GitLab merge request.
 * Various feedback from Brian, Daniel, Dylan, Eric, Erik, Jason,
   Matt, Michel and Rob.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Rob Clark <robdclark@gmail.com>
2018-12-12 10:05:29 -08:00
Rhys Kidd
ff6f1dd0d3 meson: libfreedreno depends upon libdrm (for fence support)
Error message building freedreno Gallium driver with meson:

  ../src/gallium/drivers/freedreno/freedreno_fence.c:27:21: fatal error: libsync.h: No such file or directory
   \#include <libsync.h>

Fixes: 4aa69cc425 ("meson: build freedreno")
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-12 09:01:06 -08:00
Jason Ekstrand
ca98902d09 nir: Document the function inlining process
This has thrown a few people off recently and it's good to have the
process and all the rational for it documented somewhere.  A comment at
the top of nir_inline_functions seems as good a place as any.

Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-12-12 08:32:32 -06:00
Jason Ekstrand
5749c0ebc4 intel/blorp: Assert that we don't re-layout a compressed surface
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-12 08:32:32 -06:00
Jason Ekstrand
e4fdc650f1 anv/pipeline: Set the correct binding count for compute shaders
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-12-12 08:32:25 -06:00
Samuel Pitoiset
2ac6d55f38 radv: bump reported version to 1.1.90
After going through the spec changelog, it looks like RADV
is up to date. Note that ANV also reports 1.1.90.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-12 13:51:16 +01:00
Erik Faye-Lund
f856f50194 virgl: force linear texturing support
When I made sure that half-float texture-filtering was required for ES3,
I didn't realize that virgl doesn't report support for this correctly.
This regressed the GLES version available on top of several drivers,
including i965 from 3.2 to 2.0.

This is going to need protocol changes to fix properly, so let's just
restore the previous behavior by enabling floating-point filtering
unconditionally for now.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: fcf9fcee3c "mesa/main: do not require float-texture filtering for es3"
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-12-12 11:44:47 +01:00
Iago Toral Quiroga
3918943211 intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments
The implementation of these opcodes in the generator assumes that their
arguments are packed, and it generates register regions based on that
assumption.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-12 08:09:45 +01:00
Jason Ekstrand
a10a450db2 anv: Advertise support for MinLod on Skylake+
These are usually used for dealing with sparse resources but there's no
reason why we can't hook them up before we have sparse.  We have the
hardware; let's light it up.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Jason Ekstrand
cb98e0755f intel/fs: Support min_lod parameters on texture instructions
We have to lower some shadow instructions because they don't exist in
hardware and we have to lower txb+offset+clamp because the message gets
too big and we run into the sampler message length limit of 11 regs.

Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Jason Ekstrand
4ef8f46fd1 nir/lower_tex: Add lowering for some min_lod cases
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Jason Ekstrand
4a691cfa7e nir/lower_tex: Modify txd instructions instead of replacing them
I don't know if one is better than the other or not but this approach
has the advantage that we never forget to copy information over and
we're not hard-coding quite as many assumptions.  It's also a lot
simpler and much less code.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Jason Ekstrand
5a968ae473 nir/lower_tex: Simplify lower_gradient logic
Instead of having to call two different lower_gradient functions based
on whether or not it's a cube, just make lower_gradient handle cubes.
This significantly simplifies some of the logic.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Jason Ekstrand
caeffe7549 spirv: Add support for MinLod
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Jason Ekstrand
e1ef6c3c29 intel/ir: Don't allow allocating zero registers
This simple check helps catch bugs early that can end up propagating
into later stages of the compile and triggering strange asserts.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-11 21:26:23 -06:00
Roland Scheidegger
86c45fe960 gallivm: remove unused float coord wrapping for aos sampling
AoS sampling tries to use integers for coord wrapping when possible,
as it should be faster. However, for AVX, this was suboptimal, because
only floats can use 8x32bit vectors, whereas integers have to be split
into 4x32bit vectors. (I believe part of why it was slower was also
that at least earlier llvm versions had trouble optimizing it properly,
since you can still do simple bit ops with 8x32bit vectors, so a
sequence of int add / and / int add / and with such vectors would
actually end up doing 128bit inserts/extracts between the operations
instead of just doing the cheap 128bit ands.)
Hence, a special float coord wrapping path was added to AoS sampling.
But this path was actually disabled for a long time already, since we
found that just splitting everything before entering the AoS path was
still sligthly faster usually, so none of this float coord wrapping
code was used anymore (AoS sampling code, when avx2 isn't supported,
never sees vectors with length > 4). I thought it might be useful some
day again, but I'm not interested anymore in optimizing for very weird
instruction sets which have support for 256bit vectors for floats but
not for ints, so just drop it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-12-12 03:50:03 +01:00
Emil Velikov
721c296bdc docs: update calendar, add news item and link release notes for 18.3.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-11 21:25:18 +00:00
Emil Velikov
5391b65ed1 docs: add sha256 checksums for 18.3.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-11 21:21:42 +00:00
Emil Velikov
512bd8d3dd docs: add release notes for 18.3.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-11 21:21:41 +00:00
Neil Roberts
8600aa35bd freedreno: Add .dir-locals to the common directory
The commit aa0fed10d3 moved a bunch of Freedreno code to a common
directory. The previous directory had a .dir-locals file for Emacs.
This patch copies it to the new directory as well.

Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-12-11 13:14:08 -08:00
Rob Clark
cfe8220904 mesa/st/nir: fix missing nir_compact_varyings
LinkedTransformFeedback is normally populated, which had nerf'd varying
packing since the check was introduced.

Fixes: dbd52585fa st/nir: Disable varying packing when doing transform feedback.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-11 15:51:34 -05:00
Rob Clark
9e3fc0c1e0 nir: fix spelling typo
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-12-11 15:51:34 -05:00
Jason Ekstrand
8f401b0ce6 anv,radv: Disable VK_EXT_pci_bus_info
The Vulkan working group recently discovered that we made a mistake in
assuming that PCI domains are 16-bit even though they can potentially be
32-bit values.  To fix this, the next spec update will change the types
in the VK_EXT_pci_bus_info struct to be 32 bits which will be a
backwards-incompatible change.  Normally, Khronos tries very hard to
never make backwards incompatible changes to specs.  Hopefully, the
extension is new enough (2 months) that there are no shipping apps which
use the extension so this should be safe.

This commit disables the extension for both anv and radv in mesa and
should be back-ported to 18.3 ASAP so we avoid any potential issues with
new apps running on old drivers.  I'll send out a commit (which we can
also back-port to 18.3 if we really care) to re-enable the extension in
both drivers once this week's spec update ships.  The one known use of
this extension is internal to mesa and will continue working with the
extension disabled and will naturally update when we get a new header.

Cc: "18.3" <mesa-stable@lists.freedesktop.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-11 11:30:05 -06:00
Juan A. Suarez Romero
fb88dcf5ca docs: extends 18.2 lifecycle
As 18.3 was published with some delay, let's extend 18.2 life for
another extra release.

CC: Andres Gomez <agomez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-11 15:20:10 +01:00
Kristian H. Kristensen
c0de7c21a3 glapi: fixup EXT_multisampled_render_to_texture dispatch
There's a few missing and convoluted bits:

 - FramebufferTexture2DMultisampleEXT
Missing sanity check, should be desktop="false"

 - RenderbufferStorageMultisampleEXT
Missing sanity check, is aliased to RenderbufferStorageMultisample.
Thus it's set only when desktop GL or GLES2 v3.0+, while the extension
is GLES2 2.0+.

If we flip the aliasing we'll break indirect GLX, so loosen the version
to 2.0. Not perfect, yet this is the most sane thing I could think of.

v2: [Emil] Fixup RenderbufferStorageMultisampleEXT, commmit message

Cc: Kristian H. Kristensen <hoegsberg@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108974
Fixes: 1b331ae505 ("mesa: Add core support for EXT_multisampled_render_to_texture{,2}")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 15:09:07 -08:00
Kristian H. Kristensen
9578dde1c8 freedreno: Fix the Makefile.am fix
Commit b028ce29f0 fixed a typo in
src/freedreno/Makefile.am, but ended up breaking the build for
freedreno.  The typo inadvertently made things work, as we were not
supposed to link with libnir or libmesautil to begin with.  Those come
in through libmesagallium and the typo prevented the duplicated
linkage.

Fixes: b028ce29f ("freedreno: add the missing _la in libfreedreno_ir3_la")
Cc: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 14:28:09 -08:00
Matt Turner
f447a13032 i965/fs: Handle V/UV immediates in dump_instructions() 2018-12-10 10:46:56 -08:00
Sagar Ghuge
694eb342a2 intel/compiler: Always print flag subregister number
While disassembling the predicate always print flag subregister number
to keep grammar same across the generation for assembler tool.

v2: Combine consecutive format calls (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-12-10 10:07:11 -08:00
Sagar Ghuge
e7598c5a62 intel/compiler: Set swizzle to BRW_SWIZZLE_XXXX for scalar region
When RepCtrl is set, the swizzle field is ignored by the hardware. In
order to ensure a 1-to-1 correspondence between the human-readable
disassembly and the binary instruction encoding always set the swizzle
to XXXX (all zeros) when it is unused due to RepCtrl

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-12-10 10:06:55 -08:00
Dylan Baker
6d3cbbbe15 meson: Add nir_algebraic_parser_test to suites
Just to make it easier to run a nir tests together.

Fixes: a0ae12ca91
       ("nir/algebraic: Add unit tests for bitsize validation")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-10 09:14:44 -08:00
Emil Velikov
27c4fdfdf8 amd/addrlib: drop si_ci_vi_merged_enum.h from the list
Fixes: 776b911365 ("amd/addrlib: update Mesa's copy of addrlib")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 16:35:01 +00:00
Emil Velikov
b028ce29f0 freedreno: add the missing _la in libfreedreno_ir3_la
Fixes: aa0fed10d3 ("freedreno: move ir3 to common location")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 16:35:01 +00:00
Emil Velikov
b30e37ec64 freedreno: drop duplicate MKDIR_GEN declaration
Fixes: aa0fed10d3 ("freedreno: move ir3 to common location")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 16:35:01 +00:00
Rhys Kidd
05c7e726f7 travis: radeonsi and radv require LLVM 7.0
Fixes: 3fbdcd942f ("amd: remove support for LLVM 6.0")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>
Cc: Andres Gomez <agomez@igalia.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 16:20:12 +00:00
Kirill Burtsev
a539316485 loader: free error state, when checking the drawable type
Currently we distinguish if the drawable is a window or pixmap by
checking xcb_present_select_input throws an error or not.

Yet, we don't always free the error state returned by xcb.

Cc: Kirill Burtsev <kirill.burtsev@qt.io>
Cc: Boyan Ding <boyan.j.ding@gmail.com>
Fixes: 6bd9ba7d07 ("loader: Add dri3 helper")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[Emil: add commit message, fixes tag]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 16:19:55 +00:00
Timothy Arceri
032f247921 nir: make use of new nir_cf_list_clone_and_reinsert() helper
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-10 13:59:50 +11:00
Timothy Arceri
6b961eb534 nir: add a new nir_cf_list_clone_and_reinsert() helper
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-10 13:59:50 +11:00
Timothy Arceri
03d7c65ad8 nir: clarify some nit_loop_info member names
Following commits will introduce additional fields such as
guessed_trip_count. Renaming these will help avoid confusion
as our unrolling feature set grows.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-10 13:59:50 +11:00
Timothy Arceri
de0aee7638 nir: small tidy ups for nir_loop_analyze()
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-10 13:59:50 +11:00
Kenneth Graunke
41a4a6ba6f i965: Flip arguments to load_register_reg helpers.
load_register_imm and load_register_mem take the destination as the
first argument, so I'd like load_register_reg to do the same the sake
of consistency.  Otherwise, reading sequences of mixed LRI/LRM/LRR is
needlessly confusing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-09 18:39:16 -08:00
Kenneth Graunke
34c9dc2537 i965: Delete dead brw_meta_resolve_color prototype.
Dead since commit 09e041d61d (May 2016).
2018-12-09 18:39:16 -08:00
Karol Herbst
77944fb2b7 nv50/ir: fix use-after-free in ConstantFolding::visit
opnd() might delete the passed in instruction, but it's used through
i->srcExists() later in visit

v2: use continue instead return
v3: use brackets for the outer if/else chain

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-09 18:19:59 +01:00
Karol Herbst
d63a133082 nouveau: use atomic operations for driver statistics
multiple threads can write to those at the same time

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-09 04:43:20 +01:00
Karol Herbst
a28ff22295 nv50/ir: initialize relDegree staticly
this race condition is pretty harmless, but also pretty trivial to fix

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-09 04:43:17 +01:00
Eric Anholt
cc6a5e937b shader-packing 2018-12-07 16:51:12 -08:00
Eric Anholt
09ad0d870c tfu 2018-12-07 16:49:41 -08:00
Eric Anholt
f1d98204c3 v3d: Fix a leak of the disassembled instruction string during debug dumps.
Fixes: ade416d023 ("broadcom: Add VC5 NIR compiler.")
2018-12-07 16:48:23 -08:00
Eric Anholt
7f8d8b7d27 vc4: Fix a leak of the transfer helper on screen destroy.
Fixes: d009463a65 ("vc4: Switch to using u_transfer_helper for MSAA maps.")
2018-12-07 16:48:23 -08:00
Eric Anholt
3bd73d31a8 v3d: Fix a leak of the transfer helper on screen destroy.
Fixes: 7a30517cce ("broadcom/vc5: Start adding support for rendering to Z32F_S8X24_UINT.")
2018-12-07 16:48:23 -08:00
Eric Anholt
bad95bb13c v3d: Add VIR dumping of TMU config p0/p1.
I had a bit of it for V3D 3.x, but didn't update it for 4.x.
2018-12-07 16:48:23 -08:00
Eric Anholt
1fc78ff3f1 v3d: Simplify VIR uniform dumping using a temporary. 2018-12-07 16:48:23 -08:00
Eric Anholt
5932575299 v3d: Garbage collect unused uniforms code. 2018-12-07 16:48:23 -08:00
Eric Anholt
62a3192112 v3d: Split most of TEXTURE_SHADER_STATE setup out of sampler views.
For shader image load/store, we want most of this logic to be shared.
2018-12-07 16:48:23 -08:00
Eric Anholt
8cb1f3bab7 v3d: Avoid confusing auto-indenting in TEXTURE_SHADER_STATE packing
Having "v3dx_pack() {" under each #if branch would confuse emacs's
indenter.
2018-12-07 16:48:23 -08:00
Eric Anholt
ee9b758053 v3d: Fix handling of texture first_layer offsets for 3D textures.
I think this bug predated adding v3d_layer_offset().  Noticed during an
unrelated refactor.
2018-12-07 16:48:23 -08:00
Eric Anholt
acecee4c2d v3d: Return the right gl_SampleMaskIn[] value.
It's supposed to be the dispatched sample mask for this pixel, not the GL
state's sample mask.
2018-12-07 16:48:23 -08:00
Eric Anholt
6870111051 v3d: Fix a comment typo 2018-12-07 16:48:23 -08:00
Eric Anholt
ca0e4ae4bc v3d: Convert to using nir_src_as_uint() from const_value derefs.
Follows 16870de8a0 ("nir: Use nir_src_is_const and nir_src_as_* in core
code") to clean up v3d.
2018-12-07 16:48:23 -08:00
Eric Anholt
503b55c622 v3d: Don't forget to flush writes to UBOs.
If someone did TF into a UBO, we might have left the TF job un-flushed at
the point of reading.
2018-12-07 16:48:23 -08:00
Eric Anholt
504d06e4c1 v3d: Make an array for frag/vert texture state in the context.
This simplifies a bunch of our texture handling, while introducing the
slots necessary for adding new shader stages.
2018-12-07 16:48:23 -08:00
Eric Anholt
d1965344ac v3d: Re-use the wrap mode uniform on V3D 3.3. 2018-12-07 16:48:23 -08:00
Eric Anholt
e94d034a38 v3d: Put default vertex attribute values into the state uploader as well.
The default attributes are long-lived (the state struct is cached), and
only 256 bytes each.
2018-12-07 16:48:23 -08:00
Eric Anholt
b38e4d313f v3d: Create a state uploader for packing our shaders together.
Shaders are usually quite short, and are private to the context.  We can
save memory and reduce the work the kernel needs to do at exec time by
packing them together in a stream uploader for long-lived state.
2018-12-07 16:48:23 -08:00
Eric Anholt
1911888760 v3d: Update simulator cache flushing code to match the kernel better.
We were missing the invalidate between bin and render (possibly relevant
for SSBOs), and still trying to flush the nonexistent L2C on 3.3+.
2018-12-07 16:48:23 -08:00
Eric Anholt
2ebca177dc v3d: Use the TFU to do generatemipmap.
This is a separate, dedicated hardware unit for texture layout conversions
and mipmap generation.
2018-12-07 16:48:23 -08:00
Eric Anholt
ee0549ff9a v3d: Add the V3D TFU submit interface to the simulator.
The TFU lets us format raster and SAND images into formats that can be
read by the texture engine, and do mipmap generation.

The UAPI comes from drm-next e69aa5f9b97f ("Merge tag
'drm-misc-next-2018-12-06' of git://anongit.freedesktop.org/drm/drm-misc
into drm-next")
2018-12-07 16:48:23 -08:00
Eric Anholt
42652ea51e v3d: Use combined input/output segments.
The HW apparently has some issues (or at least a much more complicated VCM
calculation) with non-combined segments, and the closed source driver also
uses combined I/O.  Until I get the last CTS failure resolved (which does
look plausibly like some VPM stomping), let's use combined I/O too.
2018-12-07 16:48:23 -08:00
Eric Anholt
fb9bcf5602 v3d: Add missing OES_half_float_linear support.
We were exposing ARB_texture_float, but apparently not the OES subset
flag.  Fixes regression from GLES3 support to GLES2.

Fixes: fcf9fcee3c ("mesa/main: do not require float-texture filtering
for es3")
2018-12-07 16:48:23 -08:00
Eric Anholt
90e98295a4 v3d: Add support for RGBA_SRGB along with BGRA_SRGB.
This is the actual native format for the hardware, without swizzling.
Noticed while debugging why GLES3 disappeared.
2018-12-07 16:48:23 -08:00
Kenneth Graunke
f0d51e81c9 intel/blorp: Expand blorp_address::offset to be 64 bits.
In the softpin world, surface state base address may be a fixed 64-bit
address (with no associated BO).  It makes sense to store this in the
offset field.  But it needs to be the full size.

We also update the clear color address to be consistently uint64_t
everywhere so we can continue passing intel_miptree_get_clear_color
a pointer to the blorp_address's offset field without type mismatches.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-12-07 16:35:51 -08:00
Rob Clark
d014af98b7 freedreno/drm: fix memory leak
Fix an emberrasing memory leak with the non-softpin submit/rb
implementation.

Fixes: f3cc0d2747 freedreno: import libdrm_freedreno + redesign submit
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 14:12:12 -05:00
Rob Clark
5c2c1f0a2d freedreno/ir3: track max flow control depth for a5xx/a6xx
Rather than just hard-coding BRANCHSTACK size.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
9517037bdc freedreno/ir3: code-motion
Split up ir3_compiler_nir.c a bit before starting to add new stuff for
a6xx SSBO/image instructions.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
e37351fa57 freedreno/ir3: sync instr/disasm
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
0d240c2214 freedreno/ir3: don't fetch unused tex components
Detect when a component of an (for example) texture fetch is unused and
propagate the updated wrmask back to the parent instruction.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
b971afd19e freedreno/a6xx: blitter fixes
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
237ae7daf2 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
e779725f0b freedreno/drm: fix relocs in nested stateobjs
If we have an reloc from stateobjA to stateobjB, we would previously
leave stateobjB's bos out of the submit's bos table.  Handle this case
by copying into stateobjA's reloc_bos table.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
9f7c6c78bc freedreno/a5xx+a6xx: remove unused fs/vs pvt mem
copy/pasta from older gens

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
c500e7b747 gallium: fix typo
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark
f6ad286c80 freedreno: remove unused fd_surface fields
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Nicolai Hähnle
4275cae95c meson: link LLVM 'native' component when LLVM is available
Linking against LLVM built with BUILD_SHARED_LIBS fails otherwise,
as the component is required for the draw module.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-07 16:26:14 +01:00
Connor Abbott
2845c49218 nir: Fixup algebraic test for variable-sized conversions
b2i can now take any size boolean in preparation for 1-bit booleans, so
the error message printed is slightly different.

Fixes: dca6cd9ce6 ("nir: Make boolean conversions sized just like the others")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108961
Cc: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-07 16:07:51 +01:00
Samuel Pitoiset
e8a383ce67 gallium: add missing PIPE_CAP_SURFACE_SAMPLE_COUNT default value
Fixes: 2710c40e3c ("gallium: Add new PIPE_CAP_SURFACE_SAMPLE_COUNT")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2018-12-07 15:06:29 +01:00
Emil Velikov
96d4ecbb11 docs: update calendar, add news item and link release notes for 18.3.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-07 11:50:12 +00:00
Emil Velikov
0144bbdb98 docs: add sha256 checksums for 18.3.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d81beab96a)
2018-12-07 11:44:33 +00:00
Emil Velikov
b1e0336497 docs: update 18.3.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d603cd9d84)
2018-12-07 11:44:31 +00:00
Kristian H. Kristensen
3e55df4f83 freedreno: Add support for EXT_multisampled_render_to_texture
There is not much to do in freedreno - tile layout and multisample
state for gmem renderings is programmed based on the pfb sample count,
while resolve blits take the destination sample count from the resource.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-12-06 16:56:37 -08:00
Rob Clark
913eb7fa58 freedreno/a6xx: MSAA
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-06 16:55:59 -08:00
Kristian H. Kristensen
14ea811c67 st/mesa: Add support for EXT_multisampled_render_to_texture
In gallium, we model the attachment sample count as a new nr_samples
field in pipe_surface. A driver can indicate support for the extension
using the new pipe cap, PIPE_CAP_MULTISAMPLED_RENDER_TO_TEXTURE.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-12-06 16:55:46 -08:00
Kristian H. Kristensen
2710c40e3c gallium: Add new PIPE_CAP_SURFACE_SAMPLE_COUNT
This new pipe cap and the new nr_samples field in pipe_surface lets a
state tracker bind a render target with a different sample count than
the resource. This allows for implementing
EXT_multisampled_render_to_texture and
EXT_multisampled_render_to_texture2.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-12-06 16:55:43 -08:00
Kristian H. Kristensen
1b331ae505 mesa: Add core support for EXT_multisampled_render_to_texture{,2}
This also turns on EXT_multisampled_render_to_texture which is a
subset of EXT_multisampled_render_to_texture2, allowing only
COLOR_ATTACHMENT0.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-12-06 16:55:30 -08:00
Vinson Lee
b4fd59075b nir/algebraic: Make algebraic_parser_test.sh executable.
Fixes make check permission error.

../../bin/test-driver: line 107: ./nir/tests/algebraic_parser_test.sh: Permission denied
FAIL nir/tests/algebraic_parser_test.sh (exit status: 126)

Fixes: a0ae12ca91 ("nir/algebraic: Add unit tests for bitsize validation")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2018-12-06 11:48:20 -08:00
Samuel Pitoiset
3fbdcd942f amd: remove support for LLVM 6.0
User are encouraged to switch to LLVM 7.0 released in September 2018.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-06 14:02:56 +01:00
Kristian H. Kristensen
3b2ad8b290 gallium: Android build fixes
A couple of simple fixes for building on Android with autotools.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-05 13:56:07 -08:00
Jason Ekstrand
dca6cd9ce6 nir: Make boolean conversions sized just like the others
Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is
one if 8, 16, 32, or 64.  This leads to having a few more opcodes but
now everything is consistent and booleans aren't a weird special case
anymore.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:03:07 -06:00
Jason Ekstrand
be98b1db38 nir/opt_algebraic: Add 32-bit specifiers to a bunch of booleans
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:03:03 -06:00
Jason Ekstrand
2715080d65 nir/opt_algebraic: Drop bit-size suffixes from conversions
Suffixes are dropped from a bunch of conversion opcodes when it makes
sense to do so.  Others are kept if we really do want the bit-size
restriction.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:03:01 -06:00
Jason Ekstrand
ff8e3d3b7b nir/opt_algebraic: Simplify an optimization using the new search ops
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:58 -06:00
Jason Ekstrand
05af952a11 nir/algebraic: Add support for unsized conversion opcodes
All conversion opcodes require a destination size but this makes
constructing certain algebraic expressions rather cumbersome.  This
commit adds support to nir_search and nir_algebraic for writing
conversion opcodes without a size.  These meta-opcodes match any
conversion of that type regardless of destination size and the size gets
inferred from the sizes of the things being matched or from other
opcodes in the expression.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:56 -06:00
Jason Ekstrand
4925290ab1 nir/algebraic: Refactor codegen a bit
Instead of using an OrderedDict, just have a (necessarily sorted) array
of transforms and a set of opcodes.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:54 -06:00
Jason Ekstrand
d6aac618fb nir/algebraic: Clean up some __str__ cruft
Both of these things are already handled in the Value base class so we
don't need to handle them explicitly in Constant.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:52 -06:00
Jason Ekstrand
85f0ea9d8f nir/opcodes: Rename tbool to tbool32
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:49 -06:00
Jason Ekstrand
03571a7a6c nir/opcodes: Pull in the type helpers from constant_expressions
While we're at it, we rework them a bit to all use regular expressions
and assert more.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:06 -06:00
Connor Abbott
a0ae12ca91 nir/algebraic: Add unit tests for bitsize validation
The non-failure path can be tested by just compiling mesa and then
testing it, but the failure paths won't be hit unless you make a mistake,
so it's best to test them with some unit tests.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-05 17:57:40 +01:00
Connor Abbott
29a1450e28 nir/algebraic: Rewrite bit-size inference
Before this commit, there were two copies of the algorithm: one in C,
that we would use to figure out what bit-size to give the replacement
expression, and one in Python, that emulated the C one and tried to
prove that the C algorithm would never fail to correctly assign
bit-sizes. That seemed pretty fragile, and likely to fall over if we
make any changes. Furthermore, the C code was really just recomputing
more-or-less the same thing as the Python code every time. Instead, we
can just store the results of the Python algorithm in the C
datastructure, and consult it to compute the bitsize of each value,
moving the "brains" entirely into Python. Since the Python algorithm no
longer has to match C, it's also a lot easier to change it to something
more closely approximating an actual type-inference algorithm. The
algorithm used is based on Hindley-Milner, although deliberately
weakened a little. It's a few more lines than the old one, judging by
the diffstat, but I think it's easier to verify that it's correct while
being as general as possible.

We could split this up into two changes, first making the C code use the
results of the Python code and then rewriting the Python algorithm, but
since the old algorithm never tracked which variable each equivalence
class, it would mean we'd have to add some non-trivial code which would
then get thrown away. I think it's better to see the final state all at
once, although I could also try splitting it up.

v2:
- Replace instances of "== None" and "!= None" with "is None" and
"is not None".
- Rename first_src to first_unsized_src
- Only merge the destination with the first unsized source, since the
sources have already been merged.
- Add a comment explaining what nir_search_value::bit_size now means.
v3:
- Fix one last instance to use "is not" instead of !=
- Don't try to be so clever when choosing which error message to print
based on whether we're in the search or replace expression.
- Fix trailing whitespace.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-05 17:57:40 +01:00
Samuel Pitoiset
49ef890733 radv: expose VK_EXT_scalar_block_layout
Nothing to do, the compiler already handles that.

All new dEQP.VK.ubo.* and dEQP.VK.ssbo.* pass, except some
16-bit tests that are quite related to fdo bug #108114.

Only enable the extension on CIK+ because it might not work on SI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-05 17:38:20 +01:00
Samuel Pitoiset
c6465fec0c spirv: add SpvCapabilityInt64Atomics
Required for VK_KHR_shader_atomic_int64.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-05 14:39:55 +01:00
Michal Srb
63c0916ada drisw: Use separate drisw_loader_funcs for shm
The original code was modifying the global drisw_lf variable, which is bad
when there are multiple contexts in single process, each initialized with
different loader. One may support put_image_shm and the other not.

Since there are currently only two possible combinations, lets create two
global tables, one for each. Lets make them const, since we won't change them
and they can be shared.

This fixes crash in VLC. It used two GL contexts (each in different thread), one
was initialized by its Qt GUI, the other by its video output plugin. The first
one set the put_image_shm=drisw_put_image_shm, the second did not, but
since the same structure was used, the drisw_put_image_shm was used too. Then
it crashed because the second loader did not have putImageShm set.

Downstream bug:
https://bugzilla.opensuse.org/show_bug.cgi?id=1113533

v2: Added Fixes and described the VLC bug.

Fixes: 63c427fa71 ("drisw: use putImageShm if available")
Signed-off-by: Michal Srb <msrb@suse.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-05 13:16:09 +00:00
Michal Srb
c0ac038c97 gallium: Constify drisw_loader_funcs struct
The content is not expected to change.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Michal Srb <msrb@suse.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-05 13:16:09 +00:00
Samuel Pitoiset
c7ada4901a radv: wait on the high 32 bits of timestamp queries
In case we are unlucky if the low part is 0xffffffff.

Fixes: 5d6a560a29 ("radv: do not use the availability bit for timestamp queries")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-05 13:05:58 +01:00
Samuel Pitoiset
e899728769 radv: reset pending_reset_query when flushing caches
If the driver used a compute shader for resetting a query pool,
it should be completed when caches are flushed.

This might reduce the number of stalls if operations are done
between vkCmdResetQueryPool() and vkCmdBeginQuery()
(or vkCmdWriteTimestamp()).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
2018-12-05 13:05:55 +01:00
Lionel Landwerlin
9a7b319903 anv/query: flush render target before copying results
This change tracks render target writes in the pipeline and applies a
render target flush before copying the query results to make sure the
preceding operations have landed in memory before the command streamer
initiates the copy.

v2: Simplify logic in CopyQueryResults (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108909
Fixes: 37f9788e9a ("anv: flush pipeline before query result copies")
Cc: mesa-stable@lists.freedesktop.org
2018-12-05 11:43:34 +00:00
Alex Smith
c1b6cb068c radv: Flush before vkCmdWriteTimestamp() if needed
As done for vkCmdBeginQuery() already. Prevents timestamps from being
overwritten by previous vkCmdResetQueryPool() calls if the shader path
was used to do the reset.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108925
Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting the query pool")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-05 10:52:48 +00:00
Samuel Pitoiset
824cfc1ee5 radv: rework the TC-compat HTILE hardware bug with COND_EXEC
After investigating on this, it appears that COND_WRITE doesn't
work correctly in some situations. I don't know exactly why does
it fail to update DB_Z_INFO.ZRANGE_PRECISION, but as AMDVLK
also uses COND_EXEC I think there is a reason.

Now the driver stores a new metadata value in order to reflect
the last fast depth clear state. If a TC-compat HTILE is fast cleared
with 0.0f, we have to update ZRANGE_PRECISION to 0 in order to
work around that hardware bug.

This fixes rendering issues with The Forest and DXVK and doesn't
seem to introduce any regressions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108914
Fixes: 68dead112e ("radv: update the ZRANGE_PRECISION value for the TC-compat bug")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-05 09:26:31 +01:00
Dieter Nützel
2669dbf881 docs/features: Delete double nv50 entry and wrong enumeration
trivial

Fix commit d9b2234042

Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-12-04 18:51:18 -05:00
Marek Olšák
5907412d04 st/mesa: expose EXT_render_snorm on GLES
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-04 15:33:29 -05:00
Marek Olšák
1660f3aa05 mesa: expose AMD_texture_texture4
because the closed driver exposes it. Tested by piglit.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-04 15:33:29 -05:00
Marek Olšák
908f817918 mesa: expose EXT_texture_compression_bptc in GLES
tested by piglit.

v2: rebase

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-12-04 15:33:29 -05:00
Marek Olšák
34f07ddebb mesa: expose EXT_texture_compression_rgtc on GLES
The spec was modified to support GLES. Tested by piglit.

v2: rebase

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-12-04 15:33:29 -05:00
Erik Faye-Lund
91af56e383 mesa/main: fix up _mesa_has_rg_textures for gles2
rg-textures are supported in GLES 2.0 if EXT_texture_rg, so let's make
sure the enums are accepted.

Fixes: 510b642460 "mesa/main: do not allow rg-textures enums before gles3"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108936
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-04 21:14:26 +01:00
Erik Faye-Lund
5bf38bfb64 mesa/main: correct validation for GL_RGB565
Technically speaking, this validation was incorrect, because GL_RGB565
is only supported in OpenGL ES 1.x if OES_framebuffer_object is
supported. This couldn't lead to any real incorrect behavior, because
all drivers support OES_framebuffer_object. But let's keep the code
self-documenting, by correcting the check as per the spec.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-04 21:14:16 +01:00
Marek Olšák
4b218984d8 mesa: expose GL_EXT_texture_view as an alias of GL_OES_texture_view
There are no spec changes.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-04 12:50:36 -05:00
Marek Olšák
d9b2234042 st/mesa: expose GL_OES_texture_view
For format fallbacks like ETC and ASTC, switching between sRGB and linear
decoding is undefined, or at least is not bit-exact. Same as
EXT_texture_sRGB_decode on GLES.

There are no piglit or dEQP regresssions.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-04 12:50:36 -05:00
Eric Engestrom
95d62baac5 loader: deduplicate logger function declaration
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-04 16:29:32 +00:00
Eric Engestrom
eade6ffeee mesa: drop unused & deprecated lib
DeprecationWarning: the imp module is deprecated in favour of importlib

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-04 16:26:21 +00:00
Eric Engestrom
919bec1c47 anv: add unreachable() for VK_EXT_fragment_density_map
This silences the -Wswitch compiler warning.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-04 16:22:55 +00:00
Eric Engestrom
a0b14c1b02 meson: skip asm check when asm is disabled
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-04 16:22:51 +00:00
Andrii Simiklit
6ae873b97d intel/tools: make sure the binary file is properly read
1. tools/i965_disasm.c:58:4: warning:
     ignoring return value of ‘fread’,
     declared with attribute warn_unused_result
     fread(assembly, *end, 1, fp);

v2: Fixed incorrect return value check.
       ( Eric Engestrom <eric.engestrom@intel.com> )

v3: Zero size file check placed before fread with exit()
       ( Eric Engestrom <eric.engestrom@intel.com> )

v4: - Title is changed.
    - The 'size' variable was moved to top of a function scope.
    - The assertion was replaced by the proper error handling.
    - The error message on a caller side was fixed.
       ( Eric Engestrom <eric.engestrom@intel.com> )

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-04 16:19:26 +00:00
Toni Lönnberg
d7b99ab947 intel/aubinator_error_decode: Get rid of warning for missing switch case
../src/intel/tools/aubinator_error_decode.c: In function ‘instdone_register_for_ring’:
../src/intel/tools/aubinator_error_decode.c:177:4: warning: enumeration value ‘I915_ENGINE_CLASS_INVALID’ not handled in switch [-Wswitch]
    switch (class) {
    ^~~~~~
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-04 12:47:49 +00:00
Ilia Mirkin
bacf8471dc nouveau: set texture upload budget
It doesn't seem like the exact number has too much effect on the
performaince in "teximage". However setting it to just about anything
prevents some OOMs from getting hit. These values are not well-tuned,
but don't seem too bad.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-03 23:11:29 -05:00
Ilia Mirkin
08c64fe7a1 nv50,nvc0: add explicit handling of PIPE_CAP_MAX_VERTEX_ELEMENT_SRC_OFFSET
Since the max attrib stride is 2048, the max src offset makes sense as
2047.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-03 23:11:29 -05:00
Ilia Mirkin
de49e06507 nv50: always keep TSC slot 0 bound
All TXF operations implicitly use sampler 0, and fail if it's not bound
to anything. This does not happen in LINKED_TSC mode, but we don't
currently use this.

We ensure that TSC entry at id 0 has the SRGB conversion bit enabled
(and all samplers we normally generate will too). Then when the TSC at
*slot* 0 (not to be confused with entry 0 in the global TSC table) is
unbound, we bind it to entry 0. This way, TXF operations are not
dependent on there being a regular sampler bound there.

Fixes arb_texture_buffer_object-subdata-sync among others. (TBO's are
particularly susceptible to this as they don't bind a sampler.)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-03 23:11:29 -05:00
Dave Airlie
1363a47c9c radv: use 3d shader for gfx9 copies if dst is 3d
This fixes some crucible 3d miptree tests I've been working on
when executed using the compute shader path.

Fixes: d08f267814 (radv/gfx9: fix 3d image to image transfers on compute queues.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-04 10:42:31 +10:00
Bas Nieuwenhuizen
12e35a64c0 radv: Check for shareable images in central place.
One place to put the logic makes things easier to change.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-04 01:21:38 +01:00
Bas Nieuwenhuizen
3bf48741e1 radv/android: Use buffer metadata to determine scanout compat.
These days we don't always allocate scanout compatible textures anymore.
That does mean we have to fix the radv android WSI though.

Fixes: b1444c9ccb "radv: Implement VK_ANDROID_native_buffer."
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-04 01:21:38 +01:00
Bas Nieuwenhuizen
51091b3e1f radv/android: Mark android WSI image as shareable.
Fixes: b1444c9ccb "radv: Implement VK_ANDROID_native_buffer."
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-04 01:21:38 +01:00
Matt Turner
dd53bb7e1f Revert "st/mesa: silenced unhanded enum warning in st_glsl_to_tgsi.cpp"
This reverts commit 198c50f487.

This needs to be reverted after commit 017199d2d2 ("mesa: Revert
INTEL_fragment_shader_ordering support")
2018-12-03 16:20:43 -08:00
Matt Turner
017199d2d2 mesa: Revert INTEL_fragment_shader_ordering support
This extension is not properly tested (testing for
GL_ARB_fragment_shader_interlock is not sufficient), and since this was
noted in review on August 28th no tests have been sent.

Revert "i965: Add INTEL_fragment_shader_ordering support."
Revert "mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_ordering"

This reverts commit 03ecec9ed2.
This reverts commit 119435c877.

Cc: mesa-stable@lists.freedesktop.org
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Anholt <eric@anholt.net>
2018-12-03 15:37:37 -08:00
Dave Airlie
e3f075439c virgl: fix const warning on debug flags.
Fixes: 8d4bb6e5c (virgl: Add command and flags to initiate debugging on the host (v2))
2018-12-04 08:11:13 +10:00
Jason Ekstrand
71271e167b vulkan: Update the XML and headers to 1.1.95
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-03 14:27:10 -06:00
Tobias Klausmann
9401a2f2e6 amd/vulkan: meson build - use radv_deps for libvulkan_radeon
Without this the build breaks with:

FAILED: src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o
cc -Isrc/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha -Isrc/amd/vulkan
-I../src/amd/vulkan -Isrc/../include -I../src/../include -Isrc -I../src
-Isrc/mapi -I../src/mapi -Isrc/mesa -I../src/mesa -I../src/gallium/include
-Isrc/gallium/auxiliary -I../src/gallium/auxiliary -Isrc/amd -I../src/amd
-Isrc/amd/common -I../src/amd/common -Isrc/compiler -I../src/compiler
-Isrc/vulkan/util -I../src/vulkan/util -Isrc/vulkan/wsi -I../src/vulkan/wsi
-Isrc/compiler/nir -I../src/compiler/nir -I/usr/include -I/usr/include/libdrm
-fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch
-std=c99 -O2 -g '-DVERSION="18.3.0-rc5"' -DPACKAGE_VERSION=VERSION
'-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"'
-DGLX_USE_TLS -DHAVE_ST_VDPAU -DENABLE_ST_OMX_BELLAGIO=0
-DENABLE_ST_OMX_TIZONIA=0 -DHAVE_X11_PLATFORM -DGLX_INDIRECT_RENDERING
-DGLX_DIRECT_RENDERING -DGLX_USE_DRM -DHAVE_DRM_PLATFORM -DENABLE_SHADER_CACHE
-DHAVE___BUILTIN_BSWAP32 -DHAVE___BUILTIN_BSWAP64 -DHAVE___BUILTIN_CLZ
-DHAVE___BUILTIN_CLZLL -DHAVE___BUILTIN_CTZ -DHAVE___BUILTIN_EXPECT
-DHAVE___BUILTIN_FFS -DHAVE___BUILTIN_FFSLL -DHAVE___BUILTIN_POPCOUNT
-DHAVE___BUILTIN_POPCOUNTLL -DHAVE___BUILTIN_UNREACHABLE
-DHAVE_FUNC_ATTRIBUTE_CONST -DHAVE_FUNC_ATTRIBUTE_FLATTEN
-DHAVE_FUNC_ATTRIBUTE_MALLOC -DHAVE_FUNC_ATTRIBUTE_PURE
-DHAVE_FUNC_ATTRIBUTE_UNUSED -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT
-DHAVE_FUNC_ATTRIBUTE_WEAK -DHAVE_FUNC_ATTRIBUTE_FORMAT
-DHAVE_FUNC_ATTRIBUTE_PACKED -DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL
-DHAVE_FUNC_ATTRIBUTE_VISIBILITY -DHAVE_FUNC_ATTRIBUTE_ALIAS
-DHAVE_FUNC_ATTRIBUTE_NORETURN -DUSE_SSE41 -DUSE_GCC_ATOMIC_BUILTINS
-DUSE_X86_64_ASM -DMAJOR_IN_SYSMACROS -DHAVE_SYS_SYSCTL_H -DHAVE_LINUX_FUTEX_H
-DHAVE_ENDIAN_H -DHAVE_DLFCN_H -DHAVE_STRTOF -DHAVE_MKOSTEMP
-DHAVE_POSIX_MEMALIGN -DHAVE_TIMESPEC_GET -DHAVE_MEMFD_CREATE -DHAVE_STRTOD_L
-DHAVE_DLADDR -DHAVE_DL_ITERATE_PHDR -DHAVE_ZLIB -DHAVE_PTHREAD
-DHAVE_PTHREAD_SETAFFINITY -DHAVE_LIBDRM -DHAVE_LLVM=0x0600
-DMESA_LLVM_VERSION_PATCH=1 -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED
-DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -Werror=implicit-function-declaration
-Werror=missing-prototypes -Werror=return-type -fno-math-errno
-fno-trapping-math -Wno-missing-field-initializers -Wno-format-truncation -O2
-Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables
-fasynchronous-unwind-tables -fstack-clash-protection -DNDEBUG -fPIC -pthread
-D__STDC_FORMAT_MACROS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS
-D__STDC_LIMIT_MACROS -fvisibility=hidden -Wno-override-init
-DVK_USE_PLATFORM_XCB_KHR -DVK_USE_PLATFORM_XLIB_KHR
-DVK_USE_PLATFORM_WAYLAND_KHR -DVK_USE_PLATFORM_DISPLAY_KHR
-DVK_USE_PLATFORM_XLIB_XRANDR_EXT  -MD -MQ
'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o' -MF
'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o.d' -o
'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o' -c
../src/amd/vulkan/radv_pipeline.c
In file included from ../src/vulkan/util/vk_alloc.h:29,
                 from ../src/amd/vulkan/radv_private.h:52,
                 from ../src/amd/vulkan/radv_debug.h:27,
                 from ../src/amd/vulkan/radv_pipeline.c:30:
../src/../include/vulkan/vulkan.h:54:10: fatal error: wayland-client.h: Datei
oder Verzeichnis nicht gefunden
 #include <wayland-client.h>
          ^~~~~~~~~~~~~~~~~~
compilation terminated.

The above command misses the include directory for wayland:
    -I/usr/include/wayland

The missing include is contained in the (until now) unused radv_deps:

if with_platform_wayland
  radv_deps += dep_wayland_client
  radv_flags += '-DVK_USE_PLATFORM_WAYLAND_KHR'
  libradv_files += files('radv_wsi_wayland.c')
endif

Fixes: 673dda8330 "meson: build "radv" vulkan driver for radeon hardware"
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-03 09:18:48 -08:00
Erik Faye-Lund
fcf9fcee3c mesa/main: do not require float-texture filtering for es3
The OpenGL ES 3.0 specification, table 3.13 lists half-float textures as
filterable, but not float textures. So we shouldn't depend on
ARB_float_texture, which requires full filtering support for both.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
43015b2a89 mesa/st: do not probe for the same texture-formats twice
This should be equalent of what we did before.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
212d270b4e mesa/main: require EXT_texture_sRGB for gles3
sRGB textures is a requirement for OpenGL ES 3.0, so let's make sure
we don't incorrectly enable a too high version.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
487010a099 mesa/main: require EXT_texture_type_2_10_10_10_REV for gles3
OpenGL ES 3.0 require this functionality, so we should also test for it
to avoid incorrectly exposing a too high GLES version.

On desktop, this has been required since all the way back in OpenGL 1.2
anyway.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
74eab1c62f mesa/main: split float-texture support checking in two
On OpenGL ES 2.0, there's separate extensions adding support for
half-float and float textures. So we need to validate the enums
separately as well.

This also prevents these enums from incorrectly being allowed on
OpenGL ES 1.x, where there's no extension that enables this in the
first place.

While we're at it, remove the pointless default-case, and the seemingly
stale fallthrough comment.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
c4136ed5cc mesa/main: do not allow EXT_texture_sRGB_R8 enums before gles3
ctx->Extensions.EXT_texture_sRGB_R8 is set regardless of the API
that's used, so checking for those direcly will always allow the
enums from this extensions when they are supported by the driver.

There's no extension adding support for this on OpenGL ES before
version 3.0, so let's tighten the check.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
d972939986 mesa/main: do not allow sRGB texture enums before gles3
ctx->Extensions.EXT_texture_sRGB is set regardless of the API that's
used, so checking for those direcly will always allow the enums from
this extensions when they are supported by the driver.

There's no extension adding support for this on OpenGL ES before
version 3.0, so let's tighten the check.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
3629ee025c mesa/main: do not allow snorm-texture enums before gles3
ctx->Extensions.EXT_texture_snorm is set regardless of the API
that's used, so checking for those direcly will always allow the
enums from this extensions when they are supported by the driver.

There's no extension adding support for this on OpenGL ES before
version 3.0, so let's tighten the check.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
52dc8b4f7b mesa/main: do not allow floating-point texture enums on gles1
ctx->Extensions.OES_texture_float is set regardless of the API
that's used, so checking for those direcly will always allow the
enums from this extensions when they are supported by the driver.

There's no extension enabling floating-point textures for OpenGL
ES 1.x, so we shouldn't allow those enums there.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
167dcd59ae mesa/main: do not allow type_2_10_10_10_REV enums before gles3
ctx->Extensions.EXT_texture_type_2_10_10_10_REV is set regardless of
the API that's used, so checking for those direcly will always enable
extensions when they are supported by the driver.

There's no corresponding extension for OpenGL ES 1.x/2.0, so we
shouldn't allow these enums there.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
b112e62ba4 mesa/main: do not allow MESA_ycbcr_texture enums on gles
This extension requies OpenGL, and shouldn't be available on OpenGL ES.
So let's not allow the enums from it either.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
1b2e9aca77 mesa/main: do not allow EXT_texture_shared_exponent enums before gles3
ctx->Extensions.EXT_texture_shared_exponent is set regardless of the
API that's used, so checking for those direcly will always allow the
enums from this extensions when they are supported by the driver.

We also need to make sure this is enabled on OpenGL ES 3. Because the
check is repeated, let's introduce a helper.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
510b642460 mesa/main: do not allow rg-textures enums before gles3
EXT_packed_float isn't supported on OpenGL ES, we shouldn't allow
these enums there, before OpenGL ES 3.0 which also introduce support
for these enums.

Since this check is repeated a lot, let's make a helper for this.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
59690bf0a3 mesa/main: do not allow EXT_packed_float enums before gles3
EXT_packed_float isn't supported on OpenGL ES, we shouldn't allow
these enums there, before OpenGL ES 3.0 which also introduce support
for these enums.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
83db9d3e3a mesa/main: do not allow ARB_depth_buffer_float enums before gles3
Floating-point depth buffers are only supported on OpenGL 3.0, OpenGL ES
3.0, or if ARB_depth_buffer_float is supported. Because we checked a
driver capability rather than using an extension-check helper, we ended
up incorrectly allowing this on OpenGL ES 1.x and 2.x.

Since this logic is repeated, let's make a helper for it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
3bbd543b6e mesa/main: do not allow integer-texture enums before gles3
Integer textures shouldn't be implicitly exposed on OpenGL ES 1.x and
2.x, but because the code checked against a driver-capability rather
than using an extension-check helper, we ended up accidentally allowing
these enums on older versions when the driver supports it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
b5a370dc25 mesa/main: do not allow ARB_texture_rgb10_a2ui enums before gles3
ARB_texture_rgb10_a2ui isn't supported on OpenGL ES, we shouldn't expose
it there even if the driver supports it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
76b038bee7 mesa/main: do not allow stencil-texture enums on gles1
ctx->Extensions.ARB_texture_stencil8 is set regardless of the API
that's used, so checking for those direcly will always allow the
enums from this extensions when they are supported by the driver.

So let's instead check for both ARB_texture_stencil8 and
OES_texture_stencil8, so we support depth textures on OpenGL and
OpenGL ES 2.0+. There's no extension enabling stencil-textures for
OpenGL ES 1.x, so we shouldn't allow those enums there.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
19eb0bf28f mesa/main: do not allow depth-texture enums on gles1
ctx->Extensions.ARB_depth_texture is set regardless of the API that's
used, so checking for those direcly will always allow the enums from
this extensions when they are supported by the driver.

So let's instead check for both ARB_depth_texture and OES_depth_texture,
so we support depth textures on OpenGL and OpenGL ES 2.0+. There's no
extension enabling depth-textures for OpenGL ES 1.x, so we shouldn't
allow those enums there.

This fixes oes_packed_depth_stencil-depth-stencil-texture_gles1 on i965

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
2dfcaf7554 mesa/main: do not allow astc enums on gles1
ctx->Extensions.KHR_texture_compression_astc_ldr is set regardless of
the API that's used, so checking for those direcly will always enable
extensions when they are supported by the driver.

But there's no extension enabling ASTC for OpenGL ES 1.x, so we
shouldn't allow those enums there.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
1aa134038c mesa/main: do not allow etc2 enums on gles1
ctx->Extensions.ARB_ES3_compatibility is set regardless of the API
that's used, so checking for those direcly will always enable
extensions when they are supported by the driver.

But there's no extension enabling ETC2 for OpenGL ES 1.x, so we
shouldn't allow those enums there.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
27ca87ccca mesa/main: do not allow s3tc enums on gles1
There's no extension enabling S3TC formats on OpenGL ES 1.x, so we
shouldn't allow these even if the driver can support it. So let's check
for EXT_texture_compression_s3tc instead of ANGLE_texture_compression_dxt,
which is supported on all other OpenGL variations.

We also need to use _mesa_has_EXT_texture_compression_s3tc() instead of
checking the driver cap directly, otherwise we end up enabling this on
OpenGL ES 1.x, as the API isn't checked.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
d70cfb322a mesa/main: use _mesa_has_FOO_bar for compressed format checks
_mesa_has_FOO_bar() knows about the APIs these extensions should be
supported under, so let's use that to simplify these checks a bit.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
70bfd31287 mesa/main: clean up integer texture check
This makes the logic a little bit easier to follow, and reduce a bit of
repetition.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
5109742e7b mesa/main: clean up ES2_compatibility check
This makes the logic a little bit easier to follow; this is *either*
about ES2 compatibility *or* about gles. GL_RGB565 was added already in
OpenGL ES 1.0.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
2e753b77dd mesa/main: clean up OES_texture_float_linear check
Using the _mesa_has_FOO_bar helpers is generally more safe and should
generally be prefered over checking driver-caps like this code did,
because the _mesa_has_FOO_bar helpers also verify the API type and
version.

This shouldn't have any practical effect here, as this function only
gets called for OpenGL ES 3.x right now. But if this was to change in
the future, this makes the function behave a lot more predictable.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
1373d117c2 mesa/main: clean up S3_s3tc check
S3_s3tc is the extension that enables this functionality on desktop, so
let's check for that one. The _mesa_has_S3_s3tc() helper already
verifies the API according to the extension-table.

As for the second hunk, we currently already only expose
EXT_texture_compression_s3tc on desktop so by using the helper instead,
we get rid of this detail here, and once we enable it for GLES we'll
automaticall get the interaction right.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
e8b331ae13 mesa/main: rename format-check function
_mesa_es3_error_check_format_and_type isn't specific to OpenGL ES 3.x,
it applies to all versions of OpenGL ES. So let's rename it to reflect
this.

While we're at it, let's also rename a helper function it uses similarly.
As the helper is static, we can also remove the namespacing-prefix from
the name.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:44 +01:00
Erik Faye-Lund
ca8e2a5277 mesa/main: make _mesa_has_tessellation return bool
All other _mesa_has_foo functions return bool rather than GLboolean, so
let's follow that style here as well.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-03 18:16:43 +01:00
Chad Versace
3ef0ca65c9 i965: Fix -Wswitch on INTEL_COPY_STREAMING_LOAD
The warning is emitted when building without INLINE_SSE41.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-03 13:07:56 +02:00
Karol Herbst
fc0139d283 nv50,nvc0: Fix gallium nine regression regarding sampler bindings
The new approach is that samplers don't get unbound even if they won't be used
in a draw and we should just leave them be as well.

Fixes a regression in multiple windows games using gallium nine and nouveau.

v2: adjust num_samplers to keep track of the highest sampler bound
v3: rework how to set the new value of num_samplers

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106577
Fixes: 4d6fab245e
       "cso: don't track the number of sampler states bound"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-02 00:05:04 +01:00
Andre Heider
b6f095f7ce d3dadapter9: use snprintf(..., "%s", ...) instead of strncpy
Fixes -Wstringop-truncation compiler warnings.
See f836d799f9 "intel/decoder: use snprintf(..., "%s", ...) instead of strncpy"

Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2018-12-01 21:32:53 +01:00
Mauro Rossi
37a2072e97 android: st/mesa: fix building error due to sched_getcpu()
Android has cpufeatures library but pinning of threads is not supported
PIPE_OS_LINUX code path causes build error due to sched_getcpu() unavailable
thus we need to avoid setting HAVE_SCHED_GETCPU for Android

Fixes: 48f2160 ("st/mesa: regularly re-pin driver threads to the CCX where the app thread is")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-01 10:15:58 +01:00
Vinson Lee
4f74580d30 st/xvmc: Add X11 include path.
This patch fixes this build error.

  CC       tests/xvmc_bench.o
In file included from tests/xvmc_bench.c:35:
tests/testlib.h:38:10: fatal error: 'X11/Xlib.h' file not found
         ^~~~~~~~~~~~

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-30 22:09:43 -08:00
Mauro Rossi
eed3f1121c android: amd/addrlib: update Mesa's copy of addrlib
Needed to fix build error in addrlib in mesa for Android

Fixes: 776b911 ("amd/addrlib: update Mesa's copy of addrlib")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-12-01 01:13:53 +01:00
Gurchetan Singh
89b4798c06 virgl: don't mark buffers as unclean after a write
We can mark the buffer unclean if it's ever bound as a TBO,
SSBO, ABO, or image.

This improves

dEQP-GLES3.performance.buffer.data_upload.function_call.map_buffer_range.new_specified_buffer.flag_write_full.stream_draw

from 9.58 MB/s to 451.17 MB/s.

v2: Track buffer cleanliness as a function of bindings (Ilia).
v3: virgl_modify_clean --> virgl_dirty_res (Erik)

Tested-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-11-30 12:21:01 +01:00
Gurchetan Singh
d18492c64f virgl: avoid large inline transfers
We flush everytime the command buffer (16 kB) is full, which is
quite costly.

This improves

dEQP-GLES3.performance.buffer.data_upload.function_call.buffer_data.new_buffer.usage_stream_draw

from 111.16 MB/s to 1930.36 MB/s.

In addition, I made the benchmark produce buffers from 0 --> VIRGL_MAX_CMDBUF_DWORDS * 4,
and tried ((VIRGL_MAX_CMDBUF_DWORDS * 4) / 2), ((VIRGL_MAX_CMDBUF_DWORDS * 4) / 4), etc.

I didn't notice any clear differences, so let's just go with the most obvious
heuristic.

Tested-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-11-30 12:20:41 +01:00
Gurchetan Singh
c0773315af virgl: quadruple command buffer size
Tested running WebGL aquarium on Nvidia host (10,000 fishes)

This moves us from 7 fps to 9 fps.  After quadrupling, performance
gains diminish.

v2: Remove change ID (Erik)

Tested-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-11-30 12:20:06 +01:00
Lionel Landwerlin
37f9788e9a anv: flush pipeline before query result copies
Pipeline state pending bits should be taken into account when copying
results.

In the particular bug below, the results of the
vkCmdCopyQueryPoolResults() command was being overwritten by the
preceding vkCmdCopyBuffer() with a same destination buffer. This is
because we copy the buffers using the 3D pipeline whereas we copy the
query results using the command streamer. Those pieces of HW work in
parallel and the results are somewhat undefined.

v2: Unconditionally flush the pipeline before copying the results
    (Jason)

v3: Wrap & expressions (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108894
Cc: mesa-stable@lists.freedesktop.org
2018-11-29 22:07:31 +00:00
Marek Olšák
39b20b7d4f Revert "winsys/amdgpu: overallocate buffers for faster address translation on Gfx9"
I didn't mean to push this. I don't think it makes any difference.

This reverts commit f737fe00a0.
2018-11-29 14:46:06 -05:00
Roland Scheidegger
fbf95ce074 draw: fix infinite loop in line stippling
The calculated length of a line may be infinite, if the coords we
get are bogus. This leads to an infinite loop in line stippling.
To prevent this test for this explicitly (although technically
on at least x86 sse it would actually work without the explicit
test, as long as we use the int-converted length value).
While here also get rid of some always-true condition.

Note this does not actually solve the root cause, which is that
the coords we receive are bogus after clipping. This seems a difficult
problem to solve. One issue is that due to float arithmetic, clip w
may become 0 after clipping if the incoming geometry is
"sufficiently degenerate", hence x/y/z ndc (and window) coords will
be all inf (or nan). Even with w not quite 0, I believe it's possible
we produce values which are actually outside the view volume.
(Also, x=y=z=w=0 coords in clipspace would be not considered subject
to clipping, and similarly result in all NaN coords.) We just hope for
now other draw stages (and rasterizers) can handle those relatively
safely (llvmpipe itself should be sort of robust against this, certainly
converstion to fixed point will produce garbage, it might fail a couple
assertions but should neither hang nor crash otherwise).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-11-29 18:39:40 +01:00
Józef Kucia
94bfb8bf38 nir: Fix assert in print_intrinsic_instr().
Signed-off-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-29 16:29:37 +00:00
Nicolai Hähnle
776b911365 amd/addrlib: update Mesa's copy of addrlib
Update to the internal master as of 2018-11-15.

This has a lot of gratuitous whitespace change, but on the plus
side it's built using the same tooling that's used for AMDVLK,
which should help going forward.
2018-11-29 13:18:24 +01:00
Nicolai Hähnle
621c107760 ac/surface/gfx9: let addrlib choose the preferred swizzle kind
Our choices here are simply redundant as long as sin.flags is set
correctly.

(v2:
- remove unused function parameter)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-29 13:18:23 +01:00
Nicolai Hähnle
729ebdf07e radv: remove dependency on addrlib gfx9_enum.h
v2:
- use SI_CONTEXT_REG_OFFSET

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-29 13:18:23 +01:00
Thomas Hellstrom
058f85d41c winsys/svga: Fix a memory leak
The ioctl.cap_3d member was never freed.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-29 10:42:06 +01:00
Thomas Hellstrom
7fce3ca375 st/xa: Fix a memory leak
Free the context after destruction.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-29 10:42:06 +01:00
Samuel Pitoiset
cc7deb749c radv: drop few useless state changes when doing color/depth decompressions
Viewport/scissor don't need to be updated for array textures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:55 +01:00
Samuel Pitoiset
6d4f65deea radv: remove unused pending_clears param in the transition path
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:53 +01:00
Samuel Pitoiset
4b9df824f7 radv: optimize CmdClear{Color,DepthStencil}Image() for layered textures
If all layers are bound we can perform a fast color or depth clear
instead of iterating over all layers. This has the advantage
to avoid trashing the framebuffer for nothing if you we end up by
doing a fast clear when calling radv_clear_image_layer(), and
clearing all layers in one shot is obviously faster.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
7484bc894b radv: refactor the fast clear path for better re-use
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
f78ee19702 radv: simplify a check in emit_fast_color_clear()
Currently only true if RADV_PERFTEST=dccmsaa is set.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
eca931a726 radv: add radv_can_fast_clear_{color,depth}() helpers
For further optimisations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
93f5ce8fa7 radv: add radv_image_view_can_fast_clear() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
aeaf8dbd09 radv: add radv_image_can_fast_clear() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
3e718db1ff radv: remove useless check in emit_fast_color_clear()
The driver doesn't support DCC/CMASK for mipmapped textures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Vinson Lee
d0c7b079d0 freedreno: Fix autotools build.
Fix build error.

  CXXLD    pipe_msm.la
../../../../src/gallium/drivers/freedreno/.libs/libfreedreno.a(freedreno_batch.o): In function `batch_init':
src/gallium/drivers/freedreno/freedreno_batch.c:54: undefined reference to `fd_device_version'
src/gallium/drivers/freedreno/freedreno_batch.c:59: undefined reference to `fd_submit_new'
src/gallium/drivers/freedreno/freedreno_batch.c:61: undefined reference to `fd_submit_new_ringbuffer'
src/gallium/drivers/freedreno/freedreno_batch.c:64: undefined reference to `fd_submit_new_ringbuffer'
src/gallium/drivers/freedreno/freedreno_batch.c:66: undefined reference to `fd_submit_new_ringbuffer'
src/gallium/drivers/freedreno/freedreno_batch.c:70: undefined reference to `fd_submit_new_ringbuffer'

Fixes: b4476138d5 ("freedreno: move drm to common location")
Fixes: aa0fed10d3 ("freedreno: move ir3 to common location")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-11-28 22:23:52 -08:00
Marek Olšák
075fd5d8f2 radeonsi: add memory management stress tests for GDS
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
c1d3c08699 winsys/amdgpu: add support for allocating GDS and OA resources
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
d7a4fa91f0 radeonsi: allow si_cp_dma_clear_buffer to clear GDS from any IB
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
72b2b61d8c winsys/amdgpu: use optimal VM alignment for CPU allocations
Acked-by: Christian König <christian.koenig@amd.com>
2018-11-28 20:20:27 -05:00
Marek Olšák
27f9935075 winsys/amdgpu: use optimal VM alignment for imported buffers
Window system buffers didn't use the optimal alignment.

Acked-by: Christian König <christian.koenig@amd.com>
2018-11-28 20:20:27 -05:00
Marek Olšák
6b554d863f winsys/amdgpu,radeon: pass vm_alignment to buffer_from_handle
Acked-by: Christian König <christian.koenig@amd.com>
2018-11-28 20:20:27 -05:00
Marek Olšák
f737fe00a0 winsys/amdgpu: overallocate buffers for faster address translation on Gfx9
Sadly, the 3 games I tested (DeusEx:MD, DiRT Rally, DOTA 2) are unaffected
by the overallocation, because I guess their buffers don't fall into
the small range below a power-of-two size.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
8c00f778fc winsys/amdgpu: increase the VM alignment to the MSB of the size for Gfx9
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
a2a6b06d48 winsys/amdgpu: use >= instead of > for VM address alignment
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
98f2312b4f winsys/amdgpu: clean up code around BO VM alignment
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Marek Olšák
5f9ccf827e winsys/amdgpu: optimize slab allocation for 2 MB amdgpu page tables
- the slab buffer size increased from 128 KB to 2 MB (PTE fragment size)
- the max suballocated buffer size increased from 64 KB to 256 KB,
  this increases memory usage because it wastes memory
- the number of suballocators increased from 1 to 3 and they are layered
  on top of each other to minimize unused space in slabs

The final increase in memory usage is:
  DeusEx:MD:  1.8%
  DOTA 2:     1.75%
  DiRT Rally: 0.2%

The kernel driver will also receive fewer buffers.
2018-11-28 20:20:27 -05:00
Marek Olšák
cf6835485c radeonsi: generalize the slab allocator code to allow layered slab allocators
There is no change in behavior. It just makes it easier to change the number
of slab allocators.
2018-11-28 20:20:27 -05:00
Marek Olšák
9576266a37 winsys/amdgpu: always reclaim/release slabs if there is not enough memory 2018-11-28 20:20:27 -05:00
Marek Olšák
015061beb3 radeonsi: fix is_oneway_access_only for bindless images 2018-11-28 20:20:27 -05:00
Marek Olšák
8c25ab1a23 radeonsi/nir: parse more information about bindless usage
fill more tgsi_shader_info fields.
2018-11-28 20:20:27 -05:00
Marek Olšák
2a936f8afa tgsi/scan: add more information about bindless usage
radeonsi will use this.
2018-11-28 20:20:27 -05:00
Marek Olšák
fba91b5173 radeonsi: small cleanup for memory opcodes 2018-11-28 20:20:27 -05:00
Marek Olšák
709905cbb6 radeonsi: fix is_oneway_access_only for image stores
We need to look at the Dst for image stores.
2018-11-28 20:20:27 -05:00
Marek Olšák
648dc52367 radeonsi: use structured buffer intrinsics for image views
to stop using the workaround in si_make_buffer_descriptor.
2018-11-28 20:20:27 -05:00
Marek Olšák
442dae2693 radeonsi: clean up primitive binning enablement
no change in behavior.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-28 20:20:27 -05:00
Dave Airlie
8eb8be3f54 virgl: fix undefined shift to use unsigned.
Ported from virglrenderer.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-11-29 09:09:31 +10:00
Dave Airlie
2ddd44d941 r600: make suballocator 256-bytes align
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108311
Cc: <mesa-stable@lists.freedesktop.org>
2018-11-29 09:09:02 +10:00
Kenneth Graunke
f11780779f intel/compiler: Use nir's info when checking uses_streams.
Vulkan and Gallium don't use Mesa's gl_program data structure, so they
can't poke at 'prog'.  But we can simply use the copy of the shader info
stored with the NIR shader, which is guaranteed to exist.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-11-28 13:35:29 -08:00
Jason Ekstrand
199a0353d6 nir/derefs: Add a nir_derefs_do_not_alias enum value
This makes some of the code more clear.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-11-28 14:29:25 -06:00
Gurchetan Singh
eb44c36cf1 egl: add missing #include <stddef.h> in egldevice.h
Otherwise, I get this error:

main/egldevice.h:54:13: error: ‘NULL’ undeclared (first use in this function)
       dev = NULL;
             ^~~~
with this config:

./autogen.sh --enable-gles1 --enable-gles2 --with-platforms='surfaceless' --disable-glx
             --with-dri-drivers="i965" --with-gallium-drivers="" --enable-gbm

v3: Use stddef.h (Matt)
v4: Modify commit message (Eric)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-28 11:22:47 -08:00
Matt Turner
2d48d5116b gallivm: Use nextafterf(0.5, 0.0) as rounding constant
The common truncf(x + 0.5) fails for the floating-point value just less
than 0.5 (nextafterf(0.5, 0.0)). nextafterf(0.5, 0.0) + 0.5, after
rounding is 1.0, thus truncf does not produce the desired value.

The solution is to add nextafterf(0.5, 0.0) instead of 0.5 before
truncating. This works for all values.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-11-28 11:22:47 -08:00
Juan A. Suarez Romero
e2ad94d928 docs: update calendar, add news item and link release notes for 18.2.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-11-28 19:20:09 +01:00
Juan A. Suarez Romero
a53a280479 docs: add sha256 checksums for 18.2.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit cfd1f8b92c)
2018-11-28 19:20:09 +01:00
Juan A. Suarez Romero
f6ab6e2867 docs: add release notes for 18.2.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 3e741344d7)
2018-11-28 19:20:09 +01:00
Nicolai Hähnle
c02390f8fc egl/wayland: rather obvious build fix
Fixes: ce74a7bb8d ("egl/wayland: plug memory leak in drm_handle_device()")
Fixes: c59d3aa4b9 ("egl/wayland: bail out when drmGetMagic fails")
2018-11-28 18:30:36 +01:00
Nicolai Hähnle
eb94b6bd5c winsys/amdgpu: explicitly declare whether buffer_map is permanent or not
Introduce a new driver-private transfer flag RADEON_TRANSFER_TEMPORARY
that specifies whether the caller will use buffer_unmap or not. The
default behavior is set to permanent maps, because that's what drivers
do for Gallium buffer maps.

This should eliminate the need for hacks in libdrm. Assertions are added
to catch when the buffer_unmap calls don't match the (temporary)
buffer_map calls.

I did my best to update r600 for consistency (r300 needs no changes
because it never calls buffer_unmap), even though the radeon winsys
ignores the new flag.

As an added bonus, this should actually improve the performance of
the normal fast path, because we no longer call into libdrm at all
after the first map, and there's one less atomic in the winsys itself
(there are now no atomics left in the UNSYNCHRONIZED fast path).

Cc: Leo Liu <leo.liu@amd.com>
v2:
- remove comment about visible VRAM (Marek)
- don't rely on amdgpu_bo_cpu_map doing an atomic write
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-28 18:24:14 +01:00
Nicolai Hähnle
35eb81987c winsys/amdgpu: add amdgpu_winsys_bo::lock
We'll use it in the upcoming mapping change. Sparse buffers have always
had one.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-28 18:23:29 +01:00
Eric Engestrom
e0f1f74eda vulkan/wsi: fix s/,/;/ typo
Fixes: 59e58c348e "vulkan/wsi: Only wait on semaphores on the first swapchain"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-28 16:44:01 +00:00
Emil Velikov
ce74a7bb8d egl/wayland: plug memory leak in drm_handle_device()
As we fail to open the node, we leak the node/device name.

v2: Log and then free() (Eric)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-28 16:12:12 +00:00
Emil Velikov
c59d3aa4b9 egl/wayland: bail out when drmGetMagic fails
Currently as the function fails, we pass uninitialized data to the
authentication function. Stop doing that and print an warning when
the function fails.

v2: Plug memory leak in error path (Eric)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-28 16:11:22 +00:00
Eric Engestrom
9575cd2893 wsi/display: fix mem leak when freeing swapchains
Fixes: da997ebec9 "vulkan: Add KHR_display extension using DRM [v10]"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
2018-11-28 12:09:54 +00:00
Gert Wollny
f08d107054 i965: Set the FBO error state INCOMPLETE_ATTACHMENT only for SRGB_R8
Originally the driver reported GL_FRAMEBUFFER_UNSUPPORTED in all cases,
adding more specific error messages was not correct and broke many tests.
Mostly revert this and only report GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT
for MESA_FORMAT_R_SRGB8.

Fixes: ebcde34545
  i965: be more specific about FBO completeness errors

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108805

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-28 10:12:47 +01:00
Gert Wollny
d8bb88d0b4 i965: Explicitely handle swizzles for MESA_FORMAT_R_SRGB8
The format is emulated by using ISL_FORMAT_L8_SRGB, therefore we need to
force swizzles for the GBA channels. However, doing this only based on the
data type GL_RED breaks other formats, therefore, test specifically for the
format.

Fixes: c5363869d4
  i965: Force zero swizzles for unused components in GL_RED and GL_RG
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-28 10:07:02 +01:00
Gert Wollny
091295d7cb virgl: Don't try handling server fences when they are not supported
vtest doesn't implement the according API and would segfault:

Program received signal SIGSEGV, Segmentation fault.
  #0  0x0000000000000000 in ?? ()
  #1  in virgl_fence_server_sync  at
       src/gallium/drivers/virgl/virgl_context.c:1049
  #2  in st_server_wait_sync  at
       src/mesa/state_tracker/st_cb_syncobj.c:155

so just don't do the call when the function pointers are not set.

Fixes dEQP:
  dEQP-GLES3.functional.fence_sync.wait_sync_smalldraw
  dEQP-GLES3.functional.fence_sync.wait_sync_largedraw

Fixes: d1a1c21e76
  virgl: native fence fd support

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
2018-11-28 10:02:31 +01:00
Gert Wollny
073fdd7382 virgl,vtest: Initialize return value
Avoids:
Conditional jump or move depends on uninitialised value(s)
  at 0x9E2B39F: virgl_vtest_winsys_resource_cache_create (virgl_vtest_winsys.c:379)
  by 0x9E2725F: virgl_buffer_create (virgl_buffer.c:169)
  by 0x9E246D5: virgl_resource_create (virgl_resource.c:60)
  by 0xA0C1B9F: bufferobj_data (st_cb_bufferobjects.c:344)
  by 0xA0C1B9F: st_bufferobj_data (st_cb_bufferobjects.c:390)
  by 0x9F4ACE3: vbo_use_buffer_objects (vbo_exec_api.c:1136)
  by 0xA0C68C3: st_create_context_priv (st_context.c:416)
  by 0xA0C707A: st_create_context (st_context.c:598)
  by 0x9F81C6B: st_api_create_context (st_manager.c:918)
  by 0x9BBE591: dri_create_context (dri_context.c:161)
  by 0x9BB6931: driCreateContextAttribs (dri_util.c:473)
  by 0x4E97A44: drisw_create_context_attribs (drisw_glx.c:630)
  by 0x4E7C591: glXCreateContextAttribsARB (create_context.c:78)
Uninitialised value was created by a stack allocation
  at 0x9E2B249: virgl_vtest_winsys_resource_cache_create (virgl_vtest_winsys.c:342)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
2018-11-28 10:02:31 +01:00
Iago Toral Quiroga
e55cbf26ea intel/compiler: fix register allocation in opt_peephole_sel
This wasn't handling 64-bit cases properly. Found by inspection.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-11-28 08:28:27 +01:00
Matt Turner
6f737b9207 glsl: Remove unused member variable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-27 22:29:53 -08:00
Matt Turner
1a210268b8 nir: Call fflush() at the end of nir_print_shader()
We normally call with stderr which is unbuffered, so this won't affect
that, but it does let me call nir_print_shader(nir, fopen("log", "w+"))
from gdb and actually get the whole shader in my file.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-27 22:29:53 -08:00
Eric Anholt
e113b21cb7 v3d: Add renderonly support.
I've been using this with the kmsro series to test v3d on VKMS without my
old KMS hack in the v3d kernel driver.  KMSRO still needs some cleanup,
but v3d RO support seems reasonable.
2018-11-27 15:03:02 -08:00
Eric Anholt
55edafa73e gallium: Remove unused variable in u_tests.
Fixes: 0d17b685b1 ("gallium/u_tests: add a compute shader test that clears an image")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-27 15:02:57 -08:00
Bas Nieuwenhuizen
6569644bb6 radv: Align large buffers to the fragment size.
Improves performance in Talos by about 15% (and significant improvements
in RotR and possibly other but did not bench with final patch) on
kernel 4.19 and earlier.

On 4.20+ a similar effect comes from

433ca054949a "drm/amdgpu: try allocating VRAM as power of two"

v2: Do not impact the alignment of the physical memory.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
2018-11-27 22:17:42 +01:00
Hyunjun Ko
76945e4140 freedreno: implements get_sample_position
Since 1285f71d3e landed, it needs to provide apps with proper sample
position for MSAA.

Currently no way to query this to hw, these are taken from blob driver.

Fixes: dEQP-GLES31.functional.texture.multisample.samples_#.sample_position
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Rob Clark
5973a4d0b7 freedreno/a3xx: also set FSSUPERTHREADENABLE
We set equiv bit in SP_FS_CTRL_REG0.  Somehow the hw doesn't hang with
this mismatched config, but does run slower.  It is faster with either
neither bit set, or both bits set, but both is the fastest of the three
configurations.  Worth a bit over 10% gain in glmark2.

Spotted-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
e68cd91251 freedreno: use MSM_BO_SCANOUT with scanout buffers
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2018-11-27 15:44:03 -05:00
Jonathan Marek
3ed4aad524 freedreno: use GENERIC instead of TEXCOORD for blit program
blip_fp uses GENERIC as input, so blit_vp should match for linking

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
3a273a4abc freedreno: a2xx texture update
Adds all missing texture related logic. For everything to work it also
needs changes to ir2/fd2_program, which are part of the ir2 update patch.

Note: it needs rnndb update

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
[remove stray patch]
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
4887aba638 freedreno/a2xx: Compute depth base in gmem correctly
Note: it needs rnndb update

Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
e7114575f7 freedreno/a2xx: set VIZ_QUERY_ID on a20x
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
a50b8a0152 freedreno: add missing a20x ids
200: 256KiB GMEM A200 (imx53)
201: 128KiB GMEM A200 (imx51)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
4e6ee033ff freedreno/a2xx: fix POINT_MINMAX_MAX overflow
As it stands, it overflows to zero.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:03 -05:00
Jonathan Marek
78fede86d9 freedreno: a2xx: fd2_draw update
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Jonathan Marek
3e7186d472 nir: add fceil lowering
lowers ceil(x) as -floor(-x)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
11593f9041 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
d47d77d49d freedreno/a6xx: set guardband clip
On older gens, the CLIP_ADJ bitfields were actually 3.6 fixed point.
Which might make more sense.  Although this formula comes up with values
pretty close to what blob does for various viewport sizes (for at least
a5xx and a6xx), and seems to work.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
2773919f06 freedreno/a6xx: disable LRZ for z32
f6131d4ec7 had the side effect of enabling LRZ w/ 32b depth buffers.
But there are some bugs with this, which aren't fully understood yet,
so for now just skip LRZ w/ z32..

Fixes: f6131d4ec7 freedreno/a6xx: Clear z32 and separate stencil with blitter
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Kristian H. Kristensen
9595be67a9 freedreno/a6xx: Clear gmem buffers at flush time
We generate an IB to clear the gmem at flush time and jump to it
before rendering each tile. This lets us get rid of the command stream
patching for gmem offsets.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Kristian H. Kristensen
b5a9bb28c6 freedreno/a6xx: Move resolve blits to an IB
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Kristian H. Kristensen
5f068cf3b0 freedreno/a6xx: Move restore blits to IB
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
09300bbe03 mesa/st: better colormask check for clear fallback
For RGB surfaces (for example) we don't really care that the colormask
is 0x7 instead of 0xf.  This should not trigger clear_with_quad()
slowpath.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-27 15:44:02 -05:00
Rob Clark
65cee01430 mesa/st: swap order of clear() and clear_with_quad()
If we can't clear all the buffers with pctx->clear() (say, for example,
because of ColorMask), push the buffers we *can* clear with pctx->clear()
first.  Tilers want to see clears coming before draws to enable fast-
paths, and clearing one of the attachments with a quad-draw first
confuses that logic.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-27 15:44:02 -05:00
Rob Clark
aa0fed10d3 freedreno: move ir3 to common location
Move (most of) the ir3 compiler to src/freedreno/ir3 so that it can be
re-used by some future vulkan driver.  The parts that are gallium
specific have been refactored out and remain in the gallium driver.

Getting the move done now so that it can happen before further
refactoring to support a6xx specific instructions.

NOTE also removes ir3_cmdline compiler tool from autotools build since
that was easier than fixing it and I normally use meson build.  Waiting
patiently for the day that we can remove *everything* from the autotools
build.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
556eec249d freedreno/ir3: remove u_inlines usage
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
312eae45a3 freedreno/ir3: split up ir3_shader
Split the parts that are gallium specific into ir3_gallium so the rest
can move to a common location outside of gallium.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
ea4cbf601d freedreno/ir3: remove pipe_stream_output_info dependency
A bit annoying to have to copy into our own struct.  But this is
something the compiler really needs to know, at least on earlier
generations where streamout is implemented in shader.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
030e98630d freedreno/ir3: some header file cleanup
Clean up some of the low-hanging-fruit usages of freedreno_util.h

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
2482153d52 freedreno/ir3: use env_var_as_unsigned()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
a321f939f6 util: env_var_as_unsigned() helper
So I can drop env2u() helper from freedreno_util.h and get rid of one
small ir3 dependency on gallium/freedreno

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
bfd8d26372 freedreno/ir3: move disasm and optmsgs debug flags
Move them to IR3_SHADER_DEBUG so we can remove ir3's dependency on
fd_mesa_debug.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
424d75656f freedreno: FD_SHADER_DEBUG -> IR3_SHADER_DEBUG
Only used by ir3, so move it into ir3 to be more self contained.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
8a654f092e freedreno: remove shader_stage_name()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
c635703c50 freedreno: shader_t -> gl_shader_stage
Just massive search/replace for the most part.

Step towards removing ir3 dependency on disasm.h which is shared by
a2xx.  One step closer to being able to move ir3 out of gallium.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
388aac32ed freedreno/ir3: standalone compiler updates
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
b4476138d5 freedreno: move drm to common location
So that we can re-use at least parts of it for vulkan driver, and so
that we can move ir3 to a common location (which uses fd_bo to allocate
storage for shaders)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark
6cb74eb4f1 freedreno/drm: remove dependency on gallium driver
Prep work to move drm to a common location.

Slightly hacky, but the softpin debug flag is only temporary.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Dylan Baker
88c4680b5a util: promote u_memory to src/util
as well as os_memory*
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Eric Anholt
bade179153 gallium: Fix uninitialized variable warning in compute test.
The compiler doesn't know that ny != 0, so x might be uninitialized for
the printf at the end.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-11-27 11:23:22 -08:00
Bas Nieuwenhuizen
08ea6b9d9b radv: Clamp gfx9 image view extents to the allocated image extents.
Mirrors AMDVLK. Looks like if we go over the alignment of height
we actually start to change the addressing. Seems like the extra
miplevels actually work with this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108245
Fixes: f6cc15dccd "radv/gfx9: fix block compression texture views. (v2)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-27 10:19:52 +01:00
Iago Toral Quiroga
453570cd8c intel/compiler: fix indentation style in opt_algebraic() 2018-11-27 09:53:09 +01:00
Anuj Phogat
16e4911972 anv/icl: Set use full ways in L3CNTLREG
L3 allocation table in h/w specification recommends using 4 KB
granularity for programming allocation fields in L3CNTLREG.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-11-26 15:11:36 -08:00
Anuj Phogat
3f55fd3814 intel/icl: Set way_size_per_bank to 4
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-11-26 15:11:36 -08:00
Anuj Phogat
3ce04da5b4 i965/icl: Set use full ways in L3CNTLREG
L3 allocation table in h/w specification recommends using 4 KB
granularity for programming allocation fields in L3CNTLREG.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-11-26 15:11:36 -08:00
Anuj Phogat
3282c7be89 i965/icl: Fix L3 configurations
Use L3 configuration specified in h/w specification.

V2: Drop configs which do under allocation of l3 cache.
    Bump up the comment above table.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-11-26 15:11:36 -08:00
Eric Engestrom
c0c533767e build: stop defining unused VERSION
Scons and autotools don't define it, and as of last commit nothing
uses it.

`VERSION` is also a generic enough name that something somewhere will
eventually clash, and we don't want to repeat the LLVM `DEBUG` fiasco.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-26 22:05:02 +00:00
Eric Engestrom
bd12e02530 vulkan/utils: s/VERSION/PACKAGE_VERSION/
Everything else uses PACKAGE_VERSION, so let's be consistent, and
VERSION and PACKAGE_VERSION are currently defined to be the same in
meson and android, while VERSION is undefined in autotools and scons.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-26 22:05:02 +00:00
Eric Engestrom
56d126f8fd anv: correctly use vulkan 1.0 by default
Per chapter 3.2 "Instances":
> Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing
> an apiVersion of 0 is equivalent to providing an apiVersion of
> VK_MAKE_VERSION(1,0,0).

Reported-by: Niklas Haas <git@haasn.xyz>
Fixes: 8c048af589 "anv: Copy the appliation info into the instance"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-26 22:05:02 +00:00
Erik Faye-Lund
d6d35d87f1 mesa/main: fixup requirements for GL_PRIMITIVES_GENERATED
This enum is also allowed by EXT_tessellation_shader, which is supported
on older i965 HW (as opposed to OES_geometry_shader). This was missed
when narrowing this code-path, leading to dEQP regressions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108868
Fixes: f09d94fbd1 "mesa/main: fix validation of transform-feedback queries"
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2018-11-26 22:12:07 +01:00
Erik Faye-Lund
c120dbfe4d mesa/main: fix incorrect depth-error
If glGetTexImage or glGetnTexImage is called with a level that doesn't
exist, we get an error message on this form:

Mesa: User error: GL_INVALID_VALUE in glGetTexImage(depth = 0)

This is clearly nonsensical, because these APIs don't even have a
depth-parameter. The reason is that get_texture_image_dims() return
all-zero dimensions for non-existent texture-images, and we go on to
validate these dimensions as if they were user-input, because
glGetTextureSubImage requires checking.

So let's split this logic in two, so glGetTextureSubImage can have
stricter input-validation. All arguments that are no longer validated
are generated internally by mesa, so there's no use in validating them.

Fixes: 42891dbaa1 "gettextsubimage: verify zoffset and depth are correct"
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-26 12:29:54 +01:00
Erik Faye-Lund
38af69adfa mesa/main: check cube-completeness in common code
This check is the only part of dimensions_error_check that isn't about
error-checking the offset and size arguments of
glGet[Compressed]TextureSubImage(), so it doesn't really belong in here.

This doesn't make a difference right now, apart for changing the
presedence of this error. But it will make a difference  for the next
patch, where we no longer call this method from the non-sub tex-image
getters.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-26 12:29:54 +01:00
Erik Faye-Lund
42820c5727 mesa/main: factor out common error-checking
This error checking is the same for teximage and texsubimage getters, so
let's factor it out to its own function.

This will be useful when getteximage and gettexsubimage gets their own
error checking routines a bit later.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-26 12:29:54 +01:00
Erik Faye-Lund
5e0a84f31c mesa/main: factor out tex-image error-checking
This will be useful when we split error-checking for getteximage and
gettexsubimage later.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-26 12:29:54 +01:00
Erik Faye-Lund
38bbb61252 mesa/main: remove bogus error for zero-sized images
The explanation quotes the spec on the following wording to justify the
error:

"An INVALID_VALUE error is generated if xoffset + width is greater than
 the texture’s width, yoffset + height is greater than the  texture’s
 height, or zoffset + depth is greater than the texture’s depth."

However, this shouldn't generate an error in the case where *all three*
of width, xoffset and the texture's width are zero. In this case, we end
up generating an unspecified error.

So let's remove this check, and instead make sure that we consider this
as an empty texture.

So let's not generate an error, there's non mandated in the spec in
xoffset/yoffset/zoffset = 0 case. We already avoid doing any work in
this case, because of the final, non-error generating check in this
function.

Fixes: b37b35a5d2 "getteximage: assume texture image is empty for non defined levels"
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-26 12:29:54 +01:00
Erik Faye-Lund
f1998e15ff mesa/main: remove ARB suffix from glGetnTexImage
This function has been core since OpenGL 4.3, so naming the
implementation and reporting erros using an ARB-suffix can be
confusing.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-26 12:29:54 +01:00
Gert Wollny
f5d053702f glsl: free or reuse memory allocated for TF varying
When a shader program is de-serialized the gl_shader_program passed in
may actually still hold memory allocations for the transform feedback
varyings. If that is the case, free the varying names and reallocate
the new storage for the names array.

This fixes a memory leak:
Direct leak of 48 byte(s) in 6 object(s) allocated from:
 in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880)
 in transform_feedback_varyings ../../samba/mesa/src/mesa/main/transformfeedback.c:875
 in _mesa_TransformFeedbackVaryings ../../samba/mesa/src/mesa/main/transformfeedback.c:985
 ...
Indirect leak of 42 byte(s) in 6 object(s) allocated from:
  in __interceptor_strdup (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0x761c8)
  in transform_feedback_varyings ../../samba/mesa/src/mesa/main/transformfeedback.c:887
  in _mesa_TransformFeedbackVaryings ../../samba/mesa/src/mesa/main/transformfeedback.c:985

Fixes: ab2643e4b0
   glsl: serialize data from glTransformFeedbackVaryings

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-26 09:58:25 +01:00
Bas Nieuwenhuizen
3c96a1e3a9 radv: Fix opaque metadata descriptor last layer.
We used the layer count which results in an off by one error.

Not sure this really affects anything.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-26 09:29:39 +01:00
Mathias Fröhlich
ff466c2d48 mesa/st: Make st_pipe_vertex_format static.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-26 07:57:10 +01:00
Mathias Fröhlich
2a3eae82a1 mesa/st: Use binding information from the VAO in feedback rendering.
Use VAO binding information in feedback rendering. In theory
it should reduce the amount of buffer objects scheduled for rendering.
Feedback rendering is implemented in a crude way anyhow, so I do not
expect much gain here. But for the sake of code reuse we should
use the same code for the same task. And finally if feeback rendering
may get improved the array setup is already well done there.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-26 07:57:10 +01:00
Mathias Fröhlich
a00a8fb8d1 mesa/st: Avoid extra references in the feedback draw function scope.
The change removes the reference that is held on the entries of the
vbuffers[] array. The new code does not do that anymore as following
the code into draw_set_vertex_buffers() the draw context holds an
other reference as long as it is reset down the function again.
So it should be already by that argument save to remove that
additional reference count.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-26 07:57:10 +01:00
Mathias Fröhlich
6705188cc5 mesa/st: Factor out array and buffer setup from st_atom_array.c.
Factor out vertex array setup routines from the array state atom.
The factored functions will be used in feedback rendering in the
next change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-26 07:57:09 +01:00
Mathias Fröhlich
774d585d49 mesa/st: Only unmap the uploader that was actually used.
In st_atom_array, we only need to unmap the upload buffer that
was actually used.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-26 07:57:09 +01:00
Mathias Fröhlich
65332aff29 mesa/st: Only care about the uploader if it was used.
In st_atom_array, we only need to care for unmapping the upload buffer
if we actually used it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-26 07:57:09 +01:00
Ilia Mirkin
927ce66b39 nv50/ir: remove dnz flag when converting MAD to ADD due to optimizations
dnz flag only applies for multiplications (e.g. to make 0 * Infinity
becomes 0 instead of NaN). Once we optimize a MAD into an ADD, the dnz
flag no longer makes sense, and upsets the GM107 emitter (since it looks
at the ftz and dnz flags together).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-11-24 22:15:53 -05:00
Marek Olšák
d4e7d8b7f0 winsys/amdgpu: fix a device handle leak in amdgpu_winsys_create
Cc: 18.2 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-23 17:08:44 -05:00
Marek Olšák
82aa07f81f winsys/amdgpu: fix a buffer leak in amdgpu_bo_from_handle
Cc: 18.2 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-23 17:08:42 -05:00
Samuel Pitoiset
9fc1ce258c radv: ignore subpass self-dependencies for CreateRenderPass() too
We really need to refactor this...

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-23 11:59:11 +01:00
Samuel Pitoiset
2951a766bd radv: remove useless sync before CmdClear{Color,DepthStencil}Image()
We don't need to flush anything before these two commands as well.
This is because they have to be externally synchronized, so the
app should have called CmdPipelineBarrier() prior to that and the
driver should have flushed the caches.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-23 11:59:08 +01:00
Erik Faye-Lund
a652842982 mesa/main: remove overly strict query-validation
The rules encoded in this code also applies to OpenGL ES 3.0 and up,
but the per-enum validation has already been taught about these rules.
So let's get rid of this duplicate, narrow version of the validation.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:36 +01:00
Erik Faye-Lund
d52be6dd29 mesa/main: fix validation of GL_TIMESTAMP
ctx->Extensions.ARB_timer_query is set based on the driver-
capabilities, not based on the context type. We need to check
against _mesa_has_ARB_timer_query(ctx) instead to figure out
if the extension is really supported. We also need to check for
EXT_disjoint_timer_query for GLES-support.

This shouln't have any functional effect, as this entry-point is only
valid on desktop GL, or on GLES with EXT_disjoint_timer_query in the
first place. But if this gets added to the core of a future version
of ES, this should be a step in the right direction.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:36 +01:00
Erik Faye-Lund
7a4d74c35a mesa/main: fix validation of ARB_query_buffer_object
ctx->Extensions.ARB_query_buffer_object is set based on the driver-
capabilities, not based on the context type. We need to check against
_mesa_has_ARB_query_buffer_object(ctx) instead to figure out if the
extension is really supported.

This turns attempts to read queries into buffer objects on ES 3 into
errors, as required by the spec.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:36 +01:00
Erik Faye-Lund
75e39b59dc mesa/main: fix validation of transform-feedback overflow queries
ctx->Extensions.ARB_transform_feedback_overflow_query is set based on
the driver-capabilities, not based on the context type. We need to
check against _mesa_has_RB_transform_feedback_overflow_query(ctx)
instead to figure out if the extension is really supported.

This turns usage of GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW and
GL_TRANSFORM_FEEDBACK_OVERFLOW into errors on ES 3, as required by the
spec.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:36 +01:00
Erik Faye-Lund
f09d94fbd1 mesa/main: fix validation of transform-feedback queries
ctx->Extensions.EXT_transform_feedback is set based on the driver-
capabilities, not based on the context type. We need to check against
_mesa_has_EXT_transform_feedback(ctx) instead to figure out if the
extension is really supported. We also need to check for
OES_geometry_shader.

This turns usage of GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN into an
error on ES 2, as well as usage of GL_PRIMITIVES_GENERATED on ES 3, both
as required by the spec.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:36 +01:00
Erik Faye-Lund
b551fe5fa7 mesa/main: fix validation of GL_TIME_ELAPSED
ctx->Extensions.EXT_timer_query is set based on the driver-
capabilities, not based on the context type. We need to check against
_mesa_has_EXT_timer_query(ctx) instead to figure out if the extension
is really supported. We also need to check for
EXT_disjoint_timer_query, which enables the same functionality for ES.

This turns usage of GL_TIME_ELAPSED into an error on ES 3, as is
required by the spec.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:36 +01:00
Erik Faye-Lund
059928e114 mesa/main: fix validation of GL_ANY_SAMPLES_PASSED_CONSERVATIVE
ctx->Extensions.ARB_ES3_compatibility is set based on the driver-
capabilities, not based on the context type. We need to check against
_mesa_has_ARB_ES3_compatibility(ctx) instead to figure out if the
extension is really supported.

In addition, EXT_occlusion_query_boolean should also allow this
behavior.

This shouldn't cause any functional change, as all drivers that support
ES3_compatibility should in practice enable either ES3_compatibility or
EXT_occlusion_query_boolean under all APIs that export this symbol.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Erik Faye-Lund
8ea819dd60 mesa/main: fix validation of GL_ANY_SAMPLES_PASSED
ctx->Extensions.ARB_occlusion_query2 is set based on the driver-
capabilities, not based on the context type. We need to check against
_mesa_has_ARB_occlusion_query2(ctx) instead to figure out if the
extension is really supported.

In addition, EXT_occlusion_query_boolean should also allow this
behavior.

This shouldn't cause any functional change, as all drivers that support
ARB_occlusion_query2 should in practice enable either
ARB_occlusion_query2 or EXT_occlusion_query_boolean under all APIs that
export this symbol.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Erik Faye-Lund
fff1738d57 mesa/main: fix validation of GL_SAMPLES_PASSED
ctx->Extensions.ARB_occlusion_query is set based on the driver-
capabilities, not based on the context type. We need to check against
_mesa_has_ARB_occlusion_query(ctx) instead to figure out if the
extension is really supported. We also need to check for
ARB_occlusion_query2, as ARB_occlusion_query isn't available in core
contexts.

This turns usage of GL_SAMPLES_PASSED into an error on ES 3, as is
required by the spec.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Erik Faye-Lund
9c13ad0ea4 mesa/main: simplify pipeline-statistics query validation
The _mesa_has_ARB_pipeline_statistics_query(ctx)-helper will already
check the GLES-version according to the extension-table, so if this
extension would ever be back-ported to ES, we only need to update the
table to support this.

This shouln't have any functional effect.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Erik Faye-Lund
dd4241b34f mesa/main: use non-prefixed enums for consistency
These enums all have the same values as their non-prefixed versions, and
there's several aliases for some of them. So let's switch to the
non-prefixed versions for simplicity.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Erik Faye-Lund
ba4e8d3754 mesa/main: correct year for EXT_occlusion_query_boolean
According to the extension spec, this was initially released in 2011,
so let's set this to the correct value.

The value of 2001 could be a copy-paste mistake, as ARB_occlusion_query
which this is based on was released then.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Erik Faye-Lund
35555b08d7 mesa/main: correct requirement for EXT_occlusion_query_boolean
EXT_occlusion_query_boolean require support for GL_ANY_SAMPLES_PASSED,
which ARB_occlusion_query doesn't supply. We need ARB_occlusion_query2
for this instead.

This is still not 100% accurate, as we also require support for the
GL_SAMPLES_PASSED_CONSERVATIVE target, which isn't guaranteed by either
ARB_occlusion_query nor ARB_occlusion_query2. But it should be trivial
to implement for any driver supporting ARB_occlusion_query2, as it can
simply be implemented as GL_ANY_SAMPLES_PASSED.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-23 10:48:35 +01:00
Tapani Pälli
09adaa4b89 anv: allow exporting an imported SYNC_FD semaphore type
Fixes issues with following SkQP tests:

   unitTest_VulkanHardwareBuffer_Vulkan_EGL_Syncs
   unitTest_VulkanHardwareBuffer_Vulkan_Vulkan_Syncs

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-23 07:49:46 +02:00
Eric Engestrom
896c59d690 glapi: add missing visibility args
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108829
Fixes: 3218056e0e "meson: Build i965 and dri stack"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-22 18:21:05 +00:00
Jason Ekstrand
a24654b49d anv/nir: Rework arguments to apply_pipeline_layout
Instead of taking a whole pipeline (which could be anything!), just take
a physical device and robust_buffer_access boolean.  This makes it
easier to verify that only the things in the hash actually affect
pipeline compilation.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-11-22 09:17:28 -06:00
Jason Ekstrand
617e402b3d anv: Put robust buffer access in the pipeline hash
It affects apply_pipeline_layout.  Shaders compiled with the wrong value
will work but they may not be robust as requested by the app.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-11-22 09:17:10 -06:00
Jason Ekstrand
a845c2bc10 anv: Expose VK_EXT_scalar_block_layout
Our compile already splits UBO loads into scalars and the untyped
surface read messages we use for SSBO reads and writes only require
dword alignment.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-22 08:16:47 -06:00
Jason Ekstrand
2ca9a4417d vulkan: Update the XML and headers to 1.1.93
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-22 08:16:40 -06:00
Samuel Pitoiset
4ff4af3d91 radv: remove useless sync after CmdClear{Color,DepthStencil}Image()
'post_flush' is only set to NULL for the normal clear path
(ie. only vkCmdClearColorImage() and vkCmdClearDepthStencilImage()
are affected commands).

Because these two operations have to be externally synchronized
with VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT,
it's useless to set those flags internallY.

VK_PIPELINE_STAGE_TRANSFER_BIT will wait for compute to be idle,
while VK_ACCESS_TRANSFER_WRITE_BIT will invalidate both L1 vector
caches and L2. RADV_CMD_FLAG_WRITEBACK_GLOBAL_L2 will be superseded
by RADV_CMD_FLAG_INV_GLOBAL_L2.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-22 08:56:36 +01:00
Bas Nieuwenhuizen
33b2f74e77 vulkan: Allow storage images in the WSI.
Since apps also have to follow the ImageFormatProperties query,
we can disallow formats that don't allow image stores (for AMD
that would be SRGB formats).

Note that this only affects anything if the app actually decides
to use the flag.

Had someone ask for this on IRC and at least on the AMD side we
can support it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-21 21:36:55 +01:00
Axel Davy
1f1d4d571a st/nine: Remove thread_submit warning
thread_submit can be useful even without DRI_PRIME,
as it can help avoid missed pageflips.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2018-11-21 19:55:28 +01:00
Axel Davy
d304f0aa31 st/nine: Allow 'triple buffering' with thread_submit
The path allowing triple buffering behaviour wasn't implemented
yet for thread_submit

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2018-11-21 19:55:28 +01:00
Robert Foss
19af208c7d virgl: add assert and missing function parameter
Verify the pipe_fd_type to be of PIPE_FD_TYPE_NATIVE_SYNC.

Fixes: d1a1c21e76 "virgl: native fence fd support"

Suggested-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-21 15:59:00 +01:00
Gert Wollny
61b535437e r600: clean up the GS ring buffers when the context is destroyed
This fixes two memory leaks reported by ASAN:

Direct leak of 248 byte(s) in 1 object(s) allocated from:
   in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880)
   in r600_alloc_buffer_struct ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:578
   in r600_buffer_create ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:600
   in r600_resource_create_common ../../samba/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1265
   in r600_resource_create ../../samba/mesa/src/gallium/drivers/r600/r600_pipe.c:725
   in pipe_buffer_create ../../samba/mesa/src/gallium/auxiliary/util/u_inlines.h:291
   in update_gs_block_state ../../samba/mesa/src/gallium/drivers/r600/r600_state_common.c:1482

Direct leak of 248 byte(s) in 1 object(s) allocated from:
   in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880)
   in r600_alloc_buffer_struct ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:578
   in r600_buffer_create ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:600
   in r600_resource_create_common ../../samba/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1265
   in r600_resource_create ../../samba/mesa/src/gallium/drivers/r600/r600_pipe.c:722
   in pipe_buffer_create ../../samba/mesa/src/gallium/auxiliary/util/u_inlines.h:291
   in update_gs_block_state ../../samba/mesa/src/gallium/drivers/r600/r600_state_common.c:1489

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Fixes: 1371d65a7f
  r600g: initial support for geometry shaders on evergreen (v2)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-11-21 10:34:17 +01:00
Samuel Pitoiset
4b9bc4791b radv: only sync CP DMA for transfer operations or bottom pipe
CP DMA can only be busy when the driver copies buffers. The
only affected Vulkan commands are vkCmdCopyBuffer() and
vkCmdUpdateBuffer() (because we fallback to a copy depending on
a threshold). Clear operations are currently not concerned
because the driver always syncs after the last DMA operation.

Per the spec, these two operations have to be externally
synchronized with VK_PIPELINE_STAGE_TRANSFER_BIT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 10:03:01 +01:00
Samuel Pitoiset
457ac6ce1e radv: ignore subpass self-dependencies
Unnecessary as they allow the app to call vkCmdPipelineBarrier()
inside the render pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 10:02:59 +01:00
Iago Toral Quiroga
8e73b57634 Revert "nir/builder: Assert that intN_t immediates fit"
This reverts commit 1f29f4db1e.

For this to work the compiler must ensure that it never puts
the values that arrive to this helper into unsigned variables
at any point in its processing, since that would not apply sign
extension to the value and it would break the expectations here.
Unfortunately, we use uint64_t extensively to pass and copy
things around, so some times we get to this helper with values
that are not properly sign extended to 64-bit. Here is an example
for an 8-bit value that comes from a switch case:

(gdb) p /x x
$1 = 0xffffffd6

The value seems to have been sign extended to 32-bit at some point
getting proper sign extension, but then copied into a uint64_t
which wont' apply sign extension, breaking the expectations of
the assertion.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-21 08:12:50 +01:00
Iago Toral Quiroga
387888e3b7 nir/from_ssa: fix bit-size of temporary register
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-21 08:07:22 +01:00
Mathias Fröhlich
2d3c466add mesa: Remove unneeded bitfield widths from the VAO.
With the current VAO layout we do not need to make these
fields a bitfield. We get a tight struct layout with this change
for VAO attributes.

v2: Change unsigned char -> GLubyte.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
0a7020b4e6 mesa: Factor out struct gl_vertex_format.
Factor out struct gl_vertex_format from array attributes.
The data type is supposed to describe the type of a vertex
element. At this current stage the data type is only used
with the VAO, but actually is useful in various other places.
Due to the bitfields being used, special care needs to be
taken for the glGet code paths.

v2: Change unsigned char -> GLubyte.
    Use struct assignment for struct gl_vertex_format.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
2da7b0a2fb tnl: Use gl_array_attribute::_ElementSize.
Instead of open coding the size computation, use the
already available gl_array_attribute::_ElementSize value.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
a4c01839c2 nouveau: Use gl_array_attribute::_ElementSize.
Instead of open coding the size computation, use the
already available gl_array_attribute::_ElementSize value.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
182ed6de8c mesa: Unify glEdgeFlagPointer data type.
Use GL_UNSIGNED_BYTE as initialization data type
for the edge flag vertex attribute array. The same datatype
is used in the glEdgeFlagPointer function when setting the
array pointer.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
1b743e2966 mesa: Work with bitmasks when en/dis-abling VAO arrays.
For enabling or disabling VAO arrays it is now possible to
change a set of arrays with a single call without the need to
iterate the attributes.
Make use of this technique in the vao module.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
3c46fa5988 mesa: Remove gl_array_attributes::Enabled.
Now that all users go via the VAO Enabled bitfield,
get rid of the Enabled boolean.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
093aeb3565 mesa: Use gl_vertex_array_object::Enabled for glGet.
Instead of using gl_array_attributes::Enabled use the
much more compact representation stored in
gl_vertex_array_object::Enabled using the corresponding bits.
Keep the glGet changes in a seperate patch at least for review.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
1217a8448c mesa: Use the gl_vertex_array_object::Enabled bitfield.
Instead of using gl_array_attributes::Enabled use the
much more compact representation stored in
gl_vertex_array_object::Enabled using the corresponding bits.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Mathias Fröhlich
73d2d313e9 mesa: Rename gl_vertex_array_object::_Enabled -> Enabled.
Mark the up to now derived bitfield value now as primary
value by removing the underscore.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-21 06:27:19 +01:00
Marek Olšák
ea9f95e2a6 radeonsi: go back to using bottom-of-pipe for beginning of TIME_ELAPSED
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102597

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-20 21:18:48 -05:00
Marek Olšák
6c1a34d2e7 radeonsi: don't send data after write-confirm with BOTTOM_OF_PIPE_TS
There are no writes.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-20 21:18:46 -05:00
Marek Olšák
bc5adc27b5 st/mesa: pin driver threads to a fixed CCX when glthread is enabled
radeonsi has 3 driver threads (glthread, gallium, winsys), other drivers
may have 2 (glthread, gallium), so it makes sense to pin them to a random
CCX and keep that irrespective of the app thread.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-20 21:18:43 -05:00
Marek Olšák
48f2160936 st/mesa: regularly re-pin driver threads to the CCX where the app thread is
This is used when glthread is disabled.

Mesa pretty much chases the app thread on the CPU.
The performance is the same as pinning the app thread.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-20 21:18:30 -05:00
Marek Olšák
ce7f84eb77 drirc: enable glthread for Talos Principle
Ryzen 1700X, Vega 56, 1600x900, 4xAA: improvement +4.4%

Immediate mode was needed.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-20 21:17:42 -05:00
Marek Olšák
7f1cac7ba6 mesa/glthread: enable immediate mode
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-20 21:17:41 -05:00
Marek Olšák
247d5a8e94 mesa/glthread: pass the function name to _mesa_glthread_restore_dispatch
If you insert printf there, you'll know why glthread was disabled.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-20 21:17:38 -05:00
Marek Olšák
25d95ed535 gallium/u_tests: fix MSVC build by using old-style zero initializers 2018-11-20 19:06:40 -05:00
Kenneth Graunke
562448b75a i965: Do NIR shader cloning in the caller.
This moves nir_shader_clone() to the driver-specific compile function,
rather than the shared src/intel/compiler code.  This allows i965 to do
key-specific passes before calling brw_compile_*.  Vulkan should not
need this cloning as it doesn't compile multiple variants.

We do need to continue cloning in the compute shader code because we
lower various things in NIR based on the SIMD width.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-11-20 15:53:46 -08:00
Kenneth Graunke
6a10dd08f4 i965: Use a 'nir' temporary rather than poking at brw_program
It's shorter and will also be useful when I adjust cloning soon.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-11-20 15:53:46 -08:00
Marek Olšák
0d17b685b1 gallium/u_tests: add a compute shader test that clears an image 2018-11-20 18:50:48 -05:00
Dave Airlie
3486fe655a ac: handle cast derefs
Just give back the same value for now.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 08:54:46 +10:00
Dave Airlie
baa4bdd3a6 radv: handle loading from shared pointers
We won't have a var to load from, so don't try to the processing
required if we don't need it.

This avoids crashes in:
dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.compute.workgroup_two_buffers

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 08:54:42 +10:00
Dave Airlie
ec9fe8abc7 ac: avoid casting pointers on bcsel and stores
For variable pointers we really don't want to case the pointers to int
without a good reason, just add a wrapper for bcsel loading and result
storing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 08:54:25 +10:00
Dylan Baker
a999798daa meson: Add tests to suites
Meson test has a concepts of suites, which allow tests to be grouped
together. This allows for a subtest of tests to be run only (say only
the tests for nir). A test can be added to more than one suite, but for
the most part I've only added a test to a single suite, though I've
added a compiler group that includes nir, glsl, and glcpp tests.

To use this you'll need to invoke meson test directly, instead of ninja
test (which always runs all targets). it can be invoked as:
`meson test -C builddir --suite $suitename` (meson test has addition
options that are pretty useful).

Tested-By: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-20 09:09:22 -08:00
Andrii Simiklit
b787dcf57b i965/batch: avoid reverting batch buffer if saved state is an empty
There's no point reverting to the last saved point if that save point is
the empty batch, we will just repeat ourselves.

v2: Merge with new commits, changes was minimized, added the 'fixes' tag
v3: Added in to patch series
v4: Fixed the regression which was introduced by this patch
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108630
    Reported-by:  Mark Janes <mark.a.janes@intel.com>
    The solution provided by: Jordan Justen <jordan.l.justen@intel.com>

CC: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 3faf56ffbd "intel: Add an interface for saving/restoring
                     the batchbuffer state."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108630 (fixed in v4)
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-20 06:33:43 -08:00
Emil Velikov
982e012b3a travis: adding missing x11-xcb for meson+vulkan
Required by the x11 WSI

Fixes: df82012b2c ("travis: add meson build for vulkan drivers.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-20 11:16:46 +00:00
Emil Velikov
5bc509363b glx: make xf86vidmode mandatory for direct rendering
Currently we detect the module and if missing, the glXGetMsc* API is
effectively a stub, always returning false.

This is what effectively has been happening with our meson build :-(

Thus users have no chance of using it - they cannot even distinguish
if the failure is due to a misconfigured build.

There's no reason for keeping xf86vidmode optional - it has been
available in all distributions for years.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: a47c525f32 "meson: build glx"
2018-11-20 11:13:20 +00:00
Emil Velikov
84445a86d1 travis: drop unneeded x11proto-xf86vidmode-dev
The only place where the package is needed is for building the DRI
based libGL library.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-20 11:13:20 +00:00
Samuel Pitoiset
f4563d8f5b ac/nir: fix intrinsic name string size in visit_image_atomic()
Fixes an assertion in SoTTR.

Fixes: dd0172e865 ("radv: Use structured intrinsics instead of indexing workaround for GFX9.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-20 10:23:45 +01:00
Bas Nieuwenhuizen
dd0172e865 radv: Use structured intrinsics instead of indexing workaround for GFX9.
These force the index to be used in the instruction so we don't need the
workaround.

Totals:
SGPRS: 1321642 -> 1321802 (0.01 %)
VGPRS: 943664 -> 943788 (0.01 %)
Spilled SGPRs: 28468 -> 28480 (0.04 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 52415292 -> 52338932 (-0.15 %) bytes
LDS: 400 -> 400 (0.00 %) blocks
Max Waves: 233903 -> 233803 (-0.04 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 238344 -> 238504 (0.07 %)
VGPRS: 232732 -> 232856 (0.05 %)
Spilled SGPRs: 13125 -> 13137 (0.09 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 15752712 -> 15676352 (-0.48 %) bytes
LDS: 139 -> 139 (0.00 %) blocks
Max Waves: 31680 -> 31580 (-0.32 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-19 23:36:00 +01:00
Kenneth Graunke
0990168642 i965: Allow only one slot of clip distances to be set on Gen4-5.
The existing backend code assumed that if VARYING_SLOT_CLIP_DIST0
was written, then VARYING_SLOT_CLIP_DIST1 would be as well.  That's
true with the current lowering, but not necessary if there are 4 or
fewer clip distances.  Separate out the checks to allow this.

The new NIR-based lowering will trigger this case, which would have
caused backend validation errors (src is null) without this patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-19 14:33:16 -08:00
Kenneth Graunke
5b682143da nir: Make nir_lower_clip_vs optionally work with variables.
The way nir_lower_clip_vs() works with store_output intrinsics makes a
ton of assumptions about the driver_location field.

In i965 and iris, I'd rather do this lowering early and work with
variables.  v3d may want to switch to that as well, and ir3 could too,
but I'm not sure exactly what would need updating.  For now, handle
both methods.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-19 14:33:16 -08:00
Kenneth Graunke
d0f746b645 nir: Save nir_variable pointers in nir_lower_clip_vs rather than locs.
I'll want the variables in the next patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-19 14:33:16 -08:00
Kenneth Graunke
63c8696874 nir: Inline lower_clip_vs() into nir_lower_clip_vs().
It's now called exactly once, and there's not really any distinction.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-19 14:33:14 -08:00
Kenneth Graunke
bfa789aceb nir: Use nir_shader_get_entrypoint in nir_lower_clip_vs().
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-19 14:31:20 -08:00
Dave Airlie
c8a35285f0 nir: handle shared pointers in lowering indirect derefs.
Check if the base ends up with no variable, and continue
if we see that case outside the loop.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-20 05:36:52 +10:00
Dave Airlie
760859cac2 nir: move getting deref from var after we check deref type.
I posted a load of hacks before to do this, Jason suggested this,
just check the deref mode, not the variable mode and delay getting
the variable until we know the type.

avoids crashes when derefing shared memory pointers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-20 05:36:38 +10:00
Dave Airlie
2f4f5a5055 spirv/vtn: handle variable pointers without offset lowering
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-20 05:36:16 +10:00
Jason Ekstrand
dca35c598d intel/fs,vec4: Fix a compiler warning
../src/intel/compiler/brw_fs_nir.cpp:3534:46: warning: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare]
       assert(nir_intrinsic_write_mask(instr) ==
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
              (1 << instr->num_components) - 1);
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This was caused by 6339aba775 which added these completely valid
checks.  However clang likes to complain about signedness mismatches.

Fixes: 6339aba775 "intel/compiler: Lower SSBO and shared..."
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-11-19 09:57:41 -06:00
Jason Ekstrand
060817b2fa intel,nir: Move gl_LocalInvocationID lowering to nir_lower_system_values
It's not at all intel-specific; the formula is dictated by OpenGL and
Vulkan.  The only intel-specific thing is that we need the lowering.  As
a nice side-effect, the new version is variable-group-size ready.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-11-19 09:57:41 -06:00
Eric Engestrom
486091bc00 gbm: add missing comma between strings
Fixes: d971a4230d "loader: Factor out the common driver
                              opening logic from each loader."
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-19 15:50:56 +00:00
Samuel Pitoiset
724107553c radv: implement fast HTILE clears for depth or stencil only on GFX9
This allows to fast clear the depth part (or the stencil part)
of a depth+stencil surface when HTILE is enabled. I didn't test
on GFX8, so it's disabled currently.

This gives a very nice boost, for example when clearing the depth
aspect of a 4096x4096 D32_SFLOAT_S8_UINT image (18x faster).

BEFORE: 235 us
AFTER: 13 us

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:18 +01:00
Samuel Pitoiset
7dcddbe54d radv: rewrite the condition that checks allowed depth/stencil values
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:16 +01:00
Samuel Pitoiset
9133bbf186 radv: check allowed fast HTILE clears a bit earlier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:14 +01:00
Samuel Pitoiset
193ad4748b radv: add radv_is_fast_clear_{depth,stencil}_allowed() helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:12 +01:00
Samuel Pitoiset
c7e142ed78 radv: add radv_get_htile_fast_clear_value() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:10 +01:00
Samuel Pitoiset
6f3fbcc041 radv: remove unnecessary goto in the fast clear paths
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:08 +01:00
Samuel Pitoiset
36006e3cec radv/winsys: remove the max IBs per submit limit for the sysmem path
This path will be eventually improved later but as it's only
used on SI (or with RADV_DEBUG=noibs), I'm not sure if that
matters much.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:06 +01:00
Samuel Pitoiset
4d30f2c6f4 radv/winsys: remove the max IBs per submit limit for the fallback path
The chained submission is the fastest path and it should now
be used more often than before. This removes some EOP events.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:04 +01:00
Lucas Stach
8ca8a6a7b1 etnaviv: use dummy RT buffer when rendering without color buffer
At least GC2000 seems to push some dirt from the PE color cache into
the last bound render target when drawing depth only. Newer cores
seem to behave properly and don't do this, but I have found no way
to fix it on GC2000. Flushes and stalls don't seem to make any
difference.

In order to stop the core from pushing the dirt into a precious real
render target, plug in dummy buffer when rendering without a color
buffer.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2018-11-19 15:48:10 +01:00
Dave Airlie
8706204074 virgl: fix vtest regression since fencing changes.
The in_fence_fd needs to be initialised to -1.

Fixes: d1a1c21e7 (virgl: native fence fd support)

Reviewed-by: Robert Foss <robert.foss@collabora.com>
2018-11-19 15:33:19 +01:00
Samuel Pitoiset
55c75d2b49 radv: always clear the FCE predicate after DCC/FMASK/CMASK decompressions
DCC and FMASK also imply a fast-clear eliminate, so it should be
safe to reset the predicate unconditionally. We still only skip
FMASK or CMASK decompressions for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 14:05:35 +01:00
Samuel Pitoiset
483a28bfd4 radv: tidy up radv_set_dcc_need_cmask_elim_pred()
This is just a small cleanup.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 14:05:33 +01:00
Nicolai Hähnle
46a59ce026 radeonsi: fix an out-of-bounds read reported by ASAN
We read 4 values out of sample_locs_8x, so make sure the array is
big enough.

Fixes: ac76aeef20 ("radeonsi: switch back to standard DX sample positions")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-19 11:16:35 +01:00
Gert Wollny
d174cbccfa r600: Only set context streamout strides info from the shader that has outputs
With 5d517a streamout info is only attached to the shader for which the
transform feedback is actually recorded, but the driver set the context info
with each state submitted, thereby always using the info data that was
attached to the vertex shader.

Pass the streamout stride info to the context only from the shader that
actually has outputs. (Thanks to Marek Olšák for pointing me in the right
direction)

Fixes regresion with: dEQP-GLES31.functional.tessellation.invariance.*
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108734
Fixes: 5d517a599b
  st/mesa: Don't record garbage streamout information in the non-SSO case.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-19 11:06:56 +01:00
Gert Wollny
18a8e11aea i965:use FRAMEBUFFER_UNSUPPORTED instead of FRAMEBUFFER_INCOMPLETE_DIMENSIONS
FRAMEBUFFER_INCOMPLETE_DIMENSIONS is not supported for GLES 3.0 and later and
not defined for Desktop OpenGL. Instead use FRAMEBUFFER_UNSUPPORTED like it
was done before.

Thanks to Iago Toral and Andrey Simiklit for pointing out the problem and the
details.

Fixes:  ebcde34545
   i965: be more specific about FBO completeness errors
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-11-19 11:06:52 +01:00
Gert Wollny
40eca7d3e1 virgl: Use file descriptor instead of un-allocated object
The structure qdws is not allocated at this point, nor is the
file descriptor set to it's member. Use the fd directly instead.

Fixes:  d1a1c21e76
    virgl: native fence fd support

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
2018-11-19 11:03:56 +01:00
Gert Wollny
78fdc507a3 i965: Add support for and expose EXT_texture_sRGB_R8
Emulate MESA_FORMAT_R_SRGB8 by using L8_UNORM_SRGB. This is possible
because component swizzling is handled based on the mesa format and,
hence, the a r001 swizzling can be used to correct the components.

Enables and makes pass (tested on Kabylake)

  dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.*
  dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8*

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-19 08:05:44 +01:00
Gert Wollny
c5363869d4 i965: Force zero swizzles for unused components in GL_RED and GL_RG
This makes it possible to use a hardware luminance format as RED format.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-19 08:05:44 +01:00
Gert Wollny
ebcde34545 i965: be more specific about FBO completeness errors
The driver was returning GL_FRAMEBUFFER_UNSUPPORTED for all cases of an
incomplete fbo, be a bit more specific about this following the description
of glCheckFramebufferStatus.

This helps to keeps dEQP happy when adding EXT_texture_sRGB_R8 support.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-19 08:05:44 +01:00
Gert Wollny
24a02157dd i965: Correct L8_UNORM_SRGB table entry
As the name says, the format is an sRGB format.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-19 08:05:44 +01:00
Robert Foss
70692adf48 virgl: Clean up fences commit
Remove a dead variable, a int->bool conversion and some
whitespace changes.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-18 12:14:55 +01:00
Kenneth Graunke
c2e3d0f163 i915: Delete swizzling detection logic.
This is all leftover from the i965 split.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-17 10:26:31 -08:00
Ilia Mirkin
beb66d3747 nv50/ir/ra: enforce max register requirement, and change spill order
On nv50, certain operations must happen on regs below 64, due to
encoding requirements. First of all, we add infrastructure to enforce
this. Secondly we change the spill order to first spill RIG nodes that
are unconstrained, followed by ones that are.

This makes the gamecube logo shadertoy compile properly. Curiously, if
we adjust the spill order so that we first spill the constrained RIG
nodes instead, the RA also succeeds. However it seems more logical to
first spill the unconstrained ones.

While we're at it, drop the nv50 max register to reserve r127 as the
zero register of last resort (r63 is preferred).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Karol Herbst <kherbst@redhat.com>
2018-11-16 22:43:52 -05:00
Ilia Mirkin
799e021894 nv50/ir/ra: improve condition for short regs, unify with cond for 16-bit
Instead of the size restriction existing in two places, and potentially
being applied twice, we move this together. Ops with 16-bit register
addresses can only take a short reg, and ops with immediates can only
take a short reg.

Of course we leave the immediate 0 in place since we know that it will
be replaced by r63/r127 down the line, so don't treat zeroes as an
immediate.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-11-16 20:53:33 -05:00
Ilia Mirkin
955d943c33 nv50/ir: delete MINMAX instruction that is no longer in the BB
We removed the op from the BB, but it was still listed in its sources'
uses. This could trip up some logic down the line which analyzes all the
uses of an l-value, e.g. spilling.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-11-16 20:53:09 -05:00
Eric Anholt
7e9fc11ff8 egl: Print the actual message to the console from _eglError().
Previously we would print errors on the console like:

   libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglInitialize

When we had everything we needed for:

   libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglInitialize: DRI2: failed to find EGLDevice

(for a gbm error in my case)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-16 17:49:31 -08:00
Eric Anholt
d971a4230d loader: Factor out the common driver opening logic from each loader.
I copied the code from egl_dri2.c, but the functionality was equivalent
between all the loaders other than their particular environment variables.

v2: Drop the logging function equivalent to loader_default_logger()
    (requested by Eric, Emil).  Move the SCons workaround across.  Drop
    the now-unused driGetDriverExtensions() declaration that was lost in a
    rebase.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2018-11-16 17:49:17 -08:00
Eric Anholt
cc19815738 loader: Stop using a local definition for an in-tree header
I need other types from the header now, and "gl.h is big" is not a good
reason to duplicate definitions.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-16 15:38:18 -08:00
Eric Anholt
2bc1f5c2e7 egl: Move loader_set_logger() up to egl_dri2.c.
Everyone needs to call it, and platform_x11 forgot to.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-16 15:38:18 -08:00
Eric Anholt
c2b515379b glx: Move DRI extensions pointer loading to driOpenDriver().
The only thing you do with a dri driver handle is get the extensions
pointer, so just fold it in to simplify the callers.

v2: Add the declaration of driGetDriverExtensions() that got lost in a
    rebase.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2018-11-16 15:38:18 -08:00
Eric Anholt
7076e9f116 glx: Remove an old DEFAULT_DRIVER_DIR default.
You can tell by "Mesa/configs/default" how old this is.  Your build system
really has to provide the DEFAULT_DRIVER_DIR, or other loaders will break.

v2: Move the bad (non-prefix-dependent) define to the SConscript to avoid
    breaking it.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2018-11-16 15:37:47 -08:00
Samuel Pitoiset
d031d5c999 radv: enable primitive binning by default
After doing a bunch of benchmarks, primitive binning helps
some games like The Talos Principle (+5%) or Serious Sam 2017
(+3%). For other titles, either it doesn't change anything or
it hurts very few (less than 1%).

This only affects GFX9.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-16 17:51:15 +01:00
Samuel Pitoiset
afd834b62e radv: add a debug option for disabling primitive binning
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-16 17:51:12 +01:00
Robert Foss
d1a1c21e76 virgl: native fence fd support
Following the support for fences on the virtio driver add support
for native fence on virgl. This was somewhat based on the freedeno one.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-16 14:41:57 +01:00
Lionel Landwerlin
0db898cef2 intel/aub_viewer: Print blend states properly
Identical fix to :

commit 70de31d0c1
Author: Jason Ekstrand <jason.ekstrand@intel.com>
Date:   Fri Aug 24 16:05:08 2018 -0500

    intel/batch_decoder: Print blend states properly

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
2018-11-16 11:40:38 +00:00
Lionel Landwerlin
ac324a6809 intel/aub_viewer: fix dynamic state printing
Identical fix to :

commit cbd4bc1346
Author: Jason Ekstrand <jason.ekstrand@intel.com>
Date:   Fri Aug 24 16:04:03 2018 -0500

    intel/batch_decoder: Fix dynamic state printing

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
2018-11-16 11:40:14 +00:00
Lionel Landwerlin
59c1059528 intel/aubinator: fix ring buffer pointer
We can only start parsing commands from the head pointer. This was
working fine up to now because we only dealt with a "made up" ring
buffer (generated by aub_write) which always had its head at 0.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
2018-11-16 11:39:54 +00:00
Lionel Landwerlin
25443cbb72 intel/decoders: read ring buffer length
Use this value to limit reading the ring buffer.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
2018-11-16 11:37:08 +00:00
Lionel Landwerlin
1c56d21156 egl/dri: fix error value with unknown drm format
According to the EGL_EXT_image_dma_buf_import spec, creating an EGL
image with a DRM format not supported should yield the BAD_MATCH
error :

"
       * If <target> is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT
         attribute is set to a format not supported by the EGL, EGL_BAD_MATCH
         is generated.
"

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 20de7f9f22 ("egl/dri2: support for creating images out of dma buffers")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-11-16 10:28:06 +00:00
Daniel Stone
5e1fe240c4 gbm: Clarify acceptable formats for gbm_bo
gbm_bo_create() was presumably meant to originally accept gbm_bo_format
enums, but it's accepted GBM_FORMAT_* tokens since the dawn of time.
This is good, since gbm_bo_format is rarely used and covers a lot less
ground than GBM_FORMAT_*.

Change the documentation to refer to both; this involves removing a 'see
also' for gbm_bo_format, since we can't also use \sa to refer to a
family of anonymous #defines.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-16 09:40:46 +00:00
Connor Abbott
ba94a00c7c Revert "radv: disable VK_SUBGROUP_FEATURE_VOTE_BIT"
This reverts commit 647c2b90e9. There was
one recently-introduced bug in ac for dvec3 loads, but the other test
failures were actually bugs in the tests. See
9429e621c4

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-16 10:32:03 +01:00
Eric Anholt
cc71bf529c vc4: Don't return a vc4 BO handle on a renderonly screen.
The handles exported need to be on the KMS device's fd, anything else is
failure.  Also, this code is assuming that the scanout resource has been
created already, so assert it.
2018-11-15 21:11:44 -08:00
Eric Anholt
cc0bc76a38 vc4: Make sure we make ro scanout resources for create_with_modifiers.
The DRI3 create_with_modifiers paths don't set tmpl.bind to SCANOUT or
SHARED, with the theory that given that you've got modifiers, that's all
you need.  However, we were looking at the tmpl.bind for setting up the
KMS handle in the renderonly case, so we'd end up trying to use vc4's
handle on the hx8357d fd.

Fixes: 84ed8b67c5 ("vc4: Set shareable BOs as T tiled if possible")
2018-11-15 21:11:44 -08:00
Danylo Piliaiev
f9fd0cf479 i965: Fix calculation of layers array length for isl_view
Handle all cases in calculation of layers count for isl_view
taking into account texture view and image unit.
st_convert_image was taken as a reference.

When u->Layered is true the whole level is taken with respect to
image view. In other case only one layer is taken.

v3: (Józef Kucia and Ilia Mirkin)
    - Rewrote patch by taking st_convert_image as a reference
    - Removed now unused get_image_num_layers function
    - Changed commit message

v4: (Jason Ekstrand)
    - Added assert

Fixes: 5a8c8903
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107856

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-15 19:59:54 -06:00
Jason Ekstrand
6339aba775 intel/compiler: Lower SSBO and shared loads/stores in NIR
We have a bunch of code to do this in the back-end compiler but it's
fairly specific to typed surface messages and the way we emit them.
This breaks it out into NIR were it's easier to do things a bit more
generally.  It also means we can easily share the code between the vec4
and FS back-ends if we wish.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:49 -06:00
Jason Ekstrand
d34fd81e76 nir: Add alignment parameters to SSBO, UBO, and shared access
This also changes spirv_to_nir and glsl_to_nir to set them.  The one
place that doesn't set them is shared memory access lowering in
nir_lower_io.  That will have to be updated before any consumers of it
can effectively use these new alignments.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
2018-11-15 19:59:42 -06:00
Jason Ekstrand
fb127f7729 nir/lower_io: Add shared to get_io_offset_src
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:31 -06:00
Jason Ekstrand
b5c48271d4 nir/glsl: Force 32-bit for UBO and SSBO Booleans
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:30 -06:00
Jason Ekstrand
44b7005581 nir/spirv: Force 32-bit for UBO and SSBO Booleans
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:29 -06:00
Jason Ekstrand
f16bd8a9fe nir/builder: Add a nir_pack/unpack/bitcast helpers
The new helpers can generate any pack/unpack operation including those
for which we do not have specific opcodes and they express a bitcast in
terms of these pack/unpack operations.  In particular, the new helpers
properly handle 8-bit types.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:28 -06:00
Jason Ekstrand
b77d68b78e nir/builder: Add iadd_imm and imul_imm helpers
The pattern of adding or multiplying an integer by an immediate is
fairly common especially in deref chain handling.  This adds a helper
for it and uses it a few places.  The advantage to the helper is that
it automatically handles bit sizes for you.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-11-15 19:59:27 -06:00
Jason Ekstrand
1f29f4db1e nir/builder: Assert that intN_t immediates fit
This assert won't catch all mistakes with this helper but it will at
least ensure that the top bits are all zero or all one which should help
catch bugs.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:26 -06:00
Jason Ekstrand
4266932c0b nir/lower_alu_to_scalar: Don't try to lower unpack_32_2x16
It messes up when trying to lower.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-15 19:59:09 -06:00
Ian Romanick
425c133ab9 glsl: Refactor type checking for redeclarations
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-15 14:27:32 -08:00
Ian Romanick
61e003ce7e glsl: Omit redundant qualifier checks on redeclarations
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-15 14:27:29 -08:00
Ian Romanick
9b9f3218db glsl: prevent qualifiers modification of predeclared variables
Section 3.7 (Identifiers) of the GLSL spec says:

    However, as noted in the specification, there are some cases where
    previously declared variables can be redeclared to change or add
    some property, and predeclared "gl_" names are allowed to be
    redeclared in a shader only for these specific purposes.  More
    generally, it is an error to redeclare a variable, including those
    starting "gl_".

This patch should fix piglit tests:
clip-distance-redeclare-without-inout.frag
clip-distance-redeclare-without-inout.vert

However, this causes a regression in
clip-distance-out-values.shader_test.  A fix for that test has been sent
to the piglit list for review:

    https://patchwork.freedesktop.org/patch/255201/

As far as I understood following mailing thread:
https://lists.freedesktop.org/archives/piglit/2013-October/007935.html
looks like we have accepted to remove an ability to change qualifiers
but have not done it yet. Unless I missed something)

v2 (idr): Move 'earlier->data.mode != var->data.mode' test much earlier
in the function.  Add special handling for gl_LastFragData.

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-15 14:27:26 -08:00
Eric Anholt
538bca78e2 v3d: Don't try to set PF flags on a LDTMU operation
We need an ALU op in order to set PF.  Fixes a recent assertion failure in
dEQP-GLES3.functional.ubo.single_basic_type.shared.bool_vertex
2018-11-15 11:12:54 -08:00
Eric Anholt
03928dd682 v3d: Fix double-swapping of R/B on V3D 4.1
Fixes: 4018eb04e8 ("v3d: Use the TLB R/B swapping instead of recompiles when available.")
2018-11-15 11:12:54 -08:00
Eric Engestrom
2b2f790e59 egl: fix bad rebase
I screwed up a rebase over a refactor and didn't notice locally because
the uncommitted refactor hid the issue.

Fixes: c973364967 "egl: add missing glvnd entrypoint for EGL_ANDROID_blob_cache"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-15 17:51:40 +00:00
Sagar Ghuge
6e60ff1ea9 intel/compiler: Disassemble GEN6_SFID_DATAPORT_SAMPLER_CACHE as dp_sampler
Both BRW_SFID_SAMPLER and GEN6_SFID_DATAPORT_SAMPLER_CACHE are getting
disassembled as "sampler", which is misleading for assembler tool.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2018-11-15 09:36:55 -08:00
Eric Engestrom
c973364967 egl: add missing glvnd entrypoint for EGL_ANDROID_blob_cache
Fixes dEQP-EGL.functional.get_proc_address.extension.egl_android_blob_cache
on builds with glvnd enabled.

Fixes: 6f5b57093b "egl: add support for EGL_ANDROID_blob_cache"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 16:27:27 +00:00
Eric Engestrom
2640854399 gbm: add new entrypoint to symbols check
Fixes: 6328536ff2 "gbm: Introduce a helper function for
                              printing GBM format names."
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-15 16:25:42 +00:00
Emil Velikov
adbdfc6666 bin/get-pick-list.sh: handle reverts prior to the branchpoint
Currently we detect when a breaking commit:
 - has landed in stable, and
 - is referenced by a untagged fix in master

Yet we did not consider the case of breaking commit:
 - prior to the branchpoint, and
 - is referenced by a untagged fix in master

Addressing the latter is extremely slow, due to the size of the lookup.

That said, we can trivially use the existing is_sha_nomination() helper
to catch reverts.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 16:15:15 +00:00
Emil Velikov
c0012a0708 bin/get-pick-list.sh: use test instead of [ ]
Latter is rather picky wrt surrounding white space. The explicit `test`
doesn't have that problem, plus the statements read a bit easier.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:51 +00:00
Emil Velikov
77ff0bfb5f bin/get-pick-list.sh: handle unofficial "broken by" tag
We have a number of cases were devs will use a tag "broken by".
While it's not something officially documented or recommended, checking
for it is trivial enough.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:47 +00:00
Emil Velikov
209525aafb bin/get-pick-list.sh: handle fixes tag with missing colon
Every so often, we forget to add the colon after "fixes". Trivially
tweak the script to catch it.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:44 +00:00
Emil Velikov
b7418d1f3f bin/get-pick-list.sh: flesh out is_sha_nomination
Refactor is_fixes_nomination into a is_sha_nomination helper. This way
we can reuse it for more than the usual "Fixes:" tag.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:40 +00:00
Emil Velikov
533fead423 bin/get-pick-list.sh: tweak the commit sha matching pattern
Currently we match on:
 - any arbitrary length of,
 - any a-z A-Z and 0-9 characters

At the same time, a commit sha consists of lowercase hexadecimal
numbers. Any sha shorter than 8 characters is ambiguous - in some cases
even 11+ are required.

So change the pattern to a-f0-9 and adjust the length to 8-40.

As we're here we could use a single grep, instead of the grep/sed combo.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:36 +00:00
Emil Velikov
181203f3c5 bin/get-pick-list.sh: handle the fixes tag
Having a separate script to handle the fixes tag, brings a number of
issues, so let's fold it in get-pick-list.sh.

v2:
 - pass the sha as argument to the function
 - Keep original sed pattern

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:31 +00:00
Emil Velikov
e6b3a3b201 bin/get-pick-list.sh: handle "typod" usecase.
As the comment in get-typod-pick-list.sh says, there's little point in
having a duplicate file.

Add the new pattern + tag to get-pick-list.sh and nuke this file.

v2:
 - pass the sha as argument to the function
 - grep -q instead of using a variable (Eric)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:55:24 +00:00
Emil Velikov
fac10169bb bin/get-pick-list.sh: prefix output with "[stable] "
With later commits we'll fold all the different scripts into one.
Add the explicit prefix, so that we know the origin of the nomination

v2:
 - pass the sha as argument to the function
 - swap $tag = none for an else statment (Juan)
 - grep -q instead of using a variable (Eric)
 - print the tag and commit oneline separately (Eric)

v3:
 - drop unused "tag=none" assignment (Juan)
 - typo nomination

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:54:48 +00:00
Emil Velikov
559c32d241 bin/get-pick-list.sh: simplify git oneline printing
Currently we force disable the pager via "|cat" where --no-pager
exists. Additionally we could use git show instead of git log -n1.

Use those for a slightly more understandable code.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-15 15:51:24 +00:00
Emil Velikov
7d9556681d docs: document the staging branch and add reference to it
A while back we agreed that having a live/staging branch is beneficial.
Sadly we forgot to document that, so here is my first attempt.

Document the caveat that the branch history is not stable.

CC: Andres Gomez <agomez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-15 15:48:15 +00:00
Emil Velikov
4ae749acf1 docs/submittingpatches.html: correctly handle the <p> tag
As pointed out by the w3c validator.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-15 15:48:13 +00:00
Emil Velikov
19a081473f docs/releasing.html: polish cherry-picking/testing text
Reword slightly and highlight the important parts of the text.

CC: Andres Gomez <agomez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-15 15:48:08 +00:00
Guido Günther
ab5653680e etnaviv: Make sure rs alignment checks match
etna_resource_alloc and etna_resource_from_handle currently use different checks.
This leads to

   etna_resource_from_handle:492: target=2, format=PIPE_FORMAT_B8G8R8X8_UNORM, 1080x1920x1, array_size=1, last_level=0, nr_samples=0, usage=0, bind=8000a, flags=0
   etna_resource_from_handle:541: BO stride 4320 is too small for RS engine width padding (4352, format PIPE_FORMAT_B8G8R8X8_UNORM)

since etna_resource_from_handle wants to be aligned to a 16 byte
boundary while the etna_resource_alloc does not.

Adjust the two checks by using a common function.

Broken by baff59ebf0

Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-11-15 16:38:35 +01:00
Juan A. Suarez Romero
52368ef83a docs: update calendar, add news item and link release notes for 18.2.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-11-15 13:08:58 +00:00
Juan A. Suarez Romero
aa7a419b8b docs: add sha256 checksums for 18.2.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 79be754f9a74a43b5748dc0934241e7701cb9581)
2018-11-15 13:06:12 +00:00
Juan A. Suarez Romero
e53ec08931 docs: add release notes for 18.2.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit f34bddc325)
2018-11-15 13:06:10 +00:00
Marek Olšák
9367514524 radeonsi: fix video APIs on Raven2
This was missed when I added the new enum.

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-11-14 17:08:34 -05:00
Andrii Simiklit
e13dd70581 i965: avoid 'unused variable' warnings
1. brw_pipe_control.c:311:34: warning:
    unused variable ‘devinfo’
2. brw_program_binary.c:209:19: warning:
    unused variable ‘gen_size’
3. brw_program_binary.c:216:19: warning:
    unused variable ‘nir_size’

v2: Changes for unreproducible issues were removed

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-14 14:41:58 +00:00
Andrii Simiklit
7aca650122 compiler: avoid 'unused variable' warnings
1. nir/nir_lower_vars_to_ssa.c:691:21: warning:
       unused variable ‘var’
       nir_variable *var = path->path[0]->var;

v2: Changes for some part of 'may be used uninitialized'
    warnings were removed, seems like it is a compiler issue.
        ( Eric Engestrom <eric.engestrom@intel.com> )
    Possible like this one:
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46684
    This issue is flagged as duplicate but an
    original one is not closed yet.

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-14 13:35:38 +00:00
Andrii Simiklit
69ee49ac46 intel/tools: avoid 'unused variable' warnings
1. tools/aub_read.c:271:31: warning: unused variable ‘end’
    const uint32_t *p = data, *end = data + data_len, *next;

2. tools/aub_mem.c:292:13: warning: unused variable ‘res’
       void *res = mmap((uint8_t *)bo.map + map_offset, 4096, PROT_READ,
   tools/aub_mem.c:357:13: warning: unused variable ‘res’
       void *res = mmap((uint8_t *)bo.map + (page - bo.addr), 4096, PROT_READ,

v2: The i965_disasm.c changes was moved into a separate patch
    The 'end' variable declared separately with MAYBE_UNUSED
    to avoid effect of it to other variables.
       ( Eric Engestrom <eric.engestrom@intel.com> )

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-14 13:35:28 +00:00
Thomas Hellstrom
25b48e3df9 st/xa: Bump minor
Bump minor to signal support for new formats and higher precision
solid pictures.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-11-14 13:12:09 +01:00
Thomas Hellstrom
c9085f6d3b st/xa: Support Component Alpha with trivial blending
Support Component Alpha for those composite operations that do not require
per-channel alpha blending.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2018-11-14 13:12:09 +01:00
Thomas Hellstrom
0477d17f51 st/xa: Minor renderer cleanups
constify function arguments to clean up the code a bit.

Reported-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2018-11-14 13:12:09 +01:00
Thomas Hellstrom
56aa23b146 st/xa: Fix transformations when we have both source and mask samplers
In the case when we had both source and mask samplers, transformations were
typically not applied correctly.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2018-11-14 13:12:09 +01:00
Thomas Hellstrom
e1298def9f st/xa: Support a couple of new formats
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-11-14 13:12:09 +01:00
Thomas Hellstrom
258d20152a st/xa: Support higher color precision for solid pictures
The only solid fill picture type we supported only had 8 bit color
channels. Add a new solid picture type that supports float channels.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-11-14 13:11:51 +01:00
Thomas Hellstrom
d86ad38205 st/xa: Render update. Better support for solid pictures
Remove unused and obsolete code for gradients and component-alpha
Support solid source- and mask pictures using a variable number
of samplers in the composite pipeline rather than the fixed number
we used before.

Tested using rendercheck for XA.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-11-14 13:07:00 +01:00
Gert Wollny
4bba280937 nir: Allow to skip integer ops in nir_lower_to_source_mods
Some hardware supports source mods only for float operations. Make it
possible to skip lowering to source mods in these cases.

v2: use option flags instead of a boolean (Jason Ekstrand)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-14 08:59:26 +01:00
Karol Herbst
b4380cb070 nir/spirv: cast shift operand to u32
v2: fix for specialization constants as well

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-11-14 02:09:11 +01:00
Karol Herbst
099728b115 nir: replace nir_load_system_value calls with appropiate builder functions
this helps reduce the overall code changes when a bit_size parameter is
added to nir_load_system_value

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-11-14 02:09:11 +01:00
Karol Herbst
80db331c2d nir: add const_index parameters to system value builder function
this allows to replace some nir_load_system_value calls with the specific
system value constructor

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-11-14 02:09:11 +01:00
Timothy Arceri
95b513c937 radv: make use of nir_move_out_const_to_consumer()
vkpipeline-db results:

Totals from affected shaders:
SGPRS: 28400 -> 28576 (0.62 %)
VGPRS: 27916 -> 27692 (-0.80 %)
Spilled SGPRs: 140 -> 138 (-1.43 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1534456 -> 1520560 (-0.91 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 3541 -> 3582 (1.16 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-14 09:41:50 +11:00
Lionel Landwerlin
ea53f76d7b anv: move helper function internally
It's only used in anv_image.c

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 18:56:31 +00:00
Lionel Landwerlin
8b00d3d6eb anv: use image aspects rather than computed ones
This shouldn't make any difference but I feel uneasy to use the
expanded aspects that do not represent the image in its entirety. If
we ever change the implementation of the anv_image_aspect_to_plane()
helper, this is safer.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 18:56:27 +00:00
Lionel Landwerlin
465de47bad anv: associate vulkan formats with aspects
This will make it easier to associate an aspect with a plane number.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 18:56:24 +00:00
Lionel Landwerlin
fe3b7fe982 anv/lower_ycbcr: make sure to set 0s on all components
To play around with debugging, we might want to disable one or the
other component. Having 0s as default values makes this work.
Otherwise we might have NULL components, leading to crashes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 18:56:21 +00:00
Lionel Landwerlin
ee8d65c25a anv/image: remove unused parameter
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 18:56:13 +00:00
Lionel Landwerlin
352e297091 anv: simplify internal address offset
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 18:56:10 +00:00
Eric Engestrom
4fa2fb3524 meson: fix wayland-less builds
Those empty variables in the !wayland case are useless and running that
meson.build with them breaks the build:

  [287/850] Generating wayland-drm-client-protocol.h with a custom command.
  FAILED: src/egl/wayland/wayland-drm/wayland-drm-client-protocol.h
  client-header ../src/egl/wayland/wayland-drm/wayland-drm.xml src/egl/wayland/wayland-drm/wayland-drm-client-protocol.h
  /bin/sh: client-header: command not found
  ninja: build stopped: subcommand failed.

Fixes: d1992255bb "meson: Add build Intel "anv" vulkan driver"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-13 17:25:02 +00:00
Eric Engestrom
7df80de6e6 gbm: remove unnecessary meson include
`inc_wayland_drm` is only used if wayland is built, and it's already
added in that case a few lines below.

Fixes: a29869e872 "gbm: Don't traverse backwards for includes"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-13 17:25:02 +00:00
Eric Engestrom
3832db275e meson: only run vulkan's meson.build when building vulkan
Fixes: d1992255bb "meson: Add build Intel "anv" vulkan driver"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-13 17:25:02 +00:00
Eric Engestrom
4f1ae271e1 xmlpool: update translation po files
These files are close to 4 years out of date; a lot's changed since.
Let's just check in a recently-regenerated version.

Changes generated by running `ninja xmlpool-{pot,update-po,gmo}`.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-13 17:25:02 +00:00
Eric Engestrom
1e918e5bef REVIEWERS: add Vulkan reviewer group
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-11-13 17:25:02 +00:00
Eric Engestrom
59b3335496 REVIEWERS: add Emil as EGL reviewer
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-11-13 17:25:02 +00:00
Eric Engestrom
923aca84b2 REVIEWERS: add include path for EGL
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-11-13 17:25:02 +00:00
Toni Lönnberg
2af4e3345f intel/genxml: Add engine definition to render engine instructions (gen11)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

v4: Added missing engine definition to MI_TOPOLOGY_FILTER.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
1921982d3e intel/genxml: Add engine definition to render engine instructions (gen10)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

v4: Added missing engine definition to MI_TOPOLOGY_FILTER.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
030fe0f981 intel/genxml: Add engine definition to render engine instructions (gen9)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

v4: Added more missing engine definitions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
12e34fc7ba intel/genxml: Add engine definition to render engine instructions (gen8)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

v4: Added missing engine tag for MI_TOPOLOGY_FILTER and MI_LOAD_URB_MEM.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
a883fd2277 intel/genxml: Add engine definition to render engine instructions (gen75)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
27cf6252d3 intel/genxml: Add engine definition to render engine instructions (gen7)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
ecf62a967e intel/genxml: Add engine definition to render engine instructions (gen6)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions

v4: Added missing engine to MEDIA_GATEWAY_STATE

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
571d6447d8 intel/genxml: Add engine definition to render engine instructions (gen5)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
6463ceca69 intel/genxml: Add engine definition to render engine instructions (gen45)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added addition engine definitions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
a4ca710c96 intel/genxml: Add engine definition to render engine instructions (gen4)
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.

v2: Divided into individual patches for each gen

v3: Added additional engine definitions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
102dadec81 intel/decoder: tools: Use engine for decoding batch instructions
The engine to which the batch was sent to is now set to the decoder context when
decoding the batch. This is needed so that we can distinguish between
instructions as the render and video pipe share some of the instruction opcodes.

v2: The engine is now in the decoder context and the batch decoder uses a local
function for finding the instruction for an engine.

v3: Spec uses engine_mask now instead of engine, replaced engine class enums
with the definitions from UAPI.

v4: Fix up aubinator_viewer (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
a6aab7e436 intel/decoder: tools: gen_engine to drm_i915_gem_engine_class
Removed the gen_engine enum and changed the involved functions to use the
drm_i915_gem_engine_class enum from UAPI instead.

v3: Wrong engine was being used for blocks in video ring

v4: Fixed aubinator_viewer.cpp
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Toni Lönnberg
b00bccd012 intel/decoder: Engine parameter for instructions
Preliminary work for adding handling of different pipes to gen_decoder. Each
instruction needs to have a definition describing which engine it is meant for.
If left undefined, by default, the instruction is defined for all engines.

v2: Changed to use the engine class definitions from UAPI

v3: Changed I915_ENGINE_CLASS_TO_MASK to use BITSET_BIT, change engine to
engine_mask, added check for incorrect engine and added the possibility to
define an instruction to multiple engines using the "|" as a delimiter in the
engine attribute.

v4: Fixed the memory leak.

v5: Removed an unnecessary ralloc_free().

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-13 15:10:12 +00:00
Gert Wollny
8d4bb6e5cd virgl: Add command and flags to initiate debugging on the host (v2)
On the host VREND_DEBUG=guestallow must be set to let the guest override
the debug flags.

v2: Send flag string instead of flags, this avoids the need to keep
    the flags in sync.
v3: Only request host logging if the host actually understands the command

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2018-11-13 14:42:22 +01:00
Gert Wollny
caa964b422 mesa: Reference count shaders that are used by transform feedback objects
Transform feedback objects may hold a pointer to a shader program, and
at least in Gallium, this must be a valid pointer until
ctx->Driver.EndTransformFeedback in glEndTransformFeedback has been called
- which is conform with the spec that any program that is part of a
current rendering state should only be flagged for deletion by glDeleteProgram.
This was not handled properly for the transform feedback objects so that
a call sequence

  glUseProgram(x)
  glBeginTransformFreedback(...)
  glPauseTransformFeedback(...)
  glDeleteProgram(x)
  glEndTransformFeedback(...)

would result in a use after free bug. With this patch the transform
feedback object also updates the reference count to the used program
thereby keeping the program valid as long as the transform feedback
objects links to it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108713
Fixes: 654587696b
       mesa: add end_transform_feedback() helper

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-13 10:57:25 +01:00
Samuel Pitoiset
90d68858ed radv: set optimal OVERWRITE_COMBINER_WATERMARK on GFX9
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-13 10:24:36 +01:00
Samuel Pitoiset
f70c5d31cd radv: set PA.SC_CONSERVATIVE_RASTERIZATION.NULL_SQUAD_AA_MASK_ENABLE
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-13 10:24:33 +01:00
Samuel Pitoiset
b5f213bb1d radv: binding streamout buffers doesn't change context regs
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-13 10:24:31 +01:00
Plamena Manolova
c5f3013cba nir: Don't lower the local work group size if it's variable.
If the local work group size is variable it won't be available
at compile time so we can't lower it in nir_lower_system_values().

Signed-off-by: Plamena Manolova <plamena.n.manolova@gmail.com>

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-11-13 10:57:04 +02:00
Matt Turner
efb1ccadca util/ralloc: Make sizeof(linear_header) a multiple of 8
Prior to this patch sizeof(linear_header) was 20 bytes in a
non-debug build on 32-bit platforms. We do some pointer arithmetic to
calculate the next available location with

   ptr = (linear_size_chunk *)((char *)&latest[1] + latest->offset);

in linear_alloc_child(). The &latest[1] adds 20 bytes, so an allocation
would only be 4-byte aligned.

On 32-bit SPARC a 'sttw' instruction (which stores a consecutive pair of
4-byte registers to memory) requires an 8-byte aligned address. Such an
instruction is used to store to an 8-byte integer type, like intmax_t
which is used in glcpp's expression_value_t struct.

As a result of the 4-byte alignment returned by linear_alloc_child() we
would generate a SIGBUS (unaligned exception) on SPARC.

According to the GNU libc manual malloc() always returns memory that has
at least an alignment of 8-bytes [1]. I think our allocator should do
the same.

So, simple fix with two parts:

   (1) Increase SUBALLOC_ALIGNMENT to 8 unconditionally.
   (2) Mark linear_header with an aligned attribute, which will cause
       its sizeof to be rounded up to that alignment. (We already do
       this for ralloc_header)

With this done, all Mesa's unit tests now pass on SPARC.

[1] https://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html

Fixes: 47e1758692 ("glcpp: use the linear allocator for most objects")
Bug: https://bugs.gentoo.org/636326
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-12 20:54:49 -08:00
Matt Turner
7e3748c268 util/ralloc: Switch from DEBUG to NDEBUG
The debug code is all asserts, so protect it with the same thing that
controls assert.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-12 20:54:49 -08:00
Timothy Arceri
34dffcf913 nir: add support for removing redundant stores to copy prop var
For example the following type of thing is seen in TCS from
a number of Vulkan and DXVK games:

	vec1 32 ssa_557 = deref_var &oPatch (shader_out float)
	vec1 32 ssa_558 = intrinsic load_deref (ssa_557) ()
	vec1 32 ssa_559 = deref_var &oPatch@42 (shader_out float)
	vec1 32 ssa_560 = intrinsic load_deref (ssa_559) ()
	vec1 32 ssa_561 = deref_var &oPatch@43 (shader_out float)
	vec1 32 ssa_562 = intrinsic load_deref (ssa_561) ()
	intrinsic store_deref (ssa_557, ssa_558) (1) /* wrmask=x */
	intrinsic store_deref (ssa_559, ssa_560) (1) /* wrmask=x */
	intrinsic store_deref (ssa_561, ssa_562) (1) /* wrmask=x */

No shader-db changes on i965 (SKL).

vkpipeline-db results RADV (VEGA):

Totals from affected shaders:
SGPRS: 7832 -> 7728 (-1.33 %)
VGPRS: 6476 -> 6740 (4.08 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 469572 -> 456596 (-2.76 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 989 -> 960 (-2.93 %)
Wait states: 0 -> 0 (0.00 %)

The Max Waves and VGPRS changes here are misleading. What is
happening is a bunch of TCS outputs are being optimised away as
they are now recognised as unused. This results in more varyings
being compacted via nir_compact_varyings() which can result in
more register pressure when they are not packed in an optimal way.
This is an existing problem independent of this patch. I've run
some benchmarks and haven't noticed any performance regressions
in affected games.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 15:19:36 +11:00
Timothy Arceri
3561108de0 anv/i965: make use of nir_link_constant_varyings()
shader-db results for SLK:

total instructions in shared programs: 13106498 -> 13091573 (-0.11%)
instructions in affected programs: 1186244 -> 1171319 (-1.26%)
helped: 6186
HURT: 0

total cycles in shared programs: 332062633 -> 331961653 (-0.03%)
cycles in affected programs: 8537165 -> 8436185 (-1.18%)
helped: 5371
HURT: 862

LOST:   6
GAINED: 14

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-13 14:06:32 +11:00
Eric Anholt
621b0fa892 egl: Improve the debugging of gbm format matching in DRI configs.
Previously the debug would be:

libEGL debug: No DRI config supports native format 0x20203852
libEGL debug: No DRI config supports native format 0x38385247

but

libEGL debug: No DRI config supports native format R8
libEGL debug: No DRI config supports native format GR88

is a lot easier to understand.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-11-12 15:20:23 -08:00
Eric Anholt
6328536ff2 gbm: Introduce a helper function for printing GBM format names.
This requires that the caller make a little (stack) allocation to store
the string.

v2: Use gbm_format_canonicalize (suggested by Daniel)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-11-12 15:20:23 -08:00
Eric Anholt
ee7f848c00 gbm: Move gbm_format_canonicalize() to the core.
I want it for the format name debugging code.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-11-12 15:20:23 -08:00
Dylan Baker
4eab98b66e meson: fix libatomic tests
There are two problems:
1) the extra underscore in MISSING_64BIT_ATOMICS
2) we should link with libatomic if the previous test decided we needed
   it

Fixes: d1992255bb
       ("meson: Add build Intel "anv" vulkan driver")
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
2018-11-12 13:29:00 -08:00
Marek Olšák
32a334777c mesa: mark GL_SR8_EXT non-renderable on GLES
Fixes: dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.sr8_ext

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-12 16:19:43 -05:00
Marek Olšák
e0c7114eb3 st/mesa: disable L3 thread pinning
This implementation can have massive drawbacks.

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
2018-11-12 16:18:15 -05:00
Christian Gmeiner
c6aaafa3a1 nir: add lowering for ffloor
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-12 21:57:25 +01:00
Alyssa Rosenzweig
41c8f99137 util: Fix warning in u_cpu_detect on non-x86
regs is only set and used on x86; on other platforms (like ARM), this
code causes a trivial warning, solved by moving the regs declaration to
the architecture-dependent usage.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2018-11-12 10:28:04 -08:00
Dylan Baker
9c2a95b298 meson: Don't set -Wall
meson does this for you with its warn levels, so we don't need to set
it ourselves.

Fixes: d1992255bb
       ("meson: Add build Intel "anv" vulkan driver")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-12 08:55:55 -08:00
Rob Clark
4a0c2cfdd6 freedreno/drm: fix unused 'entry' warnings
Looks like importing libdrm_freedreno into mesa crossed paths with
e27902a261.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-12 10:45:48 -05:00
Lionel Landwerlin
89785e2d56 i965: add support for sampling from AYUV
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-12 13:22:54 +00:00
Lionel Landwerlin
252ca7b43f dri: add AYUV format
v2: Add a AYUV entry android in the android backend (Tapani)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-12 13:22:54 +00:00
Lionel Landwerlin
8a15f06d19 nir/lower_tex: Add AYUV lowering support
Byte ordering is :

0: V
1: U
2: Y
3: A

v2: Split refactoring of alpha channel (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
2018-11-12 13:22:54 +00:00
Lionel Landwerlin
0a30c33e83 nir/lower_tex: add alpha channel parameter for yuv lowering
We're about to introduce AYUV support which provides its own alpha
channel. So give alpha as a parameter and set it to 1 on exising
formats.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-12 13:22:54 +00:00
Samuel Pitoiset
97fb1a02fd radv: make use of num_good_cu_per_sh in si_emit_graphics() too
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-12 09:35:46 +01:00
Samuel Pitoiset
d9d14346c2 radv: clean up setting partial_es_wave for distributed tess on VI
Only needed when the pipeline actually uses tessellation. I don't
think that changes anything, except improving readability.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-12 09:35:44 +01:00
Samuel Pitoiset
cc4569b733 radv: cleanup and document a Hawaii bug with offchip buffers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-12 09:35:42 +01:00
Hanno Böck
8dc2085baf glsl/test: Fix use after free in test_optpass.
The variable state is free'd and afterwards state->error is used
as the return value, resulting in a use after free bug detected
by memory safety tools like address sanitizer.

Signed-off-by: Hanno Böck <hanno@hboeck.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108636
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-12 07:42:58 +02:00
Timothy Arceri
a068958692 nir: don't pack varyings ints with floats unless flat
Fixes: 1c9c42d16b ("nir: add varying component packing helpers")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-12 15:38:56 +11:00
Timothy Arceri
9dd737bb02 nir: add glsl_type_is_integer() helper
Fixes: 1c9c42d16b ("nir: add varying component packing helpers")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-12 15:38:56 +11:00
Francisco Jerez
552642066f intel/fs: Prevent emission of IR instructions not aligned to their own execution size.
This can occur during payload setup of SIMD-split send message
instructions, which can lead to the emission of header setup
instructions with a non-zero channel group and fixed SIMD width.  Such
instructions could end up using undefined channel enable signals
except they don't care since they're always marked force_writemask_all.

Not known to affect correctness of any workload at this point, but it
would be trivial to back-port to stable if something comes up.

Reported-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Sagar Ghuge <sagar.ghuge@intel.com>
2018-11-09 19:39:22 -08:00
Timothy Arceri
590fcb50e7 st/mesa: make use of nir_link_constant_varyings()
Shader-db results radeonsi (VEGA):

Totals from affected shaders:
SGPRS: 161464 -> 161368 (-0.06 %)
VGPRS: 86904 -> 86292 (-0.70 %)
Spilled SGPRs: 296 -> 314 (6.08 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 3618596 -> 3573852 (-1.24 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 26189 -> 26276 (0.33 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-10 11:41:00 +11:00
Timothy Arceri
d40dd05553 nir: add new linking opt nir_link_constant_varyings()
This pass moves constant outputs to the consuming shader stage
where possible.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-10 11:41:00 +11:00
Andre Heider
414470854d st/nine: clean up thead shutdown sequence a bit
Just break out of the loop instead, it does the same thing.

Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2018-11-09 22:37:27 +01:00
Andre Heider
123bf9cbe7 st/nine: plug thread related leaks
Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2018-11-09 22:37:27 +01:00
Andre Heider
10598c9667 st/nine: fix stack corruption due to ABI mismatch
This fixes various crashes and hangs when using nine's 'thread_submit'
feature.

On 64bit, the thread function's data argument would just be NULL.
On 32bit, the data argument would be garbage depending on the compiler
flags (in my case -march>=core2).

Fixes: f3fa7e3068 ("st/nine: Use WINE thread for threadpool")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2018-11-09 22:37:26 +01:00
Marek Olšák
d2b2364313 radeonsi: stop command submission with PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET only
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-09 14:55:04 -05:00
Marek Olšák
4bec5025ac gallium: add PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-09 14:55:04 -05:00
Marek Olšák
9dc776f3f2 radeonsi: don't set the CB clear color registers for 0/1 clear colors on Raven2
and add has_dcc_constant_encode.
2018-11-09 14:55:04 -05:00
Marek Olšák
832ab883e2 radeonsi: use better DCC clear codes
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-09 14:55:04 -05:00
Marek Olšák
d059eae269 ac/surface: remove the overallocation workaround for Vega12
not needed anymore (probably since the tile_swizzle fix)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-09 14:55:04 -05:00
Lionel Landwerlin
959e2a5aeb intel/aub_read: remove useless breaks
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-09 18:17:30 +00:00
Erik Faye-Lund
b55af392d9 Revert "mesa: expose NV_conditional_render on GLES"
This reverts commit 5213be9fab.
2018-11-09 17:39:25 +01:00
Erik Faye-Lund
cf8b271cbe Revert "mesa/main: fixup make check after NV_conditional_render for gles"
This reverts commit cccd7a253f.
2018-11-09 17:39:22 +01:00
Erik Faye-Lund
cccd7a253f mesa/main: fixup make check after NV_conditional_render for gles
It seems I missed some details when exposing NV_conditional_render
on GLES; this fixes up "make check".

Fixes: 5213be9fab ("mesa: expose NV_conditional_render on GLES")
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-09 16:47:34 +01:00
Nicolai Hähnle
8c97abc066 radv: include LLVM IR in the VK_AMD_shader_info "disassembly"
Helpful for debugging compiler backend problems: this allows us to
easily retrieve the LLVM IR from RenderDoc.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-09 14:54:37 +01:00
Erik Faye-Lund
5213be9fab mesa: expose NV_conditional_render on GLES
The extension spec has been updated to include GLES 2 support, so let's
enable it there.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-09 13:03:00 +01:00
Iago Toral Quiroga
35baee5dce nir/constant_folding: fix incorrect bit-size check
nir_alu_type_get_type_size takes a type as parameter and we were
passing a bit-size instead, which did what we wanted by accident,
since a bit-size of zero matches nir_type_invalid, which has a
size of 0 too.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-09 08:22:15 +01:00
Iago Toral Quiroga
6c418dfa42 intel/compiler: fix node interference of simd16 instructions
SIMD16 instructions need to have additional interferences to prevent
source / destination hazards when the source and destination registers
are off by one register.

While we already have code to handle this, it was only running for SIMD16
dispatches, however, we can have SIDM16 instructions in a SIMD8 dispatch.
An example of this are pull constant loads since commit b56fa830c6,
but there are more cases.

This fixes a number of CTS test failures found in work-in-progress
tests that were hitting this situation for 16-wide pull constants
in a SIMD8 program.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-09 08:22:08 +01:00
Roland Scheidegger
a3c898dc97 gallivm: fix improper clamping of vertex index when fetching gs inputs
Because we only have one file_max for the (2d) gs input file, the value
actually represents the max of attrib and vertex index (although I'm
not entirely sure if we really want the max, since the max valid value
of the vertex dimension can be easily deduced from the input primitive).

Thus in cases where the number of inputs is higher than the number of
vertices per prim, we did not properly clamp the vertex index, which
would result in out-of-bound fetches, potentially causing segfaults
(the segfaults seemed actually difficult to trigger, but valgrind
certainly wasn't happy). This might have happened even if the shader
did not actually try to fetch bogus vertices, if the fetching happened
in non-active conditional clauses.

To fix simply use the correct max vertex index value (derived from
the input prim type) instead when clamping for this case.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-11-09 00:53:03 +01:00
Aditya Swarup
a5c39ed974 i965: Lift restriction in external textures for EGLImage support
Fixes Skqp's unitTest_EGLImageTest test.

For Intel platforms, we support external textures only for EGLImages
created with EGL_EXT_image_dma_buf_import. This restriction seems to
be Intel specific and not present for other platforms.

While running SKQP test - unitTest_EGLImageTest, GL_INVALID is sent
to the test because of this restriction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105301
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-11-08 12:33:06 -08:00
Ian Romanick
c5a4c26450 glsl: Add pragma to disable all warnings
Use #pragma warning(off) and #pragma warning(on) to disable or enable
all warnings.  This is a big hammer.  If we ever need a smaller hammer,
we can enhance this functionality.

There is one lame thing about this.  Because we parse everything, create
an AST, then convert the AST to GLSL IR, we have to treat the #pragma
like a statment.  This means that you can't do something like

'    void
'    #pragma warning(off)
'    __foo
'    #pragma warning(on)
'    (float param0);

Fixing that would, as far as I can tell, require a huge amount of work.

I did try just handling the #pragma during parsing (like we do for
state for the whole shader.

v2: Fix the #pragma lines in the commit message that git-commit ate.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-11-08 11:00:00 -08:00
Ian Romanick
011abfc963 glsl: Add warning tests for identifiers with __
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-11-08 10:59:53 -08:00
Jason Ekstrand
d28bc35ece intel/fs: Add an assert to optimize_frontfacing_ternary
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:25 -06:00
Jason Ekstrand
bcc6aab065 anv: Use nir_src_is_const and friends in lowering code
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:25 -06:00
Jason Ekstrand
52145070c0 intel/analyze_ubo_ranges: Use nir_src_is_const and friends
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:25 -06:00
Jason Ekstrand
1413512b4c intel/vec4: Use the new nir_src_is_const and friends
As of this commit, all uses of const sources either go through a
nir_src_as_<type> helper which handles bit sizes correctly or else are
accompanied by a nir_src_bit_size() == 32 assertion to assert that we
have the size we think we have.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:25 -06:00
Jason Ekstrand
61e15348c4 nir: Add a read_mask helper for ALU instructions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:22 -06:00
Jason Ekstrand
344cfe6980 intel/fs: Use the new nir_src_is_const and friends
As of this commit, all uses of const sources either go through a
nir_src_as_<type> helper which handles bit sizes correctly or else are
accompanied by a nir_src_bit_size() == 32 assertion to assert that we
have the size we think we have.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:20 -06:00
Jason Ekstrand
6b2918709a intel/fs,vec4: Clean up a repeated pattern with SSBOs
Everywhere we handle SSBO intrinsics, we have exactly the same pattern
for computing the index so we may as well make a helper for it.  We also
add a get_nir_src_imm to vec4 and use it for SSBO offsets.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:06 -06:00
Samuel Pitoiset
c472ad82e4 radv: fix GPU hangs when loading depth/stencil clear values on SI/CIK
HTILE is supported on these chips, not sure how I missed that.
This restores using PFP_SYNC_ME when LOAD_CONTEXT_REG is not used.

Fixes: f425d9ee74 ("radv: use LOAD_CONTEXT_REG when loading fast clear values")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-08 11:20:03 +01:00
Samuel Pitoiset
f425d9ee74 radv: use LOAD_CONTEXT_REG when loading fast clear values
This avoids syncing the Micro Engine. This is only supported
for VI+ currently. There is probably a way for using
LOAD_CONTEXT_REG on previous chips but that could be done later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-08 10:41:45 +01:00
Samuel Pitoiset
0dcd99c687 radv: only expose VK_SUBGROUP_FEATURE_ARITHMETIC_BIT for VI+
Inclusive and exclusives scan are missing because older chips
don't have llvm.amdgcn.update.dpp.

This fixes crashes with dEQP-VK.subgroups.arithmetic.*.

CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-08 10:41:41 +01:00
Adam Jackson
16f1023037 glx: Demand success from CreateContext requests (v2)
GLXCreate{,New}Context, like most X resource creation requests, does not
emit a reply and therefore is emitted into the X stream asynchronously.
However, unlike most resource creation requests, the GLXContext we
return is a handle to library state instead of an XID. So if context
creation fails for any reason - say, the server doesn't support indirect
contexts - then we will fail in strange places for strange reasons.

We could make every GLX entrypoint robust against half-created contexts,
or we could just verify that context creation worked. Reuse the
__glXIsDirect code to do this, as a cheap way of verifying that the
XID is real.

glXCreateContextAttribsARB solves this by using the _checked version of
the xcb command, so effectively this change makes the classic context
creation paths as robust as CreateContextAttribs.

v2: Better use of Bool, check that error != NULL first (Olivier Fourdan)

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-11-07 12:38:05 -05:00
Karol Herbst
f7fae7f64e gm107/ir: fix compile time warning in getTEXSMask
In function 'uint8_t nv50_ir::getTEXSMask(uint8_t)':
warning: control reaches end of non-void function [-Wreturn-type]

Reported-by: Moiman@freenode
Fixes: f821e80213
       "gm107/ir: use scalar tex instructions where possible"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-07 17:48:58 +01:00
Michel Dänzer
32b0eb51a3 winsys/amdgpu: Stop using amdgpu_bo_handle_type_kms_noimport
It only behaves any different from amdgpu_bo_handle_type_kms with
libdrm 2.4.93, and it breaks if an older version is picked up.

Bugzilla: https://bugs.freedesktop.org/108096
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-07 17:37:47 +01:00
Lionel Landwerlin
792dde66f2 intel/dump_gpu: add platform option
Got tired of remembering the PCI ids.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-07 11:27:41 +00:00
Lionel Landwerlin
e262cc0353 intel/dump_gpu: move output option together
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-07 11:27:38 +00:00
Samuel Pitoiset
0a0aa2ba6c radv: disable conditional rendering for vkCmdCopyQueryPoolResults()
VK_EXT_conditional_rendering says that copy commands should not be
affected by conditional rendering.

Cc: 18.2 18.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-07 11:31:36 +01:00
Samuel Pitoiset
1e7c3379e1 radv: allocate enough space in CS when copying query results with compute
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-07 11:31:34 +01:00
Timothy Arceri
9aa3c1915e ac/nir_to_llvm: fix b2f for f64
Fixes: d7e0d47b9d ("nir: Add a bunch of b2[if] optimizations")

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-07 16:35:07 +11:00
Karol Herbst
f821e80213 gm107/ir: use scalar tex instructions where possible
TEXS, TLD4 and TLD4S are variants of tex instructions which are more
scalar, which gives RA more freedom and is less likely to insert silly
MOVs to satisfy quad registers.

shader-db changes:
total instructions in shared programs : 7687265 -> 7614782 (-0.94%)
total gprs used in shared programs    : 803620 -> 798045 (-0.69%)
total shared used in shared programs  : 639636 -> 639636 (0.00%)
total local used in shared programs   : 24648 -> 24648 (0.00%)
total bytes used in shared programs   : 82103400 -> 81330696 (-0.94%)

                local     shared        gpr       inst      bytes
    helped           0           0        3648       10647       10647
      hurt           0           0         464         205         205

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06 19:57:05 +01:00
Karol Herbst
edd6c41751 nv50/ir: add scalar field to TexInstructions
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06 19:57:05 +01:00
Karol Herbst
8d825f78fc nv50/ra: add condenseDef overloads for partial condenses
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06 19:57:05 +01:00
Karol Herbst
a4550de434 nv50/ir: print color masks of tex instructions
v2: print the mask for TXG as well
    make the mask to be printed more mask like

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06 19:57:05 +01:00
Jason Ekstrand
610061838a vulkan: Update the XML and headers to 1.1.91
The biggest change here is the rename of VK_NVX_ray_tracing to
VK_NV_ray_tracing and the total removal of VK_KHR_mir_surface.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-06 12:21:19 -06:00
Gert Wollny
c171d76b94 r600: Add support for EXT_texture_sRGB_R8
Enables on R600 and makes pass:
  dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.*
  dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8*

v2: remove chunk for dri/radeon (Emil)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-11-06 18:49:02 +01:00
Lionel Landwerlin
421fa01d64 anv/android: mark gralloc allocated BOs as external
Allocating through Gralloc implies buffers are going to be used
outside the driver. We have special MOCS settings for external BOs and
we probably want to use them here too.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a1220e7311 ("anv/android: Set the BO flags in bo_cache_import (v2)")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-06 15:28:07 +00:00
Lionel Landwerlin
b43f955037 anv: stub internal android code
This reduces the amount of #ifdef ANDROID we'll have to have inside
the driver. Potentially offering better coverage of the android
extensions.

v2: Move anv_android.h include before anv_entrypoints.h (Tapani)
    Fix autotools android build (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-06 15:28:07 +00:00
Kristian H. Kristensen
f6131d4ec7 freedreno/a6xx: Clear z32 and separate stencil with blitter
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-11-06 08:56:38 -05:00
Rob Clark
3bbad81c80 freedreno/a6xx: fix VSC bug with larger # of tiles
At higher resolutions with the addition of MSAA, the number of tiles
can increase to the point where we use more than one VSC pipe per
tile.  Which would cause us to calculate an out-of-bounds offset for
VSC_SIZE_ADDRESS.  So don't try to be clever, just always put it at
a fixed offset assuming the max 32 VSC pipes in use.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-06 08:56:21 -05:00
Rob Clark
2d9c3a5db2 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-06 08:43:27 -05:00
Olivier Fourdan
55af17ffed wayland/egl: Resize EGL surface on update buffer for swrast
After commit a9fb331ea ("wayland/egl: update surface size on window
resize"), the surface size is updated as soon as the resize is done, and
`update_buffers()` would resize only if the surface size differs from
the attached size.

However, in the case of swrast, there is no resize callback and the
attached size is updated in `dri2_wl_swrast_commit_backbuffer()` prior
to the `swrast_update_buffers()` so the attached size is always up to
date when it reaches `swrast_update_buffers()` and the surface is never
resized.

This can be observed with "totem" using the GDK backend on Wayland (the
default) when running on software rendering:

  $ LIBGL_ALWAYS_SOFTWARE=true CLUTTER_BACKEND=gdk totem

Resizing the window would leave the EGL surface size unchanged.

To avoid the issue, partially revert the part of commit a9fb331ea for
`swrast_update_buffers()` and resize on the win size and not the
attached size.

Fixes: a9fb331ea - wayland/egl: update surface size on window resize
Signed-off-by: Olivier Fourdan <ofourdan@redhat.com>
CC: Daniel Stone <daniel@fooishbar.org>
CC: Juan A. Suarez Romero <jasuarez@igalia.com>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-11-06 13:59:38 +01:00
Lionel Landwerlin
b47a69ed4c intel/decoders: fix instruction base address parsing
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 00103db04a ("intel: Fix decoding for partial STATE_BASE_ADDRESS updates.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-05 13:22:35 -08:00
Emil Velikov
b3ade65387 egl/glvnd: correctly report errors when vendor cannot be found
If the user provides an invalid display or device the ToVendor lookup
will fail.

In this case, the local [Mesa vendor] error code will be set. Thus on
sequential eglGetError(), the error will be EGL_SUCCESS.

To be more specific, GLVND remembers the last vendor and calls back
into it's eglGetError, although there's no guarantee to ever have had
one.

v2:
 - Add _eglError call, so the debug callback is executed (Kyle)
 - Drop XXX comment.

Piglit: tests/egl/spec/egl_ext_device_query
Fixes: ce562f9e3f ("EGL: Implement the libglvnd interface for EGL (v3)")
Cc: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kyle Brenneman <kbrenneman@nvidia.com>
2018-11-05 20:53:05 +00:00
Emil Velikov
2a8fefdeb0 egl: add EGL_EXT_device_base entrypoints
eglQueryDevicesEXT (unlike the other three functions) does not depend
on the display. It is implemented in GLVND, which calls into each
driver collecting the list of devices and presenting it to the user.

For the other entrypoints, GLVND acts as pass through stub calling into
the vendor library. The vendor implementation calls back into GLVND to
get the vendor dispatch. Then the driver proceeds to call itself via
the said dispatch.

This design makes is possible to keep using "old" GLVND with newer
vendor drivers. Since effectively all the extension code is within the
latter itself.

Without said entrypoints, any user will outright crash - as reported in
the bug report.

Note: there's a follow-up fix needed to our GLVND code, to make piglit
happy.

v2: add some beefy documentation in the commit message.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108635
Fixes: 7552fcb7b9 ("egl: add base EGL_EXT_device_base implementation")
Reported-by: kyle.devir@mykolab.com
Cc: kyle.devir@mykolab.com
Acked-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-05 20:53:05 +00:00
Emil Velikov
7e169cf2a0 docs: mention EXT_shader_implicit_conversions
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-05 20:53:05 +00:00
Marek Olšák
04298a2f24 st/va: fix incorrect use of resource_destroy
Fixes: 4373dd3215 ("st/va: Support YUV formats in vaCreateSurfaces")
Cc: Drew Davenport <ddavenport@chromium.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-11-05 15:47:50 -05:00
Sergii Romantsov
5aeee1ab15 i965/batch/debug: Allow log be dumped before assert
Message that may show the culprit of assert now will
be dumped before that for debug purposes.

Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Lionel G Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-05 09:24:55 -08:00
Lionel Landwerlin
4fd0ff75f3 intel/sanitize_gpu: add debug message on mmap fail
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-05 15:45:08 +00:00
Lionel Landwerlin
e400ac52e4 intel/sanitize_gpu: deal with non page multiple buffer sizes
We can only map at page aligned offsets. We got that wrong with buffer
size where (size % 4096) != 0 (anv has a WA buffer of 1024).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-05 15:45:07 +00:00
Lionel Landwerlin
c5fca35af1 intel/sanitize_gpu: add help/gdb options to wrapper
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-05 15:45:07 +00:00
Lionel Landwerlin
9ab5089150 intel/dump_gpu: add missing gdb option
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-05 15:43:34 +00:00
Eric Engestrom
d515ded4d9 wsi/wayland: only finish() a successfully init()ed display
Fixes: 4369102498 "vulkan/wsi/wayland: Stop caching Wayland displays"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2018-11-05 15:29:21 +00:00
Eric Engestrom
dcee22afed wsi/wayland: use proper VkResult type
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-05 14:55:05 +00:00
Sergii Romantsov
ce837a5372 autotools: library-dependency when no sse and 32-bit
Building of 32bit Mesa may fail if __SSE__ is not specified.
Added missed dependency from libm.

v2: avoided dependecy on any flag, just link

v3: meson doesn't fail, but have added dependency on libm

CC: Dylan Baker <dylan@pnwbakers.com>
CC: Lionel G Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108560
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-05 13:21:49 +01:00
Samuel Pitoiset
f7fd0d86a9 radv: more use of radv_cp_wait_mem()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-05 09:48:50 +01:00
Samuel Pitoiset
c571ca7a08 radv: replace si_emit_wait_fence() with radv_cp_wait_mem()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-05 09:48:50 +01:00
Samuel Pitoiset
b1b2dd06a7 radv: add missing TFB queries support to CmdCopyQueryPoolsResults()
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Fixes: b4eb029062 ("radv: implement VK_EXT_transform_feedback")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-05 09:48:43 +01:00
Samuel Pitoiset
dc3419195c radv: remove useless sync after copying query results with compute
The spec says:
   "vkCmdCopyQueryPoolResults is considered to be a transfer
    operation, and its writes to buffer memory must be synchronized
    using VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT
    before using the results."

VK_PIPELINE_STAGE_TRANSFER_BIT will wait for compute to be idle,
while VK_ACCESS_TRANSFER_WRITE_BIT will invalidate both L1 vector
caches and L2. So, it's useless to set those flags internally.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-05 09:47:55 +01:00
Vinson Lee
64a9ed8848 r600/sb: Fix constant logical operand in assert.
Fixes: da977ad907 ("r600/sb: start adding GDS support")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2018-11-04 21:09:55 -08:00
Kenneth Graunke
5d517a599b st/mesa: Don't record garbage streamout information in the non-SSO case.
In the non-SSO case, where multiple shader stages are linked together,
we were recording garbage pipe_stream_output_info structures for all
but the last enabled geometry-processing stage.

Specifically, we were using the gl_transform_feedback_info from
shader_program->last_vert_prog (the stage whose outputs will be
recorded)...but were pairing it with the output varying mappings
from the current shader stage.  For example, a program with a VS and
GS, the VS's pipe_shader_state would have a pipe_stream_output_info
based on the GS transform feedback info, but the VS output mapping.

This generally worked out okay because only the pipe_stream_output_info
for the last stage really matters - the others can be ignored.  However,
we'd like to avoid confusing the pipe driver.  In particular, my new
driver translates the stream out information to hardware packets at
bind_{vs,tes,gs}_state() time...and was hitting asserts about garbage
varyings that didn't exist.

This patch changes st/mesa to record a blank pipe_stream_output_info
with num_outputs = 0 for all stages prior to last_vert_prog.  The last
one is captured as normal.

(In the fully-SSO case, nothing should change - each program contains
a single shader stage, so last_vert_prog *is* the current shader.)

Tested with llvmpipe (piglit's gpu profile), and freedreno (a3xx,
gpu profile with -t transform.feedback).  Fixes several hundred CTS
tests on my new driver.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-11-03 23:34:36 -07:00
Kenneth Graunke
b6410a2d22 st/nir: Drop unused parameter from st_nir_assign_uniform_locations().
ARB programs won't have one of these, and we don't use it anyway.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-11-03 23:34:36 -07:00
Kenneth Graunke
5294d65011 st/mesa: Pull nir_lower_wpos_ytransform work into a helper function.
This will let me use it in the ARB program code as well.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-11-03 23:34:34 -07:00
Kenneth Graunke
424a6052df intel: Use a URB start offset of 0 for disabled stages.
There are some cases where the VS is the only stage enabled, it uses the
entire URB, and the URB is large enough that placing later stages after
the VS exceeds the number of bits for "URB Starting Address".

For example, on Icelake GT2, "varying-packing-simple mat2x4 array" from
Piglit is getting a starting offset of 128 for the GS/HS/DS.  But the
field is only large enough to hold an offset of 127.

i965 doesn't hit any genxml assertions because it's still using the old
OUT_BATCH mechanism.  128 << GEN7_URB_STARTING_ADDRESS_SHIFT (57) == 0,
with the extra bit falling off the end.  So we place the disabled stage
at the beginning of the URB (overlapping with push constants).  This is
likely okay since it's a zero size region (0 entries).

It seems like the Vulkan driver might hit this assertion, however, and
the situation seems harmless.  To work around this, always place
disabled stages at the start of the URB, so the last enabled stage can
fill the remaining space without overflowing the field.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-11-03 23:25:57 -07:00
Mauro Rossi
5c0cff868a android: radv: add libmesa_git_sha1 static dependency
libmesa_git_sha1 whole static dependency is added to get git_sha1.h header
and avoid following building error:

external/mesa/src/amd/vulkan/radv_device.c:46:10:
fatal error: 'git_sha1.h' file not found
         ^
1 error generated.

Fixes: 9d40ec2cf6 ("radv: Add support for VK_KHR_driver_properties.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-03 10:48:45 +01:00
Eric Anholt
0d78c6af0d vc4: Use the normal simulator ioctl path for CL submit as well.
The simulator no longer needs to look back into the gallium structs.
2018-11-02 14:26:38 -07:00
Eric Anholt
c80e267a0a vc4: Maintain a separate GEM mapping of BOs in the simulator.
This will let us avoid looking back into the gallium driver's vc4_bo.
2018-11-02 14:26:38 -07:00
Eric Anholt
645ca269d2 vc4: Take advantage of _mesa_hash_table_remove_key() in the simulator. 2018-11-02 14:26:38 -07:00
Eric Anholt
f32ba7abd7 v3d: Remove the special path for simulaton of the submit ioctl.
Now that it doesn't need to find the struct v3d_bos, it can just take the
normal v3d_ioctl() path.
2018-11-02 14:26:38 -07:00
Eric Anholt
df9f574c13 v3d: Maintain a mapping of the GEM buffer in the simulator.
This way we don't need to reach back into the gallium driver code to get
the mapping.
2018-11-02 14:26:38 -07:00
Dylan Baker
7652931d33 meson: link gallium nine with pthreads
In some cases (not building with llvm, which automatically pulls in
pthreads) nine needs to be directly linked with pthreads. Fixes building
on x86 (32 bit) without llvm.

Distro bug: https://bugs.gentoo.org/670094
Fixes: 6b4c7047d5
       ("meson: build gallium nine state_tracker")
Tested-by: Rafal Lalik <rafallalik@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-11-02 13:10:33 -07:00
Anuj Phogat
1c140470ef anv/icl: Disable prefetching of sampler state entries
WA_1606682166:
Incorrect TDL's SSP address shift in SARB for 16:6 & 18:8 modes.
Disable the Sampler state prefetch functionality in the SARB by
programming 0xB000[30] to '1'. This is to be done at boot time and
the feature must remain disabled permanently.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-02 08:34:33 -07:00
Topi Pohjolainen
9a41a10f8a i965/icl: Disable prefetching of sampler state entries
In the same spirit as commit a5889d70f2
"i965/icl: Disable binding table prefetching". Fixes some 110+
intermittent piglit failures with tex-miplevel-selection variants.

WA_1606682166:
Incorrect TDL's SSP address shift in SARB for 16:6 & 18:8 modes.
Disable the Sampler state prefetch functionality in the SARB by
programming 0xB000[30] to '1'. This is to be done at boot time and
the feature must remain disabled permanently.

Anuj: Set SamplerCount = 0 for vs, gs, hs, ds and wm units as well.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-02 08:34:33 -07:00
Jan Vesely
9cab8ccd6c amd: Make vgpr-spilling depend on llvm version
The option was removed in LLVM r345763

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-02 10:32:47 -04:00
Timothy Arceri
769ae9fb7f nir: fix condition propagation when src has a swizzle
We cannot use nir_build_alu() to create the new alu as it has no
way to know how many components of the src we will use. This
results in it guessing the max number of components from one of
its inputs.

Fixes the following CTS tests:

dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_frag
dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_geom
dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_tessc
dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_vert

Fixes: 2975422ceb ("nir: propagates if condition evaluation down some alu chains")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-03 00:44:01 +11:00
Mauro Rossi
b9dec214f5 android: gallium/auxiliary: add include to get u_debug.h header
To avoid build error in u_debug_stack_android.cpp
due to now missing u_debug.h header:

external/mesa/src/gallium/auxiliary/util/u_debug_stack_android.cpp:26:10:
fatal error: 'u_debug.h' file not found
#include "u_debug.h"
         ^
1 error generated.

Fixes: 37db383abb ("util: Move u_debug to utils")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-11-02 13:31:37 +01:00
Gert Wollny
b710680093 virgl/vtest-winsys: Use virgl version of bind flags
The bind flags defined by mesa/gallium might not always be in sync
with the ones copied to virglrenderer/gallium. Therefore, use the
flags defined in virgl like it is done for all the other calls to
create resources.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-02 11:53:09 +01:00
Gert Wollny
acd2968005 mesa/st: Add support for EXT_texture_sRGB_R8
This only adds support on the Gallium core level, for the drivers
it is likely that additional changes are needed to support the
new texture format and thereby enabling the extension.

Enables on softpipe and makes pass:
  dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.*

v2: - add include for getting GL_SR8_EXT
v4: - since the extension is not required don't bother providing
      a fallback (Ilia Mirkin)
    - split patch (2/2) to separate Gallium and mesa/st parts
      (Roland Scheidegger)
    - trim commit message to only contain the history of the patch
      relevant to this part
v5: - don't include GLES headers (required enum has been added to glheader.h)
      (Ilia Mirkin)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-02 11:52:44 +01:00
Gert Wollny
29f0ab2c30 Gallium: Add format PIPE_FORMAT_R8_SRGB
This format is needed to support EXT_texture_sRGB_R8. THe patch adds a new
format enum, the format entries in Gallium and and svga, the mapping between
sRGB and linear formats, and tests.

  v2: - add mapping to linear format for PIPE_FORMATR_R8_SRGB
  v3: - Add texture format to svga format table since otherwise building
        mesa will fail when this driver is enabled. It was not tested
        whether the extension actually works.
  v4: - svga: remove the SVGA specific format definitions and table entries
        and only add correct the location of PIPE_FORMAT_R8_SRGB in the
        format_conversion_table (Ilia Mirkin)
      - Split patch (1/2) to separate Gallium part and mesa/st part.
        (Roland Scheidegger)
      - Trim the commit message to only contain the relevant parts from the
        split.
  v5: - svga: correct location of PIPE_FORMAT_SRGB_R8 (Ilia Mirkin)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-02 11:52:44 +01:00
Gert Wollny
b8e9c6522d mesa/core: Add definitions and translations for EXT_texture_sRGB_R8
v2: - fix format definition line
    - disable  for desktop GL
    - don't add GL_R8_EXT to glext.h since it is already in
      GLES2/gl2ext.h in glext.h and include this header  where needed
      (all Emil)
v3: - swrast: Fill the function table for sRGB_R8
      The size of the function table is checked at compile time and must
      correspond to the number of mesa texture formats.
      dri/swrast being gles-2.0 doesn't support the extension though
v4: - correct format layout comment (Ilia Mirkin)
    - correct logic for accepting GL_RED only textures (in part Ilia Mirkin)
      EXT_texture_sRGB_R8 requires OpenGL ES 3.0 which includes
      ARB_texture_rg/EXT_texture_rg, so one only must check for the first
      when SR8_EXT is really requested.
v5: - add define for GL_ES8_XT to glheader.h and don't include GLES
      headers  (Ilia Mirkin)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-02 11:52:44 +01:00
Erik Faye-Lund
742dace825 glsl: do not allow implicit casts of unsized array initializers
The GLSL 4.6 specification (section 4.1.14. "Implicit Conversions")
says:

  "There are no implicit array or structure conversions. For
   example, an array of int cannot be implicitly converted to an
   array of float."

So let's add a check in place when assigning array initializers to
implicitly sized arrays, to avoid incorrectly allowing code on the
form:

int[] foo = float[](1.0, 2.0, 3.0)

This fixes the following dEQP test-cases:
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_float_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_float_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_uint_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_uint_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.uint_to_float_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.uint_to_float_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_float_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_float_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_uint_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_uint_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.uint_to_float_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.uint_to_float_fragment

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-02 11:10:36 +01:00
Erik Faye-Lund
6df922f438 mesa/glsl: add support for EXT_shader_implicit_conversions
EXT_shader_implicit_conversions adds support for implicit conversions
for GLES 3.1 and above.

This is essentially a subset of ARB_gpu_shader5, and augments
OES_gpu_shader5.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-02 11:10:36 +01:00
Erik Faye-Lund
ecab2d6f14 glsl: fall back to inexact function-match
In GLES, we currently either need an exact match with a local function,
or an exact match with a builtin.

However, if we add support for implicit conversions for GLES shaders,
we also need to fall back to a non-exact match in the case where there
were no builtin match either.

Luckily, we already have a variable ready with this, so let's just
return it if the builtin-search failed.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-02 11:10:36 +01:00
Erik Faye-Lund
e975c5b785 glsl: add has_implicit_uint_to_int_conversion()-helper
This makes the code a bit easier to read, as well as reduces repetition,
especially when we add support for EXT_shader_implicit_conversions.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-02 11:10:36 +01:00
Erik Faye-Lund
12f001f013 glsl: add has_implicit_conversions()-helper
This makes the code a bit easier to read, as well as will reduce
repetition when we add support for EXT_shader_implicit_conversions.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-02 11:10:36 +01:00
Mathias Fröhlich
9f009c1a8f mesa: Remove needless indirection in some draw functions.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-11-02 08:42:03 +01:00
Timothy Arceri
c7bdda8aa5 nir: allow propagation of if evaluation for bcsel
Shader-db results Skylake:

total instructions in shared programs: 13109035 -> 13109024 (<.01%)
instructions in affected programs: 4777 -> 4766 (-0.23%)
helped: 11
HURT: 0

total cycles in shared programs: 332090418 -> 332090443 (<.01%)
cycles in affected programs: 19474 -> 19499 (0.13%)
helped: 6
HURT: 4

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-02 15:56:34 +11:00
Dave Airlie
677b496b6b radv: fix begin/end transform feedback with 0 counter buffers.
If the user gives 0 counterBuffers then the driver should still
enable transform feedback on all targets. This changes the
driver to always enable xfb, and use counter buffers where
one is defined for the target in question.

Fixes: b4eb029062 (radv: implement VK_EXT_transform_feedback)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-02 04:15:07 +00:00
Dave Airlie
7f37a52a21 radv: apply xfb buffer offset at buffer binding time not later. (v2)
In order to handle pause/resume properly, the offset should
be added to the buffer binding not to the begin/end paths.

v2: don't add offset to size
Fixes ext_transform_feedback-alignment* under zink

Fixes: b4eb029062 (radv: implement VK_EXT_transform_feedback)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-02 04:13:31 +00:00
Mark Janes
5f312e95f8 Revert "i965/batch: avoid reverting batch buffer if saved state is an empty"
This reverts commit a9031bf9b5.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108630
2018-11-01 16:28:05 -07:00
Eric Anholt
43a397c580 vc4: Drop the winsys_stride relayout in the simluator
Since 0c1dd9dee0 ("broadcom/vc4: Allow importing linear BOs with
arbitrary offset/stride."), we have the vc4-side BO properly laid out
(assuming it's linear) in the winsys BO so that we can skip this extra
copy.
2018-11-01 14:34:02 -07:00
Eric Anholt
4e1b163eed v3d: Update the TLB config for depth writes on V3D 4.2.
Fixes 311 piglit cases on the simulator.
2018-11-01 13:56:30 -07:00
Eric Anholt
4018eb04e8 v3d: Use the TLB R/B swapping instead of recompiles when available.
The recompile reduction is nice, but this also makes it so that a straight
texture copy could get optimized some day to not unpack/repack the f16
values.
2018-11-01 13:56:30 -07:00
Eric Anholt
3923cf626d v3d: Take advantage of _mesa_hash_table_remove_key() in the simulator. 2018-11-01 13:54:36 -07:00
Eric Anholt
47586ab569 v3d: Respect user-passed strides for BO imports.
If the caller has passed in a stride for (linear) BO import, we should use
that stride when rendering to the BO (or, if we some day support texturing
from linear-imported BOs, when doing the linear-to-UIF shadow copy).  This
lets us remove the extra stride-changing relayout in the simulator.
2018-11-01 13:54:36 -07:00
Eric Anholt
5313fb8abd v3d: Drop #if 0-ed out v3d_dump_to_file().
This came from vc4, where we had a file format for GPU hangs.  I don't
have one of those for V3D, and I probably won't ever have the simulator
side produce dumps even if I do.
2018-11-01 13:54:36 -07:00
Eric Anholt
d3f66c385b v3d: Fix a typo in a comment in job handling. 2018-11-01 13:54:36 -07:00
Eric Anholt
b93fc160f4 v3d: Fix a copy-and-paste comment in the simulator code. 2018-11-01 13:54:36 -07:00
Anuj Phogat
13c955182f anv/icl: Set Error Detection Behavior Control Bit in L3CNTLREG
The default setting of this bit is not the desirable behavior.
WA_1406697149

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-01 12:00:23 -07:00
Anuj Phogat
b3d6937fb0 i965/icl: Set Error Detection Behavior Control Bit in L3CNTLREG
The default setting of this bit is not the desirable behavior.
WA_1406697149

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-01 12:00:23 -07:00
Emil Velikov
ac95a0e024 docs: add 19.0.0-devel release notes template
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-01 18:56:54 +00:00
Emil Velikov
97c73c9174 mesa: bump version to 19.1.0-devel
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-01 18:54:02 +00:00
1597 changed files with 154990 additions and 58869 deletions

View File

@@ -11,6 +11,7 @@ tab_width = 8
[*.{c,h,cpp,hpp,cc,hh}]
indent_style = space
indent_size = 3
max_line_length = 78
[{Makefile*,*.mk}]
indent_style = tab

499
.gitlab-ci.yml Normal file
View File

@@ -0,0 +1,499 @@
# This is the tag of the docker image used for the build jobs. If the
# image doesn't exist yet, the containers-build stage generates it.
#
# In order to generate a new image, one should generally change the tag.
# While removing the image from the registry would also work, that's not
# recommended except for ephemeral images during development: Replacing
# an image after a significant amount of time might pull in newer
# versions of gcc/clang or other packages, which might break the build
# with older commits using the same tag.
#
# After merging a change resulting in generating a new image to the
# main repository, it's recommended to remove the image from the source
# repository's container registry, so that the image from the main
# repository's registry will be used there as well.
#
# The format of the tag is "%Y-%m-%d-${counter}" where ${counter} stays
# at "01" unless you have multiple updates on the same day :)
variables:
UBUNTU_TAG: 2019-02-12-01
UBUNTU_IMAGE: "$CI_REGISTRY_IMAGE/ubuntu:$UBUNTU_TAG"
UBUNTU_IMAGE_MAIN: "registry.freedesktop.org/mesa/mesa/ubuntu:$UBUNTU_TAG"
cache:
paths:
- ccache
stages:
- containers-build
- build+test
# When to automatically run the CI
.ci-run-policy:
only:
- master
- merge_requests
- /^ci([-/].*)?$/
# CONTAINERS
containers:ubuntu:
extends: .ci-run-policy
stage: containers-build
image: docker:stable
services:
- docker:dind
variables:
DOCKER_HOST: tcp://docker:2375
DOCKER_DRIVER: overlay2
script:
# Enable experimental features such as `docker manifest inspect`
- mkdir -p ~/.docker
- "echo '{\"experimental\": \"enabled\"}' > ~/.docker/config.json"
- docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY
# Check if the image (with the specific tag) already exists
- docker manifest inspect $UBUNTU_IMAGE && exit || true
# Try to re-use the image from the main repository's registry
- docker image pull $UBUNTU_IMAGE_MAIN &&
docker image tag $UBUNTU_IMAGE_MAIN $UBUNTU_IMAGE &&
docker image push $UBUNTU_IMAGE && exit || true
- docker build -t $UBUNTU_IMAGE -f .gitlab-ci/Dockerfile.ubuntu .
- docker push $UBUNTU_IMAGE
# BUILD
.build:
extends: .ci-run-policy
image: $UBUNTU_IMAGE
stage: build+test
artifacts:
when: on_failure
untracked: true
# Use ccache transparently, and print stats before/after
before_script:
- export PATH="/usr/lib/ccache:$PATH"
- export CCACHE_BASEDIR="$PWD"
- export CCACHE_DIR="$PWD/ccache"
- export CCACHE_COMPILERCHECK=content
- ccache --zero-stats || true
- ccache --show-stats || true
after_script:
- export CCACHE_DIR="$PWD/ccache"
- ccache --show-stats
.meson-build:
extends: .build
script:
# We need to control the version of llvm-config we're using, so we'll
# generate a native file to do so. This requires meson >=0.49
- if test -n "$LLVM_VERSION"; then
LLVM_CONFIG="llvm-config-${LLVM_VERSION}";
echo -e "[binaries]\nllvm-config = '`which $LLVM_CONFIG`'" > native.file;
$LLVM_CONFIG --version;
else
touch native.file;
fi
- meson --version
- meson _build
--native-file=native.file
-D build-tests=true
-D libunwind=${UNWIND}
${DRI_LOADERS}
-D dri-drivers=${DRI_DRIVERS:-[]}
${GALLIUM_ST}
-D gallium-drivers=${GALLIUM_DRIVERS:-[]}
-D vulkan-drivers=${VULKAN_DRIVERS:-[]}
- cd _build
- meson configure
- ninja -j4
- ninja test
.make-build:
extends: .build
variables:
MAKEFLAGS: "-j4"
script:
- if test -n "$LLVM_VERSION"; then
export LLVM_CONFIG="llvm-config-${LLVM_VERSION}";
fi
- mkdir build
- cd build
- ../autogen.sh
--enable-autotools
--enable-debug
$LIBUNWIND_FLAGS
$DRI_LOADERS
--with-dri-drivers=$DRI_DRIVERS
$GALLIUM_ST
--with-gallium-drivers=$GALLIUM_DRIVERS
--with-vulkan-drivers=$VULKAN_DRIVERS
--disable-llvm-shared-libs
- make
- eval $MAKE_CHECK_COMMAND
.scons-build:
extends: .build
variables:
SCONSFLAGS: "-j4"
script:
- if test -n "$LLVM_VERSION"; then
export LLVM_CONFIG="llvm-config-${LLVM_VERSION}";
fi
- scons $SCONS_TARGET
- eval $SCONS_CHECK_COMMAND
build:meson-vulkan:
extends: .meson-build
variables:
UNWIND: "false"
DRI_LOADERS: >
-D glx=disabled
-D gbm=false
-D egl=false
-D platforms=x11,wayland,drm
-D osmesa=none
GALLIUM_ST: >
-D dri3=true
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
VULKAN_DRIVERS: intel,amd
LLVM_VERSION: "7"
build:meson-loader-classic-dri:
extends: .meson-build
variables:
UNWIND: "false"
DRI_LOADERS: >
-D glx=dri
-D gbm=true
-D egl=true
-D platforms=x11,wayland,drm,surfaceless
-D osmesa=classic
DRI_DRIVERS: "i915,i965,r100,r200,swrast,nouveau"
GALLIUM_ST: >
-D dri3=true
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
build:meson-glvnd:
extends: .meson-build
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glvnd=true
-D egl=true
-D gbm=true
-D glx=dri
DRI_DRIVERS: "i965"
GALLIUM_ST: >
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
# NOTE: Building SWR is 2x (yes two) times slower than all the other
# gallium drivers combined.
# Start this early so that it doesn't hunder the run time.
build:meson-gallium-swr:
extends: .meson-build
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glx=disabled
-D egl=false
-D gbm=false
GALLIUM_ST: >
-D dri3=false
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
GALLIUM_DRIVERS: "swr"
LLVM_VERSION: "6.0"
build:meson-gallium-radeonsi:
extends: .meson-build
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glx=disabled
-D egl=false
-D gbm=false
GALLIUM_ST: >
-D dri3=false
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
GALLIUM_DRIVERS: "radeonsi"
LLVM_VERSION: "7"
build:meson-gallium-drivers-other:
extends: .meson-build
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glx=disabled
-D egl=false
-D gbm=false
GALLIUM_ST: >
-D dri3=false
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
GALLIUM_DRIVERS: "i915,iris,nouveau,kmsro,r300,r600,freedreno,svga,swrast,v3d,vc4,virgl,etnaviv"
LLVM_VERSION: "5.0"
build:meson-gallium-clover-llvm5:
extends: .meson-build
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glx=disabled
-D egl=false
-D gbm=false
GALLIUM_ST: >
-D dri3=false
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=icd
GALLIUM_DRIVERS: "r600"
LLVM_VERSION: "5.0"
build:meson-gallium-clover-llvm6:
extends: build:meson-gallium-clover-llvm5
variables:
LLVM_VERSION: "6.0"
build:meson-gallium-clover-llvm7:
extends: build:meson-gallium-clover-llvm5
variables:
GALLIUM_DRIVERS: "r600,radeonsi"
LLVM_VERSION: "7"
build:meson-gallium-st-other:
extends: .meson-build
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glx=disabled
-D egl=false
-D gbm=false
GALLIUM_ST: >
-D dri3=true
-D gallium-vdpau=true
-D gallium-xvmc=true
-D gallium-omx=bellagio
-D gallium-va=true
-D gallium-xa=true
-D gallium-nine=true
-D gallium-opencl=disabled
-D osmesa=gallium
GALLIUM_DRIVERS: "nouveau,swrast"
LLVM_VERSION: "5.0"
build:make-vulkan:
extends: .make-build
variables:
MAKE_CHECK_COMMAND: "make -C src/gtest check && make -C src/intel check"
LLVM_VERSION: "7"
DRI_LOADERS: >
--disable-glx
--disable-gbm
--disable-egl
--with-platforms=x11,wayland,drm
DRI_DRIVERS: ""
GALLIUM_ST: >
--enable-dri
--enable-dri3
--disable-opencl
--disable-xa
--disable-nine
--disable-xvmc
--disable-vdpau
--disable-va
--disable-omx-bellagio
--disable-gallium-osmesa
VULKAN_DRIVERS: intel,radeon
LIBUNWIND_FLAGS: --disable-libunwind
build:make-loader-classic-dri:
extends: .make-build
variables:
MAKE_CHECK_COMMAND: "make check"
DRI_LOADERS: >
--enable-glx
--enable-gbm
--enable-egl
--with-platforms=x11,wayland,drm,surfaceless
--enable-osmesa
DRI_DRIVERS: "i915,i965,radeon,r200,swrast,nouveau"
GALLIUM_ST: >
--enable-dri
--disable-opencl
--disable-xa
--disable-nine
--disable-xvmc
--disable-vdpau
--disable-va
--disable-omx-bellagio
--disable-gallium-osmesa
LIBUNWIND_FLAGS: --disable-libunwind
# NOTE: Building SWR is 2x (yes two) times slower than all the other
# gallium drivers combined.
# Start this early so that it doesn't hunder the run time.
build:make-gallium-drivers-swr:
extends: .make-build
variables:
MAKE_CHECK_COMMAND: "true"
LLVM_VERSION: "6.0"
DRI_LOADERS: >
--disable-glx
--disable-gbm
--disable-egl
GALLIUM_ST: >
--enable-dri
--disable-opencl
--disable-xa
--disable-nine
--disable-xvmc
--disable-vdpau
--disable-va
--disable-omx-bellagio
--disable-gallium-osmesa
GALLIUM_DRIVERS: "swr"
LIBUNWIND_FLAGS: --enable-libunwind
build:make-gallium-drivers-radeonsi:
extends: build:make-gallium-drivers-swr
variables:
LLVM_VERSION: "7"
GALLIUM_DRIVERS: "radeonsi"
build:make-gallium-drivers-other:
extends: build:make-gallium-drivers-swr
variables:
LLVM_VERSION: "3.9"
GALLIUM_DRIVERS: "i915,nouveau,kmsro,r300,r600,freedreno,svga,swrast,v3d,vc4,virgl,etnaviv"
build:make-gallium-st-clover-llvm-39:
extends: .make-build
variables:
MAKE_CHECK_COMMAND: "true"
LLVM_VERSION: "3.9"
DRI_LOADERS: >
--disable-glx
--disable-gbm
--disable-egl
GALLIUM_ST: >
--disable-dri
--enable-opencl
--enable-opencl-icd
--enable-llvm
--disable-xa
--disable-nine
--disable-xvmc
--disable-vdpau
--disable-va
--disable-omx-bellagio
--disable-gallium-osmesa
GALLIUM_DRIVERS: "r600"
LIBUNWIND_FLAGS: --enable-libunwind
build:make-gallium-st-clover-llvm-4:
extends: build:make-gallium-st-clover-llvm-39
variables:
LLVM_VERSION: "4.0"
build:make-gallium-st-clover-llvm-5:
extends: build:make-gallium-st-clover-llvm-39
variables:
LLVM_VERSION: "5.0"
build:make-gallium-st-clover-llvm-6:
extends: build:make-gallium-st-clover-llvm-39
variables:
LLVM_VERSION: "6.0"
build:make-gallium-st-clover-llvm-7:
extends: build:make-gallium-st-clover-llvm-39
variables:
LLVM_VERSION: "7"
GALLIUM_DRIVERS: "r600,radeonsi"
build:make-gallium-st-other:
extends: .make-build
variables:
MAKE_CHECK_COMMAND: "true"
# We should be testing 3.3, but 3.9 is the oldest that still exists in ubuntu
LLVM_VERSION: "3.9"
DRI_LOADERS: >
--disable-glx
--disable-gbm
--disable-egl
GALLIUM_ST: >
--enable-dri
--disable-opencl
--enable-xa
--enable-nine
--enable-xvmc
--enable-vdpau
--enable-va
--enable-omx-bellagio
--enable-gallium-osmesa
# We need swrast for osmesa and nine.
# i915 most likely doesn't work with most ST.
# Regardless - we're doing a quick build test here.
GALLIUM_DRIVERS: "i915,swrast"
LIBUNWIND_FLAGS: --enable-libunwind
build:scons-nollvm:
extends: .scons-build
variables:
SCONS_TARGET: "llvm=0"
SCONS_CHECK_COMMAND: "scons llvm=0 check"
build:scons-llvm:
extends: .scons-build
variables:
SCONS_TARGET: "llvm=1"
SCONS_CHECK_COMMAND: "scons llvm=1 check"
LLVM_VERSION: "3.9"
build:scons-swr:
extends: .scons-build
variables:
SCONS_TARGET: "swr=1"
SCONS_CHECK_COMMAND: "true"
LLVM_VERSION: "6.0"

View File

@@ -0,0 +1,165 @@
FROM ubuntu:bionic
RUN apt-get update
RUN apt-get upgrade -y
RUN apt-get install -y \
curl \
wget \
gnupg \
software-properties-common
RUN curl -fsSL https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
RUN add-apt-repository "deb http://apt.llvm.org/bionic/ llvm-toolchain-bionic-7 main"
RUN apt-get update
RUN apt-get install -y \
pkg-config \
libdrm-dev \
libpciaccess-dev \
libxrandr-dev \
libxdamage-dev \
libxfixes-dev \
libxshmfence-dev \
libxxf86vm-dev \
libvdpau-dev \
libva-dev \
llvm-3.9-dev \
libclang-3.9-dev \
llvm-4.0-dev \
libclang-4.0-dev \
llvm-5.0-dev \
llvm-6.0-dev \
llvm-7-dev \
clang-5.0 \
libclang-5.0-dev \
clang-6.0 \
libclang-6.0-dev \
clang-7 \
libclang-7-dev \
libclc-dev \
libxvmc-dev \
libomxil-bellagio-dev \
xz-utils \
libexpat1-dev \
libx11-xcb-dev \
x11proto-xf86vidmode-dev \
libelf-dev \
libunwind8-dev \
libglvnd-dev \
python2.7 \
python-pip \
python-setuptools \
python3.5 \
python3-pip \
python3-setuptools
RUN apt-get install -y \
libxcb-randr0
# autotools build deps
RUN apt-get install -y \
autoconf \
automake \
xutils-dev \
libtool \
bison \
flex \
gettext \
make
# dependencies where we want a specific version
ENV XORG_RELEASES https://xorg.freedesktop.org/releases/individual
ENV XCB_RELEASES https://xcb.freedesktop.org/dist
ENV WAYLAND_RELEASES https://wayland.freedesktop.org/releases
ENV XORGMACROS_VERSION util-macros-1.19.0
ENV GLPROTO_VERSION glproto-1.4.17
ENV DRI2PROTO_VERSION dri2proto-2.8
ENV LIBPCIACCESS_VERSION libpciaccess-0.13.4
ENV LIBDRM_VERSION libdrm-2.4.97
ENV XCBPROTO_VERSION xcb-proto-1.13
ENV RANDRPROTO_VERSION randrproto-1.3.0
ENV LIBXRANDR_VERSION libXrandr-1.3.0
ENV LIBXCB_VERSION libxcb-1.13
ENV LIBXSHMFENCE_VERSION libxshmfence-1.3
ENV LIBVDPAU_VERSION libvdpau-1.1
ENV LIBVA_VERSION libva-1.7.0
ENV LIBWAYLAND_VERSION wayland-1.15.0
ENV WAYLAND_PROTOCOLS_VERSION wayland-protocols-1.8
RUN wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2
RUN tar -xvf $XORGMACROS_VERSION.tar.bz2 && rm $XORGMACROS_VERSION.tar.bz2
RUN (cd $XORGMACROS_VERSION && ./configure && make install) && rm -rf $XORGMACROS_VERSION
RUN wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2
RUN tar -xvf $GLPROTO_VERSION.tar.bz2 && rm $GLPROTO_VERSION.tar.bz2
RUN (cd $GLPROTO_VERSION && ./configure && make install) && rm -rf $GLPROTO_VERSION
RUN wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2
RUN tar -xvf $DRI2PROTO_VERSION.tar.bz2 && rm $DRI2PROTO_VERSION.tar.bz2
RUN (cd $DRI2PROTO_VERSION && ./configure && make install) && rm -rf $DRI2PROTO_VERSION
RUN wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2
RUN tar -xvf $XCBPROTO_VERSION.tar.bz2 && rm $XCBPROTO_VERSION.tar.bz2
RUN (cd $XCBPROTO_VERSION && ./configure && make install) && rm -rf $XCBPROTO_VERSION
RUN wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2
RUN tar -xvf $LIBXCB_VERSION.tar.bz2 && rm $LIBXCB_VERSION.tar.bz2
RUN (cd $LIBXCB_VERSION && ./configure && make install) && rm -rf $LIBXCB_VERSION
RUN wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2
RUN tar -xvf $LIBPCIACCESS_VERSION.tar.bz2 && rm $LIBPCIACCESS_VERSION.tar.bz2
RUN (cd $LIBPCIACCESS_VERSION && ./configure && make install) && rm -rf $LIBPCIACCESS_VERSION
RUN wget https://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
RUN tar -xvf $LIBDRM_VERSION.tar.bz2 && rm $LIBDRM_VERSION.tar.bz2
RUN (cd $LIBDRM_VERSION && ./configure --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api && make install) && rm -rf $LIBDRM_VERSION
RUN wget $XORG_RELEASES/proto/$RANDRPROTO_VERSION.tar.bz2
RUN tar -xvf $RANDRPROTO_VERSION.tar.bz2 && rm $RANDRPROTO_VERSION.tar.bz2
RUN (cd $RANDRPROTO_VERSION && ./configure && make install) && rm -rf $RANDRPROTO_VERSION
RUN wget $XORG_RELEASES/lib/$LIBXRANDR_VERSION.tar.bz2
RUN tar -xvf $LIBXRANDR_VERSION.tar.bz2 && rm $LIBXRANDR_VERSION.tar.bz2
RUN (cd $LIBXRANDR_VERSION && ./configure && make install) && rm -rf $LIBXRANDR_VERSION
RUN wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
RUN tar -xvf $LIBXSHMFENCE_VERSION.tar.bz2 && rm $LIBXSHMFENCE_VERSION.tar.bz2
RUN (cd $LIBXSHMFENCE_VERSION && ./configure && make install) && rm -rf $LIBXSHMFENCE_VERSION
RUN wget https://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2
RUN tar -xvf $LIBVDPAU_VERSION.tar.bz2 && rm $LIBVDPAU_VERSION.tar.bz2
RUN (cd $LIBVDPAU_VERSION && ./configure && make install) && rm -rf $LIBVDPAU_VERSION
RUN wget https://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2
RUN tar -xvf $LIBVA_VERSION.tar.bz2 && rm $LIBVA_VERSION.tar.bz2
RUN (cd $LIBVA_VERSION && ./configure --disable-wayland --disable-dummy-driver && make install) && rm -rf $LIBVA_VERSION
RUN wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz
RUN tar -xvf $LIBWAYLAND_VERSION.tar.xz && rm $LIBWAYLAND_VERSION.tar.xz
RUN (cd $LIBWAYLAND_VERSION && ./configure --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation && make install) && rm -rf $LIBWAYLAND_VERSION
RUN wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz
RUN tar -xvf $WAYLAND_PROTOCOLS_VERSION.tar.xz && rm $WAYLAND_PROTOCOLS_VERSION.tar.xz
RUN (cd $WAYLAND_PROTOCOLS_VERSION && ./configure && make install) && rm -rf $WAYLAND_PROTOCOLS_VERSION
RUN apt-get install -y unzip
# Meson requires ninja >= 1.6, but xenial has 1.3.x
RUN wget https://github.com/ninja-build/ninja/releases/download/v1.6.0/ninja-linux.zip
RUN unzip ninja-linux.zip && rm ninja-linux.zip
RUN mv ninja /usr/bin/
RUN pip3 install 'meson>=0.49'
RUN pip2 install 'scons>=2.4'
RUN pip2 install mako
RUN pip3 install mako
# Use ccache to speed up builds
RUN apt-get install -y ccache
# Cleanup workdir
WORKDIR /

View File

@@ -265,6 +265,9 @@ Kristian Høgsberg <krh@bitplanet.net> <krh@hinata.boston.redhat.com>
Kristian Høgsberg <krh@bitplanet.net> <krh@sasori.boston.redhat.com>
Kristian Høgsberg <krh@bitplanet.net> <krh@temari.boston.redhat.com>
Kristian Høgsberg <krh@bitplanet.net> <kristian.h.kristensen@intel.com>
Kristian Høgsberg <krh@bitplanet.net> <hoegsberg@chromium.org>
Kristian Høgsberg <krh@bitplanet.net> <hoegsberg@google.com>
Kristian Høgsberg <krh@bitplanet.net> <hoegsberg@gmail.com>
Krzesimir Nowak <qdlacz@gmail.com> <krzesimir@kinvolk.io>

View File

@@ -1,490 +1,16 @@
language: c
sudo: false
dist: trusty
dist: xenial
cache:
apt: true
ccache: true
env:
global:
- XORG_RELEASES=https://xorg.freedesktop.org/releases/individual
- XCB_RELEASES=https://xcb.freedesktop.org/dist
- WAYLAND_RELEASES=https://wayland.freedesktop.org/releases
- XORGMACROS_VERSION=util-macros-1.19.0
- GLPROTO_VERSION=glproto-1.4.17
- DRI2PROTO_VERSION=dri2proto-2.8
- LIBPCIACCESS_VERSION=libpciaccess-0.13.4
- LIBDRM_VERSION=libdrm-2.4.74
- XCBPROTO_VERSION=xcb-proto-1.13
- RANDRPROTO_VERSION=randrproto-1.3.0
- LIBXRANDR_VERSION=libXrandr-1.3.0
- LIBXCB_VERSION=libxcb-1.13
- LIBXSHMFENCE_VERSION=libxshmfence-1.2
- LIBVDPAU_VERSION=libvdpau-1.1
- LIBVA_VERSION=libva-1.7.0
- LIBWAYLAND_VERSION=wayland-1.15.0
- WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.8
- PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig:$HOME/prefix/share/pkgconfig
- LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"
- PATH="$HOME/prefix/bin:$PATH"
- PKG_CONFIG_PATH="$PKG_CONFIG_PATH"
matrix:
include:
- env:
- LABEL="meson Vulkan"
- BUILD=meson
- DRI_DRIVERS=""
- GALLIUM_DRIVERS=""
- VULKAN_DRIVERS="intel,amd"
- LLVM_VERSION=6.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
addons:
apt:
sources:
- llvm-toolchain-trusty-6.0
# llvm-6 requires libstdc++4.9 which is not in main repo
- ubuntu-toolchain-r-test
packages:
# From sources above
- llvm-6.0-dev
# Common
- xz-utils
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- python3.5
- python3-pip
- env:
- LABEL="meson loaders/classic DRI"
- BUILD=meson
- DRI_DRIVERS="i915,i965,r100,r200,swrast,nouveau"
- GALLIUM_DRIVERS=""
- VULKAN_DRIVERS=""
addons:
apt:
packages:
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libxdamage-dev
- libxfixes-dev
- python3.5
- python3-pip
- env:
- LABEL="make loaders/classic DRI"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="make check"
- DRI_LOADERS="--enable-glx --enable-gbm --enable-egl --with-platforms=x11,drm,surfaceless,wayland --enable-osmesa"
- DRI_DRIVERS="i915,i965,radeon,r200,swrast,nouveau"
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS=""
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--disable-libunwind"
addons:
apt:
packages:
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libxdamage-dev
- libxfixes-dev
- python3-pip
- env:
# NOTE: Building SWR is 2x (yes two) times slower than all the other
# gallium drivers combined.
# Start this early so that it doesn't hunder the run time.
- LABEL="make Gallium Drivers SWR"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=6.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="swr"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-6.0
# llvm-6 requires libstdc++4.9 which is not in main repo
- ubuntu-toolchain-r-test
packages:
# From sources above
- llvm-6.0-dev
# Common
- xz-utils
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- python3-pip
- env:
- LABEL="make Gallium Drivers RadeonSI"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=6.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="radeonsi"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-6.0
# llvm-6 requires libstdc++4.9 which is not in main repo
- ubuntu-toolchain-r-test
packages:
# From sources above
- llvm-6.0-dev
# Common
- xz-utils
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- python3-pip
- env:
- LABEL="make Gallium Drivers Other"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.9
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="i915,nouveau,pl111,r300,r600,freedreno,svga,swrast,v3d,vc4,virgl,etnaviv,imx"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
- binutils-2.26
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
# Common
- xz-utils
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- python3-pip
- env:
- LABEL="make Gallium ST Clover LLVM-3.9"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.9
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- OVERRIDE_CC=gcc-4.7
- OVERRIDE_CXX=g++-4.7
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
- binutils-2.26
- libclc-dev
# LLVM packaging is broken and misses these dependencies
- libedit-dev
- g++-4.7
# From sources above
- llvm-3.9-dev
- clang-3.9
- libclang-3.9-dev
# Common
- xz-utils
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- python3-pip
- env:
- LABEL="make Gallium ST Clover LLVM-4.0"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=4.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- OVERRIDE_CC=gcc-4.8
- OVERRIDE_CXX=g++-4.8
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-4.0
packages:
- libclc-dev
# LLVM packaging is broken and misses these dependencies
- libedit-dev
- g++-4.8
# From sources above
- llvm-4.0-dev
- clang-4.0
- libclang-4.0-dev
# Common
- xz-utils
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- python3-pip
- env:
- LABEL="make Gallium ST Clover LLVM-5.0"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=5.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- OVERRIDE_CC=gcc-4.8
- OVERRIDE_CXX=g++-4.8
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-5.0
packages:
- libclc-dev
# LLVM packaging is broken and misses these dependencies
- libedit-dev
- g++-4.8
# From sources above
- llvm-5.0-dev
- clang-5.0
- libclang-5.0-dev
# Common
- xz-utils
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- python3-pip
- env:
- LABEL="make Gallium ST Clover LLVM-6.0"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=6.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600,radeonsi"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-6.0
# llvm-6 requires libstdc++4.9 which is not in main repo
- ubuntu-toolchain-r-test
packages:
- libclc-dev
# From sources above
- llvm-6.0-dev
- clang-6.0
- libclang-6.0-dev
# Common
- xz-utils
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- python3-pip
- env:
- LABEL="make Gallium ST Clover LLVM-7"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=7
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600,radeonsi"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- sourceline: 'deb http://apt.llvm.org/trusty/ llvm-toolchain-trusty-7 main'
key_url: https://apt.llvm.org/llvm-snapshot.gpg.key
# llvm-7 requires libstdc++4.9 which is not in main repo
- ubuntu-toolchain-r-test
packages:
- libclc-dev
# From sources above
- llvm-7-dev
- clang-7
- libclang-7-dev
# Common
- xz-utils
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
- LABEL="make Gallium ST Other"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.3
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --enable-xa --enable-nine --enable-xvmc --enable-vdpau --enable-va --enable-omx-bellagio --enable-gallium-osmesa"
# We need swrast for osmesa and nine.
# i915 most likely doesn't work with most ST.
# Regardless - we're doing a quick build test here.
- GALLIUM_DRIVERS="i915,swrast"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
packages:
# We actually want to test against llvm-3.3
- llvm-3.3-dev
# Nine requires gcc 4.6... which is the one we have right ?
- libxvmc-dev
# Build locally, for now.
#- libvdpau-dev
#- libva-dev
- libomxil-bellagio-dev
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# Common
- xz-utils
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- python3-pip
- env:
- LABEL="make Vulkan"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="make -C src/gtest check && make -C src/intel check"
- LLVM_VERSION=6.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl --with-platforms=x11,wayland"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --enable-dri3 --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS=""
- VULKAN_DRIVERS="intel,radeon"
- LIBUNWIND_FLAGS="--disable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-6.0
# llvm-6 requires libstdc++4.9 which is not in main repo
- ubuntu-toolchain-r-test
packages:
# From sources above
- llvm-6.0-dev
# Common
- xz-utils
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- python3-pip
- env:
- LABEL="scons"
- BUILD=scons
- SCONSFLAGS="-j4"
# Explicitly disable.
- SCONS_TARGET="llvm=0"
# Keep it symmetrical to the make build.
- SCONS_CHECK_COMMAND="scons llvm=0 check"
addons:
apt:
packages:
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- env:
- LABEL="scons LLVM"
- BUILD=scons
- SCONSFLAGS="-j4"
- SCONS_TARGET="llvm=1"
# Keep it symmetrical to the make build.
- SCONS_CHECK_COMMAND="scons llvm=1 check"
- LLVM_VERSION=3.3
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
addons:
apt:
packages:
# LLVM packaging is broken and misses these dependencies
- libedit-dev
- llvm-3.3-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- env:
- LABEL="scons SWR"
- BUILD=scons
- SCONSFLAGS="-j4"
- SCONS_TARGET="swr=1"
- LLVM_VERSION=6.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
# Keep it symmetrical to the make build. There's no actual SWR, yet.
- SCONS_CHECK_COMMAND="true"
addons:
apt:
sources:
- llvm-toolchain-trusty-6.0
# llvm-6 requires libstdc++4.9 which is not in main repo
- ubuntu-toolchain-r-test
packages:
# From sources above
- llvm-6.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- env:
- LABEL="macOS make"
- BUILD=make
@@ -495,6 +21,9 @@ matrix:
- env:
- LABEL="macOS meson"
- BUILD=meson
- UNWIND="false"
- DRI_LOADERS="-Dglx=dri -Dgbm=false -Degl=false -Dplatforms=x11 -Dosmesa=none"
- GALLIUM_ST="-Ddri3=true -Dgallium-vdpau=false -Dgallium-xvmc=false -Dgallium-omx=disabled -Dgallium-va=false -Dgallium-xa=false -Dgallium-nine=false -Dgallium-opencl=disabled"
os: osx
before_install:
@@ -522,10 +51,8 @@ before_install:
install:
# Install a more modern meson from pip, since the version in the
# ubuntu repos is often quite old. This requires python>=3.5, so
# let's make it default
# ubuntu repos is often quite old.
- if test "x$BUILD" = xmeson; then
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.5 10;
pip3 install --user meson;
pip3 install --user mako;
fi
@@ -535,135 +62,18 @@ install:
pip2 install --user mako;
fi
# Install a more modern scons from pip.
- if test "x$BUILD" = xscons; then
pip2 install --user "scons>=2.4";
pip2 install --user mako;
fi
# Since libdrm gets updated in configure.ac regularly, try to pick up the
# latest version from there.
- for line in `grep "^LIBDRM.*_REQUIRED=" configure.ac`; do
old_ver=`echo $LIBDRM_VERSION | sed 's/libdrm-//'`;
new_ver=`echo $line | sed 's/.*REQUIRED=//'`;
if `echo "$old_ver,$new_ver" | tr ',' '\n' | sort -Vc 2> /dev/null`; then
export LIBDRM_VERSION="libdrm-$new_ver";
fi;
done
# Install dependencies where we require specific versions (or where
# disallowed by Travis CI's package whitelisting).
- |
if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then
wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2
tar -jxvf $XORGMACROS_VERSION.tar.bz2
(cd $XORGMACROS_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2
tar -jxvf $GLPROTO_VERSION.tar.bz2
(cd $GLPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2
tar -jxvf $DRI2PROTO_VERSION.tar.bz2
(cd $DRI2PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2
tar -jxvf $XCBPROTO_VERSION.tar.bz2
(cd $XCBPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2
tar -jxvf $LIBXCB_VERSION.tar.bz2
(cd $LIBXCB_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2
tar -jxvf $LIBPCIACCESS_VERSION.tar.bz2
(cd $LIBPCIACCESS_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget https://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
tar -jxvf $LIBDRM_VERSION.tar.bz2
(cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api && make install)
wget $XORG_RELEASES/proto/$RANDRPROTO_VERSION.tar.bz2
tar -jxvf $RANDRPROTO_VERSION.tar.bz2
(cd $RANDRPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/lib/$LIBXRANDR_VERSION.tar.bz2
tar -jxvf $LIBXRANDR_VERSION.tar.bz2
(cd $LIBXRANDR_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
(cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget https://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2
tar -jxvf $LIBVDPAU_VERSION.tar.bz2
(cd $LIBVDPAU_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget https://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2
tar -jxvf $LIBVA_VERSION.tar.bz2
(cd $LIBVA_VERSION && ./configure --prefix=$HOME/prefix --disable-wayland --disable-dummy-driver && make install)
wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz
tar -axvf $LIBWAYLAND_VERSION.tar.xz
(cd $LIBWAYLAND_VERSION && ./configure --prefix=$HOME/prefix --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation && make install)
wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz
tar -axvf $WAYLAND_PROTOCOLS_VERSION.tar.xz
(cd $WAYLAND_PROTOCOLS_VERSION && ./configure --prefix=$HOME/prefix && make install)
# Meson requires ninja >= 1.6, but trusty has 1.3.x
wget https://github.com/ninja-build/ninja/releases/download/v1.6.0/ninja-linux.zip
unzip ninja-linux.zip
mv ninja $HOME/prefix/bin/
# Generate this header since one is missing on the Travis instance
mkdir -p linux
printf "%s\n" \
"#ifndef _LINUX_MEMFD_H" \
"#define _LINUX_MEMFD_H" \
"" \
"#define MFD_CLOEXEC 0x0001U" \
"#define MFD_ALLOW_SEALING 0x0002U" \
"" \
"#endif /* _LINUX_MEMFD_H */" > linux/memfd.h
# Generate this header, including the missing SYS_memfd_create
# macro, which is not provided by the header in the Travis
# instance
mkdir -p sys
printf "%s\n" \
"#ifndef _SYSCALL_H" \
"#define _SYSCALL_H 1" \
"" \
"#include <asm/unistd.h>" \
"" \
"#ifndef _LIBC" \
"# include <bits/syscall.h>" \
"#endif" \
"" \
"#ifndef __NR_memfd_create" \
"# define __NR_memfd_create 319 /* Taken from <asm/unistd_64.h> */" \
"#endif" \
"" \
"#ifndef SYS_memfd_create" \
"# define SYS_memfd_create __NR_memfd_create" \
"#endif" \
"" \
"#endif" > sys/syscall.h
fi
script:
- if test "x$BUILD" = xmake; then
test -n "$OVERRIDE_CC" && export CC="$OVERRIDE_CC";
test -n "$OVERRIDE_CXX" && export CXX="$OVERRIDE_CXX";
test -n "$OVERRIDE_PATH" && export PATH="$OVERRIDE_PATH:$PATH";
export CFLAGS="$CFLAGS -isystem`pwd`";
mkdir build &&
cd build &&
../autogen.sh --enable-debug
../autogen.sh
--enable-autotools
--enable-debug
$LIBUNWIND_FLAGS
$DRI_LOADERS
--with-dri-drivers=$DRI_DRIVERS
@@ -675,42 +85,30 @@ script:
make && eval $MAKE_CHECK_COMMAND;
fi
- if test "x$BUILD" = xscons; then
test -n "$OVERRIDE_CC" && export CC="$OVERRIDE_CC";
test -n "$OVERRIDE_CXX" && export CXX="$OVERRIDE_CXX";
scons $SCONS_TARGET && eval $SCONS_CHECK_COMMAND;
fi
- |
if test "x$BUILD" = xmeson; then
if test -n "$LLVM_CONFIG"; then
# We need to control the version of llvm-config we're using, so we'll
# generate a native file to do so. This requires meson >=0.49
#
echo -e "[binaries]\nllvm-config = '`which $LLVM_CONFIG`'" > native.file
if test "x$TRAVIS_OS_NAME" == xosx; then
MESON_OPTIONS="-Degl=false"
$LLVM_CONFIG --version
else
: > native.file
fi
if test "x$TRAVIS_OS_NAME" == xlinux; then
MESON_OPTIONS="-Ddri-drivers=${DRI_DRIVERS:-[]} -Dgallium-drivers=${GALLIUM_DRIVERS:-[]} -Dvulkan-drivers=${VULKAN_DRIVERS:-[]}"
fi
# Travis CI has moved to LLVM 5.0, and meson is detecting
# automatically the available version in /usr/local/bin based on
# the PATH env variable order preference.
#
# As for 0.44.x, Meson cannot receive the path to the
# llvm-config binary as a configuration parameter. See
# https://github.com/mesonbuild/meson/issues/2887 and
# https://github.com/dcbaker/meson/commit/7c8b6ee3fa42f43c9ac7dcacc61a77eca3f1bcef
#
# We want to use the custom (APT) installed version. Therefore,
# let's make Meson find our wanted version sooner than the one
# at /usr/local/bin
#
# Once this is corrected, we would still need a patch similar
# to:
# https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html
test -f /usr/bin/$LLVM_CONFIG && ln -s /usr/bin/$LLVM_CONFIG $HOME/prefix/bin/llvm-config
export CFLAGS="$CFLAGS -isystem`pwd`"
meson _build $MESON_OPTIONS
meson _build \
--native-file=native.file \
-Dbuild-tests=true \
-Dlibunwind=${UNWIND} \
${DRI_LOADERS} \
-Ddri-drivers=${DRI_DRIVERS:-[]} \
${GALLIUM_ST} \
-Dgallium-drivers=${GALLIUM_DRIVERS:-[]} \
-Dvulkan-drivers=${VULKAN_DRIVERS:-[]}
meson configure _build
ninja -C _build
ninja -C _build test
fi

View File

@@ -37,7 +37,6 @@ LOCAL_CFLAGS += \
-Wno-missing-field-initializers \
-Wno-initializer-overrides \
-Wno-mismatched-tags \
-DVERSION=\"$(MESA_VERSION)\" \
-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"

View File

@@ -24,7 +24,7 @@
# BOARD_GPU_DRIVERS should be defined. The valid values are
#
# classic drivers: i915 i965
# gallium drivers: swrast freedreno i915g nouveau pl111 r300g r600g radeonsi vc4 virgl vmwgfx etnaviv imx
# gallium drivers: swrast freedreno i915g nouveau kmsro r300g r600g radeonsi vc4 virgl vmwgfx etnaviv iris
#
# The main target is libGLES_mesa. For each classic driver enabled, a DRI
# module will also be built. DRI modules will be loaded by libGLES_mesa.
@@ -52,7 +52,7 @@ gallium_drivers := \
freedreno.HAVE_GALLIUM_FREEDRENO \
i915g.HAVE_GALLIUM_I915 \
nouveau.HAVE_GALLIUM_NOUVEAU \
pl111.HAVE_GALLIUM_PL111 \
kmsro.HAVE_GALLIUM_KMSRO \
r300g.HAVE_GALLIUM_R300 \
r600g.HAVE_GALLIUM_R600 \
radeonsi.HAVE_GALLIUM_RADEONSI \
@@ -60,7 +60,7 @@ gallium_drivers := \
vc4.HAVE_GALLIUM_VC4 \
virgl.HAVE_GALLIUM_VIRGL \
etnaviv.HAVE_GALLIUM_ETNAVIV \
imx.HAVE_GALLIUM_IMX
iris.HAVE_GALLIUM_IRIS
ifeq ($(BOARD_GPU_DRIVERS),all)
MESA_BUILD_CLASSIC := $(filter HAVE_%, $(subst ., , $(classic_drivers)))

View File

@@ -22,6 +22,7 @@
SUBDIRS = src
AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-autotools \
--enable-dri \
--enable-dri3 \
--enable-egl \
@@ -45,7 +46,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-libunwind \
--with-platforms=x11,wayland,drm,surfaceless \
--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
--with-gallium-drivers=i915,nouveau,r300,pl111,r600,radeonsi,freedreno,svga,swrast,vc4,tegra,virgl,swr,etnaviv,imx \
--with-gallium-drivers=i915,nouveau,r300,kmsro,r600,radeonsi,freedreno,svga,swrast,vc4,tegra,virgl,swr,etnaviv \
--with-vulkan-drivers=intel,radeon
ACLOCAL_AMFLAGS = -I m4

View File

@@ -9,25 +9,6 @@ This repository lives at https://gitlab.freedesktop.org/mesa/mesa.
Other repositories are likely forks, and code found there is not supported.
Build status
------------
Travis:
.. image:: https://travis-ci.org/mesa3d/mesa.svg?branch=master
:target: https://travis-ci.org/mesa3d/mesa
Appveyor:
.. image:: https://img.shields.io/appveyor/ci/mesa3d/mesa.svg
:target: https://ci.appveyor.com/project/mesa3d/mesa
Coverity:
.. image:: https://scan.coverity.com/projects/139/badge.svg?flat=1
:target: https://scan.coverity.com/projects/mesa
Build & install
---------------

View File

@@ -72,7 +72,9 @@ F: src/loader/
EGL
R: Eric Engestrom <eric@engestrom.ch>
R: Emil Velikov <emil.l.velikov@gmail.com>
F: src/egl/
F: include/EGL/
HAIKU
R: Alexander von Gluck IV <kallisti5@unixzen.com>
@@ -136,3 +138,8 @@ F: src/gallium/drivers/freedreno/
GLX
R: Adam Jackson <ajax@redhat.com>
F: src/glx/
VULKAN
R: Eric Engestrom <eric@engestrom.ch>
F: src/vulkan/
F: include/vulkan/

View File

@@ -1 +1 @@
18.3.0-rc6
19.1.0-devel

View File

@@ -1,2 +0,0 @@
# fixes: Commit was squashed into the respective offenders
c02390f8fcd367c7350db568feabb2f062efca14 egl/wayland: rather obvious build fix

View File

@@ -13,39 +13,54 @@
is_stable_nomination()
{
git show --summary "$1" | grep -q -i -o "CC:.*mesa-stable"
git show --pretty=medium --summary "$1" | grep -q -i -o "CC:.*mesa-stable"
}
is_typod_nomination()
{
git show --summary "$1" | grep -q -i -o "CC:.*mesa-dev"
git show --pretty=medium --summary "$1" | grep -q -i -o "CC:.*mesa-dev"
}
fixes=
# Helper to handle various mistypos of the fixes tag.
# The tag string itself is passed as argument and normalised within.
#
# Resulting string in the global variable "fixes" and contains entries
# in the form "fixes:$sha"
is_sha_nomination()
{
fixes=`git show --pretty=medium -s $1 | tr -d "\n" | \
sed -e 's/'"$2"'/\nfixes:/Ig' | \
grep -Eo 'fixes:[a-f0-9]{8,40}'`
fixes_count=`echo "$fixes" | wc -l`
fixes_count=`echo "$fixes" | grep "fixes:" | wc -l`
if test $fixes_count -eq 0; then
return 0
return 1
fi
# Throw a warning for each invalid sha
while test $fixes_count -gt 0; do
# Treat only the current line
id=`echo "$fixes" | tail -n $fixes_count | head -n 1 | cut -d : -f 2`
fixes_count=$(($fixes_count-1))
# Bail out if we cannot find suitable id.
# Any specific validation the $id is valid and not some junk, is
# implied with the follow up code
if test "x$id" = x; then
continue
if ! git show $id >/dev/null 2>&1; then
echo WARNING: Commit $1 lists invalid sha $id
fi
done
#Check if the offending commit is in branch.
return 0
}
# Checks if at least one of offending commits, listed in the global
# "fixes", is in branch.
sha_in_range()
{
fixes_count=`echo "$fixes" | grep "fixes:" | wc -l`
while test $fixes_count -gt 0; do
# Treat only the current line
id=`echo "$fixes" | tail -n $fixes_count | head -n 1 | cut -d : -f 2`
fixes_count=$(($fixes_count-1))
# Be that cherry-picked ...
# ... or landed before the branchpoint.
@@ -103,22 +118,32 @@ do
continue
fi
if is_stable_nomination "$sha"; then
tag=stable
elif is_typod_nomination "$sha"; then
tag=typod
elif is_fixes_nomination "$sha"; then
if is_fixes_nomination "$sha"; then
tag=fixes
elif is_brokenby_nomination "$sha"; then
tag=brokenby
elif is_revert_nomination "$sha"; then
tag=revert
elif is_stable_nomination "$sha"; then
tag=stable
elif is_typod_nomination "$sha"; then
tag=typod
else
continue
fi
case "$tag" in
fixes | brokenby | revert )
if ! sha_in_range; then
continue
fi
;;
* )
;;
esac
printf "[ %8s ] " "$tag"
git --no-pager show --summary --oneline $sha
git --no-pager show --no-patch --oneline $sha
done
rm -f already_picked

88
bin/meson-cmd-extract.py Executable file
View File

@@ -0,0 +1,88 @@
#!/usr/bin/env python3
# Copyright © 2019 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
"""This script reads a meson build directory and gives back the command line it
was configured with.
This only works for meson 0.49.0 and newer.
"""
import argparse
import ast
import configparser
import pathlib
import sys
def parse_args() -> argparse.Namespace:
"""Parse arguments."""
parser = argparse.ArgumentParser()
parser.add_argument(
'build_dir',
help='Path the meson build directory')
args = parser.parse_args()
return args
def load_config(path: pathlib.Path) -> configparser.ConfigParser:
"""Load config file."""
conf = configparser.ConfigParser()
with path.open() as f:
conf.read_file(f)
return conf
def build_cmd(conf: configparser.ConfigParser) -> str:
"""Rebuild the command line."""
args = []
for k, v in conf['options'].items():
if ' ' in v:
args.append(f'-D{k}="{v}"')
else:
args.append(f'-D{k}={v}')
cf = conf['properties'].get('cross_file')
if cf:
args.append('--cross-file={}'.format(cf))
nf = conf['properties'].get('native_file')
if nf:
# this will be in the form "['str', 'str']", so use ast.literal_eval to
# convert it to a list of strings.
nf = ast.literal_eval(nf)
args.extend(['--native-file={}'.format(f) for f in nf])
return ' '.join(args)
def main():
args = parse_args()
path = pathlib.Path(args.build_dir, 'meson-private', 'cmd_line.txt')
if not path.exists():
print('Cannot find the necessary file to rebuild command line. '
'Is your meson version >= 0.49.0?', file=sys.stderr)
sys.exit(1)
conf = load_config(path)
cmd = build_cmd(conf)
print(cmd)
if __name__ == '__main__':
main()

63
bin/meson-options.py Executable file
View File

@@ -0,0 +1,63 @@
#!/usr/bin/env python3
from os import get_terminal_size
from textwrap import wrap
from mesonbuild import coredata
from mesonbuild import optinterpreter
(COLUMNS, _) = get_terminal_size()
def describe_option(option_name: str, option_default_value: str,
option_type: str, option_message: str) -> None:
print('name: ' + option_name)
print('default: ' + option_default_value)
print('type: ' + option_type)
for line in wrap(option_message, width=COLUMNS - 9):
print(' ' + line)
print('---')
oi = optinterpreter.OptionInterpreter('')
oi.process('meson_options.txt')
for (name, value) in oi.options.items():
if isinstance(value, coredata.UserStringOption):
describe_option(name,
value.value,
'string',
"You can type what you want, but make sure it makes sense")
elif isinstance(value, coredata.UserBooleanOption):
describe_option(name,
'true' if value.value else 'false',
'boolean',
"You can set it to 'true' or 'false'")
elif isinstance(value, coredata.UserIntegerOption):
describe_option(name,
str(value.value),
'integer',
"You can set it to any integer value between '{}' and '{}'".format(value.min_value, value.max_value))
elif isinstance(value, coredata.UserUmaskOption):
describe_option(name,
str(value.value),
'umask',
"You can set it to 'preserve' or a value between '0000' and '0777'")
elif isinstance(value, coredata.UserComboOption):
choices = '[' + ', '.join(["'" + v + "'" for v in value.choices]) + ']'
describe_option(name,
value.value,
'combo',
"You can set it to any one of those values: " + choices)
elif isinstance(value, coredata.UserArrayOption):
choices = '[' + ', '.join(["'" + v + "'" for v in value.choices]) + ']'
value = '[' + ', '.join(["'" + v + "'" for v in value.value]) + ']'
describe_option(name,
value,
'array',
"You can set it to one or more of those values: " + choices)
elif isinstance(value, coredata.UserFeatureOption):
describe_option(name,
value.value,
'feature',
"You can set it to 'auto', 'enabled', or 'disabled'")
else:
print(name + ' is an option of a type unknown to this script')
print('---')

View File

@@ -52,6 +52,19 @@ mingw*)
;;
esac
AC_ARG_ENABLE(autotools,
[AS_HELP_STRING([--enable-autotools],
[Enable the use of this autotools based build configuration])],
[enable_autotools=$enableval], [enable_autotools=no])
if test "x$enable_autotools" != "xyes" ; then
AC_MSG_ERROR([the autotools build system has been deprecated in favour of
meson and will be removed eventually. For instructions on how to use meson
see https://www.mesa3d.org/meson.html.
If you still want to use the autotools build, then add --enable-autotools
to the configure command line.])
fi
# Support silent build rules, requires at least automake-1.11. Disable
# by either passing --disable-silent-rules to configure or passing V=1
# to make
@@ -74,7 +87,7 @@ AC_SUBST([OPENCL_VERSION])
# in the first entry.
LIBDRM_REQUIRED=2.4.75
LIBDRM_RADEON_REQUIRED=2.4.71
LIBDRM_AMDGPU_REQUIRED=2.4.95
LIBDRM_AMDGPU_REQUIRED=2.4.97
LIBDRM_INTEL_REQUIRED=2.4.75
LIBDRM_NVVIEUX_REQUIRED=2.4.66
LIBDRM_NOUVEAU_REQUIRED=2.4.66
@@ -107,8 +120,8 @@ dnl LLVM versions
LLVM_REQUIRED_GALLIUM=3.3.0
LLVM_REQUIRED_OPENCL=3.9.0
LLVM_REQUIRED_R600=3.9.0
LLVM_REQUIRED_RADEONSI=6.0.0
LLVM_REQUIRED_RADV=6.0.0
LLVM_REQUIRED_RADEONSI=7.0.0
LLVM_REQUIRED_RADV=7.0.0
LLVM_REQUIRED_SWR=6.0.0
dnl Check for progs
@@ -1395,7 +1408,7 @@ GALLIUM_DRIVERS_DEFAULT="r300,r600,svga,swrast"
AC_ARG_WITH([gallium-drivers],
[AS_HELP_STRING([--with-gallium-drivers@<:@=DIRS...@:>@],
[comma delimited Gallium drivers list, e.g.
"i915,nouveau,r300,r600,radeonsi,freedreno,pl111,svga,swrast,swr,tegra,v3d,vc4,virgl,etnaviv,imx"
"i915,nouveau,r300,r600,radeonsi,freedreno,kmsro,svga,swrast,swr,tegra,v3d,vc4,virgl,etnaviv"
@<:@default=r300,r600,svga,swrast@:>@])],
[with_gallium_drivers="$withval"],
[with_gallium_drivers="$GALLIUM_DRIVERS_DEFAULT"])
@@ -1716,6 +1729,8 @@ xdri)
if test x"$enable_dri" = xyes; then
dri_modules="$dri_modules xcb-dri2 >= $XCBDRI2_REQUIRED"
fi
dri_modules="$dri_modules xxf86vm"
fi
if test x"$dri_platform" = xapple ; then
DEFINES="$DEFINES -DGLX_USE_APPLEGL"
@@ -1725,8 +1740,6 @@ xdri)
fi
fi
dri_modules="$dri_modules xxf86vm"
PKG_CHECK_MODULES([DRIGL], [$dri_modules])
GL_PC_REQ_PRIV="$GL_PC_REQ_PRIV $dri_modules"
X11_INCLUDES="$X11_INCLUDES $DRIGL_CFLAGS"
@@ -1864,6 +1877,7 @@ for plat in $platforms; do
;;
drm)
test "x$enable_egl" = "xyes" &&
test "x$enable_gbm" = "xno" &&
AC_MSG_ERROR([EGL platform drm needs gbm])
DEFINES="$DEFINES -DHAVE_DRM_PLATFORM"
@@ -1908,7 +1922,7 @@ if test x"$enable_dri3" = xyes; then
dri3_modifier_modules="xcb-dri3 >= $XCBDRI3_MODIFIERS_REQUIRED xcb-present >= $XCBPRESENT_MODIFIERS_REQUIRED"
PKG_CHECK_MODULES([XCB_DRI3_MODIFIERS], [$dri3_modifier_modules], [have_dri3_modifiers=yes], [have_dri3_modifiers=no])
if test "x$have_dri3_modifiers" == xyes; then
if test "x$have_dri3_modifiers" = xyes; then
DEFINES="$DEFINES -DHAVE_DRI3_MODIFIERS"
fi
fi
@@ -2727,9 +2741,6 @@ if test -n "$with_gallium_drivers"; then
PKG_CHECK_MODULES([ETNAVIV], [libdrm >= $LIBDRM_ETNAVIV_REQUIRED libdrm_etnaviv >= $LIBDRM_ETNAVIV_REQUIRED])
require_libdrm "etnaviv"
;;
ximx)
HAVE_GALLIUM_IMX=yes
;;
xtegra)
HAVE_GALLIUM_TEGRA=yes
require_libdrm "tegra"
@@ -2816,8 +2827,8 @@ if test -n "$with_gallium_drivers"; then
DEFINES="$DEFINES -DUSE_V3D_SIMULATOR"],
[USE_V3D_SIMULATOR=no])
;;
xpl111)
HAVE_GALLIUM_PL111=yes
xkmsro)
HAVE_GALLIUM_KMSRO=yes
;;
xvirgl)
HAVE_GALLIUM_VIRGL=yes
@@ -2850,12 +2861,8 @@ AM_CONDITIONAL(HAVE_SWR_BUILTIN, test "x$HAVE_SWR_BUILTIN" = xyes)
dnl We need to validate some needed dependencies for renderonly drivers.
if test "x$HAVE_GALLIUM_ETNAVIV" != xyes -a "x$HAVE_GALLIUM_IMX" = xyes ; then
AC_MSG_ERROR([Building with imx requires etnaviv])
fi
if test "x$HAVE_GALLIUM_VC4" != xyes -a "x$HAVE_GALLIUM_PL111" = xyes ; then
AC_MSG_ERROR([Building with pl111 requires vc4])
if test "x$HAVE_GALLIUM_VC4" != xyes -a "x$HAVE_GALLIUM_KMSRO" = xyes ; then
AC_MSG_ERROR([Building with kmsro requires vc4])
fi
if test "x$HAVE_GALLIUM_NOUVEAU" != xyes -a "x$HAVE_GALLIUM_TEGRA" = xyes; then
@@ -2903,6 +2910,7 @@ if test "x$enable_llvm" = xyes; then
LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags`
LLVM_CFLAGS=$LLVM_CPPFLAGS # CPPFLAGS seem to be sufficient
LLVM_CXXFLAGS=`strip_unwanted_llvm_flags "$LLVM_CONFIG --cxxflags"`
LLVM_CXXFLAGS="$CXX11_CXXFLAGS $LLVM_CXXFLAGS"
dnl Set LLVM_LIBS - This is done after the driver configuration so
dnl that drivers can add additional components to LLVM_COMPONENTS.
@@ -2937,11 +2945,11 @@ if test "x$enable_llvm" = xyes; then
fi
dnl The gallium-xlib GLX and gallium OSMesa targets directly embed the
dnl swr/llvmpipe driver into the final binary. Adding LLVM_LIBS results in
dnl swr/llvmpipe driver into the final binary. Adding LLVM_LIBS results in
dnl the LLVM library propagated in the Libs.private of the respective .pc
dnl file which ensures complete dependency information when statically
dnl linking.
if test "x$enable_glx" == xgallium-xlib; then
if test "x$enable_glx" = xgallium-xlib; then
GL_PC_LIB_PRIV="$GL_PC_LIB_PRIV $LLVM_LIBS"
fi
if test "x$enable_gallium_osmesa" = xyes; then
@@ -2951,14 +2959,13 @@ fi
AM_CONDITIONAL(HAVE_GALLIUM_SVGA, test "x$HAVE_GALLIUM_SVGA" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_I915, test "x$HAVE_GALLIUM_I915" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_PL111, test "x$HAVE_GALLIUM_PL111" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_KMSRO, test "x$HAVE_GALLIUM_KMSRO" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_R300, test "x$HAVE_GALLIUM_R300" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_R600, test "x$HAVE_GALLIUM_R600" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_RADEONSI, test "x$HAVE_GALLIUM_RADEONSI" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_NOUVEAU, test "x$HAVE_GALLIUM_NOUVEAU" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_FREEDRENO, test "x$HAVE_GALLIUM_FREEDRENO" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_ETNAVIV, test "x$HAVE_GALLIUM_ETNAVIV" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_IMX, test "x$HAVE_GALLIUM_IMX" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_TEGRA, test "x$HAVE_GALLIUM_TEGRA" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_SOFTPIPE, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_LLVMPIPE, test "x$HAVE_GALLIUM_LLVMPIPE" = xyes)
@@ -2997,6 +3004,7 @@ AM_CONDITIONAL(HAVE_AMD_DRIVERS, test "x$HAVE_GALLIUM_RADEONSI" = xyes -o \
AM_CONDITIONAL(HAVE_BROADCOM_DRIVERS, test "x$HAVE_GALLIUM_VC4" = xyes -o \
"x$HAVE_GALLIUM_V3D" = xyes)
AM_CONDITIONAL(HAVE_FREEDRENO_DRIVERS, test "x$HAVE_GALLIUM_FREEDRENO" = xyes)
AM_CONDITIONAL(HAVE_INTEL_DRIVERS, test "x$HAVE_INTEL_VULKAN" = xyes -o \
"x$HAVE_I965_DRI" = xyes)
@@ -3043,7 +3051,7 @@ AC_SUBST([XVMC_MAJOR], 1)
AC_SUBST([XVMC_MINOR], 0)
AC_SUBST([XA_MAJOR], 2)
AC_SUBST([XA_MINOR], 4)
AC_SUBST([XA_MINOR], 5)
AC_SUBST([XA_PATCH], 0)
AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_PATCH")
@@ -3089,6 +3097,7 @@ AC_CONFIG_FILES([Makefile
src/amd/vulkan/Makefile
src/broadcom/Makefile
src/compiler/Makefile
src/freedreno/Makefile
src/egl/Makefile
src/egl/main/egl.pc
src/egl/wayland/wayland-drm/Makefile
@@ -3099,7 +3108,7 @@ AC_CONFIG_FILES([Makefile
src/gallium/drivers/i915/Makefile
src/gallium/drivers/llvmpipe/Makefile
src/gallium/drivers/nouveau/Makefile
src/gallium/drivers/pl111/Makefile
src/gallium/drivers/kmsro/Makefile
src/gallium/drivers/r300/Makefile
src/gallium/drivers/r600/Makefile
src/gallium/drivers/radeonsi/Makefile
@@ -3108,7 +3117,6 @@ AC_CONFIG_FILES([Makefile
src/gallium/drivers/swr/Makefile
src/gallium/drivers/tegra/Makefile
src/gallium/drivers/etnaviv/Makefile
src/gallium/drivers/imx/Makefile
src/gallium/drivers/v3d/Makefile
src/gallium/drivers/vc4/Makefile
src/gallium/drivers/virgl/Makefile
@@ -3143,11 +3151,10 @@ AC_CONFIG_FILES([Makefile
src/gallium/tests/trivial/Makefile
src/gallium/tests/unit/Makefile
src/gallium/winsys/etnaviv/drm/Makefile
src/gallium/winsys/imx/drm/Makefile
src/gallium/winsys/freedreno/drm/Makefile
src/gallium/winsys/i915/drm/Makefile
src/gallium/winsys/nouveau/drm/Makefile
src/gallium/winsys/pl111/drm/Makefile
src/gallium/winsys/kmsro/drm/Makefile
src/gallium/winsys/radeon/drm/Makefile
src/gallium/winsys/amdgpu/drm/Makefile
src/gallium/winsys/svga/drm/Makefile

View File

@@ -26,6 +26,12 @@
</ul>
</ol>
<h2>ATTENTION:</h2>
<p>
The autotools build is being replaced by the <a href="meson.html">meson</a>
build system. If you haven't yet now is a good time to try using meson and
report any issues you run into.
</p>
<h2 id="basic">1. Basic Usage</h2>

View File

@@ -14,7 +14,7 @@
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Bug Database</h1>
<h1>Report a bug</h1>
<p>
The Mesa bug database is hosted on

View File

@@ -49,10 +49,10 @@
<li><a href="precompiled.html" target="_parent">Precompiled Libraries</a>
</ul>
<b>Resources</b>
<b>Need help?</b>
<ul>
<li><a href="lists.html" target="_parent">Mailing Lists</a>
<li><a href="bugs.html" target="_parent">Bug Database</a>
<li><a href="bugs.html" target="_parent">Report a bug</a>
<li><a href="webmaster.html" target="_parent">Webmaster</a>
<li><a href="https://dri.freedesktop.org/" target="_parent">Mesa/DRI Wiki</a>
</ul>

View File

@@ -204,7 +204,7 @@ GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
- specified transform/feedback layout DONE
- input/output block locations DONE
GL_ARB_multi_bind DONE (all drivers)
GL_ARB_query_buffer_object DONE (i965/hsw+)
GL_ARB_query_buffer_object DONE (i965/hsw+, virgl)
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_texture_stencil8 DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
@@ -319,7 +319,7 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
GL_EXT_memory_object DONE (radeonsi)
GL_EXT_memory_object_fd DONE (radeonsi)
GL_EXT_memory_object_win32 not started
GL_EXT_render_snorm DONE (i965)
GL_EXT_render_snorm DONE (i965, radeonsi)
GL_EXT_semaphore DONE (radeonsi)
GL_EXT_semaphore_fd DONE (radeonsi)
GL_EXT_semaphore_win32 not started
@@ -338,7 +338,7 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
GL_OES_texture_float_linear DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_half_float DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_half_float_linear DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_view DONE (i965/gen8+)
GL_OES_texture_view DONE (freedreno, i965/gen8+, r600, radeonsi, nv50, nvc0, softpipe, llvmpipe, swr)
GL_OES_viewport_array DONE (i965, nvc0, radeonsi)
GLX_ARB_context_flush_control not started
GLX_ARB_robustness_application_isolation not started

View File

@@ -15,6 +15,65 @@
<div class="content">
<h1>News</h1>
<h2>February 18, 2019</h2>
<p>
<a href="relnotes/18.3.4.html">Mesa 18.3.4</a> is released.
This is a bug-fix release.
</p>
<h2>January 31, 2019</h2>
<p>
<a href="relnotes/18.3.3.html">Mesa 18.3.3</a> is released.
This is a bug-fix release.
</p>
<h2>January 17, 2019</h2>
<p>
<a href="relnotes/18.3.2.html">Mesa 18.3.2</a> is released.
This is a bug-fix release.
</p>
<h2>December 27, 2018</h2>
<p>
<a href="relnotes/18.2.8.html">Mesa 18.2.8</a> is released.
This is a bug-fix release.
<br>
NOTE: It is anticipated that 18.2.8 will be the final release in the
18.2 series. Users of 18.2 are encouraged to migrate to the 18.3
series in order to obtain future fixes.
</p>
<h2>December 13, 2018</h2>
<p>
<a href="relnotes/18.2.7.html">Mesa 18.2.7</a> is released.
This is a bug-fix release.
</p>
<h2>December 11, 2018</h2>
<p>
<a href="relnotes/18.3.1.html">Mesa 18.3.1</a> is released.
This is a bug-fix release.
</p>
<h2>December 7, 2018</h2>
<p>
<a href="relnotes/18.3.0.html">Mesa 18.3.0</a> is released. This is a
new development release. See the release notes for more information
about the release.
</p>
<h2>November 28, 2018</h2>
<p>
<a href="relnotes/18.2.6.html">Mesa 18.2.6</a> is released.
This is a bug-fix release.
</p>
<h2>November 15, 2018</h2>
<p>
<a href="relnotes/18.2.5.html">Mesa 18.2.5</a> is released.
This is a bug-fix release.
</p>
<h2>October 31, 2018</h2>
<p>
<a href="relnotes/18.2.4.html">Mesa 18.2.4</a> is released.

View File

@@ -22,6 +22,7 @@
<li><a href="#prereq-general">General prerequisites</a>
<li><a href="#prereq-dri">For DRI and hardware acceleration</a>
</ul>
<li><a href="#meson">Building with meson</a>
<li><a href="#autoconf">Building with autoconf (Linux/Unix/X11)</a>
<li><a href="#scons">Building with SCons (Windows/Linux)</a>
<li><a href="#android">Building with AOSP (Android)</a>
@@ -39,9 +40,10 @@ Build system.
</p>
<ul>
<li>Autoconf is required when building on *nix platforms.
<li><a href="https://mesonbuild.com">meson</a> is recommended when building on *nix platforms.
<li>Autoconf is another option when building on *nix platforms.
<li><a href="http://www.scons.org/">SCons</a> is required for building on
Windows and optional for Linux (it's an alternative to autoconf/automake.)
Windows and optional for Linux (it's an alternative to autoconf/automake or meson.)
</li>
<li>Android Build system when building as native Android component. Autoconf
is used when when building ARC.
@@ -72,7 +74,9 @@ you think you've spotted a bug let developers know by filing a
<ul>
<li><a href="https://www.python.org/">Python</a> - Python is required.
Version 2.7 or later should work.
When building with scons 2.7 is required.
When building with meson 3.5 or newer is required.
When building with autotools 2.7, or 3.5 or later are required.
</li>
<li><a href="http://www.makotemplates.org/">Python Mako module</a> -
Python Mako module is required. Version 0.8.0 or later should work.
@@ -111,11 +115,31 @@ the packaging tool used by your distro.
... # others
</pre>
<h1 id="autoconf">2. Building with autoconf (Linux/Unix/X11)</h1>
<h1 id="meson">2. Building with meson</h1>
<p>
The primary method to build Mesa on Unix systems is with autoconf.
Meson is the latest build system in mesa, it is currently able to build for
*nix systems like Linux and BSD, and will be able to build for windows as well.
</p>
<p>
The general approach is:
</p>
<pre>
meson builddir/
ninja -C builddir/
sudo ninja -C builddir/ install
</pre>
<p>
Please read the <a href="meson.html">detailed meson instructions</a>
for more information
</p>
<h1 id="autoconf">3. Building with autoconf (Linux/Unix/X11)</h1>
<p>
Although meson is recommended, another supported way to build on *nix systems
is with autoconf.
</p>
<p>
@@ -133,7 +157,7 @@ for more details.
<h1 id="scons">3. Building with SCons (Windows/Linux)</h1>
<h1 id="scons">4. Building with SCons (Windows/Linux)</h1>
<p>
To build Mesa with SCons on Linux or Windows do
@@ -169,7 +193,7 @@ Additional information is available in <a href="README.WIN32">README.WIN32</a>.
<h1 id="android">4. Building with AOSP (Android)</h1>
<h1 id="android">5. Building with AOSP (Android)</h1>
<p>
Currently one can build Mesa for Android as part of the AOSP project, yet
@@ -188,7 +212,7 @@ Android-x86 and/or other resources.
</p>
<h1 id="libs">5. Library Information</h1>
<h1 id="libs">6. Library Information</h1>
<p>
When compilation has finished, look in the top-level <code>lib/</code>
@@ -226,7 +250,7 @@ versions of libGL and device drivers.
</p>
<h1 id="pkg-config">6. Building OpenGL programs with pkg-config</h1>
<h1 id="pkg-config">7. Building OpenGL programs with pkg-config</h1>
<p>
Running <code>make install</code> will install package configuration files

View File

@@ -29,6 +29,9 @@ pre {
/*font-family: monospace;*/
font-size: 10pt;
/*color: black;*/
background-color: #eee;
margin-left: 2em;
padding: .5em;
}
iframe {

View File

@@ -16,6 +16,11 @@
<h1>Compilation and Installation using Meson</h1>
<ul>
<li><a href="#basic">Basic Usage</a></li>
<li><a href="#cross-compilation">Cross-compilation and 32-bit builds</a></li>
</ul>
<h2 id="basic">1. Basic Usage</h2>
<p><strong>The Meson build system is generally considered stable and ready
@@ -48,9 +53,15 @@ To see a description of your options you can run <code>meson configure</code>
along with a build directory to view the selected options for. This will show
your meson global arguments and project arguments, along with their defaults
and your local settings.
</p>
<p>
Meson does not currently support listing options before configure a build
directory, but this feature is being discussed upstream.
For now, we have a <code>bin/meson-options.py</code> script that prints
the options for you.
If that script doesn't work for some reason, you can always look in the
<code>meson_options.txt</code> file at the root of the project.
</p>
<pre>
@@ -105,14 +116,14 @@ to invoke non-default targets for ninja to update them:
<dl>
<dt><code>Environment Variables</code></dt>
<dd><p>Meson supports the standard CC and CXX environment variables for
changing the default compiler, and CFLAGS, CXXFLAGS, and LDFLAGS for setting
options to the compiler and linker during the initial configuration.
changing the default compiler. Meson does support CFLAGS, CXXFLAGS, etc. But
their use is discouraged because of the many caveats in using them. Instead it
is recomended to use <code>-D${lang}_args</code> and
<code>-D${lang}_link_args</code> instead. Among the benefits of these options
is that they are guaranteed to persist across rebuilds and reconfigurations.
These arguments are consumed and stored by meson when it is initialized. To
change these flags after the build is initialized (or when doing a first
initialization), consider using <code>-D${lang}_args</code> and
<code>-D${lang}_link_args</code> instead. Meson will never change compiler in a
configured build directory.
Meson does not allow changing compiler in a configured builddir, you will need
to create a new build dir for a different compiler.
</p>
<pre>
@@ -135,11 +146,56 @@ the popular compilers, a complete list is available
<dt><code>LLVM</code></dt>
<dd><p>Meson includes upstream logic to wrap llvm-config using its standard
dependency interface. It will search <code>$PATH</code> (or <code>%PATH%</code> on windows) for
llvm-config (and llvm-config$version and llvm-config-$version), so using an
LLVM from a non-standard path is as easy as
<code>PATH=/path/with/llvm-config:$PATH meson build</code>.
dependency interface.
</p></dd>
<dd><p>
As of meson 0.49.0 meson also has the concept of a
<a href="https://mesonbuild.com/Native-environments.html">"native file"</a>,
these files provide information about the native build environment (as opposed
to a cross build environment). They are ini formatted and can override where to
find llvm-config:
custom-llvm.ini
<pre>
[binaries]
llvm-config = '/usr/local/bin/llvm/llvm-config'
</pre>
Then configure meson:
<pre>
meson builddir/ --native-file custom-llvm.ini
</pre>
</p></dd>
<dd><p>
For selecting llvm-config for cross compiling a
<a href="https://mesonbuild.com/Cross-compilation.html#defining-the-environment">"cross file"</a>
should be used. It uses the same format as the native file above:
cross-llvm.ini
<pre>
[binaries]
...
llvm-config = '/usr/lib/llvm-config-32'
</pre>
Then configure meson:
<pre>
meson builddir/ --cross-file cross-llvm.ini
</pre>
See the <a href="#cross-compilation">Cross Compilation</a> section for more information.
</dd></p>
<dd><p>
For older versions of meson <code>$PATH</code> (or <code>%PATH%</code> on
windows) will be searched for llvm-config (and llvm-config$version and
llvm-config-$version), you can override this environment variable to control
the search: <code>PATH=/path/with/llvm-config:$PATH meson build</code>.
</dd></p>
</dl>
<dl>
@@ -190,6 +246,93 @@ is unrelated to the <code>buildtype</code>; setting the latter to
</dd>
</dl>
<h2 id="cross-compilation">2. Cross-compilation and 32-bit builds</h2>
<p><a href="https://mesonbuild.com/Cross-compilation.html">Meson supports
cross-compilation</a> by specifying a number of binary paths and
settings in a file and passing this file to <code>meson</code> or
<code>meson configure</code> with the <code>--cross-file</code>
parameter.</p>
<p>This file can live at any location, but you can use the bare filename
(without the folder path) if you put it in $XDG_DATA_HOME/meson/cross or
~/.local/share/meson/cross</p>
<p>Below are a few example of cross files, but keep in mind that you
will likely have to alter them for your system.</p>
<p>
Those running on ArchLinux can use the AUR-maintained packages for some
of those, as they'll have the right values for your system:
<ul>
<li><a href="https://aur.archlinux.org/packages/meson-cross-x86-linux-gnu">meson-cross-x86-linux-gnu</a></li>
<li><a href="https://aur.archlinux.org/packages/meson-cross-aarch64-linux-gnu">meson-cross-aarch64-linux-gnu</a></li>
</ul>
</p>
<p>
32-bit build on x86 linux:
<pre>
[binaries]
c = '/usr/bin/gcc'
cpp = '/usr/bin/g++'
ar = '/usr/bin/gcc-ar'
strip = '/usr/bin/strip'
pkgconfig = '/usr/bin/pkg-config-32'
llvm-config = '/usr/bin/llvm-config32'
[properties]
c_args = ['-m32']
c_link_args = ['-m32']
cpp_args = ['-m32']
cpp_link_args = ['-m32']
[host_machine]
system = 'linux'
cpu_family = 'x86'
cpu = 'i686'
endian = 'little'
</pre>
</p>
<p>
64-bit build on ARM linux:
<pre>
[binaries]
c = '/usr/bin/aarch64-linux-gnu-gcc'
cpp = '/usr/bin/aarch64-linux-gnu-g++'
ar = '/usr/bin/aarch64-linux-gnu-gcc-ar'
strip = '/usr/bin/aarch64-linux-gnu-strip'
pkgconfig = '/usr/bin/aarch64-linux-gnu-pkg-config'
exe_wrapper = '/usr/bin/qemu-aarch64-static'
[host_machine]
system = 'linux'
cpu_family = 'aarch64'
cpu = 'aarch64'
endian = 'little'
</pre>
</p>
<p>
64-bit build on x86 windows:
<pre>
[binaries]
c = '/usr/bin/x86_64-w64-mingw32-gcc'
cpp = '/usr/bin/x86_64-w64-mingw32-g++'
ar = '/usr/bin/x86_64-w64-mingw32-ar'
strip = '/usr/bin/x86_64-w64-mingw32-strip'
pkgconfig = '/usr/bin/x86_64-w64-mingw32-pkg-config'
exe_wrapper = 'wine'
[host_machine]
system = 'windows'
cpu_family = 'x86_64'
cpu = 'i686'
endian = 'little'
</pre>
</p>
</div>
</body>
</html>

View File

@@ -23,6 +23,16 @@ Mesa provides feature/development and stable releases.
The table below lists the date and release manager that is expected to do the
specific release.
<br>
Regular updates will ensure that the schedule for the current and the
next two feature releases are shown in the table.
<br>
In order to keep the whole releasing team up to date with the tools
used, best practices and other details, the member in charge of the
next feature release will be in constant rotation.
<br>
The way the release schedule works is
explained <a href="releasing.html#schedule" target="_parent">here</a>.
<br>
Take a look <a href="submittingpatches.html#criteria" target="_parent">here</a>
if you'd like to nominate a patch in the next stable release.
</p>
@@ -39,47 +49,117 @@ if you'd like to nominate a patch in the next stable release.
<th>Notes</th>
</tr>
<tr>
<td rowspan="3">18.2</td>
<td>2018-11-14</td>
<td>18.2.5</td>
<td rowspan="2">18.3</td>
<td>2019-02-27</td>
<td>18.3.5</td>
<td>Emil Velikov</td>
<td>
</tr>
<tr>
<td>2019-03-13</td>
<td>18.3.6</td>
<td>Emil Velikov</td>
<td>Last planned 18.3.x release</td>
</tr>
<tr>
<td rowspan="4">19.0</td>
<td>2019-01-29</td>
<td>19.0.0-rc1</td>
<td>Dylan Baker</td>
<td>
</tr>
<tr>
<td>2019-02-05</td>
<td>19.0.0-rc2</td>
<td>Dylan Baker</td>
<td>
</tr>
<tr>
<td>2019-02-12</td>
<td>19.0.0-rc3</td>
<td>Dylan Baker</td>
<td>
</tr>
<tr>
<td>2019-02-19</td>
<td>19.0.0-rc4</td>
<td>Dylan Baker</td>
<td>Last planned RC/Final release</td>
</tr>
<tr>
<td rowspan="4">19.1</td>
<td>2019-04-30</td>
<td>19.1.0-rc1</td>
<td>Andres Gomez</td>
<td>
</tr>
<tr>
<td>2019-05-07</td>
<td>19.1.0-rc2</td>
<td>Andres Gomez</td>
<td>
</tr>
<tr>
<td>2019-05-14</td>
<td>19.1.0-rc3</td>
<td>Andres Gomez</td>
<td>
</tr>
<tr>
<td>2019-05-21</td>
<td>19.1.0-rc4</td>
<td>Andres Gomez</td>
<td>Last planned RC/Final release</td>
</tr>
<tr>
<td rowspan="4">19.2</td>
<td>2019-08-06</td>
<td>19.2.0-rc1</td>
<td>Emil Velikov</td>
<td>
</tr>
<tr>
<td>2019-08-13</td>
<td>19.2.0-rc2</td>
<td>Emil Velikov</td>
<td>
</tr>
<tr>
<td>2019-08-20</td>
<td>19.2.0-rc3</td>
<td>Emil Velikov</td>
<td>
</tr>
<tr>
<td>2019-08-27</td>
<td>19.2.0-rc4</td>
<td>Emil Velikov</td>
<td>Last planned RC/Final release</td>
</tr>
<tr>
<td rowspan="4">19.3</td>
<td>2019-10-15</td>
<td>19.3.0-rc1</td>
<td>Juan A. Suarez</td>
<td/>
<td>
</tr>
<tr>
<td>2018-11-28</td>
<td>18.2.6</td>
<td>2019-10-22</td>
<td>19.3.0-rc2</td>
<td>Juan A. Suarez</td>
<td/>
<td>
</tr>
<tr>
<td>2018-12-12</td>
<td>18.2.7</td>
<td>2019-10-29</td>
<td>19.3.0-rc3</td>
<td>Juan A. Suarez</td>
<td>Last planned 18.2.x release</td>
</tr>
<td rowspan="4">18.3</td>
<td>2018-10-31</td>
<td>18.3.0-rc1</td>
<td>Emil Velikov</td>
<td></td>
<td>
</tr>
<tr>
<td>2018-11-07</td>
<td>18.3.0-rc2</td>
<td>Emil Velikov</td>
<td/>
</tr>
<tr>
<td>2018-11-14</td>
<td>18.3.0-rc3</td>
<td>Emil Velikov</td>
<td/>
</tr>
<tr>
<td>2018-11-21</td>
<td>18.3.0-rc4</td>
<td>Emil Velikov</td>
<td>Last planned RC/final release</td>
<td>2019-11-05</td>
<td>19.3.0-rc4</td>
<td>Juan A. Suarez</td>
<td>Last planned RC/Final release</td>
</tr>
</table>

View File

@@ -56,9 +56,10 @@ For example:
<p>
Releases should happen on Wednesdays. Delays can occur although those
should be keep to a minimum.
should be kept to a minimum.
<br>
See our <a href="release-calendar.html" target="_parent">calendar</a> for the
See our <a href="release-calendar.html" target="_parent">calendar</a>
for information about how the release schedule is planned, and the
date and other details for individual releases.
</p>
@@ -67,6 +68,9 @@ date and other details for individual releases.
<li>Available approximately every three months.
<li>Initial timeplan available 2-4 weeks before the planned branchpoint (rc1)
on the mesa-announce@ mailing list.
<li>Typically, the final release will happen after 4
candidates. Additional ones may be needed in order to resolve blocking
regressions, though.
<li>A <a href="#prerelease">pre-release</a> announcement should be available
approximately 24 hours before the final (non-rc) release.
</ul>
@@ -84,6 +88,12 @@ Note: There is one or two releases overlap when changing branches. For example:
<br>
The final release from the 12.0 series Mesa 12.0.5 will be out around the same
time (or shortly after) 13.0.1 is out.
<br>
This also involves that, as a final release may be delayed due to the
need of additional candidates to solve some blocking regression(s),
the release manager might have to update
the <a href="release-calendar.html" target="_parent">calendar</a> with
additional bug fix releases of the current stable branch.
</p>
@@ -112,18 +122,21 @@ the autoconf and scons build.
<p>Done continuously up-to the <a href="#prerelease">pre-release</a> announcement.</p>
<p>
As an exception, patches can be applied up-to the last ~1h before the actual
release. This is made <strong>only</strong> with explicit permission/request,
and the patch <strong>must</strong> be very well contained. Thus it cannot
affect more than one driver/subsystem.
</p>
<p>
Currently Ilia Mirkin and AMD devs have requested "permanent" exception.
Developers can request, <em>as an exception</em>, patches to be applied up-to
the last one hour before the actual release. This is made <strong>only</strong>
with explicit permission/request, and the patch <strong>must</strong> be very
well contained. Thus it cannot affect more than one driver/subsystem.
</p>
<p>Following developers have requested permanent exception</p>
<ul>
<li>make distcheck, scons and scons check must pass
<li><em>Ilia Mirkin</em>
<li><em>AMD team</em>
</ul>
<p>The following must pass:</p>
<ul>
<li>make distcheck, scons and scons check
<li>Testing with different version of system components - LLVM and others is also
performed where possible.
<li>As a general rule, testing with various combinations of configure
@@ -131,9 +144,9 @@ switches, depending on the specific patchset.
</ul>
<p>
Achieved by combination of local ad-hoc scripts, mingw-w64 cross
compilation and AppVeyor plus Travis-CI, the latter as part of their
Github integration.
These are achieved by combination of <a href="basictesting">local testing</a>,
which includes mingw-w64 cross compilation and AppVeyor plus Travis-CI, the
latter two as part of their Github integration.
</p>
<p>
@@ -225,7 +238,7 @@ in the main repository under <code>staging/X.Y</code>. For example:
Notes:
</p>
<ul>
<li>People are encouraged to test the branch and report regressions.</li>
<li>People are encouraged to test the staging branch and report regressions.</li>
<li>The branch history is not stable and it <strong>will</strong> be rebased,</li>
</ul>
@@ -445,7 +458,7 @@ Ensure the latest code is available - both in your local master and the
relevant branch.
</p>
<h3>Perform basic testing</h3>
<h3 id="basictesting">Perform basic testing</h3>
<p>
Most of the testing should already be done during the

View File

@@ -21,6 +21,15 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/18.3.4.html">18.3.4 release notes</a>
<li><a href="relnotes/18.3.3.html">18.3.3 release notes</a>
<li><a href="relnotes/18.3.2.html">18.3.2 release notes</a>
<li><a href="relnotes/18.2.8.html">18.2.8 release notes</a>
<li><a href="relnotes/18.2.7.html">18.2.7 release notes</a>
<li><a href="relnotes/18.3.1.html">18.3.1 release notes</a>
<li><a href="relnotes/18.3.0.html">18.3.0 release notes</a>
<li><a href="relnotes/18.2.6.html">18.2.6 release notes</a>
<li><a href="relnotes/18.2.5.html">18.2.5 release notes</a>
<li><a href="relnotes/18.2.4.html">18.2.4 release notes</a>
<li><a href="relnotes/18.2.3.html">18.2.3 release notes</a>
<li><a href="relnotes/18.2.2.html">18.2.2 release notes</a>

172
docs/relnotes/18.2.5.html Normal file
View File

@@ -0,0 +1,172 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.5 Release Notes / November 15, 2018</h1>
<p>
Mesa 18.2.5 is a bug fix release which fixes bugs found since the 18.2.4 release.
</p>
<p>
Mesa 18.2.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
dddc28928b6f4083a0d5120b58c1c8e2dc189ab5c14299c08a386607fdbbdce7 mesa-18.2.5.tar.gz
b12c32872832e5353155e1e8026e1f1ab75bba9dc5b178d712045684d26c2b73 mesa-18.2.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105731">Bug 105731</a> - linker error &quot;fragment shader input ... has no matching output in the previous stage&quot; when previous stage's output declaration in a separate shader object</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107511">Bug 107511</a> - KHR/khrplatform.h not always installed when needed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107626">Bug 107626</a> - [SNB] The graphical corruption and GPU hang occur sometimes on the piglit test &quot;arb_texture_multisample-large-float-texture&quot; with parameter --fp16</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108082">Bug 108082</a> - warning: unknown warning option '-Wno-format-truncation' [-Wunknown-warning-option]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108560">Bug 108560</a> - Mesa 32 is built without sse</li>
</ul>
<h2>Changes</h2>
<p>Andre Heider (1):</p>
<ul>
<li>st/nine: fix stack corruption due to ABI mismatch</li>
</ul>
<p>Andrii Simiklit (1):</p>
<ul>
<li>i965/batch: don't ignore the 'brw_new_batch' call for a 'new batch'</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>meson: link gallium nine with pthreads</li>
<li>meson: fix libatomic tests</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>egl/glvnd: correctly report errors when vendor cannot be found</li>
<li>m4: add Werror when checking for compiler flags</li>
</ul>
<p>Eric Engestrom (6):</p>
<ul>
<li>svga: add missing meson build dependency</li>
<li>clover: add missing meson build dependency</li>
<li>wsi/wayland: use proper VkResult type</li>
<li>wsi/wayland: only finish() a successfully init()ed display</li>
<li>configure: install KHR/khrplatform.h when needed</li>
<li>meson: install KHR/khrplatform.h when needed</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>virgl/vtest-winsys: Use virgl version of bind flags</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>intel/tools: include stdarg.h in error2aub</li>
</ul>
<p>Juan A. Suarez Romero (4):</p>
<ul>
<li>docs: add sha256 checksums for 18.2.4</li>
<li>cherry-ignore: add explicit 18.3 only nominations</li>
<li>cherry-ignore: i965/batch: avoid reverting batch buffer if saved state is an empty</li>
<li>Update version to 18.2.5</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv/android: mark gralloc allocated BOs as external</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>ac: fix ac_build_fdiv for f64</li>
<li>st/va: fix incorrect use of resource_destroy</li>
<li>include: update GL &amp; GLES headers (v2)</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>util/ralloc: Switch from DEBUG to NDEBUG</li>
<li>util/ralloc: Make sizeof(linear_header) a multiple of 8</li>
</ul>
<p>Olivier Fourdan (1):</p>
<ul>
<li>wayland/egl: Resize EGL surface on update buffer for swrast</li>
</ul>
<p>Rhys Perry (1):</p>
<ul>
<li>glsl_to_tgsi: don't create 64-bit integer MAD/FMA</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>radv: disable conditional rendering for vkCmdCopyQueryPoolResults()</li>
<li>radv: only expose VK_SUBGROUP_FEATURE_ARITHMETIC_BIT for VI+</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>autotools: library-dependency when no sse and 32-bit</li>
</ul>
<p>Timothy Arceri (4):</p>
<ul>
<li>st/mesa: calculate buffer size correctly for packed uniforms</li>
<li>st/glsl_to_nir: fix next_stage gathering</li>
<li>nir: add glsl_type_is_integer() helper</li>
<li>nir: don't pack varyings ints with floats unless flat</li>
</ul>
<p>Vadym Shovkoplias (1):</p>
<ul>
<li>glsl/linker: Fix out variables linking during single stage</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>r600/sb: Fix constant logical operand in assert.</li>
</ul>
</div>
</body>
</html>

179
docs/relnotes/18.2.6.html Normal file
View File

@@ -0,0 +1,179 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.6 Release Notes / November 28, 2018</h1>
<p>
Mesa 18.2.6 is a bug fix release which fixes bugs found since the 18.2.5 release.
</p>
<p>
Mesa 18.2.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
e0ea1236dbc6c412b02e1b5d7f838072525971a6630246fa82ae4466a6d8a587 mesa-18.2.6.tar.gz
9ebafa4f8249df0c718e93b9ca155e3593a1239af303aa2a8b0f2056a7efdc12 mesa-18.2.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107626">Bug 107626</a> - [SNB] The graphical corruption and GPU hang occur sometimes on the piglit test &quot;arb_texture_multisample-large-float-texture&quot; with parameter --fp16</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107856">Bug 107856</a> - i965 incorrectly calculates the number of layers for texture views (assert)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108630">Bug 108630</a> - [G965] piglit.spec.!opengl 1_2.tex3d-maxsize spins forever</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108713">Bug 108713</a> - Gallium: use after free with transform feedback</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108829">Bug 108829</a> - [meson] libglapi exports internal API</li>
</ul>
<h2>Changes</h2>
<p>Andrii Simiklit (1):</p>
<ul>
<li>i965/batch: avoid reverting batch buffer if saved state is an empty</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Fix opaque metadata descriptor last layer.</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>scons/svga: remove opt from the list of valid build types</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>i965: Fix calculation of layers array length for isl_view</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>meson: Don't set -Wall</li>
<li>meson: Don't force libva to required from auto</li>
</ul>
<p>Emil Velikov (13):</p>
<ul>
<li>bin/get-pick-list.sh: simplify git oneline printing</li>
<li>bin/get-pick-list.sh: prefix output with "[stable] "</li>
<li>bin/get-pick-list.sh: handle "typod" usecase.</li>
<li>bin/get-pick-list.sh: handle the fixes tag</li>
<li>bin/get-pick-list.sh: tweak the commit sha matching pattern</li>
<li>bin/get-pick-list.sh: flesh out is_sha_nomination</li>
<li>bin/get-pick-list.sh: handle fixes tag with missing colon</li>
<li>bin/get-pick-list.sh: handle unofficial "broken by" tag</li>
<li>bin/get-pick-list.sh: use test instead of [ ]</li>
<li>bin/get-pick-list.sh: handle reverts prior to the branchpoint</li>
<li>travis: drop unneeded x11proto-xf86vidmode-dev</li>
<li>glx: make xf86vidmode mandatory for direct rendering</li>
<li>travis: adding missing x11-xcb for meson+vulkan</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>vc4: Make sure we make ro scanout resources for create_with_modifiers.</li>
</ul>
<p>Eric Engestrom (5):</p>
<ul>
<li>meson: only run vulkan's meson.build when building vulkan</li>
<li>gbm: remove unnecessary meson include</li>
<li>meson: fix wayland-less builds</li>
<li>egl: add missing glvnd entrypoint for EGL_ANDROID_blob_cache</li>
<li>glapi: add missing visibility args</li>
</ul>
<p>Erik Faye-Lund (1):</p>
<ul>
<li>mesa/main: remove bogus error for zero-sized images</li>
</ul>
<p>Gert Wollny (3):</p>
<ul>
<li>mesa: Reference count shaders that are used by transform feedback objects</li>
<li>r600: clean up the GS ring buffers when the context is destroyed</li>
<li>glsl: free or reuse memory allocated for TF varying</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>nir/lower_alu_to_scalar: Don't try to lower unpack_32_2x16</li>
<li>anv: Put robust buffer access in the pipeline hash</li>
</ul>
<p>Juan A. Suarez Romero (6):</p>
<ul>
<li>cherry-ignore: add explicit 18.3 only nominations</li>
<li>cherry-ignore: intel/aub_viewer: fix dynamic state printing</li>
<li>cherry-ignore: intel/aub_viewer: Print blend states properly</li>
<li>cherry-ignore: mesa/main: fix incorrect depth-error</li>
<li>docs: add sha256 checksums for 18.2.5</li>
<li>Update version to 18.2.6</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nir/spirv: cast shift operand to u32</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Add PCI IDs for new Amberlake parts that are Coffeelake based</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>egl/dri: fix error value with unknown drm format</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>winsys/amdgpu: fix a buffer leak in amdgpu_bo_from_handle</li>
<li>winsys/amdgpu: fix a device handle leak in amdgpu_winsys_create</li>
</ul>
<p>Rodrigo Vivi (4):</p>
<ul>
<li>i965: Add a new CFL PCI ID.</li>
<li>intel: aubinator: Adding missed platforms to the error message.</li>
<li>intel: Introducing Amber Lake platform</li>
<li>intel: Introducing Whiskey Lake platform</li>
</ul>
</div>
</body>
</html>

167
docs/relnotes/18.2.7.html Normal file
View File

@@ -0,0 +1,167 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.7 Release Notes / December 13, 2018</h1>
<p>
Mesa 18.2.7 is a bug fix release which fixes bugs found since the 18.2.6 release.
</p>
<p>
Mesa 18.2.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
092351cfbcd430ec595fbd3a3d8d253fd62c29074e1740d7198b00289ab400f8 mesa-18.2.7.tar.gz
9c7b02560d89d77ca279cd21f36ea9a49e9ffc5611f6fe35099357d744d07ae6 mesa-18.2.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106577">Bug 106577</a> - broken rendering with nine and nouveau (GM107)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108245">Bug 108245</a> - RADV/Vega: Low mip levels of large BCn textures get corrupted by vkCmdCopyBufferToImage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108311">Bug 108311</a> - Query buffer object support is broken on r600.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108894">Bug 108894</a> - [anv] vkCmdCopyBuffer() and vkCmdCopyQueryPoolResults() write-after-write hazard</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108909">Bug 108909</a> - Vkd3d test failure test_resolve_non_issued_query_data()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108914">Bug 108914</a> - blocky shadow artifacts in The Forest with DXVK, RADV_DEBUG=nohiz fixes this</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108925">Bug 108925</a> - vkCmdCopyQueryPoolResults(VK_QUERY_RESULT_WAIT_BIT) for timestamps with large query count hangs</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (1):</p>
<ul>
<li>radv: Flush before vkCmdWriteTimestamp() if needed</li>
</ul>
<p>Bas Nieuwenhuizen (4):</p>
<ul>
<li>radv: Align large buffers to the fragment size.</li>
<li>radv: Clamp gfx9 image view extents to the allocated image extents.</li>
<li>radv/android: Mark android WSI image as shareable.</li>
<li>radv/android: Use buffer metadata to determine scanout compat.</li>
</ul>
<p>Dave Airlie (2):</p>
<ul>
<li>r600: make suballocator 256-bytes align</li>
<li>radv: use 3d shader for gfx9 copies if dst is 3d</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>egl/wayland: bail out when drmGetMagic fails</li>
<li>egl/wayland: plug memory leak in drm_handle_device()</li>
</ul>
<p>Eric Anholt (3):</p>
<ul>
<li>v3d: Fix a leak of the transfer helper on screen destroy.</li>
<li>vc4: Fix a leak of the transfer helper on screen destroy.</li>
<li>v3d: Fix a leak of the disassembled instruction string during debug dumps.</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>anv: correctly use vulkan 1.0 by default</li>
<li>wsi/display: fix mem leak when freeing swapchains</li>
<li>vulkan/wsi: fix s/,/;/ typo</li>
</ul>
<p>Gurchetan Singh (3):</p>
<ul>
<li>virgl: quadruple command buffer size</li>
<li>virgl: avoid large inline transfers</li>
<li>virgl: don't mark buffers as unclean after a write</li>
</ul>
<p>Juan A. Suarez Romero (4):</p>
<ul>
<li>docs: add sha256 checksums for 18.2.6</li>
<li>cherry-ignore: freedreno: Fix autotools build.</li>
<li>cherry-ignore: mesa: Revert INTEL_fragment_shader_ordering support</li>
<li>Update version to 18.2.7</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nv50,nvc0: Fix gallium nine regression regarding sampler bindings</li>
</ul>
<p>Lionel Landwerlin (2):</p>
<ul>
<li>anv: flush pipeline before query result copies</li>
<li>anv/query: flush render target before copying results</li>
</ul>
<p>Michal Srb (2):</p>
<ul>
<li>gallium: Constify drisw_loader_funcs struct</li>
<li>drisw: Use separate drisw_loader_funcs for shm</li>
</ul>
<p>Nicolai Hähnle (2):</p>
<ul>
<li>egl/wayland: rather obvious build fix</li>
<li>meson: link LLVM 'native' component when LLVM is available</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: rework the TC-compat HTILE hardware bug with COND_EXEC</li>
</ul>
<p>Thomas Hellstrom (2):</p>
<ul>
<li>st/xa: Fix a memory leak</li>
<li>winsys/svga: Fix a memory leak</li>
</ul>
<p>Tobias Klausmann (1):</p>
<ul>
<li>amd/vulkan: meson build - use radv_deps for libvulkan_radeon</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>st/xvmc: Add X11 include path.</li>
</ul>
</div>
</body>
</html>

183
docs/relnotes/18.2.8.html Normal file
View File

@@ -0,0 +1,183 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.8 Release Notes / December 27, 2018</h1>
<p>
Mesa 18.2.8 is a bug fix release which fixes bugs found since the 18.2.7 release.
</p>
<p>
Mesa 18.2.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
77512edc0a84e19c7131a0e2e5ebf1beaf1494dc4b71508fcc92d06d65f9f4f5 mesa-18.2.8.tar.gz
1d2ed9fd435d86d95b7215b287258d3e6b1180293a36f688e5a2efc18298d863 mesa-18.2.8.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108114">Bug 108114</a> - [vulkancts] new VK_KHR_16bit_storage tests fail.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108116">Bug 108116</a> - [vulkancts] stencil partial clear tests fail.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108910">Bug 108910</a> - Vkd3d test failure test_multisample_array_texture()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108911">Bug 108911</a> - Vkd3d test failure test_clear_render_target_view()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109081">Bug 109081</a> - [bisected] [HSW] Regression in clipping.user_defined.clip_* vulkancts tests</li>
</ul>
<h2>Changes</h2>
<p>Alex Deucher (3):</p>
<ul>
<li>pci_ids: add new vega10 pci ids</li>
<li>pci_ids: add new vega20 pci id</li>
<li>pci_ids: add new VegaM pci id</li>
</ul>
<p>Axel Davy (3):</p>
<ul>
<li>st/nine: Fix volumetexture dtor on ctor failure</li>
<li>st/nine: Bind src not dst in nine_context_box_upload</li>
<li>st/nine: Add src reference to nine_context_range_upload</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>nir: properly clear the entry sources in copy_prop_vars</li>
</ul>
<p>Dylan Baker (1):</p>
<ul>
<li>meson: Fix ppc64 little endian detection</li>
</ul>
<p>Emil Velikov (9):</p>
<ul>
<li>glx: mandate xf86vidmode only for "drm" dri platforms</li>
<li>bin/get-pick-list.sh: rework handing of sha nominations</li>
<li>bin/get-pick-list.sh: warn when commit lists invalid sha</li>
<li>meson: don't require glx/egl/gbm with gallium drivers</li>
<li>pipe-loader: meson: reference correct library</li>
<li>TODO: glx: meson: build dri based glx tests, only with -Dglx=dri</li>
<li>glx: meson: drop includes from a link-only library</li>
<li>glx: meson: wire up the dispatch-index-check test</li>
<li>glx/test: meson: assorted include fixes</li>
</ul>
<p>Eric Anholt (2):</p>
<ul>
<li>v3d: Make sure that a thrsw doesn't split a multop from its umul24.</li>
<li>v3d: Add missing flagging of SYNCB as a TSY op.</li>
</ul>
<p>Erik Faye-Lund (2):</p>
<ul>
<li>virgl: wrap vertex element state in a struct</li>
<li>virgl: work around bad assumptions in virglrenderer</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>i965/vec4/dce: Don't narrow the write mask if the flags are used</li>
<li>Revert "nir/lower_indirect: Bail early if modes == 0"</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>clover: Fix build after clang r348827</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>nir/constant_folding: Fix source bit size logic</li>
</ul>
<p>Jon Turney (1):</p>
<ul>
<li>glx: Fix compilation with GLX_USE_WINDOWSGL</li>
</ul>
<p>Juan A. Suarez Romero (7):</p>
<ul>
<li>docs: add sha256 checksums for 18.2.7</li>
<li>cherry-ignore: add explicit 18.3 only nominations</li>
<li>cherry-ignore: meson: libfreedreno depends upon libdrm (for fence support)</li>
<li>cherry-ignore: radv: Fix multiview depth clears</li>
<li>cherry-ignore: nir: properly find the entry to keep in copy_prop_vars</li>
<li>cherry-ignore: intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs</li>
<li>Update version to 18.2.8</li>
</ul>
<p>Kirill Burtsev (1):</p>
<ul>
<li>loader: free error state, when checking the drawable type</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: don't do partial resolve on layer &gt; 0</li>
</ul>
<p>Rhys Perry (2):</p>
<ul>
<li>radv: don't set surf_index for stencil-only images</li>
<li>ac: split 16-bit ssbo loads that may not be dword aligned</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>mesa/st/nir: fix missing nir_compact_varyings</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: switch on EOP when primitive restart is enabled with triangle strips</li>
</ul>
<p>Vinson Lee (2):</p>
<ul>
<li>meson: Fix typo.</li>
<li>meson: Fix libsensors detection.</li>
</ul>
</div>
</body>
</html>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.3.0 Release Notes / TBD</h1>
<h1>Mesa 18.3.0 Release Notes / December 7, 2018</h1>
<p>
Mesa 18.3.0 is a new development release. People who are concerned
@@ -40,7 +40,8 @@ an up-to-date version of Wayland to keep the functionality.
<h2>SHA256 checksums</h2>
<pre>
TBD.
17a124d4dbc712505d22a7815c9b0cee22214c96c8abb91539a2b1351e38a000 mesa-18.3.0.tar.gz
b63f947e735d6ef3dfaa30c789a9adfbae18aea671191eaacde95a18c17fc38a mesa-18.3.0.tar.xz
</pre>
@@ -70,8 +71,206 @@ Note: some of the new features are only available with certain drivers.
<h2>Bug fixes</h2>
<ul>
<li>TBD</li>
</ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=13728">Bug 13728</a> - [G965] Some objects in Neverwinter Nights Linux version not displayed correctly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91433">Bug 91433</a> - piglit.spec.arb_depth_buffer_float.fbo-depth-gl_depth_component32f-copypixels fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93355">Bug 93355</a> - [BXT,SKLGT4e] intermittent ext_framebuffer_multisample.accuracy fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94957">Bug 94957</a> - dEQP failures on llvmpipe</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98699">Bug 98699</a> - &quot;float[a+++4 ? 1:1] f;&quot; crashes glsl_compiler</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99507">Bug 99507</a> - Corrupted frame contents with Vulkan version of DOTA2, Talos Principle and Sascha Willems' demos when they're run Vsynched in fullscreen</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99730">Bug 99730</a> - Metro Redux game(s) needs override for midshader extension declaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100200">Bug 100200</a> - Default Unreal Engine 4 frag shader fails to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101247">Bug 101247</a> - Mesa fails to link GLSL programs with unused output blocks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102597">Bug 102597</a> - [Regression] mpv, high rendering times (two to three times higher)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103241">Bug 103241</a> - Anv crashes when using 64-bit vertex inputs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104602">Bug 104602</a> - [apitrace] Graphical artifacts in Civilization VI on RX Vega</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104809">Bug 104809</a> - anv: DOOM 2016 and Wolfenstein II:The New Colossus crash due to not having depthBoundsTest</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104926">Bug 104926</a> - swrast: Mesa 17.3.3 produces: HW cursor for format 875713089 not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105333">Bug 105333</a> - [gallium-nine] missing geometry after commit ac: replace ac_build_kill with ac_build_kill_if_false</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105371">Bug 105371</a> - r600_shader_from_tgsi - GPR limit exceeded - shader requires 360 registers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105731">Bug 105731</a> - linker error &quot;fragment shader input ... has no matching output in the previous stage&quot; when previous stage's output declaration in a separate shader object</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105904">Bug 105904</a> - Needed to delete mesa shader cache after driver upgrade for 32 bit wine vulkan programs to work.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105975">Bug 105975</a> - i965 always reports 0 viewport subpixel bits</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106231">Bug 106231</a> - llvmpipe blends produce bad code after llvm patch https://reviews.llvm.org/D44785</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106283">Bug 106283</a> - Shader replacements works only for limited use cases</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106577">Bug 106577</a> - broken rendering with nine and nouveau (GM107)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106833">Bug 106833</a> - glLinkProgram is expected to fail when vertex attribute aliasing happens on ES3.0 context or later</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106865">Bug 106865</a> - [GLK] piglit.spec.ext_framebuffer_multisample.accuracy stencil tests fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106980">Bug 106980</a> - Basemark GPU vulkan benchmark hangs on GFX9</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106997">Bug 106997</a> - [Regression]. Dying light game is crashing on latest mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107088">Bug 107088</a> - [GEN8+] Hang when discarding a fragment if dual source blending is enabled but shader doesn't support it</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107098">Bug 107098</a> - Segfault after munmap(kms_sw_dt-&gt;ro_mapped)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107212">Bug 107212</a> - Dual-Core CPU E5500 / G45: RetroArch with reicast core results in corrupted graphics</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107223">Bug 107223</a> - [GEN9+] 50% perf drop in SynMark Fill* tests (E2E RBC gets disabled?)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107276">Bug 107276</a> - radv: OpBitfieldUExtract returns incorrect result when count is zero</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107280">Bug 107280</a> - [DXVK] Batman: Arkham City with tessellation enabled hangs on SKL GT4</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107313">Bug 107313</a> - Meson instructions on web site are non-optimal</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107359">Bug 107359</a> - [Regression] [bisected] [OpenGL CTS] [SKL,BDW] KHR-GL46.texture_barrier*-texels, GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners, and GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners fail with some configuration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107460">Bug 107460</a> - radv: OpControlBarrier does not always work correctly (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107477">Bug 107477</a> - [DXVK] Setting high shader quality in GTA V results in LLVM error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107483">Bug 107483</a> - DispatchSanity_test.GL31_CORE regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107487">Bug 107487</a> - [intel] [tools] intel gpu tools don't honor -D tools=[]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107488">Bug 107488</a> - gl.h:2090: error: redefinition of typedef GLeglImageOES</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107510">Bug 107510</a> - [GEN8+] up to 10% perf drop on several 3D benchmarks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107511">Bug 107511</a> - KHR/khrplatform.h not always installed when needed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107524">Bug 107524</a> - Broken packDouble2x32 at llvmpipe</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107544">Bug 107544</a> - intel/decoder: out of bounds group_iter</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107547">Bug 107547</a> - shader crashing glsl_compiler (uniform block assigned to vec2, then component substraced by 1)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107550">Bug 107550</a> - &quot;0[2]&quot; as function parameter hits assert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107563">Bug 107563</a> - [RADV] Broken rendering in Unity demos</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107565">Bug 107565</a> - TypeError: __init__() got an unexpected keyword argument 'future_imports'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107579">Bug 107579</a> - [SNB] The graphic corruption when we reuse the GS compiled and used for TFB when statebuffer contain magic trash in the unused space</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107601">Bug 107601</a> - Rise of the Tomb Raider Segmentation Fault when the game starts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107610">Bug 107610</a> - Dolphin emulator mis-renders shadow overlay in Super Mario Sunshine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107626">Bug 107626</a> - [SNB] The graphical corruption and GPU hang occur sometimes on the piglit test &quot;arb_texture_multisample-large-float-texture&quot; with parameter --fp16</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107658">Bug 107658</a> - [Regression] [bisected] [OpenGLES CTS] KHR-GLES3.packed_pixels.*rectangle.r*8_snorm</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107734">Bug 107734</a> - [GLSL] glsl-fface-invariant, glsl-fcoord-invariant and glsl-pcoord-invariant should fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107745">Bug 107745</a> - [bisected] [bdw bsw] piglit.­spec.­arb_fragment_shader_interlock.­arb_fragment_shader_interlock-image-load-store failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107760">Bug 107760</a> - GPU Hang when Playing DiRT 3 Complete Edition using Steam Play with DXVK</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107765">Bug 107765</a> - [regression] Batman Arkham City crashes with DXVK under wine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107772">Bug 107772</a> - Mesa preprocessor matches if(def)s &amp; endifs incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107779">Bug 107779</a> - Access violation with some games</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107786">Bug 107786</a> - [DXVK] MSAA reflections are broken in GTA V</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107806">Bug 107806</a> - glsl_get_natural_size_align_bytes() ABORT with GfxBench Vulkan AztecRuins</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107810">Bug 107810</a> - The 'va_end' call is missed after 'va_copy' in 'util_vsnprintf' function under windows</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107832">Bug 107832</a> - Gallium picking A16L16 formats when emulating INTENSITY16 conflicts with mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107843">Bug 107843</a> - 32bit Mesa build failes with meson.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107856">Bug 107856</a> - i965 incorrectly calculates the number of layers for texture views (assert)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107857">Bug 107857</a> - GPU hang - GS_EMIT without shader outputs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107865">Bug 107865</a> - swr fail to build with llvm-libs 6.0.1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107869">Bug 107869</a> - u_thread.h:87:4: error: use of undeclared identifier 'cpu_set_t'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107870">Bug 107870</a> - Undefined symbols for architecture x86_64: &quot;_util_cpu_caps&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107879">Bug 107879</a> - crash happens when link program</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107891">Bug 107891</a> - [wine, regression, bisected] RAGE, Wolfenstein The New Order hangs in menu</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107923">Bug 107923</a> - build_id.c:126: multiple definition of `build_id_length'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107926">Bug 107926</a> - [anv] Rise of the Tomb Raider always misrendering, segfault and gpu hang.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107941">Bug 107941</a> - GPU hang and system crash with Dota 2 using Vulkan</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107971">Bug 107971</a> - SPV_GOOGLE_hlsl_functionality1 / SPV_GOOGLE_decorate_string</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108012">Bug 108012</a> - Compiler crashes on access of non-existent member incremental operations</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108024">Bug 108024</a> - [Debian Stretch]Fail to build because &quot;xcb_randr_lease_t&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108082">Bug 108082</a> - warning: unknown warning option '-Wno-format-truncation' [-Wunknown-warning-option]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108109">Bug 108109</a> - [GLSL] no-overloads.vert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108112">Bug 108112</a> - [vulkancts] some of the coherent memory tests fail.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108113">Bug 108113</a> - [vulkancts] r32g32b32 transfer operations not implemented</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108115">Bug 108115</a> - [vulkancts] dEQP-VK.subgroups.vote.graphics.subgroupallequal.* fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108164">Bug 108164</a> - [radv] VM faults since 5d6a560a2986c9ab421b3c7904d29bb7bc35e36f</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108245">Bug 108245</a> - RADV/Vega: Low mip levels of large BCn textures get corrupted by vkCmdCopyBufferToImage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108272">Bug 108272</a> - [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108311">Bug 108311</a> - Query buffer object support is broken on r600.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108319">Bug 108319</a> - [GLK BXT BSW] Assertion in piglit.spec.arb_gpu_shader_fp64.execution.built-in-functions.vs-sign-sat-neg-abs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108491">Bug 108491</a> - Commit baa38c14 causes output issues on my VEGA with RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108524">Bug 108524</a> - [RADV] GPU lockup on event synchronization</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108530">Bug 108530</a> - (mesa-18.3) [Tracker] Mesa 18.3 Release Tracker</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108532">Bug 108532</a> - make check nir_copy_prop_vars_test.store_store_load_different_components regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108560">Bug 108560</a> - Mesa 32 is built without sse</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108595">Bug 108595</a> - ir3_compiler valgrind build error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108617">Bug 108617</a> - [deqp] Mesa fails conformance for egl_ext_device</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108630">Bug 108630</a> - [G965] piglit.spec.!opengl 1_2.tex3d-maxsize spins forever</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108635">Bug 108635</a> - Mesa master commit 68dc591af16ebb36814e4c187e4998948103c99c causes XWayland to segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108713">Bug 108713</a> - Gallium: use after free with transform feedback</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108829">Bug 108829</a> - [meson] libglapi exports internal API</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108894">Bug 108894</a> - [anv] vkCmdCopyBuffer() and vkCmdCopyQueryPoolResults() write-after-write hazard</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108909">Bug 108909</a> - Vkd3d test failure test_resolve_non_issued_query_data()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108914">Bug 108914</a> - blocky shadow artifacts in The Forest with DXVK, RADV_DEBUG=nohiz fixes this</li>
<h2>Changes</h2>

63
docs/relnotes/18.3.1.html Normal file
View File

@@ -0,0 +1,63 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.3.1 Release Notes / December 11, 2018</h1>
<p>
Mesa 18.3.1 is a bug fix release which fixes bugs found since the 18.3.0 release.
</p>
<p>
Mesa 18.3.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
256d0c3d88e380c1b8e3fc5c6ac34001e3b7c30458b8b852407ec68b8ccd9fda mesa-18.3.1.tar.gz
5b1f827d28684a25f6657289f8b7d47ac56395988c7ac23e0ec9a62b644bdc63 mesa-18.3.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>None</p>
<h2>Changes</h2>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 18.3.0</li>
<li>Update version to 18.3.1</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>anv,radv: Disable VK_EXT_pci_bus_info</li>
</ul>
</div>
</body>
</html>

265
docs/relnotes/18.3.2.html Normal file
View File

@@ -0,0 +1,265 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.3.2 Release Notes / January 17, 2019</h1>
<p>
Mesa 18.3.2 is a bug fix release which fixes bugs found since the 18.3.1 release.
</p>
<p>
Mesa 18.3.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
1cde4fafd40cd1ad4ee3a13b364b7a0175a08b7afdd127fb46f918c1e1dfd4b0 mesa-18.3.2.tar.gz
f7ce7181c07b6d8e0132da879af1729523a6c8aa87f79a9d59dfd064024cfb35 mesa-18.3.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106595">Bug 106595</a> - [RADV] Rendering distortions only when MSAA is enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107728">Bug 107728</a> - Wrong background in Sascha Willem's Multisampling Demo</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108114">Bug 108114</a> - [vulkancts] new VK_KHR_16bit_storage tests fail.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108116">Bug 108116</a> - [vulkancts] stencil partial clear tests fail.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108624">Bug 108624</a> - [regression][bisected] &quot;nir: Copy propagation between blocks&quot; regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108910">Bug 108910</a> - Vkd3d test failure test_multisample_array_texture()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108911">Bug 108911</a> - Vkd3d test failure test_clear_render_target_view()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108943">Bug 108943</a> - Build fails on ppc64le with meson</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109072">Bug 109072</a> - GPU hang in blender 2.80</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109081">Bug 109081</a> - [bisected] [HSW] Regression in clipping.user_defined.clip_* vulkancts tests</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109151">Bug 109151</a> - [KBL-G][vulkan] dEQP-VK.texture.explicit_lod.2d.sizes.31x55_nearest_linear_mipmap_nearest_repeat failed verification.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109202">Bug 109202</a> - nv50_ir.cpp:749:19: error: cannot use typeid with -fno-rtti</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109204">Bug 109204</a> - [regression, bisected] retroarch's crt-royale shader crash radv</li>
</ul>
<h2>Changes</h2>
<p>Alex Deucher (3):</p>
<ul>
<li>pci_ids: add new vega10 pci ids</li>
<li>pci_ids: add new vega20 pci id</li>
<li>pci_ids: add new VegaM pci id</li>
</ul>
<p>Alexander von Gluck IV (1):</p>
<ul>
<li>egl/haiku: Fix reference to disp vs dpy</li>
</ul>
<p>Andres Gomez (2):</p>
<ul>
<li>glsl: correct typo in GLSL compilation error message</li>
<li>glsl/linker: specify proper direction in location aliasing error</li>
</ul>
<p>Axel Davy (3):</p>
<ul>
<li>st/nine: Fix volumetexture dtor on ctor failure</li>
<li>st/nine: Bind src not dst in nine_context_box_upload</li>
<li>st/nine: Add src reference to nine_context_range_upload</li>
</ul>
<p>Bas Nieuwenhuizen (5):</p>
<ul>
<li>radv: Do a cache flush if needed before reading predicates.</li>
<li>radv: Implement buffer stores with less than 4 components.</li>
<li>anv/android: Do not reject storage images.</li>
<li>radv: Fix rasterization precision bits.</li>
<li>spirv: Fix matrix parameters in function calls.</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (3):</p>
<ul>
<li>nir: properly clear the entry sources in copy_prop_vars</li>
<li>nir: properly find the entry to keep in copy_prop_vars</li>
<li>nir: remove dead code from copy_prop_vars</li>
</ul>
<p>Dave Airlie (2):</p>
<ul>
<li>radv/xfb: fix counter buffer bounds checks.</li>
<li>virgl/vtest: fix front buffer flush with protocol version 0.</li>
</ul>
<p>Dylan Baker (6):</p>
<ul>
<li>meson: Fix ppc64 little endian detection</li>
<li>meson: Add support for gnu hurd</li>
<li>meson: Add toggle for glx-direct</li>
<li>meson: Override C++ standard to gnu++11 when building with altivec on ppc64</li>
<li>meson: Error out if building nouveau and using LLVM without rtti</li>
<li>autotools: Remove tegra vdpau driver</li>
</ul>
<p>Emil Velikov (12):</p>
<ul>
<li>docs: add sha256 checksums for 18.3.1</li>
<li>bin/get-pick-list.sh: rework handing of sha nominations</li>
<li>bin/get-pick-list.sh: warn when commit lists invalid sha</li>
<li>cherry-ignore: meson: libfreedreno depends upon libdrm (for fence support)</li>
<li>glx: mandate xf86vidmode only for "drm" dri platforms</li>
<li>meson: don't require glx/egl/gbm with gallium drivers</li>
<li>pipe-loader: meson: reference correct library</li>
<li>TODO: glx: meson: build dri based glx tests, only with -Dglx=dri</li>
<li>glx: meson: drop includes from a link-only library</li>
<li>glx: meson: wire up the dispatch-index-check test</li>
<li>glx/test: meson: assorted include fixes</li>
<li>Update version to 18.3.2</li>
</ul>
<p>Eric Anholt (6):</p>
<ul>
<li>v3d: Fix a leak of the transfer helper on screen destroy.</li>
<li>vc4: Fix a leak of the transfer helper on screen destroy.</li>
<li>v3d: Fix a leak of the disassembled instruction string during debug dumps.</li>
<li>v3d: Make sure that a thrsw doesn't split a multop from its umul24.</li>
<li>v3d: Add missing flagging of SYNCB as a TSY op.</li>
<li>gallium/ttn: Fix setup of outputs_written.</li>
</ul>
<p>Erik Faye-Lund (2):</p>
<ul>
<li>virgl: wrap vertex element state in a struct</li>
<li>virgl: work around bad assumptions in virglrenderer</li>
</ul>
<p>Francisco Jerez (5):</p>
<ul>
<li>intel/fs: Handle source modifiers in lower_integer_multiplication().</li>
<li>intel/fs: Implement quad swizzles on ICL+.</li>
<li>intel/fs: Fix bug in lower_simd_width while splitting an instruction which was already split.</li>
<li>intel/eu/gen7: Fix brw_MOV() with DF destination and strided source.</li>
<li>intel/fs: Respect CHV/BXT regioning restrictions in copy propagation pass.</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>i965/vec4/dce: Don't narrow the write mask if the flags are used</li>
<li>Revert "nir/lower_indirect: Bail early if modes == 0"</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>clover: Fix build after clang r348827</li>
</ul>
<p>Jason Ekstrand (6):</p>
<ul>
<li>nir/constant_folding: Fix source bit size logic</li>
<li>intel/blorp: Be more conservative about copying clear colors</li>
<li>spirv: Handle any bit size in vector_insert/extract</li>
<li>anv/apply_pipeline_layout: Set the cursor in lower_res_reindex_intrinsic</li>
<li>spirv: Sign-extend array indices</li>
<li>intel/peephole_ffma: Fix swizzle propagation</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nv50/ir: fix use-after-free in ConstantFolding::visit</li>
</ul>
<p>Kirill Burtsev (1):</p>
<ul>
<li>loader: free error state, when checking the drawable type</li>
</ul>
<p>Lionel Landwerlin (5):</p>
<ul>
<li>anv: don't do partial resolve on layer &gt; 0</li>
<li>i965: include draw_params/derived_draw_params for VF cache workaround</li>
<li>i965: add CS stall on VF invalidation workaround</li>
<li>anv: explictly specify format for blorp ccs/mcs op</li>
<li>anv: flush fast clear colors into compressed surfaces</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>st/mesa: don't leak pipe_surface if pipe_context is not current</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>radeonsi: Fix use of 1- or 2- component GL_DOUBLE vbo's.</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>meson: link LLVM 'native' component when LLVM is available</li>
</ul>
<p>Rhys Perry (3):</p>
<ul>
<li>radv: don't set surf_index for stencil-only images</li>
<li>ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics</li>
<li>ac: split 16-bit ssbo loads that may not be dword aligned</li>
</ul>
<p>Rob Clark (2):</p>
<ul>
<li>freedreno/drm: fix memory leak</li>
<li>mesa/st/nir: fix missing nir_compact_varyings</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: switch on EOP when primitive restart is enabled with triangle strips</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>tgsi/scan: fix loop exit point in tgsi_scan_tess_ctrl()</li>
<li>tgsi/scan: correctly walk instructions in tgsi_scan_tess_ctrl()</li>
</ul>
<p>Vinson Lee (2):</p>
<ul>
<li>meson: Fix typo.</li>
<li>meson: Fix libsensors detection.</li>
</ul>
</div>
</body>
</html>

208
docs/relnotes/18.3.3.html Normal file
View File

@@ -0,0 +1,208 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.3.3 Release Notes / January 31, 2019</h1>
<p>
Mesa 18.3.3 is a bug fix release which fixes bugs found since the 18.3.2 release.
</p>
<p>
Mesa 18.3.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
6b9893942fe8011c7736d51448deb6ef80ece2257e0fac27b02e997a6605d5e4 mesa-18.3.3.tar.gz
2ab6886a6966c532ccbcc3b240925e681464b658244f0cbed752615af3936299 mesa-18.3.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108877">Bug 108877</a> - OpenGL CTS gl43 test cases were interrupted due to segment fault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109023">Bug 109023</a> - error: inlining failed in call to always_inline __m512 _mm512_and_ps(__m512, __m512): target specific option mismatch</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109129">Bug 109129</a> - format_types.h:1220: undefined reference to `_mm256_cvtps_ph'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109229">Bug 109229</a> - glLinkProgram locks up for ~30 seconds</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109242">Bug 109242</a> - [RADV] The Witcher 3 system freeze</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109488">Bug 109488</a> - Mesa 18.3.2 crash on a specific fragment shader (assert triggered) / already fixed on the master branch.</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (2):</p>
<ul>
<li>bin/get-pick-list.sh: fix the oneline printing</li>
<li>bin/get-pick-list.sh: fix redirection in sh</li>
</ul>
<p>Axel Davy (1):</p>
<ul>
<li>st/nine: Immediately upload user provided textures</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Only use 32 KiB per threadgroup on Stoney.</li>
<li>radv: Set partial_vs_wave for pipelines with just GS, not tess.</li>
<li>nir: Account for atomics in copy propagation.</li>
</ul>
<p>Bruce Cherniak (1):</p>
<ul>
<li>gallium/swr: Fix multi-context sync fence deadlock.</li>
</ul>
<p>Carsten Haitzler (Rasterman) (2):</p>
<ul>
<li>vc4: Use named parameters for the NEON inline asm.</li>
<li>vc4: Declare the cpu pointers as being modified in NEON asm.</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>glsl: Fix copying function's out to temp if dereferenced by array</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>dri_interface: add put shm image2 (v2)</li>
<li>glx: add support for putimageshm2 path (v2)</li>
<li>gallium: use put image shm2 path (v2)</li>
</ul>
<p>Dylan Baker (4):</p>
<ul>
<li>meson: allow building dri driver without window system if osmesa is classic</li>
<li>meson: fix swr KNL build</li>
<li>meson: Fix compiler checks for SWR with ICC</li>
<li>meson: Add warnings and errors when using ICC</li>
</ul>
<p>Emil Velikov (4):</p>
<ul>
<li>docs: add sha256 checksums for 18.3.2</li>
<li>cherry-ignore: radv: Fix multiview depth clears</li>
<li>cherry-ignore: spirv: Handle arbitrary bit sizes for deref array indices</li>
<li>cherry-ignore: WARNING: Commit XXX lists invalid sha</li>
</ul>
<p>Eric Anholt (2):</p>
<ul>
<li>vc4: Don't leak the GPU fd for renderonly usage.</li>
<li>vc4: Enable NEON asm on meson cross-builds.</li>
</ul>
<p>Eric Engestrom (2):</p>
<ul>
<li>configure: EGL requirements only apply if EGL is built</li>
<li>meson/vdpau: add missing soversion</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>anv/device: fix maximum number of images supported</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>anv/nir: Rework arguments to apply_pipeline_layout</li>
<li>anv: Only parse pImmutableSamplers if the descriptor has samplers</li>
<li>nir/xfb: Fix offset accounting for dvec3/4</li>
</ul>
<p>Karol Herbst (2):</p>
<ul>
<li>nv50/ir: disable tryCollapseChainedMULs in ConstantFolding for precise instructions</li>
<li>glsl/lower_output_reads: set invariant and precise flags on temporaries</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: fix invalid binding table index computation</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>radeonsi: also apply the GS hang workaround to draws without tessellation</li>
<li>radeonsi: fix a u_blitter crash after a shader with FBFETCH</li>
<li>radeonsi: fix rendering to tiny viewports where the viewport center is &gt; 8K</li>
<li>st/mesa: purge framebuffers when unbinding a context</li>
</ul>
<p>Niklas Haas (1):</p>
<ul>
<li>radv: correctly use vulkan 1.0 by default</li>
</ul>
<p>Pierre Moreau (1):</p>
<ul>
<li>meson: Fix with_gallium_icd to with_opencl_icd</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>loader: fix the no-modifiers case</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: clean up setting partial_es_wave for distributed tess on VI</li>
</ul>
<p>Timothy Arceri (5):</p>
<ul>
<li>ac/nir_to_llvm: fix interpolateAt* for arrays</li>
<li>ac/nir_to_llvm: fix clamp shadow reference for more hardware</li>
<li>radv/ac: fix some fp16 handling</li>
<li>glsl: use remap location when serialising uniform program resource data</li>
<li>glsl: Copy function out to temp if we don't directly ref a variable</li>
</ul>
<p>Tomeu Vizoso (1):</p>
<ul>
<li>etnaviv: Consolidate buffer references from framebuffers</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>meson: Fix typo.</li>
</ul>
</div>
</body>
</html>

180
docs/relnotes/18.3.4.html Normal file
View File

@@ -0,0 +1,180 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.3.4 Release Notes / February 18, 2019</h1>
<p>
Mesa 18.3.4 is a bug fix release which fixes bugs found since the 18.3.3 release.
</p>
<p>
Mesa 18.3.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
e22e6fe4c3aca80fe872a0a7285b6c5523e0cfc0bfb57ffcc3b3d66d292593e4 mesa-18.3.4.tar.gz
32314da4365d37f80d84f599bd9625b00161c273c39600ba63b45002d500bb07 mesa-18.3.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109107">Bug 109107</a> - gallium/st/va: change va max_profiles when using Radeon VCN Hardware</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109401">Bug 109401</a> - [DXVK] Project Cars rendering problems</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109543">Bug 109543</a> - After upgrade mesa to 19.0.0~rc1 all vulkan based application stop working [&quot;vulkan-cube&quot; received SIGSEGV in radv_pipeline_init_blend_state at ../src/amd/vulkan/radv_pipeline.c:699]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109603">Bug 109603</a> - nir_instr_as_deref: Assertion `parent &amp;&amp; parent-&gt;type == nir_instr_type_deref' failed.</li>
</ul>
<h2>Changes</h2>
<p>Bart Oldeman (1):</p>
<ul>
<li>gallium-xlib: query MIT-SHM before using it.</li>
</ul>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>radv: Only look at pImmutableSamples if the descriptor has a sampler.</li>
<li>amd/common: Use correct writemask for shared memory stores.</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>get-pick-list: Add --pretty=medium to the arguments for Cc patches</li>
<li>meson: Add dependency on genxml to anvil</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: add sha256 checksums for 18.3.3</li>
<li>cherry-ignore: nv50,nvc0: add explicit settings for recent caps</li>
<li>cherry-ignore: add more 19.0 only nominations from Ilia</li>
<li>cherry-ignore: radv: fix using LOAD_CONTEXT_REG with old GFX ME firmwares on GFX8</li>
<li>Update version to 18.3.4</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>vc4: Fix copy-and-paste fail in backport of NEON asm fixes.</li>
</ul>
<p>Eric Engestrom (2):</p>
<ul>
<li>xvmc: fix string comparison</li>
<li>xvmc: fix string comparison</li>
</ul>
<p>Ernestas Kulik (2):</p>
<ul>
<li>vc4: Fix leak in HW queries error path</li>
<li>v3d: Fix leak in resource setup error path</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nvc0: we have 16k-sized framebuffers, fix default scissors</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>intel/fs: Handle IMAGE_SIZE in size_read() and is_send_from_grf()</li>
<li>intel/fs: Do the grf127 hack on SIMD8 instructions in SIMD16 mode</li>
<li>nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocks</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>anv/cmd_buffer: check for NULL framebuffer</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>st/mesa: Limit GL_MAX_[NATIVE_]PROGRAM_PARAMETERS_ARB to 2048</li>
</ul>
<p>Kristian H. Kristensen (1):</p>
<ul>
<li>freedreno/a6xx: Emit blitter dst with OUT_RELOCW</li>
</ul>
<p>Leo Liu (2):</p>
<ul>
<li>st/va: fix the incorrect max profiles report</li>
<li>st/va/vp9: set max reference as default of VP9 reference number</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>meson: drop the xcb-xrandr version requirement</li>
<li>gallium/u_threaded: fix EXPLICIT_FLUSH for flush offsets &gt; 0</li>
<li>radeonsi: fix EXPLICIT_FLUSH for flush offsets &gt; 0</li>
<li>winsys/amdgpu: don't drop manually added fence dependencies</li>
</ul>
<p>Mario Kleiner (2):</p>
<ul>
<li>egl/wayland: Allow client-&gt;server format conversion for PRIME offload. (v2)</li>
<li>egl/wayland-drm: Only announce formats via wl_drm which the driver supports.</li>
</ul>
<p>Oscar Blumberg (1):</p>
<ul>
<li>radeonsi: Fix guardband computation for large render targets</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno: stop frob'ing pipe_resource::nr_samples</li>
</ul>
<p>Rodrigo Vivi (1):</p>
<ul>
<li>intel: Add more PCI Device IDs for Coffee Lake and Ice Lake.</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>radv: fix compiler issues with GCC 9</li>
<li>radv: always export gl_SampleMask when the fragment shader uses it</li>
</ul>
</div>
</body>
</html>

74
docs/relnotes/19.0.0.html Normal file
View File

@@ -0,0 +1,74 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.0.0 Release Notes / TBD</h1>
<p>
Mesa 19.0.0 is a new development release. People who are concerned
with stability and reliability should stick with a previous release or
wait for Mesa 19.0.1.
</p>
<p>
Mesa 19.0.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<ul>
<li>GL_AMD_texture_texture4 on all GL 4.0 drivers.</li>
<li>GL_EXT_shader_implicit_conversions on all drivers (ES extension).</li>
<li>GL_EXT_texture_compression_bptc on all GL 4.0 drivers (ES extension).</li>
<li>GL_EXT_texture_compression_rgtc on all GL 3.0 drivers (ES extension).</li>
<li>GL_EXT_render_snorm on gallium drivers (ES extension).</li>
<li>GL_EXT_texture_view on drivers supporting texture views (ES extension).</li>
<li>GL_OES_texture_view on drivers supporting texture views (ES extension).</li>
<li>GL_NV_shader_atomic_float on nvc0 (Fermi/Kepler only).</li>
<li>Shader-based software implementations of GL_ARB_gpu_shader_fp64, GL_ARB_gpu_shader_int64, GL_ARB_vertex_attrib_64bit, and GL_ARB_shader_ballot on i965.</li>
<li>VK_ANDROID_external_memory_android_hardware_buffer on Intel</li>
<li>Fixed and re-exposed VK_EXT_pci_bus_info on Intel and RADV</li>
<li>VK_EXT_scalar_block_layout on Intel and RADV</li>
<li>VK_KHR_depth_stencil_resolve on Intel</li>
<li>VK_KHR_draw_indirect_count on Intel</li>
<li>VK_EXT_conditional_rendering on Intel</li>
<li>VK_EXT_memory_budget on RADV</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>TBD</li>
</ul>
<h2>Changes</h2>
<ul>
<li>TBD</li>
</ul>
</div>
</body>
</html>

60
docs/relnotes/19.1.0.html Normal file
View File

@@ -0,0 +1,60 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.1.0 Release Notes / TBD</h1>
<p>
Mesa 19.1.0 is a new development release. People who are concerned
with stability and reliability should stick with a previous release or
wait for Mesa 19.1.1.
</p>
<p>
Mesa 19.1.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<ul>
<li>GL_EXT_texture_compression_s3tc_srgb on Gallium drivers and i965 (ES extension).</li>
<li>VK_EXT_buffer_device_address on Intel and RADV.</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>TBD</li>
</ul>
<h2>Changes</h2>
<ul>
<li>TBD</li>
</ul>
</div>
</body>
</html>

View File

@@ -0,0 +1,95 @@
Name
MESA_query_driver
Name Strings
EGL_MESA_query_driver
Contact
Rob Clark <robdclark 'at' gmail.com>
Nicolai Hähnle <Nicolai.Haehnle 'at' amd.com>
Contibutors
Veluri Mithun <velurimithun38 'at' gmail.com>
Status
Complete
Version
Version 3, 2019-01-24
Number
EGL Extension 131
Dependencies
EGL 1.0 is required.
Overview
When an application has to query the name of a driver and for
obtaining driver's option list (UTF-8 encoded XML) of a driver
the below functions are useful.
XML file formally describes all available options and also
includes verbal descriptions in multiple languages. Its main purpose
is to be automatically processed by configuration GUIs.
The XML shall respect the following DTD:
<!ELEMENT driinfo (section*)>
<!ELEMENT section (description+, option+)>
<!ELEMENT description (enum*)>
<!ATTLIST description lang CDATA #REQUIRED
text CDATA #REQUIRED>
<!ELEMENT option (description+)>
<!ATTLIST option name CDATA #REQUIRED
type (bool|enum|int|float) #REQUIRED
default CDATA #REQUIRED
valid CDATA #IMPLIED>
<!ELEMENT enum EMPTY>
<!ATTLIST enum value CDATA #REQUIRED
text CDATA #REQUIRED>
New Procedures and Functions
char* eglGetDisplayDriverConfig(EGLDisplay dpy);
const char* eglGetDisplayDriverName(EGLDisplay dpy);
Description
By passing EGLDisplay as parameter to `eglGetDisplayDriverName` one can retrieve
driverName. Similarly passing EGLDisplay to `eglGetDisplayDriverConfig` we can retrieve
driverConfig options of the driver in XML format.
The string returned by `eglGetDisplayDriverConfig` is heap-allocated and caller
is responsible for freeing it.
EGL_BAD_DISPLAY is generated if `disp` is not an EGL display connection.
EGL_NOT_INITIALIZED is generated if `disp` has not been initialized.
If the implementation does not have enough resources to allocate the XML then an
EGL_BAD_ALLOC error is generated.
New Tokens
No new tokens
Issues
None
Revision History
Version 1, 2018-11-05 - First draft (Veluri Mithun)
Version 2, 2019-01-23 - Final version (Veluri Mithun)
Version 3, 2019-01-24 - Mark as complete, add Khronos extension
number, fix parameter name in prototypes,
write revision history (Eric Engestrom)

View File

@@ -20,11 +20,11 @@ Status
Version
Version 8, 14-February-2014
Version 9, 09 November 2018
Number
TBD.
OpenGL Extension #446
Dependencies
@@ -32,9 +32,6 @@ Dependencies
GLX_ARB_create_context and GLX_ARB_create_context_profile are required.
This extension interacts with GLX_EXT_create_context_es2_profile and
GLX_EXT_create_context_es_profile.
Overview
In many situations, applications want to detect characteristics of a
@@ -95,18 +92,13 @@ New Tokens
GLX_RENDERER_VENDOR_ID_MESA
GLX_RENDERER_DEVICE_ID_MESA
Accepted as an attribute name in <*attrib_list> in
glXCreateContextAttribsARB:
GLX_RENDERER_ID_MESA 0x818E
Additions to the OpenGL / WGL Specifications
None. This specification is written for GLX.
Additions to the GLX 1.4 Specification
[Add the following to Section X.Y.Z of the GLX Specification]
[Add to Section 3.3.2 "GLX Versioning" of the GLX Specification]
To obtain information about the available renderers for a particular
display and screen,
@@ -206,29 +198,6 @@ Additions to the GLX 1.4 Specification
format as the string that would be returned by glGetString of GL_RENDERER.
It may, however, have a different value.
[Add to section section 3.3.7 "Rendering Contexts"]
The attribute name GLX_RENDERER_ID_MESA specified the index of the render
against which the context should be created. The default value of
GLX_RENDERER_ID_MESA is 0.
[Add to list of errors for glXCreateContextAttribsARB in section section
3.3.7 "Rendering Contexts"]
* If the value of GLX_RENDERER_ID_MESA specifies a non-existent
renderer, BadMatch is generated.
Dependencies on GLX_EXT_create_context_es_profile and
GLX_EXT_create_context_es2_profile
If neither extension is supported, remove all mention of
GLX_RENDERER_OPENGL_ES2_PROFILE_VERSION_MESA from the spec.
If GLX_EXT_create_context_es_profile is not supported, remove all mention of
GLX_RENDERER_OPENGL_ES_PROFILE_VERSION_MESA from the spec.
Issues
1) How should the difference between on-card and GART memory be exposed?
@@ -408,3 +377,9 @@ Revision History
read GLX_RENDERER_ID_MESA. The VENDOR/DEVICE_ID
example given in issue #17 should be 0x5143 and
0xFFFFFFFF respectively.
Version 9, 2018/11/09 - Remove GLX_RENDERER_ID_MESA, which has never been
implemented. Remove the unnecessary interactions
with the GLX GLES profile extensions. Note the
official GL extension number. Specify the section
of the GLX spec to modify.

View File

@@ -21,7 +21,7 @@
<li><a href="#guidelines">Basic guidelines</a>
<li><a href="#formatting">Patch formatting</a>
<li><a href="#testing">Testing Patches</a>
<li><a href="#mailing">Mailing Patches</a>
<li><a href="#submit">Submitting Patches</a>
<li><a href="#reviewing">Reviewing Patches</a>
<li><a href="#nominations">Nominating a commit for a stable branch</a>
<li><a href="#criteria">Criteria for accepting patches to the stable branch</a>
@@ -42,8 +42,10 @@ components.
<code>git bisect</code>.)
<li>Patches should be properly <a href="#formatting">formatted</a>.
<li>Patches should be sufficiently <a href="#testing">tested</a> before submitting.
<li>Patches should be submitted to <a href="#mailing">mesa-dev</a>
for <a href="#reviewing">review</a> using <code>git send-email</code>.
<li>Patches should be <a href="#submit">submitted</a>
to <a href="#mailing">mesa-dev</a> or with
a <a href="#merge-request">merge request</a>
for <a href="#reviewing">review</a>.
</ul>
@@ -156,18 +158,29 @@ As mentioned at the begining, patches should be bisectable.
A good way to test this is to make use of the `git rebase` command,
to run your tests on each commit. Assuming your branch is based off
<code>origin/master</code>, you can run:
</p>
<pre>
$ git rebase --interactive --exec "make check" origin/master
</pre>
<p>
replacing <code>"make check"</code> with whatever other test you want to
run.
</p>
<h2 id="mailing">Mailing Patches</h2>
<h2 id="submit">Submitting Patches</h2>
<p>
Patches should be sent to the mesa-dev mailing list for review:
Patches may be submitted to the Mesa project by
<a href="#mailing">email</a> or with a
GitLab <a href="#merge-request">merge request</a>. To prevent
duplicate code review, only use one method to submit your changes.
</p>
<h3 id="mailing">Mailing Patches</h3>
<p>
Patches may be sent to the mesa-dev mailing list for review:
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">
mesa-dev@lists.freedesktop.org</a>.
When submitting a patch make sure to use
@@ -201,8 +214,70 @@ disabled before sending your patches. (Note that you may need to contact
your email administrator for this.)
</p>
<h3 id="merge-request">GitLab Merge Requests</h3>
<p>
<a href="https://gitlab.freedesktop.org/mesa/mesa">GitLab</a> Merge
Requests (MR) can also be used to submit patches for Mesa.
</p>
<p>
If the MR may have interest for most of the Mesa community, you can
send an email to the mesa-dev email list including a link to the MR.
Don't send the patch to mesa-dev, just the MR link.
</p>
<p>
Add labels to your MR to help reviewers find it. For example:
<ul>
<li>Mesa changes affecting all drivers: mesa
<li>Hardware vendor specific code: amd, intel, nvidia, ...
<li>Driver specific code: anvil, freedreno, i965, iris, radeonsi,
radv, vc4, ...
<li>Other tag examples: gallium, util
</ul>
</p>
<p>
Tick the following when creating the MR. It allows developers to
rebase your work on top of master.
<pre>Allow commits from members who can merge to the target branch</pre>
</p>
<p>
If you revise your patches based on code review and push an update
to your branch, you should maintain a <strong>clean</strong> history
in your patches. There should not be "fixup" patches in the history.
The series should be buildable and functional after every commit
whenever you push the branch.
</p>
<p>
It is your responsibility to keep the MR alive and making progress,
as there are no guarantees that a Mesa dev will independently take
interest in it.
</p>
<p>
Some other notes:
<ul>
<li>Make changes and update your branch based on feedback
<li>Old, stale MR may be closed, but you can reopen it if you
still want to pursue the changes
<li>You should periodically check to see if your MR needs to be
rebased
<li>Make sure your MR is closed if your patches get pushed outside
of GitLab
<li>Please send MRs from a personal fork rather than from the main
Mesa repository, as it clutters it unnecessarily.
</ul>
</p>
<h2 id="reviewing">Reviewing Patches</h2>
<p>
To participate in code review, you should monitor the
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">
mesa-dev</a> email list and the GitLab
Mesa <a href="https://gitlab.freedesktop.org/mesa/mesa/merge_requests">Merge
Requests</a> page.
</p>
<p>
When you've reviewed a patch on the mailing list, please be unambiguous
about your review. That is, state either
@@ -229,6 +304,29 @@ which tells the patch author that the patch can be committed, as long
as the issues are resolved first.
</p>
<p>
These Reviewed-by, Acked-by, and Tested-by tags should also be amended
into commits in a MR before it is merged.
</p>
<p>
When providing a Reviewed-by, Acked-by, or Tested-by tag in a gitlab MR,
enclose the tag in backticks:
</p>
<pre>
`Reviewed-by: Joe Hacker &lt;jhacker@example.com&gt;`</pre>
<p>
This is the markdown format for literal, and will prevent gitlab from hiding
the &lt; and &gt; symbols.
</p>
<p>
Review by non-experts is encouraged. Understanding how someone else
goes about solving a problem is a great way to learn your way around
the project. The submitter is expected to evaluate whether they have
an appropriate amount of review feedback from people who also
understand the code before merging their patches.
</p>
<h2 id="nominations">Nominating a commit for a stable branch</h2>

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

9690
include/CL/cl2.hpp Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1,5 +1,5 @@
/**********************************************************************************
* Copyright (c) 2008-2012 The Khronos Group Inc.
* Copyright (c) 2008-2015 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and/or associated documentation files (the
@@ -12,6 +12,11 @@
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
* KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
* SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
* https://www.khronos.org/registry/
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

View File

@@ -1,5 +1,5 @@
/**********************************************************************************
* Copyright (c) 2008-2012 The Khronos Group Inc.
* Copyright (c) 2008-2015 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and/or associated documentation files (the
@@ -12,6 +12,11 @@
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
* KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
* SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
* https://www.khronos.org/registry/
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

View File

@@ -1,5 +1,5 @@
/**********************************************************************************
* Copyright (c) 2008-2012 The Khronos Group Inc.
* Copyright (c) 2008-2015 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and/or associated documentation files (the
@@ -12,6 +12,11 @@
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
* KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
* SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
* https://www.khronos.org/registry/
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@@ -33,7 +38,7 @@
extern "C" {
#endif
/******************************************************************************
/******************************************************************************/
/* cl_khr_dx9_media_sharing */
#define cl_khr_dx9_media_sharing 1

View File

@@ -0,0 +1,182 @@
/**********************************************************************************
* Copyright (c) 2008-2019 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and/or associated documentation files (the
* "Materials"), to deal in the Materials without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Materials, and to
* permit persons to whom the Materials are furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
* KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
* SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
* https://www.khronos.org/registry/
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
* CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
**********************************************************************************/
/*****************************************************************************\
Copyright (c) 2013-2019 Intel Corporation All Rights Reserved.
THESE MATERIALS ARE PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THESE
MATERIALS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
File Name: cl_dx9_media_sharing_intel.h
Abstract:
Notes:
\*****************************************************************************/
#ifndef __OPENCL_CL_DX9_MEDIA_SHARING_INTEL_H
#define __OPENCL_CL_DX9_MEDIA_SHARING_INTEL_H
#include <CL/cl.h>
#include <CL/cl_platform.h>
#include <d3d9.h>
#include <dxvahd.h>
#include <wtypes.h>
#include <d3d9types.h>
#ifdef __cplusplus
extern "C" {
#endif
/***************************************
* cl_intel_dx9_media_sharing extension *
****************************************/
#define cl_intel_dx9_media_sharing 1
typedef cl_uint cl_dx9_device_source_intel;
typedef cl_uint cl_dx9_device_set_intel;
/* error codes */
#define CL_INVALID_DX9_DEVICE_INTEL -1010
#define CL_INVALID_DX9_RESOURCE_INTEL -1011
#define CL_DX9_RESOURCE_ALREADY_ACQUIRED_INTEL -1012
#define CL_DX9_RESOURCE_NOT_ACQUIRED_INTEL -1013
/* cl_dx9_device_source_intel */
#define CL_D3D9_DEVICE_INTEL 0x4022
#define CL_D3D9EX_DEVICE_INTEL 0x4070
#define CL_DXVA_DEVICE_INTEL 0x4071
/* cl_dx9_device_set_intel */
#define CL_PREFERRED_DEVICES_FOR_DX9_INTEL 0x4024
#define CL_ALL_DEVICES_FOR_DX9_INTEL 0x4025
/* cl_context_info */
#define CL_CONTEXT_D3D9_DEVICE_INTEL 0x4026
#define CL_CONTEXT_D3D9EX_DEVICE_INTEL 0x4072
#define CL_CONTEXT_DXVA_DEVICE_INTEL 0x4073
/* cl_mem_info */
#define CL_MEM_DX9_RESOURCE_INTEL 0x4027
#define CL_MEM_DX9_SHARED_HANDLE_INTEL 0x4074
/* cl_image_info */
#define CL_IMAGE_DX9_PLANE_INTEL 0x4075
/* cl_command_type */
#define CL_COMMAND_ACQUIRE_DX9_OBJECTS_INTEL 0x402A
#define CL_COMMAND_RELEASE_DX9_OBJECTS_INTEL 0x402B
/******************************************************************************/
extern CL_API_ENTRY cl_int CL_API_CALL
clGetDeviceIDsFromDX9INTEL(
cl_platform_id platform,
cl_dx9_device_source_intel dx9_device_source,
void* dx9_object,
cl_dx9_device_set_intel dx9_device_set,
cl_uint num_entries,
cl_device_id* devices,
cl_uint* num_devices) CL_EXT_SUFFIX__VERSION_1_1;
typedef CL_API_ENTRY cl_int (CL_API_CALL* clGetDeviceIDsFromDX9INTEL_fn)(
cl_platform_id platform,
cl_dx9_device_source_intel dx9_device_source,
void* dx9_object,
cl_dx9_device_set_intel dx9_device_set,
cl_uint num_entries,
cl_device_id* devices,
cl_uint* num_devices) CL_EXT_SUFFIX__VERSION_1_1;
extern CL_API_ENTRY cl_mem CL_API_CALL
clCreateFromDX9MediaSurfaceINTEL(
cl_context context,
cl_mem_flags flags,
IDirect3DSurface9* resource,
HANDLE sharedHandle,
UINT plane,
cl_int* errcode_ret) CL_EXT_SUFFIX__VERSION_1_1;
typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromDX9MediaSurfaceINTEL_fn)(
cl_context context,
cl_mem_flags flags,
IDirect3DSurface9* resource,
HANDLE sharedHandle,
UINT plane,
cl_int* errcode_ret) CL_EXT_SUFFIX__VERSION_1_1;
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueAcquireDX9ObjectsINTEL(
cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem* mem_objects,
cl_uint num_events_in_wait_list,
const cl_event* event_wait_list,
cl_event* event) CL_EXT_SUFFIX__VERSION_1_1;
typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireDX9ObjectsINTEL_fn)(
cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem* mem_objects,
cl_uint num_events_in_wait_list,
const cl_event* event_wait_list,
cl_event* event) CL_EXT_SUFFIX__VERSION_1_1;
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueReleaseDX9ObjectsINTEL(
cl_command_queue command_queue,
cl_uint num_objects,
cl_mem* mem_objects,
cl_uint num_events_in_wait_list,
const cl_event* event_wait_list,
cl_event* event) CL_EXT_SUFFIX__VERSION_1_1;
typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseDX9ObjectsINTEL_fn)(
cl_command_queue command_queue,
cl_uint num_objects,
cl_mem* mem_objects,
cl_uint num_events_in_wait_list,
const cl_event* event_wait_list,
cl_event* event) CL_EXT_SUFFIX__VERSION_1_1;
#ifdef __cplusplus
}
#endif
#endif /* __OPENCL_CL_DX9_MEDIA_SHARING_INTEL_H */

View File

@@ -1,5 +1,5 @@
/*******************************************************************************
* Copyright (c) 2008-2010 The Khronos Group Inc.
* Copyright (c) 2008-2019 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and/or associated documentation files (the
@@ -12,6 +12,11 @@
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
* KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
* SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
* https://www.khronos.org/registry/
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@@ -24,13 +29,7 @@
#ifndef __OPENCL_CL_EGL_H
#define __OPENCL_CL_EGL_H
#ifdef __APPLE__
#else
#include <CL/cl.h>
#include <EGL/egl.h>
#include <EGL/eglext.h>
#endif
#ifdef __cplusplus
extern "C" {
@@ -62,69 +61,69 @@ typedef intptr_t cl_egl_image_properties_khr;
#define cl_khr_egl_image 1
extern CL_API_ENTRY cl_mem CL_API_CALL
clCreateFromEGLImageKHR(cl_context /* context */,
CLeglDisplayKHR /* egldisplay */,
CLeglImageKHR /* eglimage */,
cl_mem_flags /* flags */,
const cl_egl_image_properties_khr * /* properties */,
cl_int * /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;
clCreateFromEGLImageKHR(cl_context context,
CLeglDisplayKHR egldisplay,
CLeglImageKHR eglimage,
cl_mem_flags flags,
const cl_egl_image_properties_khr * properties,
cl_int * errcode_ret) CL_API_SUFFIX__VERSION_1_0;
typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromEGLImageKHR_fn)(
cl_context context,
CLeglDisplayKHR egldisplay,
CLeglImageKHR eglimage,
cl_mem_flags flags,
const cl_egl_image_properties_khr * properties,
cl_int * errcode_ret);
cl_context context,
CLeglDisplayKHR egldisplay,
CLeglImageKHR eglimage,
cl_mem_flags flags,
const cl_egl_image_properties_khr * properties,
cl_int * errcode_ret);
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueAcquireEGLObjectsKHR(cl_command_queue /* command_queue */,
cl_uint /* num_objects */,
const cl_mem * /* mem_objects */,
cl_uint /* num_events_in_wait_list */,
const cl_event * /* event_wait_list */,
cl_event * /* event */) CL_API_SUFFIX__VERSION_1_0;
clEnqueueAcquireEGLObjectsKHR(cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem * mem_objects,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event) CL_API_SUFFIX__VERSION_1_0;
typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireEGLObjectsKHR_fn)(
cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem * mem_objects,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event);
cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem * mem_objects,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event);
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueReleaseEGLObjectsKHR(cl_command_queue /* command_queue */,
cl_uint /* num_objects */,
const cl_mem * /* mem_objects */,
cl_uint /* num_events_in_wait_list */,
const cl_event * /* event_wait_list */,
cl_event * /* event */) CL_API_SUFFIX__VERSION_1_0;
clEnqueueReleaseEGLObjectsKHR(cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem * mem_objects,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event) CL_API_SUFFIX__VERSION_1_0;
typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseEGLObjectsKHR_fn)(
cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem * mem_objects,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event);
cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem * mem_objects,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event);
#define cl_khr_egl_event 1
extern CL_API_ENTRY cl_event CL_API_CALL
clCreateEventFromEGLSyncKHR(cl_context /* context */,
CLeglSyncKHR /* sync */,
CLeglDisplayKHR /* display */,
cl_int * /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;
clCreateEventFromEGLSyncKHR(cl_context context,
CLeglSyncKHR sync,
CLeglDisplayKHR display,
cl_int * errcode_ret) CL_API_SUFFIX__VERSION_1_0;
typedef CL_API_ENTRY cl_event (CL_API_CALL *clCreateEventFromEGLSyncKHR_fn)(
cl_context context,
CLeglSyncKHR sync,
CLeglDisplayKHR display,
cl_int * errcode_ret);
cl_context context,
CLeglSyncKHR sync,
CLeglDisplayKHR display,
cl_int * errcode_ret);
#ifdef __cplusplus
}

View File

@@ -1,5 +1,5 @@
/*******************************************************************************
* Copyright (c) 2008-2013 The Khronos Group Inc.
* Copyright (c) 2008-2019 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and/or associated documentation files (the
@@ -12,6 +12,11 @@
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
* KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
* SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
* https://www.khronos.org/registry/
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@@ -21,8 +26,6 @@
* MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
******************************************************************************/
/* $Revision: 11928 $ on $Date: 2010-07-13 09:04:56 -0700 (Tue, 13 Jul 2010) $ */
/* cl_ext.h contains OpenCL extensions which don't have external */
/* (OpenGL, D3D) dependencies. */
@@ -33,11 +36,13 @@
extern "C" {
#endif
#ifdef __APPLE__
#include <OpenCL/cl.h>
#include <AvailabilityMacros.h>
#else
#include <CL/cl.h>
#include <CL/cl.h>
/* cl_khr_fp64 extension - no extension #define since it has no functions */
/* CL_DEVICE_DOUBLE_FP_CONFIG is defined in CL.h for OpenCL >= 120 */
#if CL_TARGET_OPENCL_VERSION <= 110
#define CL_DEVICE_DOUBLE_FP_CONFIG 0x1032
#endif
/* cl_khr_fp16 extension - no extension #define since it has no functions */
@@ -47,12 +52,12 @@ extern "C" {
*
* Apple extension for use to manage externally allocated buffers used with cl_mem objects with CL_MEM_USE_HOST_PTR
*
* Registers a user callback function that will be called when the memory object is deleted and its resources
* freed. Each call to clSetMemObjectCallbackFn registers the specified user callback function on a callback
* stack associated with memobj. The registered user callback functions are called in the reverse order in
* which they were registered. The user callback functions are called and then the memory object is deleted
* and its resources freed. This provides a mechanism for the application (and libraries) using memobj to be
* notified when the memory referenced by host_ptr, specified when the memory object is created and used as
* Registers a user callback function that will be called when the memory object is deleted and its resources
* freed. Each call to clSetMemObjectCallbackFn registers the specified user callback function on a callback
* stack associated with memobj. The registered user callback functions are called in the reverse order in
* which they were registered. The user callback functions are called and then the memory object is deleted
* and its resources freed. This provides a mechanism for the application (and libraries) using memobj to be
* notified when the memory referenced by host_ptr, specified when the memory object is created and used as
* the storage bits for the memory object, can be reused or freed.
*
* The application may not call CL api's with the cl_mem object passed to the pfn_notify.
@@ -61,9 +66,9 @@ extern "C" {
* before using.
*/
#define cl_APPLE_SetMemObjectDestructor 1
cl_int CL_API_ENTRY clSetMemObjectDestructorAPPLE( cl_mem /* memobj */,
void (* /*pfn_notify*/)( cl_mem /* memobj */, void* /*user_data*/),
void * /*user_data */ ) CL_EXT_SUFFIX__VERSION_1_0;
cl_int CL_API_ENTRY clSetMemObjectDestructorAPPLE( cl_mem memobj,
void (* pfn_notify)(cl_mem memobj, void * user_data),
void * user_data) CL_EXT_SUFFIX__VERSION_1_0;
/* Context Logging Functions
@@ -72,29 +77,29 @@ cl_int CL_API_ENTRY clSetMemObjectDestructorAPPLE( cl_mem /* memobj */,
* Please check for the "cl_APPLE_ContextLoggingFunctions" extension using clGetDeviceInfo(CL_DEVICE_EXTENSIONS)
* before using.
*
* clLogMessagesToSystemLog fowards on all log messages to the Apple System Logger
* clLogMessagesToSystemLog forwards on all log messages to the Apple System Logger
*/
#define cl_APPLE_ContextLoggingFunctions 1
extern void CL_API_ENTRY clLogMessagesToSystemLogAPPLE( const char * /* errstr */,
const void * /* private_info */,
size_t /* cb */,
void * /* user_data */ ) CL_EXT_SUFFIX__VERSION_1_0;
extern void CL_API_ENTRY clLogMessagesToSystemLogAPPLE( const char * errstr,
const void * private_info,
size_t cb,
void * user_data) CL_EXT_SUFFIX__VERSION_1_0;
/* clLogMessagesToStdout sends all log messages to the file descriptor stdout */
extern void CL_API_ENTRY clLogMessagesToStdoutAPPLE( const char * /* errstr */,
const void * /* private_info */,
size_t /* cb */,
void * /* user_data */ ) CL_EXT_SUFFIX__VERSION_1_0;
extern void CL_API_ENTRY clLogMessagesToStdoutAPPLE( const char * errstr,
const void * private_info,
size_t cb,
void * user_data) CL_EXT_SUFFIX__VERSION_1_0;
/* clLogMessagesToStderr sends all log messages to the file descriptor stderr */
extern void CL_API_ENTRY clLogMessagesToStderrAPPLE( const char * /* errstr */,
const void * /* private_info */,
size_t /* cb */,
void * /* user_data */ ) CL_EXT_SUFFIX__VERSION_1_0;
extern void CL_API_ENTRY clLogMessagesToStderrAPPLE( const char * errstr,
const void * private_info,
size_t cb,
void * user_data) CL_EXT_SUFFIX__VERSION_1_0;
/************************
* cl_khr_icd extension *
/************************
* cl_khr_icd extension *
************************/
#define cl_khr_icd 1
@@ -105,16 +110,43 @@ extern void CL_API_ENTRY clLogMessagesToStderrAPPLE( const char * /* errstr */
#define CL_PLATFORM_NOT_FOUND_KHR -1001
extern CL_API_ENTRY cl_int CL_API_CALL
clIcdGetPlatformIDsKHR(cl_uint /* num_entries */,
cl_platform_id * /* platforms */,
cl_uint * /* num_platforms */);
clIcdGetPlatformIDsKHR(cl_uint num_entries,
cl_platform_id * platforms,
cl_uint * num_platforms);
typedef CL_API_ENTRY cl_int (CL_API_CALL *clIcdGetPlatformIDsKHR_fn)(
cl_uint /* num_entries */,
cl_platform_id * /* platforms */,
cl_uint * /* num_platforms */);
typedef CL_API_ENTRY cl_int
(CL_API_CALL *clIcdGetPlatformIDsKHR_fn)(cl_uint num_entries,
cl_platform_id * platforms,
cl_uint * num_platforms);
/*******************************
* cl_khr_il_program extension *
*******************************/
#define cl_khr_il_program 1
/* New property to clGetDeviceInfo for retrieving supported intermediate
* languages
*/
#define CL_DEVICE_IL_VERSION_KHR 0x105B
/* New property to clGetProgramInfo for retrieving for retrieving the IL of a
* program
*/
#define CL_PROGRAM_IL_KHR 0x1169
extern CL_API_ENTRY cl_program CL_API_CALL
clCreateProgramWithILKHR(cl_context context,
const void * il,
size_t length,
cl_int * errcode_ret);
typedef CL_API_ENTRY cl_program
(CL_API_CALL *clCreateProgramWithILKHR_fn)(cl_context context,
const void * il,
size_t length,
cl_int * errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
/* Extension: cl_khr_image2D_buffer
*
* This extension allows a 2D image to be created from a cl_mem buffer without a copy.
@@ -129,31 +161,33 @@ typedef CL_API_ENTRY cl_int (CL_API_CALL *clIcdGetPlatformIDsKHR_fn)(
* The pitch specified must be a multiple of CL_DEVICE_IMAGE_PITCH_ALIGNMENT pixels.
* The base address of the buffer must be aligned to CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT pixels.
*/
/*************************************
* cl_khr_initalize_memory extension *
*************************************/
#define CL_CONTEXT_MEMORY_INITIALIZE_KHR 0x200E
/**************************************
* cl_khr_initialize_memory extension *
**************************************/
#define CL_CONTEXT_MEMORY_INITIALIZE_KHR 0x2030
/**************************************
* cl_khr_terminate_context extension *
**************************************/
#define CL_DEVICE_TERMINATE_CAPABILITY_KHR 0x200F
#define CL_CONTEXT_TERMINATE_KHR 0x2010
#define CL_DEVICE_TERMINATE_CAPABILITY_KHR 0x2031
#define CL_CONTEXT_TERMINATE_KHR 0x2032
#define cl_khr_terminate_context 1
extern CL_API_ENTRY cl_int CL_API_CALL clTerminateContextKHR(cl_context /* context */) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clTerminateContextKHR(cl_context context) CL_EXT_SUFFIX__VERSION_1_2;
typedef CL_API_ENTRY cl_int
(CL_API_CALL *clTerminateContextKHR_fn)(cl_context context) CL_EXT_SUFFIX__VERSION_1_2;
typedef CL_API_ENTRY cl_int (CL_API_CALL *clTerminateContextKHR_fn)(cl_context /* context */) CL_EXT_SUFFIX__VERSION_1_2;
/*
* Extension: cl_khr_spir
*
* This extension adds support to create an OpenCL program object from a
* This extension adds support to create an OpenCL program object from a
* Standard Portable Intermediate Representation (SPIR) instance
*/
@@ -161,9 +195,30 @@ typedef CL_API_ENTRY cl_int (CL_API_CALL *clTerminateContextKHR_fn)(cl_context /
#define CL_PROGRAM_BINARY_TYPE_INTERMEDIATE 0x40E1
/*****************************************
* cl_khr_create_command_queue extension *
*****************************************/
#define cl_khr_create_command_queue 1
typedef cl_bitfield cl_queue_properties_khr;
extern CL_API_ENTRY cl_command_queue CL_API_CALL
clCreateCommandQueueWithPropertiesKHR(cl_context context,
cl_device_id device,
const cl_queue_properties_khr* properties,
cl_int* errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
typedef CL_API_ENTRY cl_command_queue
(CL_API_CALL *clCreateCommandQueueWithPropertiesKHR_fn)(cl_context context,
cl_device_id device,
const cl_queue_properties_khr* properties,
cl_int* errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
/******************************************
* cl_nv_device_attribute_query extension *
******************************************/
/* cl_nv_device_attribute_query extension - no extension #define since it has no functions */
#define CL_DEVICE_COMPUTE_CAPABILITY_MAJOR_NV 0x4000
#define CL_DEVICE_COMPUTE_CAPABILITY_MINOR_NV 0x4001
@@ -173,88 +228,124 @@ typedef CL_API_ENTRY cl_int (CL_API_CALL *clTerminateContextKHR_fn)(cl_context /
#define CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV 0x4005
#define CL_DEVICE_INTEGRATED_MEMORY_NV 0x4006
/*********************************
* cl_amd_device_attribute_query *
*********************************/
#define CL_DEVICE_PROFILING_TIMER_OFFSET_AMD 0x4036
/*********************************
* cl_arm_printf extension
*********************************/
#define CL_PRINTF_CALLBACK_ARM 0x40B0
#define CL_PRINTF_BUFFERSIZE_ARM 0x40B1
#ifdef CL_VERSION_1_1
/***********************************
* cl_ext_device_fission extension *
***********************************/
#define cl_ext_device_fission 1
extern CL_API_ENTRY cl_int CL_API_CALL
clReleaseDeviceEXT( cl_device_id /*device*/ ) CL_EXT_SUFFIX__VERSION_1_1;
typedef CL_API_ENTRY cl_int
(CL_API_CALL *clReleaseDeviceEXT_fn)( cl_device_id /*device*/ ) CL_EXT_SUFFIX__VERSION_1_1;
extern CL_API_ENTRY cl_int CL_API_CALL
clRetainDeviceEXT( cl_device_id /*device*/ ) CL_EXT_SUFFIX__VERSION_1_1;
typedef CL_API_ENTRY cl_int
(CL_API_CALL *clRetainDeviceEXT_fn)( cl_device_id /*device*/ ) CL_EXT_SUFFIX__VERSION_1_1;
/***********************************
* cl_ext_device_fission extension
***********************************/
#define cl_ext_device_fission 1
typedef cl_ulong cl_device_partition_property_ext;
extern CL_API_ENTRY cl_int CL_API_CALL
clCreateSubDevicesEXT( cl_device_id /*in_device*/,
const cl_device_partition_property_ext * /* properties */,
cl_uint /*num_entries*/,
cl_device_id * /*out_devices*/,
cl_uint * /*num_devices*/ ) CL_EXT_SUFFIX__VERSION_1_1;
extern CL_API_ENTRY cl_int CL_API_CALL
clReleaseDeviceEXT(cl_device_id device) CL_EXT_SUFFIX__VERSION_1_1;
typedef CL_API_ENTRY cl_int
( CL_API_CALL * clCreateSubDevicesEXT_fn)( cl_device_id /*in_device*/,
const cl_device_partition_property_ext * /* properties */,
cl_uint /*num_entries*/,
cl_device_id * /*out_devices*/,
cl_uint * /*num_devices*/ ) CL_EXT_SUFFIX__VERSION_1_1;
typedef CL_API_ENTRY cl_int
(CL_API_CALL *clReleaseDeviceEXT_fn)(cl_device_id device) CL_EXT_SUFFIX__VERSION_1_1;
extern CL_API_ENTRY cl_int CL_API_CALL
clRetainDeviceEXT(cl_device_id device) CL_EXT_SUFFIX__VERSION_1_1;
typedef CL_API_ENTRY cl_int
(CL_API_CALL *clRetainDeviceEXT_fn)(cl_device_id device) CL_EXT_SUFFIX__VERSION_1_1;
typedef cl_ulong cl_device_partition_property_ext;
extern CL_API_ENTRY cl_int CL_API_CALL
clCreateSubDevicesEXT(cl_device_id in_device,
const cl_device_partition_property_ext * properties,
cl_uint num_entries,
cl_device_id * out_devices,
cl_uint * num_devices) CL_EXT_SUFFIX__VERSION_1_1;
typedef CL_API_ENTRY cl_int
(CL_API_CALL * clCreateSubDevicesEXT_fn)(cl_device_id in_device,
const cl_device_partition_property_ext * properties,
cl_uint num_entries,
cl_device_id * out_devices,
cl_uint * num_devices) CL_EXT_SUFFIX__VERSION_1_1;
/* cl_device_partition_property_ext */
#define CL_DEVICE_PARTITION_EQUALLY_EXT 0x4050
#define CL_DEVICE_PARTITION_BY_COUNTS_EXT 0x4051
#define CL_DEVICE_PARTITION_BY_NAMES_EXT 0x4052
#define CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN_EXT 0x4053
/* clDeviceGetInfo selectors */
#define CL_DEVICE_PARENT_DEVICE_EXT 0x4054
#define CL_DEVICE_PARTITION_TYPES_EXT 0x4055
#define CL_DEVICE_AFFINITY_DOMAINS_EXT 0x4056
#define CL_DEVICE_REFERENCE_COUNT_EXT 0x4057
#define CL_DEVICE_PARTITION_STYLE_EXT 0x4058
/* error codes */
#define CL_DEVICE_PARTITION_FAILED_EXT -1057
#define CL_INVALID_PARTITION_COUNT_EXT -1058
#define CL_INVALID_PARTITION_NAME_EXT -1059
/* CL_AFFINITY_DOMAINs */
#define CL_AFFINITY_DOMAIN_L1_CACHE_EXT 0x1
#define CL_AFFINITY_DOMAIN_L2_CACHE_EXT 0x2
#define CL_AFFINITY_DOMAIN_L3_CACHE_EXT 0x3
#define CL_AFFINITY_DOMAIN_L4_CACHE_EXT 0x4
#define CL_AFFINITY_DOMAIN_NUMA_EXT 0x10
#define CL_AFFINITY_DOMAIN_NEXT_FISSIONABLE_EXT 0x100
/* cl_device_partition_property_ext list terminators */
#define CL_PROPERTIES_LIST_END_EXT ((cl_device_partition_property_ext) 0)
#define CL_PARTITION_BY_COUNTS_LIST_END_EXT ((cl_device_partition_property_ext) 0)
#define CL_PARTITION_BY_NAMES_LIST_END_EXT ((cl_device_partition_property_ext) 0 - 1)
/***********************************
* cl_ext_migrate_memobject extension definitions
***********************************/
#define cl_ext_migrate_memobject 1
typedef cl_bitfield cl_mem_migration_flags_ext;
#define CL_MIGRATE_MEM_OBJECT_HOST_EXT 0x1
#define CL_COMMAND_MIGRATE_MEM_OBJECT_EXT 0x4040
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueMigrateMemObjectEXT(cl_command_queue command_queue,
cl_uint num_mem_objects,
const cl_mem * mem_objects,
cl_mem_migration_flags_ext flags,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event);
typedef CL_API_ENTRY cl_int
(CL_API_CALL *clEnqueueMigrateMemObjectEXT_fn)(cl_command_queue command_queue,
cl_uint num_mem_objects,
const cl_mem * mem_objects,
cl_mem_migration_flags_ext flags,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event);
/* cl_device_partition_property_ext */
#define CL_DEVICE_PARTITION_EQUALLY_EXT 0x4050
#define CL_DEVICE_PARTITION_BY_COUNTS_EXT 0x4051
#define CL_DEVICE_PARTITION_BY_NAMES_EXT 0x4052
#define CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN_EXT 0x4053
/* clDeviceGetInfo selectors */
#define CL_DEVICE_PARENT_DEVICE_EXT 0x4054
#define CL_DEVICE_PARTITION_TYPES_EXT 0x4055
#define CL_DEVICE_AFFINITY_DOMAINS_EXT 0x4056
#define CL_DEVICE_REFERENCE_COUNT_EXT 0x4057
#define CL_DEVICE_PARTITION_STYLE_EXT 0x4058
/* error codes */
#define CL_DEVICE_PARTITION_FAILED_EXT -1057
#define CL_INVALID_PARTITION_COUNT_EXT -1058
#define CL_INVALID_PARTITION_NAME_EXT -1059
/* CL_AFFINITY_DOMAINs */
#define CL_AFFINITY_DOMAIN_L1_CACHE_EXT 0x1
#define CL_AFFINITY_DOMAIN_L2_CACHE_EXT 0x2
#define CL_AFFINITY_DOMAIN_L3_CACHE_EXT 0x3
#define CL_AFFINITY_DOMAIN_L4_CACHE_EXT 0x4
#define CL_AFFINITY_DOMAIN_NUMA_EXT 0x10
#define CL_AFFINITY_DOMAIN_NEXT_FISSIONABLE_EXT 0x100
/* cl_device_partition_property_ext list terminators */
#define CL_PROPERTIES_LIST_END_EXT ((cl_device_partition_property_ext) 0)
#define CL_PARTITION_BY_COUNTS_LIST_END_EXT ((cl_device_partition_property_ext) 0)
#define CL_PARTITION_BY_NAMES_LIST_END_EXT ((cl_device_partition_property_ext) 0 - 1)
/*********************************
* cl_qcom_ext_host_ptr extension
*********************************/
#define cl_qcom_ext_host_ptr 1
#define CL_MEM_EXT_HOST_PTR_QCOM (1 << 29)
#define CL_DEVICE_EXT_MEM_PADDING_IN_BYTES_QCOM 0x40A0
#define CL_DEVICE_EXT_MEM_PADDING_IN_BYTES_QCOM 0x40A0
#define CL_DEVICE_PAGE_SIZE_QCOM 0x40A1
#define CL_IMAGE_ROW_ALIGNMENT_QCOM 0x40A2
#define CL_IMAGE_SLICE_ALIGNMENT_QCOM 0x40A3
@@ -280,12 +371,21 @@ typedef struct _cl_mem_ext_host_ptr
/* Type of external memory allocation. */
/* Legal values will be defined in layered extensions. */
cl_uint allocation_type;
/* Host cache policy for this external memory allocation. */
/* Host cache policy for this external memory allocation. */
cl_uint host_cache_policy;
} cl_mem_ext_host_ptr;
/*******************************************
* cl_qcom_ext_host_ptr_iocoherent extension
********************************************/
/* Cache policy specifying io-coherence */
#define CL_MEM_HOST_IOCOHERENT_QCOM 0x40A9
/*********************************
* cl_qcom_ion_host_ptr extension
*********************************/
@@ -300,13 +400,339 @@ typedef struct _cl_mem_ion_host_ptr
/* ION file descriptor */
int ion_filedesc;
/* Host pointer to the ION allocated memory */
void* ion_hostptr;
} cl_mem_ion_host_ptr;
#endif /* CL_VERSION_1_1 */
/*********************************
* cl_qcom_android_native_buffer_host_ptr extension
*********************************/
#define CL_MEM_ANDROID_NATIVE_BUFFER_HOST_PTR_QCOM 0x40C6
typedef struct _cl_mem_android_native_buffer_host_ptr
{
/* Type of external memory allocation. */
/* Must be CL_MEM_ANDROID_NATIVE_BUFFER_HOST_PTR_QCOM for Android native buffers. */
cl_mem_ext_host_ptr ext_host_ptr;
/* Virtual pointer to the android native buffer */
void* anb_ptr;
} cl_mem_android_native_buffer_host_ptr;
/******************************************
* cl_img_yuv_image extension *
******************************************/
/* Image formats used in clCreateImage */
#define CL_NV21_IMG 0x40D0
#define CL_YV12_IMG 0x40D1
/******************************************
* cl_img_cached_allocations extension *
******************************************/
/* Flag values used by clCreateBuffer */
#define CL_MEM_USE_UNCACHED_CPU_MEMORY_IMG (1 << 26)
#define CL_MEM_USE_CACHED_CPU_MEMORY_IMG (1 << 27)
/******************************************
* cl_img_use_gralloc_ptr extension *
******************************************/
#define cl_img_use_gralloc_ptr 1
/* Flag values used by clCreateBuffer */
#define CL_MEM_USE_GRALLOC_PTR_IMG (1 << 28)
/* To be used by clGetEventInfo: */
#define CL_COMMAND_ACQUIRE_GRALLOC_OBJECTS_IMG 0x40D2
#define CL_COMMAND_RELEASE_GRALLOC_OBJECTS_IMG 0x40D3
/* Error code from clEnqueueReleaseGrallocObjectsIMG */
#define CL_GRALLOC_RESOURCE_NOT_ACQUIRED_IMG 0x40D4
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueAcquireGrallocObjectsIMG(cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem * mem_objects,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueReleaseGrallocObjectsIMG(cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem * mem_objects,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event) CL_EXT_SUFFIX__VERSION_1_2;
/*********************************
* cl_khr_subgroups extension
*********************************/
#define cl_khr_subgroups 1
#if !defined(CL_VERSION_2_1)
/* For OpenCL 2.1 and newer, cl_kernel_sub_group_info is declared in CL.h.
In hindsight, there should have been a khr suffix on this type for
the extension, but keeping it un-suffixed to maintain backwards
compatibility. */
typedef cl_uint cl_kernel_sub_group_info;
#endif
/* cl_kernel_sub_group_info */
#define CL_KERNEL_MAX_SUB_GROUP_SIZE_FOR_NDRANGE_KHR 0x2033
#define CL_KERNEL_SUB_GROUP_COUNT_FOR_NDRANGE_KHR 0x2034
extern CL_API_ENTRY cl_int CL_API_CALL
clGetKernelSubGroupInfoKHR(cl_kernel in_kernel,
cl_device_id in_device,
cl_kernel_sub_group_info param_name,
size_t input_value_size,
const void * input_value,
size_t param_value_size,
void * param_value,
size_t * param_value_size_ret) CL_EXT_SUFFIX__VERSION_2_0_DEPRECATED;
typedef CL_API_ENTRY cl_int
(CL_API_CALL * clGetKernelSubGroupInfoKHR_fn)(cl_kernel in_kernel,
cl_device_id in_device,
cl_kernel_sub_group_info param_name,
size_t input_value_size,
const void * input_value,
size_t param_value_size,
void * param_value,
size_t * param_value_size_ret) CL_EXT_SUFFIX__VERSION_2_0_DEPRECATED;
/*********************************
* cl_khr_mipmap_image extension
*********************************/
/* cl_sampler_properties */
#define CL_SAMPLER_MIP_FILTER_MODE_KHR 0x1155
#define CL_SAMPLER_LOD_MIN_KHR 0x1156
#define CL_SAMPLER_LOD_MAX_KHR 0x1157
/*********************************
* cl_khr_priority_hints extension
*********************************/
/* This extension define is for backwards compatibility.
It shouldn't be required since this extension has no new functions. */
#define cl_khr_priority_hints 1
typedef cl_uint cl_queue_priority_khr;
/* cl_command_queue_properties */
#define CL_QUEUE_PRIORITY_KHR 0x1096
/* cl_queue_priority_khr */
#define CL_QUEUE_PRIORITY_HIGH_KHR (1<<0)
#define CL_QUEUE_PRIORITY_MED_KHR (1<<1)
#define CL_QUEUE_PRIORITY_LOW_KHR (1<<2)
/*********************************
* cl_khr_throttle_hints extension
*********************************/
/* This extension define is for backwards compatibility.
It shouldn't be required since this extension has no new functions. */
#define cl_khr_throttle_hints 1
typedef cl_uint cl_queue_throttle_khr;
/* cl_command_queue_properties */
#define CL_QUEUE_THROTTLE_KHR 0x1097
/* cl_queue_throttle_khr */
#define CL_QUEUE_THROTTLE_HIGH_KHR (1<<0)
#define CL_QUEUE_THROTTLE_MED_KHR (1<<1)
#define CL_QUEUE_THROTTLE_LOW_KHR (1<<2)
/*********************************
* cl_khr_subgroup_named_barrier
*********************************/
/* This extension define is for backwards compatibility.
It shouldn't be required since this extension has no new functions. */
#define cl_khr_subgroup_named_barrier 1
/* cl_device_info */
#define CL_DEVICE_MAX_NAMED_BARRIER_COUNT_KHR 0x2035
/**********************************
* cl_arm_import_memory extension *
**********************************/
#define cl_arm_import_memory 1
typedef intptr_t cl_import_properties_arm;
/* Default and valid proporties name for cl_arm_import_memory */
#define CL_IMPORT_TYPE_ARM 0x40B2
/* Host process memory type default value for CL_IMPORT_TYPE_ARM property */
#define CL_IMPORT_TYPE_HOST_ARM 0x40B3
/* DMA BUF memory type value for CL_IMPORT_TYPE_ARM property */
#define CL_IMPORT_TYPE_DMA_BUF_ARM 0x40B4
/* Protected DMA BUF memory type value for CL_IMPORT_TYPE_ARM property */
#define CL_IMPORT_TYPE_PROTECTED_ARM 0x40B5
/* This extension adds a new function that allows for direct memory import into
* OpenCL via the clImportMemoryARM function.
*
* Memory imported through this interface will be mapped into the device's page
* tables directly, providing zero copy access. It will never fall back to copy
* operations and aliased buffers.
*
* Types of memory supported for import are specified as additional extension
* strings.
*
* This extension produces cl_mem allocations which are compatible with all other
* users of cl_mem in the standard API.
*
* This extension maps pages with the same properties as the normal buffer creation
* function clCreateBuffer.
*/
extern CL_API_ENTRY cl_mem CL_API_CALL
clImportMemoryARM( cl_context context,
cl_mem_flags flags,
const cl_import_properties_arm *properties,
void *memory,
size_t size,
cl_int *errcode_ret) CL_EXT_SUFFIX__VERSION_1_0;
/******************************************
* cl_arm_shared_virtual_memory extension *
******************************************/
#define cl_arm_shared_virtual_memory 1
/* Used by clGetDeviceInfo */
#define CL_DEVICE_SVM_CAPABILITIES_ARM 0x40B6
/* Used by clGetMemObjectInfo */
#define CL_MEM_USES_SVM_POINTER_ARM 0x40B7
/* Used by clSetKernelExecInfoARM: */
#define CL_KERNEL_EXEC_INFO_SVM_PTRS_ARM 0x40B8
#define CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM_ARM 0x40B9
/* To be used by clGetEventInfo: */
#define CL_COMMAND_SVM_FREE_ARM 0x40BA
#define CL_COMMAND_SVM_MEMCPY_ARM 0x40BB
#define CL_COMMAND_SVM_MEMFILL_ARM 0x40BC
#define CL_COMMAND_SVM_MAP_ARM 0x40BD
#define CL_COMMAND_SVM_UNMAP_ARM 0x40BE
/* Flag values returned by clGetDeviceInfo with CL_DEVICE_SVM_CAPABILITIES_ARM as the param_name. */
#define CL_DEVICE_SVM_COARSE_GRAIN_BUFFER_ARM (1 << 0)
#define CL_DEVICE_SVM_FINE_GRAIN_BUFFER_ARM (1 << 1)
#define CL_DEVICE_SVM_FINE_GRAIN_SYSTEM_ARM (1 << 2)
#define CL_DEVICE_SVM_ATOMICS_ARM (1 << 3)
/* Flag values used by clSVMAllocARM: */
#define CL_MEM_SVM_FINE_GRAIN_BUFFER_ARM (1 << 10)
#define CL_MEM_SVM_ATOMICS_ARM (1 << 11)
typedef cl_bitfield cl_svm_mem_flags_arm;
typedef cl_uint cl_kernel_exec_info_arm;
typedef cl_bitfield cl_device_svm_capabilities_arm;
extern CL_API_ENTRY void * CL_API_CALL
clSVMAllocARM(cl_context context,
cl_svm_mem_flags_arm flags,
size_t size,
cl_uint alignment) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY void CL_API_CALL
clSVMFreeARM(cl_context context,
void * svm_pointer) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueSVMFreeARM(cl_command_queue command_queue,
cl_uint num_svm_pointers,
void * svm_pointers[],
void (CL_CALLBACK * pfn_free_func)(cl_command_queue queue,
cl_uint num_svm_pointers,
void * svm_pointers[],
void * user_data),
void * user_data,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueSVMMemcpyARM(cl_command_queue command_queue,
cl_bool blocking_copy,
void * dst_ptr,
const void * src_ptr,
size_t size,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueSVMMemFillARM(cl_command_queue command_queue,
void * svm_ptr,
const void * pattern,
size_t pattern_size,
size_t size,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueSVMMapARM(cl_command_queue command_queue,
cl_bool blocking_map,
cl_map_flags flags,
void * svm_ptr,
size_t size,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueSVMUnmapARM(cl_command_queue command_queue,
void * svm_ptr,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clSetKernelArgSVMPointerARM(cl_kernel kernel,
cl_uint arg_index,
const void * arg_value) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clSetKernelExecInfoARM(cl_kernel kernel,
cl_kernel_exec_info_arm param_name,
size_t param_value_size,
const void * param_value) CL_EXT_SUFFIX__VERSION_1_2;
/********************************
* cl_arm_get_core_id extension *
********************************/
#ifdef CL_VERSION_1_2
#define cl_arm_get_core_id 1
/* Device info property for bitfield of cores present */
#define CL_DEVICE_COMPUTE_UNITS_BITFIELD_ARM 0x40BF
#endif /* CL_VERSION_1_2 */
#ifdef __cplusplus
}

423
include/CL/cl_ext_intel.h Normal file
View File

@@ -0,0 +1,423 @@
/*******************************************************************************
* Copyright (c) 2008-2019 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and/or associated documentation files (the
* "Materials"), to deal in the Materials without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Materials, and to
* permit persons to whom the Materials are furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
* KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
* SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
* https://www.khronos.org/registry/
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
* CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
******************************************************************************/
/*****************************************************************************\
Copyright (c) 2013-2019 Intel Corporation All Rights Reserved.
THESE MATERIALS ARE PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THESE
MATERIALS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
File Name: cl_ext_intel.h
Abstract:
Notes:
\*****************************************************************************/
#ifndef __CL_EXT_INTEL_H
#define __CL_EXT_INTEL_H
#include <CL/cl.h>
#include <CL/cl_platform.h>
#ifdef __cplusplus
extern "C" {
#endif
/***************************************
* cl_intel_thread_local_exec extension *
****************************************/
#define cl_intel_thread_local_exec 1
#define CL_QUEUE_THREAD_LOCAL_EXEC_ENABLE_INTEL (((cl_bitfield)1) << 31)
/***********************************************
* cl_intel_device_partition_by_names extension *
************************************************/
#define cl_intel_device_partition_by_names 1
#define CL_DEVICE_PARTITION_BY_NAMES_INTEL 0x4052
#define CL_PARTITION_BY_NAMES_LIST_END_INTEL -1
/************************************************
* cl_intel_accelerator extension *
* cl_intel_motion_estimation extension *
* cl_intel_advanced_motion_estimation extension *
*************************************************/
#define cl_intel_accelerator 1
#define cl_intel_motion_estimation 1
#define cl_intel_advanced_motion_estimation 1
typedef struct _cl_accelerator_intel* cl_accelerator_intel;
typedef cl_uint cl_accelerator_type_intel;
typedef cl_uint cl_accelerator_info_intel;
typedef struct _cl_motion_estimation_desc_intel {
cl_uint mb_block_type;
cl_uint subpixel_mode;
cl_uint sad_adjust_mode;
cl_uint search_path_type;
} cl_motion_estimation_desc_intel;
/* error codes */
#define CL_INVALID_ACCELERATOR_INTEL -1094
#define CL_INVALID_ACCELERATOR_TYPE_INTEL -1095
#define CL_INVALID_ACCELERATOR_DESCRIPTOR_INTEL -1096
#define CL_ACCELERATOR_TYPE_NOT_SUPPORTED_INTEL -1097
/* cl_accelerator_type_intel */
#define CL_ACCELERATOR_TYPE_MOTION_ESTIMATION_INTEL 0x0
/* cl_accelerator_info_intel */
#define CL_ACCELERATOR_DESCRIPTOR_INTEL 0x4090
#define CL_ACCELERATOR_REFERENCE_COUNT_INTEL 0x4091
#define CL_ACCELERATOR_CONTEXT_INTEL 0x4092
#define CL_ACCELERATOR_TYPE_INTEL 0x4093
/* cl_motion_detect_desc_intel flags */
#define CL_ME_MB_TYPE_16x16_INTEL 0x0
#define CL_ME_MB_TYPE_8x8_INTEL 0x1
#define CL_ME_MB_TYPE_4x4_INTEL 0x2
#define CL_ME_SUBPIXEL_MODE_INTEGER_INTEL 0x0
#define CL_ME_SUBPIXEL_MODE_HPEL_INTEL 0x1
#define CL_ME_SUBPIXEL_MODE_QPEL_INTEL 0x2
#define CL_ME_SAD_ADJUST_MODE_NONE_INTEL 0x0
#define CL_ME_SAD_ADJUST_MODE_HAAR_INTEL 0x1
#define CL_ME_SEARCH_PATH_RADIUS_2_2_INTEL 0x0
#define CL_ME_SEARCH_PATH_RADIUS_4_4_INTEL 0x1
#define CL_ME_SEARCH_PATH_RADIUS_16_12_INTEL 0x5
#define CL_ME_SKIP_BLOCK_TYPE_16x16_INTEL 0x0
#define CL_ME_CHROMA_INTRA_PREDICT_ENABLED_INTEL 0x1
#define CL_ME_LUMA_INTRA_PREDICT_ENABLED_INTEL 0x2
#define CL_ME_SKIP_BLOCK_TYPE_8x8_INTEL 0x4
#define CL_ME_FORWARD_INPUT_MODE_INTEL 0x1
#define CL_ME_BACKWARD_INPUT_MODE_INTEL 0x2
#define CL_ME_BIDIRECTION_INPUT_MODE_INTEL 0x3
#define CL_ME_BIDIR_WEIGHT_QUARTER_INTEL 16
#define CL_ME_BIDIR_WEIGHT_THIRD_INTEL 21
#define CL_ME_BIDIR_WEIGHT_HALF_INTEL 32
#define CL_ME_BIDIR_WEIGHT_TWO_THIRD_INTEL 43
#define CL_ME_BIDIR_WEIGHT_THREE_QUARTER_INTEL 48
#define CL_ME_COST_PENALTY_NONE_INTEL 0x0
#define CL_ME_COST_PENALTY_LOW_INTEL 0x1
#define CL_ME_COST_PENALTY_NORMAL_INTEL 0x2
#define CL_ME_COST_PENALTY_HIGH_INTEL 0x3
#define CL_ME_COST_PRECISION_QPEL_INTEL 0x0
#define CL_ME_COST_PRECISION_HPEL_INTEL 0x1
#define CL_ME_COST_PRECISION_PEL_INTEL 0x2
#define CL_ME_COST_PRECISION_DPEL_INTEL 0x3
#define CL_ME_LUMA_PREDICTOR_MODE_VERTICAL_INTEL 0x0
#define CL_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_INTEL 0x1
#define CL_ME_LUMA_PREDICTOR_MODE_DC_INTEL 0x2
#define CL_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_LEFT_INTEL 0x3
#define CL_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_RIGHT_INTEL 0x4
#define CL_ME_LUMA_PREDICTOR_MODE_PLANE_INTEL 0x4
#define CL_ME_LUMA_PREDICTOR_MODE_VERTICAL_RIGHT_INTEL 0x5
#define CL_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_DOWN_INTEL 0x6
#define CL_ME_LUMA_PREDICTOR_MODE_VERTICAL_LEFT_INTEL 0x7
#define CL_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_UP_INTEL 0x8
#define CL_ME_CHROMA_PREDICTOR_MODE_DC_INTEL 0x0
#define CL_ME_CHROMA_PREDICTOR_MODE_HORIZONTAL_INTEL 0x1
#define CL_ME_CHROMA_PREDICTOR_MODE_VERTICAL_INTEL 0x2
#define CL_ME_CHROMA_PREDICTOR_MODE_PLANE_INTEL 0x3
/* cl_device_info */
#define CL_DEVICE_ME_VERSION_INTEL 0x407E
#define CL_ME_VERSION_LEGACY_INTEL 0x0
#define CL_ME_VERSION_ADVANCED_VER_1_INTEL 0x1
#define CL_ME_VERSION_ADVANCED_VER_2_INTEL 0x2
extern CL_API_ENTRY cl_accelerator_intel CL_API_CALL
clCreateAcceleratorINTEL(
cl_context context,
cl_accelerator_type_intel accelerator_type,
size_t descriptor_size,
const void* descriptor,
cl_int* errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
typedef CL_API_ENTRY cl_accelerator_intel (CL_API_CALL *clCreateAcceleratorINTEL_fn)(
cl_context context,
cl_accelerator_type_intel accelerator_type,
size_t descriptor_size,
const void* descriptor,
cl_int* errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clGetAcceleratorInfoINTEL(
cl_accelerator_intel accelerator,
cl_accelerator_info_intel param_name,
size_t param_value_size,
void* param_value,
size_t* param_value_size_ret) CL_EXT_SUFFIX__VERSION_1_2;
typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetAcceleratorInfoINTEL_fn)(
cl_accelerator_intel accelerator,
cl_accelerator_info_intel param_name,
size_t param_value_size,
void* param_value,
size_t* param_value_size_ret) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clRetainAcceleratorINTEL(
cl_accelerator_intel accelerator) CL_EXT_SUFFIX__VERSION_1_2;
typedef CL_API_ENTRY cl_int (CL_API_CALL *clRetainAcceleratorINTEL_fn)(
cl_accelerator_intel accelerator) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clReleaseAcceleratorINTEL(
cl_accelerator_intel accelerator) CL_EXT_SUFFIX__VERSION_1_2;
typedef CL_API_ENTRY cl_int (CL_API_CALL *clReleaseAcceleratorINTEL_fn)(
cl_accelerator_intel accelerator) CL_EXT_SUFFIX__VERSION_1_2;
/******************************************
* cl_intel_simultaneous_sharing extension *
*******************************************/
#define cl_intel_simultaneous_sharing 1
#define CL_DEVICE_SIMULTANEOUS_INTEROPS_INTEL 0x4104
#define CL_DEVICE_NUM_SIMULTANEOUS_INTEROPS_INTEL 0x4105
/***********************************
* cl_intel_egl_image_yuv extension *
************************************/
#define cl_intel_egl_image_yuv 1
#define CL_EGL_YUV_PLANE_INTEL 0x4107
/********************************
* cl_intel_packed_yuv extension *
*********************************/
#define cl_intel_packed_yuv 1
#define CL_YUYV_INTEL 0x4076
#define CL_UYVY_INTEL 0x4077
#define CL_YVYU_INTEL 0x4078
#define CL_VYUY_INTEL 0x4079
/********************************************
* cl_intel_required_subgroup_size extension *
*********************************************/
#define cl_intel_required_subgroup_size 1
#define CL_DEVICE_SUB_GROUP_SIZES_INTEL 0x4108
#define CL_KERNEL_SPILL_MEM_SIZE_INTEL 0x4109
#define CL_KERNEL_COMPILE_SUB_GROUP_SIZE_INTEL 0x410A
/****************************************
* cl_intel_driver_diagnostics extension *
*****************************************/
#define cl_intel_driver_diagnostics 1
typedef cl_uint cl_diagnostics_verbose_level;
#define CL_CONTEXT_SHOW_DIAGNOSTICS_INTEL 0x4106
#define CL_CONTEXT_DIAGNOSTICS_LEVEL_ALL_INTEL ( 0xff )
#define CL_CONTEXT_DIAGNOSTICS_LEVEL_GOOD_INTEL ( 1 )
#define CL_CONTEXT_DIAGNOSTICS_LEVEL_BAD_INTEL ( 1 << 1 )
#define CL_CONTEXT_DIAGNOSTICS_LEVEL_NEUTRAL_INTEL ( 1 << 2 )
/********************************
* cl_intel_planar_yuv extension *
*********************************/
#define CL_NV12_INTEL 0x410E
#define CL_MEM_NO_ACCESS_INTEL ( 1 << 24 )
#define CL_MEM_ACCESS_FLAGS_UNRESTRICTED_INTEL ( 1 << 25 )
#define CL_DEVICE_PLANAR_YUV_MAX_WIDTH_INTEL 0x417E
#define CL_DEVICE_PLANAR_YUV_MAX_HEIGHT_INTEL 0x417F
/*******************************************************
* cl_intel_device_side_avc_motion_estimation extension *
********************************************************/
#define CL_DEVICE_AVC_ME_VERSION_INTEL 0x410B
#define CL_DEVICE_AVC_ME_SUPPORTS_TEXTURE_SAMPLER_USE_INTEL 0x410C
#define CL_DEVICE_AVC_ME_SUPPORTS_PREEMPTION_INTEL 0x410D
#define CL_AVC_ME_VERSION_0_INTEL 0x0; // No support.
#define CL_AVC_ME_VERSION_1_INTEL 0x1; // First supported version.
#define CL_AVC_ME_MAJOR_16x16_INTEL 0x0
#define CL_AVC_ME_MAJOR_16x8_INTEL 0x1
#define CL_AVC_ME_MAJOR_8x16_INTEL 0x2
#define CL_AVC_ME_MAJOR_8x8_INTEL 0x3
#define CL_AVC_ME_MINOR_8x8_INTEL 0x0
#define CL_AVC_ME_MINOR_8x4_INTEL 0x1
#define CL_AVC_ME_MINOR_4x8_INTEL 0x2
#define CL_AVC_ME_MINOR_4x4_INTEL 0x3
#define CL_AVC_ME_MAJOR_FORWARD_INTEL 0x0
#define CL_AVC_ME_MAJOR_BACKWARD_INTEL 0x1
#define CL_AVC_ME_MAJOR_BIDIRECTIONAL_INTEL 0x2
#define CL_AVC_ME_PARTITION_MASK_ALL_INTEL 0x0
#define CL_AVC_ME_PARTITION_MASK_16x16_INTEL 0x7E
#define CL_AVC_ME_PARTITION_MASK_16x8_INTEL 0x7D
#define CL_AVC_ME_PARTITION_MASK_8x16_INTEL 0x7B
#define CL_AVC_ME_PARTITION_MASK_8x8_INTEL 0x77
#define CL_AVC_ME_PARTITION_MASK_8x4_INTEL 0x6F
#define CL_AVC_ME_PARTITION_MASK_4x8_INTEL 0x5F
#define CL_AVC_ME_PARTITION_MASK_4x4_INTEL 0x3F
#define CL_AVC_ME_SEARCH_WINDOW_EXHAUSTIVE_INTEL 0x0
#define CL_AVC_ME_SEARCH_WINDOW_SMALL_INTEL 0x1
#define CL_AVC_ME_SEARCH_WINDOW_TINY_INTEL 0x2
#define CL_AVC_ME_SEARCH_WINDOW_EXTRA_TINY_INTEL 0x3
#define CL_AVC_ME_SEARCH_WINDOW_DIAMOND_INTEL 0x4
#define CL_AVC_ME_SEARCH_WINDOW_LARGE_DIAMOND_INTEL 0x5
#define CL_AVC_ME_SEARCH_WINDOW_RESERVED0_INTEL 0x6
#define CL_AVC_ME_SEARCH_WINDOW_RESERVED1_INTEL 0x7
#define CL_AVC_ME_SEARCH_WINDOW_CUSTOM_INTEL 0x8
#define CL_AVC_ME_SEARCH_WINDOW_16x12_RADIUS_INTEL 0x9
#define CL_AVC_ME_SEARCH_WINDOW_4x4_RADIUS_INTEL 0x2
#define CL_AVC_ME_SEARCH_WINDOW_2x2_RADIUS_INTEL 0xa
#define CL_AVC_ME_SAD_ADJUST_MODE_NONE_INTEL 0x0
#define CL_AVC_ME_SAD_ADJUST_MODE_HAAR_INTEL 0x2
#define CL_AVC_ME_SUBPIXEL_MODE_INTEGER_INTEL 0x0
#define CL_AVC_ME_SUBPIXEL_MODE_HPEL_INTEL 0x1
#define CL_AVC_ME_SUBPIXEL_MODE_QPEL_INTEL 0x3
#define CL_AVC_ME_COST_PRECISION_QPEL_INTEL 0x0
#define CL_AVC_ME_COST_PRECISION_HPEL_INTEL 0x1
#define CL_AVC_ME_COST_PRECISION_PEL_INTEL 0x2
#define CL_AVC_ME_COST_PRECISION_DPEL_INTEL 0x3
#define CL_AVC_ME_BIDIR_WEIGHT_QUARTER_INTEL 0x10
#define CL_AVC_ME_BIDIR_WEIGHT_THIRD_INTEL 0x15
#define CL_AVC_ME_BIDIR_WEIGHT_HALF_INTEL 0x20
#define CL_AVC_ME_BIDIR_WEIGHT_TWO_THIRD_INTEL 0x2B
#define CL_AVC_ME_BIDIR_WEIGHT_THREE_QUARTER_INTEL 0x30
#define CL_AVC_ME_BORDER_REACHED_LEFT_INTEL 0x0
#define CL_AVC_ME_BORDER_REACHED_RIGHT_INTEL 0x2
#define CL_AVC_ME_BORDER_REACHED_TOP_INTEL 0x4
#define CL_AVC_ME_BORDER_REACHED_BOTTOM_INTEL 0x8
#define CL_AVC_ME_SKIP_BLOCK_PARTITION_16x16_INTEL 0x0
#define CL_AVC_ME_SKIP_BLOCK_PARTITION_8x8_INTEL 0x4000
#define CL_AVC_ME_SKIP_BLOCK_16x16_FORWARD_ENABLE_INTEL ( 0x1 << 24 )
#define CL_AVC_ME_SKIP_BLOCK_16x16_BACKWARD_ENABLE_INTEL ( 0x2 << 24 )
#define CL_AVC_ME_SKIP_BLOCK_16x16_DUAL_ENABLE_INTEL ( 0x3 << 24 )
#define CL_AVC_ME_SKIP_BLOCK_8x8_FORWARD_ENABLE_INTEL ( 0x55 << 24 )
#define CL_AVC_ME_SKIP_BLOCK_8x8_BACKWARD_ENABLE_INTEL ( 0xAA << 24 )
#define CL_AVC_ME_SKIP_BLOCK_8x8_DUAL_ENABLE_INTEL ( 0xFF << 24 )
#define CL_AVC_ME_SKIP_BLOCK_8x8_0_FORWARD_ENABLE_INTEL ( 0x1 << 24 )
#define CL_AVC_ME_SKIP_BLOCK_8x8_0_BACKWARD_ENABLE_INTEL ( 0x2 << 24 )
#define CL_AVC_ME_SKIP_BLOCK_8x8_1_FORWARD_ENABLE_INTEL ( 0x1 << 26 )
#define CL_AVC_ME_SKIP_BLOCK_8x8_1_BACKWARD_ENABLE_INTEL ( 0x2 << 26 )
#define CL_AVC_ME_SKIP_BLOCK_8x8_2_FORWARD_ENABLE_INTEL ( 0x1 << 28 )
#define CL_AVC_ME_SKIP_BLOCK_8x8_2_BACKWARD_ENABLE_INTEL ( 0x2 << 28 )
#define CL_AVC_ME_SKIP_BLOCK_8x8_3_FORWARD_ENABLE_INTEL ( 0x1 << 30 )
#define CL_AVC_ME_SKIP_BLOCK_8x8_3_BACKWARD_ENABLE_INTEL ( 0x2 << 30 )
#define CL_AVC_ME_BLOCK_BASED_SKIP_4x4_INTEL 0x00
#define CL_AVC_ME_BLOCK_BASED_SKIP_8x8_INTEL 0x80
#define CL_AVC_ME_INTRA_16x16_INTEL 0x0
#define CL_AVC_ME_INTRA_8x8_INTEL 0x1
#define CL_AVC_ME_INTRA_4x4_INTEL 0x2
#define CL_AVC_ME_INTRA_LUMA_PARTITION_MASK_16x16_INTEL 0x6
#define CL_AVC_ME_INTRA_LUMA_PARTITION_MASK_8x8_INTEL 0x5
#define CL_AVC_ME_INTRA_LUMA_PARTITION_MASK_4x4_INTEL 0x3
#define CL_AVC_ME_INTRA_NEIGHBOR_LEFT_MASK_ENABLE_INTEL 0x60
#define CL_AVC_ME_INTRA_NEIGHBOR_UPPER_MASK_ENABLE_INTEL 0x10
#define CL_AVC_ME_INTRA_NEIGHBOR_UPPER_RIGHT_MASK_ENABLE_INTEL 0x8
#define CL_AVC_ME_INTRA_NEIGHBOR_UPPER_LEFT_MASK_ENABLE_INTEL 0x4
#define CL_AVC_ME_LUMA_PREDICTOR_MODE_VERTICAL_INTEL 0x0
#define CL_AVC_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_INTEL 0x1
#define CL_AVC_ME_LUMA_PREDICTOR_MODE_DC_INTEL 0x2
#define CL_AVC_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_LEFT_INTEL 0x3
#define CL_AVC_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_RIGHT_INTEL 0x4
#define CL_AVC_ME_LUMA_PREDICTOR_MODE_PLANE_INTEL 0x4
#define CL_AVC_ME_LUMA_PREDICTOR_MODE_VERTICAL_RIGHT_INTEL 0x5
#define CL_AVC_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_DOWN_INTEL 0x6
#define CL_AVC_ME_LUMA_PREDICTOR_MODE_VERTICAL_LEFT_INTEL 0x7
#define CL_AVC_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_UP_INTEL 0x8
#define CL_AVC_ME_CHROMA_PREDICTOR_MODE_DC_INTEL 0x0
#define CL_AVC_ME_CHROMA_PREDICTOR_MODE_HORIZONTAL_INTEL 0x1
#define CL_AVC_ME_CHROMA_PREDICTOR_MODE_VERTICAL_INTEL 0x2
#define CL_AVC_ME_CHROMA_PREDICTOR_MODE_PLANE_INTEL 0x3
#define CL_AVC_ME_FRAME_FORWARD_INTEL 0x1
#define CL_AVC_ME_FRAME_BACKWARD_INTEL 0x2
#define CL_AVC_ME_FRAME_DUAL_INTEL 0x3
#define CL_AVC_ME_SLICE_TYPE_PRED_INTEL 0x0
#define CL_AVC_ME_SLICE_TYPE_BPRED_INTEL 0x1
#define CL_AVC_ME_SLICE_TYPE_INTRA_INTEL 0x2
#define CL_AVC_ME_INTERLACED_SCAN_TOP_FIELD_INTEL 0x0
#define CL_AVC_ME_INTERLACED_SCAN_BOTTOM_FIELD_INTEL 0x1
#ifdef __cplusplus
}
#endif
#endif /* __CL_EXT_INTEL_H */

View File

@@ -1,5 +1,5 @@
/**********************************************************************************
* Copyright (c) 2008 - 2012 The Khronos Group Inc.
* Copyright (c) 2008-2019 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and/or associated documentation files (the
@@ -12,6 +12,11 @@
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
* KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
* SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
* https://www.khronos.org/registry/
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@@ -24,11 +29,7 @@
#ifndef __OPENCL_CL_GL_H
#define __OPENCL_CL_GL_H
#ifdef __APPLE__
#include <OpenCL/cl.h>
#else
#include <CL/cl.h>
#endif
#ifdef __cplusplus
extern "C" {
@@ -44,110 +45,118 @@ typedef struct __GLsync *cl_GLsync;
#define CL_GL_OBJECT_TEXTURE2D 0x2001
#define CL_GL_OBJECT_TEXTURE3D 0x2002
#define CL_GL_OBJECT_RENDERBUFFER 0x2003
#ifdef CL_VERSION_1_2
#define CL_GL_OBJECT_TEXTURE2D_ARRAY 0x200E
#define CL_GL_OBJECT_TEXTURE1D 0x200F
#define CL_GL_OBJECT_TEXTURE1D_ARRAY 0x2010
#define CL_GL_OBJECT_TEXTURE_BUFFER 0x2011
#endif
/* cl_gl_texture_info */
#define CL_GL_TEXTURE_TARGET 0x2004
#define CL_GL_MIPMAP_LEVEL 0x2005
#ifdef CL_VERSION_1_2
#define CL_GL_NUM_SAMPLES 0x2012
#endif
extern CL_API_ENTRY cl_mem CL_API_CALL
clCreateFromGLBuffer(cl_context /* context */,
cl_mem_flags /* flags */,
cl_GLuint /* bufobj */,
int * /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;
clCreateFromGLBuffer(cl_context context,
cl_mem_flags flags,
cl_GLuint bufobj,
cl_int * errcode_ret) CL_API_SUFFIX__VERSION_1_0;
#ifdef CL_VERSION_1_2
extern CL_API_ENTRY cl_mem CL_API_CALL
clCreateFromGLTexture(cl_context /* context */,
cl_mem_flags /* flags */,
cl_GLenum /* target */,
cl_GLint /* miplevel */,
cl_GLuint /* texture */,
cl_int * /* errcode_ret */) CL_API_SUFFIX__VERSION_1_2;
clCreateFromGLTexture(cl_context context,
cl_mem_flags flags,
cl_GLenum target,
cl_GLint miplevel,
cl_GLuint texture,
cl_int * errcode_ret) CL_API_SUFFIX__VERSION_1_2;
#endif
extern CL_API_ENTRY cl_mem CL_API_CALL
clCreateFromGLRenderbuffer(cl_context /* context */,
cl_mem_flags /* flags */,
cl_GLuint /* renderbuffer */,
cl_int * /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;
clCreateFromGLRenderbuffer(cl_context context,
cl_mem_flags flags,
cl_GLuint renderbuffer,
cl_int * errcode_ret) CL_API_SUFFIX__VERSION_1_0;
extern CL_API_ENTRY cl_int CL_API_CALL
clGetGLObjectInfo(cl_mem /* memobj */,
cl_gl_object_type * /* gl_object_type */,
cl_GLuint * /* gl_object_name */) CL_API_SUFFIX__VERSION_1_0;
extern CL_API_ENTRY cl_int CL_API_CALL
clGetGLTextureInfo(cl_mem /* memobj */,
cl_gl_texture_info /* param_name */,
size_t /* param_value_size */,
void * /* param_value */,
size_t * /* param_value_size_ret */) CL_API_SUFFIX__VERSION_1_0;
clGetGLObjectInfo(cl_mem memobj,
cl_gl_object_type * gl_object_type,
cl_GLuint * gl_object_name) CL_API_SUFFIX__VERSION_1_0;
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueAcquireGLObjects(cl_command_queue /* command_queue */,
cl_uint /* num_objects */,
const cl_mem * /* mem_objects */,
cl_uint /* num_events_in_wait_list */,
const cl_event * /* event_wait_list */,
cl_event * /* event */) CL_API_SUFFIX__VERSION_1_0;
clGetGLTextureInfo(cl_mem memobj,
cl_gl_texture_info param_name,
size_t param_value_size,
void * param_value,
size_t * param_value_size_ret) CL_API_SUFFIX__VERSION_1_0;
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueReleaseGLObjects(cl_command_queue /* command_queue */,
cl_uint /* num_objects */,
const cl_mem * /* mem_objects */,
cl_uint /* num_events_in_wait_list */,
const cl_event * /* event_wait_list */,
cl_event * /* event */) CL_API_SUFFIX__VERSION_1_0;
clEnqueueAcquireGLObjects(cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem * mem_objects,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event) CL_API_SUFFIX__VERSION_1_0;
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueReleaseGLObjects(cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem * mem_objects,
cl_uint num_events_in_wait_list,
const cl_event * event_wait_list,
cl_event * event) CL_API_SUFFIX__VERSION_1_0;
/* Deprecated OpenCL 1.1 APIs */
extern CL_API_ENTRY CL_EXT_PREFIX__VERSION_1_1_DEPRECATED cl_mem CL_API_CALL
clCreateFromGLTexture2D(cl_context /* context */,
cl_mem_flags /* flags */,
cl_GLenum /* target */,
cl_GLint /* miplevel */,
cl_GLuint /* texture */,
cl_int * /* errcode_ret */) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;
clCreateFromGLTexture2D(cl_context context,
cl_mem_flags flags,
cl_GLenum target,
cl_GLint miplevel,
cl_GLuint texture,
cl_int * errcode_ret) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;
extern CL_API_ENTRY CL_EXT_PREFIX__VERSION_1_1_DEPRECATED cl_mem CL_API_CALL
clCreateFromGLTexture3D(cl_context /* context */,
cl_mem_flags /* flags */,
cl_GLenum /* target */,
cl_GLint /* miplevel */,
cl_GLuint /* texture */,
cl_int * /* errcode_ret */) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;
clCreateFromGLTexture3D(cl_context context,
cl_mem_flags flags,
cl_GLenum target,
cl_GLint miplevel,
cl_GLuint texture,
cl_int * errcode_ret) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;
/* cl_khr_gl_sharing extension */
#define cl_khr_gl_sharing 1
typedef cl_uint cl_gl_context_info;
/* Additional Error Codes */
#define CL_INVALID_GL_SHAREGROUP_REFERENCE_KHR -1000
/* cl_gl_context_info */
#define CL_CURRENT_DEVICE_FOR_GL_CONTEXT_KHR 0x2006
#define CL_DEVICES_FOR_GL_CONTEXT_KHR 0x2007
/* Additional cl_context_properties */
#define CL_GL_CONTEXT_KHR 0x2008
#define CL_EGL_DISPLAY_KHR 0x2009
#define CL_GLX_DISPLAY_KHR 0x200A
#define CL_WGL_HDC_KHR 0x200B
#define CL_CGL_SHAREGROUP_KHR 0x200C
extern CL_API_ENTRY cl_int CL_API_CALL
clGetGLContextInfoKHR(const cl_context_properties * /* properties */,
cl_gl_context_info /* param_name */,
size_t /* param_value_size */,
void * /* param_value */,
size_t * /* param_value_size_ret */) CL_API_SUFFIX__VERSION_1_0;
clGetGLContextInfoKHR(const cl_context_properties * properties,
cl_gl_context_info param_name,
size_t param_value_size,
void * param_value,
size_t * param_value_size_ret) CL_API_SUFFIX__VERSION_1_0;
typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetGLContextInfoKHR_fn)(
const cl_context_properties * properties,
cl_gl_context_info param_name,

View File

@@ -1,5 +1,5 @@
/**********************************************************************************
* Copyright (c) 2008-2012 The Khronos Group Inc.
* Copyright (c) 2008-2019 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and/or associated documentation files (the
@@ -12,6 +12,11 @@
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
* KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
* SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
* https://www.khronos.org/registry/
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@@ -21,11 +26,6 @@
* MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
**********************************************************************************/
/* $Revision: 11708 $ on $Date: 2010-06-13 23:36:24 -0700 (Sun, 13 Jun 2010) $ */
/* cl_gl_ext.h contains vendor (non-KHR) OpenCL extensions which have */
/* OpenGL dependencies. */
#ifndef __OPENCL_CL_GL_EXT_H
#define __OPENCL_CL_GL_EXT_H
@@ -33,34 +33,17 @@
extern "C" {
#endif
#ifdef __APPLE__
#include <OpenCL/cl_gl.h>
#else
#include <CL/cl_gl.h>
#endif
/*
* For each extension, follow this template
* cl_VEN_extname extension */
/* #define cl_VEN_extname 1
* ... define new types, if any
* ... define new tokens, if any
* ... define new APIs, if any
*
* If you need GLtypes here, mirror them with a cl_GLtype, rather than including a GL header
* This allows us to avoid having to decide whether to include GL headers or GLES here.
*/
#include <CL/cl_gl.h>
/*
* cl_khr_gl_event extension
* See section 9.9 in the OpenCL 1.1 spec for more information
* cl_khr_gl_event extension
*/
#define CL_COMMAND_GL_FENCE_SYNC_OBJECT_KHR 0x200D
extern CL_API_ENTRY cl_event CL_API_CALL
clCreateEventFromGLsyncKHR(cl_context /* context */,
cl_GLsync /* cl_GLsync */,
cl_int * /* errcode_ret */) CL_EXT_SUFFIX__VERSION_1_1;
clCreateEventFromGLsyncKHR(cl_context context,
cl_GLsync cl_GLsync,
cl_int * errcode_ret) CL_EXT_SUFFIX__VERSION_1_1;
#ifdef __cplusplus
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,172 @@
/**********************************************************************************
* Copyright (c) 2008-2019 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and/or associated documentation files (the
* "Materials"), to deal in the Materials without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Materials, and to
* permit persons to whom the Materials are furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
* KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
* SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
* https://www.khronos.org/registry/
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
* CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
**********************************************************************************/
/*****************************************************************************\
Copyright (c) 2013-2019 Intel Corporation All Rights Reserved.
THESE MATERIALS ARE PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THESE
MATERIALS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
File Name: cl_va_api_media_sharing_intel.h
Abstract:
Notes:
\*****************************************************************************/
#ifndef __OPENCL_CL_VA_API_MEDIA_SHARING_INTEL_H
#define __OPENCL_CL_VA_API_MEDIA_SHARING_INTEL_H
#include <CL/cl.h>
#include <CL/cl_platform.h>
#include <va/va.h>
#ifdef __cplusplus
extern "C" {
#endif
/******************************************
* cl_intel_va_api_media_sharing extension *
*******************************************/
#define cl_intel_va_api_media_sharing 1
/* error codes */
#define CL_INVALID_VA_API_MEDIA_ADAPTER_INTEL -1098
#define CL_INVALID_VA_API_MEDIA_SURFACE_INTEL -1099
#define CL_VA_API_MEDIA_SURFACE_ALREADY_ACQUIRED_INTEL -1100
#define CL_VA_API_MEDIA_SURFACE_NOT_ACQUIRED_INTEL -1101
/* cl_va_api_device_source_intel */
#define CL_VA_API_DISPLAY_INTEL 0x4094
/* cl_va_api_device_set_intel */
#define CL_PREFERRED_DEVICES_FOR_VA_API_INTEL 0x4095
#define CL_ALL_DEVICES_FOR_VA_API_INTEL 0x4096
/* cl_context_info */
#define CL_CONTEXT_VA_API_DISPLAY_INTEL 0x4097
/* cl_mem_info */
#define CL_MEM_VA_API_MEDIA_SURFACE_INTEL 0x4098
/* cl_image_info */
#define CL_IMAGE_VA_API_PLANE_INTEL 0x4099
/* cl_command_type */
#define CL_COMMAND_ACQUIRE_VA_API_MEDIA_SURFACES_INTEL 0x409A
#define CL_COMMAND_RELEASE_VA_API_MEDIA_SURFACES_INTEL 0x409B
typedef cl_uint cl_va_api_device_source_intel;
typedef cl_uint cl_va_api_device_set_intel;
extern CL_API_ENTRY cl_int CL_API_CALL
clGetDeviceIDsFromVA_APIMediaAdapterINTEL(
cl_platform_id platform,
cl_va_api_device_source_intel media_adapter_type,
void* media_adapter,
cl_va_api_device_set_intel media_adapter_set,
cl_uint num_entries,
cl_device_id* devices,
cl_uint* num_devices) CL_EXT_SUFFIX__VERSION_1_2;
typedef CL_API_ENTRY cl_int (CL_API_CALL * clGetDeviceIDsFromVA_APIMediaAdapterINTEL_fn)(
cl_platform_id platform,
cl_va_api_device_source_intel media_adapter_type,
void* media_adapter,
cl_va_api_device_set_intel media_adapter_set,
cl_uint num_entries,
cl_device_id* devices,
cl_uint* num_devices) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_mem CL_API_CALL
clCreateFromVA_APIMediaSurfaceINTEL(
cl_context context,
cl_mem_flags flags,
VASurfaceID* surface,
cl_uint plane,
cl_int* errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
typedef CL_API_ENTRY cl_mem (CL_API_CALL * clCreateFromVA_APIMediaSurfaceINTEL_fn)(
cl_context context,
cl_mem_flags flags,
VASurfaceID* surface,
cl_uint plane,
cl_int* errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueAcquireVA_APIMediaSurfacesINTEL(
cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem* mem_objects,
cl_uint num_events_in_wait_list,
const cl_event* event_wait_list,
cl_event* event) CL_EXT_SUFFIX__VERSION_1_2;
typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireVA_APIMediaSurfacesINTEL_fn)(
cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem* mem_objects,
cl_uint num_events_in_wait_list,
const cl_event* event_wait_list,
cl_event* event) CL_EXT_SUFFIX__VERSION_1_2;
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueReleaseVA_APIMediaSurfacesINTEL(
cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem* mem_objects,
cl_uint num_events_in_wait_list,
const cl_event* event_wait_list,
cl_event* event) CL_EXT_SUFFIX__VERSION_1_2;
typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseVA_APIMediaSurfacesINTEL_fn)(
cl_command_queue command_queue,
cl_uint num_objects,
const cl_mem* mem_objects,
cl_uint num_events_in_wait_list,
const cl_event* event_wait_list,
cl_event* event) CL_EXT_SUFFIX__VERSION_1_2;
#ifdef __cplusplus
}
#endif
#endif /* __OPENCL_CL_VA_API_MEDIA_SHARING_INTEL_H */

86
include/CL/cl_version.h Normal file
View File

@@ -0,0 +1,86 @@
/*******************************************************************************
* Copyright (c) 2018 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and/or associated documentation files (the
* "Materials"), to deal in the Materials without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Materials, and to
* permit persons to whom the Materials are furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
* KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
* SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
* https://www.khronos.org/registry/
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
* CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
******************************************************************************/
#ifndef __CL_VERSION_H
#define __CL_VERSION_H
/* Detect which version to target */
#if !defined(CL_TARGET_OPENCL_VERSION)
#pragma message("cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)")
#define CL_TARGET_OPENCL_VERSION 220
#endif
#if CL_TARGET_OPENCL_VERSION != 100 && \
CL_TARGET_OPENCL_VERSION != 110 && \
CL_TARGET_OPENCL_VERSION != 120 && \
CL_TARGET_OPENCL_VERSION != 200 && \
CL_TARGET_OPENCL_VERSION != 210 && \
CL_TARGET_OPENCL_VERSION != 220
#pragma message("cl_version: CL_TARGET_OPENCL_VERSION is not a valid value (100, 110, 120, 200, 210, 220). Defaulting to 220 (OpenCL 2.2)")
#undef CL_TARGET_OPENCL_VERSION
#define CL_TARGET_OPENCL_VERSION 220
#endif
/* OpenCL Version */
#if CL_TARGET_OPENCL_VERSION >= 220 && !defined(CL_VERSION_2_2)
#define CL_VERSION_2_2 1
#endif
#if CL_TARGET_OPENCL_VERSION >= 210 && !defined(CL_VERSION_2_1)
#define CL_VERSION_2_1 1
#endif
#if CL_TARGET_OPENCL_VERSION >= 200 && !defined(CL_VERSION_2_0)
#define CL_VERSION_2_0 1
#endif
#if CL_TARGET_OPENCL_VERSION >= 120 && !defined(CL_VERSION_1_2)
#define CL_VERSION_1_2 1
#endif
#if CL_TARGET_OPENCL_VERSION >= 110 && !defined(CL_VERSION_1_1)
#define CL_VERSION_1_1 1
#endif
#if CL_TARGET_OPENCL_VERSION >= 100 && !defined(CL_VERSION_1_0)
#define CL_VERSION_1_0 1
#endif
/* Allow deprecated APIs for older OpenCL versions. */
#if CL_TARGET_OPENCL_VERSION <= 210 && !defined(CL_USE_DEPRECATED_OPENCL_2_1_APIS)
#define CL_USE_DEPRECATED_OPENCL_2_1_APIS
#endif
#if CL_TARGET_OPENCL_VERSION <= 200 && !defined(CL_USE_DEPRECATED_OPENCL_2_0_APIS)
#define CL_USE_DEPRECATED_OPENCL_2_0_APIS
#endif
#if CL_TARGET_OPENCL_VERSION <= 120 && !defined(CL_USE_DEPRECATED_OPENCL_1_2_APIS)
#define CL_USE_DEPRECATED_OPENCL_1_2_APIS
#endif
#if CL_TARGET_OPENCL_VERSION <= 110 && !defined(CL_USE_DEPRECATED_OPENCL_1_1_APIS)
#define CL_USE_DEPRECATED_OPENCL_1_1_APIS
#endif
#if CL_TARGET_OPENCL_VERSION <= 100 && !defined(CL_USE_DEPRECATED_OPENCL_1_0_APIS)
#define CL_USE_DEPRECATED_OPENCL_1_0_APIS
#endif
#endif /* __CL_VERSION_H */

View File

@@ -1,5 +1,5 @@
/*******************************************************************************
* Copyright (c) 2008-2012 The Khronos Group Inc.
* Copyright (c) 2008-2015 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and/or associated documentation files (the
@@ -12,6 +12,11 @@
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
* KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
* SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
* https://www.khronos.org/registry/
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@@ -30,25 +35,13 @@
extern "C" {
#endif
#ifdef __APPLE__
#include <OpenCL/cl.h>
#include <OpenCL/cl_gl.h>
#include <OpenCL/cl_gl_ext.h>
#include <OpenCL/cl_ext.h>
#else
#include <CL/cl.h>
#include <CL/cl_gl.h>
#include <CL/cl_gl_ext.h>
#include <CL/cl_ext.h>
#endif
#ifdef __cplusplus
}
#endif
#endif /* __OPENCL_H */

View File

@@ -28,17 +28,17 @@ extern "C" {
** MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
*/
/*
** This header is generated from the Khronos OpenGL / OpenGL ES XML
** API Registry. The current version of the Registry, generator scripts
** This header is generated from the Khronos EGL XML API Registry.
** The current version of the Registry, generator scripts
** used to make the header, and the header can be found at
** http://www.khronos.org/registry/egl
**
** Khronos $Git commit SHA1: a732b061e7 $ on $Git commit date: 2017-06-17 23:27:53 +0100 $
** Khronos $Git commit SHA1: 9ed2ec4c67 $ on $Git commit date: 2019-01-09 17:54:35 -0800 $
*/
#include <EGL/eglplatform.h>
/* Generated on date 20170627 */
/* Generated on date 20190124 */
/* Generated C header for:
* API: egl

View File

@@ -28,17 +28,17 @@ extern "C" {
** MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
*/
/*
** This header is generated from the Khronos OpenGL / OpenGL ES XML
** API Registry. The current version of the Registry, generator scripts
** This header is generated from the Khronos EGL XML API Registry.
** The current version of the Registry, generator scripts
** used to make the header, and the header can be found at
** http://www.khronos.org/registry/egl
**
** Khronos $Git commit SHA1: bae3518c48 $ on $Git commit date: 2018-05-17 10:56:57 -0700 $
** Khronos $Git commit SHA1: 9ed2ec4c67 $ on $Git commit date: 2019-01-09 17:54:35 -0800 $
*/
#include <EGL/eglplatform.h>
#define EGL_EGLEXT_VERSION 20180517
#define EGL_EGLEXT_VERSION 20190124
/* Generated C header for:
* API: egl
@@ -681,6 +681,7 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQueryDisplayAttribEXT (EGLDisplay dpy, EGLint a
#ifndef EGL_EXT_device_drm
#define EGL_EXT_device_drm 1
#define EGL_DRM_DEVICE_FILE_EXT 0x3233
#define EGL_DRM_MASTER_FD_EXT 0x333C
#endif /* EGL_EXT_device_drm */
#ifndef EGL_EXT_device_enumeration
@@ -716,6 +717,11 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQueryDisplayAttribEXT (EGLDisplay dpy, EGLint a
#define EGL_GL_COLORSPACE_DISPLAY_P3_LINEAR_EXT 0x3362
#endif /* EGL_EXT_gl_colorspace_display_p3_linear */
#ifndef EGL_EXT_gl_colorspace_display_p3_passthrough
#define EGL_EXT_gl_colorspace_display_p3_passthrough 1
#define EGL_GL_COLORSPACE_DISPLAY_P3_PASSTHROUGH_EXT 0x3490
#endif /* EGL_EXT_gl_colorspace_display_p3_passthrough */
#ifndef EGL_EXT_gl_colorspace_scrgb
#define EGL_EXT_gl_colorspace_scrgb 1
#define EGL_GL_COLORSPACE_SCRGB_EXT 0x3351
@@ -1025,6 +1031,16 @@ EGLAPI EGLBoolean EGLAPIENTRY eglExportDMABUFImageMESA (EGLDisplay dpy, EGLImage
#define EGL_PLATFORM_SURFACELESS_MESA 0x31DD
#endif /* EGL_MESA_platform_surfaceless */
#ifndef EGL_MESA_query_driver
#define EGL_MESA_query_driver 1
typedef char *(EGLAPIENTRYP PFNEGLGETDISPLAYDRIVERCONFIGPROC) (EGLDisplay dpy);
typedef const char *(EGLAPIENTRYP PFNEGLGETDISPLAYDRIVERNAMEPROC) (EGLDisplay dpy);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI char *EGLAPIENTRY eglGetDisplayDriverConfig (EGLDisplay dpy);
EGLAPI const char *EGLAPIENTRY eglGetDisplayDriverName (EGLDisplay dpy);
#endif
#endif /* EGL_MESA_query_driver */
#ifndef EGL_NOK_swap_region
#define EGL_NOK_swap_region 1
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSWAPBUFFERSREGIONNOKPROC) (EGLDisplay dpy, EGLSurface surface, EGLint numRects, const EGLint *rects);

View File

@@ -48,6 +48,8 @@ typedef unsigned int drm_drawable_t;
typedef struct drm_clip_rect drm_clip_rect_t;
#endif
#include <GL/gl.h>
#include <stdint.h>
/**
@@ -589,7 +591,7 @@ struct __DRIdamageExtensionRec {
* SWRast Loader extension.
*/
#define __DRI_SWRAST_LOADER "DRI_SWRastLoader"
#define __DRI_SWRAST_LOADER_VERSION 4
#define __DRI_SWRAST_LOADER_VERSION 5
struct __DRIswrastLoaderExtensionRec {
__DRIextension base;
@@ -649,6 +651,23 @@ struct __DRIswrastLoaderExtensionRec {
void (*getImageShm)(__DRIdrawable *readable,
int x, int y, int width, int height,
int shmid, void *loaderPrivate);
/**
* Put shm image to drawable (v2)
*
* The original version fixes srcx/y to 0, and expected
* the offset to be adjusted. This version allows src x,y
* to not be included in the offset. This is needed to
* avoid certain overflow checks in the X server, that
* result in lost rendering.
*
* \since 5
*/
void (*putImageShm2)(__DRIdrawable *drawable, int op,
int x, int y,
int width, int height, int stride,
int shmid, char *shmaddr, unsigned offset,
void *loaderPrivate);
};
/**
@@ -1327,6 +1346,8 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_FOURCC_NV16 0x3631564e
#define __DRI_IMAGE_FOURCC_YUYV 0x56595559
#define __DRI_IMAGE_FOURCC_UYVY 0x59565955
#define __DRI_IMAGE_FOURCC_AYUV 0x56555941
#define __DRI_IMAGE_FOURCC_XYUV8888 0x56555958
#define __DRI_IMAGE_FOURCC_YVU410 0x39555659
#define __DRI_IMAGE_FOURCC_YVU411 0x31315659
@@ -1334,6 +1355,10 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_FOURCC_YVU422 0x36315659
#define __DRI_IMAGE_FOURCC_YVU444 0x34325659
#define __DRI_IMAGE_FOURCC_P010 0x30313050
#define __DRI_IMAGE_FOURCC_P012 0x32313050
#define __DRI_IMAGE_FOURCC_P016 0x36313050
/**
* Queryable on images created by createImageFromNames.
*
@@ -1353,6 +1378,8 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_COMPONENTS_Y_UV 0x3004
#define __DRI_IMAGE_COMPONENTS_Y_XUXV 0x3005
#define __DRI_IMAGE_COMPONENTS_Y_UXVX 0x3008
#define __DRI_IMAGE_COMPONENTS_AYUV 0x3009
#define __DRI_IMAGE_COMPONENTS_XYUV 0x300A
#define __DRI_IMAGE_COMPONENTS_R 0x3006
#define __DRI_IMAGE_COMPONENTS_RG 0x3007

View File

@@ -1,6 +1,6 @@
This directory contains a copy of the installed kernel headers
required by the anv & i965 drivers to communicate with the kernel.
Whenever either of those driver needs new definitions for new kernel
required by several drivers to communicate with the kernel.
Whenever one of those driver needs new definitions for new kernel
APIs, these files should be updated.
These files in master should only be updated once the changes have landed
@@ -13,9 +13,9 @@ $ make headers_install INSTALL_HDR_PATH=/path/to/install
The last update was done at the following kernel commit :
commit 78230c46ec0a91dd4256c9e54934b3c7095a7ee3
Merge: b65bd4031156 037f03155b7d
commit a5f2fafece141ef3509e686cea576366d55cabb6
Merge: 71f4e45a4ed3 860433ed2a55
Author: Dave Airlie <airlied@redhat.com>
Date: Wed Mar 21 14:07:03 2018 +1000
Date: Wed Feb 20 12:16:30 2019 +1000
Merge tag 'omapdrm-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux into drm-next
Merge https://gitlab.freedesktop.org/drm/msm into drm-next

View File

@@ -674,6 +674,22 @@ struct drm_get_cap {
*/
#define DRM_CLIENT_CAP_ATOMIC 3
/**
* DRM_CLIENT_CAP_ASPECT_RATIO
*
* If set to 1, the DRM core will provide aspect ratio information in modes.
*/
#define DRM_CLIENT_CAP_ASPECT_RATIO 4
/**
* DRM_CLIENT_CAP_WRITEBACK_CONNECTORS
*
* If set to 1, the DRM core will expose special connectors to be used for
* writing back to memory the scene setup in the commit. Depends on client
* also supporting DRM_CLIENT_CAP_ATOMIC
*/
#define DRM_CLIENT_CAP_WRITEBACK_CONNECTORS 5
/** DRM_IOCTL_SET_CLIENT_CAP ioctl argument type */
struct drm_set_client_cap {
__u64 capability;

View File

@@ -30,11 +30,50 @@
extern "C" {
#endif
/**
* DOC: overview
*
* In the DRM subsystem, framebuffer pixel formats are described using the
* fourcc codes defined in `include/uapi/drm/drm_fourcc.h`. In addition to the
* fourcc code, a Format Modifier may optionally be provided, in order to
* further describe the buffer's format - for example tiling or compression.
*
* Format Modifiers
* ----------------
*
* Format modifiers are used in conjunction with a fourcc code, forming a
* unique fourcc:modifier pair. This format:modifier pair must fully define the
* format and data layout of the buffer, and should be the only way to describe
* that particular buffer.
*
* Having multiple fourcc:modifier pairs which describe the same layout should
* be avoided, as such aliases run the risk of different drivers exposing
* different names for the same data format, forcing userspace to understand
* that they are aliases.
*
* Format modifiers may change any property of the buffer, including the number
* of planes and/or the required allocation size. Format modifiers are
* vendor-namespaced, and as such the relationship between a fourcc code and a
* modifier is specific to the modifer being used. For example, some modifiers
* may preserve meaning - such as number of planes - from the fourcc code,
* whereas others may not.
*
* Vendors should document their modifier usage in as much detail as
* possible, to ensure maximum compatibility across devices, drivers and
* applications.
*
* The authoritative list of format modifier codes is found in
* `include/uapi/drm/drm_fourcc.h`
*/
#define fourcc_code(a, b, c, d) ((__u32)(a) | ((__u32)(b) << 8) | \
((__u32)(c) << 16) | ((__u32)(d) << 24))
#define DRM_FORMAT_BIG_ENDIAN (1<<31) /* format is big endian instead of little endian */
/* Reserve 0 for the invalid format specifier */
#define DRM_FORMAT_INVALID 0
/* color index */
#define DRM_FORMAT_C8 fourcc_code('C', '8', ' ', ' ') /* [7:0] C */
@@ -112,6 +151,21 @@ extern "C" {
#define DRM_FORMAT_VYUY fourcc_code('V', 'Y', 'U', 'Y') /* [31:0] Y1:Cb0:Y0:Cr0 8:8:8:8 little endian */
#define DRM_FORMAT_AYUV fourcc_code('A', 'Y', 'U', 'V') /* [31:0] A:Y:Cb:Cr 8:8:8:8 little endian */
#define DRM_FORMAT_XYUV8888 fourcc_code('X', 'Y', 'U', 'V') /* [31:0] X:Y:Cb:Cr 8:8:8:8 little endian */
/*
* packed YCbCr420 2x2 tiled formats
* first 64 bits will contain Y,Cb,Cr components for a 2x2 tile
*/
/* [63:0] A3:A2:Y3:0:Cr0:0:Y2:0:A1:A0:Y1:0:Cb0:0:Y0:0 1:1:8:2:8:2:8:2:1:1:8:2:8:2:8:2 little endian */
#define DRM_FORMAT_Y0L0 fourcc_code('Y', '0', 'L', '0')
/* [63:0] X3:X2:Y3:0:Cr0:0:Y2:0:X1:X0:Y1:0:Cb0:0:Y0:0 1:1:8:2:8:2:8:2:1:1:8:2:8:2:8:2 little endian */
#define DRM_FORMAT_X0L0 fourcc_code('X', '0', 'L', '0')
/* [63:0] A3:A2:Y3:Cr0:Y2:A1:A0:Y1:Cb0:Y0 1:1:10:10:10:1:1:10:10:10 little endian */
#define DRM_FORMAT_Y0L2 fourcc_code('Y', '0', 'L', '2')
/* [63:0] X3:X2:Y3:Cr0:Y2:X1:X0:Y1:Cb0:Y0 1:1:10:10:10:1:1:10:10:10 little endian */
#define DRM_FORMAT_X0L2 fourcc_code('X', '0', 'L', '2')
/*
* 2 plane RGB + A
@@ -141,6 +195,27 @@ extern "C" {
#define DRM_FORMAT_NV24 fourcc_code('N', 'V', '2', '4') /* non-subsampled Cr:Cb plane */
#define DRM_FORMAT_NV42 fourcc_code('N', 'V', '4', '2') /* non-subsampled Cb:Cr plane */
/*
* 2 plane YCbCr MSB aligned
* index 0 = Y plane, [15:0] Y:x [10:6] little endian
* index 1 = Cr:Cb plane, [31:0] Cr:x:Cb:x [10:6:10:6] little endian
*/
#define DRM_FORMAT_P010 fourcc_code('P', '0', '1', '0') /* 2x2 subsampled Cr:Cb plane 10 bits per channel */
/*
* 2 plane YCbCr MSB aligned
* index 0 = Y plane, [15:0] Y:x [12:4] little endian
* index 1 = Cr:Cb plane, [31:0] Cr:x:Cb:x [12:4:12:4] little endian
*/
#define DRM_FORMAT_P012 fourcc_code('P', '0', '1', '2') /* 2x2 subsampled Cr:Cb plane 12 bits per channel */
/*
* 2 plane YCbCr MSB aligned
* index 0 = Y plane, [15:0] Y little endian
* index 1 = Cr:Cb plane, [31:0] Cr:Cb [16:16] little endian
*/
#define DRM_FORMAT_P016 fourcc_code('P', '0', '1', '6') /* 2x2 subsampled Cr:Cb plane 16 bits per channel */
/*
* 3 plane YCbCr
* index 0: Y plane, [7:0] Y
@@ -183,6 +258,9 @@ extern "C" {
#define DRM_FORMAT_MOD_VENDOR_QCOM 0x05
#define DRM_FORMAT_MOD_VENDOR_VIVANTE 0x06
#define DRM_FORMAT_MOD_VENDOR_BROADCOM 0x07
#define DRM_FORMAT_MOD_VENDOR_ARM 0x08
#define DRM_FORMAT_MOD_VENDOR_ALLWINNER 0x09
/* add more to the end as needed */
#define DRM_FORMAT_RESERVED ((1ULL << 56) - 1)
@@ -298,6 +376,28 @@ extern "C" {
*/
#define DRM_FORMAT_MOD_SAMSUNG_64_32_TILE fourcc_mod_code(SAMSUNG, 1)
/*
* Tiled, 16 (pixels) x 16 (lines) - sized macroblocks
*
* This is a simple tiled layout using tiles of 16x16 pixels in a row-major
* layout. For YCbCr formats Cb/Cr components are taken in such a way that
* they correspond to their 16x16 luma block.
*/
#define DRM_FORMAT_MOD_SAMSUNG_16_16_TILE fourcc_mod_code(SAMSUNG, 2)
/*
* Qualcomm Compressed Format
*
* Refers to a compressed variant of the base format that is compressed.
* Implementation may be platform and base-format specific.
*
* Each macrotile consists of m x n (mostly 4 x 4) tiles.
* Pixel data pitch/stride is aligned with macrotile width.
* Pixel data height is aligned with macrotile height.
* Entire pixel data buffer is aligned with 4k(bytes).
*/
#define DRM_FORMAT_MOD_QCOM_COMPRESSED fourcc_mod_code(QCOM, 1)
/* Vivante framebuffer modifiers */
/*
@@ -485,6 +585,128 @@ extern "C" {
*/
#define DRM_FORMAT_MOD_BROADCOM_UIF fourcc_mod_code(BROADCOM, 6)
/*
* Arm Framebuffer Compression (AFBC) modifiers
*
* AFBC is a proprietary lossless image compression protocol and format.
* It provides fine-grained random access and minimizes the amount of data
* transferred between IP blocks.
*
* AFBC has several features which may be supported and/or used, which are
* represented using bits in the modifier. Not all combinations are valid,
* and different devices or use-cases may support different combinations.
*
* Further information on the use of AFBC modifiers can be found in
* Documentation/gpu/afbc.rst
*/
#define DRM_FORMAT_MOD_ARM_AFBC(__afbc_mode) fourcc_mod_code(ARM, __afbc_mode)
/*
* AFBC superblock size
*
* Indicates the superblock size(s) used for the AFBC buffer. The buffer
* size (in pixels) must be aligned to a multiple of the superblock size.
* Four lowest significant bits(LSBs) are reserved for block size.
*
* Where one superblock size is specified, it applies to all planes of the
* buffer (e.g. 16x16, 32x8). When multiple superblock sizes are specified,
* the first applies to the Luma plane and the second applies to the Chroma
* plane(s). e.g. (32x8_64x4 means 32x8 Luma, with 64x4 Chroma).
* Multiple superblock sizes are only valid for multi-plane YCbCr formats.
*/
#define AFBC_FORMAT_MOD_BLOCK_SIZE_MASK 0xf
#define AFBC_FORMAT_MOD_BLOCK_SIZE_16x16 (1ULL)
#define AFBC_FORMAT_MOD_BLOCK_SIZE_32x8 (2ULL)
#define AFBC_FORMAT_MOD_BLOCK_SIZE_64x4 (3ULL)
#define AFBC_FORMAT_MOD_BLOCK_SIZE_32x8_64x4 (4ULL)
/*
* AFBC lossless colorspace transform
*
* Indicates that the buffer makes use of the AFBC lossless colorspace
* transform.
*/
#define AFBC_FORMAT_MOD_YTR (1ULL << 4)
/*
* AFBC block-split
*
* Indicates that the payload of each superblock is split. The second
* half of the payload is positioned at a predefined offset from the start
* of the superblock payload.
*/
#define AFBC_FORMAT_MOD_SPLIT (1ULL << 5)
/*
* AFBC sparse layout
*
* This flag indicates that the payload of each superblock must be stored at a
* predefined position relative to the other superblocks in the same AFBC
* buffer. This order is the same order used by the header buffer. In this mode
* each superblock is given the same amount of space as an uncompressed
* superblock of the particular format would require, rounding up to the next
* multiple of 128 bytes in size.
*/
#define AFBC_FORMAT_MOD_SPARSE (1ULL << 6)
/*
* AFBC copy-block restrict
*
* Buffers with this flag must obey the copy-block restriction. The restriction
* is such that there are no copy-blocks referring across the border of 8x8
* blocks. For the subsampled data the 8x8 limitation is also subsampled.
*/
#define AFBC_FORMAT_MOD_CBR (1ULL << 7)
/*
* AFBC tiled layout
*
* The tiled layout groups superblocks in 8x8 or 4x4 tiles, where all
* superblocks inside a tile are stored together in memory. 8x8 tiles are used
* for pixel formats up to and including 32 bpp while 4x4 tiles are used for
* larger bpp formats. The order between the tiles is scan line.
* When the tiled layout is used, the buffer size (in pixels) must be aligned
* to the tile size.
*/
#define AFBC_FORMAT_MOD_TILED (1ULL << 8)
/*
* AFBC solid color blocks
*
* Indicates that the buffer makes use of solid-color blocks, whereby bandwidth
* can be reduced if a whole superblock is a single color.
*/
#define AFBC_FORMAT_MOD_SC (1ULL << 9)
/*
* AFBC double-buffer
*
* Indicates that the buffer is allocated in a layout safe for front-buffer
* rendering.
*/
#define AFBC_FORMAT_MOD_DB (1ULL << 10)
/*
* AFBC buffer content hints
*
* Indicates that the buffer includes per-superblock content hints.
*/
#define AFBC_FORMAT_MOD_BCH (1ULL << 11)
/*
* Allwinner tiled modifier
*
* This tiling mode is implemented by the VPU found on all Allwinner platforms,
* codenamed sunxi. It is associated with a YUV format that uses either 2 or 3
* planes.
*
* With this tiling, the luminance samples are disposed in tiles representing
* 32x32 pixels and the chrominance samples in tiles representing 32x64 pixels.
* The pixel order in each tile is linear and the tiles are disposed linearly,
* both in row-major order.
*/
#define DRM_FORMAT_MOD_ALLWINNER_TILED fourcc_mod_code(ALLWINNER, 1)
#if defined(__cplusplus)
}
#endif

View File

@@ -93,6 +93,15 @@ extern "C" {
#define DRM_MODE_PICTURE_ASPECT_NONE 0
#define DRM_MODE_PICTURE_ASPECT_4_3 1
#define DRM_MODE_PICTURE_ASPECT_16_9 2
#define DRM_MODE_PICTURE_ASPECT_64_27 3
#define DRM_MODE_PICTURE_ASPECT_256_135 4
/* Content type options */
#define DRM_MODE_CONTENT_TYPE_NO_DATA 0
#define DRM_MODE_CONTENT_TYPE_GRAPHICS 1
#define DRM_MODE_CONTENT_TYPE_PHOTO 2
#define DRM_MODE_CONTENT_TYPE_CINEMA 3
#define DRM_MODE_CONTENT_TYPE_GAME 4
/* Aspect ratio flag bitmask (4 bits 22:19) */
#define DRM_MODE_FLAG_PIC_AR_MASK (0x0F<<19)
@@ -102,6 +111,10 @@ extern "C" {
(DRM_MODE_PICTURE_ASPECT_4_3<<19)
#define DRM_MODE_FLAG_PIC_AR_16_9 \
(DRM_MODE_PICTURE_ASPECT_16_9<<19)
#define DRM_MODE_FLAG_PIC_AR_64_27 \
(DRM_MODE_PICTURE_ASPECT_64_27<<19)
#define DRM_MODE_FLAG_PIC_AR_256_135 \
(DRM_MODE_PICTURE_ASPECT_256_135<<19)
#define DRM_MODE_FLAG_ALL (DRM_MODE_FLAG_PHSYNC | \
DRM_MODE_FLAG_NHSYNC | \
@@ -173,8 +186,9 @@ extern "C" {
/*
* DRM_MODE_REFLECT_<axis>
*
* Signals that the contents of a drm plane is reflected in the <axis> axis,
* Signals that the contents of a drm plane is reflected along the <axis> axis,
* in the same way as mirroring.
* See kerneldoc chapter "Plane Composition Properties" for more details.
*
* This define is provided as a convenience, looking up the property id
* using the name->prop id lookup is the preferred method.
@@ -338,6 +352,7 @@ enum drm_mode_subconnector {
#define DRM_MODE_CONNECTOR_VIRTUAL 15
#define DRM_MODE_CONNECTOR_DSI 16
#define DRM_MODE_CONNECTOR_DPI 17
#define DRM_MODE_CONNECTOR_WRITEBACK 18
struct drm_mode_get_connector {
@@ -873,6 +888,25 @@ struct drm_mode_revoke_lease {
__u32 lessee_id;
};
/**
* struct drm_mode_rect - Two dimensional rectangle.
* @x1: Horizontal starting coordinate (inclusive).
* @y1: Vertical starting coordinate (inclusive).
* @x2: Horizontal ending coordinate (exclusive).
* @y2: Vertical ending coordinate (exclusive).
*
* With drm subsystem using struct drm_rect to manage rectangular area this
* export it to user-space.
*
* Currently used by drm_mode_atomic blob property FB_DAMAGE_CLIPS.
*/
struct drm_mode_rect {
__s32 x1;
__s32 y1;
__s32 x2;
__s32 y2;
};
#if defined(__cplusplus)
}
#endif

View File

@@ -412,6 +412,14 @@ typedef struct drm_i915_irq_wait {
int irq_seq;
} drm_i915_irq_wait_t;
/*
* Different modes of per-process Graphics Translation Table,
* see I915_PARAM_HAS_ALIASING_PPGTT
*/
#define I915_GEM_PPGTT_NONE 0
#define I915_GEM_PPGTT_ALIASING 1
#define I915_GEM_PPGTT_FULL 2
/* Ioctl to query kernel params:
*/
#define I915_PARAM_IRQ_ACTIVE 1
@@ -529,6 +537,28 @@ typedef struct drm_i915_irq_wait {
*/
#define I915_PARAM_CS_TIMESTAMP_FREQUENCY 51
/*
* Once upon a time we supposed that writes through the GGTT would be
* immediately in physical memory (once flushed out of the CPU path). However,
* on a few different processors and chipsets, this is not necessarily the case
* as the writes appear to be buffered internally. Thus a read of the backing
* storage (physical memory) via a different path (with different physical tags
* to the indirect write via the GGTT) will see stale values from before
* the GGTT write. Inside the kernel, we can for the most part keep track of
* the different read/write domains in use (e.g. set-domain), but the assumption
* of coherency is baked into the ABI, hence reporting its true state in this
* parameter.
*
* Reports true when writes via mmap_gtt are immediately visible following an
* lfence to flush the WCB.
*
* Reports false when writes via mmap_gtt are indeterminately delayed in an in
* internal buffer and are _not_ immediately visible to third parties accessing
* directly via mmap_cpu/mmap_wc. Use of mmap_gtt as part of an IPC
* communications channel when reporting false is strongly disadvised.
*/
#define I915_PARAM_MMAP_GTT_COHERENT 52
typedef struct drm_i915_getparam {
__s32 param;
/*
@@ -1456,9 +1486,73 @@ struct drm_i915_gem_context_param {
#define I915_CONTEXT_MAX_USER_PRIORITY 1023 /* inclusive */
#define I915_CONTEXT_DEFAULT_PRIORITY 0
#define I915_CONTEXT_MIN_USER_PRIORITY -1023 /* inclusive */
/*
* When using the following param, value should be a pointer to
* drm_i915_gem_context_param_sseu.
*/
#define I915_CONTEXT_PARAM_SSEU 0x7
__u64 value;
};
/**
* Context SSEU programming
*
* It may be necessary for either functional or performance reason to configure
* a context to run with a reduced number of SSEU (where SSEU stands for Slice/
* Sub-slice/EU).
*
* This is done by configuring SSEU configuration using the below
* @struct drm_i915_gem_context_param_sseu for every supported engine which
* userspace intends to use.
*
* Not all GPUs or engines support this functionality in which case an error
* code -ENODEV will be returned.
*
* Also, flexibility of possible SSEU configuration permutations varies between
* GPU generations and software imposed limitations. Requesting such a
* combination will return an error code of -EINVAL.
*
* NOTE: When perf/OA is active the context's SSEU configuration is ignored in
* favour of a single global setting.
*/
struct drm_i915_gem_context_param_sseu {
/*
* Engine class & instance to be configured or queried.
*/
__u16 engine_class;
__u16 engine_instance;
/*
* Unused for now. Must be cleared to zero.
*/
__u32 flags;
/*
* Mask of slices to enable for the context. Valid values are a subset
* of the bitmask value returned for I915_PARAM_SLICE_MASK.
*/
__u64 slice_mask;
/*
* Mask of subslices to enable for the context. Valid values are a
* subset of the bitmask value return by I915_PARAM_SUBSLICE_MASK.
*/
__u64 subslice_mask;
/*
* Minimum/Maximum number of EUs to enable per subslice for the
* context. min_eus_per_subslice must be inferior or equal to
* max_eus_per_subslice.
*/
__u16 min_eus_per_subslice;
__u16 max_eus_per_subslice;
/*
* Unused for now. Must be cleared to zero.
*/
__u32 rsvd;
};
enum drm_i915_oa_format {
I915_OA_FORMAT_A13 = 1, /* HSW only */
I915_OA_FORMAT_A29, /* HSW only */

View File

@@ -32,143 +32,615 @@ extern "C" {
#define DRM_TEGRA_GEM_CREATE_TILED (1 << 0)
#define DRM_TEGRA_GEM_CREATE_BOTTOM_UP (1 << 1)
/**
* struct drm_tegra_gem_create - parameters for the GEM object creation IOCTL
*/
struct drm_tegra_gem_create {
/**
* @size:
*
* The size, in bytes, of the buffer object to be created.
*/
__u64 size;
/**
* @flags:
*
* A bitmask of flags that influence the creation of GEM objects:
*
* DRM_TEGRA_GEM_CREATE_TILED
* Use the 16x16 tiling format for this buffer.
*
* DRM_TEGRA_GEM_CREATE_BOTTOM_UP
* The buffer has a bottom-up layout.
*/
__u32 flags;
/**
* @handle:
*
* The handle of the created GEM object. Set by the kernel upon
* successful completion of the IOCTL.
*/
__u32 handle;
};
/**
* struct drm_tegra_gem_mmap - parameters for the GEM mmap IOCTL
*/
struct drm_tegra_gem_mmap {
/**
* @handle:
*
* Handle of the GEM object to obtain an mmap offset for.
*/
__u32 handle;
/**
* @pad:
*
* Structure padding that may be used in the future. Must be 0.
*/
__u32 pad;
/**
* @offset:
*
* The mmap offset for the given GEM object. Set by the kernel upon
* successful completion of the IOCTL.
*/
__u64 offset;
};
/**
* struct drm_tegra_syncpt_read - parameters for the read syncpoint IOCTL
*/
struct drm_tegra_syncpt_read {
/**
* @id:
*
* ID of the syncpoint to read the current value from.
*/
__u32 id;
/**
* @value:
*
* The current syncpoint value. Set by the kernel upon successful
* completion of the IOCTL.
*/
__u32 value;
};
/**
* struct drm_tegra_syncpt_incr - parameters for the increment syncpoint IOCTL
*/
struct drm_tegra_syncpt_incr {
/**
* @id:
*
* ID of the syncpoint to increment.
*/
__u32 id;
/**
* @pad:
*
* Structure padding that may be used in the future. Must be 0.
*/
__u32 pad;
};
/**
* struct drm_tegra_syncpt_wait - parameters for the wait syncpoint IOCTL
*/
struct drm_tegra_syncpt_wait {
/**
* @id:
*
* ID of the syncpoint to wait on.
*/
__u32 id;
/**
* @thresh:
*
* Threshold value for which to wait.
*/
__u32 thresh;
/**
* @timeout:
*
* Timeout, in milliseconds, to wait.
*/
__u32 timeout;
/**
* @value:
*
* The new syncpoint value after the wait. Set by the kernel upon
* successful completion of the IOCTL.
*/
__u32 value;
};
#define DRM_TEGRA_NO_TIMEOUT (0xffffffff)
/**
* struct drm_tegra_open_channel - parameters for the open channel IOCTL
*/
struct drm_tegra_open_channel {
/**
* @client:
*
* The client ID for this channel.
*/
__u32 client;
/**
* @pad:
*
* Structure padding that may be used in the future. Must be 0.
*/
__u32 pad;
/**
* @context:
*
* The application context of this channel. Set by the kernel upon
* successful completion of the IOCTL. This context needs to be passed
* to the DRM_TEGRA_CHANNEL_CLOSE or the DRM_TEGRA_SUBMIT IOCTLs.
*/
__u64 context;
};
/**
* struct drm_tegra_close_channel - parameters for the close channel IOCTL
*/
struct drm_tegra_close_channel {
/**
* @context:
*
* The application context of this channel. This is obtained from the
* DRM_TEGRA_OPEN_CHANNEL IOCTL.
*/
__u64 context;
};
/**
* struct drm_tegra_get_syncpt - parameters for the get syncpoint IOCTL
*/
struct drm_tegra_get_syncpt {
/**
* @context:
*
* The application context identifying the channel for which to obtain
* the syncpoint ID.
*/
__u64 context;
/**
* @index:
*
* Index of the client syncpoint for which to obtain the ID.
*/
__u32 index;
/**
* @id:
*
* The ID of the given syncpoint. Set by the kernel upon successful
* completion of the IOCTL.
*/
__u32 id;
};
/**
* struct drm_tegra_get_syncpt_base - parameters for the get wait base IOCTL
*/
struct drm_tegra_get_syncpt_base {
/**
* @context:
*
* The application context identifying for which channel to obtain the
* wait base.
*/
__u64 context;
/**
* @syncpt:
*
* ID of the syncpoint for which to obtain the wait base.
*/
__u32 syncpt;
/**
* @id:
*
* The ID of the wait base corresponding to the client syncpoint. Set
* by the kernel upon successful completion of the IOCTL.
*/
__u32 id;
};
/**
* struct drm_tegra_syncpt - syncpoint increment operation
*/
struct drm_tegra_syncpt {
/**
* @id:
*
* ID of the syncpoint to operate on.
*/
__u32 id;
/**
* @incrs:
*
* Number of increments to perform for the syncpoint.
*/
__u32 incrs;
};
/**
* struct drm_tegra_cmdbuf - structure describing a command buffer
*/
struct drm_tegra_cmdbuf {
/**
* @handle:
*
* Handle to a GEM object containing the command buffer.
*/
__u32 handle;
/**
* @offset:
*
* Offset, in bytes, into the GEM object identified by @handle at
* which the command buffer starts.
*/
__u32 offset;
/**
* @words:
*
* Number of 32-bit words in this command buffer.
*/
__u32 words;
/**
* @pad:
*
* Structure padding that may be used in the future. Must be 0.
*/
__u32 pad;
};
/**
* struct drm_tegra_reloc - GEM object relocation structure
*/
struct drm_tegra_reloc {
struct {
/**
* @cmdbuf.handle:
*
* Handle to the GEM object containing the command buffer for
* which to perform this GEM object relocation.
*/
__u32 handle;
/**
* @cmdbuf.offset:
*
* Offset, in bytes, into the command buffer at which to
* insert the relocated address.
*/
__u32 offset;
} cmdbuf;
struct {
/**
* @target.handle:
*
* Handle to the GEM object to be relocated.
*/
__u32 handle;
/**
* @target.offset:
*
* Offset, in bytes, into the target GEM object at which the
* relocated data starts.
*/
__u32 offset;
} target;
/**
* @shift:
*
* The number of bits by which to shift relocated addresses.
*/
__u32 shift;
/**
* @pad:
*
* Structure padding that may be used in the future. Must be 0.
*/
__u32 pad;
};
/**
* struct drm_tegra_waitchk - wait check structure
*/
struct drm_tegra_waitchk {
/**
* @handle:
*
* Handle to the GEM object containing a command stream on which to
* perform the wait check.
*/
__u32 handle;
/**
* @offset:
*
* Offset, in bytes, of the location in the command stream to perform
* the wait check on.
*/
__u32 offset;
/**
* @syncpt:
*
* ID of the syncpoint to wait check.
*/
__u32 syncpt;
/**
* @thresh:
*
* Threshold value for which to check.
*/
__u32 thresh;
};
/**
* struct drm_tegra_submit - job submission structure
*/
struct drm_tegra_submit {
/**
* @context:
*
* The application context identifying the channel to use for the
* execution of this job.
*/
__u64 context;
__u32 num_syncpts;
__u32 num_cmdbufs;
__u32 num_relocs;
__u32 num_waitchks;
__u32 waitchk_mask;
__u32 timeout;
__u64 syncpts;
__u64 cmdbufs;
__u64 relocs;
__u64 waitchks;
__u32 fence; /* Return value */
__u32 reserved[5]; /* future expansion */
/**
* @num_syncpts:
*
* The number of syncpoints operated on by this job. This defines the
* length of the array pointed to by @syncpts.
*/
__u32 num_syncpts;
/**
* @num_cmdbufs:
*
* The number of command buffers to execute as part of this job. This
* defines the length of the array pointed to by @cmdbufs.
*/
__u32 num_cmdbufs;
/**
* @num_relocs:
*
* The number of relocations to perform before executing this job.
* This defines the length of the array pointed to by @relocs.
*/
__u32 num_relocs;
/**
* @num_waitchks:
*
* The number of wait checks to perform as part of this job. This
* defines the length of the array pointed to by @waitchks.
*/
__u32 num_waitchks;
/**
* @waitchk_mask:
*
* Bitmask of valid wait checks.
*/
__u32 waitchk_mask;
/**
* @timeout:
*
* Timeout, in milliseconds, before this job is cancelled.
*/
__u32 timeout;
/**
* @syncpts:
*
* A pointer to an array of &struct drm_tegra_syncpt structures that
* specify the syncpoint operations performed as part of this job.
* The number of elements in the array must be equal to the value
* given by @num_syncpts.
*/
__u64 syncpts;
/**
* @cmdbufs:
*
* A pointer to an array of &struct drm_tegra_cmdbuf structures that
* define the command buffers to execute as part of this job. The
* number of elements in the array must be equal to the value given
* by @num_syncpts.
*/
__u64 cmdbufs;
/**
* @relocs:
*
* A pointer to an array of &struct drm_tegra_reloc structures that
* specify the relocations that need to be performed before executing
* this job. The number of elements in the array must be equal to the
* value given by @num_relocs.
*/
__u64 relocs;
/**
* @waitchks:
*
* A pointer to an array of &struct drm_tegra_waitchk structures that
* specify the wait checks to be performed while executing this job.
* The number of elements in the array must be equal to the value
* given by @num_waitchks.
*/
__u64 waitchks;
/**
* @fence:
*
* The threshold of the syncpoint associated with this job after it
* has been completed. Set by the kernel upon successful completion of
* the IOCTL. This can be used with the DRM_TEGRA_SYNCPT_WAIT IOCTL to
* wait for this job to be finished.
*/
__u32 fence;
/**
* @reserved:
*
* This field is reserved for future use. Must be 0.
*/
__u32 reserved[5];
};
#define DRM_TEGRA_GEM_TILING_MODE_PITCH 0
#define DRM_TEGRA_GEM_TILING_MODE_TILED 1
#define DRM_TEGRA_GEM_TILING_MODE_BLOCK 2
/**
* struct drm_tegra_gem_set_tiling - parameters for the set tiling IOCTL
*/
struct drm_tegra_gem_set_tiling {
/* input */
/**
* @handle:
*
* Handle to the GEM object for which to set the tiling parameters.
*/
__u32 handle;
/**
* @mode:
*
* The tiling mode to set. Must be one of:
*
* DRM_TEGRA_GEM_TILING_MODE_PITCH
* pitch linear format
*
* DRM_TEGRA_GEM_TILING_MODE_TILED
* 16x16 tiling format
*
* DRM_TEGRA_GEM_TILING_MODE_BLOCK
* 16Bx2 tiling format
*/
__u32 mode;
/**
* @value:
*
* The value to set for the tiling mode parameter.
*/
__u32 value;
/**
* @pad:
*
* Structure padding that may be used in the future. Must be 0.
*/
__u32 pad;
};
/**
* struct drm_tegra_gem_get_tiling - parameters for the get tiling IOCTL
*/
struct drm_tegra_gem_get_tiling {
/* input */
/**
* @handle:
*
* Handle to the GEM object for which to query the tiling parameters.
*/
__u32 handle;
/* output */
/**
* @mode:
*
* The tiling mode currently associated with the GEM object. Set by
* the kernel upon successful completion of the IOCTL.
*/
__u32 mode;
/**
* @value:
*
* The tiling mode parameter currently associated with the GEM object.
* Set by the kernel upon successful completion of the IOCTL.
*/
__u32 value;
/**
* @pad:
*
* Structure padding that may be used in the future. Must be 0.
*/
__u32 pad;
};
#define DRM_TEGRA_GEM_BOTTOM_UP (1 << 0)
#define DRM_TEGRA_GEM_FLAGS (DRM_TEGRA_GEM_BOTTOM_UP)
/**
* struct drm_tegra_gem_set_flags - parameters for the set flags IOCTL
*/
struct drm_tegra_gem_set_flags {
/* input */
/**
* @handle:
*
* Handle to the GEM object for which to set the flags.
*/
__u32 handle;
/* output */
/**
* @flags:
*
* The flags to set for the GEM object.
*/
__u32 flags;
};
/**
* struct drm_tegra_gem_get_flags - parameters for the get flags IOCTL
*/
struct drm_tegra_gem_get_flags {
/* input */
/**
* @handle:
*
* Handle to the GEM object for which to query the flags.
*/
__u32 handle;
/* output */
/**
* @flags:
*
* The flags currently associated with the GEM object. Set by the
* kernel upon successful completion of the IOCTL.
*/
__u32 flags;
};
@@ -193,7 +665,7 @@ struct drm_tegra_gem_get_flags {
#define DRM_IOCTL_TEGRA_SYNCPT_INCR DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_INCR, struct drm_tegra_syncpt_incr)
#define DRM_IOCTL_TEGRA_SYNCPT_WAIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_WAIT, struct drm_tegra_syncpt_wait)
#define DRM_IOCTL_TEGRA_OPEN_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_OPEN_CHANNEL, struct drm_tegra_open_channel)
#define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_open_channel)
#define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_close_channel)
#define DRM_IOCTL_TEGRA_GET_SYNCPT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT, struct drm_tegra_get_syncpt)
#define DRM_IOCTL_TEGRA_SUBMIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SUBMIT, struct drm_tegra_submit)
#define DRM_IOCTL_TEGRA_GET_SYNCPT_BASE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT_BASE, struct drm_tegra_get_syncpt_base)

View File

@@ -36,6 +36,7 @@ extern "C" {
#define DRM_V3D_MMAP_BO 0x03
#define DRM_V3D_GET_PARAM 0x04
#define DRM_V3D_GET_BO_OFFSET 0x05
#define DRM_V3D_SUBMIT_TFU 0x06
#define DRM_IOCTL_V3D_SUBMIT_CL DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CL, struct drm_v3d_submit_cl)
#define DRM_IOCTL_V3D_WAIT_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_WAIT_BO, struct drm_v3d_wait_bo)
@@ -43,6 +44,7 @@ extern "C" {
#define DRM_IOCTL_V3D_MMAP_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_MMAP_BO, struct drm_v3d_mmap_bo)
#define DRM_IOCTL_V3D_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_PARAM, struct drm_v3d_get_param)
#define DRM_IOCTL_V3D_GET_BO_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_BO_OFFSET, struct drm_v3d_get_bo_offset)
#define DRM_IOCTL_V3D_SUBMIT_TFU DRM_IOW(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_TFU, struct drm_v3d_submit_tfu)
/**
* struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D
@@ -50,6 +52,14 @@ extern "C" {
*
* This asks the kernel to have the GPU execute an optional binner
* command list, and a render command list.
*
* The L1T, slice, L2C, L2T, and GCA caches will be flushed before
* each CL executes. The VCD cache should be flushed (if necessary)
* by the submitted CLs. The TLB writes are guaranteed to have been
* flushed by the time the render done IRQ happens, which is the
* trigger for out_sync. Any dirtying of cachelines by the job (only
* possible using TMU writes) must be flushed by the caller using the
* CL's cache flush commands.
*/
struct drm_v3d_submit_cl {
/* Pointer to the binner command list.
@@ -58,10 +68,15 @@ struct drm_v3d_submit_cl {
* coordinate shader to determine where primitives land on the screen,
* then writes out the state updates and draw calls necessary per tile
* to the tile allocation BO.
*
* This BCL will block on any previous BCL submitted on the
* same FD, but not on any RCL or BCLs submitted by other
* clients -- that is left up to the submitter to control
* using in_sync_bcl if necessary.
*/
__u32 bcl_start;
/** End address of the BCL (first byte after the BCL) */
/** End address of the BCL (first byte after the BCL) */
__u32 bcl_end;
/* Offset of the render command list.
@@ -69,10 +84,15 @@ struct drm_v3d_submit_cl {
* This is the second set of commands executed, which will either
* execute the tiles that have been set up by the BCL, or a fixed set
* of tiles (in the case of RCL-only blits).
*
* This RCL will block on this submit's BCL, and any previous
* RCL submitted on the same FD, but not on any RCL or BCLs
* submitted by other clients -- that is left up to the
* submitter to control using in_sync_rcl if necessary.
*/
__u32 rcl_start;
/** End address of the RCL (first byte after the RCL) */
/** End address of the RCL (first byte after the RCL) */
__u32 rcl_end;
/** An optional sync object to wait on before starting the BCL. */
@@ -169,6 +189,7 @@ enum drm_v3d_param {
DRM_V3D_PARAM_V3D_CORE0_IDENT0,
DRM_V3D_PARAM_V3D_CORE0_IDENT1,
DRM_V3D_PARAM_V3D_CORE0_IDENT2,
DRM_V3D_PARAM_SUPPORTS_TFU,
};
struct drm_v3d_get_param {
@@ -187,6 +208,28 @@ struct drm_v3d_get_bo_offset {
__u32 offset;
};
struct drm_v3d_submit_tfu {
__u32 icfg;
__u32 iia;
__u32 iis;
__u32 ica;
__u32 iua;
__u32 ioa;
__u32 ios;
__u32 coef[4];
/* First handle is the output BO, following are other inputs.
* 0 for unused.
*/
__u32 bo_handles[4];
/* sync object to block on before running the TFU job. Each TFU
* job will execute in the order submitted to its FD. Synchronization
* against rendering jobs requires using sync objects.
*/
__u32 in_sync;
/* Sync object to signal when the TFU job is done. */
__u32 out_sync;
};
#if defined(__cplusplus)
}
#endif

View File

@@ -18,7 +18,6 @@
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
inc_drm_uapi = include_directories('drm-uapi')
inc_vulkan = include_directories('vulkan')
inc_d3d9 = include_directories('D3D9')
inc_gl_internal = include_directories('GL/internal')
@@ -94,14 +93,19 @@ if with_gallium_opencl and not with_opencl_icd
install_headers(
'CL/cl.h',
'CL/cl.hpp',
'CL/cl2.hpp',
'CL/cl_d3d10.h',
'CL/cl_d3d11.h',
'CL/cl_dx9_media_sharing.h',
'CL/cl_dx9_media_sharing_intel.h',
'CL/cl_egl.h',
'CL/cl_ext.h',
'CL/cl_ext_intel.h',
'CL/cl_gl.h',
'CL/cl_gl_ext.h',
'CL/cl_platform.h',
'CL/cl_va_api_media_sharing_intel.h',
'CL/cl_version.h',
'CL/opencl.h',
subdir: 'CL'
)

View File

@@ -1,3 +1,4 @@
#ifndef IRIS
CHIPSET(0x29A2, i965, "Intel(R) 965G")
CHIPSET(0x2992, i965, "Intel(R) 965Q")
CHIPSET(0x2982, i965, "Intel(R) 965G")
@@ -91,6 +92,11 @@ CHIPSET(0x0F32, byt, "Intel(R) Bay Trail")
CHIPSET(0x0F33, byt, "Intel(R) Bay Trail")
CHIPSET(0x0157, byt, "Intel(R) Bay Trail")
CHIPSET(0x0155, byt, "Intel(R) Bay Trail")
CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* Overridden in brw_get_renderer_string */
CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
#endif
CHIPSET(0x1602, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x1606, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x160A, bdw_gt1, "Intel(R) Broadwell GT1")
@@ -109,10 +115,6 @@ CHIPSET(0x162A, bdw_gt3, "Intel(R) Iris Pro P6300 (Broadwell GT3e)")
CHIPSET(0x162B, bdw_gt3, "Intel(R) Iris 6100 (Broadwell GT3)")
CHIPSET(0x162D, bdw_gt3, "Intel(R) Broadwell GT3")
CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")
CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* Overridden in brw_get_renderer_string */
CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x1902, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
CHIPSET(0x1906, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake GT1")
@@ -171,6 +173,7 @@ CHIPSET(0x3185, glk_2x6, "Intel(R) UHD Graphics 600 (Geminilake 2x6)")
CHIPSET(0x3E90, cfl_gt1, "Intel(R) UHD Graphics 610 (Coffeelake 2x6 GT1)")
CHIPSET(0x3E93, cfl_gt1, "Intel(R) UHD Graphics 610 (Coffeelake 2x6 GT1)")
CHIPSET(0x3E99, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
CHIPSET(0x3E9C, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
CHIPSET(0x3E91, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")
CHIPSET(0x3E92, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")
CHIPSET(0x3E96, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
@@ -203,6 +206,10 @@ CHIPSET(0x5A54, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
CHIPSET(0x8A50, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")
CHIPSET(0x8A51, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")
CHIPSET(0x8A52, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")
CHIPSET(0x8A56, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
CHIPSET(0x8A57, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")
CHIPSET(0x8A58, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
CHIPSET(0x8A59, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")
CHIPSET(0x8A5A, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")
CHIPSET(0x8A5B, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
CHIPSET(0x8A5C, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")

View File

@@ -219,6 +219,7 @@ CHIPSET(0x699F, POLARIS12)
CHIPSET(0x694C, VEGAM)
CHIPSET(0x694E, VEGAM)
CHIPSET(0x694F, VEGAM)
CHIPSET(0x6860, VEGA10)
CHIPSET(0x6861, VEGA10)
@@ -227,8 +228,14 @@ CHIPSET(0x6863, VEGA10)
CHIPSET(0x6864, VEGA10)
CHIPSET(0x6867, VEGA10)
CHIPSET(0x6868, VEGA10)
CHIPSET(0x687F, VEGA10)
CHIPSET(0x6869, VEGA10)
CHIPSET(0x686A, VEGA10)
CHIPSET(0x686B, VEGA10)
CHIPSET(0x686C, VEGA10)
CHIPSET(0x686D, VEGA10)
CHIPSET(0x686E, VEGA10)
CHIPSET(0x686F, VEGA10)
CHIPSET(0x687F, VEGA10)
CHIPSET(0x69A0, VEGA12)
CHIPSET(0x69A1, VEGA12)
@@ -240,6 +247,7 @@ CHIPSET(0x66A0, VEGA20)
CHIPSET(0x66A1, VEGA20)
CHIPSET(0x66A2, VEGA20)
CHIPSET(0x66A3, VEGA20)
CHIPSET(0x66A4, VEGA20)
CHIPSET(0x66A7, VEGA20)
CHIPSET(0x66AF, VEGA20)

View File

@@ -2,7 +2,7 @@
#define VULKAN_H_ 1
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
** Copyright (c) 2015-2019 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
@@ -39,12 +39,6 @@
#endif
#ifdef VK_USE_PLATFORM_MIR_KHR
#include <mir_toolkit/client_types.h>
#include "vulkan_mir.h"
#endif
#ifdef VK_USE_PLATFORM_VI_NN
#include "vulkan_vi.h"
#endif

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
** Copyright (c) 2015-2019 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.

File diff suppressed because it is too large Load Diff

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
** Copyright (c) 2015-2019 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
** Copyright (c) 2015-2019 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
** Copyright (c) 2015-2019 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.

View File

@@ -1,65 +0,0 @@
#ifndef VULKAN_MIR_H_
#define VULKAN_MIR_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_mir_surface 1
#define VK_KHR_MIR_SURFACE_SPEC_VERSION 4
#define VK_KHR_MIR_SURFACE_EXTENSION_NAME "VK_KHR_mir_surface"
typedef VkFlags VkMirSurfaceCreateFlagsKHR;
typedef struct VkMirSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkMirSurfaceCreateFlagsKHR flags;
MirConnection* connection;
MirSurface* mirSurface;
} VkMirSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateMirSurfaceKHR)(VkInstance instance, const VkMirSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceMirPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, MirConnection* connection);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateMirSurfaceKHR(
VkInstance instance,
const VkMirSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceMirPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
MirConnection* connection);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
** Copyright (c) 2015-2019 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
** Copyright (c) 2015-2019 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
** Copyright (c) 2015-2019 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
** Copyright (c) 2015-2019 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
** Copyright (c) 2015-2019 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
** Copyright (c) 2015-2019 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.

View File

@@ -1,4 +1,4 @@
# Copyright © 2017-2018 Intel Corporation
# Copyright © 2017-2019 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -34,8 +34,6 @@ cpp = meson.get_compiler('cpp')
null_dep = dependency('', required : false)
system_has_kms_drm = ['openbsd', 'netbsd', 'freebsd', 'dragonfly', 'linux'].contains(host_machine.system())
# Arguments for the preprocessor, put these in a separate array from the C and
# C++ (cpp in meson terminology) arguments since they need to be added to the
# default arguments for both C and C++.
@@ -43,8 +41,7 @@ pre_args = [
'-D__STDC_CONSTANT_MACROS',
'-D__STDC_FORMAT_MACROS',
'-D__STDC_LIMIT_MACROS',
'-DVERSION="@0@"'.format(meson.project_version()),
'-DPACKAGE_VERSION=VERSION',
'-DPACKAGE_VERSION="@0@"'.format(meson.project_version()),
'-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"',
]
@@ -54,20 +51,21 @@ with_valgrind = get_option('valgrind')
with_libunwind = get_option('libunwind')
with_asm = get_option('asm')
with_glx_read_only_text = get_option('glx-read-only-text')
with_glx_direct = get_option('glx-direct')
with_osmesa = get_option('osmesa')
with_swr_arches = get_option('swr-arches')
with_tools = get_option('tools')
if with_tools.contains('all')
with_tools = ['freedreno', 'glsl', 'intel', 'nir', 'nouveau', 'xvmc']
with_tools = ['etnaviv', 'freedreno', 'glsl', 'intel', 'nir', 'nouveau', 'xvmc']
endif
dri_drivers_path = get_option('dri-drivers-path')
if dri_drivers_path == ''
dri_drivers_path = join_paths(get_option('libdir'), 'dri')
dri_drivers_path = join_paths(get_option('prefix'), get_option('libdir'), 'dri')
endif
dri_search_path = get_option('dri-search-path')
if dri_search_path == ''
dri_search_path = join_paths(get_option('prefix'), dri_drivers_path)
dri_search_path = dri_drivers_path
endif
with_gles1 = get_option('gles1')
@@ -133,8 +131,8 @@ if _drivers.contains('auto')
]
elif ['arm', 'aarch64'].contains(host_machine.cpu_family())
_drivers = [
'pl111', 'v3d', 'vc4', 'freedreno', 'etnaviv', 'imx', 'nouveau',
'tegra', 'virgl', 'swrast',
'kmsro', 'v3d', 'vc4', 'freedreno', 'etnaviv', 'nouveau',
'tegra', 'virgl', 'swrast'
]
else
error('Unknown architecture @0@. Please pass -Dgallium-drivers to set driver options. Patches gladly accepted to fix this.'.format(
@@ -147,7 +145,7 @@ if _drivers.contains('auto')
host_machine.system()))
endif
endif
with_gallium_pl111 = _drivers.contains('pl111')
with_gallium_kmsro = _drivers.contains('kmsro')
with_gallium_radeonsi = _drivers.contains('radeonsi')
with_gallium_r300 = _drivers.contains('r300')
with_gallium_r600 = _drivers.contains('r600')
@@ -156,14 +154,23 @@ with_gallium_freedreno = _drivers.contains('freedreno')
with_gallium_softpipe = _drivers.contains('swrast')
with_gallium_vc4 = _drivers.contains('vc4')
with_gallium_v3d = _drivers.contains('v3d')
with_gallium_panfrost = _drivers.contains('panfrost')
with_gallium_etnaviv = _drivers.contains('etnaviv')
with_gallium_imx = _drivers.contains('imx')
with_gallium_tegra = _drivers.contains('tegra')
with_gallium_iris = _drivers.contains('iris')
with_gallium_i915 = _drivers.contains('i915')
with_gallium_svga = _drivers.contains('svga')
with_gallium_virgl = _drivers.contains('virgl')
with_gallium_swr = _drivers.contains('swr')
if cc.get_id() == 'intel'
if meson.version().version_compare('< 0.49.0')
error('Meson does not have sufficient support of ICC before 0.49.0 to compile mesa')
elif with_gallium_swr and meson.version().version_compare('== 0.49.0')
warning('Meson as of 0.49.0 is sufficient for compiling mesa with ICC, but there are some caveats with SWR. 0.49.1 should resolve all of these')
endif
endif
with_gallium = _drivers.length() != 0 and _drivers != ['']
if with_gallium and system_has_kms_drm
@@ -204,11 +211,8 @@ endif
if with_dri_i915 and with_gallium_i915
error('Only one i915 provider can be built')
endif
if with_gallium_imx and not with_gallium_etnaviv
error('IMX driver requires etnaviv driver')
endif
if with_gallium_pl111 and not with_gallium_vc4
error('pl111 driver requires vc4 driver')
if with_gallium_kmsro and not (with_gallium_vc4 or with_gallium_etnaviv or with_gallium_freedreno or with_gallium_panfrost)
error('kmsro driver requires one or more renderonly drivers (vc4, etnaviv, freedreno, panfrost)')
endif
if with_gallium_tegra and not with_gallium_nouveau
error('tegra driver requires nouveau driver')
@@ -223,8 +227,6 @@ elif system_has_kms_drm
else
# FIXME: haiku doesn't use dri, and xlib doesn't use dri, probably should
# assert here that one of those cases has been met.
# FIXME: GNU (hurd) ends up here as well, but meson doesn't officially
# support Hurd at time of writing (2017/11)
# FIXME: illumos ends up here as well
with_dri_platform = 'none'
endif
@@ -370,9 +372,6 @@ if with_glvnd
endif
endif
# TODO: toggle for this
with_glx_direct = true
if with_vulkan_icd_dir == ''
with_vulkan_icd_dir = join_paths(get_option('datadir'), 'vulkan/icd.d')
endif
@@ -388,9 +387,9 @@ endif
if with_any_vk and (with_platform_x11 and not with_dri3)
error('Vulkan drivers require dri3 for X11 support')
endif
if with_dri or with_gallium
if with_glx == 'disabled' and not with_egl and not with_platform_haiku
error('building dri or gallium drivers require at least one window system')
if with_dri
if with_glx == 'disabled' and not with_egl and not with_gbm and with_osmesa != 'classic'
error('building dri drivers require at least one windowing system or classic osmesa')
endif
endif
@@ -611,7 +610,7 @@ with_gallium_xa = _xa != 'false'
d3d_drivers_path = get_option('d3d-drivers-path')
if d3d_drivers_path == ''
d3d_drivers_path = join_paths(get_option('libdir'), 'd3d')
d3d_drivers_path = join_paths(get_option('prefix'), get_option('libdir'), 'd3d')
endif
with_gallium_st_nine = get_option('gallium-nine')
@@ -619,8 +618,9 @@ if with_gallium_st_nine
if not with_gallium_softpipe
error('The nine state tracker requires gallium softpipe/llvmpipe.')
elif not (with_gallium_radeonsi or with_gallium_nouveau or with_gallium_r600
or with_gallium_r300 or with_gallium_svga or with_gallium_i915)
error('The nine state tracker requires at least on non-swrast gallium driver.')
or with_gallium_r300 or with_gallium_svga or with_gallium_i915
or with_gallium_iris)
error('The nine state tracker requires at least one non-swrast gallium driver.')
endif
if not with_dri3
error('Using nine with wine requires dri3')
@@ -628,7 +628,12 @@ if with_gallium_st_nine
endif
if get_option('power8') != 'false'
if host_machine.cpu_family() == 'ppc64le'
# on old versions of meson the cpu family would return as ppc64le on little
# endian power8, this was changed in 0.48 such that the family would always
# be ppc64 regardless of endianness, and the the machine.endian() value
# should be checked. Since we support versions < 0.48 we need to use
# startswith.
if host_machine.cpu_family().startswith('ppc64') and host_machine.endian() == 'little'
if cc.get_id() == 'gcc' and cc.version().version_compare('< 4.8')
error('Altivec is not supported with gcc version < 4.8.')
endif
@@ -650,6 +655,7 @@ if get_option('power8') != 'false'
endif
_opencl = get_option('gallium-opencl')
clover_cpp_std = []
if _opencl != 'disabled'
if not with_gallium
error('OpenCL Clover implementation requires at least one gallium driver.')
@@ -658,10 +664,18 @@ if _opencl != 'disabled'
dep_clc = dependency('libclc')
with_gallium_opencl = true
with_opencl_icd = _opencl == 'icd'
if host_machine.cpu_family().startswith('ppc') and cpp.compiles('''
#if !defined(__VEC__) || !defined(__ALTIVEC__)
#error "AltiVec not enabled"
#endif''',
name : 'Altivec')
clover_cpp_std += ['cpp_std=gnu++11']
endif
else
dep_clc = null_dep
with_gallium_opencl = false
with_gallium_icd = false
with_opencl_icd = false
endif
gl_pkgconfig_c_flags = []
@@ -781,7 +795,7 @@ if cc.compiles('int foo(void) __attribute__((__noreturn__));',
endif
# TODO: this is very incomplete
if ['linux', 'cygwin'].contains(host_machine.system())
if ['linux', 'cygwin', 'gnu'].contains(host_machine.system())
pre_args += '-D_GNU_SOURCE'
endif
@@ -918,7 +932,7 @@ endif
# case of cross compiling where we can use asm, and that's x86_64 -> x86 when
# host OS == build OS, since in that case the build machine can run the host's
# binaries.
if meson.is_cross_build()
if with_asm and meson.is_cross_build()
if build_machine.system() != host_machine.system()
# TODO: It may be possible to do this with an exe_wrapper (like wine).
message('Cross compiling from one OS to another, disabling assembly.')
@@ -940,7 +954,7 @@ endif
with_asm_arch = ''
if with_asm
if host_machine.cpu_family() == 'x86'
if system_has_kms_drm
if system_has_kms_drm or host_machine.system() == 'gnu'
with_asm_arch = 'x86'
pre_args += ['-DUSE_X86_ASM', '-DUSE_MMX_ASM', '-DUSE_3DNOW_ASM',
'-DUSE_SSE_ASM']
@@ -969,7 +983,7 @@ if with_asm
with_asm_arch = 'sparc'
pre_args += ['-DUSE_SPARC_ASM']
endif
elif host_machine.cpu_family() == 'ppc64le'
elif host_machine.cpu_family().startswith('ppc64') and host_machine.endian() == 'little'
if system_has_kms_drm
with_asm_arch = 'ppc64le'
pre_args += ['-DUSE_PPC64LE_ASM']
@@ -1102,7 +1116,7 @@ dep_libdrm_nouveau = null_dep
dep_libdrm_etnaviv = null_dep
dep_libdrm_intel = null_dep
_drm_amdgpu_ver = '2.4.95'
_drm_amdgpu_ver = '2.4.97'
_drm_radeon_ver = '2.4.71'
_drm_nouveau_ver = '2.4.66'
_drm_etnaviv_ver = '2.4.89'
@@ -1163,7 +1177,7 @@ endif
llvm_modules = ['bitwriter', 'engine', 'mcdisassembler', 'mcjit']
llvm_optional_modules = []
if with_amd_vk or with_gallium_radeonsi or with_gallium_r600
llvm_modules += ['amdgpu', 'bitreader', 'ipo']
llvm_modules += ['amdgpu', 'native', 'bitreader', 'ipo']
if with_gallium_r600
llvm_modules += 'asmparser'
endif
@@ -1177,7 +1191,7 @@ if with_gallium_opencl
endif
if with_amd_vk or with_gallium_radeonsi
_llvm_version = '>= 6.0.0'
_llvm_version = '>= 7.0.0'
elif with_gallium_swr
_llvm_version = '>= 6.0.0'
elif with_gallium_opencl or with_gallium_r600
@@ -1224,6 +1238,9 @@ if with_llvm
# programs, so we need to build all C++ code in mesa without rtti as well to
# ensure that linking works.
if dep_llvm.get_configtool_variable('has-rtti') == 'NO'
if with_gallium_nouveau
error('The Nouveau driver requires rtti. You either need to turn off nouveau or use an LLVM built with LLVM_ENABLE_RTTI.')
endif
cpp_args += '-fno-rtti'
endif
elif with_amd_vk or with_gallium_radeonsi or with_gallium_swr
@@ -1350,9 +1367,8 @@ if with_platform_x11
dep_xdamage = dependency('xdamage', version : '>= 1.1')
dep_xfixes = dependency('xfixes')
dep_xcb_glx = dependency('xcb-glx', version : '>= 1.8.1')
dep_xxf86vm = dependency('xxf86vm')
endif
if (with_any_vk or with_glx == 'dri' or
if (with_any_vk or with_glx == 'dri' or with_egl or
(with_gallium_vdpau or with_gallium_xvmc or with_gallium_va or
with_gallium_omx != 'disabled'))
dep_xcb = dependency('xcb')
@@ -1377,6 +1393,7 @@ if with_platform_x11
if with_glx == 'dri'
if with_dri_platform == 'drm'
dep_dri2proto = dependency('dri2proto', version : '>= 2.8')
dep_xxf86vm = dependency('xxf86vm')
endif
dep_glproto = dependency('glproto', version : '>= 1.4.14')
endif
@@ -1386,7 +1403,7 @@ if with_platform_x11
dep_xcb_xfixes = dependency('xcb-xfixes')
endif
if with_xlib_lease
dep_xcb_xrandr = dependency('xcb-randr', version : '>= 1.12')
dep_xcb_xrandr = dependency('xcb-randr')
dep_xlib_xrandr = dependency('xrandr', version : '>= 1.3')
endif
endif
@@ -1397,7 +1414,7 @@ endif
_sensors = get_option('lmsensors')
if _sensors != 'false'
dep_lmsensors = cc.find_library('libsensors', required : _sensors == 'true')
dep_lmsensors = cc.find_library('sensors', required : _sensors == 'true')
if dep_lmsensors.found()
pre_args += '-DHAVE_LIBSENSORS=1'
endif
@@ -1427,8 +1444,8 @@ elif with_glx == 'dri'
'xcb-glx >= 1.8.1']
if with_dri_platform == 'drm'
gl_priv_reqs += 'xcb-dri2 >= 1.8'
gl_priv_reqs += 'xxf86vm'
endif
gl_priv_reqs += 'xxf86vm'
endif
if dep_libdrm.found()
gl_priv_reqs += 'libdrm >= 2.4.75'
@@ -1450,6 +1467,10 @@ pkg = import('pkgconfig')
env_test = environment()
env_test.set('NM', find_program('nm').path())
# This quirk needs to be applied to sources with functions defined in assembly
# as GCC LTO drops them. See: https://bugs.freedesktop.org/show_bug.cgi?id=109391
gcc_lto_quirk = (cc.get_id() == 'gcc') ? ['-fno-lto'] : []
subdir('include')
subdir('bin')
subdir('src')

View File

@@ -58,9 +58,9 @@ option(
type : 'array',
value : ['auto'],
choices : [
'', 'auto', 'pl111', 'radeonsi', 'r300', 'r600', 'nouveau', 'freedreno',
'swrast', 'v3d', 'vc4', 'etnaviv', 'imx', 'tegra', 'i915', 'svga', 'virgl',
'swr',
'', 'auto', 'kmsro', 'radeonsi', 'r300', 'r600', 'nouveau', 'freedreno',
'swrast', 'v3d', 'vc4', 'etnaviv', 'tegra', 'i915', 'svga', 'virgl',
'swr', 'panfrost', 'iris'
],
description : 'List of gallium drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'
)
@@ -167,6 +167,12 @@ option(
value : '',
description : 'Location relative to prefix to put vulkan icds on install. Default: $datadir/vulkan/icd.d'
)
option(
'vulkan-overlay-layer',
type : 'boolean',
value : false,
description : 'Whether to build the vulkan overlay layer'
)
option(
'shared-glapi',
type : 'boolean',
@@ -301,7 +307,7 @@ option(
'tools',
type : 'array',
value : [],
choices : ['freedreno', 'glsl', 'intel', 'intel-ui', 'nir', 'nouveau', 'xvmc', 'all'],
choices : ['etnaviv', 'freedreno', 'glsl', 'intel', 'intel-ui', 'nir', 'nouveau', 'xvmc', 'all'],
description : 'List of tools to build. (Note: `intel-ui` selects `intel`)',
)
option(
@@ -318,3 +324,9 @@ option(
choices : ['auto', 'true', 'false'],
description : 'Enable VK_EXT_acquire_xlib_display.'
)
option(
'glx-direct',
type : 'boolean',
value : true,
description : 'Enable direct rendering in GLX and EGL for DRI',
)

View File

@@ -81,6 +81,10 @@ if HAVE_BROADCOM_DRIVERS
SUBDIRS += broadcom
endif
if HAVE_FREEDRENO_DRIVERS
SUBDIRS += freedreno
endif
if NEED_OPENGL_COMMON
SUBDIRS += mesa
endif
@@ -132,3 +136,18 @@ libglsl_util_la_SOURCES = \
mesa/program/prog_parameter.c \
mesa/program/symbol_table.c \
mesa/program/dummy_errors.c
EXTRA_DIST += \
tools/imgui/imconfig.h \
tools/imgui/imgui.cpp \
tools/imgui/imgui.h \
tools/imgui/imgui_draw.cpp \
tools/imgui/imgui_demo.cpp \
tools/imgui/imgui_internal.h \
tools/imgui/imgui_memory_editor.h \
tools/imgui/stb_rect_pack.h \
tools/imgui/stb_textedit.h \
tools/imgui/stb_truetype.h \
tools/imgui/README \
tools/imgui/LICENSE.txt \
tools/imgui/meson.build

View File

@@ -33,12 +33,11 @@ LOCAL_SRC_FILES := $(ADDRLIB_FILES)
LOCAL_C_INCLUDES := \
$(MESA_TOP)/src \
$(MESA_TOP)/src/amd/common \
$(MESA_TOP)/src/amd/addrlib \
$(MESA_TOP)/src/amd/addrlib/core \
$(MESA_TOP)/src/amd/addrlib/inc/chip/gfx9 \
$(MESA_TOP)/src/amd/addrlib/inc/chip/r800 \
$(MESA_TOP)/src/amd/addrlib/gfx9/chip \
$(MESA_TOP)/src/amd/addrlib/r800/chip
$(MESA_TOP)/src/amd/addrlib/inc \
$(MESA_TOP)/src/amd/addrlib/src \
$(MESA_TOP)/src/amd/addrlib/src/core \
$(MESA_TOP)/src/amd/addrlib/src/chip/gfx9 \
$(MESA_TOP)/src/amd/addrlib/src/chip/r800
LOCAL_EXPORT_C_INCLUDE_DIRS := \
$(LOCAL_PATH) \

View File

@@ -26,12 +26,11 @@ addrlib_libamdgpu_addrlib_la_CPPFLAGS = \
-I$(top_srcdir)/src/ \
-I$(top_srcdir)/include \
-I$(srcdir)/common \
-I$(srcdir)/addrlib \
-I$(srcdir)/addrlib/core \
-I$(srcdir)/addrlib/inc/chip/gfx9 \
-I$(srcdir)/addrlib/inc/chip/r800 \
-I$(srcdir)/addrlib/gfx9/chip \
-I$(srcdir)/addrlib/r800/chip
-I$(srcdir)/addrlib/inc \
-I$(srcdir)/addrlib/src \
-I$(srcdir)/addrlib/src/core \
-I$(srcdir)/addrlib/src/chip/gfx9 \
-I$(srcdir)/addrlib/src/chip/r800
addrlib_libamdgpu_addrlib_la_CXXFLAGS = \
$(VISIBILITY_CXXFLAGS) $(CXX11_CXXFLAGS)

View File

@@ -5,35 +5,33 @@ COMMON_HEADER_FILES = \
common/amd_kernel_code_t.h
ADDRLIB_FILES = \
addrlib/addrinterface.cpp \
addrlib/addrinterface.h \
addrlib/addrtypes.h \
addrlib/amdgpu_asic_addr.h \
addrlib/core/addrcommon.h \
addrlib/core/addrelemlib.cpp \
addrlib/core/addrelemlib.h \
addrlib/core/addrlib.cpp \
addrlib/core/addrlib.h \
addrlib/core/addrlib1.cpp \
addrlib/core/addrlib1.h \
addrlib/core/addrlib2.cpp \
addrlib/core/addrlib2.h \
addrlib/core/addrobject.cpp \
addrlib/core/addrobject.h \
addrlib/gfx9/chip/gfx9_enum.h \
addrlib/gfx9/coord.cpp \
addrlib/gfx9/coord.h \
addrlib/gfx9/gfx9addrlib.cpp \
addrlib/gfx9/gfx9addrlib.h \
addrlib/inc/chip/gfx9/gfx9_gb_reg.h \
addrlib/inc/chip/r800/si_gb_reg.h \
addrlib/r800/chip/si_ci_vi_merged_enum.h \
addrlib/r800/ciaddrlib.cpp \
addrlib/r800/ciaddrlib.h \
addrlib/r800/egbaddrlib.cpp \
addrlib/r800/egbaddrlib.h \
addrlib/r800/siaddrlib.cpp \
addrlib/r800/siaddrlib.h
addrlib/inc/addrinterface.h \
addrlib/inc/addrtypes.h \
addrlib/src/addrinterface.cpp \
addrlib/src/amdgpu_asic_addr.h \
addrlib/src/core/addrcommon.h \
addrlib/src/core/addrelemlib.cpp \
addrlib/src/core/addrelemlib.h \
addrlib/src/core/addrlib.cpp \
addrlib/src/core/addrlib.h \
addrlib/src/core/addrlib1.cpp \
addrlib/src/core/addrlib1.h \
addrlib/src/core/addrlib2.cpp \
addrlib/src/core/addrlib2.h \
addrlib/src/core/addrobject.cpp \
addrlib/src/core/addrobject.h \
addrlib/src/core/coord.cpp \
addrlib/src/core/coord.h \
addrlib/src/gfx9/gfx9addrlib.cpp \
addrlib/src/gfx9/gfx9addrlib.h \
addrlib/src/chip/gfx9/gfx9_gb_reg.h \
addrlib/src/chip/r800/si_gb_reg.h \
addrlib/src/r800/ciaddrlib.cpp \
addrlib/src/r800/ciaddrlib.h \
addrlib/src/r800/egbaddrlib.cpp \
addrlib/src/r800/egbaddrlib.h \
addrlib/src/r800/siaddrlib.cpp \
addrlib/src/r800/siaddrlib.h
AMD_COMPILER_FILES = \
common/ac_binary.c \

File diff suppressed because it is too large Load Diff

View File

@@ -1,5 +1,5 @@
/*
* Copyright © 2014 Advanced Micro Devices, Inc.
* Copyright © 2007-2018 Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
@@ -177,7 +177,6 @@ typedef struct _ADDR_EQUATION
///< stacked vertically prior to swizzling
} ADDR_EQUATION;
/**
****************************************************************************************************
* @brief Alloc system memory flags.
@@ -409,8 +408,6 @@ ADDR_E_RETURNCODE ADDR_API AddrCreate(
const ADDR_CREATE_INPUT* pAddrCreateIn,
ADDR_CREATE_OUTPUT* pAddrCreateOut);
/**
****************************************************************************************************
* AddrDestroy
@@ -425,8 +422,6 @@ ADDR_E_RETURNCODE ADDR_API AddrCreate(
ADDR_E_RETURNCODE ADDR_API AddrDestroy(
ADDR_HANDLE hLib);
////////////////////////////////////////////////////////////////////////////////////////////////////
// Surface functions
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -658,8 +653,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeSurfaceInfo(
const ADDR_COMPUTE_SURFACE_INFO_INPUT* pIn,
ADDR_COMPUTE_SURFACE_INFO_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_COMPUTE_SURFACE_ADDRFROMCOORD_INPUT
@@ -748,8 +741,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeSurfaceAddrFromCoord(
const ADDR_COMPUTE_SURFACE_ADDRFROMCOORD_INPUT* pIn,
ADDR_COMPUTE_SURFACE_ADDRFROMCOORD_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_COMPUTE_SURFACE_COORDFROMADDR_INPUT
@@ -931,8 +922,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeHtileInfo(
const ADDR_COMPUTE_HTILE_INFO_INPUT* pIn,
ADDR_COMPUTE_HTILE_INFO_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_COMPUTE_HTILE_ADDRFROMCOORD_INPUT
@@ -995,8 +984,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeHtileAddrFromCoord(
const ADDR_COMPUTE_HTILE_ADDRFROMCOORD_INPUT* pIn,
ADDR_COMPUTE_HTILE_ADDRFROMCOORD_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_COMPUTE_HTILE_COORDFROMADDR_INPUT
@@ -1057,8 +1044,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeHtileCoordFromAddr(
const ADDR_COMPUTE_HTILE_COORDFROMADDR_INPUT* pIn,
ADDR_COMPUTE_HTILE_COORDFROMADDR_OUTPUT* pOut);
////////////////////////////////////////////////////////////////////////////////////////////////////
// C-mask functions
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -1146,8 +1131,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeCmaskInfo(
const ADDR_COMPUTE_CMASK_INFO_INPUT* pIn,
ADDR_COMPUTE_CMASK_INFO_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_COMPUTE_CMASK_ADDRFROMCOORD_INPUT
@@ -1208,8 +1191,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeCmaskAddrFromCoord(
const ADDR_COMPUTE_CMASK_ADDRFROMCOORD_INPUT* pIn,
ADDR_COMPUTE_CMASK_ADDRFROMCOORD_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_COMPUTE_CMASK_COORDFROMADDR_INPUT
@@ -1268,8 +1249,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeCmaskCoordFromAddr(
const ADDR_COMPUTE_CMASK_COORDFROMADDR_INPUT* pIn,
ADDR_COMPUTE_CMASK_COORDFROMADDR_OUTPUT* pOut);
////////////////////////////////////////////////////////////////////////////////////////////////////
// F-mask functions
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -1350,8 +1329,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeFmaskInfo(
const ADDR_COMPUTE_FMASK_INFO_INPUT* pIn,
ADDR_COMPUTE_FMASK_INFO_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_COMPUTE_FMASK_ADDRFROMCOORD_INPUT
@@ -1428,8 +1405,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeFmaskAddrFromCoord(
const ADDR_COMPUTE_FMASK_ADDRFROMCOORD_INPUT* pIn,
ADDR_COMPUTE_FMASK_ADDRFROMCOORD_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_COMPUTE_FMASK_COORDFROMADDR_INPUT
@@ -1503,8 +1478,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeFmaskCoordFromAddr(
const ADDR_COMPUTE_FMASK_COORDFROMADDR_INPUT* pIn,
ADDR_COMPUTE_FMASK_COORDFROMADDR_OUTPUT* pOut);
////////////////////////////////////////////////////////////////////////////////////////////////////
// Element/utility functions
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -1593,7 +1566,6 @@ ADDR_E_RETURNCODE ADDR_API AddrExtractBankPipeSwizzle(
const ADDR_EXTRACT_BANKPIPE_SWIZZLE_INPUT* pIn,
ADDR_EXTRACT_BANKPIPE_SWIZZLE_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_COMBINE_BANKPIPE_SWIZZLE_INPUT
@@ -1651,8 +1623,6 @@ ADDR_E_RETURNCODE ADDR_API AddrCombineBankPipeSwizzle(
const ADDR_COMBINE_BANKPIPE_SWIZZLE_INPUT* pIn,
ADDR_COMBINE_BANKPIPE_SWIZZLE_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_COMPUTE_SLICESWIZZLE_INPUT
@@ -1679,8 +1649,6 @@ typedef struct _ADDR_COMPUTE_SLICESWIZZLE_INPUT
///< README: When tileIndex is not -1, this must be valid
} ADDR_COMPUTE_SLICESWIZZLE_INPUT;
/**
****************************************************************************************************
* ADDR_COMPUTE_SLICESWIZZLE_OUTPUT
@@ -1711,7 +1679,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeSliceSwizzle(
const ADDR_COMPUTE_SLICESWIZZLE_INPUT* pIn,
ADDR_COMPUTE_SLICESWIZZLE_OUTPUT* pOut);
/**
****************************************************************************************************
* AddrSwizzleGenOption
@@ -1802,8 +1769,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeBaseSwizzle(
const ADDR_COMPUTE_BASE_SWIZZLE_INPUT* pIn,
ADDR_COMPUTE_BASE_SWIZZLE_OUTPUT* pOut);
/**
****************************************************************************************************
* ELEM_GETEXPORTNORM_INPUT
@@ -1844,8 +1809,6 @@ BOOL_32 ADDR_API ElemGetExportNorm(
ADDR_HANDLE hLib,
const ELEM_GETEXPORTNORM_INPUT* pIn);
/**
****************************************************************************************************
* ELEM_FLT32TODEPTHPIXEL_INPUT
@@ -1901,8 +1864,6 @@ ADDR_E_RETURNCODE ADDR_API ElemFlt32ToDepthPixel(
const ELEM_FLT32TODEPTHPIXEL_INPUT* pIn,
ELEM_FLT32TODEPTHPIXEL_OUTPUT* pOut);
/**
****************************************************************************************************
* ELEM_FLT32TOCOLORPIXEL_INPUT
@@ -1956,6 +1917,21 @@ ADDR_E_RETURNCODE ADDR_API ElemFlt32ToColorPixel(
const ELEM_FLT32TOCOLORPIXEL_INPUT* pIn,
ELEM_FLT32TOCOLORPIXEL_OUTPUT* pOut);
/**
****************************************************************************************************
* ElemSize
*
* @brief
* Get bits-per-element for specified format
*
* @return
* Bits-per-element of specified format
*
****************************************************************************************************
*/
UINT_32 ADDR_API ElemSize(
ADDR_HANDLE hLib,
AddrFormat format);
/**
****************************************************************************************************
@@ -2014,8 +1990,6 @@ ADDR_E_RETURNCODE ADDR_API AddrConvertTileInfoToHW(
const ADDR_CONVERT_TILEINFOTOHW_INPUT* pIn,
ADDR_CONVERT_TILEINFOTOHW_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_CONVERT_TILEINDEX_INPUT
@@ -2140,8 +2114,6 @@ ADDR_E_RETURNCODE ADDR_API AddrConvertTileIndex1(
const ADDR_CONVERT_TILEINDEX1_INPUT* pIn,
ADDR_CONVERT_TILEINDEX_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_GET_TILEINDEX_INPUT
@@ -2187,8 +2159,6 @@ ADDR_E_RETURNCODE ADDR_API AddrGetTileIndex(
const ADDR_GET_TILEINDEX_INPUT* pIn,
ADDR_GET_TILEINDEX_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_PRT_INFO_INPUT
@@ -2233,8 +2203,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputePrtInfo(
const ADDR_PRT_INFO_INPUT* pIn,
ADDR_PRT_INFO_OUTPUT* pOut);
////////////////////////////////////////////////////////////////////////////////////////////////////
// DCC key functions
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -2295,8 +2263,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeDccInfo(
const ADDR_COMPUTE_DCCINFO_INPUT* pIn,
ADDR_COMPUTE_DCCINFO_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR_GET_MAX_ALINGMENTS_OUTPUT
@@ -2360,7 +2326,6 @@ ADDR_E_RETURNCODE ADDR_API AddrGetMaxMetaAlignments(
*
**/
////////////////////////////////////////////////////////////////////////////////////////////////////
// Surface functions for Gfx9
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -2395,7 +2360,8 @@ typedef union _ADDR2_SURFACE_FLAGS
UINT_32 noMetadata : 1; ///< This resource has no metadata
UINT_32 metaRbUnaligned : 1; ///< This resource has rb unaligned metadata
UINT_32 metaPipeUnaligned : 1; ///< This resource has pipe unaligned metadata
UINT_32 reserved : 14; ///< Reserved bits
UINT_32 view3dAs2dArray : 1; ///< This resource is a 3D resource viewed as 2D array
UINT_32 reserved : 13; ///< Reserved bits
};
UINT_32 value;
@@ -2523,8 +2489,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeSurfaceInfo(
const ADDR2_COMPUTE_SURFACE_INFO_INPUT* pIn,
ADDR2_COMPUTE_SURFACE_INFO_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR2_COMPUTE_SURFACE_ADDRFROMCOORD_INPUT
@@ -2591,8 +2555,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeSurfaceAddrFromCoord(
const ADDR2_COMPUTE_SURFACE_ADDRFROMCOORD_INPUT* pIn,
ADDR2_COMPUTE_SURFACE_ADDRFROMCOORD_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR2_COMPUTE_SURFACE_COORDFROMADDR_INPUT
@@ -2658,8 +2620,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeSurfaceCoordFromAddr(
const ADDR2_COMPUTE_SURFACE_COORDFROMADDR_INPUT* pIn,
ADDR2_COMPUTE_SURFACE_COORDFROMADDR_OUTPUT* pOut);
////////////////////////////////////////////////////////////////////////////////////////////////////
// HTile functions for Gfx9
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -2710,8 +2670,10 @@ typedef struct _ADDR2_META_MIP_INFO
struct
{
UINT_32 offset;
UINT_32 sliceSize;
UINT_32 offset; ///< Metadata offset within one slice,
/// the thickness of a slice is meta block depth.
UINT_32 sliceSize; ///< Metadata size within one slice,
/// the thickness of a slice is meta block depth.
};
};
} ADDR2_META_MIP_INFO;
@@ -2735,7 +2697,9 @@ typedef struct _ADDR2_COMPUTE_HTILE_INFO_INPUT
UINT_32 unalignedHeight; ///< Depth surface original height (of mip0)
UINT_32 numSlices; ///< Number of slices of depth surface (of mip0)
UINT_32 numMipLevels; ///< Total mipmap levels of color surface
UINT_32 firstMipIdInTail;
UINT_32 firstMipIdInTail; /// Id of the first mip in tail,
/// if no mip is in tail, it should be set to
/// number of mip levels
} ADDR2_COMPUTE_HTILE_INFO_INPUT;
/**
@@ -2777,8 +2741,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeHtileInfo(
const ADDR2_COMPUTE_HTILE_INFO_INPUT* pIn,
ADDR2_COMPUTE_HTILE_INFO_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR2_COMPUTE_HTILE_ADDRFROMCOORD_INPUT
@@ -2836,8 +2798,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeHtileAddrFromCoord(
const ADDR2_COMPUTE_HTILE_ADDRFROMCOORD_INPUT* pIn,
ADDR2_COMPUTE_HTILE_ADDRFROMCOORD_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR2_COMPUTE_HTILE_COORDFROMADDR_INPUT
@@ -2896,8 +2856,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeHtileCoordFromAddr(
const ADDR2_COMPUTE_HTILE_COORDFROMADDR_INPUT* pIn,
ADDR2_COMPUTE_HTILE_COORDFROMADDR_OUTPUT* pOut);
////////////////////////////////////////////////////////////////////////////////////////////////////
// C-mask functions for Gfx9
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -2963,8 +2921,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeCmaskInfo(
const ADDR2_COMPUTE_CMASK_INFO_INPUT* pIn,
ADDR2_COMPUTE_CMASK_INFO_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR2_COMPUTE_CMASK_ADDRFROMCOORD_INPUT
@@ -3026,8 +2982,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeCmaskAddrFromCoord(
const ADDR2_COMPUTE_CMASK_ADDRFROMCOORD_INPUT* pIn,
ADDR2_COMPUTE_CMASK_ADDRFROMCOORD_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR2_COMPUTE_CMASK_COORDFROMADDR_INPUT
@@ -3086,8 +3040,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeCmaskCoordFromAddr(
const ADDR2_COMPUTE_CMASK_COORDFROMADDR_INPUT* pIn,
ADDR2_COMPUTE_CMASK_COORDFROMADDR_OUTPUT* pOut);
////////////////////////////////////////////////////////////////////////////////////////////////////
// F-mask functions for Gfx9
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -3170,8 +3122,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeFmaskInfo(
const ADDR2_COMPUTE_FMASK_INFO_INPUT* pIn,
ADDR2_COMPUTE_FMASK_INFO_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR2_COMPUTE_FMASK_ADDRFROMCOORD_INPUT
@@ -3231,8 +3181,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeFmaskAddrFromCoord(
const ADDR2_COMPUTE_FMASK_ADDRFROMCOORD_INPUT* pIn,
ADDR2_COMPUTE_FMASK_ADDRFROMCOORD_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR2_COMPUTE_FMASK_COORDFROMADDR_INPUT
@@ -3291,8 +3239,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeFmaskCoordFromAddr(
const ADDR2_COMPUTE_FMASK_COORDFROMADDR_INPUT* pIn,
ADDR2_COMPUTE_FMASK_COORDFROMADDR_OUTPUT* pOut);
////////////////////////////////////////////////////////////////////////////////////////////////////
// DCC key functions for Gfx9
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -3321,7 +3267,8 @@ typedef struct _ADDR2_COMPUTE_DCCINFO_INPUT
UINT_32 numMipLevels; ///< Total mipmap levels of color surface
UINT_32 dataSurfaceSize; ///< The padded size of all slices and mip levels
///< useful in meta linear case
UINT_32 firstMipIdInTail;
UINT_32 firstMipIdInTail; ///< The id of first mip in tail, if no mip is in tail,
/// it should be number of mip levels
} ADDR2_COMPUTE_DCCINFO_INPUT;
/**
@@ -3356,7 +3303,9 @@ typedef struct _ADDR2_COMPUTE_DCCINFO_OUTPUT
union
{
UINT_32 fastClearSizePerSlice; ///< Size of DCC within a slice should be fast cleared
UINT_32 dccRamSliceSize;
UINT_32 dccRamSliceSize; ///< DCC ram size per slice. For mipmap, it's
/// the slize size of a mip chain, the thickness of a
/// a slice is meta block depth
};
ADDR2_META_MIP_INFO* pMipInfo; ///< DCC mip information
@@ -3376,7 +3325,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeDccInfo(
const ADDR2_COMPUTE_DCCINFO_INPUT* pIn,
ADDR2_COMPUTE_DCCINFO_OUTPUT* pOut);
/**
****************************************************************************************************
* ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT
@@ -3628,6 +3576,55 @@ typedef union _ADDR2_SWTYPE_SET
UINT_32 value;
} ADDR2_SWTYPE_SET;
/**
****************************************************************************************************
* ADDR2_SWMODE_SET
*
* @brief
* Bit field that defines swizzle type
****************************************************************************************************
*/
typedef union _ADDR2_SWMODE_SET
{
struct
{
UINT_32 swLinear : 1;
UINT_32 sw256B_S : 1;
UINT_32 sw256B_D : 1;
UINT_32 sw256B_R : 1;
UINT_32 sw4KB_Z : 1;
UINT_32 sw4KB_S : 1;
UINT_32 sw4KB_D : 1;
UINT_32 sw4KB_R : 1;
UINT_32 sw64KB_Z : 1;
UINT_32 sw64KB_S : 1;
UINT_32 sw64KB_D : 1;
UINT_32 sw64KB_R : 1;
UINT_32 swVar_Z : 1;
UINT_32 swVar_S : 1;
UINT_32 swVar_D : 1;
UINT_32 swVar_R : 1;
UINT_32 sw64KB_Z_T : 1;
UINT_32 sw64KB_S_T : 1;
UINT_32 sw64KB_D_T : 1;
UINT_32 sw64KB_R_T : 1;
UINT_32 sw4KB_Z_X : 1;
UINT_32 sw4KB_S_X : 1;
UINT_32 sw4KB_D_X : 1;
UINT_32 sw4KB_R_X : 1;
UINT_32 sw64KB_Z_X : 1;
UINT_32 sw64KB_S_X : 1;
UINT_32 sw64KB_D_X : 1;
UINT_32 sw64KB_R_X : 1;
UINT_32 swVar_Z_X : 1;
UINT_32 swVar_S_X : 1;
UINT_32 swVar_D_X : 1;
UINT_32 swVar_R_X : 1;
};
UINT_32 value;
} ADDR2_SWMODE_SET;
/**
****************************************************************************************************
* ADDR2_GET_PREFERRED_SURF_SETTING_INPUT
@@ -3681,6 +3678,7 @@ typedef struct _ADDR2_GET_PREFERRED_SURF_SETTING_OUTPUT
/// type
ADDR2_SWTYPE_SET validSwTypeSet; ///< Valid swizzle type bit combination
ADDR2_SWTYPE_SET clientPreferredSwSet; ///< Client-preferred swizzle type bit combination
ADDR2_SWMODE_SET validSwModeSet; ///< Valid swizzle mode bit combination
} ADDR2_GET_PREFERRED_SURF_SETTING_OUTPUT;
/**

View File

@@ -1,5 +1,5 @@
/*
* Copyright © 2014 Advanced Micro Devices, Inc.
* Copyright © 2007-2018 Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
@@ -112,7 +112,6 @@ typedef int INT;
#define GC_FASTCALL ADDR_FASTCALL
#endif
#if defined(__GNUC__)
#define ADDR_INLINE static inline // inline needs to be static to link
#else
@@ -353,7 +352,7 @@ typedef enum _AddrFormat {
ADDR_FMT_3_3_2 = 0x00000003,
ADDR_FMT_RESERVED_4 = 0x00000004,
ADDR_FMT_16 = 0x00000005,
ADDR_FMT_16_FLOAT = 0x00000006,
ADDR_FMT_16_FLOAT = ADDR_FMT_16,
ADDR_FMT_8_8 = 0x00000007,
ADDR_FMT_5_6_5 = 0x00000008,
ADDR_FMT_6_5_5 = 0x00000009,
@@ -361,28 +360,28 @@ typedef enum _AddrFormat {
ADDR_FMT_4_4_4_4 = 0x0000000b,
ADDR_FMT_5_5_5_1 = 0x0000000c,
ADDR_FMT_32 = 0x0000000d,
ADDR_FMT_32_FLOAT = 0x0000000e,
ADDR_FMT_32_FLOAT = ADDR_FMT_32,
ADDR_FMT_16_16 = 0x0000000f,
ADDR_FMT_16_16_FLOAT = 0x00000010,
ADDR_FMT_16_16_FLOAT = ADDR_FMT_16_16,
ADDR_FMT_8_24 = 0x00000011,
ADDR_FMT_8_24_FLOAT = 0x00000012,
ADDR_FMT_8_24_FLOAT = ADDR_FMT_8_24,
ADDR_FMT_24_8 = 0x00000013,
ADDR_FMT_24_8_FLOAT = 0x00000014,
ADDR_FMT_24_8_FLOAT = ADDR_FMT_24_8,
ADDR_FMT_10_11_11 = 0x00000015,
ADDR_FMT_10_11_11_FLOAT = 0x00000016,
ADDR_FMT_10_11_11_FLOAT = ADDR_FMT_10_11_11,
ADDR_FMT_11_11_10 = 0x00000017,
ADDR_FMT_11_11_10_FLOAT = 0x00000018,
ADDR_FMT_11_11_10_FLOAT = ADDR_FMT_11_11_10,
ADDR_FMT_2_10_10_10 = 0x00000019,
ADDR_FMT_8_8_8_8 = 0x0000001a,
ADDR_FMT_10_10_10_2 = 0x0000001b,
ADDR_FMT_X24_8_32_FLOAT = 0x0000001c,
ADDR_FMT_32_32 = 0x0000001d,
ADDR_FMT_32_32_FLOAT = 0x0000001e,
ADDR_FMT_32_32_FLOAT = ADDR_FMT_32_32,
ADDR_FMT_16_16_16_16 = 0x0000001f,
ADDR_FMT_16_16_16_16_FLOAT = 0x00000020,
ADDR_FMT_16_16_16_16_FLOAT = ADDR_FMT_16_16_16_16,
ADDR_FMT_RESERVED_33 = 0x00000021,
ADDR_FMT_32_32_32_32 = 0x00000022,
ADDR_FMT_32_32_32_32_FLOAT = 0x00000023,
ADDR_FMT_32_32_32_32_FLOAT = ADDR_FMT_32_32_32_32,
ADDR_FMT_RESERVED_36 = 0x00000024,
ADDR_FMT_1 = 0x00000025,
ADDR_FMT_1_REVERSED = 0x00000026,
@@ -393,9 +392,9 @@ typedef enum _AddrFormat {
ADDR_FMT_5_9_9_9_SHAREDEXP = 0x0000002b,
ADDR_FMT_8_8_8 = 0x0000002c,
ADDR_FMT_16_16_16 = 0x0000002d,
ADDR_FMT_16_16_16_FLOAT = 0x0000002e,
ADDR_FMT_16_16_16_FLOAT = ADDR_FMT_16_16_16,
ADDR_FMT_32_32_32 = 0x0000002f,
ADDR_FMT_32_32_32_FLOAT = 0x00000030,
ADDR_FMT_32_32_32_FLOAT = ADDR_FMT_32_32_32,
ADDR_FMT_BC1 = 0x00000031,
ADDR_FMT_BC2 = 0x00000032,
ADDR_FMT_BC3 = 0x00000033,
@@ -550,7 +549,6 @@ typedef enum _AddrHtileBlockSize
ADDR_HTILE_BLOCKSIZE_8 = 8,
} AddrHtileBlockSize;
/**
****************************************************************************************************
* AddrPipeCfg
@@ -584,7 +582,8 @@ typedef enum _AddrPipeCfg
ADDR_PIPECFG_P8_32x64_32x32 = 15,
ADDR_PIPECFG_P16_32x32_8x16 = 17, /// 16 pipes
ADDR_PIPECFG_P16_32x32_16x16 = 18,
ADDR_PIPECFG_MAX = 19,
ADDR_PIPECFG_RESERVED = 19, /// reserved for internal use
ADDR_PIPECFG_MAX = 20,
} AddrPipeCfg;
/**
@@ -712,7 +711,6 @@ typedef enum _AddrTileType
#define ADDR64D "lld" OR "I64d"
#endif
/// @brief Union for storing a 32-bit float or 32-bit integer
/// @ingroup type
///
@@ -728,7 +726,6 @@ typedef union {
float f;
} ADDR_FLT_32;
////////////////////////////////////////////////////////////////////////////////////////////////////
//
// Macros for controlling linking and building on multiple systems

View File

@@ -19,35 +19,33 @@
# SOFTWARE.
files_addrlib = files(
'addrinterface.cpp',
'addrinterface.h',
'addrtypes.h',
'core/addrcommon.h',
'core/addrelemlib.cpp',
'core/addrelemlib.h',
'core/addrlib.cpp',
'core/addrlib.h',
'core/addrlib1.cpp',
'core/addrlib1.h',
'core/addrlib2.cpp',
'core/addrlib2.h',
'core/addrobject.cpp',
'core/addrobject.h',
'gfx9/chip/gfx9_enum.h',
'gfx9/coord.cpp',
'gfx9/coord.h',
'gfx9/gfx9addrlib.cpp',
'gfx9/gfx9addrlib.h',
'amdgpu_asic_addr.h',
'inc/chip/gfx9/gfx9_gb_reg.h',
'inc/chip/r800/si_gb_reg.h',
'r800/chip/si_ci_vi_merged_enum.h',
'r800/ciaddrlib.cpp',
'r800/ciaddrlib.h',
'r800/egbaddrlib.cpp',
'r800/egbaddrlib.h',
'r800/siaddrlib.cpp',
'r800/siaddrlib.h',
'inc/addrinterface.h',
'inc/addrtypes.h',
'src/addrinterface.cpp',
'src/core/addrcommon.h',
'src/core/addrelemlib.cpp',
'src/core/addrelemlib.h',
'src/core/addrlib.cpp',
'src/core/addrlib.h',
'src/core/addrlib1.cpp',
'src/core/addrlib1.h',
'src/core/addrlib2.cpp',
'src/core/addrlib2.h',
'src/core/addrobject.cpp',
'src/core/addrobject.h',
'src/core/coord.cpp',
'src/core/coord.h',
'src/gfx9/gfx9addrlib.cpp',
'src/gfx9/gfx9addrlib.h',
'src/amdgpu_asic_addr.h',
'src/chip/gfx9/gfx9_gb_reg.h',
'src/chip/r800/si_gb_reg.h',
'src/r800/ciaddrlib.cpp',
'src/r800/ciaddrlib.h',
'src/r800/egbaddrlib.cpp',
'src/r800/egbaddrlib.h',
'src/r800/siaddrlib.cpp',
'src/r800/siaddrlib.h',
)
libamdgpu_addrlib = static_library(
@@ -55,7 +53,7 @@ libamdgpu_addrlib = static_library(
files_addrlib,
include_directories : [
include_directories(
'core', 'inc/chip/gfx9', 'inc/chip/r800', 'gfx9/chip', 'r800/chip',
'inc', 'src', 'src/core', 'src/chip/gfx9', 'src/chip/r800',
),
inc_amd_common, inc_common, inc_src,
],

View File

@@ -1,40 +0,0 @@
/*
* Copyright © 2014 Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NON-INFRINGEMENT. IN NO EVENT SHALL THE COPYRIGHT HOLDERS, AUTHORS
* AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*/
#if !defined (SI_CI_VI_MERGED_ENUM_HEADER)
#define SI_CI_VI_MERGED_ENUM_HEADER
typedef enum PipeInterleaveSize {
ADDR_CONFIG_PIPE_INTERLEAVE_256B = 0x00000000,
ADDR_CONFIG_PIPE_INTERLEAVE_512B = 0x00000001,
} PipeInterleaveSize;
typedef enum RowSize {
ADDR_CONFIG_1KB_ROW = 0x00000000,
ADDR_CONFIG_2KB_ROW = 0x00000001,
ADDR_CONFIG_4KB_ROW = 0x00000002,
} RowSize;
#endif

View File

@@ -1,5 +1,5 @@
/*
* Copyright © 2014 Advanced Micro Devices, Inc.
* Copyright © 2007-2018 Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
@@ -61,13 +61,13 @@ ADDR_E_RETURNCODE ADDR_API AddrCreate(
{
ADDR_E_RETURNCODE returnCode = ADDR_OK;
returnCode = Lib::Create(pAddrCreateIn, pAddrCreateOut);
{
returnCode = Lib::Create(pAddrCreateIn, pAddrCreateOut);
}
return returnCode;
}
/**
****************************************************************************************************
* AddrDestroy
@@ -97,8 +97,6 @@ ADDR_E_RETURNCODE ADDR_API AddrDestroy(
return returnCode;
}
////////////////////////////////////////////////////////////////////////////////////////////////////
// Surface functions
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -135,8 +133,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeSurfaceInfo(
return returnCode;
}
/**
****************************************************************************************************
* AddrComputeSurfaceAddrFromCoord
@@ -201,8 +197,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeSurfaceCoordFromAddr(
return returnCode;
}
////////////////////////////////////////////////////////////////////////////////////////////////////
// HTile functions
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -304,8 +298,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeHtileCoordFromAddr(
return returnCode;
}
////////////////////////////////////////////////////////////////////////////////////////////////////
// C-mask functions
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -408,8 +400,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeCmaskCoordFromAddr(
return returnCode;
}
////////////////////////////////////////////////////////////////////////////////////////////////////
// F-mask functions
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -510,8 +500,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeFmaskCoordFromAddr(
return returnCode;
}
////////////////////////////////////////////////////////////////////////////////////////////////////
// DCC key functions
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -546,8 +534,6 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeDccInfo(
return returnCode;
}
///////////////////////////////////////////////////////////////////////////////
// Below functions are element related or helper functions
///////////////////////////////////////////////////////////////////////////////
@@ -850,6 +836,34 @@ BOOL_32 ADDR_API ElemGetExportNorm(
return enabled;
}
/**
****************************************************************************************************
* ElemSize
*
* @brief
* Get bits-per-element for specified format
*
* @return
* Bits-per-element of specified format
*
****************************************************************************************************
*/
UINT_32 ADDR_API ElemSize(
ADDR_HANDLE hLib,
AddrFormat format)
{
UINT_32 bpe = 0;
Addr::Lib* pLib = Lib::GetLib(hLib);
if (pLib != NULL)
{
bpe = pLib->GetBpe(format);
}
return bpe;
}
/**
****************************************************************************************************
* AddrConvertTileInfoToHW
@@ -1105,7 +1119,6 @@ ADDR_E_RETURNCODE ADDR_API AddrGetMaxMetaAlignments(
return returnCode;
}
////////////////////////////////////////////////////////////////////////////////////////////////////
// Surface functions for Addr2
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -1142,7 +1155,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeSurfaceInfo(
return returnCode;
}
/**
****************************************************************************************************
* Addr2ComputeSurfaceAddrFromCoord
@@ -1175,7 +1187,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeSurfaceAddrFromCoord(
return returnCode;
}
/**
****************************************************************************************************
* Addr2ComputeSurfaceCoordFromAddr
@@ -1208,8 +1219,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeSurfaceCoordFromAddr(
return returnCode;
}
////////////////////////////////////////////////////////////////////////////////////////////////////
// HTile functions for Addr2
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -1246,7 +1255,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeHtileInfo(
return returnCode;
}
/**
****************************************************************************************************
* Addr2ComputeHtileAddrFromCoord
@@ -1279,7 +1287,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeHtileAddrFromCoord(
return returnCode;
}
/**
****************************************************************************************************
* Addr2ComputeHtileCoordFromAddr
@@ -1313,8 +1320,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeHtileCoordFromAddr(
return returnCode;
}
////////////////////////////////////////////////////////////////////////////////////////////////////
// C-mask functions for Addr2
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -1352,7 +1357,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeCmaskInfo(
return returnCode;
}
/**
****************************************************************************************************
* Addr2ComputeCmaskAddrFromCoord
@@ -1385,7 +1389,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeCmaskAddrFromCoord(
return returnCode;
}
/**
****************************************************************************************************
* Addr2ComputeCmaskCoordFromAddr
@@ -1419,8 +1422,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeCmaskCoordFromAddr(
return returnCode;
}
////////////////////////////////////////////////////////////////////////////////////////////////////
// F-mask functions for Addr2
////////////////////////////////////////////////////////////////////////////////////////////////////
@@ -1457,7 +1458,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeFmaskInfo(
return returnCode;
}
/**
****************************************************************************************************
* Addr2ComputeFmaskAddrFromCoord
@@ -1490,7 +1490,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeFmaskAddrFromCoord(
return returnCode;
}
/**
****************************************************************************************************
* Addr2ComputeFmaskCoordFromAddr
@@ -1523,8 +1522,6 @@ ADDR_E_RETURNCODE ADDR_API Addr2ComputeFmaskCoordFromAddr(
return returnCode;
}
////////////////////////////////////////////////////////////////////////////////////////////////////
// DCC key functions for Addr2
////////////////////////////////////////////////////////////////////////////////////////////////////

View File

@@ -1,5 +1,5 @@
/*
* Copyright © 2017 Advanced Micro Devices, Inc.
* Copyright © 2017-2018 Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
@@ -96,7 +96,6 @@
#define AMDGPU_RANGE_HELPER(val, min, max) ((val >= min) && (val < max))
#define AMDGPU_IN_RANGE(val, ...) AMDGPU_EXPAND_FIX(AMDGPU_RANGE_HELPER(val, __VA_ARGS__))
// ASICREV_IS(eRevisionId, revisionName)
#define ASICREV_IS(r, rn) AMDGPU_IN_RANGE(r, AMDGPU_##rn##_RANGE)
#define ASICREV_IS_TAHITI_P(r) ASICREV_IS(r, TAHITI)

View File

@@ -2,7 +2,7 @@
#define __GFX9_GB_REG_H__
/*
* Copyright © 2017 Advanced Micro Devices, Inc.
* Copyright © 2007-2018 Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining

View File

@@ -2,7 +2,7 @@
#define __SI_GB_REG_H__
/*
* Copyright © 2014 Advanced Micro Devices, Inc.
* Copyright © 2007-2018 Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining

View File

@@ -1,5 +1,5 @@
/*
* Copyright © 2014 Advanced Micro Devices, Inc.
* Copyright © 2007-2018 Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
@@ -99,7 +99,6 @@
#define ADDR_INFO(cond, a) \
{ if (!(cond)) { ADDR_PRNT(a); } }
/// @brief Macro for reporting error warning messages
/// @ingroup util
///
@@ -118,7 +117,6 @@
ADDR_PRNT((" WARNING in file %s, line %d\n", __FILE__, __LINE__)); \
} }
/// @brief Macro for reporting fatal error conditions
/// @ingroup util
///
@@ -261,7 +259,8 @@ union ConfigFlags
UINT_32 useHtileSliceAlign : 1; ///< Do htile single slice alignment
UINT_32 allowLargeThickTile : 1; ///< Allow 64*thickness*bytesPerPixel > rowSize
UINT_32 disableLinearOpt : 1; ///< Disallow tile modes to be optimized to linear
UINT_32 reserved : 22; ///< Reserved bits for future use
UINT_32 use32bppFor422Fmt : 1; ///< View 422 formats as 32 bits per pixel element
UINT_32 reserved : 21; ///< Reserved bits for future use
};
UINT_32 value;
@@ -845,7 +844,6 @@ static inline VOID InitChannel(
pChanSet->index = index;
}
/**
****************************************************************************************************
* InitChannel

View File

@@ -1,5 +1,5 @@
/*
* Copyright © 2014 Advanced Micro Devices, Inc.
* Copyright © 2007-2018 Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
@@ -72,6 +72,7 @@ ElemLib::ElemLib(
default:
m_fp16ExportNorm = 1;
m_depthPlanarType = ADDR_DEPTH_PLANAR_R800;
break;
}
m_configFlags.value = 0;
@@ -346,7 +347,6 @@ VOID ElemLib::Int32sToPixel(
UINT_32 elemMask=0;
UINT_32 elementXor = 0; // address xor when reading bytes from elements
// @@ NOTE: assert if called on a compressed format!
if (properties.byteAligned) // Components are all byte-sized
@@ -1387,38 +1387,33 @@ UINT_32 ElemLib::GetBitsPerPixel(
case ADDR_FMT_8_8:
case ADDR_FMT_4_4_4_4:
case ADDR_FMT_16:
case ADDR_FMT_16_FLOAT:
bpp = 16;
break;
case ADDR_FMT_GB_GR: // treat as FMT_8_8
case ADDR_FMT_GB_GR:
elemMode = ADDR_PACKED_GBGR;
bpp = 16;
bpp = m_configFlags.use32bppFor422Fmt ? 32 : 16;
expandX = m_configFlags.use32bppFor422Fmt ? 2 : 1;
break;
case ADDR_FMT_BG_RG: // treat as FMT_8_8
case ADDR_FMT_BG_RG:
elemMode = ADDR_PACKED_BGRG;
bpp = 16;
bpp = m_configFlags.use32bppFor422Fmt ? 32 : 16;
expandX = m_configFlags.use32bppFor422Fmt ? 2 : 1;
break;
case ADDR_FMT_8_8_8_8:
case ADDR_FMT_2_10_10_10:
case ADDR_FMT_10_11_11:
case ADDR_FMT_11_11_10:
case ADDR_FMT_16_16:
case ADDR_FMT_16_16_FLOAT:
case ADDR_FMT_32:
case ADDR_FMT_32_FLOAT:
case ADDR_FMT_24_8:
case ADDR_FMT_24_8_FLOAT:
bpp = 32;
break;
case ADDR_FMT_16_16_16_16:
case ADDR_FMT_16_16_16_16_FLOAT:
case ADDR_FMT_32_32:
case ADDR_FMT_32_32_FLOAT:
case ADDR_FMT_CTX1:
bpp = 64;
break;
case ADDR_FMT_32_32_32_32:
case ADDR_FMT_32_32_32_32_FLOAT:
bpp = 128;
break;
case ADDR_FMT_INVALID:
@@ -1444,10 +1439,7 @@ UINT_32 ElemLib::GetBitsPerPixel(
case ADDR_FMT_32_AS_8:
case ADDR_FMT_32_AS_8_8:
case ADDR_FMT_8_24:
case ADDR_FMT_8_24_FLOAT:
case ADDR_FMT_10_10_10_2:
case ADDR_FMT_10_11_11_FLOAT:
case ADDR_FMT_11_11_10_FLOAT:
case ADDR_FMT_5_9_9_9_SHAREDEXP:
bpp = 32;
break;
@@ -1461,12 +1453,10 @@ UINT_32 ElemLib::GetBitsPerPixel(
expandX = 3;
break;
case ADDR_FMT_16_16_16:
case ADDR_FMT_16_16_16_FLOAT:
elemMode = ADDR_EXPANDED;
bpp = 48;//@@ 16; // read 3 elements per pixel
expandX = 3;
break;
case ADDR_FMT_32_32_32_FLOAT:
case ADDR_FMT_32_32_32:
elemMode = ADDR_EXPANDED;
expandX = 3;
@@ -1755,7 +1745,6 @@ BOOL_32 ElemLib::IsBlockCompressed(
((format >= ADDR_FMT_ASTC_4x4) && (format <= ADDR_FMT_ETC2_128BPP)));
}
/**
****************************************************************************************************
* ElemLib::IsCompressed
@@ -1797,9 +1786,7 @@ BOOL_32 ElemLib::IsExpand3x(
{
case ADDR_FMT_8_8_8:
case ADDR_FMT_16_16_16:
case ADDR_FMT_16_16_16_FLOAT:
case ADDR_FMT_32_32_32:
case ADDR_FMT_32_32_32_FLOAT:
is3x = TRUE;
break;
default:

Some files were not shown because too many files have changed in this diff Show More