Compare commits

...

5248 Commits
17.3 ... 18.1

Author SHA1 Message Date
Dylan Baker
f57f37f3ba docs: Add 18.1.9 release notes 2018-09-24 08:43:25 -07:00
Dylan Baker
634f4f98ac Bump version to 18.1.9 2018-09-21 09:02:55 -07:00
Bas Nieuwenhuizen
a76c43fc19 radv: Fix driver UUID SHA1 init.
Was missing the init, found by Emil.

Fixes: d17443a459 "radv: Use build ID if available for cache UUID."
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 0a77e70d10)
2018-09-20 14:53:59 -07:00
Dylan Baker
6819121e67 cherry-ignore: one final update 2018-09-20 13:30:54 -07:00
Mauro Rossi
309945ec49 android: broadcom/cle: export the broadcom top level path headers
Fixes the following building error in vc4 build:

In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_render_cl.c:34:
In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_drv.h:27:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_simulator_validate.h:34:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_context.h:39:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_cl.h:56:
out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h:12:10:
fatal error: 'cle/v3d_packet_helpers.h' file not found
         ^~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2018-09-20 13:29:41 -07:00
Mauro Rossi
9b850150c0 android: broadcom/cle: add gallium include path
Fixes the following building error:

In file included from external/mesa/src/broadcom/cle/v3d_decoder.c:38:
In file included from external/mesa/src/broadcom/cle/v3d_packet_helpers.h:29:
external/mesa/src/gallium/auxiliary/util/u_math.h:42:10:
fatal error: 'pipe/p_compiler.h' file not found
         ^~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2018-09-20 13:29:40 -07:00
Mauro Rossi
80d2139e4a android: broadcom/genxml: fix collision with intel/genxml header-gen macro
Backport to mesa 18.1
Fixes the following building error, happening when building both intel and broadcom:

Gen Header: libmesa_broadcom_genxml_32 <= v3d_packet_v21_pack.h
FAILED: out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h
/bin/bash -c "python external/mesa/src/broadcom/cle/gen_pack_header.py \
external/mesa/src/broadcom/cle/v3d_packet_v21.xml \
> out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h"
Traceback (most recent call last):
  File "external/mesa/src/broadcom/cle/gen_pack_header.py", line 626, in <module>
    p = Parser(sys.argv[2])
IndexError: list index out of range

header-gen macro is already defined by Intel genxml building rules
and the existing header-gen does not have the $(PRIVATE_VER) argument,
infact the bash command line logged in the building error is missing
exactly $(PRIVATE_VER) argument

Renaming the macro as pack-header-gen in src/broadcom/Android.genxml.mk
solves the building error, another possible way is to keep the gen rules
commands expanded and not use the macros.

Fixes: 7f80a9ff13 ("vc4: Introduce XML-based packet header generation like Intel's.")
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2018-09-20 13:29:39 -07:00
Michal Srb
0be61a9f17 st/dri: don't set queryDmaBufFormats/queryDmaBufModifiers if the driver does not implement it
This is equivalent to commit a65db0ad1c, but for dri_kms_init_screen. Without
this gbm_dri_is_format_supported always returns false.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104926
Fixes: e14fe41e0b ("st/dri: implement createImageFromRenderbuffer(2)")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Tested-by: Adam Williamson <adamwill@fedoraproject.org>
(cherry picked from commit 194bf0a2e0)
2018-09-20 08:51:22 -07:00
Bas Nieuwenhuizen
59c708ff8a radv: Set the user SGPR MSB for Vega.
Otherwise using 32 user SGPRs would be broken.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit d97c892584)
Conflicts resolved by Dylan

Conflicts:
	src/amd/vulkan/radv_shader.c
2018-09-18 16:43:55 -07:00
Bas Nieuwenhuizen
b38445b218 radv: Only allow 16 user SGPRs for compute on GFX9+.
Apparently for compute there are only 16 instead of the 32 for the
graphics path.

Fixes dEQP-VK.binding_model.descriptorset_random.sets16.noarray.ubolimitlow.sbolimitlow.imglimitlow.noiub.comp.0

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 0dd8189f15)
Conflicts Resolved by Dylan

Conflicts:
	src/amd/vulkan/radv_nir_to_llvm.c
2018-09-18 16:40:19 -07:00
Dylan Baker
e402a7efa5 cherry-ignore: add a patch that was reverted on master 2018-09-18 16:37:34 -07:00
Jason Ekstrand
551b07abba anv/query: Write both dwords in emit_zero_queries
Each query slot is a uint64_t and we were only zeroing half of it.

Fixes: 7ec6e4e689 "anv/query: implement multiview interactions"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 07e214f1ce)
2018-09-18 16:31:51 -07:00
Kenneth Feng
3366da58f4 amd: Add Picasso device id
No changes here compared to Raven.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4490fce166)
2018-09-18 16:31:32 -07:00
Bas Nieuwenhuizen
da9fc5c189 radv: Use build ID if available for cache UUID.
To get an useful UUID for systems that have a non-useful mtime
for the binaries.

I started using SHA1 to ensure we get reasonable mixing in the
various possibilities and the various build id lengths.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit d17443a459)
2018-09-17 13:32:39 -07:00
Josh Pieper
3a34419cb2 st/mesa: Validate the result of pipe_transfer_map in make_texture (v2)
When using Freecad, I was getting intermittent segfaults inside of
mesa.  I traced it down to this path in st_cb_drawpixels.c where the
result of pipe_transfer_map wasn't being checked.  In my case, it was
returning NULL because nouveau_bo_new returned ENOENT.  I'm by no
means a mesa developer, but this patch solves the problem for me and
seems reasonable enough.

v2: Marek - also unmap the PBO and release the texture, and call
    the make_texture function sooner for less cleanup

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 936e0dcd61)
2018-09-17 13:31:51 -07:00
Pierre Moreau
649aff1a87 nvir: Always split 64-bit IMAD/IMUL operations
Those operations do not map to actual hardware instructions, therefore
those should always be lowered to 32-bit instructions.

Fixes: 009c54aa7a "nv50/ir: Split 64-bit integer MAD/MUL operations"

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit 21b92b3464)
Conflicts resolved by Dylan

Conflicts:
	src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
2018-09-14 09:12:08 -07:00
Erik Faye-Lund
cf54d5f47c virgl: adjust strides when mapping temp-resources
When we're mapping temp-resources, we clip the resource to the
transfer-box, which means the stride might not be correct any more.

So let's update the stride from the temp-resource, and recompute the
layer-stride.

This fixes crashes when running dEQP with --deqp-gl-config-name=rgba8888d24s8ms4

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: a8987b88ff "virgl: add driver for virtio-gpu 3D (v2)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit fa5e9f1f73)
2018-09-14 09:09:32 -07:00
Dylan Baker
33f5a21c1f cherry-ignore: add 18.2 patchs 2018-09-14 09:09:14 -07:00
Erik Faye-Lund
1816f280c4 winsys/virgl: avoid unintended behavior
If we end up never taking the loop that writes ret, we can end up with
an uninitialized value, and if we're *really* unlucky, that value can
be -1, causing us to go down an error-path instead of a success path.

This was obviously not intended, so let's just initialize this to zero.

Noticed by Valgrind:

Conditional jump or move depends on uninitialised value(s)
   at 0xBA640A0: virgl_drm_winsys_resource_cache_create (virgl_drm_winsys.c:348)
   by 0xBA62FCF: virgl_buffer_create (virgl_buffer.c:170)
   by 0xBA605AC: virgl_resource_create (virgl_resource.c:60)
   by 0xBCF816F: bufferobj_data (st_cb_bufferobjects.c:344)
   by 0xBCF816F: st_bufferobj_data (st_cb_bufferobjects.c:390)
   by 0xBB7E836: vbo_use_buffer_objects (vbo_exec_api.c:1136)
   by 0xBCFCC6E: st_create_context_priv (st_context.c:414)
   by 0xBCFD3CD: st_create_context (st_context.c:590)
   by 0xBBB30CA: st_api_create_context (st_manager.c:896)
   by 0xB981E76: dri_create_context (dri_context.c:155)
   by 0xB97BDCE: driCreateContextAttribs (dri_util.c:473)
   by 0x5288331: dri3_create_context_attribs (dri3_glx.c:309)
   by 0x5264D64: glXCreateContextAttribsARB (create_context.c:78)

Fixes: a8987b88ff ("virgl: add driver for virtio-gpu 3D (v2)")
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit eaa718588e)
2018-09-12 08:46:47 -07:00
Michel Dänzer
e89a4589c0 loader/dri3: Only wait for back buffer fences in dri3_get_buffer
We don't need to wait before drawing to the fake front buffer, as front
buffer rendering by definition is allowed to produce artifacts.

Fixes hangs in some cases when re-using the fake front buffer, due to it
still being busy (i.e. in use for presentation).

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/106404
Bugzilla: https://bugs.freedesktop.org/107757
Tested-by: Olivier Fourdan <ofourdan@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
(cherry picked from commit aefac10fec)
2018-09-12 08:46:26 -07:00
Dylan Baker
eb6abe40f0 cherry-ignore: Add more 18.2 patches 2018-09-11 08:19:39 -07:00
Christopher Egert
3a23ba5acb radeon: fix ColorMask
Since commit af3685d149 various OpenGL applications regressed
on the classic mesa radeon driver.

Signed-off-by: Christopher Egert <cme3000@gmail.com>
CC: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 51995f6920)
2018-09-11 08:19:02 -07:00
Marek Olšák
09196d4b66 radeonsi: fix printing a BO list into ddebug reports
important for debugging

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit 662db03577)
2018-09-11 08:18:28 -07:00
Marek Olšák
517c2f1e0f r600: fix HTILE for NPOT textures with mipmapping
Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit da72b6296c)
2018-09-11 08:18:16 -07:00
Marek Olšák
aeb8b00661 radeonsi: fix HTILE for NPOT textures with mipmapping on SI/CI
VI uses addrlib so it's unaffected.

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit a1b9a00f82)
Conflicts resolved by Dylan

Conflicts:
	src/gallium/drivers/radeonsi/si_texture.c
2018-09-11 08:17:31 -07:00
Dave Airlie
4670daa12d virgl: don't send a shader create with no data. (v2)
This fixes the situation where we'd send a shader with just the
header and no data.

piglit/glsl-max-varyings test was causing this to happen, and
the renderer fix was breaking it.

v2: drop fprintf

Fixes: a8987b88ff "virgl: add driver for virtio-gpu 3D (v2)"
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 240af61494)
2018-09-10 10:13:34 -07:00
Jason Ekstrand
05985a643d anv: Clamp scissors to the framebuffer boundary
The Vulkan 1.1.81 spec says:

    "It is legal for offset.x + extent.width or offset.y + extent.height
    to exceed the dimensions of the framebuffer - the scissor test still
    applies as defined above. Rasterization does not produce fragments
    outside of the framebuffer, so such fragments never have the scissor
    test performed on them."

Elsewhere, the Vulkan 1.1.81 spec says:

    "The application must ensure (using scissor if necessary) that all
    rendering is contained within the render area, otherwise the pixels
    outside of the render area become undefined and shader side effects
    may occur for fragments outside the render area. The render area
    must be contained within the framebuffer dimensions."

Unfortunately, there's some room for interpretation here as to what the
consequences are of having the render area set to exactly the
framebuffer dimensions and having a scissor that is larger than the
framebuffer.  Given that GL and other APIs provide automatic clipping to
the framebuffer, it makes sense that applications would assume that
Vulkan does this as well.  It costs us very little to play it safe and
just clamp client-provided scissors to the framebuffer dimensions.
Fortunately, the user is required to provide us with at least one
scissor so we don't need to handle the case where they don't.

Fixes: fb2a5ceb32 "anv: Emit DRAWING_RECTANGLE once at driver..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 465e5a868c)
2018-09-10 10:13:27 -07:00
Jason Ekstrand
3bfed07f09 anv: Disable the vertex cache when tessellating on SKL GT4
I have no idea if I'm correct about what's going wrong or if this is the
correct fix.  However, in my multiple weeks of banging my head on this
hang, a VUE reference counting bug seems to match all the symptoms and
it definitely fixes the hang.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107280
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b08b4b2b25)
2018-09-10 10:12:20 -07:00
Jason Ekstrand
9f08ea8c0a anv: Re-emit vertex buffers when the pipeline changes
Some of the bits of VERTEX_BUFFER_STATE such as access type, instance
data step rate, and pitch come from the pipeline.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c643c5e18d)
2018-09-10 10:12:09 -07:00
Jason Ekstrand
08e3909c6c i965: Workaround the gen9 hw astc5x5 sampler bug
gen9 hardware has a bug in the sampler cache that can cause GPU hangs
whenever an texture with aux compression enabled is in the sampler cache
together with an ASTC5x5 texture.  Because we can't control what the
client binds at any given time, we have two options: resolve the CCS or
decompresss the ASTC.  Doing a CCS or HiZ resolve is far less drastic
and will likely have a smaller performance impact.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit f9e630e23d)
2018-09-10 10:12:04 -07:00
Dylan Baker
99ecc5fc20 cherry-ignore: Add patches that don't apply cleanly and are for developer tools 2018-09-07 10:16:55 -07:00
Andrii Simiklit
ea7df5b7c0 mesa/util: add missing va_end() after va_copy()
MSDN:
"va_end must be called on each argument list that's initialized
 with va_start or va_copy before the function returns."

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107810
Fixes: c6267ebd6c "gallium/util: Stop bundling our snprintf implementation."
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
(cherry picked from commit 2930b76cfe)
2018-09-07 10:16:55 -07:00
Andrii Simiklit
a17aed452d mesa/util: don't ignore NULL returned from 'malloc'
We should exit from the function 'util_vasprintf'
with error code -1 for case where 'malloc'
returns NULL

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 864148d69e "util: add util_vasprintf() for Windows (v2)"
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
(cherry picked from commit 65cfe698b0)
2018-09-07 10:16:55 -07:00
Andrii Simiklit
4a44ff8bad mesa/util: don't use the same 'va_list' instance twice
The first usage of the 'va_list' instance could change it.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 864148d69e "util: add util_vasprintf() for Windows (v2)"
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
(cherry picked from commit 570cacba7a)
2018-09-07 10:16:55 -07:00
Andrii Simiklit
019ff6b453 apple/glx/log: added missing va_end() after va_copy()
Each invocation of va_copy() must be matched by a
corresponding invocation of va_end()

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 51691f0767 "darwin: Use ASL for logging"
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
(cherry picked from commit 267ed29288)
2018-09-07 10:16:55 -07:00
Dylan Baker
b14c2b467d cherry-ignore: add another 18.2 patch 2018-09-07 10:16:55 -07:00
Sergii Romantsov
b5e03decbb intel: compiler option msse2 and mstackrealign
Seems in case of 32-bit library, usage of msse2 makes
some stack corruption or incorrect instructions.
Usage with mstackrealign fixes that case.

v2: Fixed meson.

v3: Definition of c_sse2_args moved on the top (L.Landwerlin).
    Added mstackrealign for Android's mks where msee4.1 is used.

v4: Added for Vulkan also.

v5: Commit message correction.

CC: <mesa-stable@lists.freedesktop.org>
Fixes: 6b05c080f2 (i965: Compile with -msse3)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107779
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit d709f12792)
2018-09-07 10:16:55 -07:00
Jason Ekstrand
6f43390dd2 anv/pipeline: Only consider double elements which actually exist
The brw_vs_prog_data::double_inputs_read field comes directly from
shader_info::double_inputs which may contain inputs which are not
actually read.  Instead of using it directly, AND it with inputs_read
which is only things which are read.  Otherwise, we may end up
subtracting too many elements when computing elem_count.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103241
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 7b26741806)
2018-09-07 10:16:55 -07:00
Timothy Arceri
67cfeb1686 glsl: fixer lexer for unreachable defines
If we have something like:

   #ifdef NOT_DEFINED
   #define A_MACRO(x) \
	if (x)
   #endif

The # on the #define is not skipped but the define itself is so
this then gets recognised as #if.

Until 28a3731e3f this didn't happen because we ended up in
<HASH>{NONSPACE} where BEGIN INITIAL was called stopping the
problem from happening.

This change makes sure we never call RETURN_TOKEN_NEVER_SKIP for
if/else/endif when processing a define.

Cc: Ian Romanick <idr@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107772
2018-09-07 10:16:55 -07:00
Mathias Fröhlich
06579faa9d tnl: Fix green gun regression in xonotic.
Fix an other regression of
mesa: Make gl_vertex_array contain pointers to first order VAO members.
The regression showed up with drivers using the tnl module and
was reproducible using xonotic-glx -benchmark demos/the-big-keybench.dem.

Fixes: 64d2a20480
    mesa: Make gl_vertex_array contain pointers to first order VAO members.
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
(cherry picked from commit a6232b6932)
Conflicts resolved by Dylan

Conflicts:
	src/mesa/tnl/t_split_copy.c
2018-09-07 10:16:55 -07:00
Dylan Baker
527813b222 meson: Print a message about why a libdrm version was selected
We require a single version of libdrm for all of our libdrm
dependencies (core and driver), but the way this is structured can make
the error message less than helpful, as one driver might be the one
setting the libdrm requirement, while another might be the one that
generates the version failure.

This adds a simple message to the output announcing which libdrm module
set the version, which might be more helpful.

v2: - Use message suggested by Eric Engstrom

Fixes: c445b1d56f
       ("meson: Use the same version for all libdrm checks")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit d25a27ec56)
2018-09-07 10:16:55 -07:00
Gert Wollny
37fd626925 winsys/virgl: correct resource and handle allocation (v2)
Fixes crash with
  piglit/bin/map_buffer_range-invalidate CopyBufferSubData \
                               increment-offset -auto -fbo

* Resize the resource storage already when the count is equal to the
  allocated size, fixes:

  Invalid write of size 8
  at 0xB72E4CF: virgl_drm_add_res (virgl_drm_winsys.c:629)
  by 0xB72E4CF: virgl_drm_emit_res (virgl_drm_winsys.c:663)
  by 0xB72A44A: virgl_encode_resource_copy_region (virgl_encode.c:776)
  by 0xB40CD12: st_copy_buffer_subdata (st_cb_bufferobjects.c:585)
  by 0xB244A3B: _mesa_CopyBufferSubData (bufferobj.c:2940)
  by 0x109A1E: upload (invalidate.c:169)
  by 0x109C2F: piglit_display (invalidate.c:215)
  by 0x4F80FBE: run_test (piglit_fbo_framework.c:52)
  by 0x4F66E5F: piglit_gl_test_run (piglit-framework-gl.c:229)
  by 0x10949D: main (invalidate.c:47)
  Address 0xbe07d30 is 0 bytes after a block of size 4,096 alloc'd
  at 0x4C31B25: calloc (in
       /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
  by 0xB72DAAF: virgl_drm_cmd_buf_create (virgl_drm_winsys.c:567)

* Also resize the space allocated for the handles, fixes:

  Invalid write of size 4
  at 0xB72E4F0: virgl_drm_add_res (virgl_drm_winsys.c:631)
  by 0xB72E4F0: virgl_drm_emit_res (virgl_drm_winsys.c:663)
  by 0xB72A44A: virgl_encode_resource_copy_region (virgl_encode.c:776)
  by 0xB40CD12: st_copy_buffer_subdata (st_cb_bufferobjects.c:585)
  by 0xB244A3B: _mesa_CopyBufferSubData (bufferobj.c:2940)
  by 0x109A1E: upload (invalidate.c:169)
  by 0x109C2F: piglit_display (invalidate.c:215)
  by 0x4F80FBE: run_test (piglit_fbo_framework.c:52)
  by 0x4F66E5F: piglit_gl_test_run (piglit-framework-gl.c:229)
  by 0x10949D: main (invalidate.c:47)
  Address 0xbe08570 is 0 bytes after a block of size 2,048 alloc'd
  at 0x4C2FB0F: malloc (
    in /usr/lib/valgrind/vgpreload_memcheck-amd64- linux.so)
  by 0xB72DAC8: virgl_drm_cmd_buf_create (virgl_drm_winsys.c:572)

Fixes: 4b15b5e803 ("virgl: resize resource bo allocation if we need to.")

v2: - Use REALLOC macro and avoid memory leak when re-allocation fails
    - add Fixes tag (both Emil Velikov)
    - reorder commit message

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
(cherry picked from commit 9b0e8d8723)
2018-09-07 10:16:55 -07:00
Dylan Baker
2620eb8e06 cherry-ignore: Add additional 18.2 patch 2018-09-07 10:16:55 -07:00
Marek Olšák
a985f18ef4 st/mesa: help fix stencil border color for GL_DEPTH_STENCIL textures
GL_STENCIL_INDEX uses GL_INTENSITY for the border color, which is nicer
to hardware that doesn't read the stencil border value from the X channel.

This fixes a bunch of dEQP tests on Vega & Raven.

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 725e8ad559)
2018-09-07 10:16:55 -07:00
Dylan Baker
ad47840009 docs/relnotes: Add sha256 sums for mesa 18.1.8 2018-09-07 10:10:13 -07:00
Dylan Baker
92497d659b docs: Add release notes for 18.1.8 2018-09-07 08:27:26 -07:00
Dylan Baker
9743fd241b Bump version to 18.1.8 2018-09-07 08:22:13 -07:00
Juan A. Suarez Romero
01967a97bf egl/wayland: do not leak wl_buffer when it is locked
If color buffer is locked, do not set its wayland buffer to NULL;
otherwise it can not be freed later.

Rather, flag it in order to destroy it later on the release event.

v2: instruct release event to unlock only or free wl_buffer too (Daniel)

This also fixes dEQP-EGL.functional.swap_buffers_with_damage.* tests.

CC: Daniel Stone <daniel@fooishbar.org>
2018-09-04 14:24:02 -07:00
Dylan Baker
99082b93b6 cherry-ignore: Add patch that needs more significant patches to function
In this case the patch is fine for 18.2 as the required patches are
already present, but they're not in 18.1 and they're too big to be
pulled back.
2018-09-04 09:00:11 -07:00
Dylan Baker
9622c17ec0 cherry-ignore: Add a couple of two fixes warning patches 2018-09-04 08:59:29 -07:00
Christian Gmeiner
88bbbe8503 tegra: fix memory leak
Fixes: 1755f608f5 ("tegra: Initial support")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d0b09e2dfe)
2018-09-04 08:59:29 -07:00
Daniel Stone
104c598bfb st/dri: Don't expose sRGB formats to clients
Though the SARGB8888 format is used internally through its FourCC value,
it is not a real format as defined by drm_fourcc.h; it cannot be used
with KMS or other interfaces expecting drm_fourcc.h format codes.

Ensure we don't advertise it through the dmabuf format/modifier query
interfaces, preventing us from tripping over an assert.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Fixes: 8c1b9882b2 ("egl/dri2: Guard against invalid fourcc formats")
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 01c0aa9f05)
2018-09-04 08:59:29 -07:00
Samuel Pitoiset
1ec6ba931b radv: fix passing clip/cull distances from VS to PS
CTS doesn't test input clip/cull distances for the fragment
shader stage, which explains why this was totally broken. I
wrote a simple test locally that works now.

This fixes a crash with GTA V and DXVK.

Note that we are exporting unused parameters from the vertex
shader now, but this can't be optimized easily because we don't
keep the fragment shader info...

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107477
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 6f47df3129)
Very minor conflicts resolved by Dylan

Conflicts:
	src/amd/vulkan/radv_nir_to_llvm.c
2018-09-04 08:59:29 -07:00
Nanley Chery
b0838037b4 i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on some gen9
According to internal docs, some gen9 platforms have a pixel shader push
constant synchronization issue. Although not listed among said
platforms, this issue seems to be present on the GeminiLake 2x6's we've
tested.

We consider the available workarounds to be too detrimental on
performance. Instead, we mitigate the issue by applying part of one of
the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch
(as suggested by Ken).

Fixes ext_framebuffer_multisample-accuracy piglit test failures with the
following options:
* 6 depth_draw small depthstencil
* 8 stencil_draw small depthstencil
* 6 stencil_draw small depthstencil
* 8 depth_resolve small
* 6 stencil_resolve small depthstencil
* 4 stencil_draw small depthstencil
* 16 stencil_draw small depthstencil
* 16 depth_draw small depthstencil
* 2 stencil_resolve small depthstencil
* 6 stencil_draw small
* all_samples stencil_draw small
* 2 depth_draw small depthstencil
* all_samples depth_draw small depthstencil
* all_samples stencil_resolve small
* 4 depth_draw small depthstencil
* all_samples depth_draw small
* all_samples stencil_draw small depthstencil
* 4 stencil_resolve small depthstencil
* 4 depth_resolve small depthstencil
* all_samples stencil_resolve small depthstencil

v2: Include more platforms in WA (Ken).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93355
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 904c2a617d)
2018-09-04 08:59:29 -07:00
Ian Romanick
a144518f34 i965/vec4: Correctly handle uniform sources in generate_tes_add_indirect_urb_offset
Fixes failure in the new piglit test
tes-patch-input-array-vec2-index-invalid-rd.shader_test.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 75666605c9)
2018-09-04 08:59:29 -07:00
Ian Romanick
de445b36ba i965/vec4: Clamp indirect tes input array reads with 0x0fffffff
Page 190 of "Volume 7: 3D Media GPGPU Engine (Haswell)" says the valid
range of the offset is [0, 0FFFFFFFh].

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 82530ce1b5)
2018-09-04 08:59:29 -07:00
Jason Ekstrand
e5a2bf8f94 anv/blorp: Do more flushing around HiZ clears
We make the flush after a HiZ clear unconditional and add a flush/stall
before the clear as well.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107760
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 62378c5e9e)
2018-09-04 08:59:29 -07:00
Bas Nieuwenhuizen
1063fbe6cc radv: Use a lower max offchip buffer count.
No clue what gets fixed by this but both radeonsi and amdvlk do it.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit ab64891f4c)
2018-09-04 08:50:14 -07:00
Bas Nieuwenhuizen
5872b1522a radv: Fix CMASK dimensions.
Mirrors

1e40f69483 "ac/surface: fix CMASK fast clear for NPOT textures with mipmapping on SI/CI/VI"

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 233718a199)
2018-09-04 08:50:09 -07:00
Jason Ekstrand
f58f2f6754 egl/dri2: Guard against invalid fourcc formats
We already reject attempts to import images with invalid fourcc formats
but don't really guard the queries all that well.  This makes us error
out in any calls to eglQueryDmaBufModifiersEXT if the given format is
not a valid fourcc format.  We also add an assert to ensure that drivers
don't advertise any non-fourcc formats.

Cc: mesa-stable@lists.freedesktop.org
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 8c1b9882b2)
2018-08-31 08:36:56 -07:00
Jason Ekstrand
74a671d019 egl/dri2: Add a helper for the number of planes for a FOURCC format
This also serves as a convenient "is this a fourcc format" check as well
which we'll take advantage of in the next commit.

Cc: mesa-stable@lists.freedesktop.org
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit b95896f492)
2018-08-31 08:36:56 -07:00
Dave Airlie
6a854c5620 ac/radeonsi: fix CIK copy max size
While adding transfer queues to radv, I started writing some tests,
the first test I wrote fell over copying a buffer larger than this
limit.

Checked AMDVLK and found the correct limit.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 2c1f249f2b)
2018-08-31 08:36:07 -07:00
Jason Ekstrand
16d44d1e98 nir/algebraic: Be more careful converting ushr to extract_u8/16
If it's not the right bit-size, it may not actually be the correct
extraction.  For now, we'll only worry about 32-bit versions.

Fixes: 905ff86198 "nir: Recognize open-coded extract_u16"
Fixes: 76289fbfa8 "nir: Recognize open-coded extract_u8"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 116b47fe3c)
2018-08-30 08:32:47 -07:00
Bas Nieuwenhuizen
8c3aee4038 radv: Add missing checks in radv_get_image_format_properties.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4738b6ac81)
2018-08-30 08:31:41 -07:00
Dylan Baker
229d0854e0 cherry-ignore: Add patch that doesn't apply to 18.1 2018-08-30 08:31:12 -07:00
Andrii Simiklit
8aba522f49 i965/gen6/xfb: handle case where transform feedback is not active
When the SVBI Payload Enable is false I guess the register R1.4
which contains the Maximum Streamed Vertex Buffer Index is filled by zero
and GS stops to write transform feedback when the transform feedback
is not active.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107579
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
(cherry picked from commit 1b0df8a460)
2018-08-30 08:28:36 -07:00
Lionel Landwerlin
7a9b95bd7e anv: blorp: support multiple aspect blits
Newer blit tests are enabling depth&stencils blits. We currently don't
support it but can do by iterating over the aspects masks (copy some
logic from the CopyImage function).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9f44745eca ("anv: Use blorp to implement VkBlitImage")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 5a1c23d150)
2018-08-30 08:25:54 -07:00
Dylan Baker
d7065dc401 cherry-ignore: Add additional patch 2018-08-29 08:30:29 -07:00
Dylan Baker
e7eb574075 cherry-ignore: Add more 18.2 patches 2018-08-29 08:23:55 -07:00
vadym.shovkoplias
bdc842a092 glsl/linker: Allow unused in blocks which are not declated on previous stage
>From Section 4.3.4 (Inputs) of the GLSL 1.50 spec:

    "Only the input variables that are actually read need to be written
     by the previous stage; it is allowed to have superfluous
     declarations of input variables."

Fixes:
    * interstage-multiple-shader-objects.shader_test

v2:
  Update comment in ir.h since the usage of "used" field
  has been extended.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101247
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 4a8444d5bc)
2018-08-28 09:02:44 -07:00
Dylan Baker
850e1259e1 meson: Actually load translation files
Currently we run the script but don't actually load any files, even in a
tarball where they exist.

Fixes: 3218056e0e
       ("meson: Build i965 and dri stack")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 7c00db9527)
2018-08-28 08:59:23 -07:00
Dylan Baker
cd10b797c4 cherry-ignore: Add more 18.2 only patches 2018-08-28 08:58:41 -07:00
Jason Ekstrand
5163f96905 anv: Fill holes in the VF VUE to zero
This fixes a GPU hang in DOOM 2016 running under wine.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104809
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 76b0e4d8c9)
2018-08-27 08:56:28 -07:00
Marek Olšák
9517387144 glapi: actually implement GL_EXT_robustness for GLES
The extension was exposed but not the functions.

This fixes:
    dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.readn_pixels
    dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_nuniformfv
    dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_nuniformiv

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 37eee90df7)
2018-08-24 09:33:23 -07:00
Dylan Baker
280be5067d cherry-ignore: add a patch 2018-08-24 09:33:23 -07:00
Emil Velikov
745be98a7d docs: update required mako version
The requirement was bumped a while back, but we forgot to update the
docs.

Fixes: ed871af91c ("configure.ac: raise Mako required version to
0.8.0")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit e39b916d0c)
2018-08-24 09:33:23 -07:00
Gurchetan Singh
d180c8d785 meson: fix egl build for android
Haven't tested this, but we do include loader.h
in platform_android.c

Fixes: c5ec155685 ("meson: wire up egl/android")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit c731508b98)
2018-08-24 09:33:23 -07:00
Gurchetan Singh
529eda0aad meson: fix egl build for surfaceless
Without this, I get:

 > platform_surfaceless.c:38:10: fatal error: 'loader.h' file not found
 > #include "loader.h"
 >      ^~~~~~~~~~
 > 1 error generated.

Fixes: 108d257a16 ("meson: build libEGL")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>

v2: Split up patches, modify commit message (Dylan)
(cherry picked from commit ec6cb01e21)
2018-08-24 09:33:23 -07:00
Nanley Chery
fc1a21da40 i965/miptree: Fix can_blit_slice()
Check the destination's row pitch against the BLT engine's row pitch
limitation as well.

Fixes: 0288fe8d04
("i965/miptree: Use the correct BLT pitch")

v2: Fix the Fixes tag (Dylan).
    Check the destination row pitch (Chris).

Reported-by: Dylan Baker <dylan@pnwbakers.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit b041fc0649)
2018-08-24 09:33:23 -07:00
Nanley Chery
af3c8a4c01 i965/miptree: Use miptree_map in map_blit functions
This struct contains all the data of interest. can_blit_slice() will use
it in the next patch to calculate the correct pitch.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 030b6efcfd)
2018-08-24 09:33:23 -07:00
Nanley Chery
a9dbc4b3f4 i965/miptree: Use the correct BLT pitch
Retile miptrees to a linear tiling less often. Retiling can cause issues
with imported BOs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106738
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 0288fe8d04)
2018-08-24 09:33:23 -07:00
Nanley Chery
32331a3efc i965/miptree: Drop an if case from retile_as_linear
Drop an if statement whose predicate never evaluates to true. row_pitch
belongs to a surface with non-linear tiling. According to
isl_calc_tiled_min_row_pitch, the pitch is a multiple of the tile width.
By looking at isl_tiling_get_info, we see that non-linear tilings have
widths greater than or equal to 128B.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 3df201e3e8)
2018-08-24 09:33:23 -07:00
Nanley Chery
b9c903b406 i965: Make blt_pitch public
We'd like to reuse this helper.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 0ab2541943)
2018-08-24 09:33:23 -07:00
Nanley Chery
b727e1ff10 intel/isl: Avoid tiling some 16K-wide render targets
Fix rendering issues on BDW and SKL.

Fixes: 0288fe8d04
("i965/miptree: Use the correct BLT pitch")

Fixes the following regressions seen

exclusively on SKL:
* KHR-GL46.texture_barrier_ARB.disjoint-texels
* KHR-GL46.texture_barrier_ARB.overlapping-texels
* KHR-GL46.texture_barrier.disjoint-texels
* KHR-GL46.texture_barrier.overlapping-texels

and both on BDW and SKL:
* GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners
* GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners

v2: Note the fixed tests (Andres).
    Don't cause failures with multisampled buffers (Andres).
    Don't hamper SKL GT4 (Ken).
v3: Fix the Fixes tag (Dylan).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107359
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6d80b0b4ba)
2018-08-24 09:33:23 -07:00
Grazvydas Ignotas
e6fdf98a79 radv: place pointer length into cache uuid
Thanks to reproducible builds, binary file timestamps may be identical
for both 32bit and 64bit packages when built from the same source.
This means radv will use the same cache for both 32 and 64 bit
processes, which leads to crashes.

Conveniently there is a spare byte in cache_uuid, let's place the
pointer size there.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
CC: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107601
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105904
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 356f6673d6)
2018-08-24 09:33:23 -07:00
Dylan Baker
4a844b54d0 docs: Add mesa 18.1.7 notes 2018-08-24 09:29:07 -07:00
Dylan Baker
ab384a10ce docs: Add mesa 18.1.7 docs 2018-08-23 09:39:20 -07:00
Dylan Baker
9c5b4ca2f3 bump version for 18.1.7 release 2018-08-23 09:34:54 -07:00
Dave Airlie
97ecabee2d r600/eg: rework atomic counter emission with flushes
With the current code, we didn't do the space checks prior
to atomic counter setup emission, but we also didn't add
atomic counters to the space check so we could get a flush
later as well.

These flushes would be bad, and lead to problems with
parallel tests. We have to ensure the atomic counter copy in,
draw emits and counter copy out are kept in the same command
submission unit.

This reworks the code to drop some useless masks, make the
counting separate to the emits, and make the space checker
handle atomic counter space.

[airlied: want this in 18.2]

Fixes: 06993e4ee (r600: add support for hw atomic counters. (v3))
(cherry picked from commit 32529e6084)
2018-08-22 08:46:19 -07:00
Dylan Baker
cafb53fe4d cherry-ignore: more 18.2 patches 2018-08-22 08:46:00 -07:00
Danylo Piliaiev
3268ca90f0 i965: Advertise 8 bits subpixel precision for viewport bounds on gen6+
We use floating-points for viewport bounds so VIEWPORT_SUBPIXEL_BITS
should reflect this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105975

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 25ec806eb2)
2018-08-21 08:25:00 -07:00
Dylan Baker
fb927b9a64 cherry-ignore: Add a couple of patches with > 1 fixes tags 2018-08-21 08:23:16 -07:00
Jason Ekstrand
5a90cc132b anv/lower_ycbcr: Use the binding array size for bounds checks
Because lower_ycbcr gets called before apply_pipeline_layout, the
indices are all logical and the binding layout HW size is actually too
big for the bounds check.  We should just use the regular logical array
size instead.

Fixes: f3e91e78a3 "anv: add nir lowering pass for ycbcr textures"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 320dacb0a0)
2018-08-20 09:11:10 -07:00
Ray Strode
33acfb0780 gallium/winsys/kms: don't unmap what wasn't mapped
At the moment, depending on pipe transfer flags, the dumb
buffer map address can end up at either kms_sw_dt->ro_mapped
or kms_sw_dt->mapped.

When it's time to unmap the dumb buffer, both locations get unmapped,
even though one is probably initialized to 0.

That leads to the code segment getting unmapped at runtime and
crashes when trying to call into unrelated code.

This commit addresses the problem by using MAP_FAILED instead of
NULL for ro_mapped and mapped when the dumb buffer is unmapped,
and only unmapping mapped addresses at unmap time.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107098
Signed-off-by: Ray Strode <rstrode@redhat.com>
Fixes: d891f28df9 ("gallium/winsys/kms: Fix possible leak in map/unmap.")
Cc: Lepton Wu <lepton@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 9baff597ce)
2018-08-20 08:26:14 -07:00
Dylan Baker
7b7fbee00a cherry-ignore: Add more 18.2 patches 2018-08-20 08:25:50 -07:00
Samuel Pitoiset
fe63541a70 radv/winsys: fix creating the BO list for virtual buffers
When the number of unique BO is 0, we optimize the list creation
by copying all buffers of the current CS directly into it. But
this is only valid if the CS doesn't have virtual buffers,
otherwise they are not added and hw might report VM faults.

This fixes VM faults with:
dEQP-VK.sparse_resources.image_sparse_binding.2d.rgba8ui.1024_128_1

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit d27e1584ce)
2018-08-20 08:25:34 -07:00
Dylan Baker
b33a2fa911 cherry-ignore: Add more 18.2 patches 2018-08-16 14:24:37 -07:00
Timothy Arceri
0936b87324 radv: add Doom workaround
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit f0a8accb0d)
Conflicts resolved by Dylan

Conflicts:
	src/amd/vulkan/radv_device.c
2018-08-16 14:23:48 -07:00
Alexander Tsoy
8143fefa10 meson: fix build for egl platform_x11 without dri3 and gbm
Compiling EGL's platform_x11 without dri3 and gbm yields this compile
failure:

platform_x11 needs inc_loader:

../mesa-18.2.0-rc2/src/egl/drivers/dri2/platform_x11.c:48:10: fatal
error: loader.h: No such file or directory
 #include "loader.h"
          ^~~~~~~~~~

Fixes: 108d257a16 ("meson: build libEGL")
Bugzilla: https://bugs.gentoo.org/663534
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 9a96bf0ecd)
2018-08-16 14:22:42 -07:00
Dylan Baker
cf7f783f35 cherry-ignore: Add additional 18.2 only patches 2018-08-15 09:06:09 -07:00
Bas Nieuwenhuizen
ee595b275c radv: Fix missing Android platform define.
CC: <mesa-stable@lists.freedesktop.org>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit bf33ca7512)
Conflicts resolved by Dylan

Conflicts:
	src/amd/vulkan/Android.mk

There is no android.mk support in 18.1 for radv, so only apply the parts
for autotools android builds, which can be used by android
implementations not using android.mk files
2018-08-15 09:01:03 -07:00
Jason Ekstrand
1f06f0e850 intel: Switch the order of the 2x MSAA sample positions
The Vulkan 1.1.82 spec flipped the order to better match D3D.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit a9f7bcfdf9)
2018-08-13 10:08:44 -07:00
Dylan Baker
3f2388e21a docs: Add sha256 sums for 18.1.6 2018-08-13 09:56:25 -07:00
Dylan Baker
5343019cfa docs: Add release notes for 18.1.6 2018-08-13 08:36:16 -07:00
Dylan Baker
5950393f4e bump version to 18.1.6 2018-08-13 08:30:23 -07:00
Tapani Pälli
af8076e1c9 glsl: handle error case with ast_post_inc, ast_post_dec
Return ir_rvalue::error_value with ast_post_inc, ast_post_dec if
parser error was emitted previously. This way process_array_size
won't see bogus IR generated like with commit 9c676a6427.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98699
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 03a5acec68)
2018-08-10 08:37:54 -07:00
Kenneth Graunke
8d742a1eb1 intel: Fix SIMD16 unaligned payload GRF reads on Gen4-5.
When the SIMD16 Gen4-5 fragment shader payload contains source depth
(g2-3), destination stencil (g4), and destination depth (g5-6), the
single register of stencil makes the destination depth unaligned.

We were generating this instruction in the RT write payload setup:

   mov(16)   m14<1>F   g5<8,8,1>F   { align1 compr };

which is illegal, instructions with a source region spanning more than
one register need to be aligned to even registers.  This is because the
hardware implicitly does (nr | 1) instead of (nr + 1) when splitting the
compressed instruction into two mov(8)'s.

I believe this would cause the hardware to load g5 twice, replicating
subspan 0-1's destination depth to subspan 2-3.  This showed up as 2x2
artifact blocks in both TIS-100 and Reicast.

Normally, we rely on the register allocator to even-align our virtual
GRFs.  But we don't control the payload, so we need to lower SIMD widths
to make it work.  To fix this, we teach lower_simd_width about the
restriction, and then call it again after lower_load_payload (which is
what generates the offending MOV).

Fixes: 8aee87fe4c (i965: Use SIMD16 instead of SIMD8 on Gen4 when possible.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107212
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=13728
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Diego Viola <diego.viola@gmail.com>
(cherry picked from commit 08a5c395ab)
2018-08-10 08:35:15 -07:00
Eric Anholt
de4a4f2b17 egl: Fix leak of X11 pixmaps backing pbuffers in DRI3.
This is basically copied from the DRI2 destroy path.  Without this,
Raspberry Pi would quickly run out of CMA during the EGL tests in the CTS
due to all the pixmaps laying around.

Fixes: f35198bade ("egl/x11: Implement dri3 support with loader's dri3 helper")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit b618d7ea59)
2018-08-10 08:35:09 -07:00
Adam Jackson
a1bb434461 glx: GLX_MESA_multithread_makecurrent is direct-only
This extension is not defined for indirect contexts. Marking it as
"client only", as the old code did here, would make the extension
available in indirect contexts, even though the server would certainly
not have it in its extension list.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 63a6b719d9)
2018-08-10 08:34:40 -07:00
vadym.shovkoplias
894fdbf058 drirc: Allow extension midshader for Metro Redux
This fixes both Metro 2033 Redux and Metro Last Light Redux

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99730
Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit e0de26eacc)
2018-08-09 09:03:41 -07:00
Juan A. Suarez Romero
4395919bd9 wayland/egl: initialize window surface size to window size
When creating a windows surface with eglCreateWindowSurface(), the
width and height returned by eglQuerySurface(EGL_{WIDTH,HEIGHT}) is
invalid until buffers are updated (like calling glClear()).

But according to EGL 1.5 spec, section 3.5.6 ("Surface Attributes"):

  "Querying EGL_WIDTH and EGL_HEIGHT returns respectively the width and
   height, in pixels, of the surface. For a window or pixmap surface,
   these values are initially equal to the width and height of the
   native window or pixmap with respect to which the surface was
   created"

This fixes dEQP-EGL.functional.color_clears.* CTS tests

v2:
- Do not modify attached_{width,height} (Daniel)
- Do not update size on resizing window (Brendan)

CC: Daniel Stone <daniel@fooishbar.org>
CC: Brendan King <brendan.king@imgtec.com>
CC: mesa-stable@lists.freedesktop.org
Tested-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 1fe7cbdf05)
Conflicts resolved by Dylan

Conflicts:
	src/egl/drivers/dri2/platform_wayland.c
2018-08-09 09:00:56 -07:00
Juan A. Suarez Romero
6cffbd963f wayland/egl: update surface size on window resize
According to EGL 1.5 spec, section 3.10.1.1 ("Native Window Resizing"):

  "If the native window corresponding to _surface_ has been resized
   prior to the swap, _surface_ must be resized to match. _surface_ will
   normally be resized by the EGL implementation at the time the native
   window is resized. If the implementation cannot do this transparently
   to the client, then *eglSwapBuffers* must detect the change and
   resize surface prior to copying its pixels to the native window."

So far, resizing a native window in Wayland/EGL was interpreted in Mesa
as a request to resize, which is not executed until the first draw call.
And hence, surface size is not updated until executing it. Thus,
querying the surface size with eglQuerySurface() after a window resize
still returns the old values.

This commit updates the surface size values as soon as the resize is
done, even when the real resize is done in the draw call. This makes the
semantics that any native window resize request take effect inmediately,
and if user calls eglQuerySurface() it will return the new resized
values.

v2: update surface size if there isn't a back surface (Daniel)

CC: Daniel Stone <daniel@fooishbar.org>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit a9fb331ea7)
2018-08-09 08:58:44 -07:00
Emil Velikov
0c927e8da9 autotools: use correct gl.pc LIBS when using glvnd
This is more of a hack, since glvnd itself should be providing the file.
Until that happens, ensure the libs is correctly set to -lGL

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 315c46cfdc)
2018-08-08 09:20:23 -07:00
Emil Velikov
db3ff90efc autotools: error out when building with mangling and glvnd
It's not a thing that can work, nor is a wise idea to attempt.

v2: Tweak error message (Dylan)

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com> (v1)
(cherry picked from commit 25a9450a44)
2018-08-08 09:20:17 -07:00
Emil Velikov
8691317ff7 autotools: error out when using the broken --with-{gl, osmesa}-lib-name
The toggles were broken with the introduction of --enable-mangling.
Fixing that up might be possible, but it's not worth the complexity
since one can rename the libraries at any point.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit d5ac236471)
2018-08-08 09:20:11 -07:00
Emil Velikov
7339855b1c automake: require shared glapi when using DRI based libGL
This has been a requirement for ages, yet it seems like we never
explicitly errored out during configure.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit a7ea7511ba)
2018-08-08 09:20:04 -07:00
Eric Anholt
ff8c3747ac vc4: Ignore samplers for finding uniform offsets.
Fixes:
dEQP-GLES2.shaders.struct.uniform.sampler_array_fragment
dEQP-GLES2.shaders.struct.uniform.sampler_array_vertex
dEQP-GLES2.shaders.struct.uniform.sampler_nested_fragment
dEQP-GLES2.shaders.struct.uniform.sampler_nested_vertex

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 69158c452b)
2018-08-08 09:19:58 -07:00
Eric Anholt
ec7f550e17 vc4: Respect a sampler view's first_layer field.
Fixes texturing from EGL images created from cubemap faces, as in
dEQP-EGL.functional.image.create.gles2_cubemap_negative_x_rgba_texture

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 9ab6912a00)
2018-08-08 09:19:52 -07:00
Dylan Baker
f2d3373ecd cherry-ignore: Add some additional patches that are for 18.2 2018-08-07 12:08:22 -07:00
Emil Velikov
695a515975 swr: don't export swr_create_screen_internal
With earlier rework the user and provider of the symbol are within the
same binary. Thus there's no point in exporting the function.

Spotted while reviewing patch from Chuck, that nearly added another
unneeded PUBLIC function.

Cc: Chuck Atkins <chuck.atkins@kitware.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Fixes: f50aa21456 "(swr: build driver proper separate from rasterizer")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com<mailto:george.kyriazis@intel.com>>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com<mailto:chuck.atkins@kitware.com>>
(cherry picked from commit 54d844897f)
2018-08-07 12:05:43 -07:00
Eric Anholt
3daf7ff658 vc4: Fix a leak of the no-vertex-elements workaround BO.
Fixes: bd1925562a ("vc4: Convert the driver to emitting the shader record using pack macros.")
(cherry picked from commit 9507e03699)
2018-08-07 12:05:38 -07:00
Gert Wollny
3e0dbfde91 meson, install_megadrivers: Also remove stale symlinks
os.path.exists doesn't return True for stale symlinks, but they are in
the way later, when a link/file with the same name is to be created.
For instance it is conceivable that the pointed to file is replaced by
a file with a new name, and then the symlink is dead.

To handle this check specifically for all existing symlinks to be
removed. (This bugged me for some time with a link libXvMCr600.so
always being in the way of installing this file)

v2: use only os.lexist and replace all instances of os.exist (Dylan Baker)

v3: handle directory check correctly (Eric Engestrom)

Fixes: f7f1b30f81
       ("meson: extend install_megadrivers script to handle symmlinking")

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>(v2 minus dir check)
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
(cherry picked from commit 7a46b2d641)
Conflicts resolved by Dylan

Conflicts:
	bin/install_megadrivers.py
2018-08-07 12:03:54 -07:00
Jon Turney
3ca69de0c4 meson: use correct keyword to fix a meson warning
With a sufficently recent meson, the following warning is produced:

WARNING: Passed invalid keyword argument "extra_args".
WARNING: This will become a hard error in the future.

It seems that compiler.links(args:) is meant here.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit a48c0659e1)
2018-08-06 09:27:05 -07:00
Karol Herbst
7f818fd358 nvc0/ir: return 0 in imageLoad on incomplete textures
We already guarded all OP_SULDP against out of bound accesses, but we
ended up just reusing whatever value was stored in the dest registers.

Fixes CTS test shader_image_load_store.incomplete_textures

v2: fix for loads not ending up with predicates (bindless_texture)
v3: fix replacing the def

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit c3325097be)
Conflicts resolved by Dylan

Conflicts:
	src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h
2018-08-06 09:22:17 -07:00
Dylan Baker
a8d32e8ad7 cherry-ignore: add patches that get-pick-list is finding in error 2018-08-06 09:20:45 -07:00
Jordan Justen
7a1e5f9b45 i965: Disable shader cache with INTEL_DEBUG=shader_time
Shader time hard codes an index of the shader time buffer within the
gen program.

In order to support shader time in the disk shader cache, we'd need to
add the shader time index into the program key. This should work, but
probably is not worth it for this particular debug feature.

Therefore, let's just disable the disk shader cache if the shader time
debug feature is used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106382
Fixes: 96fe36f7ac "i965: Enable disk shader cache by default"
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 3887700dfd)

Merged conflicts resolved by Dylan

Conflicts:
	src/mesa/drivers/dri/i965/brw_disk_cache.c
2018-08-02 11:23:37 -07:00
Jordan Justen
2888dc653f i965, anv: Use INTEL_DEBUG for disk_cache driver flags
Since various options within INTEL_DEBUG could impact code generation,
we need to set the disk cache driver_flags parameter based on the
INTEL_DEBUG flags in use.

An example that will affect the program generated by i965 is the
INTEL_DEBUG=nocompact option.

The DEBUG_DISK_CACHE_MASK value is added to mask the settings of
INTEL_DEBUG that can affect program generation.

v2:
 * Use driver_flags (Tim)
 * Also update Anvil (Jason)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 2b3064c073)

Merge conflicts resolved by Dylan
Pulled in to apply next patch

Conflicts:
	src/intel/vulkan/anv_device.c
2018-08-02 11:18:00 -07:00
Dylan Baker
a79ed78b22 gallium: fix ddebug on windows
By including the proper headers for getpid and for mkdir.

Fixes: 6ff0c6f4eb
       ("gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 2877b6555c)
2018-08-02 11:02:36 -07:00
Dylan Baker
c1e67f0b8e nir/meson: fix c vs cpp args for nir test
Fixes: d1992255bb
       ("meson: Add build Intel "anv" vulkan driver")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 34998aae18)
2018-08-02 11:02:31 -07:00
Jason Ekstrand
2d14906109 i965/fs: Flag all slots of a flat input as flat
Otherwise, only the first vec4 of a matrix or other complex type will
get marked as flat and we'll interpolate the others.  This was caught by
a dEQP test which started failing because it did a SSO vs. non-SSO
comparison.  Previously, we did the interpolation wrong consistently in
both versions.  However, with one of Tim Arceri's NIR linkingpatches, we
started splitting the matrix input into vectors at link time in the
non-SSO version and it started getting correctly interpolated which
didn't match the broken SSO version.  As of this commit, they both get
correctly interpolated.

Fixes: e61cc87c75 "i965/fs: Add a flat_inputs field to prog_data"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 57804efa88)
2018-08-02 11:02:26 -07:00
Andres Gomez
15f579f401 glsl: use util_snprintf()
Instead of plain snprintf(). To fix the MSVC 2013 build.

Fixes: 6ff0c6f4eb ("gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Brian Paul <brianp@vmware.com>
Cc: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 9d220fa950)
2018-08-02 11:02:10 -07:00
Andres Gomez
3006373031 gallium/aux/util: use util_snprintf() in test_texture_barrier
Instead of plain snprintf(). To fix the MSVC 2013 build:

  Compiling src\gallium\auxiliary\util\u_tests.c ...
u_tests.c
src\gallium\auxiliary\util\u_tests.c(624) : warning C4013: 'snprintf' undefined; assuming extern returning int

...

gallium.lib(u_tests.obj) : error LNK2019: unresolved external symbol _snprintf referenced in function _test_texture_barrier
build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll : fatal error LNK1120: 1 unresolved externals
scons: *** [build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll] Error 1120
scons: building terminated because of errors.

Fixes: 56342c97ee ("gallium/u_tests: test FBFETCH and shader-based blending with MSAA")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Brian Paul <brianp@vmware.com>
Cc: Roland Scheidegger <sroland@vmware.com>
Cc: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 18d9dc179f)
2018-08-02 11:02:03 -07:00
Andres Gomez
7574119e04 ddebug: use util_snprintf() in dd_get_debug_filename_and_mkdir
Instead of plain snprintf(). To fix the MSVC 2013 build:

  Compiling src\gallium\auxiliary\driver_ddebug\dd_draw.c ...
dd_draw.c
c:\projects\mesa\src\gallium\auxiliary\driver_ddebug\dd_util.h(60) : warning C4013: 'snprintf' undefined; assuming extern returning int

...

gallium.lib(dd_draw.obj) : error LNK2001: unresolved external symbol _snprintf
build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll : fatal error LNK1120: 1 unresolved externals
scons: *** [build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll] Error 1120
scons: building terminated because of errors.

Fixes: 6ff0c6f4eb ("gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Brian Paul <brianp@vmware.com>
Cc: Roland Scheidegger <sroland@vmware.com>
Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 1090e97e77)
2018-08-02 11:01:54 -07:00
Vlad Golovkin
dbcc6decdc swr: Remove unnecessary memset call
Zeroing memory after calloc is not necessary. This also allows to avoid
possible crash when allocation fails, because memset is called before
checking screen for NULL.

Fixes: a29d63ecf7 "swr: refactor swr_create_screen to allow
                              for proper cleanup on error"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 9d3a2394e4)
2018-08-02 11:01:49 -07:00
Olivier Fourdan
88b6d6443e dri3: For 1.2, use root window instead of pixmap drawable
get_supported_modifiers() and pixmap_from_buffers() requests both
expect a window as drawable, passing a pixmap will fail as the Xserver
will fail to match the given drawable to a window.

That leads to dri3_alloc_render_buffer() to return NULL and breaks
rendering when using GLX_DOUBLEBUFFER on pixmaps.

Query the root window of the pixmap on first init, and use the root
window instead of the pixmap drawable for get_supported_modifiers()
and pixmap_from_buffers().

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107117
Fixes: 069fdd5 ("egl/x11: Support DRI3 v1.1")
Signed-off-by: Olivier Fourdan <ofourdan@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 03a61b977e)
2018-08-01 09:38:18 -07:00
Marek Olšák
948e06a434 ac/surface: fix MSAA corruption on Vega due to FMASK tile swizzle
a needle in the haystack?

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit c5c6e0187f)
2018-08-01 09:37:47 -07:00
Christian Gmeiner
0a19e01038 etnaviv: fix typo in query names
Fixes: d0bed0b494 ("etnaviv: support HI performance counters")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Chris Healy <cphealy@gmail.com>
(cherry picked from commit e1d4882d05)
2018-07-31 08:57:29 -07:00
Dave Airlie
8e49c2ebc1 r600: reduce num compute threads to 1024.
I copied this value from radeonsi, but it was wrong, 1024
seems to be correct answer from looking at gpuinfo.

This should fix a few compute shader related hangs. (at least in CTS)

Cc: <mesa-stable@lists.freedesktop.org>
(airlied: pushed because it avoids hangs)

(cherry picked from commit 9039cf70fa)
2018-07-31 08:57:19 -07:00
Karol Herbst
7c3f43cba3 nir/lower_int64: mark all metadata as dirty
v2: use nir_metadata_preserve
    preserve metadata in case of !progress

Fixes: 074f5ba0b5
       "nir: Add a simple int64 lowering pass"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit bc0e0c2818)
2018-07-30 10:35:58 -07:00
Jason Ekstrand
be9dfba546 nir: Take if uses into account in ssa_def_components_read
Fixes: d800b7daa5 "nir: Add a helper for figuring out what..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 9a4ab4c120)
2018-07-30 10:35:51 -07:00
Mauro Rossi
b4eeb2dff2 radv: move vk_format_table.c to generated sources
Android build system will try to compile vk_format_table.c
as a shipped source, but at compile time it will be missing,
we move it to generated source, where it belongs

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c67b36c8a1)
2018-07-28 07:07:18 -07:00
Mauro Rossi
7699951adb radv: generate entrypoints for VK_ANDROID_native_buffer
Patch changes radv entrypoints generator to not skip this extension even
though it is set as disabled in the vk.xml

Reference: 63525ba730 ("android: enable VK_ANDROID_native_buffer")
Fixes: 69f447553c ("vulkan: Drop vk_android_native_buffer.xml")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 1eb65c51ad)
2018-07-28 07:07:11 -07:00
Jan Vesely
b563370ee3 clover: Don't extend illegal integer types.
It's OK to pass them in memory, which is what kernel invocation needs.
Fixes regressions since llvm r337535 ("Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering"):
	scalar-arithmetic-char
	scalar-arithmetic-uchar
	scalar-arithemtic-short
	scalar-arithmetic-ushort
	scalar-comparison-char
	scalar-comparison-uchar
	scalar-comparison-short
	scalar-comparison-ushort

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit c2942141ae)
Merge conflicts resolved by Dylan

Conflicts:
	src/gallium/state_trackers/clover/llvm/compat.hpp
2018-07-27 07:10:50 -07:00
Jan Vesely
1be3293087 clover: Reduce wait_count in abort path.
Trigger waiter condition variable.
Passes 'events' CTS on carrizo and turks.
v2: reduce to 0

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 1e8b8e0878)
2018-07-27 07:10:50 -07:00
Dylan Baker
2a3446af28 docs: Add sha-256 sums for 18.1.5 2018-07-27 07:06:08 -07:00
Dylan Baker
711035e355 docs: add 18.1.5 release notes 2018-07-26 10:48:51 -07:00
Dylan Baker
c5d658a991 bump version to 18.1.5 2018-07-26 10:42:58 -07:00
Jason Ekstrand
2cb3b5b319 spirv: Fix a couple of image atomic load/store bugs
For one thing, the NIR opcodes for image load/store always take and
return a vec4 value regardless of the image type.  We need to fix up
both the source and destination to handle it.  For another thing, we
weren't actually setting up a destination in the OpAtomicLoad case.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 5e030deaf2)
Conflicts resolved by Dylan

Conflicts:
	src/compiler/spirv/spirv_to_nir.c
2018-07-26 10:42:06 -07:00
Jason Ekstrand
7c6ff66eba intel/compiler: Account for built-in uniforms in analyze_ubo_ranges
The original pass only looked for load_uniform intrinsics but there are
a number of other places that could end up loading a push constant.  One
obvious omission was images which always implicitly use a push constant.
Legacy VS clip planes also get pushed into the shader.  This fixes some
new Vulkan CTS tests that test random combinations of bindings and, in
particular, test lots of UBOs and images together.

Cc: mesa-stable@lists.freedesktop.org
Cc: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 820d5e51b7)
Applied cleanly, but required backport performed by Dylan, as suggested
by Jason
2018-07-26 10:42:06 -07:00
Jason Ekstrand
700ff56edc intel/blorp: Handle 3-component formats in clears
This fixes a nasty hang in Batman: Arkham City which apparently calls
vkCmdClearColorImage on a linear RGB image.

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
(cherry picked from commit daa78f30b6)

Conflicts resolved by Dylan

Conflicts:
	src/intel/blorp/blorp_blit.c
2018-07-26 10:42:06 -07:00
Jason Ekstrand
1100a52143 blorp: Handle the RGB workaround more like other workarounds
The previous version was sort-of strapped on in that it just adjusted
the blit rectangle and trusted in the fact that we would use texelFetch
and round to the nearest integer to ensure that the component positions
matched.  This new version, while slightly more complicated, is more
accurate because all three components end up with exactly the same
dst_pos and so they will get interpolated and sampled at the same
texture coordinate.  This makes the workaround suitable for using with
scaled blits.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 293b8de161)
2018-07-26 10:42:06 -07:00
Bas Nieuwenhuizen
f017a20636 radv: Still enable inmemory & API level caching if disk cache is not enabled.
That we don't have a background disk cache does not mean we should
prevent the app caching anything.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 28b8c18d84)
2018-07-26 10:42:06 -07:00
Jason Ekstrand
604e930e8f nir/serialize: Alloc constants off the variable
nir_sweep assumes that constants area always allocated off the variable
to which they belong.  Violating this assumption causes them to get
freed early and leads to use-after-free bugs.

Fixes: 120da00975 "nir: add serialization and deserialization"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107366
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit f214baf72f)
2018-07-26 10:42:06 -07:00
Danylo Piliaiev
4066610ad4 i965: Sweep NIR after linking phase to free held memory
After optimization passes and many trasfromations most of memory
NIR holds is a garbage which was being freed only after shader deletion.
Freeing it at the end of linking will save memory which would be useful
in case there are a lot of complex shaders being compiled.
The common case for this issue is 32bit game running under Wine.

The cost of the optimization is around ~3-5% of compilation speed
with complex shaders.

V2: by Jason Ekstrand
    - Move nir_sweep up, right after the last change of NIR

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103274
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d219521379)
2018-07-26 10:42:06 -07:00
Harish Krupo
f54a831786 egl: Fix missing clamping in eglSetDamageRegionKHR
Clamp the x and y co-ordinates of the rectangles.

v2: Clamp width/height after converting to co-ordinates
    (Ilia Merkin)

Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit fd734608c3)
2018-07-26 10:42:06 -07:00
Roland Scheidegger
2f1a2aff49 draw: force draw pipeline if there's more than 65535 vertices
The pt emit path can only handle 65535 - the number of vertices is
truncated to a ushort, resulting in a too small buffer allocation, which
will crash.

Forcing the pipeline path looks suboptimal, then again this bug is
probably there ever since GS is supported, so it seems it's not
happening often. (Note that the vertex_id in the vertex header is 16
bit too, however this is only used by the draw pipeline, and it denotes
the emit vertex nr, and that uses vbuf code, which will only emit smaller
chunks, so should be fine I think.)
Other solutions would be to simply allow 32bit counts for vertex
allocation, however 65535 is already larger than this was intended for
(the idea being it should be more cache friendly). Or could try to teach
the pt emit path to split the emit in smaller chunks (only the non-index
path can be affected, since gs output is always linear), but it's a bit
tricky (we don't know the primitive boundaries up-front).

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=107295

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 09828feab0)
2018-07-26 10:42:06 -07:00
Dave Airlie
3843ceb43d r600: enable tess_input_info for TES
There might be a nicer way to do this, but this is at least correct.

This fixes:
KHR-GL44.tessellation_shader.single.max_patch_vertices
KHR-GL44.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d73f1026b4)
2018-07-26 10:42:06 -07:00
Chih-Wei Huang
e67e983392 Android: fix a missing nir_intrinsics.h error
The commit 76dfed8ae2 changed nir_intrinsics.h to be a generated
header, but the corresponding dependency was not updated for Android.
It causes the error:

[  0% 19/4336] target  C: libmesa_pipe_radeonsi <= external/mesa/src/gallium/drivers/radeonsi/si_debug.c
...
In file included from external/mesa/src/gallium/drivers/radeonsi/si_debug.c:25:
In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:28:
In file included from external/mesa/src/gallium/drivers/radeonsi/si_shader.h:140:
In file included from external/mesa/src/amd/common/ac_llvm_build.h:30:
external/mesa/src/compiler/nir/nir.h:966:10: fatal error: 'nir_intrinsics.h' file not found
         ^~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 76dfed8ae2 ("nir: mako all the intrinsics")
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Mauro Rossi <issor.oruam@gmail.com>
(cherry picked from commit e7ffd3fb08)
2018-07-26 10:42:06 -07:00
Mauro Rossi
f8060b1ec0 android: util/disk_cache: fix building errors in gallium drivers
This patch applies the necessary changes in Android.common.mk
as per automake rules, to avoid following building error:

external/mesa/src/gallium/drivers/nouveau/nouveau_screen.c:159:8:
error: implicit declaration of function 'disk_cache_get_function_timestamp'
is invalid in C99 [-Werror,-Wimplicit-function-declaration]
   if (disk_cache_get_function_timestamp(nouveau_disk_cache_create,
       ^
1 error generated.

(v2) -DENABLE_SHADER_CACHE Android cflag is kept, to leave the AS-IS capability enabled

Fixes: cc10b34 ("util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 6cbbd5b4f8)
2018-07-26 10:42:06 -07:00
Jason Ekstrand
79270d2140 anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV
We've had several broadwell hangs that have come down to this bit just
not working correctly.  Most recently, we've had a pile of hangs
reported with apps running under DXVK:

https://github.com/doitsujin/dxvk/issues/469

Instead, use the bit that doesn't try to imply weird D3D coherency
things and just force-enables the PS like we want.

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit abd629eb3d)
2018-07-26 10:42:06 -07:00
Samuel Pitoiset
65f62d2977 radv: fix a memleak for merged shaders on GFX9
modules[i] can be NULL for merged shaders but we have to
free the NIR code. radv_can_dump_shader_stats() already handles
if modules[i] is NULL, no need to check it twice.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 6e32d9e7b0)
2018-07-26 10:42:06 -07:00
Alex Smith
996a8eb412 anv: Pay attention to VK_ACCESS_MEMORY_(READ|WRITE)_BIT
According to the spec, these should apply to all read/write access
types (so would be equivalent to specifying all other access types
individually). Currently, they were doing nothing.

v2: Handle VK_ACCESS_MEMORY_WRITE_BIT in dstAccessMask.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 54f8f1545f)
2018-07-26 10:42:06 -07:00
Bas Nieuwenhuizen
32c9b01027 nir: Fix end of function without return warning/error.
There always is a continue block, so let us just do unreachable.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 8cacf38f52 "nir: Do not use continue block after removing it."
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107312
(cherry picked from commit e1febbefe8)
2018-07-26 10:42:06 -07:00
Dylan Baker
4b776c159d cherry-ignore: add 11712b9ca1 2018-07-26 10:42:06 -07:00
Bas Nieuwenhuizen
24e3aa2d78 util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.
radv always needs it, so just check the header instead. Also
do not declare the function if the variable is not set, so we
get a nice compile error instead of failing to open a device
at runtime.

Fixes: b87ef9e606 "util: fix MSVC build issue in disk_cache.h"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit cc10b34e9e)
2018-07-26 10:42:06 -07:00
Bas Nieuwenhuizen
a6feff4a75 nir: Do not use continue block after removing it.
Reinserting code directly before a jump means the block gets split
and merged, removing the original block and replacing it in the
process.

Hence keeping a pointer to the continue block over a reinsert
causes issues.

This code changes nir_opt_if to simply look for the new continue
block.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107275
CC: 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8cacf38f52)
2018-07-26 10:42:06 -07:00
Dylan Baker
2f1a3dbf58 cherry-ignore: Add 1f616a840e
Which was manually backported by Samuel
2018-07-26 10:42:06 -07:00
Samuel Pitoiset
63ac3b50e7 radv: emit a dummy ZPASS_DONE to prevent GPU hangs on GFX9
A ZPASS_DONE or PIXEL_STAT_DUMP_EVENT (of the DB occlusion
counters) must immediately precede every timestamp event to
prevent a GPU hang on GFX9.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-26 10:42:06 -07:00
Eric Anholt
a0dd7bd498 meson: Move xvmc test tools from unit tests to installed tools.
These are not unit tests, as they rely on the host's XVMC and some user
configuration.  Switch them over to being general installed tools, to fix
unit testing.

Fixes: 22a817af8a ("meson: build gallium xvmc state tracker")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 162fcdad6a)
Minor merge conflict resolved by Dylan

Conflicts:
	meson_options.txt
2018-07-26 10:42:06 -07:00
Karol Herbst
d7463d7f34 nir: fix printing of vec16 type
Fixes: 2f181c8c18
       "glsl_types: vec8/vec16 support"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit 87c8af2836)
2018-07-26 10:42:06 -07:00
Lepton Wu
47ac26f858 virgl: Fix flush in virgl_encoder_inline_write.
The current code is buggy: if there are only 12 dwords left in cbuf,
we emit a zero data length command which will be rejected by virglrenderer.
Fix it by calling flush in this case.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 04e278f793)
2018-07-26 10:42:06 -07:00
Jan Vesely
92e544d96e clover: Catch errors from executing event action
Abort all dependent events.
v2: Abort the current event as well.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 866b25fd01)
2018-07-26 10:42:06 -07:00
Jan Vesely
409894e616 clover: Report error when pipe driver fails to create compute state
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 154fbd03cc)
2018-07-26 10:42:06 -07:00
Samuel Iglesias Gonsálvez
b97c46f2ce anv: fix assert in anv_CmdBindDescriptorSets()
The assert is checking that we are not binding more descriptor sets
than the supported by the driver. When binding the descriptor set
number MAX_SETS-1, it was breaking the assert because
descriptorSetCount = 1.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 0f29006256)
2018-07-26 10:42:06 -07:00
Jan Vesely
b9d1ef5035 radeonsi: Refuse to accept code with unhandled relocations
They might lead to unrecoverable GPU hang.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 9baacf3fa7)
2018-07-26 10:42:06 -07:00
Bas Nieuwenhuizen
1a0b75aa16 radv: Disable disabled color buffers in rbplus opts.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit c0144e915a)
2018-07-26 10:42:06 -07:00
Bas Nieuwenhuizen
3a8ac0b54f radv: Fix number of samples used for binning.
Used the wrong register ...

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 760211b77c)
2018-07-26 10:42:06 -07:00
Bas Nieuwenhuizen
79a247b233 radv: Select correct entries for binning.
Overshot it by one every time.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 82664af6cf)
2018-07-26 10:42:06 -07:00
Mauro Rossi
630eacdaa5 radv: winsys/amdgpu: include missing pthread.h header
pthread types are used in some files without explicitely including pthread.h.
This leads to compile errors on Android 7.x nougat-x86
e.g. in src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h

In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c:31:
In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.h:32:
external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h:52:2: error: unknown type name 'pthread_mutex_t'
        pthread_mutex_t global_bo_list_lock;
        ^
1 error generated.

Including pthread.h explicitely solves the building error

Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 1a1f2b134c)
2018-07-26 10:42:06 -07:00
Eric Anholt
b10229038c vc4: Don't automatically reallocate a PERSISTENT-mapped buffer.
I had mistakenly used the COHERENT flag, which can only be set when
PERSISTENT is mapped, but isn't always.

Fixes: a2014c2eb9 ("vc4: Simplify the DISCARD_RANGE handling")
(cherry picked from commit 50a3a283d0)
2018-07-26 10:42:06 -07:00
Michel Dänzer
01d1f4713f gallium: Check pipe_screen::resource_changed before dereferencing it
It's optional, only implemented by the etnaviv driver so far.

Fixes: 501d0edeca "st/mesa: call resource_changed when binding a
                     EGLImage to a texture"
Fixes: a37cf630b4 "gallium: add pipe_screen::resource_changed callback
                     wrappers"
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
(cherry picked from commit 34e89e4d38)
2018-07-26 10:42:06 -07:00
Lucas Stach
c94cde9544 st/mesa: call resource_changed when binding a EGLImage to a texture
When a EGLImage is newly bound to a texture, we need to make sure the
driver is informed that the resource might have changed. Fixes stale
texture content on Etnaviv when binding an existing EGLImage to an
existing texture object.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 501d0edeca)
2018-07-26 10:42:06 -07:00
Dylan Baker
dac8e6bbe0 cherry-ignore: add 4a67ce886a 2018-07-26 10:42:06 -07:00
Samuel Pitoiset
27aaa64c55 radv: make sure to wait for CP DMA when needed
This might fix some synchronization issues. I don't know if
that will affect performance but it's required for correctness.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-26 10:42:06 -07:00
Chad Versace
0ecf5dcedc anv/android: Fix Autotools build for VK_ANDROID_native_buffer
Changes to vk.xml and anv_entrypoints_gen.py broke the Autotools build
on Android. The changes undef'd the VK_ANDROID_native_buffer entrypoints
in anv_entrypoints.h.

Fix it with CPPFLAGS += -DVK_USE_PLATFORM_ANDROID_KHR.

CC: <mesa-stable@lists.freedesktop.org>
See-Also: 63525ba7 "android: enable VK_ANDROID_native_buffer"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 8e403bc959)
2018-07-26 10:42:06 -07:00
Chad Versace
e536024346 anv/android: Fix type error in call to vk_errorf()
In a single call to vk_errorf() in the Android code, the arguments were
swapped. The bug has existed since day one. Chrome OS used to forgive
the warning, but it is now a compilation error.

CC: <mesa-stable@lists.freedesktop.org>
Fixes: 053d4c32 "anv: Implement VK_ANDROID_native_buffer (v9)"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit be5fc0d7f1)
2018-07-26 10:42:06 -07:00
Dylan Baker
b956e98cd6 docs: Add sha256 sums for 18.1.4 tarballs 2018-07-26 10:42:06 -07:00
Jose Fonseca
38d328c391 gallium/tests: Don't ignore S3TC errors.
Now we do full S3TC decompression they should no longer fail.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Fixes: 34cf3c43be ("mesa: Call DXTn functions directly")
2018-07-24 18:22:27 +01:00
Dylan Baker
7f76bfcc7e docs: Add release notes for 18.1.4 2018-07-13 11:34:55 -07:00
Dylan Baker
4078bff608 Bump version for release 2018-07-13 11:15:47 -07:00
Jose Maria Casanova Crespo
251cf0dc36 i965/fs: unspills shoudn't use grf127 as dest since Gen8+
At 232ed89802 "i965/fs: Register allocator
shoudn't use grf127 for sends dest" we didn't take into account the case
of SEND instructions that are not send_from_grf. But since Gen7+ although
the backend still uses MRFs internally for sends they are finally
assigned to a GRFs.

In the case of unspills the backend assigns directly as source its
destination because it is suppose to be available. So we always have a
source-destination overlap. If the reg_allocator assigns registers that
include the grf127 we fail the validation rule that affects Gen8+
"r127 must not be used for return address when there is a src and dest
overlap in send instruction."

So this patch activates the grf127_send_hack_node for Gen8+ and if we
have any register spilled we add interferences to the destination of
the unspill operations.

We also need to avoid that opt_bank_conflicts() optimization, that runs
after the register allocation, doesn't move things around, causing the
grf127 to be used in the condition we were avoiding.

Fixes piglit test tests/spec/arb_compute_shader/linker/bug-93840.shader_test
and some shader-db crashed because of the grf127 validation rule..

v2: make sure that opt_bank_conflicts() optimization doesn't change
the use of grf127. (Caio)

Found by Caio Marcelo de Oliveira Filho

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107193
Fixes: 232ed89802 "i965/fs: Register allocator shoudn't use grf127 for sends dest"
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Cc: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 62f37ee53d)
2018-07-13 10:00:41 -07:00
Adam Jackson
658d4e8e00 glx: Don't allow glXMakeContextCurrent() with only one valid drawable
Drawable and readable need to either both be None or both be non-None.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d257ec0136)
2018-07-10 09:30:44 -07:00
Jose Maria Casanova Crespo
e681c0eb9d intel/compiler: grf127 can not be dest when src and dest overlap in send
Implement at brw_eu_validate the restriction from Intel Broadwell PRM,
vol 07, section "Instruction Set Reference", subsection "EUISA
Instructions", Send Message (page 990):

"r127 must not be used for return address when there is a src and
dest overlap in send instruction."

v2: Style fixes (Matt Turner)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0e47ecb29a)
2018-07-10 09:30:39 -07:00
Jose Maria Casanova Crespo
9ce4c7b414 i965/fs: Register allocator shoudn't use grf127 for sends dest
Since Gen8+ Intel PRM states that "r127 must not be used for return
address when there is a src and dest overlap in send instruction."

This patch implements this restriction creating new grf127_send_hack_node
at the register allocator. This node has a fixed assignation to grf127.

For vgrf that are used as destination of send messages we create node
interfereces with the grf127_send_hack_node. So the register allocator
will never assign to these vgrf a register that involves grf127.

If dispatch_width > 8 we don't create these interferences to the because
all instructions have node interferences between sources and destination.
That is enough to avoid the r127 restriction.

This fixes CTS tests that raised this issue as they were executed as SIMD8:

dEQP-VK.spirv_assembly.instruction.graphics.8bit_storage.8struct_to_32struct.storage_buffer_*int_geom

Shader-db results on Skylake:
   total instructions in shared programs: 7686798 -> 7686797 (<.01%)
   instructions in affected programs: 301 -> 300 (-0.33%)
   helped: 1
   HURT: 0

   total cycles in shared programs: 337092322 -> 337091919 (<.01%)
   cycles in affected programs: 22420415 -> 22420012 (<.01%)
   helped: 712
   HURT: 588

Shader-db results on Broadwell:

   total instructions in shared programs: 7658574 -> 7658625 (<.01%)
   instructions in affected programs: 19610 -> 19661 (0.26%)
   helped: 3
   HURT: 4

   total cycles in shared programs: 340694553 -> 340676378 (<.01%)
   cycles in affected programs: 24724915 -> 24706740 (-0.07%)
   helped: 998
   HURT: 916

   total spills in shared programs: 4300 -> 4311 (0.26%)
   spills in affected programs: 333 -> 344 (3.30%)
   helped: 1
   HURT: 3

   total fills in shared programs: 5370 -> 5378 (0.15%)
   fills in affected programs: 274 -> 282 (2.92%)
   helped: 1
   HURT: 3

v2: Avoid duplicating register classes without grf127. Let's use a node
    with a fixed assignation to grf127 and create interferences to send
    message vgrf destinations. (Eric Anholt)
v3: Update reference to CTS VK_KHR_8bit_storage failing tests.
    (Jose Maria Casanova)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 232ed89802)
2018-07-10 09:30:32 -07:00
zhaowei yuan
a0b97410a0 glsl: Treat sampler2DRect and sampler2DRectShadow as reserved in ES2
"sampler2DRect" and "sampler2DRectShadow" are specified as
reserved from GLSL 1.1 and GLSL ES 1.0

Signed-off-by: zhaowei yuan <zhaowei.yuan@samsung.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106906
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes: 34f7e761bc ("glsl/parser: Track built-in types using the glsl_type directly")
(cherry picked from commit 73ec437627)
2018-07-10 09:30:09 -07:00
Jason Ekstrand
6fa04b1767 intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLN
When we don't have PLN (gen4 and gen11+), we implement LINTERP as either
LINE+MAC or a pair of MADs.  In both cases, the accumulator is written
by the first of the two instructions and read by the second.  Even
though the accumulator value isn't actually ever used from a logical
instruction perspective, it is trashed so we need to make the scheduler
aware.  Otherwise, the scheduler could end up re-ordering instructions
and putting a LINTERP between another an instruction which writes the
accumulator and another which tries to use that result.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 566e6abd6d)
Rebased version provided by Jason
2018-07-09 13:21:07 -07:00
Lionel Landwerlin
9401dcdb3c i965: fix clear color bo address relocation
Fixes: 7987d041fd ("i965/surface_state: Emit the clear color address instead of value.")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 420bf14e12)
2018-07-09 09:25:09 -07:00
Marek Olšák
a47e6da27b st/dri: fix a crash in server_wait_sync
Ported from i965 including the comment.

This fixes:
    dEQP-EGL.functional.reusable_sync.valid.wait_server

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 0eaf069679)
2018-07-09 09:24:56 -07:00
Ian Romanick
e7d4549ac0 i965/fs: Properly handle sign(-abs(x))
Fixes new piglit tests:

 - glsl-1.10/execution/fs-sign-neg-abs.shader_test
 - glsl-1.10/execution/fs-sign-sat-neg-abs.shader_test
 - glsl-1.10/execution/vs-sign-neg-abs.shader_test
 - glsl-1.10/execution/vs-sign-sat-neg-abs.shader_test

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 88bd37c010)
2018-07-09 09:24:51 -07:00
Ian Romanick
7dd72c1c8a i965/vec4: Properly handle sign(-abs(x))
This is achived by copying the sign(abs(x)) optimization from the FS
backend.

On Gen7 an earlier platforms, this fixes new piglit tests:

 - glsl-1.10/execution/vs-sign-neg-abs.shader_test
 - glsl-1.10/execution/vs-sign-sat-neg-abs.shader_test

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 9626ea497d)
2018-07-09 09:24:45 -07:00
Ian Romanick
c1027505f9 intel/compiler: Relax mixed type restriction for saturating immediates
At the time of commit 7bc6e455e2 (i965: Add support for saturating
immediates.) we thought mixed type saturates would be impossible.  We
were only thinking about type converting moves from D to F, for
example.  However, type converting moves w/saturate from F to DF are
definitely possible.  This change minimally relaxes the restriction to
allow cases that I have been able trigger via piglit tests.

Fixes new piglit tests:
 - arb_gpu_shader_fp64/execution/built-in-functions/fs-sign-sat-neg-abs.shader_test
 - arb_gpu_shader_fp64/execution/built-in-functions/vs-sign-sat-neg-abs.shader_test

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit f8e54d02f7)
2018-07-09 09:24:36 -07:00
Roland Scheidegger
d68f2d7ede r600/sb: fix crash in fold_alu_op3
fold_assoc() called from fold_alu_op3() can lower the number of src to 2,
which then leads to an invalid access to n.src[2]->gvalue().
This didn't seem to have caused much harm in the past, but on Fedora 28
it will crash (presumably because -D_GLIBCXX_ASSERTIONS is used, although
with libstdc++ 4.8.5 this didn't do anything, -D_GLIBCXX_DEBUG was
needed to show the issue).

An alternative fix would be to instead call fold_alu_op2() from within
fold_assoc() when the number of src is reduced and return always TRUE
from fold_assoc() in this case, with the only actual difference being
the return value from fold_alu_op3() then. I'm not sure what the return
value actually should be in this case (or whether it even can make a
difference).

https://bugs.freedesktop.org/show_bug.cgi?id=106928
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 817efd8968)
2018-07-09 09:24:30 -07:00
Samuel Pitoiset
3ddbe5d4d7 radv: fix emitting the view index on GFX9
For merged shaders, VS as HS for example.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 85865dbe0d)
2018-07-06 09:12:27 -07:00
Neil Roberts
7333112ed0 i965: Fix output register sizes when variable ranges are interleaved
In 6f5abf3146 this code was fixed to calculate the maximum size of
an attribute in a seperate pass and then allocate the registers to
that size. However this wasn’t taking into account ranges that overlap
but don’t have the same starting location. For example:

layout(location = 0, component = 0) out float a[4];
layout(location = 2, component = 1) out float b[4];

Previously, if ‘a’ was processed first then it would allocate a
register of size 4 for location 0 and it wouldn’t allocate another
register for location 2 because it would already be covered by the
range of 0. Then if something tries to write to b[2] it would try to
write past the end of the register allocated for ‘a’ and it would hit
an assert.

This patch changes it to scan for any overlapping ranges that start
within each range to calculate the maximum extent and allocate that
instead.

Fixed Piglit’s arb_enhanced_layouts/execution/component-layout/
vs-fs-array-interleave-range.shader_test

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 6f5abf3146 "i965: Fix output register sizes when multiple variables
       share a slot."
(cherry picked from commit 2d5ddbe960)
2018-07-05 10:00:13 -07:00
Dave Airlie
83716106b9 r600/sb: cleanup if_conversion iterator to be legal C++
The current code causes:
/usr/include/c++/8/debug/safe_iterator.h:207:
Error: attempt to copy from a singular iterator.

This is due to the iterators getting invalidated, fix the
reverse iterator to use the return value from erase, and
cast it properly.

(used Mathias suggestion)
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>

(cherry picked from commit 8c51caab24)
2018-07-05 10:00:02 -07:00
Timothy Arceri
81af1a0ae2 nir: fix selection of loop terminator when two or more have the same limit
We need to add loop terminators to the list in the order we come
across them otherwise if two or more have the same exit condition
we will select that last one rather than the first one even though
its unreachable.

This fix is for simple unrolls where we only have a single exit
point. When unrolling these type of loops the unreachable
terminators and their unreachable branch are removed prior to
unrolling. Because of the logic change we also switch some
list access in the complex unrolling logic to avoid breakage.

Fixes: 6772a17acc ("nir: Add a loop analysis pass")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 463f849097)
2018-07-03 10:26:16 -07:00
Ian Romanick
4cd70c4cdf i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible
Otherwise we can incorrectly cmod propagate in situations like

    add(8)          g10<1>.xD       g2<0>.xD        -16D
    ...
    cmp.ge.f0(8)    null<1>D        g2<0>.xD        16D
    ...
    (+f0) sel(8)    g21<1>.xyUD     g14<4>.xyyyUD   g18<4>.xyyyUD

Sadly, this change hurts quite a few shaders.

v2: Refactor writemask compatibility check into a separate function.
Suggested by Caio.

Ivy Bridge and Haswell had similar results. (Haswell shown)
total instructions in shared programs: 12968489 -> 12968738 (<.01%)
instructions in affected programs: 60679 -> 60928 (0.41%)
helped: 0
HURT: 249
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.22% max: 0.81% x̄: 0.46% x̃: 0.44%
95% mean confidence interval for instructions value: 1.00 1.00
95% mean confidence interval for instructions %-change: 0.44% 0.48%
Instructions are HURT.

total cycles in shared programs: 409171965 -> 409172317 (<.01%)
cycles in affected programs: 260056 -> 260408 (0.14%)
helped: 0
HURT: 176
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.04% max: 0.34% x̄: 0.17% x̃: 0.17%
95% mean confidence interval for cycles value: 2.00 2.00
95% mean confidence interval for cycles %-change: 0.16% 0.18%
Cycles are HURT.

Sandy Bridge
total instructions in shared programs: 10423577 -> 10423753 (<.01%)
instructions in affected programs: 40667 -> 40843 (0.43%)
helped: 0
HURT: 176
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.29% max: 0.79% x̄: 0.48% x̃: 0.42%
95% mean confidence interval for instructions value: 1.00 1.00
95% mean confidence interval for instructions %-change: 0.46% 0.51%
Instructions are HURT.

total cycles in shared programs: 146097503 -> 146097855 (<.01%)
cycles in affected programs: 503990 -> 504342 (0.07%)
helped: 0
HURT: 176
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.02% max: 0.36% x̄: 0.12% x̃: 0.11%
95% mean confidence interval for cycles value: 2.00 2.00
95% mean confidence interval for cycles %-change: 0.11% 0.13%
Cycles are HURT.

No changes on any other platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes: cd635d149b i965/vec4: Propagate conditional modifiers from compares to adds
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 995d993710)
2018-07-03 10:26:11 -07:00
Marek Olšák
a14f1d2110 glsl/cache: save and restore ExternalSamplersUsed
Shaders that need special code for external samplers were broken if
they were loaded from the cache.

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 99c6cae227)
2018-07-03 10:25:43 -07:00
Iago Toral Quiroga
a4aec345e3 anv/cmd_buffer: never shrink the push constant buffer size
If we have to re-emit push constant data, we need to re-emit all
of it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 198a72220b)
2018-07-03 10:25:38 -07:00
Iago Toral Quiroga
ebaa43bebc anv/cmd_buffer: clean dirty push constants flag after emitting push constants
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6a1d8350c9)
2018-07-03 10:25:32 -07:00
Iago Toral Quiroga
52b78ae7da anv/cmd_buffer: make descriptors dirty when emitting base state address
Every time we emit a new state base address we will need to re-emit our
binding tables, since they might have been emitted with a different base
state adress.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1b54824687)
2018-07-03 10:25:26 -07:00
Jason Ekstrand
fed76b3269 anv: Be more careful about hashing pipeline layouts
Previously, we just hashed the entire descriptor set layout verbatim.
This meant that a bunch of extra stuff such as pointers and reference
counts made its way into the cache.  It also meant that we weren't
properly hashing in the Y'CbCr conversion information information from
bound immutable samplers.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit d1c778b362)
2018-07-03 10:25:21 -07:00
Marek Olšák
fde83d5f2b radeonsi: fix memory exhaustion issue with DCC statistics gathering with DRI2
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 41f80373b4)
Conflicts fixed by Dylan

Conflicts:
	src/gallium/drivers/radeonsi/si_blit.c
2018-07-03 10:24:58 -07:00
Timothy Arceri
45bea64814 glsl: skip comparison opt when adding vars of different size
The spec allows adding scalars with a vector or matrix. In this case
the opt was losing swizzle and size information.

This fixes a bug with Doom (2016) shaders.

Fixes: 34ec1a24d6 ("glsl: Optimize (x + y cmp 0) into (x cmp -y).")

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2a5121bf35)
2018-07-03 10:24:58 -07:00
Rhys Perry
fb39f5d2e8 nvc0/ir: fix TargetNVC0::insnCanLoadOffset()
Previously, TargetNVC0::insnCanLoadOffset() returned whether the offset
could be set to a specific value. The IndirectPropagation pass expected
it to return whether the offset could be increased by a specific value,
which is what TargetNV50::insnCanLoadOffset() does.

Fixes: 37b67db6ae
	("nvc0/ir: be careful about propagating very large offsets into const load")

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit 6bb0f87c60)
2018-07-03 10:24:58 -07:00
Jason Ekstrand
2fa4b4dfc0 intel/fs: Split instructions low to high in lower_simd_width
Commit 0d905597f fixed an issue with the placement of the zip and unzip
instructions.  However, as a side-effect, it reversed the order in which
we were emitting the split instructions so that they went from high
group to low instead of low to high.  This is fine for most things like
texture instructions and the like but certain render target writes
really want to be emitted low to high.  This commit just switches the
order back around to be low to high.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 0d905597f "intel/fs: Be more explicit about our placement of [un]zip"
(cherry picked from commit d5b617a28e)
2018-07-03 10:24:58 -07:00
Ross Burton
f102eada9b egl: fix build race in automake
There is a parallel make build issue in src/egl/drivers/dri2/
for wayland builds. Can be reproduced with:

$ rm src/egl/drivers/dri2/*.h src/egl/drivers/dri2/platform_wayland.lo
$ make -C src/egl/ drivers/dri2/platform_wayland.lo
../../../mesa-18.1.2/src/egl/drivers/dri2/platform_wayland.c:50:10: fatal error: linux-dmabuf-unstable-v1-client-protocol.h: No such file or directory

This patch adds the missing dependency.

Fixes: 02cc359372 "egl/wayland: Use linux-dmabuf interface for buffers"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>

[Eric: fixed up the commit title]
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit d7c4ce1d1d)
2018-07-03 10:24:58 -07:00
Dylan Baker
f815839df0 docs: Add SHA256 sums to notes for 18.1.3 2018-06-29 11:00:48 -07:00
Dylan Baker
f7e89b2f48 docs: Add release notes for 18.1.3 2018-06-29 10:36:11 -07:00
Dylan Baker
765fdbe2f8 VERSION: bump version to 18.1.3 2018-06-29 10:29:48 -07:00
Dylan Baker
ed54f93d91 cherry-ignore: add a2f5292c82 2018-06-27 08:31:24 -07:00
Samuel Pitoiset
07049c0b67 radv: use separate bind points for the dynamic buffers
The Vulkan spec says:

   "pipelineBindPoint is a VkPipelineBindPoint indicating whether
    the descriptors will be used by graphics pipelines or compute
    pipelines. There is a separate set of bind points for each of
    graphics and compute, so binding one does not disturb the other."

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 7a57c82767)
2018-06-27 08:21:19 -07:00
Dylan Baker
696be22905 cherry-ignore: Ignore cac7ab1192 2018-06-26 11:15:11 -07:00
Jason Ekstrand
bc67499beb nir/validate: Use the type from the tail of call parameter derefs
Otherwise, if what gets passed into the function call is a deref chain
longer than just a variable deref, we would use the type of the entire
variable rather than the type of the thing being dereferenced.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106980
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(Unique to 18.1)
2018-06-26 08:40:30 -07:00
Jason Ekstrand
d96eecbdd7 nir: Handle call instructions in foreach_src
Even though they don't have regular sources, they do have derefs and
those may have implied sources that should be handled.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106980
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(Unique to 18.1)
2018-06-26 08:40:14 -07:00
Tapani Pälli
915d9166bf glsl: serialize data from glTransformFeedbackVaryings
While XFB has been enabled for cache, we did not serialize enough
data for the whole API to work (such as glGetProgramiv).

Fixes: 6d830940f7 "Allow shader cache usage with transform feedback"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106907
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit ab2643e4b0)
2018-06-26 08:39:04 -07:00
Andrii Simiklit
f0f66ee4ba i965/gen6/gs: Handle case where a GS doesn't allocate VUE
We can not use the VUE Dereference flags combination for EOT
message under ILK and SNB because the threads are not initialized
there with initial VUE handle unlike Pre-IL.
So to avoid GPU hangs on SNB and ILK we need
to avoid usage of the VUE Dereference flags combination.
(Was tested only on SNB but according to the specification
SNB Volume 2 Part 1: 1.6.5.3, 1.6.5.6
the ILK must behave itself in the similar way)

v2: Approach to fix this issue was changed.
Instead of different EOT flags in the program end
we will create VUE every time even if GS produces no output.

v3: Clean up the patch.
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105399
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit 232c5d75ea)
2018-06-26 08:38:36 -07:00
Samuel Pitoiset
0747f76b85 radv: ignore pInheritanceInfo for primary command buffers
From the Vulkan spec:
"If this is a primary command buffer, then this value is ignored."

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit ba5e25ed29)
2018-06-26 08:27:27 -07:00
Dylan Baker
f77cae2c59 meson: Correct behavior of vdpau=auto
Currently if vdpau is set to auto, it will be disabled only in cases
where gallium is disabled or the host OS is not supported (mac, haiku,
windows). However on (for example) Linux if libvdpau is not installed
then the build will error because of the unmet dependency. This corrects
auto to do the right thing, and not error if libvdpau is not installed.

Fixes: 992af0a4b8
       ("meson: dedup gallium-vdpau logic")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit d9a8008a93)
2018-06-25 08:43:39 -07:00
Dylan Baker
abe65eb58f meson: Fix auto option for xvmc
This fixes the same problem as the previous patch did for vdpau, but for
xvmc.

Fixes: 724916c8a8
       ("meson: dedup gallium-xvmc logic")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit a6943bb4ce)

Squashed with:

meson: Fix typo that breaks -Dgalium-xvmc=false

_xmvc -> _xvmc. Sigh

Fixes: a6943bb4ce
       ("meson: Fix auto option for xvmc")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
(cherry picked from commit ced3df5623)
2018-06-25 08:43:28 -07:00
Dylan Baker
6a1ef7ccb8 meson: Fix auto option for va
The same as the previous two patches, but for the libva state tracker.

Fixes: 724916c8a8
       ("meson: dedup gallium-xvmc logic")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 94cf397092)
2018-06-25 08:40:18 -07:00
Samuel Pitoiset
b5f154a860 radv: fix HTILE metadata initialization in presence of subpass clears
If the driver ends up by performing a slow depthstencil clear,
the HTILE metadata won't be initialized correctly.

This fixes random VM faults on Polaris while running CTS
with Bas's runner. This doesn't seem to regress performance.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 07cb1373a2)
2018-06-25 08:39:18 -07:00
Dylan Baker
fc0d0ad019 cherry-ignore: Add 587e712eda 2018-06-22 14:43:22 -07:00
Marek Olšák
dd14a0f3e1 mesa: fix glGetInteger64v for arrays of integers
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit a2790b134a)
2018-06-22 09:42:31 -07:00
Emil Velikov
2a0015de3f glsl/tests/glcpp: reinstate "error out if no tests found"
With the recent rework of converting the shell script to a python one
the check for actual tests was dropped.

Bring that back, since it was explicitly added considering we had a ~2
year period, during which the tests were not run.

v2: use raise Exception() over  print() & return false (Dylan)

Fixes: db8cd8e367 ("glcpp/tests: Convert shell scripts to a python
script")
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d589eddc8b)
2018-06-21 08:13:05 -07:00
Emil Velikov
3c1eaa596f configure: use compliant grep regex checks
The current `grep "foo\|bar"' trips on some grep implementations, like
the FreeBSD one. Instead use `egrep "foo|bar"' as suggested by Stefan.

Cc: Stefan Esser <se@FreeBSD.org>
Reported-by: Stefan Esser <se@FreeBSD.org>
Bugzilla: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=228673
Fixes: 1914c814a6 ("configure: error out if building OMX w/o supported platform")
Fixes: 63e11ac2b5 ("configure: error out if building VA w/o supported platform")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit dfb1f2759c)
2018-06-21 08:12:39 -07:00
Rob Clark
564c882021 freedreno/ir3: fix base_vertex
Fixes: c366f422f0 nir: Offset vertex_id by first_vertex instead of base_vertex
Signed-off-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit e1e40935b4)
2018-06-21 08:12:33 -07:00
Marek Olšák
decb031bd8 radeonsi: always put persistent buffers into GTT on radeon
This improves performance for certain games.

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit 9322974ec7)
2018-06-20 09:44:17 -07:00
Marek Olšák
e979b79cec ac/gpu_info: add kernel_flushes_hdp_before_ib
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit b81149e258)

Conflicts:
	src/amd/common/ac_gpu_info.c

Conflicts resolved by Dylan
2018-06-20 09:42:15 -07:00
Bas Nieuwenhuizen
19655023a9 ac/surface: Set compressZ for stencil-only surfaces.
We HTILE compress stencil-only surfaces too.

CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 1a8501a9dd)
2018-06-19 08:26:53 -07:00
Tomeu Vizoso
4c07e44ae5 virgl: Remove debugging left-overs
Some fprintfs were probably left unintentionally a few years ago and are
a bit of a nuisance.

Fixes: 2d3301e4d5 ("virgl: fix reference counting of prime handles")
       Cc: Rob Herring <robh@kernel.org>

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 9b1cb50ba4)
2018-06-19 08:26:40 -07:00
Eric Engestrom
7540acb137 meson: fix i965/anv/isl genX static lib names
Shouldn't make any functional difference, just that `liblibanv_gen90.a`
will now be called `libanv_gen90.a`.

Fixes: 3218056e0e "meson: Build i965 and dri stack"
Fixes: d1992255bb "meson: Add build Intel "anv" vulkan driver"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit e8eb84826e)

Trivial merge conflicts resolved by Dylan.
2018-06-18 13:30:18 -07:00
Eric Engestrom
6bc8fcbc94 radv: fix bitwise check
Fixes: 922cd38172 "radv: implement out-of-order rasterization when it's safe on VI+"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4d08c1e7d1)
2018-06-18 10:44:03 -07:00
Eric Engestrom
9b8c90fc67 radv: fix reported number of available VGPRs
It's a bit late to round up after an integer division.

Fixes: de88979413 "radv: Implement VK_AMD_shader_info"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
(cherry picked from commit d85fef1e34)
2018-06-18 10:43:57 -07:00
Samuel Pitoiset
c86a99adef radv: fix emitting the TCS regs on GFX9
The primitive ID is NULL and this generates an invalid
select instruction which crashes because one operand is NULL.

This fixes crashes in The Long Journey Home, Quantum Break
and Just Cause 3 with DXVK.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106756
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 5917761e3d)
2018-06-18 10:43:19 -07:00
Ian Romanick
a4609fe84f glsl: Don't copy propagate elements from SSBO or shared variables either
Since SSBOs can be written by a different GPU thread, copy propagating a
read can cause the value to magically change.  SSBO reads are also very
expensive, so doing it twice will be slower.

The same shader was helped by this patch and the previous.

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14399119 -> 14399113 (<.01%)
instructions in affected programs: 683 -> 677 (-0.88%)
helped: 1
HURT: 0

total cycles in shared programs: 532973113 -> 532971865 (<.01%)
cycles in affected programs: 524666 -> 523418 (-0.24%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
(cherry picked from commit 37bd9ccd21)
2018-06-15 13:55:04 -07:00
Ian Romanick
f3ec346ab1 glsl: Don't copy propagate from SSBO or shared variables either
Since SSBOs can be written by other GPU threads, copy propagating a read
can cause the value to magically change.  SSBO reads are also very
expensive, so doing it twice will be slower.

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14399120 -> 14399119 (<.01%)
instructions in affected programs: 684 -> 683 (-0.15%)
helped: 1
HURT: 0

total cycles in shared programs: 532978931 -> 532973113 (<.01%)
cycles in affected programs: 530484 -> 524666 (-1.10%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
(cherry picked from commit 461a5c899c)
2018-06-15 13:55:04 -07:00
Christian Gmeiner
ae39496831 util/bitset: include util/macro.h
BITSET_FFS(x) macro makes use of ARRAY_SIZE(x) macro which is
defined in util/macro.h. Include it directy to make usage more
straightforward.

Fixes: 692bd4a1ab ("util: replace Elements() with ARRAY_SIZE()")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit efae127993)
2018-06-15 13:55:04 -07:00
Lukas Rusak
2008ca24d7 meson: fix private libs when building without glx
I noticed that the generated pkg-config files will include
glx and x11 dependencies even when x11 isn't a selected platform.

This fixes the private libs and was tested by building kmscube

V2:
  - check if gallium-xlib is being used for glx

Fixes: 108d257a16 "meson: build libEGL"
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 4cfc4cef80)
2018-06-15 13:55:04 -07:00
Lukas Rusak
87453e9fe1 meson: only build vl_winsys_dri.c when x11 platform is used
This seems to have been missed in the move from autotools

This fixes the following build issue:

../src/gallium/auxiliary/vl/vl_winsys_dri.c:34:10: fatal error: X11/Xlib-xcb.h: No such file or directory
 #include <X11/Xlib-xcb.h>
          ^~~~~~~~~~~~~~~~

Fixes: b1b65397d0
       ("meson: Build gallium auxiliary")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 1d92d6486a)
2018-06-15 13:55:04 -07:00
Dave Airlie
156d0230a5 glsl: allow standalone semicolons outside main()
GLSL 4.60 offically added this but games and older CTS suites actually
had shaders that did this, we may as well enable it everywhere.

Adding stable because it appears apps in the wild do this.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit babd1d526b)
2018-06-15 13:55:04 -07:00
Marek Olšák
49f43bdcbf ac/gpu_info: report real total memory sizes
The change from MIN2 to MAX2 is intentional.

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 95ecde42eb)
2018-06-15 13:55:04 -07:00
Marek Olšák
66bc41a3f7 radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointers
This fixes:
GL45-CTS.pipeline_statistics_query_tests_ARB.functional_compute_shader_invocations

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 6d671078a8)
2018-06-15 13:55:04 -07:00
Samuel Pitoiset
b274062906 radv: update the ZRANGE_PRECISION value for the TC-compat bug
On GFX8+, there is a bug that affects TC-compatible depth surfaces
when the ZRange is not reset after LateZ kills pixels.

The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match
the last fast clear value. Because the value is set to 1 by default,
we only need to update it when clearing Z to 0.0.

We also need to set the depth clear regs and to update
ZRANGE_PRECISION when initializing a TC-compat depth image to 0.

Original patch from James Legg.

This fixes random CTS fails with
dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.input.*

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 68dead112e)
2018-06-15 13:55:04 -07:00
Samuel Pitoiset
888b7fcaf4 radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8
This causes rendering issues in Shadow Warrior 2 with DXVK.

Cc: mesa-stable@lists.freedesktop.org
Fixes: ccc64f3133 ("radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106912
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 51e23d3419)
2018-06-15 13:55:04 -07:00
Bas Nieuwenhuizen
c41a74622d radv: Fix output for sparse MRTs.
We need to init the cb_shader_format correctly with the changed
col_format, so this moves the col_format adjustment to before the
adjustment to before the cb_shader_mask gets generated.

Fixes: 06d3c65098 "radv: fix a GPU hang when MRTs are sparse"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106903
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 41dabdc475)
2018-06-15 13:55:04 -07:00
Dylan Baker
dd355ee90e docs: Add release notes for 18.1.2 2018-06-15 13:45:10 -07:00
Dylan Baker
aa1f6934f8 version: bump version for 18.1.2 release 2018-06-15 13:36:06 -07:00
Andrew Galante
87fb8bf6a8 configure.ac: Test for __atomic_add_fetch in atomic checks
Some platforms have 64-bit __atomic_load_n but not 64-bit
__atomic_add_fetch, so test for both of them.

Bug: https://bugs.gentoo.org/655616
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit baf16b2ea3)
2018-06-13 11:07:06 -07:00
Andrew Galante
b9725e3aa4 meson: Test for __atomic_add_fetch in atomic checks
Some platforms have 64-bit __atomic_load_n but not 64-bit
__atomic_add_fetch, so test for both of them.

Bug: https://bugs.gentoo.org/655616
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 9d547a7617)
2018-06-13 11:07:00 -07:00
Matt Turner
54b38ea495 meson: Fix -latomic check
Commit 54ba73ef10 (configure.ac/meson.build: Fix -latomic test) fixed
some checks for -latomic, and then commit 54bbe600ec (configure.ac:
rework -latomic check) further extended the fixes in configure.ac but
not in Meson. This commit extends those fixes to the Meson tests.

Fixes: 54bbe600ec (configure.ac: rework -latomic check)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit b29b5a82a1)
2018-06-13 11:06:55 -07:00
Thomas Petazzoni
f9500edb96 configure.ac: rework -latomic check
The configure.ac logic added in commit
2ef7f23820 ("configure: check if
-latomic is needed for __atomic_*") makes the assumption that if a
64-bit atomic intrinsic test program fails to link without -latomic,
it is because we must use -latomic.

Unfortunately, this is not completely correct: libatomic only appeared
in gcc 4.8, and therefore gcc versions before that will not have
libatomic, and therefore don't provide atomic intrinsics for all
architectures. This issue was for example encountered on PowerPC with
a gcc 4.7 toolchain, where the build fails with:

powerpc-ctng_e500v2-linux-gnuspe/bin/ld: cannot find -latomic

This commit aims at fixing that, by not assuming -latomic is
available. The commit re-organizes the atomic intrinsics detection as
follows:

 (1) Test if a program using 64-bit atomic intrinsics links properly,
     without -latomic. If this is the case, we have atomic intrinsics,
     and we're good to go.

 (2) If (1) has failed, then test to link the same program, but this
     time with -latomic in LDFLAGS. If this is the case, then we have
     atomic intrinsics, provided we link with -latomic.

This has been tested in three situations:

 - On x86-64, where atomic instrinsics are all built-in, with no need
   for libatomic. In this case, config.log contains:

   GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE='#'
   GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE=''
   LIBATOMIC_LIBS=''

   This means: atomic intrinsics are available, and we don't need to
   link with libatomic.

 - On NIOS2, where atomic intrinsics are available, but some of them
   (64-bit ones) require using libatomic. In this case, config.log
   contains:

   GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE='#'
   GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE=''
   LIBATOMIC_LIBS='-latomic'

   This means: atomic intrinsics are available, and we need to link
   with libatomic.

 - On PowerPC with an old gcc 4.7 toolchain, where 32-bit atomic
   instrinsics are available, but not 64-bit atomic instrinsics, and
   there is no libatomic. In this case, config.log contains:

   GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE=''
   GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE='#'

   With means that atomic intrinsics are not usable.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
(cherry picked from commit 54bbe600ec)
2018-06-13 11:06:50 -07:00
Nicolas Boichat
c001ecebe6 configure.ac/meson.build: Fix -latomic test
When compiling with LLVM 6.0 on x86 (32-bit) for Android, the test
fails to detect that -latomic is actually required, as the atomic
call is inlined.

In the code itself (src/util/disk_cache.c), we see this pattern:
p_atomic_add(cache->size, - (uint64_t)size);
where cache->size is an uint64_t *, and results in the following
link time error without -latomic:
src/util/disk_cache.c:628: error: undefined reference to '__atomic_fetch_add_8'

Fix the configure/meson test to replicate this pattern, which then
correctly realizes the need for -latomic.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
(cherry picked from commit 54ba73ef10)
2018-06-13 11:06:43 -07:00
Dylan Baker
d9c527942f cherry-ignore: Add another patch 2018-06-12 08:45:14 -07:00
Kenneth Graunke
31c70b88a7 anv: Disable __gen_validate_value if NDEBUG is set.
We were enabling undefined memory checking for genxml values based on
Valgrind being installed at build time, even for release builds.  This
generates piles and piles of assembly whenever you touch genxml.

With gcc 7.3.1 and -O3 and -march=native on a Kabylake with Valgrind
installed at build time:

      text    data    bss     dec    hex filename
   5978385  262884  13488 6254757 5f70a5 libvulkan_intel.so
   3799377  262884  13488 4075749 3e30e5 libvulkan_intel.so

That's a 36% reduction in text size.

Fixes: 047ed02723 (vk/emit: Use valgrind to validate every packed field)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0d5329d626)
2018-06-12 08:43:35 -07:00
Jason Ekstrand
5a3107b7ec i965/screen: Return false for unsupported formats in query_modifiers
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 7d55d7d54d6855b2b6cb183d0aa87fce1c7b9e5e)

v2: - Remove __DRI_IMAGE_FOURCC_SABGR8888 which doesn't exist on 18.1

V2 by Dylan, changes suggested by Jason
2018-06-11 10:45:54 -07:00
Samuel Pitoiset
5bf748045c radv: add a workaround for DXVK hangs by setting amdgpu-skip-threshold
Workaround for bug in llvm that causes the GPU to hang in presence
of nested loops because there is an exec mask issue. The proper
solution is to fix LLVM but this might require a bunch of work.

This fixes a bunch of GPU hangs that happen with DXVK.

Vega10:
Totals from affected shaders:
SGPRS: 110456 -> 110456 (0.00 %)
VGPRS: 122800 -> 122800 (0.00 %)
Spilled SGPRs: 7478 -> 7478 (0.00 %)
Spilled VGPRs: 36 -> 36 (0.00 %)
Code Size: 9901104 -> 9922928 (0.22 %) bytes
Max Waves: 7143 -> 7143 (0.00 %)

Code size slightly increases because it inserts more branch
instructions but that's expected. I don't see any real performance
changes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105613
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 135e4d434f)

Conflicts:
	src/amd/vulkan/radv_shader.c

There was a minor conflict in the last hunk of this patch that was
manually resolved.
2018-06-11 10:08:30 -07:00
Samuel Pitoiset
70e56e8ea3 radv: fix missing ZRANGE_PRECISION(1) for GFX9+
ZRANGE_PRECISION(1) seems to be the default optimal value, but
it was only set for VI and older chips.

This fixes a rendering issue with Banished through DXVK, and
might fix more than that.

There is still the ZRANGE_PRECISION bug that we need to handle
but that can be fixed later.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 94706f0de4)
2018-06-11 10:08:30 -07:00
Eric Engestrom
0de0b5c406 i965: fix resource leak
v2: intel_miptree_release() already takes care of the planes, no need
    to hand-code the loop (Lionel)

Coverity ID: 1436909
Fixes: 3352f2d746 "i965: Create multiple miptrees for planar YUV images"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit e43c012433)
2018-06-11 10:08:30 -07:00
Jordan Justen
98dfe161b0 mesa/program_binary: add implicit UseProgram after successful ProgramBinary
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106810
Fixes: b4c37ce214 "i965: Add ARB_get_program_binary support using nir_serialization"
Ref: 3fe8d04a6d "mesa: don't always set _NEW_PROGRAM when linking"
Ref: c505d6d852 "mesa: use gl_program for CurrentProgram rather than gl_shader_program"
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit e266b32059)
2018-06-11 10:08:30 -07:00
Timothy Arceri
bd60b6ef54 radeonsi: fix possible truncation on renderer string
Fixes truncation warning in gcc 8.1

Fixes: 8539c9bf31 ("gallium/radeon: add the kernel version into the renderer string")
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 03c370d2f1)
2018-06-11 10:08:30 -07:00
Timothy Arceri
0376e1a79a ac: fix possible truncation of intrinsic name
Fixes the gcc warning:
snprintf’ output between 26 and 33 bytes into a destination of size 32

Fixes: d5f7ebda3e ("ac: add LLVM build functions for subgroup instrinsics")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit fae3b38770)
2018-06-11 10:08:30 -07:00
Dylan Baker
dd173da7a2 meson: work around gentoo applying -m32 to host compiler in cross builds
Gentoo's ebuild system always adds -m32 to the compiler for doing x86_64
-> x86 cross builds, while meson expects it not to do that. This
results in an x86 -> x86 cross build, and assembly gets disabled.

Fixes: 2d62fc0646
       ("meson: disable x86 asm in fewer cases.")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 8f2421d73b)

Conflicts:
	meson.build

I manually resolved the conflicts between this patch and 18.1 to avoid
pulling in another patch.
2018-06-11 10:08:30 -07:00
Jason Ekstrand
d6533c02d0 anv: Set fence/semaphore types to NONE in impl_cleanup
There were some places that were calling anv_semaphore_impl_cleanup and
neither deleting the semaphore nor setting the type back to NONE.  Just
set it to NONE in impl_cleanup to avoid these issues.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106643
Fixes: 031f57eba "anv: Add a basic implementation of VK_KHX_external..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 237c5ac4f9)
2018-06-11 10:08:30 -07:00
Cameron Kumar
3737e4df7c vulkan/wsi: Destroy swapchain images after terminating FIFO queues
The queue_manager thread can access the images from x11_present_to_x11,
hence this reorder prevents dereferencing of dangling pointers.

Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Fixes: e73d136a02 ("vulkan/wsi/x11: Implement FIFO mode.")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit cb03803253)
2018-06-11 10:08:30 -07:00
Dylan Baker
ab9e4f9733 cherry-ignore: Add patches from Jason that he rebased on 18.1
These 5 commits don't apply cleanly, so a rebased version was supplied
by Jason.
2018-06-11 10:08:30 -07:00
Jason Ekstrand
5fb59f4686 i965/screen: Use RGBA non-sRGB formats for images
Not all of the MESA_FORMAT and ISL_FORMAT helpers we use can properly
handle RGBX formats.  Also, we don't want to make decisions based on
those in the first place because we can't render to RGBA and we use the
non-sRGB version to determine whether or not to allow CCS_E.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 7ed8c125ad3df29ce1ae9fcb82762a589cd8817c)
2018-06-11 10:08:30 -07:00
Jason Ekstrand
47e22895eb i965/screen: Refactor query_dma_buf_formats
This reworks it to work like query_dma_buf_modifiers and, in particular,
makes it more flexible so that we can disallow a non-static set of
formats.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 8c4c5a09c076832328b6f926c043dbbf8708d0b8)
2018-06-08 10:19:55 -07:00
Jason Ekstrand
c2e44a3540 intel/isl: Add bounds-checking assertions for the format_info table
We follow the same convention as isl_format_get_layout in having two
assertions to ensure that only valid formats are passed in.  We also
check against the array size of the table because some valid formats
such as CCS formats will may be past the end of the table.  This fixes
some potential out-of-bounds array access even in valid cases.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 45c05a7cee32990feda89685f1080b7251fa1a59)
2018-06-08 10:19:54 -07:00
Jason Ekstrand
0046ee83d7 intel/isl: Add bounds-checking assertions in isl_format_get_layout
We add two assertions instead of one because the first assertion that
format != ISL_FORMAT_UNSUPPORTED is more descriptive and checks for a
real but unsupported enumerant while the second ensures that they don't
pass in garbage values.  We also update some other helpers to use
isl_format_get_layout instead of using the table directly so that they
get bounds checking too.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 4bbe4cf4270b84052d672fe5fe99d3e594c53124)
2018-06-08 10:18:29 -07:00
Jason Ekstrand
cc4743a518 intel/blorp: Don't vertex fetch directly from clear values
On gen8+, we have to VF cache flush whenever a vertex binding aliases a
previous binding at the same index modulo 4GiB.  We deal with this in
Vulkan by ensuring that vertex buffers and the dynamic state (from which
BLORP pulls its vertex buffers) are in the same 4GiB region of the
address space.  That doesn't work if we're reading clear colors with the
VF unit.  In order to work around this we switch to using MI commands to
copy the clear value into the vertex buffer we allocate for the normal
constant data.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 44c614843c)
2018-06-07 08:30:48 -07:00
Scott D Phillips
8b76c603dd intel/tools: add intel_sanitize_gpu to EXTRA_DIST
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106778
Fixes: cc41603d6d ("intel/tools: new intel_sanitize_gpu tool")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 6fb22114a0)
2018-06-06 08:28:10 -07:00
Marek Olšák
3a85a6180b r300g/swtcl: make pipe_context uploaders use malloc'd memory as before
Discovered by Roland Scheidegger.

The resource_create code uses GPU memory for PIPE_BIND_CUSTOM, but
malloc'd memory otherwise. Vertex and index buffers should use malloc'd
memory.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 17a42062cc)
2018-06-06 08:27:58 -07:00
Philip Rebohle
f5865f1f58 radv: Use correct color format for fast clears
Using the image format is incorrect when the view has a different
format than the image. Instead, the view format needs to be used.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106687
(cherry picked from commit cc21e96d5f)
2018-06-06 08:27:50 -07:00
Eric Engestrom
b7c27dc7a3 configure: radv depends on mako
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106784
Fixes: 17201a2eb0 "radv: port to using updated anv
                              entrypoint/extension generator."
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit c765c39ea7)
2018-06-05 10:04:23 -07:00
Michel Dänzer
d6aacb54b5 glx: Fix number of property values to read in glXImportContextEXT
We were trying to read twice as many as the X server sent us, which
upset XCB:

[xcb] Too much data requested from _XRead
[xcb] This is most likely caused by a broken X extension library
[xcb] Aborting, sorry about that.
glx-free-context: ../../src/xcb_io.c:732: _XRead: Assertion `!xcb_xlib_too_much_data_requested' failed.

Fixing this takes 3 GLX piglit tests from crash to pass.

Fixes: 0852162950 "glx: Be more tolerant in glXImportContext (v2)"
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 6b8f3724c8)
2018-06-05 10:04:17 -07:00
Eric Engestrom
f9bff89350 autotools: add missing android file to package
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106779
Fixes: ff904978a1 "gallium/util: Android backtrace support"
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 66c61797ad)
2018-06-05 08:41:12 -07:00
Jason Ekstrand
4599cebfb3 intel/eu: Set flag [sub]register number differently for 3src
Prior to gen8, the flag [sub]register number is in a different spot on
3src instructions than on other instructions.  Starting with Broadwell,
they made it consistent.  This commit fixes bugs that occur when a
conditional modifier gets propagated into a 3src instruction such as a
MAD.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit db9675f5a4)
2018-06-05 08:40:59 -07:00
Jason Ekstrand
371e5bd93b intel/eu: Copy fields manually in brw_next_insn
Instead of doing a memcpy, this moves us to start with a blank
instruction (memset to zero) and copy the fields over one at a time.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 2d20303e18)
2018-06-05 08:40:53 -07:00
Jason Ekstrand
c317fef670 intel/eu: Add some brw_get_default_ helpers
This is much cleaner than everything that wants a default value poking
at the bits of p->current directly.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 381fac2740)
2018-06-05 08:40:39 -07:00
Kenneth Graunke
96cefc2cbc i965: Fix batch-last mode to properly swap BOs.
On pre-4.13 kernels, which don't support I915_EXEC_BATCH_FIRST, we move
the validation list entry to the end...but incorrectly left the exec_bo
array alone, causing a mismatch where exec_bos[0] no longer corresponded
with validation_list[0] (and similarly for the last entry).

One example of resulting breakage is that we'd update bo->gtt_offset
based on the wrong buffer.  This wreaked total havoc when trying to use
softpin, and likely caused unnecessary relocations in the normal case.

Fixes: 29ba502a4e (i965: Use I915_EXEC_BATCH_FIRST when available.)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit b3ba47c592)
2018-06-04 11:05:27 -07:00
Samuel Pitoiset
8952c4370d radv: fix a GPU hang when MRTs are sparse
When the i-th target format is set, all previous target formats
must be non-zero to avoid hangs. In other words, without this
if a fragment shader exports mrt0, mrt2 and mrt3, the GPU hangs
because the target format of mrt1 is zero.

This fixes DXVK GPU hangs with "Seven: The Days Long Gone",
"GTA V" and probably more games.

Cc: "18.0" 18.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 06d3c65098)
2018-06-04 11:05:11 -07:00
Bas Nieuwenhuizen
1a2205636c radv: Don't pass a TESS_EVAL shader when tesselation is not enabled.
Otherwise on pre-GFX9, if the constant layout allows both TESS_EVAL and
GEOMETRY shaders, but the PIPELINE has only GEOMETRY, it would return the
GEOMETRY shader for the TESS_EVAL shader.

This would cause the flush_constants code to emit the GEOMETRY constants
to the TESS_EVAL registers and then conclude that it did not need to set
the GEOMETRY shader registers.

Fixes: dfff9fb6f8 "radv: Handle GFX9 merged shaders in radv_flush_constants()"
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 2835b6baf4)
2018-06-04 11:05:04 -07:00
Juan A. Suarez Romero
44d164537f glsl: Add ir_binop_vector_extract in NIR
Implement ir_binop_vector_extract using NIR operations. Based on SPIR-V
to NIR approach.

This fixes:
dEQP-GLES3.functional.shaders.indexing.moredynamic.with_value_from_indexing_expression_fragment
Piglit's glsl-fs-vec4-indexing-8.shader_test

CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral <itoral@igalia.com>
(cherry picked from commit cbe4baed1f)
2018-06-04 11:04:58 -07:00
Alex Smith
f0ad21dfce radv: Set active_stages the same whether or not shaders were cached
With GFX9 merged shaders, active_stages would be set to the original
stages specified if shaders were not cached, but to the stages still
present after merging if they were.

Be consistent and use the original stages.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 0fa51bfdbe)
[Dylan Baker: resolve trivial conflict]

Conflicts:
    src/amd/vulkan/radv_pipeline.c
2018-06-01 08:43:25 -07:00
Alex Smith
0dd373e841 radeonsi: Fix crash on shaders using MSAA image load/store
The value returned by tgsi_util_get_texture_coord_dim() does not
account for the sample index. This means image_fetch_coords() will not
fetch it, leading to a null deref in ac_build_image_opcode() which
expects it to be present (the return value of ac_num_coords() *does*
include the sample index).

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 01a2414045)
2018-06-01 08:41:09 -07:00
Alex Smith
d6245b0a38 radv: Handle GFX9 merged shaders in radv_flush_constants()
This was not previously handled correctly. For example,
push_constant_stages might only contain MESA_SHADER_VERTEX because
only that stage was changed by CmdPushConstants or
CmdBindDescriptorSets.

In that case, if vertex has been merged with tess control, then the
push constant address wouldn't be updated since
pipeline->shaders[MESA_SHADER_VERTEX] would be NULL.

Use radv_get_shader() instead of getting the shader directly so that
we get the right shader if merged. Also, skip emitting the address
redundantly - if two merged stages are set in push_constant_stages
this change would have made the address get emitted twice.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit dfff9fb6f8)
2018-06-01 08:41:08 -07:00
Alex Smith
0d0135276f radv: Consolidate GFX9 merged shader lookup logic
This was being handled in a few different places, consolidate it into a
single radv_get_shader() function.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 7ca0167ae9)
2018-06-01 08:41:08 -07:00
Dylan Baker
675495b575 cherry-ignore: add commits not to pull 2018-06-01 08:41:08 -07:00
Dylan Baker
85d02bcf76 docs/relnotes: Add sha256 sums for mesa 18.1.1 2018-06-01 08:34:21 -07:00
Dylan Baker
1f3536b80b docs: Add release notes for 18.1.1 2018-05-31 15:22:49 -07:00
Dylan Baker
32dbb286c5 VERSION: bump to 18.1.1 for next release 2018-05-31 13:08:36 -07:00
Marek Olšák
e3897b667a mesa: handle GL_UNSIGNED_INT64_ARB properly (v2)
Bindless texture handles can be passed via vertex attribs using this type.
This fixes a bunch of bindless piglit tests on radeonsi.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit a8e1413876)
2018-05-30 16:56:47 -07:00
Eric Engestrom
df303adc29 vulkan: don't free uninitialised memory
The modifiers array hasn't been initialised by then, much less with data
that would need freeing.
Move the label after the loop to fix this.

Fixes: c80c08e226 ("vulkan/wsi/x11: Add support for DRI3 v1.2")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit e4fe2fd3bb)
2018-05-30 16:56:47 -07:00
Ilia Mirkin
b6167e9640 nv30: ensure that displayable formats are marked accordingly
Fixes: f7604d8af5 ("st/dri: only expose config formats that are display targets")
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 30918b77ac)
2018-05-30 16:56:47 -07:00
Bas Nieuwenhuizen
55de9df50e radv: Only expose subgroup shuffles on VI+.
The current implementation depends on bpermute, which
is VI+.

Fixes: f2c6a55061 "radv: enable subgroup capabilities"
Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit c2799574eb)
2018-05-30 16:56:47 -07:00
Thierry Reding
5015436eaf tegra: Remove usage of non-stable UAPI
This code path is no longer required with framebuffer modifier support.

Tested-by: Daniel Kolesa <daniel@octaforge.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
(cherry picked from commit bd3e97e5aa)
2018-05-30 16:56:47 -07:00
Thierry Reding
92c633881e tegra: Fix scanout resources without modifiers
Resources created for scanout but without modifiers need to be treated
as pitch-linear. This is because applications that don't use modifiers
to create resources must be assumed to not understand modifiers and in
turn won't be able to create a DRM framebuffer and passing along which
modifiers were picked by the implementation.

Tested-by: Daniel Kolesa <daniel@octaforge.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
(cherry picked from commit 9603d81df0)
2018-05-30 16:56:47 -07:00
Thierry Reding
2b8f87cefe tegra: Treat resources with modifiers as scanout
Resources created with modifiers are treated as scanout because there is
no way for applications to specify the usage (though that capability may
be useful to have in the future). Currently all the resources created by
applications with modifiers are for scanout, so make sure they have bind
flags set accordingly.

This is necessary in order to properly export buffers for such resources
so that they can be shared with scanout hardware.

Tested-by: Daniel Kolesa <daniel@octaforge.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
(cherry picked from commit 9e539012df)
2018-05-30 16:56:47 -07:00
Jason Ekstrand
1dd9532d15 intel/blorp: Support blits and clears on surfaces with offsets
For certain EGLImage cases, we represent a single slice or LOD of an
image with a byte offset to a tile and X/Y intratile offsets to the
given slice.  Most of i965 is fine with this but it breaks blorp.  This
is a terrible way to represent slices of a surface in EGL and we should
stop some day but that's a very scary and thorny path.  This gets blorp
to start working with those surfaces and fixes some dEQP EGL test bugs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ae514ca695)
2018-05-30 16:56:47 -07:00
Marek Olšák
36e65cc47f radeonsi: fix incorrect parentheses around VS-PS varying elimination
I don't know if it caused issues.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 92ea9329e5)
2018-05-30 16:56:47 -07:00
Marek Olšák
3cb16ce7e9 st/mesa: simplify lastLevel determination in st_finalize_texture
This fixes shader images where we always bind stObj->pt and not individual
gl_texture_images.

Roughly based on i965 commit 845ad2667a
which does a similar thing but for a different reason.

This fixes GL CTS assertion failures introduced by Ilia.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a4ba7cd6a2)
2018-05-30 16:56:47 -07:00
Jose Dapena Paz
7b588bf361 mesa: do not leak ctx->Shader.ReferencedProgram references
When glUseProgram is used, references to the included shaders are
added in ctx->Shader.ReferencedProgram. But those references are not
decreased when the shader data is deallocated. Thus, those shaders
are leaked.

Explicitely remove the pending references to these shaders.

Fixes: e6506b3cd2 ("mesa: retain gl_shader_programs after glDeleteProgram if they are in use")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 6c61c31dc2)
2018-05-30 16:56:47 -07:00
Francisco Jerez
54ff500546 i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.
Instead of directly using intel_obj->buffer.  Among other things
intel_bufferobj_buffer() will update intel_buffer_object::
gpu_active_start/end, which are used by glBufferSubData() to decide
which path to take.  Fixes a failure in the Piglit
ARB_shader_image_load_store-host-mem-barrier Buffer Update/WaW tests,
which could be reproduced with a non-standard glGetTexSubImage
implementation (see bug report).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105351
Reported-by: Nanley Chery <nanleychery@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 936cd3c87a)
2018-05-30 16:56:47 -07:00
Francisco Jerez
9c1ab9609c i965: Handle non-zero texture buffer offsets in buffer object range calculation.
Otherwise the specified surface state will allow the GPU to access
memory up to BufferOffset bytes past the end of the buffer.  Found by
inspection.

v2: Protect against out-of-range BufferOffset (Nanley).
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit e989acb03b)
2018-05-30 16:56:47 -07:00
Francisco Jerez
1b5f4fca95 i965: Move buffer texture size calculation into a common helper function.
The buffer texture size calculations (should be easy enough, right?)
are repeated in three different places, each of them subtly broken in
a different way.  E.g. the image load/store path was never fixed to
clamp to MaxTextureBufferSize, and none of them are taking into
account the buffer offset correctly.  It's easier to fix it all in one
place.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106481
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 156d2c6e62)
2018-05-30 16:56:47 -07:00
Francisco Jerez
d795a1a691 Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"
This reverts commit c0ed52f614.  It was
preventing the image format validation from being done on buffer
textures, which is required to ensure that the application doesn't
attempt to bind a buffer texture with an internal format incompatible
with the image unit format (e.g. of different texel size), which is
not allowed by the spec (it's not allowed for *any* texture target,
whether or not there is spec wording restricting this behavior
specifically for buffer textures) and will cause the driver to
calculate texel bounds incorrectly and potentially crash instead of
the expected behavior.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106465
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 5a68147803)
2018-05-30 16:56:47 -07:00
Dave Airlie
7366a91340 tgsi/scan: add hw atomic to the list of memory accessing files
This fixes 4 out of 5 cases in:
arb_framebuffer_no_attachments-atomic on cayman.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f2f464de57)
2018-05-30 16:56:47 -07:00
Michel Dänzer
e4b86e96e8 dri3: Stricter SBC wraparound handling
Prevents corrupting the upper 32 bits of draw->recv_sbc when
draw->send_sbc resets to 0 (which currently happens when the window is
unbound from a context and bound to one again), which in turn caused
loader_dri3_swap_buffers_msc to calculate target_msc with corrupted
upper 32 bits. This resulted in hangs with the Xorg modesetting driver
as of xserver 1.20 (older versions and other drivers ignored the upper
32 bits of the target MSC, which is why this wasn't noticed earlier).

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/106351
Tested-by: Mike Lothian <mike@fireburn.co.uk>
(cherry picked from commit fe2edb25dd)
2018-05-30 16:56:47 -07:00
Jason Ekstrand
14689dccbf intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0
Fixes: d6cd14f213 "i965/fs: Define new shader opcode to..."
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
(cherry picked from commit 417b9e5770)
2018-05-30 16:56:47 -07:00
Anuj Phogat
66604a7518 i965/glk: Add l3 banks count for 2x6 configuration
2x6 configuration with pci-id 0x3185 has same number of
banks (2) as 3x6 configuration (pci-id 0x3184).

Reported-by: Clayton Craft <clayton.a.craft@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: eb23be1d97 "i965: Add and initialize l3_banks field for gen7+"
Cc: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 0748383a60)
2018-05-30 16:56:47 -07:00
Samuel Pitoiset
fd18f81a0c radv: fix centroid interpolation
It's legal to set the centroid and sample interpolation modes
when MSAA disabled. So, we have to initialize the centroid
inputs because the hardware doesn't.

This fixes rendering issues with DXVK and The Witness, World of
Warcraft, Trackmania and probably more games.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106315
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102390
CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 73df16dcee)
2018-05-30 16:56:47 -07:00
Bas Nieuwenhuizen
b49daeac88 radv: Fix SRGB compute copies.
SRGB stores are broken. We had compensation code in the
resolve path but none in the copy path. Since we don't
want any conversion and it does not matter for DCC,
just make everything UNORM instead.

This happened to cause wrong colors for the PRIME path, as
that uses image->buffer copies which always use the compute
path.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106587
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a63a0960e3)
2018-05-30 16:56:47 -07:00
Bas Nieuwenhuizen
f969afdf2d amd/addrlib: Use defines in autotools build.
Otherwise stuff like NDEBUG would not be passed through.

CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106479
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 62e0e089d7)
2018-05-30 16:56:47 -07:00
Dave Airlie
5d4b5b0745 virgl: set texture buffer offset alignment to disable ARB_texture_buffer_range.
The host side hasn't got support for this feature yet, so don't enable it
unless we get the caps from the host.

This makes the texture buffer range piglit tests skip now.

Fixes: fe0647df5a (virgl: add offset alignment values to to v2 caps struct)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
(cherry picked from commit bfa74bb44d)
2018-05-30 16:56:47 -07:00
Christoph Haag
5b79bbe27d radv: fix VK_EXT_descriptor_indexing
GetPhysicalDeviceProperties2KHR() was crashing because features was null

Fixes: 0e10790558 "radv: Enable VK_EXT_descriptor_indexing."
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 549e54270b)
2018-05-30 16:56:47 -07:00
Nanley Chery
6fe826ee06 i965/miptree: Zero-initialize CCS_D buffers
Before this patch, the aux_state was actually AUX_INVALID because the BO
was never defined. This was fine on single slice miptrees because we
would fast-clear the resource right after creation. For multi-slice
miptrees on SKL+ however, this results in undefined behavior when
accessing a non-base slice. Here's a specific example:

1) Fast clear level 0
   * Undefined CCS_D buffer allocated in "PASS_THROUGH" state.
   * Level 0 transitions to the CLEAR state.
2) Render to level 1
   * Level 1 may have a 2-bit pattern of 2's.
   * Rendering with a 2 in the CCS is undefined.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 8a9491058d)
2018-05-30 16:56:47 -07:00
Nanley Chery
66fa6eceb9 i965/miptree: Fix handling of uninitialized MCS buffers
Before this patch, if we failed to initialize an MCS buffer, we'd
end up in a state in which the miptree thinks it has an MCS buffer,
but doesn't. We also leaked the clear_color_bo if it existed.

With this patch, we now free the miptree aux buffer resources and let
intel_miptree_alloc_mcs() know that the MCS buffer no longer exists.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 816f2dc67d)
2018-05-30 16:56:47 -07:00
Nanley Chery
21d51b4cea i965: Add and use a single miptree aux_buf field
We want to add and use a function that accesses the auxiliary buffer's
clear_color_bo and doesn't care if it has an MCS or HiZ buffer
specifically.

v2 (Jason Ekstrand):
* Drop intel_miptree_get_aux_buffer().
* Mention CCS in the aux_buf field.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit af4e9295fe)
2018-05-30 16:56:47 -07:00
Nanley Chery
db4e345047 i965: Add and use a getter for the miptree aux buffer
Make the next patch easier to read by eliminating most of the would-be
duplicate field accesses now.

v2: Update the HiZ comment instead of deleting it (Rafael).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit 5503b65103)
2018-05-30 16:56:47 -07:00
Timothy Arceri
c9074b3c52 mesa: add glUniform*ui{v} support to display lists
Fixes: a017c7ecb7 "mesa: display list support for uint uniforms"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78097
(cherry picked from commit f71714022b)
2018-05-30 16:56:47 -07:00
Stuart Young
4978181fdf etnaviv: Fix missing rnndb file in tarballs
Seems that when the rnndb files for etniviv were updated/included back
in Nov 2017, hw/texdesc_3d.xml.h was missed from Makefile.sources and
meson.build. This was all during the conversion to meson, so it apears
to have slipped through the cracks. As such, this file has been missing
from the official tarballs since inclusion in Mesa, so the git trees
and tarballs differ.

Found due to lintian errors in the Debian packages.

Fixes: f1e1c60ff6 ("etnaviv: Update from rnndb")
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit f806cc9eb6)
2018-05-30 16:56:47 -07:00
Jan Vesely
ac152cb1f9 eg/compute: Use reference counting to handle compute memory pool.
Use pipe_reference to release old RAT surfaces.
RAT surface adds a reference to pool bo, so use reference counting for pool->bo
as well.

v2: Use the same pattern for both defrag paths
    Drop confusing comment

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit f3521ce2c4)
2018-05-18 16:47:02 -07:00
Samuel Pitoiset
16591feedd spirv: fix visiting inner loops with same break/continue block
We should stop walking through the CFG when the inner loop's
break block ends up as the same block as the outer loop's
continue block because we are already going to visit it.

This fixes the following assertion which ends up by crashing
in RADV or ANV:

SPIR-V parsing FAILED:
In file ../src/compiler/spirv/vtn_cfg.c:381
block->node.link.next == NULL
0 bytes into the SPIR-V binary

This also fixes a crash with a camera shader from SteamVR.

v2: make use of vtn_get_branch_type() and add an assertion

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106090
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106504
CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6bde8c5608)
2018-05-18 16:47:02 -07:00
Kai Wasserbäch
30e5b42fb9 opencl: autotools: Fix linking order for OpenCL target
Otherwise the build fails with an undefined reference to
clang::FrontendTimesIsEnabled.

Bugzilla: https://bugs.freedesktop.org/106209
Cc: Jan Vesely <jan.vesely@rutgers.edu>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Acked-by: Jan Vesely <jan.vesely@rutgers.edu>
Tested-by: Aaron Watry <awatry@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit b691d9192c)
2018-05-18 16:47:02 -07:00
Bas Nieuwenhuizen
287b286931 radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.
The hardware always interprets the alpha as unsigned and fixing it
in the shader is going to add unacceptable overheads.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit f944a59996)
2018-05-18 16:47:02 -07:00
Bas Nieuwenhuizen
0d321533b2 radv: Fix up 2_10_10_10 alpha sign.
Pre-Vega HW always interprets the alpha for this format as unsigned,
so we have to implement a fixup to do the sign correctly for signed
formats.

v2: Improve indexing mess.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 3d4d388e39)
2018-05-18 16:47:02 -07:00
Bas Nieuwenhuizen
15e7f80cf9 radv: Translate logic ops.
radeonsi could pass them through but the enum changed between
Gallium and Vulkan, so we have to translate.

In progress I made the register defines a bit more readable.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100430
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit dd102405de)
2018-05-18 16:47:02 -07:00
Bas Nieuwenhuizen
f8a82e13f5 radv: Fix multiview queries.
This moves the extra queries to after the main query ended, instead
of doing it after the begin and hence doing nesting.

We also emit only (view count - 1) extra queries, as the main query
is already there for the first view.

This fixes the CTS occasionally getting stuck in
dEQP-VK.multiview.queries* waiting on results.

Fixes: 32b4f3c38d "radv/query: handle multiview queries properly. (v3)"
CC: 18.1 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 62f50df7b7)
2018-05-18 16:47:02 -07:00
Dave Airlie
0ea288d584 radv: use compute path for multi-layer images.
I don't think the hw resolve path can't handle multi-layer images.

This fixes all the:
dEQP-VK.renderpass.multisample_resolve.layers_*
tests on my VI card.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5978d54a09)
2018-05-18 16:47:02 -07:00
Dave Airlie
73b7828683 radv: resolve all layers in compute resolve path.
This path should iterate across all layers, I've some ideas
for doing this in a single pass, but this is simpler for now.

This passes the tests because we don't use the fragment path
unless we have DCC, and we don't have DCC on layered images.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 98dbaa445a)
2018-05-18 16:47:02 -07:00
Dave Airlie
c3ac3fea32 radv/resolve: do fmask decompress on all layers.
For a multi-layer subpass resolve we want to make sure we flush all
the layers.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b16fc6cda1)
2018-05-18 16:47:02 -07:00
Dylan Baker
309509bbc0 docs: Add sha sums for release 2018-05-18 16:35:59 -07:00
Dylan Baker
204159b481 docs: Add release notes for 18.1.0 2018-05-18 11:29:45 -07:00
Dylan Baker
92b3efd37a Update version to 18.1.0 2018-05-18 11:00:51 -07:00
Dylan Baker
630559e054 Bump version to rc4 2018-05-11 08:41:18 -07:00
Jan Vesely
36c24a608e winsys/amdgpu: Destroy dev_hash table when the last winsys is removed.
Fixes memory leak on module unload.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 58272c1ad7)
2018-05-11 08:35:13 -07:00
Marek Olšák
1def4eaa5c radeonsi/gfx9: work around a GPU hang due to broken indirect indexing in LLVM
Fixes: 6d19120da8 "radeonsi/gfx9: workaround for INTERP with indirect indexing"
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 597b9e8810)
2018-05-11 08:35:06 -07:00
Brian Paul
421aedcc59 mesa: revert GL_[SECONDARY_]COLOR_ARRAY_SIZE glGet type to TYPE_INT
Since size can be 3, 4 or GL_BGRA we need to keep these glGet types
as TYPE_INT, not TYPE_UBYTE.

Fixes: d07466fe18 ("mesa: fix glGetInteger/Float/etc queries for
vertex arrays attribs")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106462
cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>

(cherry picked from commit e4211b36bb)
2018-05-11 08:34:58 -07:00
Brian Paul
5391c4f5fd mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs
The vertex array Size and Stride attributes are now ubyte and short,
respectively.  The glGet code needed to be updated to handle those
types, but wasn't.

Fixes the new piglit test gl-1.5-get-array-attribs test.

v2: fix inadvertant whitespace change, change COLOR_ARRAY_SIZE to UBYTE,
misc fixes suggested by Justin

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106450
Fixes: d5f42f96e1 ("mesa: shrink size of gl_array_attributes (v2)")
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit d07466fe18)
2018-05-11 08:34:52 -07:00
Jason Ekstrand
e4c54e1031 i965,anv: Set the CS stall bit on the ISP disable PIPE_CONTROL
From the bspec docs for "Indirect State Pointers Disable":

    "At the completion of the post-sync operation associated with this
    pipe control packet, the indirect state pointers in the hardware are
    considered invalid"

So the ISP disable is a post-sync type of operation which means that it
should be combined with a CS stall.  Without this, the simulator throws
an error.

Fixes: 766d801ca "anv: emit pixel scoreboard stall before ISP disable"
Fixes: f536097f6 "i965: require pixel scoreboard stall prior to ISP disable"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit a8a740f272)
2018-05-10 08:31:54 -07:00
Matt Turner
3b9b1175aa gallium/tests: Fix assignment of EXTRA_DIST
Fixes: 6754c2e83d ("autotools: Include new meson files")
(cherry picked from commit 0f959215c3)
2018-05-10 08:31:54 -07:00
Lionel Landwerlin
7da026cb98 anv: emit pixel scoreboard stall before ISP disable
We want to make sure that all indirect state data has been loaded into
the EUs before disable the pointers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Fixes: 78c125af39 ("anv/gen10: Ignore push constant packets during context restore.")
(cherry picked from commit 766d801ca3)
2018-05-10 08:31:54 -07:00
Lionel Landwerlin
2c140183d4 i965: require pixel scoreboard stall prior to ISP disable
Invalidating the indirect state pointers might affect a previously
scheduled & still running 3DPRIMITIVE (causing page fault). So stall
on pixel scoreboard before that.

v2: Fix compile issue :(

v3: Stall on pixel scoreboard

v4: Drop the post sync operation (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Fixes: ca19ee33d7 ("i965/gen10: Ignore push constant packets during context restore.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106243
(cherry picked from commit f536097f67)
2018-05-10 08:31:54 -07:00
Lionel Landwerlin
e3106b460e i965: silence unused variable
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2dc29e095f ("i965: Don't leak blorp on Gen4-5.")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 3853f1c6f4)
2018-05-10 08:31:54 -07:00
Jan Vesely
5519bb3c65 winsys/radeon: Destroy fd_hash table when the last winsys is removed.
Fixes memory leak on module unload.
v2: Use util_hash_table helper function

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
(cherry picked from commit 45dfa6f4e7)
2018-05-10 08:30:35 -07:00
Jan Vesely
5fc5b1dab8 gallium/auxiliary: Add helper function to count the number of entries in hash table
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
(cherry picked from commit d146768d13)
2018-05-10 08:30:29 -07:00
Dave Airlie
e90f0d711c r600: fix constant buffer bounds.
If you have an indirect access to a constant buffer on r600/eg
use a vertex fetch in the shader. However apps have expected
behaviour on those out of bounds accessess (even if illegal).

If the constants were being uploaded as part of a larger
upload buffer, we'd set the range of allowed access to a lot
larger than required so apps would get values back from
other parts of the upload buffer instead of the expected out
of bounds access.

This fixes rendering bugs in Trine and Witcher 1, thanks
to iive for nagging me effectively until I figured it out :-)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91808
Cc: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit ce027ac5c7)
2018-05-10 08:30:15 -07:00
Ross Burton
0d2d5d2ffa src/intel/Makefile.vulkan.am: add missing MKDIR_GEN
Out of tree builds can try to write into a directory that doesn't exist yet:

| Traceback (most recent call last):
|   File "../../../mesa-18.0.2/src/intel/vulkan/anv_icd.py", line 46, in <module>
|     with open(args.out, 'w') as f:
| IOError: [Errno 2] No such file or directory: 'vulkan/intel_icd.x86_64.json'
| Makefile:4882: recipe for target 'vulkan/intel_icd.x86_64.json' failed

Add missing MKDIR_GEN calls to solve this.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 1755654d9f)
2018-05-10 08:30:07 -07:00
Rhys Perry
8c384f62a2 mesa: fix error handling in get_framebuffer_parameteriv
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 5ac16ed047)
2018-05-10 08:30:01 -07:00
Michel Dänzer
312c2f047e dri3: Only update number of back buffers in loader_dri3_get_buffers
And only free no longer needed back buffers there as well.

We want to stick to the same back buffer throughout a frame, otherwise
we can run into various issues.

Bugzilla: https://bugs.freedesktop.org/105906
Bugzilla: https://bugs.freedesktop.org/106399
Fixes: 3160cb86aa "egl/x11: Re-allocate buffers if format is suboptimal"
Reported-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 6f81e07ecb)
2018-05-09 08:03:25 -07:00
Jan Vesely
eea5ec86d5 pipe-loader: Free driver_name in error path
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 0783399d79)
2018-05-09 08:03:06 -07:00
Brian Paul
2136602167 glsl: change ast_type_qualifier bitset size to work around GCC 5.4 bug
Change the size of the bitset from 128 bits to 96.  This works around an
apparent GCC 5.4 bug in which bad SSE code is generated, leading to a
crash in ast_type_qualifier::validate_in_qualifier() (ast_type.cpp:654).

This can be repro'd with the Piglit test tests/spec/glsl-1.50/execution/
varying-struct-basic-gs-fs.shader_test

Bugzilla:https://bugs.freedesktop.org/show_bug.cgi?id=105497
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 901db25d5b)
2018-05-09 08:03:00 -07:00
Jan Vesely
3cdbd79dcc eg/compute: Drop reference to kernel_param bo in destructor
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a9e4be9212)
2018-05-08 10:57:05 -07:00
Jan Vesely
795fffc2df eg/compute: Drop reference on code_bo in destructor.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit ea1fff4416)
2018-05-08 10:55:33 -07:00
Lionel Landwerlin
817414b00b intel: devinfo: fix assertion on devices with odd number of EUs
I forgot to change the assert in the second helper function in a
previous change.

This hit the assert() on a Broadwell platform with 1 slice, 3
subslices but all EUs disabled in subslice 1 & 2.

Fixes: c1900f5b0f ("intel: devinfo: add helper functions to fill fusing masks values")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 3cdf1bf97d)
2018-05-08 10:10:24 -07:00
Kenneth Graunke
bf24436c6d i965: Don't leak blorp on Gen4-5.
We used to only initialize BLORP on Gen6+.  When we added it on Gen4-5,
we forgot to destroy it unconditionally.

Fixes: 752d7af77a (i965: Add blorp support for gen4-5)
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 2dc29e095f)
2018-05-08 10:10:17 -07:00
Bas Nieuwenhuizen
95474b31c4 vulkan/wsi: Only use LINEAR modifier for prime if supported.
This was setting the LINEAR modifier if neither the
X server nor the driver supported modifiers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106180
Fixes: c80c08e226 "vulkan/wsi/x11: Add support for DRI3 v1.2"
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Tested-by: Abel Garcia Dorta <mercuriete@gmail.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit b17cfb08a3)
2018-05-08 10:07:50 -07:00
Jan Vesely
f366968e86 r600: Cleanup constant buffers on context destruction
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a1e8fcce3e)
2018-05-08 10:07:17 -07:00
Ian Romanick
227fbe2799 mesa: Add missing support for glFogiv(GL_FOG_DISTANCE_MODE_NV)
Found by inspection, so I made a piglit test too.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f2db3be620)
2018-05-07 10:25:55 -07:00
Dylan Baker
ea1d5faa96 bump version to 18.1.0-rc3 2018-05-04 10:57:29 -07:00
Neil Roberts
d90d4e61e2 spirv: Apply OriginUpperLeft to FragCoord
This behaviour was changed in 1e5b09f42f. The commit message
for that says it is just a “tidy up” so my assumption is that the
behaviour change was a mistake. It’s a little hard to decipher looking
at the diff, but the previous code before that patch was:

  if (builtin == SpvBuiltInFragCoord || builtin == SpvBuiltInSamplePosition)
     nir_var->data.origin_upper_left = b->origin_upper_left;

  if (builtin == SpvBuiltInFragCoord)
     nir_var->data.pixel_center_integer = b->pixel_center_integer;

After the patch the code was:

  case SpvBuiltInSamplePosition:
     nir_var->data.origin_upper_left = b->origin_upper_left;
     /* fallthrough */
  case SpvBuiltInFragCoord:
     nir_var->data.pixel_center_integer = b->pixel_center_integer;
     break;

Before the patch origin_upper_left affected both builtins and
pixel_center_integer only affected FragCoord. After the patch
origin_upper_left only affects SamplePosition and pixel_center_integer
affects both variables.

This patch tries to restore the previous behaviour by changing the
code to:

  case SpvBuiltInFragCoord:
     nir_var->data.pixel_center_integer = b->pixel_center_integer;
     /* fallthrough */
  case SpvBuiltInSamplePosition:
     nir_var->data.origin_upper_left = b->origin_upper_left;
     break;

This change will be important for ARB_gl_spirv which is meant to
support OriginLowerLeft.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Fixes: 1e5b09f42f "spirv: Tidy some repeated if checks..."
(cherry picked from commit e17d0ccbbd)
2018-05-03 10:56:08 -07:00
Deepak Rawat
cd1435aa9d egl/x11: Send invalidate to driver on copy_region path in swap_buffer
Similar to swap_available path send invalidate to the driver because
egl/X11 is not watching for for server's invalidate events. The
dri2_copy_region path is trigerred when server supports DRI2 version
minor 1.

Tested with piglit egl tests for regression.

V2: Move invalidate from dri2_copy_region to swap_buffer common.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 9a21c96126)
2018-05-03 10:55:51 -07:00
Jose Maria Casanova Crespo
0d15a443fa intel/compiler: fix brw_imm_w for negative 16-bit integers
16-bit immediates need to replicate the 16-bit immediate value
in both words of the 32-bit value. This needs to be careful
to avoid sign-extension, which the previous implementation was
not handling properly.

For example, with the previous implementation, storing the value
-3 would generate imm.d = 0xfffffffd due to signed integer sign
extension, which is not correct. Instead, we should cast to
uint16_t, which gives us the correct result: imm.ud = 0xfffdfffd.

We only had a couple of cases hitting this path in the driver
until now, one with value -1, which would work since all bits are
one in this case, and another with value -2 in brw_clip_tri(),
which would hit the aforementioned issue (this case only affects
gen4 although we are not aware of whether this was causing an
actual bug somewhere).

v2: Make explicit uint32_t casting for left shift (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f0e6dacee5)
2018-05-03 10:55:43 -07:00
Jose Maria Casanova Crespo
1e5c3fa29b intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate
From Intel Skylake PRM, vol 07, "Immediate" section (page 768):

"For a word, unsigned word, or half-float immediate data,
software must replicate the same 16-bit immediate value to both
the lower word and the high word of the 32-bit immediate field
in a GEN instruction."

This fixes the int16/uint16 negate and abs immediates that weren't
taking into account the replication in lower and upper words.

v2: Integer cases are different to Float cases. (Jason Ekstrand)
    Included reference to PRM (Jose Maria Casanova)
v3: Make explicit uint32_t casting for left shift (Jason Ekstrand)
    Split half float implementation. (Jason Ekstrand)
    Fix brw_abs_immediate (Jose Maria Casanova)

Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 2a76f03c90)
2018-05-03 10:55:34 -07:00
Bas Nieuwenhuizen
57aebd4283 radv: Don't check the incoming apiVersion on CreateInstance.
This fixes

dEQP-VK.api.device_init.create_instance_invalid_api_version

CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 467c562a29)
2018-05-03 10:55:26 -07:00
Bas Nieuwenhuizen
e334caa4be radv: Allow vkEnumerateInstanceVersion ProcAddr without instance.
Apparently the somewhere between 1.1.70 and 1.1.73 the loader started
depending on this. The loader then creates a 1.0 instance, which gets
into funny situation because we have a 1.1 device.

No idea how to do line wrapping in Mako though, my random guesses
did not work.

CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 9267ff9883)
2018-05-03 10:55:19 -07:00
Nanley Chery
b3f3d605c8 i965/tex_image: Avoid the ASTC LDR workaround on gen9lp
Both the internal documentation and the results of testing this in the
CI suggest that this is unnecessary. Add the fixes tag because this
reduces an internal benchmark's startup time by about 17 seconds
(reported by Eero).

Fixes: 710b1d2e66 "i965/tex_image: Flush certain subnormal ASTC channel values"
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 3e56e4642f)
2018-05-02 11:23:54 -07:00
Jason Ekstrand
c760bbff20 anv: Allow lookup of vkEnumerateInstanceVersion without an instance
Fixes: cbab2d1da5
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit d216ffc604)
2018-05-02 11:23:49 -07:00
Matthew Nicholls
63953cc0fb radv: fix multisample image copies
Previously before fb077b0728, the LOD parameter was being used in place of the
sample index, which would only copy the first sample to all samples in the
destination image. After that multisample image copies wouldn't copy anything
from my observations.

This fixes some copy_and_blit CTS tests.

v3.1: - set lod to 0 for nir_txf_ms (Samuel)
v2: - use GLSL_SAMPLER_DIM_MS instead of 2D (Samuel)
    - updated commit description (Samuel)

Fix this properly by copying each sample in a separate radv_CmdDraw and using a
pipeline with the correct rasterizationSamples for the destination image.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 97d57ef917)
2018-05-02 11:23:32 -07:00
Samuel Pitoiset
4cf3a2b064 radv: compute the number of subpass attachments correctly
Only count color attachments twice if resolves are used, also
account for the depth stencil attachment if present.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit d8db5986ce)
2018-05-01 14:19:55 -07:00
Andres Rodriguez
2fe5a43995 radv/winsys: fix leaking resources from bo's imported by fd
A bo's ref_count was not being initialized when imported from an fd.
Therefore, we would fail to free the resource during VkFreeMemory().

This patch fixes applications like hifi VR in threaded mode, which
perform frequent imports/releases of IPC shared memory.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f56e22e496)
2018-05-01 14:19:49 -07:00
Leo Liu
7a1f220b26 st/omx/enc: fix blit setup for YUV LoadImage
The blit here involves scaling since it's copying from I8 format to R8G8 format.
Half of source will be filtered out with PIPE_TEX_FILTER_NEAREST instruction, it
looks that GPU always uses the second half as source. Currently we use "1" as
the start point of x for R, then causing 1 source pixel of U component shift to
right. So "-1" should be the start point for U component.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 1c5f4f4e17)
2018-04-30 09:22:20 -07:00
Juan A. Suarez Romero
171753ff5d autotools, meson: bump up required VA version
Due using a new VP9 config we use, required VA API 0.39

Fixes: 413c5ca372 ("travis: update libva required version")
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4d449c94e4)
2018-04-30 09:22:13 -07:00
Marek Olšák
7d6ed8d0dd radeonsi/gfx9: workaround for INTERP with indirect indexing
and clean up the conditions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6d19120da8)
2018-04-30 09:22:08 -07:00
Marek Olšák
66f64177b2 util/u_queue: fix a deadlock in util_queue_finish
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 7083ac7290)
2018-04-30 09:22:03 -07:00
Dylan Baker
c2768b8a51 bump version for rc2 2018-04-27 11:58:11 -07:00
Jason Ekstrand
b049689a1d anv/allocator: Don't shrink either end of the block pool
Previously, we only tried to ensure that we didn't shrink either end
below what was already handed out.  However, due to the way we handle
relocations with block pools, we can't shrink the back end at all.  It's
probably best to not shrink in either direction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105374
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106147
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3db93f9128)
2018-04-27 10:11:58 -07:00
Samuel Pitoiset
7ef0933099 ac: fix texture query LOD for 1D textures on GFX9
1D textures are allocated as 2D which means we only need
one coordinate for texture query LOD.

Fixes: 625dcbbc45 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit d38425ce87)
2018-04-27 10:11:58 -07:00
Ian Romanick
3f1847459b intel/compiler: Add scheduler deps for instructions that implicitly read g0
Otherwise the scheduler can move the writes after the reads.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95009
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95012
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Cc: Clayton A Craft <clayton.a.craft@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 0d5ce25c1c)
2018-04-27 10:11:58 -07:00
Marek Olšák
0ec061bbea Revert "st/dri: Fix dangling pointer to a destroyed dri_drawable"
This reverts commit dab02dea34.

It causes crashes of qtcreator and firefox.

Fixes: dab02de "st/dri: Fix dangling pointer to a destroyed dri_drawable"

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4559aefb5c)
2018-04-27 10:11:58 -07:00
Jason Ekstrand
9ec980063c i965/fs: Return mlen * 8 for size_read() for INTERPOLATE_AT_*
They are send messages and this makes size_read() and mlen agree.  For
both of these opcodes, the payload is just a dummy so mlen == 1 and this
should decrease register pressure a bit.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit de1f22d595)
2018-04-27 10:11:58 -07:00
Johan Klokkhammer Helsing
a85b00e278 st/dri: Fix dangling pointer to a destroyed dri_drawable
If an EGLSurface is created, made current and destroyed, and then a second
EGLSurface is created. Then the second malloc in driCreateNewDrawable may
return the same pointer address the first surface's drawable had.
Consequently, when dri_make_current later tries to determine if it should
update the texture_stamp it compares the surface's drawable pointer against
the drawable in the last call to dri_make_current and assumes it's the same
surface (which it isn't).

When texture_stamp is left unset, then dri_st_framebuffer_validate thinks
it has already called update_drawable_info for that drawable, leaving it
unvalidated and this is when bad things starts to happen. In my case it
manifested itself by the width and height of the surface being unset.

This is fixed this by setting the pointer to NULL before freeing the
surface.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106126
Signed-off-by: Johan Klokkhammer Helsing <johan.helsing@qt.io>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dab02dea34)
2018-04-27 10:11:58 -07:00
Samuel Pitoiset
b74d10534e radv: set ac_surf_info::num_channels correctly
num_channels has been introduced since "ac/surface: don't set
the display flag for obviously unsupported cases".

Based on RadeonSI.

Fixes: e29facff31 ("ac/surface: don't set the display flag for obviously unsupported cases (v2)")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit d7ffe3b384)
2018-04-27 10:11:58 -07:00
Samuel Pitoiset
6278c88395 radv: fix DCC enablement since partial MSAA implementation
dcc_msaa_allowed is always false on GFX9+ and only true on VI
if RADV_PERFTEST=dccmsaa is set. This means DCC was disabled
in some situations where it should not.

This is likely going to fix a performance regression.

Fixes: 2f63b3dd09 ("radv: enable DCC for MSAA 2x textures on VI under an option")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit a6fbefa67b)
2018-04-27 10:11:58 -07:00
Eric Anholt
f2b72b6fdc gallium/util: Fix incorrect refcounting of separate stencil.
The driver may have a reference on the separate stencil buffer for some
reason (like an unflushed job using it), so we can't directly free the
resource and should instead just decrement the refcount that we own.
Fixes double-free in KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8
on vc5.

Fixes: e94eb5e600 ("gallium/util: add u_transfer_helper")
Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 069c409f43)
2018-04-27 10:11:58 -07:00
Juan A. Suarez Romero
d76507298f travis: update libva required version
Commit fa328456e8 added VP9 config support, but this needs a newer
libva version, 1.7.0 or above.

Fixes: fa328456e8 ("st/va: add VP9 config to enable profile2")
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 413c5ca372)
2018-04-27 10:11:58 -07:00
Dylan Baker
297caed81d meson: fix graw-xlib after auxiliary consolidation
This one's completely my fault, I didn't do good enough testing after
rebasing and this got missed.

Fixes: d28c246501
       ("meson: build graw tests")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 1546f76a39)
2018-04-27 10:11:58 -07:00
Dylan Baker
dde47f6d19 meson: only build mesa_st tests when build-tests is true
Since we have an option to turn test building on and off, we should
honor that.

Fixes: 34cb4d0ebc
       ("meson: build tests for gallium mesa state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit c73abb4f82)
2018-04-27 10:11:58 -07:00
Dylan Baker
7a6e084b41 meson: don't build classic mesa tests without dri_drivers
Since mesa_classic is build-on-demand the tests will create a demand and
add a bunch of extra compilation.

Fixes: 43a6e84927
       ("meson: build mesa test.")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit aaab624245)
2018-04-27 10:11:58 -07:00
Samuel Pitoiset
762c07b5a0 ac: fix the number of coordinates for ac_image_get_lod and arrays
This fixes crashes for the following CTS:
dEQP-VK.glsl.texture_functions.query.texturequerylod.*

Cubemaps are the same as 2D arrays.

Fixes: 625dcbbc45 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit d136a5fad9)
2018-04-27 10:11:58 -07:00
Samuel Pitoiset
8ac80de5f2 ac/nir: add missing round_slice for 1D arrays
This fixes a bunch of CTS fails with 1D arrays:

dEQP-VK.glsl.texture_functions.texture*.sampler1darray_*

Fixes: 625dcbbc45 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 84fef802fb)
2018-04-27 10:11:58 -07:00
Dylan Baker
08e8556f54 bin/install_megadrivers: fix DESTDIR and -D*-path
This fixes -Ddri-drivers-path, -Dvdpau-libs-path, etc. with DESTDIR when
those paths are absolute. Currently due to the way python's os.path.join
handles absolute paths these will ignore DESTDIR, which is bad. This
fixes them to be relative to DESTDIR if that is set.

Fixes: 3218056e0e
       ("meson: Build i965 and dri stack")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
(cherry picked from commit ae3f45c11e)
2018-04-27 10:11:58 -07:00
Dylan Baker
d6a312e288 compiler/glsl: close fd's in glcpp_test.py
I would have thought falling out of scope would allow the gc to collect
these, but apparently it doesn't, and this hits an fd limit on macos.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106133
Fixes: db8cd8e367
       ("glcpp/tests: Convert shell scripts to a python script")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
(cherry picked from commit dbf5b772b3)
2018-04-27 10:11:58 -07:00
Bas Nieuwenhuizen
d844b2a656 nir: Do not use progress for unreachable code in return lowering.
We seem to use progress for two cases:
1) When we lowered some returns.
2) When we remove unreachable code.

If just case 2 happens we assert as state->return_flag has not
been allocated yet, but we are still trying to do insert all
predicates based on it.

This splits the concerns. We only use progress internally for case 1
and then keep track of 2 in a separate variable to indicate progress
in the return value of the pass.

This is slightly better than transforming the assert into
if (!state->return_flag) return, as the solution in this patch avoids
inserting predicates even if some other part of the might need them.

Fixes: 6e22ad6edc "nir: return early when lowering a return at the end of a function"
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106174
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 0e945fdf23)
2018-04-27 10:11:58 -07:00
Dylan Baker
a829747276 bin: force git show to use default pretty setting
I have pretty default to short, which breaks this script.

cc: Emil Velikov <emil.velikov@collabora.com>
cc: Andres Gomez <agomez@igalia.com>
cc: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-27 10:11:58 -07:00
Dylan Baker
ff1ecebb1d Bump release version 2018-04-20 22:42:20 -07:00
Dylan Baker
6754c2e83d autotools: Include new meson files
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-20 20:26:56 -07:00
Dylan Baker
5c8e2501a6 autotools: Add passes.h to sources so it will be included in the tarball
This was introduced in commit 8f848ada8a
but not added to the sources list, which is necessary for it to be
included in release tarballs.

Fixes: 8f848ada8a
       ("swr/rast: Start refactoring of builder/packetizer.")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-20 20:26:54 -07:00
Dylan Baker
cfd7d2ba0d autotools: include include/vulkan headers
This is needed to provide vk_android_native_buffer.h for vk_enum_to_str.

v2: - remove accidentally included changes

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-20 20:26:49 -07:00
Rhys Perry
a0e57432b7 nvc0: fix line width on GM20x+
This has the side-effect of fixing polygon-offset piglit test failures.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-20 20:43:59 -04:00
Nanley Chery
7b20329107 i965/miptree: Delete an unused function
We're going to combine ::mcs_buf and ::hiz_buf in later commits. Once
that happens, this function no longer make sense.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-20 17:14:37 -07:00
Nanley Chery
010abacc95 i965/miptree: Don't leak the clear_color_bo
Free the clear_color_bo in addition to freeing the
intel_miptree_aux_buffer which holds the reference to it.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-20 17:14:37 -07:00
Jason Ekstrand
9d2ef3c9ec i965/blorp: Do the gen11 BTI flush
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-20 16:30:14 -07:00
Jason Ekstrand
185630c6bc anv/blorp: Do the gen11 BTI flush
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-20 16:30:14 -07:00
Lucas Stach
52e93e309f etnaviv: fix texture_format_needs_swiz
memcmp returns 0 when both swizzles are the same, which means we don't
need any hardware swizzling. texture_format_needs_swiz should return
true when the return value of the memcmp is non-zero.

Fixes: 751ae6afbe ("etnaviv: add support for swizzled texture formats")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2018-04-20 18:54:10 +02:00
Samuel Pitoiset
8f13975713 ac/nir: fix image dimension for subpass attachments
For subpass attachments we need one more coordinate with
the layer, so make them array types.

This fixes a bunch of CTS fails with RADV.

Fixes: 24fb3e6aa1 ("ac/nir: use ac_build_image_opcode for image intrinsics")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 18:44:51 +02:00
Bas Nieuwenhuizen
e1df849c3c radv: Mark GTT memory as device local for APUs.
Otherwise a lot of games complain about not having enough memory,
and it is sort of local so this seems reasonable to me.

CC: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-20 18:16:16 +02:00
Samuel Pitoiset
fedd0a4215 radv/winsys: allow to submit up to 4 IBs for chips without chaining
The SI family doesn't support chaining which means the maximum
size in dwords per CS is limited. When that limit was reached
we failed to submit the CS and the application crashed.

This patch allows to submit up to 4 IBs which is currently the
limit, but recent amdgpu supports more than that.

Please note that we can reach the limit of 4 IBs per submit
but currently we can't improve that. The only solution is to
upgrade libdrm. That will be improved later but for now this
should fix crashes on SI or when using RADV_DEBUG=noibs.

Fixes: 36cb5508e8 ("radv/winsys: Fail early on overgrown cs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105775
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 18:12:26 +02:00
Stefan Schake
ff904978a1 gallium/util: Android backtrace support
We can't use any of the existing implementations in u_debug_stack.
Android technically has libunwind, but it's been modified to the point
where it no longer compiles with the Mesa usage. The library is also
not meant to be referenced by vendor libraries. The officially sanctioned
way of obtaining backtraces is through the Android own libbacktrace, a
C++ library. Access it through a separate C++ source file on Android only.

Signed-off-by: Stefan Schake <stschake@gmail.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-20 18:49:49 +03:00
Stefan Schake
2abd4f4b49 gallium/util: Don't stub u_debug_stack on Android
The fallback path for no libunwind ends up being stubs for Android.
Don't compile them in so we can provide our own implementation.

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-20 18:49:37 +03:00
Samuel Pitoiset
dd069e9b41 ac/nir: handle nir_intrinsic_load_first_vertex like base_vertex
This fixes a ton of CTS crashes.

Fixes: c366f422f0 ("nir: Offset vertex_id by first_vertex instead of base_vertex")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 17:07:38 +02:00
Samuel Pitoiset
b21a4efb55 radv/winsys: allow local BOs on APUs
Ported from RadeonSI.

Local BOs ignore BO priorities, and we don't need those on APUs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 16:18:24 +02:00
Samuel Pitoiset
5c1233ed62 radv: use a global BO list only for VK_EXT_descriptor_indexing
Maintaining two different paths is annoying but this gets
rid of the performance regression introduced by the global
BO list.

We might find a better solution in the future, but for now
just keeps two paths.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 16:18:18 +02:00
Samuel Pitoiset
7bd5367546 Revert "radv: Don't store buffer references in the descriptor set."
In order to reduce a performance regression introduced by
4b13fe55a4 ("radv: Keep a global BO list for VkMemory."),
we are going to maintain two different paths.

One when VK_EXT_descriptor_indexing is enabled by the
application because we need to have a global BO list, and
one (the old one) when it's not enabled.

With Talos on Polaris, the global BO list reduces performance
by 10% which is too much for me.

This reverts commit ab6cadd3ec.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 16:18:13 +02:00
Jose Maria Casanova Crespo
eb96bd57c7 i965/fs: retype offset_reg to UD at load_ssbo
All operations with offset_reg at do_vector_read are done
with UD type. So copy propagation was not working through
the generated MOVs:

mov(8) vgrf9:UD, vgrf7:D

This change allows removing the MOV generated for reading the
first components for 16-bit and 64-bit ssbo reads with
non-constant offsets.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-20 13:30:12 +02:00
Nicolai Hähnle
24fb3e6aa1 ac/nir: use ac_build_image_opcode for image intrinsics
So that we'll use the dimension-aware intrinsics in the future.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:30:07 +02:00
Nicolai Hähnle
74063431f1 radeonsi: generate image load/store/atomic ops using ac_build_image_opcode
In preparation of dimension-aware LLVM image intrinsics.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:29:57 +02:00
Nicolai Hähnle
625dcbbc45 amd/common: pass address components individually to ac_build_image_intrinsic
This is in preparation for the new image intrinsics.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:23:52 +02:00
Nicolai Hähnle
f931583828 amd/common: pass new enum ac_image_dim to ac_build_image_opcode
This is in preparation for the new, dimension-aware LLVM image
intrinsics.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:23:40 +02:00
Nicolai Hähnle
9cb52d470a radeonsi/nir: fix crash in test involving the sample mask
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:50 +02:00
Nicolai Hähnle
552bc37c6f radeonsi/nir: set FS properties only when scanning a fragment shader
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:47 +02:00
Nicolai Hähnle
a807a9b215 ac/nir: fix atomic compare-and-swap
The LLVM instruction returns { i32, i1 }, where the i1 indicates success.
We're only interested in the first part, which is the loaded value.

Fixes dEQP-GLES31.functional.compute.shared_var.atomic.compswap.*

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:40 +02:00
Nicolai Hähnle
e788b987d8 radeonsi: fix error paths of si_texture_transfer_map
trans is zero-initialized, but trans->resource is setup immediately so
needs to be dereferenced.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:33 +02:00
Nicolai Hähnle
68ee1d5796 glsl: prevent spurious Valgrind errors when serializing NIR
It looks as if the structure fields array is fully initialized below,
but in fact at least gcc in debug builds will not actually overwrite
the unused bits of bit fields.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:23 +02:00
Aaron Watry
354b12681b clover: Fix host access validation for sub-buffer creation
From CL 1.2 Section 5.2.1:
    CL_INVALID_VALUE if buffer was created with CL_MEM_HOST_WRITE_ONLY and
    flags specify CL_MEM_HOST_READ_ONLY , or if buffer was created with
    CL_MEM_HOST_READ_ONLY and flags specify CL_MEM_HOST_WRITE_ONLY , or if
    buffer was created with CL_MEM_HOST_NO_ACCESS and flags specify
    CL_MEM_HOST_READ_ONLY or CL_MEM_HOST_WRITE_ONLY .

Fixes CL 1.2 CTS test/api get_buffer_info

v2: Correct host_access_flags check (Francisco)

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-04-19 20:57:37 -05:00
Neil Roberts
c366f422f0 nir: Offset vertex_id by first_vertex instead of base_vertex
base_vertex will be zero for non-indexed calls and in that case we
need vertex_id to be offset by the ‘first’ parameter instead. That is
what we get with first_vertex. This is true for both GL and Vulkan.

The freedreno driver is also setting vertex_id_zero_based on
nir_options. In order to avoid breakage this patch switches the
relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can
retain the same behavior.

v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from
SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-19 15:57:45 -07:00
Neil Roberts
c4f30a9100 spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX
The base vertex in Vulkan is different from GL in that for non-indexed
primitives the value is taken from the firstVertex parameter instead
of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX
instead of BASE_VERTEX.

v2 (idr): Add comment describing why SYSTEM_VALUE_FIRST_VERTEX is used
for SpvBuiltInBaseVertex.  Suggested by Jason.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-19 15:57:45 -07:00
Antia Puentes
c32e1035cb intel: Handle firstvertex in an identical way to BaseVertex
Until we set gl_BaseVertex to zero for non-indexed draw calls
both have an identical value.

The Vertex Elements are kept like that:
* VE 1: <BaseVertex/firstvertex, BaseInstance, VertexID, InstanceID>
* VE 2: <Draw ID, 0, 0, 0>

v2 (idr): Mark nir_intrinsic_load_first_vertex as "unreachable" in
emit_system_values_block and fs_visitor::nir_emit_vs_intrinsic.
2018-04-19 15:57:45 -07:00
Neil Roberts
0c8395e15d intel/compiler: Add a uses_firstvertex flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-19 15:57:45 -07:00
Antia Puentes
5ff848df7b compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics
This VS system value will contain the value passed as <basevertex> for
indexed draw calls or the value passed as <first> for non-indexed draw
calls. It can be used to calculate the gl_VertexID as
SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX.

From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays":

-  Page 352:
"The index of any element transferred to the GL by DrawArraysOneInstance
is referred to as its vertex ID, and may be read by a vertex shader as
gl_VertexID.  The vertex ID of the ith element transferred is first +
i."

- Page 355:
"The index of any element transferred to the GL by
DrawElementsOneInstance is referred to as its vertex ID, and may be read
by a vertex shader as gl_VertexID.  The vertex ID of the ith element
transferred is the sum of basevertex and the value stored in the
currently bound element array buffer at offset indices + i."

Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but
this will have to change when the value of gl_BaseVertex is
fixed. Currently its value is broken for non-indexed draw calls because
it must be zero but we are setting it to <first>.

v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of
SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth).

v3 (idr): Rebase on Rob Clark converting nir_intrinsics.h to be
generated.  Reformat commit message to 72 columns.

Reviewed-by: Neil Roberts <nroberts@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-19 15:57:45 -07:00
Mike Lothian
051fddb4a9 meson: Build st_tests_common with gtest
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106131
Fixes: 34cb4d0ebc ("meson: build tests for gallium mesa state tracker")
Signed-off-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-19 09:04:51 -07:00
Bas Nieuwenhuizen
dffdef6737 radv: Add Vega M support.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-19 16:36:21 +02:00
Bas Nieuwenhuizen
d1ce31d36c radv: Add bound checking workaround for dynamic buffers.
I have seen a few applications and games do the dynamic buffer bounds incorrectly, this
make it easier to work around, e.g. for debugging.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-19 16:13:25 +02:00
Thomas Hellstrom
e0c08183fb svga: Fix incorrect advertizing of EGL_KHR_gl_colorspace
When advertizing this extension, egl_dri2 uses the DRI2_RENDERER_QUERY
extension to query whether an sRGB format is supported. That extension will
query our driver with the BIND flag PIPE_BIND_RENDER_TARGET rather than
PIPE_BIND_DISPLAY_TARGET which is used when building the configs.
We only return the correct value for PIPE_BIND_DISPLAY_TARGET.

The inconsistency causes EGL to crash at surface initialization if sRGB is
not supported. Fix this by supporting both bind flags.

Testing done:
piglit egl_gl_colorspace srgb

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-19 13:42:51 +02:00
Mike Lothian
79487c427e swr: Fix include for createPromoteMemoryToRegisterPass
Include llvm/Transforms/Utils.h with the newest LLVM 7

v2: Include with " " rather than < > (Vinson Lee)

v3: Use LLVM_VERSION_MAJOR rather than HAVE_LLVM (George Kyriazis)

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-04-19 00:39:04 -07:00
Samuel Pitoiset
2f63b3dd09 radv: enable DCC for MSAA 2x textures on VI under an option
This can be enabled with RADV_PERFTEST=dccmsaa.

DCC for MSAA textures is actually not as easy to implement. It
looks like there is some corner cases. I will improve support
incrementally.

Vega support, as well as Polaris improvements, will be added later.

No CTS changes on Polaris using RADV_DEBUG=zerovram and
RADV_PERFTEST=dccmsaa.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:55 +02:00
Samuel Pitoiset
dc3d39771f radv: decompress DCC for multisampled source images before resolving
Multisampled source images (ie. color attachments) can be now
DCC compressed, so the driver needs to perform a DCC decompression
pass before resolving

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:52 +02:00
Samuel Pitoiset
1aefb62f1e radv: add a workaround for fast clears with DCC and MSAA textures
This should be fixed at some point in order to improve
performance.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:50 +02:00
Samuel Pitoiset
373fa0b599 radv: allocate CMASK for DCC fast clear with MSAA
CMASK is required because it should be cleared to
0xCCCCCCCC for MSAA textures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:48 +02:00
Samuel Pitoiset
255506c4e0 radv: implement fast color clear for DCC with MSAA
When DCC is enabled with MSAA textures, CMASK should be
cleared to 0xCCCCCCCC.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:45 +02:00
Samuel Pitoiset
796b6f4aab radv: make sure to sync after resolving using the compute path
This fixes some random CTS failures:

dEQP-VK.renderpass.multisample.*.

Performing a fast-clear eliminate is still useless, but it
seems that we need to sync.

Found while running CTS with RADV_DEBUG=zerovram.

Fixes: 56a171a499 ("radv: don't fast-clear eliminate after resolving a subpass with compute")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:09:55 +02:00
Samuel Pitoiset
4a698660ae radv: dump the SHA1 of SPIRV in the hang report
Might be useful for debugging purposes, especially when we
want to replace a shader on the fly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:09:52 +02:00
Bas Nieuwenhuizen
0e10790558 radv: Enable VK_EXT_descriptor_indexing.
This adds everything except non-uniform indexing, which needs a bit
more work and testing.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
5f7ebb5206 spirv: Add support for runtime descriptor array cap.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
c48feaf2d1 spirv: Add support for VK_EXT_descriptor_indexing uniform indexing caps.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
b5e04e9217 radv: Support allocating variable size descriptor sets.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
78c54acbe8 radv: Add support for variable descriptor set layouts.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
082c11e8a5 radv: Fix GetDescriptorSetLayoutSupport.
The continue means we do alignment differently than during creation,
making the buffer smaller than expected.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
d02bbde1a8 radv: Use sorted bindings for set layout creation.
Previously we did not care about havin the set storage in order,
but for variable descriptor count we want the highest binding
at the end of the storage.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
ab6cadd3ec radv: Don't store buffer references in the descriptor set.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
4b13fe55a4 radv: Keep a global BO list for VkMemory.
With update after bind we can't attach bo's to the command buffer
from the descriptor set anymore, so we have to have a global BO
list.

I am somewhat surprised this works really well even though we have
implicit synchronization in the WSI based on the bo list associations
and with the new behavior every command buffer is associated with
every swapchain image. But I could not find slowdowns in games because
of it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
22d6b89e39 spirv: Update spirv.h to 12f8de9f04327336b699b1b80aa390ae7f9ddbf4
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Kenneth Graunke
da25ae92be i965: Fix shadow batches to be the same size as the real BO.
brw_bo_alloc may round up our allocation size to the next bucket size.
In this case, we would malloc a shadow buffer that was the original
intended size, but use bo->size (the larger size) for all of our checks.

This could cause us to run off the end of the shadow buffer.

v2: Actually use the new BO size (caught by Lionel)

Reported-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c7dcee58b5 (i965: Avoid problems from referencing orphaned BOs after growing.)
2018-04-18 13:55:08 -07:00
Marek Olšák
7bd24d951a glsl_to_tgsi: try harder to lower unsupported ir_binop_vector_extract
This fixes some piglits.

Cc: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 15:34:52 -04:00
Leo Liu
90de03708f radeon/vce: disable vce dual pipe on VegaM
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 14:45:35 -04:00
Marek Olšák
c6f1d36019 radeonsi: add support for VegaM
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 14:45:33 -04:00
Marek Olšák
d6a66bc8db amd/addrlib: add support for VegaM
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 14:45:32 -04:00
Marek Olšák
d15fb766aa radeonsi/gfx9: fix a hang with an empty first IB
This packet causes the no-op IB detection to fail, so the IB is always
submitted. Also fix the no-op IB detection by moving the begin call.

Cc: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 14:42:06 -04:00
Dylan Baker
d28c246501 meson: build graw tests
This only enables the null and xlib target, so no windows support yet.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
34cb4d0ebc meson: build tests for gallium mesa state tracker
v2: - Fix typo

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
de01018293 meson: build gallium unit tests
v2: - gate unit tests on swrast being enabled (Eric A)
v3: - rebase on libtrace being merged with gallium auxiliary

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v2)
2018-04-18 09:03:57 -07:00
Dylan Baker
4c794c7834 meson: Build gallium trivial tests
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
7fee8fed16 meson: Remove TODO about mesa/main tests
They're already done.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
5d16c86add meson: enable glcpp test
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
db8cd8e367 glcpp/tests: Convert shell scripts to a python script
This ports glcpp-test.sh and glcpp-test-cr-lf.sh to a python script that
accepts arguments for each line ending type. This should allow for
better reporting to users.

v2: - Use $PYTHON2 to be consistent with other tests in mesa

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
8cb96c4031 glsl/tests: Remove unused compare_ir.py script
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-18 09:03:57 -07:00
Dylan Baker
877d250ea1 meson: enable optimization-test
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-18 09:03:57 -07:00
Dylan Baker
97c28cb082 glsl/tests: Convert optimization-test.sh to pure python
This patch converts optimization-test.sh to python, in this process it
removes external shell dependencies including diff. It replaces the
python script that generates shell scripts with a python library that
generates test cases and runs them using subprocess.

v2: - use $PYTHON2 to be consistent with other tests in mesa

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-18 09:03:57 -07:00
Dylan Baker
ad9c2f2018 meson: run glsl compiler warnings test
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
3b52d29227 glsl/tests: reimplement warnings-test in python
This reimplements the test in python with a shell script wrapper that
allows autotools to continue to run the test without realizing that
anything has changed.

Using python has two advantages, first it's portable so this test can be
run on windows as well as Linux since it just requires python, no more
diff, pwd or sh. It's also no longer tied to autotools implementation
details, like the environment variables $srcdir and $abs_builddir,
though the autotools shell wrapper still uses those, which makes it
possible to run the test in meson.

v2: - Use $PYTHON2 in script to be consistent with other scripts in mesa

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
George Kyriazis
12a002a3a1 swr/rast: Fix VGATHERPD lowering
Also Implement VHSUBPS in x86 lowering pass.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
99fe90722d swr/rast: Replace x86 VMOVMSK with llvm-only implementation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
0899122c03 swr/rast: Optimize late/bindless JIT of samplers
Add per-worker thread private data to all shader calls
Add per-worker sampler cache and jit context
Add late LoadTexel JIT support
Add per-worker-thread Sampler / LoadTexel JIT

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
ec7154abc0 swr/rast: Implement VROUND intrinsic in x86 lowering pass
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
bb02da3c1b swr/rast: Refactor to improve code sharing.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
94ca1c018f swr/rast: minimize codegen redundant work
Move filtering of redundant codegen operations into gen scripts themselves

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
7f34860125 swr/rast: double-pump in x86 lowering pass
Add support for double-pumping a smaller SIMD width intrinsic.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
96ad8f5a23 swr/rast: Fix 64bit float loads in x86 lowering pass
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
1ffbbbee97 swr/rast: Add shader stats infrastructure (WIP)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
a81c625cb7 swr/rast: Type-check TemplateArgUnroller
Allows direct use of enum values in conversion to template args.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
2966ee1028 swr/rast: Add vgather to x86 lowering pass.
Add support for generic VGATHERPD intrinsic in x86 lowering pass.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
e4929b5d26 swr/rast: fix comment
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
670a99c233 swr/rast: add cvt instructions in x86 lowering pass
Support generic VCVTPD2PS and VCVTPH2PS in x86 lowering pass.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
aa482014e5 swr/rast: Fix alloca usage in jitter
Fix issue where temporary allocas were getting hoisted to function entry
unnecessarily. We now explicitly mark temporary allocas and skip hoisting
during the hoist pass. Shuold reduce stack usage.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
81371a5909 swr/rast: Change gfx pointers to gfxptr_t
Changing type to gfxptr for indices and related changes to fetch and mem
builder code.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
71239478d3 swr/rast: Fix byte offset for non-indexed draws
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
c57b594317 swr/rast: Add support for setting optimization level
for JIT compilation

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
4f0df5e2f7 swr/rast: Adding translate call to builder_gfx_mem.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
f135f54b18 swr/rast: Fix codegen for typedef types
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
c5d7b37fe7 swr: add x86 lowering pass to fragment shader
Needed because some FP paths (namely stipple) use gather intrinsics
that now need to be lowered to x86.

v2: fix typo in commit message
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
9161c40d14 swr/rast: Enable generalized fetch jit
Enable generalized fetch jit with 8 or 16 wide SIMD target. Still some
work needed to remove some simd8 double pumping for 16-wide target.

Also removed unused non-gather load vertices path.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
d73082b98b swr/rast: Add builder_gfx_mem.{h|cpp}
Abstract usage scenarios for memory accesses into builder_gfx_mem.
Builder_gfx_mem will convert gfxptr_t from 64-bit int to regular pointer
types for use by builder_mem.

v2: reworded commit message; renamed enum more appropriately
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
1eb72673fc swr/rast: Lower VGATHERPS and VGATHERPS_16 to x86.
Some more work to do before we can support simultaneous 8-wide and
16-wide and remove the VGATHERPS_16 version.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
b15fb78df5 swr/rast: Cleanup of JitManager convenience types
Small cleanup. Remove convenience types from JitManager and standardize
on the Builder's convenience types.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
d68694016c swr/rast: Lower PERMD and PERMPS to x86.
Add support for providing an emulation callback function for arch/width
combinations that don't map cleanly to an x86 intrinsic.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
8f848ada8a swr/rast: Start refactoring of builder/packetizer.
Move x86 intrinsic lowering to a separate pass. Builder now instantiates
generic intrinsics for features not supported by llvm. The separate x86
lowering pass is responsible for lowering to valid x86 for the target
SIMD architecture. Currently it's a port of existing code to get it
up and running quickly. Will eventually support optimized x86 for AVX,
AVX2 and AVX512.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
ffc0aeb4ec swr/rast: Simplify #define usage in gen source file
Removed preprocessor defines from structures passed to LLVM jitted code.

The python scripts do not understand the preprocessor defines and ignores
them. So for fields that are compiled out due to a preprocessor define
the LLVM script accounts for them anyway because it doesn't know what
the defines are set to. The sanitize defines for open source are fine
in that they're safely used.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
f36026ce2e swr/rast: Move CallPrint() to a separate file
Needed work for jit code debug.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
67c8bb4db7 swr/rast: Fix name mangling for LLVM pow intrinsic
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
7a5054aa1c swr/rast: Add some archrast counters
Hook up archrast counters for shader stats: instructions executed.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
f52a501716 swr/rast: Code cleanup
Removing some code that doesn't seem to do anything meaningful.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
093c1aee88 swr/rast: Add "Num Instructions Executed" stats intrinsic.
Added a SWR_SHADER_STATS structure which is passed to each shader. The
stats pass will instrument the shader to populate this.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
5fbee5e4ef swr/rast: Add MEM_ADD helper function to Builder.
mem[offset] += value

This function will be heavily used by all stats intrinsics.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
9103119cb3 swr/rast: Permute work for simd16
Fix slow permutes in PA tri lists under SIMD16 emulation on AVX

Added missing permute (interlane, immediate) to SIMDLIB

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
4c69823d15 swr/rast: WIP builder rewrite (2)
Finish up the remaining explicit intrinsic uses. At this point all
explicit Intrinsic::getDeclaration() usage has been replaced with auto
generated macros generated with gen_llvm_ir_macros.py. Going forward,
make sure to only use the intrinsics here, adding new ones as needed.

Next step is to remove all references to x86 intrinsics to keep the
builder target-independent. Any x86 lowering will be handled by a
separate pass.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
c2163dc56a swr/rast: Add autogen of helper llvm intrinsics.
Replace sqrt, maskload, fp min/max, cttz, ctlz with llvm equivalent.
Replace AVX maskedstore intrinsic with LLVM intrinsic. Add helper llvm
macros for stacksave, stackrestore, popcnt.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
6427315e43 swr/rast: WIP builder rewrite.
Start removing avx2 macros for functionality that exists in llvm.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
a16f8e0554 swr/rast: LLVM 6 fix
for getting masked gather intrinsic (also compatible with LLVM 4)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
a92cc09c7a swr/rast: Changes to allow jitter to compile with LLVM5
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
0f6fef9632 swr/rast: Add some archrast stats
Add stats for degenerate and backfacing primitive counts

Wire archrast stats for alpha blend and alpha test.
pass value to jitter, upon return have archrast event increment a value

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
b488028854 swr/rast: Silence some unused variable warnings
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
e84bfec4ab swr/rast: Add debug type info for i128
Help support debug info in 16 wide shaders.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
a3edcfe1fb swr/rast: Use blend context struct to pass params
Stuff parameters into a blend context struct before passing down through
the PFN_BLEND_JIT_FUNC function pointer. Needed for stat changes.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
be6cf0fd7c swr/rast: Introduce JIT_MEM_CLIENT
Add assert for correct usage of memory accesses

v2: reworded commit message; renamed enum more appropriately
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
d34edffe48 swr/rast: Add some instructions to jitter
VPHADDD, PMAXUD, PMINUD

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
Juan A. Suarez Romero
4aa03581b5 docs: update calendar, add news and link release notes to 18.0.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-18 15:29:12 +00:00
Juan A. Suarez Romero
ad51d8871e docs: add sha256 checksums for 18.0.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit a1c421c638)
2018-04-18 15:25:32 +00:00
Juan A. Suarez Romero
76cadaa1de docs: add release notes for 18.0.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 8bd719e3fa)
2018-04-18 15:25:30 +00:00
Juan A. Suarez Romero
193d615917 docs: update calendar, add news and link release notes to 17.3.9
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-18 09:45:11 +00:00
Juan A. Suarez Romero
6372227209 docs: add sha256 checksums for 17.3.9
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit cf0864dc63)
2018-04-18 09:40:44 +00:00
Juan A. Suarez Romero
6a1261bd09 docs: add release notes for 17.3.9
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 6d88ea9dd4)
2018-04-18 09:40:42 +00:00
Dylan Baker
b9ad5282ba Revert "meson: add wrap for libdrm"
This reverts commit 6217eedc9b.

I was using this for testing and accidentally put it on master

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-17 13:48:55 -07:00
Dylan Baker
efcbcfa7c8 Revert "Add subprojects directory and git ignore"
This reverts commit 21e2e73f71.

I was using this for testing and accidentally put it on master

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-17 13:48:43 -07:00
Jan Alexander Steffens (heftig)
5cf752b18b meson: Version libMesaOpenCL like autotools does
This is for parity with autotools. It names the library
libMesaOpenCL.so.1.0.0 and points mesa.icd to the .1 symlink.

opencl_version now matches configure.ac's OPENCL_VERSION.

Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com>
Tested-By: Aaron Watry <awatry@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-17 13:46:15 -07:00
Jan Alexander Steffens (heftig)
5bb98cfd92 meson: Add library versions to swr drivers
This is for parity with autotools.

Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-17 13:46:15 -07:00
Dylan Baker
6217eedc9b meson: add wrap for libdrm
Currently this requires libdrm from git, since the version reported by
meson is wrong.
2018-04-17 13:46:15 -07:00
Dylan Baker
21e2e73f71 Add subprojects directory and git ignore
For meson wraps.
2018-04-17 13:46:15 -07:00
Samuel Pitoiset
893e19efb7 radv: fix scissor computation when using half-pixel viewport offset
'scale[i]' can be non-integer.

Original patch by Philip Rebohle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106074
Fixes: 0f3de89a56 ("radv: Use the guard band.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-17 22:12:14 +02:00
Neil Roberts
608d70bc02 spirv: Accept doubles in FaceForward, Reflect and Refract
The SPIR-V spec doesn’t specify a size requirement for these and the
equivalent functions in the GLSL spec have explicit alternatives for
doubles. Refract is a little bit more complicated due to the fact that
the final argument is always supposed to be a scalar 32- or 16- bit
float regardless of the other operands. However in practice it seems
there is a bug in glslang that makes it convert the argument to 64-bit
if you actually try to pass it a 32-bit value while the other
arguments are 64-bit. This adds an optional conversion of the final
argument in order to support any type.

These have been tested against the automatically generated tests of
glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch
which tests it with quite a large range of combinations.

The issue with glslang has been filed here:
https://github.com/KhronosGroup/glslang/issues/1279

v2: Convert the eta operand of Refract from any size in order to make
    it eventually cope with 16-bit floats.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 20:58:11 +02:00
Neil Roberts
6e499572b9 spirv: Add a 64-bit implementation of OpIsInf
The only change neccessary is to change the type of the constant used
to compare against.

This has been tested against the arb_gpu_shader_fp64/execution/
fs-isinf-dvec tests using the ARB_gl_spirv branch.

v2: Use nir_imm_floatN_t for the constant.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 20:58:06 +02:00
Neil Roberts
696f4abcbc spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins
There is an existing macro that is used to choose between either a
float or a double immediate constant based on the bit size of the
first operand to the builtin. This is now changed to use the new
nir_imm_floatN_t helper function to reduce the number of places that
make this decision.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 20:58:03 +02:00
Neil Roberts
e7b2c125c3 nir/builder: Add a nir_imm_floatN_t helper
This lets you easily build float immediates just given the bit size.
If we have this single place here to handle this then it will be
easier to add support for 16-bit floats later.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 20:57:36 +02:00
Timothy Arceri
6e22ad6edc nir: return early when lowering a return at the end of a function
Otherwise we create unused conditional return flags and things
get unnecessarily ugly fast when lowering nested functions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 14:17:56 +10:00
Timothy Arceri
d3cafc18fc mesa: merge the driver functions DrawBuffers and DrawBuffer
The extra params we unused by the drivers that used DrawBuffers.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-17 14:17:48 +10:00
Marc Dietrich
268d8f244b glsl: fix gcc 8 parenthesis warning
fixes warnings like this:
[184/1137] Compiling C++ object 'src/compiler/glsl/glsl@sta/lower_jumps.cpp.o'.
In file included from ../src/mesa/main/mtypes.h:48,
                 from ../src/compiler/glsl_types.h:149,
                 from ../src/compiler/glsl/lower_jumps.cpp:59:
../src/compiler/glsl/lower_jumps.cpp: In member function '{anonymous}::block_record {anonymous}::ir_lower_jumps_visitor::visit_block(exec_list*)':
../src/compiler/glsl/list.h:650:17: warning: unnecessary parentheses in declaration of 'node' [-Wparentheses]
    for (__type *(__inst) = (__type *)(__list)->head_sentinel.next; \
                 ^
../src/compiler/glsl/lower_jumps.cpp:510:7: note: in expansion of macro 'foreach_in_list'
       foreach_in_list(ir_instruction, node, list) {
       ^~~~~~~~~~~~~~~

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-17 11:53:59 +10:00
Rob Clark
2a55344e7d compiler: int8/uint8 fixes
A couple spots were missed for handling of the new INT8/UINT8 base type.

Also de-duplicate get_base_type().. get_scalar_type() had nearly the
same switch statement, with the exception that anything with base_type
that was not scalar would return error_type.  So just handle that one
special case in get_scalar_type().

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-16 20:41:18 -04:00
Marek Olšák
60299e9abe radeonsi: don't emit partial flushes for internal CS flushes only
Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-16 16:58:10 -04:00
Marek Olšák
692f550740 winsys/amdgpu: always set AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE
There is a kernel patch that adds the new flag.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-16 16:58:10 -04:00
Marek Olšák
1b3199d14d radeonsi: implement mechanism for IBs without partial flushes at the end (v6)
(This patch doesn't enable the behavior. It will be enabled in a later
commit.)

Draw calls from multiple IBs can be executed in parallel.

v2: do emit partial flushes on SI
v3: invalidate all shader caches at the beginning of IBs
v4: don't call si_emit_cache_flush in si_flush_gfx_cs if not needed,
    only do this for flushes invoked internally
v5: empty IBs should wait for idle if the flush requires it
v6: split the commit

If we artificially limit the number of draw calls per IB to 5, we'll get
a lot more IBs, leading to a lot more partial flushes. Let's see how
the removal of partial flushes changes GPU utilization in that scenario:

With partial flushes (time busy):
    CP: 99%
    SPI: 86%
    CB: 73:

Without partial flushes (time busy):
    CP: 99%
    SPI: 93%
    CB: 81%

Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-16 16:58:10 -04:00
Erico Nunes
d19b488339 nir: fix ir_binop_gequal glsl_to_nir conversion
ir_binop_gequal needs to be converted to nir_op_sge when native integers
are not supported in the driver.
Otherwise it becomes no different than ir_binop_less after the
conversion.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-16 07:59:25 -07:00
Jason Ekstrand
72ab499c9f anv,radv: Drop XML workarounds for VK_ANDROID_native_buffer
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-16 07:59:25 -07:00
Jason Ekstrand
35ef0f767e vulkan: Update the XML and headers to 1.1.73
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-16 07:59:25 -07:00
Samuel Pitoiset
62510846b6 radv: clean up radv_decompress_resolve_subpass_src()
To handle the source color image transitions in the same place.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:21:05 +02:00
Samuel Pitoiset
56a171a499 radv: don't fast-clear eliminate after resolving a subpass with compute
That looks useless, and I think radv_handle_image_transition()
will do a fast-clear eliminate because it's called after the
resolve.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:21:02 +02:00
Samuel Pitoiset
7e84d69861 radv: handle CMASK/FMASK transitions only if DCC is disabled
DCC implies a fast-clear eliminate, so I think this sounds
reasonable.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:59 +02:00
Samuel Pitoiset
584d1f2711 radv: merge radv_handle_{dcc,cmask}_image_transition() functions
Into radv_handle_color_image_transition().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:56 +02:00
Samuel Pitoiset
d5812b900b radv: add radv_init_color_image_metadata() helper
In order to separate initialization from decompression. In the
future, that will allow us to init DCC/FMASK/CMASK in one shot.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:54 +02:00
Samuel Pitoiset
fde7b90ecf radv: make radv_initialise_cmask() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:51 +02:00
Samuel Pitoiset
790f6e4718 radv: clean up radv_handle_image_transition() a bit
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:49 +02:00
Samuel Pitoiset
6967d32beb radv: add radv_handle_color_image_transition() helper
To handle CMASK, FMASK and DCC transitions in the same place.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:45 +02:00
Samuel Pitoiset
c6b1f1c97a radv: handle DCC image transitions before CMASK/FMASK transitions
Mostly because DCC implies a fast-clear eliminate and we
should be able to skip some DCC decompressions by setting
a predicate like for CMASK and FMASK.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:42 +02:00
Samuel Pitoiset
79c87a45b6 radv: disable prediction only if it has been enabled
When decompressing DCC we don't enable it, so it's useless
to disable it. This reduces the number of prediction packets
sent to the GPU when performing color decompression passes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:39 +02:00
Bas Nieuwenhuizen
b0e3a9b19f ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too.
No clue how I missed those ...

Fixes: 4503ff760c "ac/nir: Add workaround for GFX9 buffer views."
CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105320
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-16 11:55:48 +02:00
Brian Paul
6a519a157b gallium/osmesa: link with winsock2 library on Windows
To fix the MSVC build.  The build broke because we started to compile
the ddebug code on Windows after the mtypes.h changes.  Building ddebug
caused us to also use the u_network.c code for the first time.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-13 19:06:55 -06:00
Brian Paul
201c08c463 gallium/util: put (void) in a few function signatures
To match the header file.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-13 19:06:55 -06:00
Brian Paul
65d1040435 ddebug: add PIPE_OS_UNIX/LINUX checks to fix MSVC build
Don't include Unix headers or use Unix functions when building with MSVC.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-13 19:06:55 -06:00
Brian Paul
6d41edbf8a mesa: protect #include of unistd.h with _MSV_VER check
unistd.h is unix only.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-13 19:06:55 -06:00
Brian Paul
bf67fec235 mesa: remove unused 'i' in dimensions_error_check()
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-13 19:06:55 -06:00
Marek Olšák
976db661ff radeonsi: restore si_emit_cache_flush call at the end of IBs
Fixes: 918b798668 "radeonsi: make sure CP DMA is idle at the end of IBs"
2018-04-13 20:05:53 -04:00
Daniel Schürmann
f2c6a55061 radv: enable subgroup capabilities
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 01:03:15 +02:00
Daniel Schürmann
4b0616e533 ac: handle subgroup intrinsics
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 01:03:15 +02:00
Daniel Schürmann
d5f7ebda3e ac: add LLVM build functions for subgroup instrinsics
Co-authored-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 01:03:09 +02:00
Daniel Schürmann
d19f20e793 ac: make ballot and umsb capable of 64bit inputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Daniel Schürmann
79701b414c nir: lower 64bit subgroup shuffle intrinsics
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Daniel Schürmann
fd5b0e0a64 nir/spirv: Fix warning and add missing breaks.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Daniel Schürmann
54937d820d nir: use ballot_bit_size when lowering ballot_bitfield_extract
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Daniel Schürmann
4d802df3aa nir: subgroups instructions for 64bit ballot sizes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Brian Paul
1098c18af3 glsl: #undef THIS macro to fix MSVC build
THIS is a macro in one of the MSVC header files.  It's also a token
in the GLSL lexer.  This causes a compilation failure with MSVC.
This issue seems to be newly exposed after the recent mtypes.h removal
patches.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-04-13 13:53:12 -06:00
Brian Paul
5dc7233f44 glsl: rename 'interface' var to 'iface' to fix MSVC build
The recent mtypes.h removal patches seems to have exposed a MSVC
issue where 'interface' is defined as a macro in an MSVC header file.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-04-13 13:53:08 -06:00
Brian Paul
73f1e33d34 mesa: remove snprintf macro in imports.h to fix MSVC build
snprintf is a macro in the MSVC stdio.h header and we needed to
include that header before imports.h where we also defined an
snprintf macro.  Otherwise, the MSVC build would fail.  The recent
mtypes.h removal patches seems to have exposed this issue.

This patch simply removes our snprintf macro and replaces one use
of it in teximage.c with _mesa_snprintf().  There are other calls
to snprintf() in DRI drivers, but none of them are built on Windows.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-04-13 13:52:57 -06:00
Lionel Landwerlin
0a6547014f anv: fix number of planes for depth & stencil
We're not counting correctly with depth & stencil images.

Additionally we need to move an assert that is meant just for color
attachments.

v2: Move an assert() (Reported by Craig)
    Change aspect mask checks (Francesco)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a62a979335 ("anv: enable multiple planes per image/imageView")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105994
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-04-13 11:44:53 -07:00
Marek Olšák
6ff0c6f4eb gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times
which also simplifies the build scripts.
2018-04-13 14:08:14 -04:00
Marek Olšák
918b798668 radeonsi: make sure CP DMA is idle at the end of IBs 2018-04-13 14:07:20 -04:00
Marek Olšák
b6ad7075b9 gallium/hud: add a simple HUD view that only draws text
Add this prefix to the env var: "simple," For example:
    GALLIUM_HUD=simple,fps

The X coordinates are the same, but the Y coordinates are different, because
there is only text.

'+' happens to behave the same as "\n".
',' happens to behave the same as "\n\n".
2018-04-13 14:07:20 -04:00
Dylan Baker
506671594a mesa: Include unistd.h in program_lexer
Which was previously provided implicitly by mtypes.h

CC: Marek Olšák <marek.olsak@amd.com>
CC: Mark Janes <mark.a.janes@intel.com>
Fixes: 43d66c8c2d
       ("mesa: include mtypes.h less")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-04-13 11:03:37 -07:00
Marek Olšák
9a1363427e radeonsi: always prefetch later shaders after the draw packet
so that the draw is started as soon as possible.

v2: only prefetch the API VS and VBO descriptors

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
e4b7974ec7 radeonsi: emit shader pointers before cache flushes & waits
This code was written with the constant engine in mind.
We can simplify it now.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
82799c5035 radeonsi/gfx9: don't use the workaround for gather4 + stencil
it doesn't seem to be needed.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
1372ccfe6f radeonsi: disable TC-compat HTILE on Tonga and Iceland
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
afe0bd2c55 radeonsi: force 2D tiling on VI only when TC-compat HTILE is really enabled
just pass the flag that indicates it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
29a09e1d38 radeonsi: don't flush HTILE if there is no HTILE clear
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
5fb31a1734 radeonsi: merge 2 identical if statements in si_clear
and other cleanups

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
8a28679987 radeonsi: don't do GFX-specific texture decompression for compute
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
307bccc6df radeonsi: simplify generating the renderer string
HAVE_LLVM > 0 is a tautology.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
a3b785be4d winsys/amdgpu: allow local BOs on APUs
Local BOs ignore BO priorities, and we don't need those on APUs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Juan A. Suarez Romero
b37b35a5d2 getteximage: assume texture image is empty for non defined levels
Current code is returning an INVALID_OPERATION when trying to use
getTextureImage() on a level that has not been explicitly defined.

That is, we define a mipmapped Texture2D with 3 levels, and try to use
GetTextureImage() for the 4th levels, and INVALID_OPERATION is returned.

Nevertheless, such case is not listed as an error in OpenGL 4.6 spec,
section 8.11.4 ("Texture Image Queries"), where all the case errors for
this function are defined. So it seems this is a valid operation.

On the other hand, in section 8.22 ("Texture State and Proxy State") it
states:

  "Each initial texture image is null. It has zero width, height, and
   depth, internal format RGBA, or R8 for buffer textures, component
   sizes set to zero and component types set to NONE, the compressed
   flag set to FALSE, a zero compressed size, and the bound buffer
   object name is zero."

We can assume that we are reading this initialized empty image when
calling GetTextureImage() with a non defined level.

With this assumption, we will reach one of the other error cases defined
for the functions. In the end this means that we would end up returning
INVALID_VALUE to the caller.

This fixes arb_get_texture_sub_image piglit tests.

v2: just return INVALID_VALUE if there is no defined level (Iago)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-13 17:47:37 +02:00
Juan A. Suarez Romero
8d411eb6b3 gettextureimage: verify cube map is complete
According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"),
relative to errors for GetTexImage, GetTextureImage, and GetnTexImage:

  "An INVALID_OPERATION error is generated by GetTextureImage if the
   effective target is TEXTURE_CUBE_MAP or TEXTURE_CUBE_MAP_ARRAY, and
   the texture object is not cube complete or cube array complete,
   respectively."

This fixes arb_get_texture_sub_image piglit tests.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-13 17:47:27 +02:00
Juan A. Suarez Romero
42891dbaa1 gettextsubimage: verify zoffset and depth are correct
According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"),
relative to errors for GetTextureSubImage() function:

  "An INVALID_VALUE error is generated if the effective target is
   TEXTURE_1D and either yoffset is not zero, or height is not one.

   An INVALID_VALUE error is generated if the effective target is
   TEXTURE_1D, TEXTURE_1D_ARRAY, TEXTURE_2D or TEXTURE_RECTANGLE, and
   either zoffset is not zero, or depth is not one."

The commit fixes the check for height and depth.

This fixes arb_get_texture_sub_image piglit tests.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-13 17:47:27 +02:00
Timothy Arceri
a63e69f5f0 mesa: free debug messages when destroying the debug state
Fixes: 04a8baad37 "mesa: refactor _mesa_PopDebugGroup and _mesa_free_errors_data"

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98281
2018-04-13 22:20:48 +10:00
Timothy Arceri
c500ab2735 mesa: fix x86 builds
Fixes: 43d66c8c2d "mesa: include mtypes.h less"
2018-04-13 22:13:46 +10:00
Marek Olšák
e961824ba8 Fix make check 2018-04-12 20:03:13 -04:00
Marek Olšák
6d6b1b3890 Fix scons build 2018-04-12 19:55:01 -04:00
Marek Olšák
43d66c8c2d mesa: include mtypes.h less
- remove mtypes.h from most header files
- add main/menums.h for often used definitions
- remove main/core.h

v2: fix radv build

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-12 19:31:30 -04:00
Marek Olšák
57f4268da4 mesa: include dispatch.h less
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-12 19:31:28 -04:00
Bas Nieuwenhuizen
6ff98dbf7c radv: Implement VK_EXT_vertex_attribute_divisor.
Pretty straight forward, just pass the divisors through the shader
key and then do a LLVM divide.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-12 22:57:23 +02:00
Bas Nieuwenhuizen
7eff8d7d35 ac/surface: Allow S swizzle for displayable surfaces.
For dcn1 && < 64 bpp displayable surfaces, addrlib only accepts
S swizzles.

At the same time addrlib prefers D swizzles is allowed, so we can
just allow S swizzles as fallback.

Fixes: b64b712558 "ac/surface/gfx9: request desired micro tile mode explicitly"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-12 21:24:55 +02:00
Eric Anholt
7bc77dbb00 broadcom/vc5: Fix a stray '`' in a comment. 2018-04-12 11:20:50 -07:00
Eric Anholt
b225cdcecc broadcom/vc5: Update the UABI for in/out syncobjs
This is the ABI I'm hoping to stabilize for merging the driver.  seqnos
are eliminated, which allows for the GPU scheduler to task-switch between
DRM fds even after submission to the kernel.  In/out sync objects are
introduced, to allow the Android fencing extension (not yet implemented,
but should be trivial), and to also allow the driver to tell the kernel to
not start a bin until a previous render is complete.
2018-04-12 11:20:50 -07:00
Eric Anholt
d9c525ed22 broadcom/vc5: Drop the finished_seqno optimization.
With the DRM scheduler changes, I'm about to remove all seqnos from the
UABI.
2018-04-12 11:20:50 -07:00
Eric Anholt
aedfd8ede4 broadcom/vc5: Drop the throttling code.
Since I'll be using the DRM scheduler, we won't run into the problem of a
runaway client starving other clients of GPU time.
2018-04-12 11:20:50 -07:00
Eric Anholt
dd9c476165 broadcom/vc5: Move flush_last_load into load_general, like for stores.
This should avoid mistakes with not flushing as we change the series of
loads.  Already, it fixes a hopefully unreachable case where we were
emitting just the TILE_COORDINATES and not the dummy store that needs to
go with it.
2018-04-12 11:20:50 -07:00
Eric Anholt
6a21a582fb broadcom/vc5: Rename read_but_not_cleared to loads_pending.
This is a more obvious name for what the variable means, and matches what
it's called for stores.
2018-04-12 11:20:50 -07:00
Eric Anholt
b946218c48 broadcom/vc5: Refactor the implicit coords/stores_pending logic.
Since I just fixed a bug due to forgetting to do these right, do it once
in the helper func.
2018-04-12 11:20:50 -07:00
Eric Anholt
ec60559f97 broadcom/vc5: Emit missing TILE_COORDINATES_IMPLICIT in separate z/s stores.
Fixes a simulator assertion failure in
KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8
2018-04-12 11:20:50 -07:00
Eric Anholt
8f2999120d broadcom/vc5: Add checks that we don't try to do raw Z+S load/stores.
This was dying in the simulator on
GTF-GLES3.gtf.GL3Tests.packed_depth_stencil.packed_depth_stencil_blit.
We'll need to do basically the same thing as Z32F/S8 does in the MSAA
Z24S8 case.
2018-04-12 11:20:50 -07:00
Eric Anholt
7553cbfc9d broadcom/vc5: Fix MSAA depth/stencil size setup.
The v3dX(get_internal_type_bpp_for_output_format)() call only handles
color output formats (which overlap in enum numbers with depth output
formats), so for depth we just need to take the normal cpp times the
number of samples.
2018-04-12 11:20:50 -07:00
Leo Liu
fa328456e8 st/va: add VP9 config to enable profile2
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
dac0024b58 radeonsi: use PIPE_FORMAT_P016 format for VP9 profile2
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
f1277dabbc radeon/vcn: add VP9 profile2 support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
e8724bd1e3 vl: add VP9 profile2 support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
d9a31341ec st/va: add VP9 config to enable profile0
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
ef52ba8aa0 st/va: parse VP9 uncompressed frame header
To get some of UVD required parameters.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
bf0f5fe929 st/va: add slice parameter handling for VP9
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
05176fe65e st/va: add picture parameter handling for VP9
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
9ff83d13e5 st/va: add handles for VP9 buffers
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
30438fbf46 st/va: add VP9 picture to context
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
0f373a65e5 radeonsi: cap VP9 support to progressive buffer
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
6adaf6de6d radeonsi: cap VP9 support to Raven
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
905368669d radeon/vcn: add VP9 context buffer
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
e2ce7c0a62 radeon/vcn: get VP9 msg buffer
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
6000bdb75b radeon/vcn: fill probability table to prob buffers
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
93c0f3cc13 radeon/vcn: add VP9 message buffer interface
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
caaecf3d3b radeon/vcn: add VP9 prob table buffer
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
b628ea039f vl: add VP9 probability tables
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
eb22785bd8 radeon/vcn: add VP9 dpb buffer size
The current FW has restricted the size to the worse case,
and the new dynamic dpb buffer support is on the way from
firmware side, we will change accordingly.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
f73befdd9b radeon/vcn: add VP9 stream type for decoder
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
ca1646db89 vl: add VP9 picture description
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
29bc354684 vl: add VP9 profile0 and format
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Samuel Pitoiset
9eac49246c radv: fix radv_layout_dcc_compressed() when image doesn't have DCC
num_dcc_levels means that DCC is supported, but this doesn't
mean that it's enabled by the driver. Instead, we should rely
on radv_image_has_dcc().

This fixes some multisample regressions since 0babc8e5d6
("radv: fix picking the method for resolve subpass") on Vega.
This is because the resolve method changed from HW to FS, but
those fails are totally unexpected, so there might some
differences between Polaris and Vega here.

Fixes: 44fcf58744 ("radv: Disable DCC for GENERAL layout and compute transfer dest.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-12 09:58:46 +02:00
Samuel Pitoiset
ab0e625a67 radv: add radv_decompress_resolve_{subpass}_src() helpers
This helper shares common code before resolving using either
a fragment or a compute shader.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-12 09:58:44 +02:00
Samuel Pitoiset
ed93d90a67 radv: add radv_init_dcc_control_reg() helper
And add some comments.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-12 09:58:41 +02:00
Timothy Arceri
c7e3d31b0b glsl: fix compat shaders in GLSL 1.40
The compatibility and core tokens were not added until GLSL 1.50,
for GLSL 1.40 just assume all shaders built with a compat profile
are compat shaders.

Fixes rendering issues in Dawn of War II on radeonsi which has
enabled OpenGL 3.1 compat support.

Fixes: a0c8b49284 "mesa: enable OpenGL 3.1 with ARB_compatibility"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807
2018-04-12 11:51:08 +10:00
Ian Romanick
f3b14ca2e1 mesa: Silence remaining unused parameter warnings in teximage.c
src/mesa/main/teximage.c: In function ‘_mesa_test_proxy_teximage’:
src/mesa/main/teximage.c:1301:51: warning: unused parameter ‘level’ [-Wunused-parameter]
                           GLuint numLevels, GLint level,
                                                   ^~~~~
src/mesa/main/teximage.c: In function ‘texsubimage_error_check’:
src/mesa/main/teximage.c:2186:30: warning: unused parameter ‘dsa’ [-Wunused-parameter]
                         bool dsa, const char *callerName)
                              ^~~
src/mesa/main/teximage.c: In function ‘copytexture_error_check’:
src/mesa/main/teximage.c:2297:32: warning: unused parameter ‘width’ [-Wunused-parameter]
                          GLint width, GLint height, GLint border )
                                ^~~~~
src/mesa/main/teximage.c:2297:45: warning: unused parameter ‘height’ [-Wunused-parameter]
                          GLint width, GLint height, GLint border )
                                             ^~~~~~
src/mesa/main/teximage.c: In function ‘check_rtt_cb’:
src/mesa/main/teximage.c:2679:21: warning: unused parameter ‘key’ [-Wunused-parameter]
 check_rtt_cb(GLuint key, void *data, void *userData)
                     ^~~
src/mesa/main/teximage.c: In function ‘override_internal_format’:
src/mesa/main/teximage.c:2756:55: warning: unused parameter ‘width’ [-Wunused-parameter]
 override_internal_format(GLenum internalFormat, GLint width, GLint height)
                                                       ^~~~~
src/mesa/main/teximage.c:2756:68: warning: unused parameter ‘height’ [-Wunused-parameter]
 override_internal_format(GLenum internalFormat, GLint width, GLint height)
                                                                    ^~~~~~
src/mesa/main/teximage.c: In function ‘texture_sub_image’:
src/mesa/main/teximage.c:3293:24: warning: unused parameter ‘dsa’ [-Wunused-parameter]
                   bool dsa)
                        ^~~
src/mesa/main/teximage.c: In function ‘can_avoid_reallocation’:
src/mesa/main/teximage.c:3788:53: warning: unused parameter ‘x’ [-Wunused-parameter]
                        mesa_format texFormat, GLint x, GLint y, GLsizei width,
                                                     ^
src/mesa/main/teximage.c:3788:62: warning: unused parameter ‘y’ [-Wunused-parameter]
                        mesa_format texFormat, GLint x, GLint y, GLsizei width,
                                                              ^
src/mesa/main/teximage.c: In function ‘valid_texstorage_ms_parameters’:
src/mesa/main/teximage.c:5987:40: warning: unused parameter ‘samples’ [-Wunused-parameter]
                                GLsizei samples, unsigned dims)
                                        ^~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-11 16:20:56 -07:00
Ian Romanick
fa44941072 mesa: Silence unused parameter warning in compressedteximage_only_format
Passing ctx to compressedteximage_only_format was the only use of the
ctx parameter in _mesa_format_no_online_compression, so that parameter
had to go too.

../../SOURCE/master/src/mesa/main/teximage.c: In function ‘compressedteximage_only_format’:
../../SOURCE/master/src/mesa/main/teximage.c:1355:57: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 compressedteximage_only_format(const struct gl_context *ctx, GLenum format)
                                                         ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-11 16:20:42 -07:00
Nanley Chery
377da9eb78 blorp: Silence unused function warnings
vulkan/genX_blorp_exec.c:69:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from vulkan/genX_blorp_exec.c:35:0:
./blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not used [-Wunused-function]
 blorp_emit_memcpy(struct blorp_batch *batch,
 ^~~~~~~~~~~~~~~~~
genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from genX_blorp_exec.c:33:0:
../../../../../src/intel/blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not used [-Wunused-function]
 blorp_emit_memcpy(struct blorp_batch *batch,
 ^~~~~~~~~~~~~~~~~

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-04-11 13:04:49 -07:00
Caio Marcelo de Oliveira Filho
89542c9ce6 nir/vars_to_ssa: Simplify node matching code
The matching code doesn't make real use of the return value. The main
function return value is ignored, and while the worker function
propagate its return value, the actual callback never returns false.

v2: Style fixes. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-11 11:05:05 -07:00
Caio Marcelo de Oliveira Filho
fac9dd1b93 nir/vars_to_ssa: Remove an unnecessary deref_arry_type check
Only fully-qualified direct derefs, collected in direct_deref_nodes,
are checked for aliasing, so it is already known up front that they
have only array derefs of type direct.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-11 11:05:05 -07:00
Caio Marcelo de Oliveira Filho
1c9bccdeb8 nir/vars_to_ssa: Rework register_variable_uses()
The return value was needed to make use of the old nir_foreach_block
helper, but not needed anymore with the macro version. Then go one
step further and move the foreach directly into the register variable
uses function.

v2: Move foreach to register_variable_uses(). (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-11 11:05:05 -07:00
Jason Ekstrand
bc2b170d68 nir: Use nir_builder in lower_io_to_temporaries
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-11 11:03:22 -07:00
Bas Nieuwenhuizen
bd95397d65 radv: Enable RB+ on Raven.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-11 18:46:55 +02:00
Tapani Pälli
9f29b1a4c8 vulkan: fix build issue on android (both anv/radv)
Fixes linking errors against:

   anv_GetPhysicalDeviceImageFormatProperties2KHR
   radv_GetPhysicalDeviceImageFormatProperties2KHR

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-11 13:55:49 +03:00
Nicolai Hähnle
41e6ffee49 radeonsi: correctly parse disassembly with labels
LLVM now emits labels as part of the disassembly string, which is very
useful but breaks the old parsing approach.

Use the semicolon to detect the boundary of instructions instead of going
by line breaks.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-11 12:44:30 +02:00
Nicolai Hähnle
0630e52c9e radeonsi: pass -O halt_waves to umr for hang debugging
This will give us meaningful wave information in the case of a hang where
shaders are still running in an infinite loop.

Note that we call umr multiple times for different sections of the ddebug
hang dump, and so the wave information will not necessarily match up
between sections.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-11 12:44:24 +02:00
Jason Ekstrand
69f447553c vulkan: Drop vk_android_native_buffer.xml
All the information in vk_android_native_buffer.xml is now in vk.xml.
The only exception is the extension type attribute which we can work
around in the generators while we wait for the XML to be fixed.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-10 19:29:49 -07:00
Jason Ekstrand
ae3a856c34 nir/lower_atomics: Rework the main walker loop a bit
This replaces some "if (...} { }" with "if (...) continue;" to reduce
nesting depth and makes nir_metadata_preserve conditional on progress
for the given impl.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-10 19:28:49 -07:00
Bas Nieuwenhuizen
ed94638156 radv: Enable RB+ where possible.
According to Marek, not enabling it on Stoney has a significant
negative performance impact. (And I guess this might impact
performance on Raven as well)

The register settings are pretty much copied from radeonsi. I did
not put this in the pipeline as that would make the pipeline more
dependent on the format which mean we would have to have more
pipelines for the meta shaders.

v2: Don't clear RB+ regs if not enabled as the CLEAR_STATE packet
    does already.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-11 01:19:10 +02:00
Topi Pohjolainen
5d895a1f37 nir: Check if u_vector_init() succeeds
However, it only fails when running out of memory. Now, if we
are about to check that, we should be consistent and check
the allocation of the worklist as well.

CID: 1433512
Fixes: edb18564c7 nir: Initial implementation of a nir_instr_worklist
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-11 01:49:56 +03:00
Topi Pohjolainen
98d3874754 mesa: Assert base format before truncating to unsigned short
CID: 1433709
Fixes: ca721b3d8: mesa: use GLenum16 in a few more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-11 01:49:56 +03:00
Topi Pohjolainen
26f48fe010 intel/dev: Assert the number of slices is not zero
Fixes: c1900f5b intel: devinfo: add helper functions to fill...
CID: 1433511
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-11 01:49:56 +03:00
Kenneth Graunke
8960903c90 i965: Remove brw_bo_alloc_tiled_2d from intel_detect_swizzling.
I'd like to drop this pre-isl function.  This drops one of the two uses.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-10 15:31:31 -07:00
Timothy Arceri
a05faf80c3 mesa: fix glsl version mismatch in compat profile
Drivers that only support compat 3.0 were reporting GLSL 1.40
support. This fixes issues with the menu of Dawn of War II.

Fixes: a0c8b49284 "mesa: enable OpenGL 3.1 with ARB_compatibility"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807
2018-04-11 08:05:19 +10:00
Samuel Pitoiset
0babc8e5d6 radv: fix picking the method for resolve subpass
The source and destination image parameters were swapped.

No CTS changes on Polaris10, but I suspect this might
fix something.

Fixes: 2a04f5481d ("radv/meta: select resolve paths")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-10 21:55:28 +02:00
Samuel Pitoiset
9f6a28eb27 radv: add shader BOs to the list at pipeline bind time
Otherwise, the shader BOs are not added to the list on SI because
prefetching isn't supported. Calling radv_cs_add_buffer() in the
prefetch codepath was a bad idea.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105952
Fixes: 4ad7595f35 ("radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Turo Lamminen <turo@alternativegames.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-10 21:55:28 +02:00
Marek Olšák
e29facff31 ac/surface: don't set the display flag for obviously unsupported cases (v2)
This enables the tile swizzle for some cases of the displayable micro mode,
and it also fixes an addrlib assertion failure on Vega.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-04-10 13:06:03 -04:00
Marek Olšák
19ce5048ee radeonsi: add shader binary padding for UMR 2018-04-10 13:05:20 -04:00
Marek Olšák
b64b712558 ac/surface/gfx9: request desired micro tile mode explicitly
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-10 12:44:41 -04:00
Emil Velikov
5dd02123a0 docs/release-calendar: update to include 18.1 and 18.2
Dylan has kindly stepped up to help with 18.1.0, while I've taken the
liberty to nominate Andres for 18.2.0 ;-)

As always, people are welcome to swap/adjust where needed.

v2: Add Juan for 18.0.x (Juan)

Cc: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-10 16:08:54 +01:00
Emil Velikov
8eceac9de7 glsl: remove unreachable assert()
Earlier commit enforced that we'll bail out if the number of terminators
is different than 2. With that in mind, the assert() will never trigger.

Fixes: 56b867395d ("glsl: fix infinite loop caused by bug in loop
unrolling pass")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-10 16:04:50 +01:00
Juan A. Suarez Romero
0d0ef8ae33 spirv: autotools: add vtn_gather_types_c.py in distribution tarball
Fixes: 042ee4bea2 "(spirv: Move SPIR-V building to Makefile.spirv.am and
spirv/meson.build")

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-10 10:37:46 +02:00
Juan A. Suarez Romero
15ed757834 radeonsi: autotools: add si_build_pm4.h in dist tarball
Fixes: 5777488406 ("radeonsi: move r600_cs.h contents into si_pipe.h,
si_build_pm4.h")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-10 10:33:28 +02:00
Bas Nieuwenhuizen
4381be4648 ac/nir: Use an array instead of hashtable for SSA defs.
Saves about 2% of compile time for F1 2017, as well as reduce code
size of an optimized libvulkan_radeon.so by about 1 KiB.

This still keeps the hashtable, as we also stored blocks in there.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-10 09:53:16 +02:00
Timothy Arceri
6066f08ee9 st/mesa: finalise tcs/tes/geom NIR before storing it to the cache
We don't create variants of the NIR so here we finalise it before
caching to avoid unnecessary processing when restoring it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 15:10:16 +10:00
Timothy Arceri
bc71e20993 st/mesa: exit st_translate_fragment_program() earlier for NIR path
This avoids a bunch of scanning that is only used by the TGSI path.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 15:10:16 +10:00
Timothy Arceri
494a5c3501 radeonsi/nir: tidy up si_nir_load_sampler_desc()
This makes it easier to follow the code, and also initialises
dynamic_index which will be useful for adding bindless textures
support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
d7cbe795ed radeonsi/nir: set uses_bindless_images for images
V2: add missing intrinsics (Spotted-by: Samuel Pitoiset)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
74b3fc2ce0 nir: dont lower bindless samplers
We neeed to skip the var if its not a uniform here as well as checking
the bindless flag since UBOs can contain bindless samplers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
bd4cc54c8b st/glsl_to_nir: set paramater value offset as driver location for packed uniforms
This allows us to simplify the code and will also be useful for supporting
bindless textures.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
222d862cd3 radeonsi/nir: don't add bindless samplers/images to declared bitmasks
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
f33d9036b9 st/mesa: stop calling _mesa_init_shader_object_functions()
This sets the LinkShader function for the driver, but for the st we
set it properly with the following call to st_init_program_functions().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Jason Ekstrand
c3f9d5c235 anv/pipeline: Lower more constant initializers earlier
Once we've gotten rid of everything but the main entrypoint, there's no
reason why we should go ahead and lower them all.  This is what radv
does and it will make future work easier.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-09 19:45:25 -07:00
Jason Ekstrand
14e0a222d9 spirv: Use the LOCAL_GROUP_SIZE system value
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-09 19:45:25 -07:00
Jason Ekstrand
131d454c35 nir/lower_system_values: Support SYSTEM_VALUE_LOCAL_GROUP_SIZE
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-09 19:45:25 -07:00
Lionel Landwerlin
f3353e53db intel: aubinator: print out addresses of invalid instructions
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-10 00:58:38 +01:00
Bas Nieuwenhuizen
41fbcc7901 radv: Always reset draw user SGPRs after secondary command buffer.
As we sometimes reset them to -1, -1 does not mean that they are
not written by the secondary command buffer.

Fixes: ad11fc3571 "radv: don't emit unneeded vertex state."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-09 23:04:42 +02:00
Bas Nieuwenhuizen
74b0b869dd radv: Don't set instance count using predication.
The packet can sometimes be skipped, but we still think the change takes effect.

This just makes the packet always take effect.

Fixes: ad11fc3571 "radv: don't emit unneeded vertex state."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105942
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-09 23:04:35 +02:00
Rob Clark
d66dc34316 mesa/st/nir: fix instruction removal
At one point this kinda worked (or at least didn't cause problems).  But
with deref-instructions it results in dangling deref instructions not
being properly removed.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 15:36:21 -04:00
Rob Clark
becf2d1fac mesa/st/nir: fix naked lowering pass call
Not using the macro means no nir_validate in debug builds, resulting in
problems showing up only after later passes.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 15:36:21 -04:00
Rob Clark
c4457113e9 nir: add comment about nir_src_copy()
So it is more clear about when to use nir_instr_rewrite_src()

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 15:36:21 -04:00
Nanley Chery
1d94aa1987 i965: Make the miptree clear color setter take a gl_color_union
We want to hide the internal details of how the miptree's clear color
is calculated.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 10:56:48 -07:00
Nanley Chery
3dbb49a978 i965/miptree: Move the clear color and value setter implementations
These will get more complex in later commits.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 10:56:48 -07:00
Nanley Chery
1ce7ae391e i965: Use the brw_context for the clear color and value setters
Do what all the other functions in the miptree API do.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 10:56:48 -07:00
Bas Vermeulen
c63bef15fc radeonsi: convert dispatch packet to little endian
The parameters for the compute engine are wrong when using
an E8860 on a big endian machine.
To fix this, convert the contents of struct dispatch_packet
to little endian.

This ensures that get_global_id(0) and similar functions
in the OpenCL code get the correct endian values, and
makes my simple OpenCL program work correctly.

Signed-off-by: Bas Vermeulen <bas@daedalean.ai>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-04-09 13:47:52 -04:00
Bas Vermeulen
be628e4749 radeonsi: correct si_vgt_param_key on big endian machines
Using mesa OpenCL failed on a big endian PowerPC machine because
si_vgt_param_key is using bitfields and a 32 bit int for an
index into an array.

Fix si_vgt_param_key to work correctly on both little endian
and big endian machines.

Signed-off-by: Bas Vermeulen <bas@daedalean.ai>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-04-09 13:42:30 -04:00
Marek Olšák
f33e4482b3 radeonsi: don't set RB+ registers on GFX9 chips without RB+
CLEAR_STATE initializes them properly.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-09 13:40:25 -04:00
Emil Velikov
ea2536cd26 etnaviv: meson: add etnaviv_query_pm.[ch] to the sources
Otherwise building the driver will fail with unresolved symbols.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105960
Fixes: 72d2043be0 ("etnaviv: add perfmon query implementation")
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Clayton Craft <clayton.a.craft@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-04-09 19:09:24 +02:00
Xiong, James
f23b45dce3 i965: return the fourcc saved in __DRIimage when possible
When creating a image from a texture, the image's dri_format is
set to the first plane's format, and used to look up for the
fourcc. e.g. for FOURCC_NV12 texture, the dri_format is set to
__DRI_IMAGE_FORMAT_R8, we end up with a wrong entry in function
intel_lookup_fourcc():
   { __DRI_IMAGE_FOURCC_R8, __DRI_IMAGE_COMPONENTS_R, 1,
     { { 0, 0, 0, __DRI_IMAGE_FORMAT_R8, 1 }, } },
instead of the correct one:
   { __DRI_IMAGE_FOURCC_NV12, __DRI_IMAGE_COMPONENTS_Y_UV, 2,
     { { 0, 0, 0, __DRI_IMAGE_FORMAT_R8, 1 },
       { 1, 1, 1, __DRI_IMAGE_FORMAT_GR88, 2 } } },
as a result, a wrong fourcc __DRI_IMAGE_FOURCC_R8 was returned.

To fix this bug, the image inherits the texture's planar_format that
has the original fourcc; Upon querying, if planar_format is set,
return the saved fourcc; Otherwise fall back to the old way.

v3: add a bug description and "cc mesa-stable" tag (Jason)
  remove redundant null pointer check (Tapani)
  squash 2 patches into one (James)
v2: fall back to intel_lookup_fourcc() when planar_format is NULL
  (Dongwon & Matt Roper)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Xiong, James <james.xiong@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-09 18:16:59 +03:00
Bastien Orivel
42c2f5b579 nir: Fix a typo in src/compiler/Makefile.nir.am
Since 31d91f019b, the makefile tries to
find the file SConstript.spirv instead of SConscript.spirv which breaks
the make dist command.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-09 08:32:45 -06:00
Samuel Pitoiset
04e609f1f8 radv: fix prefetching of vertex shader and VBOs on SI
Forgot one check... Too many mistakes for a simple change.

Fixes: f1d7c16e85 ("radv: fix prefetching compute shaders on CIK and older chips")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 16:14:12 +02:00
Samuel Pitoiset
56a4d03b0c radv: implement VK_AMD_shader_core_properties
Simple extension that only returns information for AMD hw.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 14:28:13 +02:00
Samuel Pitoiset
466aba9fa2 radv: add RADV_NUM_PHYSICAL_VGPRS constant
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 14:28:13 +02:00
Samuel Pitoiset
2f7bb93146 radv: add radv_get_num_physical_sgprs() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 14:28:13 +02:00
Samuel Pitoiset
b30dec738a vulkan: Update the XML and headers to 1.1.72
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 14:28:13 +02:00
Andres Gomez
a055f5108d docs: properly escape characters
Signed-off-by: Andres Gomez <agomez@igalia.com>
2018-04-09 13:47:40 +03:00
Andres Gomez
7cf3932098 mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage
Fixes: 03fd6704db ("mesa: Add support for a new override string
MESA_GLES_VERSION_OVERRIDE")

Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-09 13:47:40 +03:00
Marek Olšák
806ab42c0f mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override
v2:
 - Provide a correct explanation on the envvars documentation (Ian).
 - Provide a more correct explanation on the function comments (Andres).
v3:
 - Homogenize documentation and inline comments (Emil).
 - Correct a typo (Emil).

Fixes: 2599b92eb9 ("mesa: allow forcing >=3.1 compatibility contexts
with MESA_GL_VERSION_OVERRIDE")

Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-09 13:47:40 +03:00
Andres Gomez
c6067fcd07 dri_util: don't fail when not supporting ARB_compatibility with GL3.1
Currently, any driver that does not support the ARB_compatibility
extension will fail on GL3.1 context creation if the application does
not request the forward-compatiblity flag.

Restore the original check which changes mesa_api to API_OPENGL_CORE,
only when:
 - GL3.1 is requested, without the forward-compatiblity flag.
 - driver does not support ARB_compatibility - as deduced by
max_gl_compat_version.

Fixes: a0c8b49284 ("mesa: enable OpenGL 3.1 with ARB_compatibility")

v2:
 - Improve commit log (Emil).
 - Provide a correct explanation on the features documentation (Ian).

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-09 13:46:34 +03:00
Andres Gomez
044acd3569 dri_util: when overriding, always reset the core version
This way we won't fail when validating just because we may have a non
overriden core version that is lower than the requested one, even when
the compat version is high enough.

For example, running glcts from VK-GL-CTS with i965, this will
succeed:

$ MESA_GL_VERSION_OVERRIDE=4.6 ./glcts --deqp-case=KHR-GL46.info.vendor

While, this will fail:

$ MESA_GL_VERSION_OVERRIDE=4.6COMPAT ./glcts --deqp-case=KHR-GL46.info.vendor

Fixes: 464c56d3d5 ("dri_util: Use
_mesa_override_gl_version_contextless")

Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-09 13:18:16 +03:00
Samuel Pitoiset
b0f8ad189c radv: add radv_image_is_tc_compat_htile() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:26 +02:00
Samuel Pitoiset
95d5ad80e9 radv: add radv_use_dcc_for_image() helper
And add some TODOs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:24 +02:00
Samuel Pitoiset
fab5fe4284 radv: rename radv_image_is_tc_compat_htile()
... to radv_use_tc_compat_htile_for_image(). This function
name makes more sense to me because we want to know if and
only if TC-compat HTILE should be used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:21 +02:00
Samuel Pitoiset
2692736cee radv: simplify a check in radv_initialise_color_surface()
If the image has FMASK metadata, the number of samples is > 1
because radv_image_can_enable_fmask() handles that already.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:16 +02:00
Samuel Pitoiset
ed41e776d0 radv: clean up radv_vi_dcc_enabled()
And rename to radv_dcc_enabled() to be consistent.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:14 +02:00
Samuel Pitoiset
e213f19907 radv: clean up radv_htile_enabled()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:12 +02:00
Samuel Pitoiset
0fc9113ac5 radv: add radv_image_has_{cmask,fmask,dcc,htile}() helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:10 +02:00
Samuel Pitoiset
32f5174ce8 radv: add radv_get_cmask_fast_clear_value() helper
DCC for MSAA textures are currently unsupported but that will
be used later on.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:08 +02:00
Samuel Pitoiset
f882c62218 radv: add radv_clear_{cmask,dcc} helpers
They will help for DCC MSAA textures and if we support mipmaps
in the future.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:05 +02:00
Axel Davy
d899826733 st/nine: Do not use scratch for face register
Scratch registers are reused every instructions.
Since vFace is reused, a new temporary register
should be used.

Fixes: https://github.com/iXit/Mesa-3D/issues/311

Signed-off-by: Axel Davy <davyaxel0@gmail.com>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-04-08 22:49:43 +02:00
Christian Gmeiner
9e80273693 etnaviv: expose perfmon query groups
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:23:45 +02:00
Christian Gmeiner
c320b158f5 etnaviv: add query_group_info for perfmon counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:23:38 +02:00
Christian Gmeiner
5a3b744ed2 etnaviv: assign group_ids to perfmon queries
Prep work for AMD_performance_monitor support.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:23:34 +02:00
Christian Gmeiner
4020fa3e08 etnaviv: support MC performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:21:40 +02:00
Christian Gmeiner
3c3f936ae1 etnaviv: support TX performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:21:12 +02:00
Christian Gmeiner
f380ce13f0 etnaviv: support RA performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:21:04 +02:00
Christian Gmeiner
3af0e228e5 etnaviv: support SE performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:50 +02:00
Christian Gmeiner
9ae86c1306 etnaviv: support PA performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:46 +02:00
Christian Gmeiner
69bebe06e3 etnaviv: support SH performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:42 +02:00
Christian Gmeiner
1f603402f6 etnaviv: support PE performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:37 +02:00
Christian Gmeiner
d0bed0b494 etnaviv: support HI performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:32 +02:00
Christian Gmeiner
72d2043be0 etnaviv: add perfmon query implementation
Add needed infrastructure to use performance monitor
requests for queries.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:25 +02:00
Christian Gmeiner
7e3dba301e etnaviv: sw queries: return correct number of groups
Fixes: 3d912bd742 ("etnaviv: add query_group_info for sw counters")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-04-08 22:13:04 +02:00
Lucas Stach
208891650b etnaviv: advertise YUV formats as external only
We only support importing YUV as OES external resources.
This will change in the future, but for now this fixes the
advertised capabilities in eglQueryDmaBufModifiersEXT.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-04-08 22:11:46 +02:00
Lucas Stach
dfe4a08ccd gallium/util: implement util_format_is_yuv
This adds a helper to check if a pipe format is in YUV color space.
Drivers want to know about this, as YUV mostly needs special handling.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-04-08 22:10:57 +02:00
Rhys Perry
19254a977b nvc0: finish implementation of PIPE_QUERY_SO_OVERFLOW_PREDICATE
This also removes some useless code leftover from old changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-07 16:45:00 -04:00
Rhys Perry
14cc8c55ea nvc0: change ACQUIRE_EQUAL to ACQUIRE_GEQUAL in nvc0_hw_query_fifo_wait
If a fence is created in between nvc0_hw_end_query and
nvc0_hw_query_fifo_wait, the sequence number in nvc0->screen->fence.bo can
be larger than hq->fence->sequence before the semaphore is created,
resulting in the semaphore never being triggered.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-07 16:45:00 -04:00
Rhys Perry
98d15e0550 nvc0: ensure the query's fence has been emitted in nvc0_hw_query_fifo_wait
If the fence has not been emitted, hq->fence->sequence would be zero. This
would result in the semaphore never being triggered, blocking all later
commands in the pushbuf.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
[imirkin: use nouveau_fence_emit instead]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-07 16:45:00 -04:00
Ilia Mirkin
90bb2d7152 st/mesa: tex offsets can't be in a const or 2d-indexed
All consts are now implicitly 2d (they set .Dimension), so trigger
asserts. Also, the texture offset can't handle any sort of 2d indexing.
While this could be tacked on, this seems unnecessary, just move it off
into a separate temp.

Fixes assertion failure in
tests/spec/arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag

Note that this was an issue even before the const-always-2d thing, since
there was no detection of when even a proper second dimension was used,
e.g. for UBO or geom/tess inputs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-07 16:45:00 -04:00
Ilia Mirkin
2a2b22e9b1 nvc0: restore image binding on RGB10A2, remove from BGR10A2
Fixes a bunch of new CTS pbo tests that use those as an output format,
which the state tracker converts into buffer image writes.

No part of the driver is ready for BGR10A2. It could probably be enabled
on Maxwell+, but seems unnecessary. This error was introduced when
flipping the displayable bit on those formats, which accidentally also
moved the image bit.

Fixes: e1a70aed10 (nv50,nvc0: mark ABGR format as displayable instead of ARGB format)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-07 16:45:00 -04:00
Rob Clark
684f7cd7e3 freedreno/ir3: use lower_global_vars_to_local in cmdline compiler
tgsi_to_nir emits things with arrays as global vars.. and nir->ir3 does
lower_locals_to_regs.  But nothing was lowering global to local, which
breaks compiling tgsi shaders

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-04-07 11:33:41 -04:00
Kenneth Graunke
a3782a612f i965: Use %x instead of %u in debug print.
I mistakenly printed out the address as 0x<decimal number> instead of
printing a proper hex number.  This was...surprising.
2018-04-06 22:57:48 -07:00
Dylan Baker
b5f92b6fd4 meson: fix warnings about comparing unlike types
In the old days (0.42.x), when mesa's meson system was written the
recommendation for handling conditional dependencies was to define them
as empty lists. When meson would evaluate the dependencies of a target
it would recursively flatten all of the arguments, and empty lists would
be removed. There are some problems with this, among them that lists and
dependencies have different methods (namely .found()), so the
recommendation changed to use `dependency('', required : false)` for
such cases.  This has the advantage of providing a .found() method, so
there is no need to do things like `dep_foo != [] and dep_foo.found()`,
such a dependency should never exist.

I've tested this with 0.42 (the minimum we claim to support) and 0.45.
On 0.45 this removes warnings about comparing unlike types, such as:

meson.build:1337: WARNING: Trying to compare values of different types
(DependencyHolder, list) using !=.

v2: - Use dependency('', required : false) instead of
      declare_dependency(), the later will always report that it is
      found, which is not what we want.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-06 15:29:53 -07:00
Ian Romanick
81ed629b38 intel/compiler: Explicitly cast register type in switch
brw_reg::type is "enum brw_reg_type type:4".  For whatever reason, GCC
is treating this as an int instead of an enum.  As a result, it doesn't
detect missing switch cases and it doesn't detect that flow can get out
of the switch.

This silences the warning:

src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg*, const brw_reg*)’:
src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-06 15:22:10 -07:00
Axel Davy
39240926cd st/nine: Declare lighting consts for ff shaders
The lighting constants were not declared previously,
but were accessed with indirect addressing, which is
illegal.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=105442

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-04-06 23:34:31 +02:00
Caio Marcelo de Oliveira Filho
67c728f7a9 nir: rename variables in nir_lower_io_to_temporaries for clarity
In the emit_copies() function, the use of "newv" and "temp" names made
sense when only copies from temporaries to the new variables were
being done. But now there are other calls to copy with other pairings,
and "temp" doesn't always refer to a temporary created in this
pass. Use the names "dest" and "src" instead.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-06 11:08:08 -07:00
Samuel Pitoiset
8f9f62c2db radv: don't pass the pipeline to radv_flush_constants()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 19:46:27 +02:00
Samuel Pitoiset
2bd50cceff radv: rename radv_cmd_buffer_update_vertex_descriptors()
... to radv_flush_vertex_descriptors().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-06 19:46:23 +02:00
Samuel Pitoiset
e829a0cc1e radv: do not try to skip draw calls when VBOs upload failed
This is unnecessary because we record an error which should
be returned by vkEndCommandBuffer(), and the app shouldn't
submit a command buffer when this happens.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 19:46:21 +02:00
Samuel Pitoiset
f1d7c16e85 radv: fix prefetching compute shaders on CIK and older chips
Because the check was moved to radv_emit_prefetch_L2().

Fixes: 4ad7595f35 ("radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2()")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 19:46:18 +02:00
Samuel Pitoiset
7fe586f6fb radv: only enable PERFECT_ZPASS_COUNTS for precision occlusion queries
This unnecessary when the precision bit flag is not set, and this
might hurt performance. The Vulkan explains that not setting
VK_QUERY_CONTROL_PRECISE_BIT might be more efficient on some
implementations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 09:07:34 +02:00
Samuel Pitoiset
d53dff3bfc radv: enable the Polaris small primitive filter control
Enable it directly in the preamble, but do not enable line
on Polaris10/11/12 because there is a hw bug.

There is possibly an issue when MSAA is off, but this doesn't
regress any CTS and AMDVLK doesn't have a workaround as well.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 09:07:31 +02:00
Jason Ekstrand
c5b87c94d8 anv: Add WSI support for the I915_FORMAT_MOD_Y_TILED_CCS
v2 (Jason Ekstrand):
 - Return the correct enum values from anv_layout_to_fast_clear_type

v3 (Jason Ekstrand):
 - Always return ANV_FAST_CLEAR_NONE and leave doing the right thing for
   the patch which adds a modifier which supports fast-clears.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Daniel Stone <daniels@collabora.com>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
2018-04-05 21:17:02 -07:00
Anuj Phogat
ff8b82666a Add more Coffee Lake brand strings
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-04-05 14:50:11 -07:00
Jan Vesely
2406e8848e radeonsi: Reorder checks in si_check_render_feedback
si_get_total_colormask accesses NULL pointer on compute shaders
Fixes crashes on clover
Fixes: 0669dca9c0 ("radeonsi: skip DCC render feedback checking if color writes are disabled")
CC: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-05 17:11:18 -04:00
Kevin Rogovin
cc41603d6d intel/tools: new intel_sanitize_gpu tool
Adds a new debug tool to pad each GEM BO allocated with (weak)
pseudo-random noise values which are then checked after each
batchbuffer dispatch to the kernel. This can be quite valuable to
find diffucult to track down heisenberg style bugs.

[scott.d.phillips@intel.com: split to separate tool]

v2: (by Scott D Phillips)
    - track gem handles per fd (Kevin)
    - remove handles on GEM_CLOSE (Kevin)
    - ignore prime handles
    - meson & shell script

v3: (by Scott D Phillips)
    - don't track prime bos at all (Kevin)
    - protect the hash table with a mutex (Kevin)
    - hook fds by drm_version.name, not path (Chris Wilson)

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-05 13:52:49 -07:00
Jason Ekstrand
e85b95269e prog/nir: Simplify some load/store operations
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-05 13:20:39 -07:00
Marek Olšák
c7dd59b06d radeonsi: fix a crash if ps_shader.cso is NULL in si_get_total_colormask 2018-04-05 15:53:52 -04:00
Marek Olšák
be4250aa88 radeonsi: remove more R600 references
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
c0dfc0c6df radeonsi: try to fix android
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
f55d1f806e radeonsi: try to fix meson
This is not fully tested. Meson can't link LLVM even though automake can.

PATH=/usr/llvm/x86_64-linux-gnu/bin:$PATH meson build/ -Dgallium-va=false \
    -Dplatforms=x11,drm -Dgallium-drivers=radeonsi -Ddri-drivers= \
    -Dgallium-omx=disabled -Dgallium-xvmc=false -Dgles1=false \
    -Dtexture-float=true -Dvulkan-drivers=

src/gallium/auxiliary/libgallium.a(gallivm_lp_bld_misc.cpp.o):
(.data.rel.ro._ZTI26DelegatingJITMemoryManager[_ZTI26DelegatingJITMemoryManager]+0x10):
undefined reference to `typeinfo for llvm::RTDyldMemoryManager'

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
38faac43e3 radeonsi: don't build libradeon.la separately
for better parallelism

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
f9323ddbb9 radeonsi: clean up GET_MAX_VIEWPORT_RANGE definition
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
6a93441295 radeonsi: remove r600_common_context
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5f77361d2e radeonsi: remove r600_pipe_common::screen
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
321bd6c280 radeonsi: move r600_buffer_common.c and r600_texture.c into radeonsi
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
d58080b318 radeonsi: move r600_gpu_load.c to si_gpu_load.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
f7f4ba5306 radeonsi: move r600_query.c/h files to si_query.c/h
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5777488406 radeonsi: move r600_cs.h contents into si_pipe.h, si_build_pm4.h
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
eced536ed6 radeonsi: rename query definitions R600_ -> SI_
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
72e9e98076 radeonsi: move and rename R600_ERR out of r600_pipe_common.h
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
076afb4f0e radeonsi: rename a few R600/r600_ -> SI_/si_
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5f1cddde78 radeonsi: move definitions out of r600_pipe_common.h
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
a67ee02388 radeonsi: move functions out of and remove r600_pipe_common.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
90d12f1d77 radeonsi: rename r600 -> si in some places
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
50c7aa6756 radeonsi: use si_context instead of pipe_context in parameters pt3
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
e332ba61f4 radeonsi: use si_context instead of pipe_context in parameters pt2
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
c424f86180 radeonsi: use si_context instead of pipe_context in parameters pt1
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
2a62e5eec9 radeonsi: pass sctx to si_rebind_buffer and clean up
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
605ba1b9ae radeonsi: use r600_common_context less pt7
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
0b2f2a6a18 radeonsi: use r600_common_context less pt6
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
4c5efc40f4 radeonsi: update copyrights
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
95bc30275b radeonsi: switch radeon_add_to_buffer_list parameter to si_context
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
e5053060eb radeonsi: use r600_common_context less pt5
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
884fd97f6b radeonsi: use r600_common_context less pt4
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
a8291a23c5 radeonsi: use r600_common_context less pt3
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
3069cb8b78 radeonsi: use r600_common_context less pt2
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
71d9028b7a radeonsi: use r600_common_context less pt1
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
0606190059 radeonsi: don't use r600_common_context in si_emit_cache_flush
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
3de323f9bb radeonsi: switch r600_atom::emit parameter to si_context
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
2b70dd8c8a radeonsi: flatten / remove struct r600_ring
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
f7de8686de radeonsi: remove r600_ring::flush callback
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
4598ad6a00 radeonsi: make radeon_add_to_buffer_list_check_mem be gfx-only
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
426ef367f3 radeonsi: add_to_buffer_list functions can return void
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
c0987d8adf radeonsi: move saved_cs functions from r600_pipe_common.c to si_debug.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
37ef4765ff radeonsi: move DMA CS functions from r600_pipe_common.c to si_dma_cs.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
19f550f1d2 radeonsi: move EOP event code from r600_pipe_common.c to si_fence.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
fc6a44e169 radeonsi: rename si_hw_context.c -> si_gfx_cs.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
42500d1dab radeonsi: move si_destroy_saved_cs to si_debug.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
02a61e71a2 radeonsi: rename si_begin_new_cs -> si_begin_new_gfx_cs
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
fa09388704 radeonsi: rename si_need_cs_space -> si_need_gfx_cs_space
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
85e75b2da5 radeonsi: remove r600_pipe_common::blit_decompress_depth
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
e04389cc2a radeonsi: remove r600_pipe_common::decompress_dcc
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
9d7f809c03 radeonsi: remove r600_pipe_common::invalidate_buffer
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
898500c440 radeonsi: remove r600_pipe_common::rebind_buffer
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
fbf1bf9b8f radeonsi: remove r600_common_context::set_occlusion_query_state
and remove unused old_enable parameter.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5ed8b54ffe radeonsi: remove r600_pipe_common::save_qbo_state
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
72842d15ac radeonsi: remove unused query code
The get_size perf counter callback is also inlined and removed.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
3f55fe99d6 radeonsi: use num_cs_dw_queries_suspend
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
54f28359b5 radeonsi: remove r600_pipe_common::need_gfx_cs_space
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
0447e8e59e radeonsi: remove r600_pipe_common::set_atom_dirty
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5c125ab1ba radeonsi: remove r600_pipe_common::check_vm_faults
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
17e8f1608e radeonsi: call CS flush functions directly whenever possible
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
0669dca9c0 radeonsi: skip DCC render feedback checking if color writes are disabled 2018-04-05 15:34:58 -04:00
Dylan Baker
6ac87c1769 meson: fix megadriver symlinking
Which should be relative instead of absolute.

Fixes: f7f1b30f81
       ("meson: extend install_megadrivers script to handle symmlinking")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105567
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-05 10:48:38 -07:00
Dylan Baker
19dbed6477 meson: Set .so version for xa like autotools does
Fixes: 0ba909f0f1
       ("meson: build gallium xa state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-05 10:46:14 -07:00
Rafael Antognolli
7728720f07 anv: Make blorp update the clear color.
Instead of updating the clear color in anv before a resolve, just let
blorp handle that for us during fast clears.

v5: Update comment about HiZ clear color (Jordan).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
e8cadb673d anv: Use clear address for HiZ fast clears too.
Store the default clear address for HiZ fast clears on a global bo, and
point to it when needed.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
021e1885d0 anv: Emit the fast clear color address, instead of value.
On Gen10+, instead of copying the clear color from the state buffer to
the surface state, just use the address of the state buffer in the
surface state directly. This way we can avoid the copy from state buffer
to surface state.

v4:
 - Remove use_clear_address from anv code. (Jason)
 - Use the helper to extract clear color from attachment (Jason)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
3f96b459f4 anv: Add a helper to extract clear color from the attachment.
Extract the code from color_attachment_compute_aux_usage, so we can
later reuse it to update the clear color state buffer.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
7987d041fd i965/surface_state: Emit the clear color address instead of value.
On Gen10, when emitting the surface state, use the value stored in the
clear color entry buffer by using a clear color address in the surface
state.

v4: Use the clear color offset from the clear_color_bo, when available.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
2efe8309d3 i965/blorp: Update the fast clear value buffer.
On Gen10, whenever we do a fast clear, blorp will update the clear color
state buffer for us, as long as we set the clear color address
correctly.

However, on a hiz clear, if the surface is already on the fast clear
state we skip the actual fast clear operation and, before gen10, only
updated the miptree. On gen10+ we need to update the clear value state
buffer too, since blorp will not be doing a fast clear and updating it
for us.

v4:
 - do not use clear_value_size in the for loop
 - Get the address of the clear color from the aux buffer or the
 clear_color_bo, depending on which one is available.
 - let core blorp update the clear color, but also update it when we
 skip a fast clear depth.

v5: Better subject (Jordan).
v6: Remove outdated comment (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
5449f942f2 i965: Add aux_buf variable to simplify code.
In a follow up patch, we make use of clear_color_bo, which is in
mt->mcs_buf or mt->hiz_buf. To avoid duplicating more code that does the
same thing on both aux buffers, just use aux_buf already.

v5: Add aux_buf to brw_wm_surface_state too.
v6: Drop aux_surf and use aux_buf->surf instead (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
8735c86ce0 i965/miptree: Add new clear color BO for winsys aux buffers
Add an extra BO to store clear color when we receive the aux buffer from
the window system. Since we have no control over the aux buffer size in
this case, we need the new BO to store only the clear color.

v5:
 - Better subject (Jordan).
 - Drop alignment from brw_bo_alloc().

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
ab633c2d61 i965/miptree: Add space to store the clear value in the aux surface.
Similarly to vulkan where we store the clear value in the aux surface,
we can do the same in GL.

v2: Remove unneeded extra function.
v3: Use clear_value_state_size instead of clear_value_size.
v4:
 - rename to clear_color_state_size
 - store clear_color_bo and clear_color_offset in the aux buf struct
v5: Unreference clear color bo (Jordan)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
14260e7c60 intel/blorp: Update clear color state buffer during fast clears.
We always want to update the fast clear color during a fast clear on
i965. On anv, we are doing that before a resolve, but by adding support
to blorp, we can do a similar thing and update it during a fast clear
instead.

The goal is to remove some code from anv that does such update, and
centralize everything in blorp, hopefully removing a lot of code
duplication. It also allows us to have a similar behavior on gen < 9 and
gen >= 10.

v5: s/we/we are/ (Jordan)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
92eb5bbc68 intel/blorp: Only copy clear color when doing a resolve.
We only need to copy the clear color from the state buffer to the
inlined surface state when doing a resolve.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
188a473b9a intel/blorp: Add support for fast clear address.
On gen10+, if surface->clear_color_addr is present, use it directly
intead of copying it to the surface state.

v4: Remove redundant #if clause for GEN <= 10 (Jason)
v5: Move flush after the reloc, and keep lower bits (Topi).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
b8f45cf967 intel/isl: Add support to emit clear value address.
gen10 can emit the clear color by setting it on a buffer somewhere, and
then adding only the address to the surface state.

This commit add support for that on isl_surf_fill_state, and if that is
requested, skip setting the clear value itself.

v2: Add assert to make sure we are at least on gen10.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
94675edcfd intel: Use Clear Color struct size.
The size of the clear color struct (expected by the hardware) is 8
dwords (isl_dev.ss.clear_value_state_size here). But we still need to
track the size of the clear color, used when memcopying it to/from the
state buffer. For that we keep isl_dev.ss.clear_value_size.

v4:
 - Add struct to gen11 too (Jason, Jordan)
 - Add field for Converted Clear Color to gen11 (Jason)
 - Add clear_color_state_offset to differentiate from
   clear_value_offset.
 - Fix all the places where clear_value_size was used.

v5 (Jason):
 - Split genxml changes to another commit.
 - Remove unnecessary gen checks.
 - Bring back missing offset increment to init_fast_clear_color().

v6 (Jason):
 - On init_fast_clear_color, change:
   addr.offset += 4 => sdi.Address.offset += i * 4
 - Use GEN_GEN instead of GEN_VERSIONx10.

[jordan.l.justen@intel.com: isl_device_init changes]
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
f77789a3f0 intel/genxml: Add Clear Color struct to gen10+.
v5: Split genxml changes into its own commit (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
7e616ae201 intel/genxml: Use a single field for clear color address on gen10.
genxml does not support having two address fields with different names
but same position in the state struct. Both "Clear Color Address"
and "Clear Depth Address Low" mean the same thing, only for different
surface types.

To workaround this genxml limitation, rename "Clear Color Address"
to "Clear Value Address" and use it for both color and depth. Do the
same for the high bits.

TODO: add support for multiple addresses at the same position in the
xml.

v2: Combine high and low order bits into a single address field.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
8e1f2e1d2d genxml: Preserve fields that share dword space with addresses.
Some instructions contain fields that are either an address or a value
of some type based on the content of other fields, such as clear color
values vs address. That works fine if these fields are in the less
significant dword, the lower 32 bits of the address, because they get
OR'ed with the address. But if they are in the higher 32 bits, they get
discarded.

On Gen10 we have fields that share space with the higher 16 bits of the
address too. This commit makes sure those fields don't get discarded.

v5: Remove spurious whitespace (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
f421a31637 anv/image: Do not override lower bits of dword.
The lower bits seem to have extra fields in every platform but gen8
(even though we don't use them in gen9). So just go ahead and avoid
using them for the address.

v4: Use Jason's suggestion for comment explaining the change.
v5: Fix aux_address comment in anv_private.h (Jason)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-05 07:42:45 -07:00
Samuel Pitoiset
942fdfe357 radv: implement a fast prefetch path for the vertex stage
This allows to start draws as soon as possible.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-05 10:03:48 +02:00
Samuel Pitoiset
4ad7595f35 radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-05 10:03:45 +02:00
Samuel Pitoiset
a8a696a38f radv: use a mask for VBOs and shaders prefetching
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-05 10:03:42 +02:00
Marek Olšák
8cd58df2f2 gallium/pp: fix MLAA shaders
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99549
2018-04-04 20:01:43 -04:00
Marek Olšák
096942be2c gallium/pp: use user constant buffers
This fixes a radeonsi crash.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105026
2018-04-04 20:01:43 -04:00
Marek Olšák
d9dc26c94e st/mesa: set stencil border color the same as intensity
This fixes some stencil border color tests on Vega and Raven chips.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-04 16:55:52 -04:00
Jon Turney
498d9d0f4d Fix use of alloca() without #include <c99_alloca.h>
Fix use of alloca() without #include <c99_alloca.h> in 1da345e5

vbo/vbo_context.c: In function '_vbo_draw_indirect':
vbo/vbo_context.c:284:34: error: implicit declaration of function 'alloca' [-Werror=implicit-function-declaration]
       struct _mesa_prim *space = alloca(draw_count*sizeof(struct _mesa_prim));
                                  ^~~~~~
vbo/vbo_context.c:284:34: warning: initialization makes pointer from integer without a cast [-Wint-conversion]

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-04-04 14:34:07 +01:00
Samuel Pitoiset
922cd38172 radv: implement out-of-order rasterization when it's safe on VI+
Disabled by default for now, it can be enabled with
RADV_PERFTEST=outoforder.

No CTS regressions on Polaris, and all Vulkan games I tested
look good as well.

Expect small performance improvements for applications where
out-of-order rasterization can be enabled by the driver.

Loosely based on RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
d6709c91a6 radv: change blend_enable field to use four bits per CB
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
a8818d1af2 radv: scan which color blend attachments are enabled
With cb_target_enabled_4bit in order to have four bits per CB.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
ac456d0d1b radv: put more fields in radv_blend_state
Some will be used for further optimizations (ie. out-of-order rast).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
e4976ca33b radv: do not always disable dual quad mode when chip has RbPlus
For GFX9+ only, RadeonSI does this too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
b8c06a961c radv: don't use the SPI barrier management bug workaround
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
ab147cba77 radv: mask out high VM address bits in registers where needed
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Lionel Landwerlin
1beb80cb56 intel: compiler: silence compiler warning
../src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg*, const brw_reg*)’:
../src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type]

Introduced by 8f83eea71e ("i965: Add negative_equals methods").

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-04 11:57:39 +01:00
Iago Toral Quiroga
41ac0b1443 compiler/spirv: set is_shadow for depth comparitor sampling opcodes
From the SPIR-V spec, OpTypeImage:

"Depth is whether or not this image is a depth image. (Note that
 whether or not depth comparisons are actually done is a property of
 the sampling opcode, not of this type declaration.)"

The sampling opcodes that specify depth comparisons are
OpImageSample{Proj}Dref{Explicit,Implicit}Lod, so we should set
is_shadow only for these (we were using the deph property of the
image until now).

v2:
 - Do the same for OpImageDrefGather.
 - Set is_shadow to false if the sampling opcode is not one of these (Jason)
 - Reuse an existing switch statement instead of adding a new one (Jason)

Fixes crashes in:
dEQP-VK.spirv_assembly.instruction.graphics.image_sampler.depth_property.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2018-04-04 07:57:58 +02:00
Sergii Romantsov
98b860e311 i965: Extend the negative 32-bit deltas to 64-bits
Gen8+ use 48-bit address relocations so need to extend the sign
to 64-bit return value. Without it we have higher bits zeroed
and missing the negavive values.
Haswell and older use 32-bit deltas so are unaffected by this issue.

v2:
  used int32_t fucntion parameter instead of explicit type conversion.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101408
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Tested-by: Andriy Khulap <andriy.khulap@globallogic.com>
Tested-by: Stuart Young <cefiar@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>
2018-04-03 22:48:09 -07:00
Jason Ekstrand
800df942ea nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination
Otherwise we may end up trying to coalesce in a case such as

ssa_1 = fadd r1, r2
r3.x = fneg(r2);
r3 = vec4(ssa_1, ssa_1.y, ...)

and that would cause us to move the writes to r3 from the vec to the
fadd which would re-order them with respect to the write from the fneg.
In order to solve this, we just don't coalesce if the destination of the
vec is not SSA.  We could try to get clever and still coalesce if there
are no writes to the destination of the vec between the vec and the ALU
source.  However, since registers only come from phi webs and indirects,
the chances of having a vec with a register destination that is actually
coalescable into its source is very slim.

Shader-db results on Haswell:

    total instructions in shared programs: 13657906 -> 13659101 (<.01%)
    instructions in affected programs: 149291 -> 150486 (0.80%)
    helped: 0
    HURT: 592

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105440
Fixes: 2458ea95c5 "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible"
Reported-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Tested-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-04-03 22:21:23 -07:00
Kevin Strasser
5bbde9b80f anv: Fix close(fd) before import issue in vkCreateDmaBufImageINTEL
If we close the fd before calling DRM_IOCTL_PRIME_FD_TO_HANDLE the kernel
will hit a -EBADF error. Move the close(fd) call to the end of
anv_CreateDmaBufImageINTEL().

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-03 18:33:17 -07:00
Timothy Arceri
b42633db8e glsl: always call do_lower_jumps() after loop unrolling
This fixes a bug in radeonsi where LLVM cannot handle the case where
a break exists but its not the last instruction in the block.

LLVM would fail with:
Terminator found in the middle of a basic block!
LLVM ERROR: Broken function found, compilation aborted!

Fixes: 96fe8834f5 "glsl_to_tgsi: do fewer optimizations with GLSLOptimizeConservatively"

Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105317
2018-04-04 08:40:16 +10:00
James Legg
a58fdc61e9 vulkan/wsi/wayland: fix leaks
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Reviewed-by: Daniel Stone <daniels@collabora.com>
CC: Jason Ekstrand <jason@jlekstrand.net>
2018-04-03 22:09:57 +01:00
Juan A. Suarez Romero
06076ead28 docs: update calendar, add news and link release notes to 17.3.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-03 17:38:36 +00:00
Juan A. Suarez Romero
ca71b7bab8 docs: add sha256 checksums for 17.3.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit ba371c7262)
2018-04-03 17:34:16 +00:00
Juan A. Suarez Romero
d89ef8ce62 docs: add release notes for 17.3.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 3bf5c10c5c)
2018-04-03 17:34:16 +00:00
Jakob Bornecrantz
88e958257c st/mesa: Also use PIPE_FORMAT_R8G8B8A8_SRGB for framebuffer_sRGB.
When running virgl on a GLES host the only sRGB formats that support
rendering is RGBA and RGBX. That pipe format is in the sRGB default
lists that the state tracker uses when mapping mesa formats.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-04-03 17:48:52 +01:00
Lionel Landwerlin
78c18d99dc intel: gen-decoder: print all dword a field belongs to
Prior to printing a decoded field, print out all dwords that field
belongs to. In particular with address fields spanning multiple
dwords, we want to have all the dwords presented before the field is
decoded to make it easier to read.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Lionel Landwerlin
4d59127213 intel: genxml: decode variable length MI_LRI
MI_LOAD_REGISTER_IMM can load multiple (register, value) tuples in one
command. In our drivers we only use one tuple at a time, but the
kernel might load more than one at a time.

Instead of making all the tuple part of a group, we leave out the
first tuple (the one we use in the generated packing structures).

This is particularly useful for looking at error stats generated by
the kernel.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Lionel Landwerlin
2841af6238 intel: gen-decoder: don't decode fields beyond a dword length
For example, a PIPE_CONTROL with DWordLength = 2 should look like
this :

0xffffe374:  0x7a000002:  PIPE_CONTROL
0xffffe374:  0x7a000002 : Dword 0
    DWord Length: 2
0xffffe378:  0x00800000 : Dword 1
    Depth Cache Flush Enable: false
    Stall At Pixel Scoreboard: false
    State Cache Invalidation Enable: false
    Constant Cache Invalidation Enable: false
    VF Cache Invalidation Enable: false
    DC Flush Enable: false
    Pipe Control Flush Enable: false
    Notify Enable: false
    Indirect State Pointers Disable: false
    Texture Cache Invalidation Enable: false
    Instruction Cache Invalidate Enable: false
    Render Target Cache Flush Enable: false
    Depth Stall Enable: false
    Post Sync Operation: 0 (No Write)
    Generic Media State Clear: false
    TLB Invalidate: false
    Global Snapshot Count Reset: false
    Command Streamer Stall Enable: false
    Store Data Index: 0
    LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
    Destination Address Type: 0 (PPGTT)
    Flush LLC: false
0xffffe37c:  0x00000000 : Dword 2
    Address: 0x00000000
0xffffe384:  0x05000000:  MI_BATCH_BUFFER_END

Prior to this change, fields beyond the length of the command would be
decoded (notice the MI_BATCH_BUFFER_END decoded as part of the
previous PIPE_CONTROL) :

0xffffe374:  0x7a000002:  PIPE_CONTROL
0xffffe374:  0x7a000002 : Dword 0
    DWord Length: 2
0xffffe378:  0x00800000 : Dword 1
    Depth Cache Flush Enable: false
    Stall At Pixel Scoreboard: false
    State Cache Invalidation Enable: false
    Constant Cache Invalidation Enable: false
    VF Cache Invalidation Enable: false
    DC Flush Enable: false
    Pipe Control Flush Enable: false
    Notify Enable: false
    Indirect State Pointers Disable: false
    Texture Cache Invalidation Enable: false
    Instruction Cache Invalidate Enable: false
    Render Target Cache Flush Enable: false
    Depth Stall Enable: false
    Post Sync Operation: 0 (No Write)
    Generic Media State Clear: false
    TLB Invalidate: false
    Global Snapshot Count Reset: false
    Command Streamer Stall Enable: false
    Store Data Index: 0
    LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
    Destination Address Type: 0 (PPGTT)
    Flush LLC: false
0xffffe37c:  0x00000000 : Dword 2
    Address: 0x00000000
0xffffe380:  0x00000000 : Dword 3
0xffffe384:  0x05000000 : Dword 4
    Immediate Data: 83886080
0xffffe384:  0x05000000:  MI_BATCH_BUFFER_END

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Lionel Landwerlin
81375516b2 intel: error_decode: add an option to decode all buffers
The kernel reports workaround batch buffers, but we're not presenting
them currently. Also they might not be useful for debugging purely
userspace driver issues, when problems arise because of interactions
between kernel & userspace drivers, it's nice to be able to decode
them.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Lionel Landwerlin
b3aa18dfd6 intel: genxml: add preemption control instructions
Helpful to debug kernel workaround batchbuffers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Dylan Baker
6f6e711c72 mesa: ensure that variable is initialized
This variable controls whether we link using the glsl code path or the
spirv path. It's set when we validate that all shaders are glsl or
spirv, but if there are no shaders attached to the program it will
remain unset, resulting in undefined behavior. We want to go down the
glsl path in that case, so initialize to false.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105820
Fixes: 16f6634e7f
       ("mesa/program: Link SPIR-V shaders using the SPIR-V code-path")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-04-03 08:47:59 -07:00
Marek Olšák
d3e96b1063 radeonsi/gfx9: fix bad LLVM params in monolithic LS+HS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-03 11:07:28 -04:00
Samuel Pitoiset
acf60abc54 radv: enable VK_EXT_shader_viewport_index_layer
The driver already supports exporting the Layer and ViewportIndex
built-ins from vertex or tessellation shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-03 14:05:46 +02:00
Rob Clark
51888bf07d nir+drivers: add helpers to get # of src/dest components
Add helpers to get the number of src/dest components for an intrinsic,
and update spots that were open-coding this logic to use the helpers
instead.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-03 06:08:56 -04:00
Rob Clark
91f9450b32 freedreno/ir3: fix fallout of unused false-depth elimination
Since we were MARK flag for both preventing loops, and tracking whether
instructions were used, we could end up in an infinite loop due to
bd2ca2bcdd.  Instead invert the logic.. mark all instructions UNUSED
up front and clear the flag as we visit them.

Fixes: bd2ca2bcdd freedreno/ir3: eliminate unused false-deps
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-04-03 06:08:56 -04:00
Timothy Arceri
7e9b7ec094 gallium/pipebuffer: fix parenthesis location
Without this the return value will never get set to -1. This
was first added in 49866c8f34 and copied in 2b396eeed9.

Fixes: 2b396eeed9 "gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102342
2018-04-03 16:05:59 +10:00
Tapani Pälli
6b21391729 Revert "mesa: add GL_HALF_FLOAT as supported type to readpixels"
This reverts commit 41cf30b8bc.

Commit caused regressions with KHR-GLES3.packed_pixels.* tests.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Eric Anholt <eric@anholt.net>
2018-04-03 08:43:30 +03:00
Mike Lothian
0bdbe4583f gallivm: Fix include for LLVMAddPromoteMemoryToRegisterPass
Include llvm-c/Transforms/Utils.h with the newest LLVM 7

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:27:29 -04:00
Mike Lothian
5e07881305 radeonsi: Fix include for LLVMAddPromoteMemoryToRegisterPass
Include llvm-c/Transforms/Utils.h with the newest LLVM 7

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:27:29 -04:00
Mike Lothian
7e144ace95 ac/nir: Fix include for LLVMAddPromoteMemoryToRegisterPass
Include llvm-c/Transforms/Utils.h with the newest LLVM 7

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:27:29 -04:00
Daniel Stone
4cbecb6168 st/dri: Initialise modifier to INVALID for DRI2
When allocating a buffer for DRI2, set the modifier to INVALID to inform
the backend that we have no supplied modifiers and it should do its own
thing. The missed initialisation forced linear, even if the
implementation had made other decisions.

This resulted in VC4 DRI2 clients failing with:
  Modifier 0x0 vs. tiling (0x700000000000001) mismatch

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Andreas Müller <schnitzeltony@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes: 3f8513172f ("gallium/winsys/drm: introduce modifier field to winsys_handle")
2018-04-02 19:07:57 +01:00
Marek Olšák
2be6143032 radeonsi: implement GL_KHR_blend_equation_advanced
MSAA is supported using sample shading. Layered rendering and all texture
targets are also supported.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:25 -04:00
Marek Olšák
e04631b0f2 radeonsi: rename unpack_param -> si_unpack_param
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:23 -04:00
Marek Olšák
dc04e4bba2 radeonsi: move FMASK shader logic to shared code
We'll need it for FBFETCH in both TGSI and NIR paths.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:22 -04:00
Marek Olšák
eb77961292 radeonsi: add R600_DEBUG=nofmask to disable MSAA compression
For testing.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:20 -04:00
Marek Olšák
56342c97ee gallium/u_tests: test FBFETCH and shader-based blending with MSAA
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:18 -04:00
Marek Olšák
5d91c2ccea ac/gpu_info: print GB_ADDR_CONFIG 2018-04-02 13:10:37 -04:00
Marek Olšák
b1f33086ec ac/gpu_info: reorder the fields and print them nicely 2018-04-02 13:10:37 -04:00
Marek Olšák
a0a96819e1 ac/gpu_info: rename has_virtual_memory -> r600_has_virtual_memory 2018-04-02 13:10:37 -04:00
Marek Olšák
32b3932de1 ac/gpu_info: don't print irrelevant fields 2018-04-02 13:10:37 -04:00
Marek Olšák
f754217517 st/mesa: don't draw if the bound element array buffer is not allocated
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:10:36 -04:00
Iago Toral Quiroga
31881079af anv/cmd_buffer: honor pending clear views for depth/stencil attachments
v2: rebased on top of subpass rework.

v3: rebased

v4:
 - rebased
 - reset pending clear views in one go rather one bit at a time (Caio)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-02 09:53:24 +02:00
Iago Toral Quiroga
f60c5fc17e anv/cmd_buffer: consider multiview masks for tracking pending clear aspects
When multiview is active a subpass clear may only clear a subset of the
attachment layers. Other subpasses in the same render pass may also
clear too and we want to honor those clears as well, however, we need to
ensure that we only clear a layer once, on the first subpass that uses
a particular layer (view) of a given attachment.

This means that when we check if a subpass attachment needs to be cleared
we need to check if all the layers used by that subpass (as indicated by
its view_mask) have already been cleared in previous subpasses or not, in
which case, we must clear any pending layers used by the subpass, and only
those pending.

v2:
  - track pending clear views in the attachment state (Jason)
  - rebased on top of fast-clear rework.

v3:
  - rebased on top of subpass rework.

v4: rebased.

v5 (Caio):
 - Rebased.
 - Initialize pending clear views to only have bits set for layers
   that exist.
 - Reset pending clear views in one go rather one bit at a time.
 - Put "last subpass for this attachment" condition in a separate
   function to simplify the conditional that resets pending_clear_aspects.

Fixes:
dEQP-VK.multiview.readback_implicit_clear.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-02 09:53:15 +02:00
Timothy Arceri
c88e7fe29e radeonsi/nir: fix explicit component packing for geom/tess doubles
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Timothy Arceri
dd3d3cc877 radeonsi/nir: gather buffers declared more accurately and use const fast path
For now we skip SI && HAVE_LLVM < 0x0600 for simplicity. We also skip
setting the more accurate masks for builtin uniforms for now as it
causes some piglit regressions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Timothy Arceri
56017d8100 radeonsi: create load_const_buffer_desc_fast_path() helper
This will be shared by the TGSI and NIR backends. For simplicity
we leave the SI LLVM 5.0 and lower work around only in the TGSI
backend.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Timothy Arceri
7aad5e15f6 radeonsi/nir: set TGSI_PROPERTY_NEXT_SHADER
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Timothy Arceri
2ca5d9548f st/glsl_to_nir: gather next_stage in shader_info
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Rob Clark
2f175bfe5d freedreno/a5xx: don't align height for PIPE_BUFFER
Buffers can be large, so we probably don't want to make them all 32x
bigger.  But they can't be rendered to (at least in GL) so we don't
need this workaround to prevent page faults on mem<->gmem.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-04-01 11:26:01 -04:00
Rob Clark
1866f76f7b freedreno/a5xx: fix page faults on last level
We could alternatively fall back to using "old style" draw's for
mem<->gmem (ie. what <= a4xx do) when height is not aligned to 32,
but that is somewhat more work (and not really something that could
be applied to stable)

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-04-01 10:50:11 -04:00
Rob Clark
afde9294b5 freedreno/ir3: fix issue w/ glamor composite shaders
Fixes an issue that became possible when we started lowering phi webs to
regs (a7ea2b4e) (although was not really seen until we also switched to
using peephole select pass (ec8bc54a) instead of lowering *all* if/else
to select).

If texture coord (or anything else that uses create_collect() to collect
scalar values in a sequence of scalar registers) was consuming a value
produced on either side of an if/else (ie. a phi lowered to nir reg,
which in ir3 is an "array" of length 1) then register allocation would
happen incorrectly and we'd end up sampling from garbage coordinates.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 16:25:13 -04:00
Rob Clark
2191a18e75 freedreno/ir3: more half-precision fixes
Some instructions require src/dst to be in full or half precision
register depending on src/dst type.  So do a better job of propagating
register type.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:16:16 -04:00
Rob Clark
e04e068f75 freedreno/ir3: add helper to create immed of specified size
We'll also need to be able to create a half-precision immediate.  So
re-work create_immed().  Prep work for following patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:13:11 -04:00
Rob Clark
1f45320e51 freedreno/ir3: pass ctx instead of block to create_collect()
Prep work for following patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:12:33 -04:00
Rob Clark
bd2ca2bcdd freedreno/ir3: eliminate unused false-deps
Previously false-dependencies would get flagged as used, even if the
only "use" was a false dep to (for example) prevent a load from being
scheduled after a store.

In addition to being pointless instructions, in some cases they can
cause problems.  For example, ldg (and similar instructions) depend on
an immed arg getting CP'd into the instruction, but this doesn't happen
if an instruction is otherwise unused.  Which can result in undefined
results (overwriting unintended registers).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:11:46 -04:00
Rob Clark
4f78383809 freedreno/ir3: add local_group_size
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:10:56 -04:00
Rob Clark
96e7927fb2 freedreno/ir3: clear SSA flag when assigning "ARRAY" regs too
Avoids a misleading "INVALID FLAGS" warning in debug builds.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:10:16 -04:00
Rob Clark
6514b4e3fd freedreno/ir3: print array live ranges
This is also useful to see if optmsgs are enabled.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:09:42 -04:00
Wladimir J. van der Laan
e8e3aa68d6 freedreno: a2xx: Implement DP2 instruction
Use DOT2ADDv instruction with 0.0f constant add.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
79d6b194f2 freedreno: a2xx: implement SEQ/SNE instructions
Extend translate_sge_slt to emit these, in analogous fashion
but using CNDEv.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
837fabaaa3 freedreno: a2xx: Compressed textures support
Add support for:

- PIPE_FORMAT_ETC1_RGB8
- PIPE_FORMAT_DXT1_RGB
- PIPE_FORMAT_DXT1_RGBA
- PIPE_FORMAT_DXT3_RGBA
- PIPE_FORMAT_DXT5_RGBA

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
92d529e7e4 freedreno: a2xx: Support TEXTURE_RECT
Denormalized texture coordinates are required for text rendering in
GALLIUM_HUD.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
6be017fdc4 freedreno: a2xx: Prevent crash in emit_texture if view is not set
Textures will sometimes be updated if texture view state was
un-set, without this change that causes an assertion crash or
segfault.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
fb41372761 freedreno: a2xx: Fix fd2_tex_swiz
Compose swizzles using util_format_compose_swizzles instead
of the custom code (which somehow had a bug).

This makes the GL_ALPHA internal format work.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
faed84a615 freedreno: a2xx: Change use of BLEND_ to BLEND2_
Change use of BLEND_ to BLEND2_,

    BLEND_* a3xx_rb_blend_opcode
    BLEND2_* is a2xx_rb_blend_opcode

This makes no effective difference as the used enumerant has the same
value (0), but the other enumerants do not match 1-to-1 so this will
avoid future problems.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
cb6dd7070f freedreno: a2xx: Update rnndb header for formats enumeration
The format enumeration comes comes from the yamoto
register headers that are part of the amd-gpu kernel driver.
(see freedreno envytools commit b8fb7978e7ae106d0d11d0b238ab2ba2d4dd9d43)

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Mathias Fröhlich
1da345e569 vbo: Use alloca for _vbo_draw_indirect.
Avoid using malloc in the draw path of mesa.
Since the draw_count is a user api input, fall back to malloc if
the amount of consumed stack space may get too high.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:15 +02:00
Mathias Fröhlich
3f1cd957d3 vbo: Remove unused includes to vbo_private.h
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:14 +02:00
Mathias Fröhlich
6e9f00e3fc vbo: Move vbo_split into the tnl module.
Move the files, adapt to the naming scheme in tnl, update callers
and build system.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:14 +02:00
Mathias Fröhlich
245f9a3977 vbo: Readd the arrays argument to the legacy draw methods.
The legacy draw paths from back before 2012 contained a gl_vertex_array
array for the inputs to be used for draw. So all draw methods from legacy
drivers and everything that goes through tnl are originally written
for this calling convention. The same goes for tools like t_rebase or
vbo_split*, that even partly still have the original calling convention
with a currently unused such pointer.
Back in 2012 patch 50f7e75

mesa: move gl_client_array*[] from vbo_draw_func into gl_context

introduced Array._DrawArrays, which was something that was IMO aiming for
a similar direction than Array._DrawVAO introduced recently.
Now several tools like t_rebase and vbo_split*, which are mostly used by
tnl based drivers, would need to be converted to use the internal
Array._DrawVAO instead of Array._DrawArrays. The same goes for the driver
backends that use any of these tools.
Alternatively we can reintroduce the gl_vertex_array array in its call
argument list and put these tools finally into the tnl directory.
So this change reintroduces this gl_vertex_array array for the legacy
draw paths that are still required for the tools t_rebase and vbo_split*.
A followup will move vbo_split also into tnl.

Note that none of the affected drivers use the DriverFlags.NewArray
driver bit. So it should be safe to remove this also for the legacy
draw path.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:14 +02:00
Mathias Fröhlich
461698af26 vbo: Remove the now unused vbo draw path.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:13 +02:00
Mathias Fröhlich
784fdef4e7 tnl: Push down the gl_vertex_array inputs into tnl drivers.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:13 +02:00
Mathias Fröhlich
7f8db5ca47 vbo: Remove vbo_indirect_draw_func.
Remove the vbo_indirect_draw_func vbo callback and make the default
implementation use the drivers main draw callback function directly.
This will be needed with the next changes when drivers without own main
drivers DrawIndirect implementation get moved to the main drivers
Draw method.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:13 +02:00
Mathias Fröhlich
4db9d83a2d i965: Push down the gl_vertex_array inputs into i965.
Let the i965 backend have its own gl_vertex_array array and basically
reimplement the way _vbo_draw works.
Note that brw_draw_indirect_prims calls brw_draw_prims internally
and gets its update to Array._DrawArray by this way.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:12 +02:00
Mathias Fröhlich
fca1550550 gallium: Push down the gl_vertex_array inputs into gallium.
Let the gallium backend have its own gl_vertex_array array and basically
reimplement the way _vbo_draw works.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:12 +02:00
Jason Ekstrand
9978f55cd1 nir/validator: Validate that all used variables exist
We were validating this for locals but nothing else.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-30 17:20:27 -07:00
Jason Ekstrand
2b977989f3 intel/vec4: Set channel_sizes for MOV_INDIRECT sources
Otherwise, any indirect push constant access results in an assertion
failure when we start digging through the channel_sizes array.  This
fixes dEQP-VK.pipeline.push_constant.graphics_pipeline.dynamic_index_vert
on Haswell.  It should be a harmless no-op for GL since indirect push
constants aren't used there.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: e69e5c7006 "i965/vec4: load dvec3/4 uniforms first in the..."
2018-03-30 17:20:27 -07:00
Jason Ekstrand
6018f5b079 nir/lower_indirect_derefs: Support interp_var_at intrinsics
This fixes the fs-interpolateAtCentroid-block-array piglit test on i965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2018-03-30 17:20:27 -07:00
Jason Ekstrand
0517d65f96 nir/vars_to_ssa: Remove copies from the correct set
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2018-03-30 17:20:27 -07:00
Jason Ekstrand
a1452a94fc nir: Return a cursor from nir_instr_remove
Because nir_instr_remove is an inline wrapper around nir_instr_remove_v,
the compiler should be able to tell that the return value is unused and
not emit the extra code in most cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-30 17:20:27 -07:00
Jason Ekstrand
956f17395b nir: Add src/dest num_components helpers
We already have these for bit_size

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-30 17:20:27 -07:00
Brian Paul
bebf758c49 docs: document WGL_SWAP_INTERVAL env var
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-03-30 14:45:05 -06:00
Brian Paul
c8906b8459 st/wgl: check if WGL_SWAP_INTERVAL is defined in wglSwapIntervalEXT()
This allows the WGL_SWAP_INTERVAL env var to override any application
calls to wglSwapIntervalEXT().  Useful for debugging, or to set the
interval to zero to effectively disable the swap interval.

Note: we also rename the previous instance of SVGA_SWAP_INTERVAL to
WGL_SWAP_INTERVAL since this is a WGL feature and not related to the
svga driver.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-03-30 14:44:50 -06:00
Brian Paul
1bf201ddce glapi: define GL_API to be KEYWORD1 in glapi_dispatch.c (v2)
This fixes a Windows build warning where the prototypes for the ES
function in the header file don't match the prototypes in this file
because the GL_API and GLAPI macros are defined differently.

v2: defined GL_API to KEYWORD1 instead of GLAPI, per Mathias.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-03-30 14:33:33 -06:00
Brian Paul
26bc983c83 spirv: s/uint/unsigned/ to fix MSVC build
Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Brian Paul
f3164c2ed9 nir/spirv: s/uint32_t/SpvOp/ in various functions
The MSVC compiler warns when the function parameter types don't
exactly match with respect to enum vs. uint32_t.  Use SpvOp everywhere.

Alternately, uint32_t could be used everywhere.  There doesn't seem
to be an advantage to one over the other.

Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Brian Paul
cb619a3c9a nir/spirv: fix MSVC syntax error in vtn_handle_texture()
Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Brian Paul
c58c9f712d nir/spirv: move NORETURN annotation on _vtn_fail() prototype
This needs to before the function, not after, to compile with MSVC.
This works with gcc too.

Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Brian Paul
84be45fc20 nir/spirv: fix MSVC warning in vtn_align_u32()
Fixes warning that "negation of an unsigned value results in an
unsigned value".

Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Neil Roberts
31d91f019b spirv: Fix building with SCons
The SCons build broke with commit ba975140d3 because a SPIR-V
function is called from Mesa main. This adds a convenience library for
SPIR-V and adds it to everything that was including nir. It also adds
both nir and spirv to drivers/x11/SConscript.

Also add nir/spirv modules to osmesa and libgl-gdi targets. (Brian Paul)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105817
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2018-03-30 14:33:03 -06:00
Brian Paul
cdc34e2cea mesa: fix MSVC bitshift overflow warnings
In the BITFIELD_MASK() macro, if b==32 the expression evaluates to
~0u, but the compiler still sees the expression (1 << 32) in the
unused part and issues a warning about integer bitshift overflow.

Fix that by using (b) % 32 to ensure the max shift is 31 bits.

This issue has been present for a while, but shows up much more
often because of the recent VBO changes.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-03-30 11:04:32 -06:00
Brian Paul
fa18a427e9 st/mesa: add missing GLSL_TYPE_[U]INT8 cases in st_glsl_type_dword_size()
Silences a compiler warning about unhandled enum switch cases.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-03-30 11:04:32 -06:00
Jakob Bornecrantz
e16b92ad7e vbo: MaxVertexAttribStride is not always set
This assert is hit on hardware which does not expose GL 4.4 or GLES 3.1.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-03-30 17:23:08 +01:00
Daniel Stone
696762eef5 x11: Only report supported DRI3/Present versions
The version passed to QueryVersion requests is the version that the
client supports. We were just passing in whatever version of XCB was
present on the system, which may not be a version that Mesa actually
explicitly supports, e.g. it might bring unwanted semantics.

Set specific protocol versions which we support, and only pass those.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: 7aeef2d4ef ("dri3: allow building against older xcb (v3)")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-30 16:53:51 +01:00
Samuel Pitoiset
2a329f4ada radv: set SAMPLE_RATE to the number of samples of the current fb
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-30 17:32:15 +02:00
Brian Paul
fc1d1dbe81 nir: s/uint/unsigned/ to fix MSVC/MinGW build
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-30 08:37:59 -06:00
Eduardo Lima Mitev
e7fc18097e i965: Don't call process_glsl_ir() for SPIR-V shaders
v2: Use 'spirv_data' from gl_linked_shader instead, to check if shader
   is SPIR-V. (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
e7d97aa75d i965: Call spirv_to_nir() instead of glsl_to_nir() for SPIR-V shaders
This is the main fork of the shader compilation code-path, where a NIR
shader is obtained by calling spirv_to_nir() or glsl_to_nir(),
depending on its nature..

v2: Use 'spirv_data' member from gl_linked_shader to know which method
   to call. (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
abb6d0797c mesa/glspirv: Add a _mesa_spirv_to_nir() function
This is basically a wrapper around spirv_to_nir() that includes
arguments setup and post-conversion validation.

v2: * Rebase update (SpirVCapabilities not a pointer anymore,
    spirv_to_nir_options added, and others).
    * Code-style improvements and remove debug hunk. (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
16f6634e7f mesa/program: Link SPIR-V shaders using the SPIR-V code-path
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
9c36e9f862 mesa/glspirv: Add _mesa_spirv_link_shaders() function
This is the equivalent to link_shaders() from
src/compiler/glsl/linker.cpp, but for SPIR-V programs. It just
creates the program and its gl_linked_shader objects, giving drivers
the opportunity to implement any linking of SPIR-V shaders they choose,
at a later stage.

v2: Bail out if we see more that one shader for the same stage, and
    add a corresponding comment. (Timothy Arceri)

v3:
  * Adds also a linker error log to the condition above, with a
    reference to the specification issue. (Timothy Arceri)
  * Squash with the patch adding the function boilerplate (Timothy
    Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
22b6b3d0a7 mesa: Add a reference to gl_shader_spirv_data to gl_linked_shader
This is a reference to the spirv_data object stored in gl_shader, which
stores shader SPIR-V data that is needed during linking too.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Nicolai Hähnle
ba975140d3 mesa: Implement glSpecializeShaderARB
v2:
  * Use gl_spirv_validation instead of spirv_to_nir.  This method just
    validates the shader. The conversion to NIR will happen later,
    during linking. (Alejandro Piñeiro)
  * Use gl_shader_spirv_data struct to store the SPIR-V data.
    (Eduardo Lima)
  * Use the 'spirv_data' member to tell if the gl_shader is a SPIR-V
    shader, instead of a dedicated flag. (Timothy Arceri)

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Alejandro Piñeiro
9063bf7ad8 nir/spirv: add gl_spirv_validation method
ARB_gl_spirv adds the ability to use SPIR-V binaries, and a new
method, glSpecializeShader. Here we add a new function to do the
validation for this function:

From OpenGL 4.6 spec, section 7.2.1"

   "Shader Specialization", error table:

    INVALID_VALUE is generated if <pEntryPoint> does not name a valid
    entry point for <shader>.

    INVALID_VALUE is generated if any element of <pConstantIndex>
    refers to a specialization constant that does not exist in the
    shader module contained in <shader>.""

v2: rebase update (spirv_to_nir options added, changes on the warning
    logging, and others)

v3: include passing options on common initialization, doesn't call
    setjmp on common_initialization

v4: (after Jason comments):
  * Rename common_initialization to vtn_builder_create
  * Move validation method and their helpers to own source file.
  * Create own handle_constant_decoration_cb instead of reuse existing one

v5: put vtn_build_create refactoring to their own patch (Jason)

v6: update after vtn_builder_create method renamed, add explanatory
    comment, tweak existing comment and commit message (Timothy)
2018-03-30 09:14:56 +02:00
Alejandro Piñeiro
bebe3d626e spirv: add vtn_create_builder
Refactored from spirv_to_nir, in order to be reused later.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>

v2: renamed method (from vtn_builder_create), add explanatory comment
    (Timothy)
2018-03-30 09:14:56 +02:00
Alejandro Piñeiro
3761e675e2 i965: initialize SPIR-V capabilities
Needed for ARB_gl_spirv. Those are not the same that the Intel vulkan
driver. From the ARB_spirv_extensions spec:

   "3. If a new GL extension is added that includes SPIR-V support via
   a new SPIR-V extension does it's SPIR-V extension also get
   enumerated by the SPIR_V_EXTENSIONS_ARB query?.

   RESOLVED. Yes. It's good to include it for consistency. Any SPIR-V
   functionality supported beyond the SPIR-V version that is required
   for the GL API version should be enumerated."

So in addition to the core SPIR-V support, there is the possibility of
specific GL extensions enabling specific SPIR-V extensions (so
capabilities). That would mean that it is possible that OpenGL and
Vulkan not having the same capabilities supported, even for the same
driver. For this reason it is better to keep them separated.

As an example: at the time of this patch writing Intel vulkan driver
support multiview, but there isn't any OpenGL multiview GL extension
supported.

Note: we initialize SPIR-V capabilities at brwCreateContext instead of
the usual brw_initialize_context_constants because we want to do that
only if the extension is enabled.

v2:
   * Rebase update (SpirVCapabilities not a pointer anymore)
   * Fill spirv capabilities for OpenGL >= 3.3 (Ian Romanick)

v3:
   * Drop multiview support, as i965 doesn't support any multiview GL
     extension (Jason)
   * Fill spirv capabilities only if the extension is enabled (Jason)

v4: Capabilities are supported only on gen7+. Added comment and assert
    (Jason)
2018-03-30 09:14:56 +02:00
Nicolai Hähnle
ca5cc78206 mesa: add gl_constants::SpirVCapabilities
For drivers to declare which SPIR-V features they support.

v2: Don't use a pointer (Ian Romanick)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Ian Romanick
19e0dd1ad3 i965: Don't request GLSL IR lowering of gl_VertexID
Let the lowering in NIR handle it instead.

This hurts one shader that occurs twice in shader-db (SynMark GSCloth)
on IVB and HSW.  No other shaders or platforms were affected.

total cycles in shared programs: 253438422 -> 253438426 (0.00%)
cycles in affected programs: 412 -> 416 (0.97%)
helped: 0
HURT: 2

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
2018-03-29 14:16:07 -07:00
Ian Romanick
2765633116 i965: Silence unused parameter warning
src/mesa/drivers/dri/i965/brw_draw_upload.c: In function ‘double_types’:
src/mesa/drivers/dri/i965/brw_draw_upload.c:225:34: warning: unused parameter ‘brw’ [-Wunused-parameter]
 double_types(struct brw_context *brw,
                                  ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-03-29 14:16:04 -07:00
Ian Romanick
042ee4bea2 spirv: Move SPIR-V building to Makefile.spirv.am and spirv/meson.build
Future changes will add generated files used only from
src/compiler/glsl.  These can't be built from Makefile.nir.am, and we
can't move all the rules from Makefile.nir.am to Makefile.spirv.am (and
it would be silly anyway).

v2: Do it for meson too.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (the meson bits)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (the automake bits)
2018-03-29 14:16:01 -07:00
Ian Romanick
2c9621ee5c compiler: All leaf Makefile.am should use +=
This slightly simplifies later changes that add more Makefile.*.am
files.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2018-03-29 14:09:41 -07:00
Ian Romanick
4925347ec5 util: Include bitscan.h directly
Previously bitset.h would include u_math.h to get bitscan.h.  u_math.h
lives in src/gallium/auxiliary/util while both bitset.h and bitscan.h
live in src/util.  Having the one file directly include another file
that lives in the same directory makes much more sense.

As a side-effect, several files need to directly include standard header
files that were previously indirectly included.

v2: Fix build break in src/amd/common/ac_nir_to_llvm.c.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2018-03-29 14:09:30 -07:00
Ian Romanick
ef7a4c9015 util: Optimize util_is_power_of_two_nonzero
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2018-03-29 14:09:29 -07:00
Ian Romanick
cd18aa1e50 util: Use util_is_power_of_two_nonzero in u_vector
Previously size=0, element_size=0 would have been allowed.  That
combination can only lead to despair.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-03-29 14:09:28 -07:00
Ian Romanick
22fbb5c594 util: Add and use util_is_power_of_two_nonzero
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2018-03-29 14:09:28 -07:00
Ian Romanick
d76c204d05 util: Move util_is_power_of_two to bitscan.h and rename to util_is_power_of_two_or_zero
The new name make the zero-input behavior more obvious.  The next
patch adds a new function with different zero-input behavior.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-03-29 14:09:23 -07:00
Dylan Baker
a3a16d4aa7 meson: use dep_libdrm version for pkg-config
This corrects pkg-config to use the libdrm version (as computed by the
previous patch) instead of using a hardcoded value that may or may not
(probably not) be right.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-29 10:20:52 -07:00
Dylan Baker
c445b1d56f meson: Use the same version for all libdrm checks
Currently each driver specifies it's own version, and core libdrm
specifies a version. In the most common case this is fine, since there
will be exactly one libdrm installed on a system, but if there are more
than one it's possible that mesa will be linked against different
versions of libdrm. There is also the possibility that the current
approach makes the pkg-config files we generate incorrect, since there
could be #defines that use newer features if they're available.

This patch corrects all of that. All of the versions are still set by
driver (along with a default core version). Then all of the drivers that
are enabled have their versions compared and the highest version is
selected, then all libdrm checks are made with that version.

v2: - Reorder the list to have the name first and whether the dependency
      is needed second (Eric)

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-29 10:20:52 -07:00
Dylan Baker
acadf06f56 meson: group libdrm dependencies
The reason libdrm is after libdrm_* will be made clear in later patches.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-29 10:18:47 -07:00
Brian Paul
e520ca562a gl.h: remove stale comment, trailing whitespace
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-29 08:46:55 -06:00
Brian Paul
4ff6a7b0de glapi: add glBlendBarrier(), glPrimitiveBoundingBox() prototypes
in glapi_dispatch.c, as we have for many other GLES functions.
Fixes a cross-compile issue (missing prototype) when GLES support
is disabled.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2018-03-29 08:45:10 -06:00
Brian Paul
5cd5878a1f st/mesa: silence unhandled switch case warning
And improve the unreachable() error message.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-03-29 08:45:10 -06:00
Henri Verbeet
0b73c86b80 mesa: Inherit texture view multi-sample information from the original texture images.
Found running "The Witness" in Wine. Without this patch, texture views created
on multi-sample textures would have a GL_TEXTURE_SAMPLES of 0. All things
considered such views actually work surprisingly well, but when combined with
(plain) multi-sample textures in a framebuffer object, the resulting FBO is
incomplete because the sample counts don't match.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-29 14:38:25 +04:30
Samuel Pitoiset
e45fe0ed66 radv: fix scanning output_usage_mask with structs
To fix a regression in:
dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.struct

And the following regressions (Polaris only):
dEQP-VK.glsl.indexing.varying_array.*

Fixes: f3275ca01c ("ac/nir: only enable used channels when exporting parameters")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-29 10:22:10 +02:00
Karol Herbst
6179a87c1e nvc0/ir: fix emiting NOTs with predicates
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-03-29 03:06:36 +02:00
Aaron Watry
1dae92f150 broadcom/vc4: Fix out-of-tree build with automake.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-28 17:48:41 -07:00
Eric Anholt
81f82ecc56 broadcom/vc5: Start using nir_opt_move_load_ubo().
In the absence of a general NIR or VIR-level scheduler, this at least
avoids spilling in
GTF-GLES3.gtf.GL3Tests.uniform_buffer_object.uniform_buffer_object_storage_layouts
2018-03-28 17:48:41 -07:00
Eric Anholt
1fe4c748f7 broadcom/vc5: Fix setup of integer surface clear values.
I'm disappointed that the compiler didn't warn me about use of
uninitialized uc in these paths.  Just use the incoming clear color
instead of the packing temporary if we're doing our own packing.

Fixes GTF-GLES3.gtf.GL3Tests.color_buffer_float.color_buffer_float_clamp_*
2018-03-28 17:48:41 -07:00
Eric Anholt
123ee37627 broadcom/vc5: Stop trying to swizzle around RGBA4 clear color.
We always want A in the A slot in the tile buffer, and any other swapping
should happen elsewhere.

Fixes RGBA4-using cases in fbo-clear-formats and
GTF-GLES3.gtf.GL3Tests.color_buffer_float.color_buffer_float_clamp_fixed.
2018-03-28 17:48:41 -07:00
Eric Anholt
2f4c4e10c2 broadcom/vc5: Work around scissor w/h==0 bug same as rasterizer discard.
The 7268 HW apparently lets some rendering through in this case.  Fixes
GTF-GLES2.gtf.GL2FixedTests.scissor.scissor
2018-03-28 17:48:41 -07:00
Eric Anholt
0349c79bdc st: Don't try to finalize the texture in st_render_texture().
We can't necessarily finalize the texture at this point if we're rendering
to a texture image whose format is different from the baselevel's format.
This was introduced as a fix for fbo-incomplete-texture-03 in
de414f4915, but the later fix for vmware on
that testcase in 95d5c48f68 made it
unnecessary.

Fixes assertion failures in util_resource_copy_region() in
KHR-GLES3.copy_tex_image_conversions.forbidden.* when trying to finalize
an R8 texture image to the RG8 texture object's pt.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-28 17:48:41 -07:00
Marek Olšák
e159d46fc7 drirc: whitelist glthread for Medieval II: TW, Carnivores: DHR, Far Cry 2 2018-03-28 20:00:48 -04:00
Daniel Schürmann
b91cd5dba4 radv: enable VK_AMD_shader_trinary_minmax extension
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 01:29:39 +02:00
Daniel Schürmann
d00fb7ce54 ac: add support for trinary_minmax instructions
v2: Add missing break (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 01:29:35 +02:00
Dave Airlie
fe5d5d19b0 spirv: add support for SPV_AMD_shader_trinary_minmax
Co-authored-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 01:29:29 +02:00
Dave Airlie
3e830a1af2 nir: add support for min/max/median of 3 srcs
These are needed for SPV_AMD_shader_trinary_minmax,
the AMD HW supports these.

Co-authored-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 01:28:58 +02:00
Marek Olšák
025105453a radeonsi: simplify DCC format categories
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-28 18:45:52 -04:00
Marek Olšák
3fea237c85 radeonsi: don't use the SPI barrier management bug workaround
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-28 18:45:52 -04:00
Marek Olšák
3045c5f274 radeonsi: use maximum OFFCHIP_BUFFERING on Vega12
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-28 18:45:52 -04:00
Bas Nieuwenhuizen
4503ff760c ac/nir: Add workaround for GFX9 buffer views.
On GFX9 whether the buffer size is interpreted as elements or bytes
depends on whether IDXEN is enabled in the instruction. If the index
is a constant zero, LLVM optimizes IDXEN to 0.

Now the size in elements is interpreted in bytes which of course
results in out of bounds accesses.

The correct fix is most likely to disable the LLVM optimization,
but we need something to work with LLVM <= 6.0.

radeonsi does the max between stride and element count on the CPU
but that results in the size intrinsics returning the wrong size
for the buffer. This would cause CTS errors for radv.

v2: Also include the store changes.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 00:03:03 +02:00
Marek Olšák
4f96747530 ac/surface: set AddrSurfInfoIn.format = ADDR_FMT_8 for stencil, add assertions
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105738

Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 17:23:41 -04:00
Samuel Pitoiset
1c4fdcf444 radv: enable VK_EXT_sampler_filter_minmax
Only enable for CIK+ because it's buggy on SI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 22:55:48 +02:00
Samuel Pitoiset
413d77e7f9 radv: add support for VK_EXT_sampler_filter_minmax
The driver only supports the required formats for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 22:55:48 +02:00
Samuel Pitoiset
99b52aa1da radv: rename VEGA10 device name
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 20:15:17 +02:00
Samuel Pitoiset
4d2c46dda3 radv: add support for Vega12
Based on RadeonSI. Untested.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 20:15:14 +02:00
Matt Turner
3e6326deb9 build: Fix up nir_intrinsics.Plo
nir_intrinsics.c existed as a static file until commit 76dfed8ae2 began
generating it as part of the build process. autotools is incapable of
coping, and so a build-tree from before this commit would then fail with
it:

[4]: *** No rule to make target '../../../mesa/src/compiler/nir/nir_intrinsics.c', needed by 'nir/nir_intrinsics.lo'.  Stop.

Add a few lines to configure.ac to update the broken build files.

Fixes: 76dfed8ae2 ("nir: mako all the intrinsics")
2018-03-28 11:09:23 -07:00
Dylan Baker
2cfc68d984 autotools: Include intel/dev/meson.build in tarball
Fixes: 272bef0601
       ("intel: Split gen_device_info out into libintel_dev")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-28 10:19:05 -07:00
Dylan Baker
bc2fdb9759 autotools: include meson_get_version
Otherwise meson won't read the VERSION file and won't set a version.
That means that pkg-config files will have version unset as well.

Fixes: 3e9533d9b8
       ("meson: Add script to use VERSION file for getting version")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-28 10:13:23 -07:00
Eric Engestrom
d77844a529 docs: fix 18.0 release note version
Fixes: 839fb3a696 "docs: Update 18.0.0 release notes"
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-28 16:52:56 +01:00
Marek Olšák
20eb44ad65 radeonsi: add support for Vega12
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-28 11:37:43 -04:00
Marek Olšák
5425d32fcf amd/addrlib: update to the latest version for Vega12
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-28 11:37:43 -04:00
Eric Engestrom
431a1d12cc gbm: remove never-implemented function
I assume this was implemented in a previous version of that commit, but
was removed in the version that actually landed.

Fixes: 8430af5ebe "Add support for swrast to the DRM EGL platform"
Cc: Giovanni Campagna <gcampagna@src.gnome.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-28 16:25:52 +01:00
Stefan Schake
77ade10c86 android: Use new nir intrinsics python scripts
Fixes: 76dfed8ae2 ("nir: mako all the intrinsics")
Signed-off-by: Stefan Schake <stschake@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-28 14:48:47 +03:00
Eric Anholt
a691fa4a1b broadcom/vc5: Fix padding of NPOT miplevels >= 2.
The power-of-two padded size that gets minified is based on level 1's
dimensions, not level 0's, which starts to differ at a width of 9.

Fixes all failures on texelFetch fs sampler2D 1x1x1-64x64x1
2018-03-27 21:16:23 -07:00
Timothy Arceri
92fa89a08d ac/radeonsi: pass bindless bool to load_sampler_desc()
We also fix the base_index for bindless by using the driver
location.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 12:56:16 +11:00
Timothy Arceri
5411b98d52 st/glsl_to_nir: set driver location for bindless images and samplers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 12:56:15 +11:00
Timothy Arceri
f94b6b79be radeonsi/nir: set uses_bindless_samplers for samplers
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 12:56:15 +11:00
Timothy Arceri
5c810a2c05 nir: add bindless to nir data
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-28 12:56:15 +11:00
Kenneth Graunke
fb18d0dbe4 i965: Drop unnecessary bo->align field.
bo->align is always 0; there's no need to waste 8 bytes storing it.
Thanks to C99 initializers zeroing fields, we can completely drop the
only read of the field altogether.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 18:41:44 -07:00
Kenneth Graunke
037d738a23 i965: Drop unused alignment parameter from brw_bo_alloc().
brw_bo_alloc no longer uses this parameter, so there's no point.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 18:41:44 -07:00
Kenneth Graunke
07ec3a2e0f i965: Drop alignment parameter from bo_alloc_internal().
Buffers are always page aligned on 965+ hardware; I believe this extra
parameter is a vestige from the Gen2-3 era.

All callers pass 0, and in fact we assert that the alignment is 0 unless
BO_ALLOC_BUSY is set (for some reason).  We can just drop the parameter
and set the value to 0 explicitly.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 18:41:44 -07:00
Kenneth Graunke
b9a54b18f6 i965: Drop BO_ALLOC_BUSY in intel_miptree_create_for_bo().
intel_miptree_create_for_bo does not actually allocate a BO, so
specifying allocation flags accomplishes nothing and is confusing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 18:41:44 -07:00
Kenneth Graunke
2c01215c1b i965: Drop PIPE_CONTROL_NO_WRITE from various calls.
This is just zero - passing nothing already gives us a post-sync
operation of "nothing".

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-27 18:41:44 -07:00
Jason Ekstrand
5f21a7afe0 nir/intrinsics: Don't report negative dest_components
I have no idea why but having dest_components == -1 was causing a memory
leak somewhere.  Without this, you can't get through a full shader-db
run without running out of memory.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-27 18:18:26 -07:00
Jason Ekstrand
7e38f49a8f intel/fs: Don't emit a des copy for image ops with has_dest == false
This was causing us to walk dest_components times over a thing with no
destination.  This happened to work because all of the image intrinsics
without a destination also happened to have dest_components == 0.  We
shouldn't be reading dest_components if has_dest == false.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-03-27 18:18:21 -07:00
Ilia Mirkin
776e6af879 nvc0/ir: fix INTERP_* with indirect inputs
There were two problems, both of which are fixed now:
 - The indirect address was not being shifted by 4
 - The indirect address was being placed as an argument in the offset case

This fixes some of the new interpolateAt* piglits which now test for
these situations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-03-27 20:41:11 -04:00
Timothy Arceri
629ee690ad nir: fix crash in loop unroll corner case
When an if nesting inside anouther if is optimised away we can
end up with a loop terminator and following block that looks like
this:

        if ssa_596 {
                block block_5:
                /* preds: block_4 */
                vec1 32 ssa_601 = load_const (0xffffffff /* -nan */)
                break
                /* succs: block_8 */
        } else {
                block block_6:
                /* preds: block_4 */
                /* succs: block_7 */
        }
        block block_7:
        /* preds: block_6 */
        vec1 32 ssa_602 = phi block_6: ssa_552
        vec1 32 ssa_603 = phi block_6: ssa_553
        vec1 32 ssa_604 = iadd ssa_551, ssa_66

The problem is the phis. Loop unrolling expects the last block in
the loop to be empty once we splice the instructions in the last
block into the continue branch. The problem is we cant move phis
so here we lower the phis to regs when preparing the loop for
unrolling. As it could be possible to have multiple additional
blocks/ifs following the terminator we just convert all phis at
the top level of the loop body for simplicity.

We also add some comments to loop_prepare_for_unroll() while we
are here.

Fixes: 51daccb289 "nir: add a loop unrolling pass"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670
2018-03-28 09:59:38 +11:00
Timothy Arceri
48f6014903 st/glsl_to_nir: correctly handle arrays packed across multiple vars
Fixes piglit test:
tests/spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-interleave-range.shader_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:38 +11:00
Timothy Arceri
b260efbd5e radeonsi/nir: fix input processing for packed varyings
The location was only being incremented the first time we processed a
location. This meant we would incorrectly skip some elements of
an array if the first element was packed and proccessed previously
but other elements were not.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:38 +11:00
Timothy Arceri
51f175028d ac/nir_to_llvm: fix component packing for double outputs
We need to wait until after the writemask is widened before we
adjust it for component packing.

Together with the previous patch this fixes a number of
arb_enhanced_layouts component layout piglit tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:37 +11:00
Timothy Arceri
fc51fdbcde st/glsl_to_nir: fix driver location for dual-slot packed doubles
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:37 +11:00
Timothy Arceri
47eee04556 radeonsi/nir: fix scanning of multi-slot output varyings
This fixes tcs/tes varying arrays where we dont lower indirects and
therefore don't split arrays. Here we also fix useagemask for dual
slot doubles.

Fixes a number of arb_tessellation_shader piglit tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:37 +11:00
Eric Anholt
9f1b4f6204 broadcom/vc5: Fix RG16I/UI texture sampling.
How many times did I look at this table without noticing the missing 'G'
in the texture column?

Fixes KHR-GLES3.copy_tex_image_conversions.required.* on 7268.
2018-03-27 15:49:58 -07:00
Rob Clark
16581904b0 nir: fix generated nir_intrinsics.c for MSVC
Apparently it is not happy about things like: .foo = {}

So skip over initializers for empty lists.

Fixes: 76dfed8ae2
Reported-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-27 15:01:11 -04:00
Emil Velikov
eda2f58d15 docs: update calendar 18.0.0 is out
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-27 19:11:45 +01:00
Emil Velikov
02f89b62fe docs: add news item and link release notes for 18.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-27 19:08:48 +01:00
Emil Velikov
62eb721ed8 docs: add sha256 checksums for 18.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit fb64913d19)
2018-03-27 19:06:27 +01:00
Emil Velikov
839fb3a696 docs: Update 18.0.0 release notes
Note: the file was originally 17.4.0, yet git stuggles to detect the
move :-\

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit dceb1ce807)
2018-03-27 19:06:19 +01:00
Rob Clark
76dfed8ae2 nir: mako all the intrinsics
I threatened to do this a long time ago.. I probably *should* have done
it a long time ago when there where many fewer intrinsics.  But the
system of macro/#include magic for dealing with intrinsics is a bit
annoying, and python has the nice property of optional fxn params,
making it possible to define new intrinsics while ignoring parameters
that are not applicable (and naming optional params).  And not having to
specify various array lengths explicitly is nice too.

I think the end result makes it easier to add new intrinsics.

v2: couple small fixes found with a test program to compare the old and
    new tables
v3: misc comments, don't rely on capture=true for meson.build, get rid
    of system_values table to avoid return value of intrinsic() and
    *mostly* remove side-effects, add autotools build support
v4: scons build

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 08:36:37 -04:00
Rob Clark
cc3a88e81d nir: fix per_vertex_output intrinsic
This is supposed to have both BASE and COMPONENT but num_indices was
inadvertantly set to 1.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 08:20:40 -04:00
Rob Clark
1e0a06000b glsl_types: fix build break with intel/msvc compiler
The VECN() macro was taking advantage of a GCC specific feature that is
not available on lesser compilers, mostly for the purposes of avoiding a
macro that encoded a return statement.

But as suggested by Ian, we could just have the macro produce the entire
method body and avoid the need for this.  So let's do that instead.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105740
Fixes: f407edf340
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Roland Scheidegger <sroland@vmware.com>
Cc: Ian Romanick <idr@freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-03-27 08:17:11 -04:00
Lin Johnson
41cf30b8bc mesa: add GL_HALF_FLOAT as supported type to readpixels
EXT_color_buffer_float spec states:

  "An INVALID_OPERATION error is generated ... if the color buffer is
   a floating-point format and type is not FLOAT, HALF FLOAT, or
   UNSIGNED_INT_10F_11F_11F_REV."

This means that GL_HALF_FLOAT type should be supported when color
buffer has floating-point format.

Fixes Android CTS test android.view.cts.PixelCopyTest.

v2: remove comments of EXT_color_buffer_half_float as
    EXT_color_buffer_float can use type GL_HALF_FLOAT

Signed-off-by: Lin Johnson <johnson.lin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-27 09:04:52 +03:00
Eric Anholt
0024b77e87 broadcom/vc5: Fix swizzling of RGB10_A2UI render targets.
This is the actual hardware layout, and we were only swizzling R/B back
around in texturing.  Fixes part of
KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx in
simulation.
2018-03-26 17:46:23 -07:00
Eric Anholt
c2b13627d9 broadcom/vc5: Fix extraneous register index in QIR dumping of TLBU writes.
Just like TLB without a config uniform, we don't have a register index.
2018-03-26 17:46:23 -07:00
Eric Anholt
494da6c2dd broadcom/vc5: Implement workaround for GFXH-1431.
This should fix some blending errors, but doesn't impact any testcases in
the CTS.
2018-03-26 17:46:19 -07:00
Eric Anholt
1bf466270d broadcom/vc5: Fix EZ disabling and allow using GT/GE direction as well.
Once we've disabled EZ for some draws, we need to not use EZ on future
draws.  Implementing that made implementing the GT/GE direction trivial.

Fixes KHR-GLES3.shaders.fragdepth.compare.no_write on V3D 4.1 simulation.
2018-03-26 17:46:19 -07:00
Eric Anholt
262208eb3c broadcom/vc5: Disable TF on V3D 4.x when drawing with queries disabled.
On 3.x, we just don't flag the primitive as needing TF, but those
primitive bits are now allocated to the new primitive types.  Now we need
to actually update the enable flag at draw time.
2018-03-26 17:46:19 -07:00
Eric Anholt
ef2cf9cc3c broadcom/vc5: Disable transform feedback on V3D 4.x at the end of the job.
The next job from this client will turn it back on unless TF gets
disabled, but we don't want the state to leak from this client to another
(which causes GPU hangs).
2018-03-26 17:46:19 -07:00
Eric Anholt
1fa820cef8 broadcom/vc5: Move the BCL epilogue code to a per-version compile.
I need to do some new packets for transform feedback on 4.1.
2018-03-26 17:46:19 -07:00
Eric Anholt
3387864130 broadcom/vc5: Fix transform feedback in the presence of point size.
I had this note to myself, and it turns out that a lot of CTS tests use
XFB with points to get data out without using a fragment shader.  Keep
track of two sets of precomputed TF specs (point size in VPM prologue or
not), and switch between them when we enable/disable point size.
2018-03-26 17:46:19 -07:00
Eric Anholt
09ac5ade8f broadcom/vc5: Split transform feedback specs update from buffers.
The specs update will be changing based on additional state flags in the
next commit, and this unindents the buffer update code.
2018-03-26 17:46:18 -07:00
Eric Anholt
9e62aec9cd broadcom/vc5: Limit each transform feedback data spec to 16 dwords.
The length-1 field only has 4 bits, so we need to generate separate specs
when there's too much TF output per buffer.

Fixes
GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_builtin_type
and transform_feedback_max_interleaved.
2018-03-26 17:33:37 -07:00
Eric Anholt
0356db022d gallium/u_vbuf: Protect against overflow with large instance divisors.
GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor uses -1
as a divisor, so we would overflow to count=0 and upload no data,
triggering the assert below.  We want to upload 1 element in this case,
fixing the test on VC5.

v2: Use some more obvious logic, and explain why we don't use the normal
    round_up().

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-26 17:33:37 -07:00
Eric Anholt
d491ad1d36 st: Allow accelerated CopyTexImage from RGBA to RGB.
There's nothing to worry about here -- the A channel just gets dropped by
the blit.  This avoids a segfault in the fallback path when copying from a
RGBA16_SINT renderbuffer to a RGB16_SINT destination represented by an
RGBA16_SINT texture (the fallback path tries to get/fetch to float
buffers, but the float pack/unpack functions are NULL for SINT/UINT).

Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba16i on VC5.

v2: Extract the logic to a helper function and explain what's going on
    better.
v3: const-qualify args

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-26 17:33:37 -07:00
Marek Olšák
7d2079908d winsys/amdgpu: always allow GTT placements on APUs
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-03-26 19:23:30 -04:00
Marek Olšák
769603564e radeonsi: don't reallocate on DMABUF export if local BOs are disabled 2018-03-26 19:22:12 -04:00
Timothy Arceri
56b867395d glsl: fix infinite loop caused by bug in loop unrolling pass
Just checking for 2 jumps is not enough to be sure we can do a
complex loop unroll. We need to make sure we also have also found
2 loop terminators.

Without this we were attempting to unroll a loop where the second
jump was nested inside multiple ifs which loop analysis is unable
to detect as a terminator. We ended up splicing out the first
terminator but failed to actually unroll the loop, this resulted
in the creation of a possible infinite loop.

Fixes: 646621c66d "glsl: make loop unrolling more like the nir unrolling path"

Tested-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670
2018-03-27 09:15:02 +11:00
Vinson Lee
dc94a0506f gallium: Do not add -Wframe-address option for gcc <= 4.4.
This patch fixes these build errors with GCC 4.4.

  Compiling src/gallium/auxiliary/util/u_debug_stack.c ...
src/gallium/auxiliary/util/u_debug_stack.c: In function ‘debug_backtrace_capture’:
src/gallium/auxiliary/util/u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions
src/gallium/auxiliary/util/u_debug_stack.c:269: error: #pragma GCC diagnostic not allowed inside functions
src/gallium/auxiliary/util/u_debug_stack.c:271: error: #pragma GCC diagnostic not allowed inside functions

Fixes: 370e356eba ("gallium: silence __builtin_frame_address nonzero argument is unsafe warning")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105529
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-26 11:23:51 -07:00
Alyssa Rosenzweig
029f1a2d61 gallium: Correct minor typo in header comments
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-26 10:15:04 -07:00
Rafael Antognolli
27581d18bc intel/aubinator_error_decode: Decode more registers.
Decode SC_INSTDONE, ROW_INSTDONE and SAMPLER_INSTDONE.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-26 09:25:57 -07:00
Rafael Antognolli
70d7c70e8d intel/genxml: Add SAMPLER_INSTDONE register.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-26 09:25:57 -07:00
Rafael Antognolli
227edf05f3 intel/genxml: Add ROW_INSTDONE register.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-26 09:25:57 -07:00
Rafael Antognolli
4c0ae36143 intel/genxml: Add SC_INSTDONE register.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-26 09:25:57 -07:00
Ian Romanick
91225cb33f i965/vec4: Fix null destination register in 3-source instructions
A recent commit (see below) triggered some cases where conditional
modifier propagation and dead code elimination would cause a MAD
instruction like the following to be generated:

    mad.l.f0  null, ...

Matt pointed out that fs_visitor::fixup_3src_null_dest() fixes cases
like this in the scalar backend.  This commit basically ports that code
to the vec4 backend.

NOTE: I have sent a couple tests to the piglit list that reproduce this
bug *without* the commit mentioned below.  This commit fixes those
tests.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Fixes: ee63933a7 ("nir: Distribute binary operations with constants into bcsel")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105704
2018-03-26 08:50:44 -07:00
Ian Romanick
2c643fd978 nir: Don't condition 'a-b < 0' -> 'a < b' on is_not_used_by_conditional
Now that i965 recognizes that a-b generates the same conditions as 'a <
b', there is no reason to condition this transformation on 'is not used
by conditional.'

Since this was the only user of the is_not_used_by_conditional function,
delete it.

All Gen6+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14400775 -> 14400595 (<.01%)
instructions in affected programs: 36712 -> 36532 (-0.49%)
helped: 182
HURT: 26
helped stats (abs) min: 1 max: 2 x̄: 1.13 x̃: 1
helped stats (rel) min: 0.15% max: 1.82% x̄: 0.70% x̃: 0.62%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.24% max: 1.02% x̄: 0.82% x̃: 0.90%
95% mean confidence interval for instructions value: -0.97 -0.76
95% mean confidence interval for instructions %-change: -0.59% -0.43%
Instructions are helped.

total cycles in shared programs: 532929592 -> 532926345 (<.01%)
cycles in affected programs: 478660 -> 475413 (-0.68%)
helped: 187
HURT: 22
helped stats (abs) min: 2 max: 200 x̄: 20.99 x̃: 18
helped stats (rel) min: 0.23% max: 24.10% x̄: 1.48% x̃: 1.03%
HURT stats (abs)   min: 1 max: 214 x̄: 30.86 x̃: 11
HURT stats (rel)   min: 0.01% max: 23.06% x̄: 3.12% x̃: 0.86%
95% mean confidence interval for cycles value: -19.50 -11.57
95% mean confidence interval for cycles %-change: -1.42% -0.58%
Cycles are helped.

GM45 and Iron Lake had similar results. (Iron Lake shown)
total cycles in shared programs: 177851578 -> 177851810 (<.01%)
cycles in affected programs: 24408 -> 24640 (0.95%)
helped: 2
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.42% max: 0.47% x̄: 0.44% x̃: 0.44%
HURT stats (abs)   min: 24 max: 108 x̄: 60.00 x̃: 54
HURT stats (rel)   min: 0.52% max: 1.62% x̄: 1.04% x̃: 1.02%
95% mean confidence interval for cycles value: -7.75 85.08
95% mean confidence interval for cycles %-change: -0.39% 1.49%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
cd635d149b i965/vec4: Propagate conditional modifiers from compares to adds
No changes on Broadwell or later as those platforms do not use the vec4
backend.

Ivy Bridge and Haswell had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11682119 -> 11681056 (<.01%)
instructions in affected programs: 150403 -> 149340 (-0.71%)
helped: 950
HURT: 0
helped stats (abs) min: 1 max: 16 x̄: 1.12 x̃: 1
helped stats (rel) min: 0.23% max: 2.78% x̄: 0.82% x̃: 0.71%
95% mean confidence interval for instructions value: -1.19 -1.04
95% mean confidence interval for instructions %-change: -0.84% -0.79%
Instructions are helped.

total cycles in shared programs: 257495842 -> 257495238 (<.01%)
cycles in affected programs: 270302 -> 269698 (-0.22%)
helped: 271
HURT: 13
helped stats (abs) min: 2 max: 14 x̄: 2.42 x̃: 2
helped stats (rel) min: 0.06% max: 1.13% x̄: 0.32% x̃: 0.28%
HURT stats (abs)   min: 2 max: 12 x̄: 4.00 x̃: 4
HURT stats (rel)   min: 0.15% max: 1.18% x̄: 0.30% x̃: 0.26%
95% mean confidence interval for cycles value: -2.41 -1.84
95% mean confidence interval for cycles %-change: -0.31% -0.26%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10430493 -> 10429727 (<.01%)
instructions in affected programs: 120860 -> 120094 (-0.63%)
helped: 766
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.30% max: 2.70% x̄: 0.78% x̃: 0.73%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.80% -0.75%
Instructions are helped.

total cycles in shared programs: 146138718 -> 146138446 (<.01%)
cycles in affected programs: 244114 -> 243842 (-0.11%)
helped: 132
HURT: 0
helped stats (abs) min: 2 max: 4 x̄: 2.06 x̃: 2
helped stats (rel) min: 0.03% max: 0.43% x̄: 0.16% x̃: 0.19%
95% mean confidence interval for cycles value: -2.12 -2.00
95% mean confidence interval for cycles %-change: -0.18% -0.15%
Cycles are helped.

GM45 and Iron Lake had identical results. (Iron Lake shown)
total instructions in shared programs: 7780251 -> 7780248 (<.01%)
instructions in affected programs: 175 -> 172 (-1.71%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.49% max: 2.44% x̄: 1.81% x̃: 1.49%

total cycles in shared programs: 177851584 -> 177851578 (<.01%)
cycles in affected programs: 9796 -> 9790 (-0.06%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.05% max: 0.08% x̄: 0.06% x̃: 0.05%

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
780f307ba8 i965/vec4: Allow cmod propagation when src0 is a uniform or shader input
No shader-db changes.  This source must have been written by a previous
instruction, so it cannot be a uniform or a shader input.  However, this
change allows the next commit to help more shaders.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
020b0055e7 i965/fs: Propagate conditional modifiers from compares to adds
The math inside the add and the cmp in this instruction sequence is the
same.  We can utilize this to eliminate the compare.

add(8)          g5<1>F          g2<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
cmp.z.f0(8)     null<1>F        g2<8,8,1>F      -g64.5<0,1,0>F  { align1 1Q switch };
(-f0) sel(8)    g8<1>F          (abs)g5<8,8,1>F 3e-37F          { align1 1Q };

This is reduced to:

add.z.f0(8)     g5<1>F          g2<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
(-f0) sel(8)    g8<1>F          (abs)g5<8,8,1>F 3e-37F          { align1 1Q };

This optimization pass could do even better.  The nature of converting
vectorized code from the GLSL front end to scalar code in NIR results in
sequences like:

add(8)          g7<1>F          g4<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
add(8)          g6<1>F          g3<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
add(8)          g5<1>F          g2<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
cmp.z.f0(8)     null<1>F        g2<8,8,1>F      -g64.5<0,1,0>F  { align1 1Q switch };
(-f0) sel(8)    g8<1>F          (abs)g5<8,8,1>F 3e-37F          { align1 1Q };
cmp.z.f0(8)     null<1>F        g3<8,8,1>F      -g64.5<0,1,0>F  { align1 1Q switch };
(-f0) sel(8)    g10<1>F         (abs)g6<8,8,1>F 3e-37F          { align1 1Q };
cmp.z.f0(8)     null<1>F        g4<8,8,1>F      -g64.5<0,1,0>F  { align1 1Q switch };
(-f0) sel(8)    g12<1>F         (abs)g7<8,8,1>F 3e-37F          { align1 1Q };

In this sequence, only the first cmp.z is removed.  With different
scheduling, all 3 could get removed.

Skylake
total instructions in shared programs: 14407009 -> 14400173 (-0.05%)
instructions in affected programs: 1307274 -> 1300438 (-0.52%)
helped: 4880
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.40 x̃: 1
helped stats (rel) min: 0.03% max: 8.70% x̄: 0.70% x̃: 0.52%
95% mean confidence interval for instructions value: -1.45 -1.35
95% mean confidence interval for instructions %-change: -0.72% -0.69%
Instructions are helped.

total cycles in shared programs: 532943169 -> 532923528 (<.01%)
cycles in affected programs: 14065798 -> 14046157 (-0.14%)
helped: 2703
HURT: 339
helped stats (abs) min: 1 max: 1062 x̄: 12.27 x̃: 2
helped stats (rel) min: <.01% max: 28.72% x̄: 0.38% x̃: 0.21%
HURT stats (abs)   min: 1 max: 739 x̄: 39.86 x̃: 12
HURT stats (rel)   min: 0.02% max: 27.69% x̄: 1.38% x̃: 0.41%
95% mean confidence interval for cycles value: -8.66 -4.26
95% mean confidence interval for cycles %-change: -0.24% -0.14%
Cycles are helped.

LOST:   0
GAINED: 1

Broadwell
total instructions in shared programs: 14719636 -> 14712949 (-0.05%)
instructions in affected programs: 1288188 -> 1281501 (-0.52%)
helped: 4845
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.38 x̃: 1
helped stats (rel) min: 0.03% max: 8.00% x̄: 0.70% x̃: 0.52%
95% mean confidence interval for instructions value: -1.43 -1.33
95% mean confidence interval for instructions %-change: -0.72% -0.68%
Instructions are helped.

total cycles in shared programs: 559599253 -> 559581699 (<.01%)
cycles in affected programs: 13315565 -> 13298011 (-0.13%)
helped: 2600
HURT: 269
helped stats (abs) min: 1 max: 2128 x̄: 12.24 x̃: 2
helped stats (rel) min: <.01% max: 23.95% x̄: 0.41% x̃: 0.20%
HURT stats (abs)   min: 1 max: 790 x̄: 53.07 x̃: 20
HURT stats (rel)   min: 0.02% max: 15.96% x̄: 1.55% x̃: 0.75%
95% mean confidence interval for cycles value: -8.47 -3.77
95% mean confidence interval for cycles %-change: -0.27% -0.18%
Cycles are helped.

LOST:   0
GAINED: 8

Haswell
total instructions in shared programs: 12978609 -> 12973483 (-0.04%)
instructions in affected programs: 932921 -> 927795 (-0.55%)
helped: 3480
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.47 x̃: 1
helped stats (rel) min: 0.03% max: 7.84% x̄: 0.78% x̃: 0.58%
95% mean confidence interval for instructions value: -1.53 -1.42
95% mean confidence interval for instructions %-change: -0.80% -0.75%
Instructions are helped.

total cycles in shared programs: 410270788 -> 410250531 (<.01%)
cycles in affected programs: 10986161 -> 10965904 (-0.18%)
helped: 2087
HURT: 254
helped stats (abs) min: 1 max: 2672 x̄: 14.63 x̃: 4
helped stats (rel) min: <.01% max: 39.61% x̄: 0.42% x̃: 0.21%
HURT stats (abs)   min: 1 max: 519 x̄: 40.49 x̃: 16
HURT stats (rel)   min: 0.01% max: 12.83% x̄: 1.20% x̃: 0.47%
95% mean confidence interval for cycles value: -12.82 -4.49
95% mean confidence interval for cycles %-change: -0.31% -0.18%
Cycles are helped.

LOST:   0
GAINED: 5

Ivy Bridge
total instructions in shared programs: 11686082 -> 11681548 (-0.04%)
instructions in affected programs: 937696 -> 933162 (-0.48%)
helped: 3150
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.44 x̃: 1
helped stats (rel) min: 0.03% max: 7.84% x̄: 0.69% x̃: 0.49%
95% mean confidence interval for instructions value: -1.49 -1.38
95% mean confidence interval for instructions %-change: -0.71% -0.67%
Instructions are helped.

total cycles in shared programs: 257514962 -> 257492471 (<.01%)
cycles in affected programs: 11524149 -> 11501658 (-0.20%)
helped: 1970
HURT: 239
helped stats (abs) min: 1 max: 3525 x̄: 17.48 x̃: 3
helped stats (rel) min: <.01% max: 49.60% x̄: 0.46% x̃: 0.17%
HURT stats (abs)   min: 1 max: 1358 x̄: 50.00 x̃: 15
HURT stats (rel)   min: 0.02% max: 59.88% x̄: 1.84% x̃: 0.65%
95% mean confidence interval for cycles value: -17.01 -3.35
95% mean confidence interval for cycles %-change: -0.33% -0.08%
Cycles are helped.

LOST:   9
GAINED: 1

Sandy Bridge
total instructions in shared programs: 10432841 -> 10429893 (-0.03%)
instructions in affected programs: 685071 -> 682123 (-0.43%)
helped: 2453
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 1.20 x̃: 1
helped stats (rel) min: 0.02% max: 7.55% x̄: 0.64% x̃: 0.46%
95% mean confidence interval for instructions value: -1.23 -1.17
95% mean confidence interval for instructions %-change: -0.67% -0.62%
Instructions are helped.

total cycles in shared programs: 146133660 -> 146134195 (<.01%)
cycles in affected programs: 3991634 -> 3992169 (0.01%)
helped: 1237
HURT: 153
helped stats (abs) min: 1 max: 2853 x̄: 6.93 x̃: 2
helped stats (rel) min: <.01% max: 29.00% x̄: 0.24% x̃: 0.14%
HURT stats (abs)   min: 1 max: 1740 x̄: 59.56 x̃: 12
HURT stats (rel)   min: 0.03% max: 78.98% x̄: 1.96% x̃: 0.42%
95% mean confidence interval for cycles value: -5.13 5.90
95% mean confidence interval for cycles %-change: -0.17% 0.16%
Inconclusive result (value mean confidence interval includes 0).

LOST:   0
GAINED: 1

GM45 and Iron Lake had similar results (GM45 shown):
total instructions in shared programs: 4800332 -> 4798380 (-0.04%)
instructions in affected programs: 565995 -> 564043 (-0.34%)
helped: 1451
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 1.35 x̃: 1
helped stats (rel) min: 0.05% max: 5.26% x̄: 0.47% x̃: 0.31%
95% mean confidence interval for instructions value: -1.40 -1.29
95% mean confidence interval for instructions %-change: -0.50% -0.45%
Instructions are helped.

total cycles in shared programs: 122032318 -> 122027798 (<.01%)
cycles in affected programs: 8334868 -> 8330348 (-0.05%)
helped: 1029
HURT: 1
helped stats (abs) min: 2 max: 40 x̄: 4.43 x̃: 2
helped stats (rel) min: <.01% max: 1.83% x̄: 0.09% x̃: 0.04%
HURT stats (abs)   min: 38 max: 38 x̄: 38.00 x̃: 38
HURT stats (rel)   min: 0.25% max: 0.25% x̄: 0.25% x̃: 0.25%
95% mean confidence interval for cycles value: -4.70 -4.08
95% mean confidence interval for cycles %-change: -0.09% -0.08%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
5bbb3d60d3 i965/fs: Allow cmod propagation when src0 is a uniform or shader input
No shader-db changes.  This source must have been written by a previous
instruction, so it cannot be a uniform or a shader input.  However, this
change allows the next commit to help about 900 more shaders.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
8f83eea71e i965: Add negative_equals methods
This method is similar to the existing ::equals methods.  Instead of
testing that two src_regs are equal to each other, it tests that one is
the negation of the other.

v2: Simplify various checks based on suggestions from Matt.  Use
src_reg::type instead of fixed_hw_reg.type in a check.  Also suggested
by Matt.

v3: Rebase on 3 years.  Fix some problems with negative_equals with VF
constants.  Add fs_reg::negative_equals.

v4: Replace the existing default case with BRW_REGISTER_TYPE_UB,
BRW_REGISTER_TYPE_B, and BRW_REGISTER_TYPE_NF.  Suggested by Matt.
Expand the FINISHME comment to better explain why it isn't already
finished.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v3]
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Gert Wollny
a21da49e5c mesa/st/tests: Use tgsi opcode enum also in the test classes
Fixes: ec478cf9c31K ("st/mesa,tgsi: use enum tgsi_opcode")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105737
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-26 09:04:53 -06:00
Eric Engestrom
1e36fe5dc4 meson: fix header check message
before: Checking if "endian.h works" compiles: YES
after:  Checking if "endian.h" compiles: YES

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
2018-03-26 09:59:32 +01:00
Rob Clark
2f181c8c18 glsl_types: vec8/vec16 support
Not used in GL but 8 and 16 component vectors exist in OpenCL.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-25 10:42:54 -04:00
Rob Clark
f407edf340 glsl_types: refactor/prep for vec8/vec16
Refactor things so there isn't so much typing involved to add new
things.

Also drops a pointless conditional (out of bounds rows or columns
already returns error_type in all paths.. might as well drop it
rather than make the check more convoluted in the next patch by
adding the vec8/vec16 case).

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-25 10:42:54 -04:00
Jordan Justen
d60eaf7b1f anv: Set genX_table for gen11
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-23 17:23:59 -07:00
Jordan Justen
af8535d02f anv: Add gen11 to anv_genX_call
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-23 17:23:59 -07:00
Mathias Fröhlich
4a8ef1f5d4 vbo: Make sure the internal VAO's stay within limits.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-23 19:59:02 +01:00
Mathias Fröhlich
1a131aaf4b mesa: Flag early if we modify a SharedAndImmutable VAO.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-23 19:58:59 +01:00
Mathias Fröhlich
19526a57f5 mesa: When copying a VAO also copy the vertex attribute mode.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-23 19:58:54 +01:00
Emil Velikov
5a75019ad0 configure: use AC_CHECK_HEADERS to check for endian.h
The currently we use the singular CHECK_HEADER combined with explicit
append to the DEFINES variable. That is a legacy misnomer, since it
requires us to add $DEFINES to every piece that we build.

Using the plural version of the helper sets the HAVE_ macro for us, plus
ensures it's passed to the compiler - if config.h is available in there
(not in the case of mesa) otherwise on the command line.

In hindsight, we should replace all the AC_CHECK_{FUNC,HEADER} instances
with the plural version (or even the _ONCE suffixed version) and drop
the DEFINES hacks.

Fixes: cbee1bfb34 ("meson/configure: detect endian.h instead of trying
to guess when it's available")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105717
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2018-03-23 18:12:52 +00:00
Kenneth Graunke
90f556f0b1 android: Use local i915_drm.h rather than the system one.
Fixes: 2d26c99933 (intel: devinfo: meson: include drm uapi)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2018-03-23 10:05:02 -07:00
Brian Paul
e31d5bd2f9 st/mesa: s/unsigned/enum pipe_shader_type/ for st_bind_ubos()
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
6a93deedf5 st/mesa: whitespace/formatting fixes in st_atom_constbuf.c
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
aad23f91ee st/mesa: s/unsigned/enum pipe_shader_type/
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
93581c2ca0 svga: simplify uses_flat_interp expression in emit_input_declarations()
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
c99f46c2ac svga: replace unsigned with proper enum names
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
7181a9fa0e tgsi,softpipe: use enum tgsi_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
ec478cf9c3 st/mesa,tgsi: use enum tgsi_opcode
Need to update the tgsi code and st_glsl_to_tgsi code at the same time
to prevent compile break since C++ is much pickier about implicit
enum/unsigned casting.

Bump size of glsl_to_tgsi_instruction::op to 10 bits to be sure to
avoid MSVC signed enum overflow issue.  No change in class size.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
ccecb2bbd3 tgsi/nir: use enum tgsi_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
22a3190c85 tgsi: use enum tgsi_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
9413d1c0fe gallivm: use enum tgis_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
7df96826f8 svga: use enum tgsi_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
4e0f967f6d tgsi: convert opcode macros to enums
Enums are nicer in gdb.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Lionel Landwerlin
412fae46c0 compiler: glsl: silence valgrind warning on write cache
I don't think it actually fixes anything, but that's nice not to have valgrind warnings.
It manifests itself when running the piglit test : glsl-fs-raytrace-bug27060

==2058== Uninitialised byte(s) found during client check request
==2058==    at 0xC5BB040: blob_write_bytes (blob.c:152)
==2058==    by 0xC595359: write_variable (nir_serialize.c:144)
==2058==    by 0xC59560C: write_var_list (nir_serialize.c:192)
==2058==    by 0xC5982E4: nir_serialize (nir_serialize.c:1124)
==2058==    by 0xC0B729D: brw_program_serialize_nir (brw_program.c:835)
==2058==    by 0xC0AB2D6: brw_link_shader (brw_link.cpp:358)
==2058==    by 0xC32FE3F: _mesa_glsl_link_shader (ir_to_mesa.cpp:3169)
==2058==    by 0xC36C7ED: create_new_program(gl_context*, state_key*) (ff_fragment_shader.cpp:1127)
==2058==    by 0xC36C8A6: _mesa_get_fixed_func_fragment_program (ff_fragment_shader.cpp:1157)
==2058==    by 0xC1B50AF: update_program (state.c:134)
==2058==    by 0xC1B56DF: _mesa_update_state_locked (state.c:352)
==2058==    by 0xC1B579A: _mesa_update_state (state.c:386)
==2058==  Address 0xf1eab8a is 58 bytes inside a block of size 96 alloc'd
==2058==    at 0x4C2CB8F: malloc (vg_replace_malloc.c:299)
==2058==    by 0xC0FD306: ralloc_size (ralloc.c:121)
==2058==    by 0xC0FD5B1: ralloc_array_size (ralloc.c:208)
==2058==    by 0xC452B3B: (anonymous namespace)::nir_visitor::visit(ir_variable*) (glsl_to_nir.cpp:448)
==2058==    by 0xC45CE8B: ir_variable::accept(ir_visitor*) (ir.h:428)
==2058==    by 0xC46D0B5: visit_exec_list(exec_list*, ir_visitor*) (ir.cpp:1898)
==2058==    by 0xC451D2F: glsl_to_nir (glsl_to_nir.cpp:162)
==2058==    by 0xC0B5223: brw_create_nir (brw_program.c:79)
==2058==    by 0xC0AAB67: brw_link_shader (brw_link.cpp:257)
==2058==    by 0xC32FE3F: _mesa_glsl_link_shader (ir_to_mesa.cpp:3169)
==2058==    by 0xC36C7ED: create_new_program(gl_context*, state_key*) (ff_fragment_shader.cpp:1127)
==2058==    by 0xC36C8A6: _mesa_get_fixed_func_fragment_program (ff_fragment_shader.cpp:1157)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-23 13:05:12 +00:00
Eric Engestrom
cbee1bfb34 meson/configure: detect endian.h instead of trying to guess when it's available
Cc: Maxin B. John <maxin.john@gmail.com>
Cc: Khem Raj <raj.khem@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Suggested-by: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Cc: <mesa-stable@lists.freedesktop.org>
2018-03-23 11:44:21 +00:00
Juan A. Suarez Romero
ee2b943fa8 wayland-drm: do not distribute generated sources
Instead we will re-generate them again on building.

v2: get rid of BUILT_SOURCES (Daniel, Emil)
v3: keep BUILT_SOURCES for egl/Makefile.am (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-23 11:27:12 +01:00
Samuel Pitoiset
ccc64f3133 radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8
The hardware only supports 32-bit depth surfaces, but we can
enable TC-compat HTILE for 16-bit depth surfaces if no Z planes
are compressed.

The main benefit is to reduce the number of depth decompression
passes. Also, we don't need to implement DB->CB copies which is
fine.

This improves Serious Sam 2017 by +4%. Talos and F12017 are also
affected but I don't see a performance difference.

This also improves the shadowmapping Vulkan demo by 10-15%
(FPS is now similar to AMDVLK).

No CTS regressions on Polaris10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-23 10:05:57 +01:00
Samuel Pitoiset
5ae9772245 radv: add radv_calc_decompress_on_z_planes() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-23 10:05:55 +01:00
Samuel Pitoiset
9b8e75bee3 radv: add radv_image_is_tc_compat_htile() helper
Instead of that huge conditional that's going to be crazy.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-23 10:05:54 +01:00
Jason Ekstrand
884d27bcf6 nir: Rename image intrinsics to image_var
Generated with

git grep -l nir_intrinsic_image | xargs \
sed -i 's/nir_intrinsic_image/nir_intrinsic_image_var/g'

and some manual fixing in nir_intrinsics.h

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-23 13:48:11 +11:00
Dave Airlie
fa683385de virgl: add ARB_cull_distance support.
This just allows the properties through to the host if we have
cull dist support.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-23 10:21:10 +10:00
Eric Anholt
d7a015cbc6 broadcom/vc5: Account for InstanceID/VertexID in VPM segment size.
Fixes failure in
GTF-GLES3.gtf.GL3Tests.draw_instanced.draw_instanced_attrib_size
2018-03-22 15:12:21 -07:00
Eric Anholt
b8387dbc49 broadcom/vc5: Allow FBOs with mixed color formats.
This is required by GLES3, fixing
GTF-GLES3.gtf.GL3Tests.framebuffer_srgb.framebuffer_srgb_draw
2018-03-22 15:12:21 -07:00
Eric Anholt
4f62679be5 broadcom/vc5: Add missing support for 2101010_REV vertex attributes.
Fixes
GTF-GLES3.gtf.GL3Tests.vertex_type_2_10_10_10_rev.vertex_type_2_10_10_10_rev_invalid2,
where we hadn't thrown a GL error as needed in the extension-disabled
case.  We want to be exposing the extension anyway.
2018-03-22 15:12:21 -07:00
Eric Anholt
ba29b89dc7 broadcom/vc5: Set up a vertex position if the shader doesn't.
Our backend needs some sort of vertex position value to emit the scaled
viewport values and such.  Fixes potential segfaults in
KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx
2018-03-22 15:12:21 -07:00
Lionel Landwerlin
903e9952fb i965: add performance query support on CNL
v2: Add brw_oa_cnl.xml to EXTRA_DIST (Emil)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
e7f6d1e5f8 i965: perf: add support for new equation operators
Some equations of the CNL metrics started to use operators we haven't
defined yet, just add those.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
57a11550bc i965: perf: query topology
With the introduction of asymmetric slices in CNL, we cannot rely on
the previous SUBSLICE_MASK getparam to tell userspace what subslices
are available.

We introduce a new uAPI in the kernel driver to report exactly what
part of the GPU are fused and require this to be available on Gen10+.

Prior generations can continue to rely on GETPARAM on older kernels.

This patch is quite a lot of code because we have to support lots of
different kernel versions, ranging from not providing any information
(for Haswell on 4.13 through 4.17), to being able to query through
GETPARAM (for gen8/9 on 4.13 through 4.17), to finally requiring 4.17
for Gen10+.

This change stores topology information in a unified way on
brw_context.topology from the various kernel APIs. And then generates
the appropriate values for the equations from that unified topology.

v2: Move slice/subslice masks fields to gen_device_info (Rafael)

v3: Add a gen_device_info_subslice_available() helper (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
c1900f5b0f intel: devinfo: add helper functions to fill fusing masks values
There are a couple of ways we can get the fusing information from the
kernel :

  - Through DRM_I915_GETPARAM with the SLICE_MASK/SUBSLICE_MASK
    parameters

  - Through the new DRM_IOCTL_I915_QUERY by requesting the
    DRM_I915_QUERY_TOPOLOGY_INFO

The second method is more accurate and also gives us the EUs fusing
masks. It's also a requirement for CNL as this platform has asymetric
subslices and the first method SUBSLICE_MASK value is assumed uniform
across slices.

v2: Change gen_device_info_update_from_masks() to generate topology
    and call into gen_device_info_update_from_topology (Lionel/Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
2d26c99933 intel: devinfo: meson: include drm uapi
Already available with the autotools build.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
5d3e74a5a5 drm-uapi: bump headers
Required updates from drm-next for changes in i965.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
c471716574 intel: devinfo: store slice/subslice/eu masks
We want to store values coming from the kernel but as a first step, we
can generate mask values out the numbers already stored in the
gen_device_info masks.

v2: Add a helper to set EU masks (Lionel/Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
7e2c6147da intel: devinfo: store number of EUs per subslice
This will be reused to store values reported by the kernel. The main
use case will be for use as the input values of the metric sets
equations for the INTEL_performance_queries extension. By storing this
information in the gen_device_info we make this non GL specific so
this can be reused by Vulkan if we ever have an equivalent extension.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Dylan Baker
8e5988eb35 Revert "meson: merge C and C++ compiler arguments check"
This reverts commit cb2ddcefa5.

This causes clang to error out building C++ code. The plan is to fix the
build to work with clang, but in the mean time we'll just revert this

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2018-03-22 11:35:08 -07:00
Lionel Landwerlin
1603ce1921 i965/perf: fix config registration when uploading to kernel
When registring configurations to the kernel for the first time, we
run into an issue where the id number is not properly set (we're using
the wrong variable). As a result when trying to use that id later on,
we get an error.

This issue manifest itself the first time you use frameretrace after
reboot, subsequent runs are fine.

Fixes: 27ee83eaf7 ("i965: perf: add support for userspace configurations")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 18:21:57 +00:00
Lepton Wu
a8b846bccd gallium/winsys/kms: Add support for multi-planes
Add a new struct kms_sw_plane which delegate a plane and use it
in place of sw_displaytarget. Multiple planes share same underlying
kms_sw_displaytarget.

v2:
 - add more check for plane size (Tomasz)
v3:
 - split from larger patch (Emil)
v4:
 - no change from v3
v5:
 - remove mapped field (Tomasz)
v6:
 - remove change-id in commit message (Tomasz)
v7:
 - add revision history in commit message (Emil)

Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2018-03-22 18:10:44 +00:00
Lepton Wu
d891f28df9 gallium/winsys/kms: Fix possible leak in map/unmap.
If user calls map twice for kms_sw_displaytarget, the first mapped
buffer could get leaked. Instead of calling mmap every time, just
reuse previous mapping. Since user could map same displaytarget with
different flags, we have to keep two different pointers, one for rw
mapping and one for ro mapping. Also introduce reference count for
mapped buffer so we can unmap them at right time.

v2:
 - avoid duplicated mapping and leaked mapping (Tomasz)
v3:
 - split from larger patch (Emil)
v4:
 - remove munmap from dt_destory (Emil)
v5:
 - introduce reference count for mapping (Tomasz)
 - add back munmap in dt_destory
v6:
 - remove change-id in commit message (Tomasz)
v7:
 - remove munmap from dt_destory again (Emil)
 - add revision history in commit message (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2018-03-22 18:10:42 +00:00
Juan A. Suarez Romero
4db269f30c broadcom/vc4: add path to nir_builder.h
As the other VC4 files do. Otherwise, it won't find nir_builder.h

v2: add path in source code rather changing autotools (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
d39e828c82 autotools: add tegra header files
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
40ecee89b7 swr/rast: autotools: add events_private.proto in dist tarball.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
0bf1274883 radv: autotools: add radv_extensions.h in the generated VULKAN list
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
13459c637a anv/radv: autotools: include vulkan_*.h headers
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
f8b749b7c0 nir: autotools, meson: add GLSL.ext.AMD.h in the files list
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Matt Turner
724586a266 intel/compiler: Readd ICL to test_eu_validate.cpp
Now that the PCI IDs are upstream, this can be readded.
2018-03-22 09:56:09 -07:00
Matt Turner
65b060d9cb intel/compiler: Skip 64-bit type tests when types not available
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Anuj Phogat
ad7ed86bf7 intel: Add a Ice Lake PCI IDs
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-22 09:56:09 -07:00
Anuj Phogat
1065acfb69 intel: Disable fast color clear on icl
Disabling fast color clear makes fbo-clearmipmap test render correct
texture in base miplevel. Fast color clear is anyways disabled for
non-base miplevels.

Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Jason Ekstrand
d2eecf0b0b intel/compiler/icl: Clear "null render target" bit in extended message descriptor
Otherwise all our render target writes go no where.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Anuj Phogat
1484876ef7 intel/compiler/icl: Update the assert in brw_stage_has_packed_dispatch()
Rafael ran piglit with the test code enabled and saw no additional GPU
hangs.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Anuj Phogat
f05e0d9c2a intel/common/icl: Disable hiz surface sampling
On gen11+ AUX_HIZ is not a supported value for surfaces being
sampled by the 3D sampler.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Anuj Phogat
370af9dcc0 intel/common/icl: Add L3 config
ICL uses the same L3 configs as CNL, just leaving the SLM configs out.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Matt Turner
f56693af4b intel/tools/aubinator: Drop platform list from print_help()
We all know the platform names, and I don't want to update this list
continually.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Derek Foreman
aa18a63512 egl/wayland: Make swrast display_sync the correct queue
commit 03dd9a88b0 introduced per surface
queues, but the display_sync for swrast_commit_backbuffer remained on
the old queue.  This is likely to break when dispatching the correct
queue at the top of function (which can't dispatch the sync callback
we're waiting for).

The easiest known reproduction case is running weston-subsurfaces under
weston --use-pixman

Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-22 15:27:35 +00:00
Samuel Pitoiset
52fba3f45d radv: remove unused radv_pipeline::needs_data_cache variable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-22 14:30:37 +01:00
Eric Engestrom
cb2ddcefa5 meson: merge C and C++ compiler arguments check
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-22 11:59:12 +00:00
Mathias Fröhlich
880c1718b6 omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}
We're trying to be -Wundef clean so that we can turn it on (and
eventually make it an error).

Note that the OMX code already used `#if ENABLE_ST_OMX_BELLAGIO` instead
of #ifdef; I could've changed these, but the point of -Wundef is to
catch typos, so we might as well make the change the right way.

Fixes: 83d4a5d5ae "st/omx/tizonia: Add H.264 decoder"
Fixes: b2f2236dc5 "st/omx/tizonia: Add H.264 encoder"
Fixes: c62cf1f165 "st/omx/tizonia/h264d: Add EGLImage support"
Cc: Gurkirpal Singh <gurkirpal204@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-22 11:39:28 +00:00
Mathias Fröhlich
795b465c50 meson: simplify omx logic
and let's make sure `with_gallium_omx` is never 'auto' and can only be
one of [bellagio, tizonia, disabled].

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-22 10:08:10 +00:00
Mathias Fröhlich
862c872c48 vbo: Remove now duplicate _DrawVAO notification.
The DriverFlags.NewArray bit is already set to NewDriverState in
_mesa_set_draw_vao since we have actually just above changed the VAOs
content. So this can be removed.
The _vbo_update_inputs is called by the vbo...recalculate_inputs being
set through the same mechanism as described above.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:53 +01:00
Mathias Fröhlich
006b5e798a vbo: Remove now duplicate _vbo_update_inputs from dlist draw.
At the current state, _vbo_update_inputs is called from
the draw callback if vbo...recalculate_inputs is set.
But that is now set of the _DrawVAO or its content or the
vertex program mode is changed.
So remove _vbo_update_inputs from the direct dlist draw path.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:53 +01:00
Mathias Fröhlich
2887c98140 vbo: Remove redundant set of DriverFlags.NewArray in vbo_bind_arrays.
Now that setting vbo...recalculate_inputs also sets the
DriverFlags.NewArray bits into the NewDriverState setting that from
vbo_bind_arrays is redundant.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
9f5b6ef2ef vbo: Remove vbo...recalculate_inputs from vbo_exec_invalidate_state.
This flag is now set when the actual Array._DrawVAO changes.
So setting this flag is redundant here.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
bf328359a7 mesa: A change of gl_vertex_processing_mode needs an array update.
Since arrays also handle the mapping of current values into the
disabled array slots, we need to tell the array update code that
this mapping has changed. Also mark only dirty if it has changed.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
5b91786225 mesa: Set DriverFlags.NewArray together with vbo...recalculate_inputs.
Both mean something very similar and are set at the same time now.
For that vbo module to be set from core mesa, implement a public vbo
module method to set that flag. In the longer term the flag should
vanish in favor of a driver flag of the appropriate driver.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
d3c604e12e mesa: Update VAO internal state when setting the _DrawVAO.
Update the VAO internal state on Array._DrawVAO instead of
Array.VAO. Also the VAO internal state update gets triggered now
by a change of Array._DrawVAO instead of the _NEW_ARRAY state flag.
Also no driver looks at any VAO's NewArrays value from within
the Driver.UpdateState callback. So it should be safe to move
this update into the _mesa_set_draw_vao method.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
c4c56ff303 vbo: Move vbo_bind_arrays into a dd_driver_functions draw callback.
Factor out that common call into the almost single place.
Remove the _mesa_set_drawing_arrays call from vbo_{exec,save}_draw code
paths as the function is now called through vbo_bind_arrays.
Prepare updating the list of struct gl_vertex_array entries via
calling _vbo_update_inputs for being pushed into those drivers that
finally work on that long list of gl_vertex_array pointers.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
6307d1be0a mesa: Move vbo draw functions into dd_function_table.
Move vbo draw functions into struct dd_function_table.
For now just wrap the underlying vbo functions.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Aaron Watry
23100acc8f clover/llvm: Fix build against LLVM/Clang 4.0
The opencl 1.0 langstandard was renamed in 5.0+

v2: Move preprocessor check into compat.hpp

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-21 21:03:23 -05:00
Timothy Arceri
c135316555 ac/nir_to_llvm: add frexp support
Fixes CTS tests:
KHR-GL40.gpu_shader_fp64.builtin.frexp_double
KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec2
KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec3
KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec4

And piglit test:
tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-22 12:42:34 +11:00
Timothy Arceri
cca2141745 nir: add frexp_exp and frexp_sig opcodes
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-22 12:42:34 +11:00
Caio Marcelo de Oliveira Filho
12c22b897a anv/pipeline: don't pass constant view index in multiview
If view mask has only one bit set, view index is effectively a
constant, so doesn't need to be passed to the next stages, just always
set it.

Part of this was in the original patch that added
anv_nir_lower_multiview.c but disabled.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 14:49:50 -07:00
Caio Marcelo de Oliveira Filho
5e7c1d05d4 anv/pipeline: use less instructions for multiview
The view_index is encoded in the remainder of dividing instance id by
the number of views in the view mask (n). In the general case (handled
by the else clause), there is a need to map from 0..n-1 into the
number of the view being masked. For that a map is encoded.

In the case only the first n bits in the mask are set, the mapping is
trivial, 0..n-1 already represent what view is being referred to.

That case was in the original patch that added
anv_nir_lower_multiview.c but disabled.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 14:49:50 -07:00
Eric Anholt
baeb6a4b4a broadcom/vc5: Fix up the NIR types of FS outputs generated by NIR-to-TGSI.
Unfortunately TGSI doesn't record the type of the FS output like GLSL
does, but VC5's TLB writes depend on the output's base type.  Just record
the type in the key at variant compile time when we've got a TGSI input
and then fix it up.

Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba32i/ui and apparently a
GPU hang that breaks most tests that come after it.
2018-03-21 14:02:34 -07:00
Neil Roberts
61603f0e42 spirv: Add a 64-bit implementation of Frexp
The implementation is inspired by
lower_instructions_visitor::dfrexp_sig_to_arith.

This has been tested against the arb_gpu_shader_fp64/fs-frexp-dvec4
test using the ARB_gl_spirv branch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 20:18:44 +01:00
Rafael Antognolli
5297a17571 aubinator_error_decode: Compare only the class_name of the ring.
ring_name is "<class_name> + <instance_id>" (e.g. rcs0). So we need to
first compare the class name only, then get the instance id.

Without this, INSTDONE is not being decoded.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-21 11:35:15 -07:00
Thomas Helland
8d5cd91ca0 nir: Migrate nir_dce to instr worklist
Shader-db runtime change avarage of five runs:
   Before 125,77 seconds (+/- 0,09%)
   After  124,48 seconds (+/- 0,07%)

Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Reviewed-by: Eric Anholt <eric at anholt.net>
2018-03-21 19:26:40 +01:00
Thomas Helland
edb18564c7 nir: Initial implementation of a nir_instr_worklist
Make a simple worklist by basically just wrapping u_vector.
This is intended used in nir_opt_dce to reduce the number of calls
to ralloc, as we are currenlty spamming ralloc quite bad. It should
also give better cache locality and much lower memory usage.

Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Reviewed-by: Eric Anholt <eric at anholt.net>
2018-03-21 19:26:27 +01:00
Scott D Phillips
cab8df1e3e intel/tools: aubinator: Catch gen11 "enhanced execlist" submission
Different registers are used for execlist submission in gen11, so
also watch those. This code only watches element zero of the
submit queue, which is all aubdump currently writes.

Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-21 11:07:15 -07:00
Marek Olšák
a8d55374dc radeonsi: fix a snprintf warning on gcc 7.3.0 2018-03-21 13:43:09 -04:00
Marek Olšák
cf0a95afac radeonsi/gfx9: print the swizzle mode for testdma
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-03-21 13:40:06 -04:00
Marek Olšák
f7ffa504a0 ac/surface: compute tile swizzle for GFX9
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-03-21 13:40:06 -04:00
Eric Anholt
9f0c9c6d18 broadcom/vc5: Don't skip job submit just because everything is scissored.
The coordinate shaders may now have side effects in the form of transform
feedback.

Part of fixing
GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_misc
2018-03-21 10:04:21 -07:00
Eric Anholt
024e814dee broadcom/vc5: Handle sparsely populated SO target array.
Fixes
GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_state_variables
2018-03-21 10:04:21 -07:00
Eric Anholt
f735ac6b1c broadcom/vc5: Fix 3D miplevel limit to match other texture targets.
Fixes segfault in
GTF-GLES3.gtf.GL3Tests.texture_storage.texture_storage_texture_levels on
level 13.
2018-03-21 10:04:21 -07:00
Eric Anholt
ba87d85b04 broadcom/vc5: Clamp the instance divisor to 16 bits.
Fixes debug assert on
GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor

Signed-off-by: Eric Anholt <eric@anholt.net>
2018-03-21 10:04:21 -07:00
Lionel Landwerlin
3dd92184d5 i965: fix android build
This is the equivalent of commit 5770e1d89e for
android.

v2: fix xml files path and file given to --header

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Fixes: 2d2b15fbca ("i965: fix autotools/android build")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105634
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 18:56:47 +02:00
Juan A. Suarez Romero
e5cd376c2f docs: fix typo in 17.3.6 release notes
Title is about 17.3.5, when it must be about 17.3.6.

CC: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-21 16:37:49 +00:00
Caio Marcelo de Oliveira Filho
8571c577aa nir/dead_cf: also remove useless ifs
Generalize the code for remove dead loops to also remove dead if
nodes. The conditions are the same in both cases, if the node (and
it's children) don't have side-effects AND the nodes after it don't
use the values produced by the node.

The only difference is when evaluating side effects: loops consider
only return jumps as a side-effect -- they can stop execution of nodes
after it; 'if' nodes outside loops should consider all kinds of
jumps (return, break, continue) since all of them can cause execution
of nodes after it to be skipped.

After this patch, empty ifs (those which both then and else blocks are
empty) will be removed by nir_opt_dead_cf.

It caused no change to shader-db, in part because the removal of empty
ifs is currently covered by nir_opt_peephole_select.

v2: Improve the identification of cases where break/continue can cause
    side-effects. (Jason)

v3: Move code comment changes to a different patch. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 09:36:09 -07:00
Caio Marcelo de Oliveira Filho
470056d37b nir/dead_cf: rephrase definition of a dead loop node
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 09:35:57 -07:00
Juan A. Suarez Romero
e1f8c23e18 docs: update calendar, add news and link release notes to 17.3.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-03-21 16:02:37 +00:00
Juan A. Suarez Romero
543e7c8382 docs: add sha256 checksums for 17.3.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 13dd6016d7)
2018-03-21 15:58:55 +00:00
Juan A. Suarez Romero
09448940ed docs: add release notes for 17.3.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 8a51f3857c)
2018-03-21 15:58:52 +00:00
Leo Liu
c4de2f0880 radeon/vce: move feedback command inside of destroy function
On the CI family, firmware requires the destory command have to be the
last command in the IB, moving feedback command after destroy is causing
issues on CI cards, so we have to keep the previous logic that moves
destroy back to the last command.

But as the original issue fixed previously, with the newer family like Vega10,
feedback command have to be included inside of the task info command along
with destroy command.

Fixes: 6d74cb25("radeon/vce: move destroy command before feedback command")

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-03-21 11:24:35 -04:00
Eric Engestrom
1346a36162 egl: pull update from Khronos and drop local define
Added in Khronos in 2b6bb4ee45cc46c89d4a "EGL_MESA_drm_image: add
EGL_DRM_BUFFER_USE_CURSOR_MESA to egl.xml" [1] as part of PR #36 [2].

[1] 2b6bb4ee45
[2] https://github.com/KhronosGroup/EGL-Registry/pull/36

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 14:28:05 +00:00
Eric Engestrom
f744c6c1e2 egl: align the formatting of Haiku section of eglplatform.h with Khronos'
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 14:28:05 +00:00
Eric Engestrom
ac698ae4a0 egl: add Ozone section to eglplatform.h
This pulls in commit a93f559e9c11fa53fb5f1cc255b8f75433f85d2a "Add Ozone
section to eglplatform.h" from Khronos [1] added by Brian Anderson [2]
a few months ago.

[1] a93f559e9c
[2] https://github.com/KhronosGroup/EGL-Registry/pull/26

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 14:28:05 +00:00
Aaron Watry
c95d953b18 clover: Dynamically calculate __OPENCL_VERSION__ and CLC language version
Use get_language_version to calculate default cl standard based on
device capabilities and -cl-std specified in build options.

v5; move dev_clc_version declaration from an earlier patch
v4: Squash the __OPENCL_VERSION__ and CLC language version patches
v3: (Jan) Allow device_version up to 2.2 while device_clc_version
    only goes to 2.0
    Use get_cl_version to calculate version instead
v2: Split out from the previous patch (Pierre)

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
CC: Jan Vesely <jan.vesely@rutgers.edu>
2018-03-21 06:59:46 -05:00
Aaron Watry
29b4090d18 clover/llvm: Add get_[cl|language]_version, validation and some helpers
Used to calculate the default CLC language version based on the --cl-std in build args
and the device capabilities.

According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by:
 1) If you have -cl-std=CL1.1+ use the version specified
 2) If not, use the highest 1.x version that the device supports

Curiously, there is no valid value for -cl-std=CL1.0

Validates requested cl-std against device_clc_version

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>

v7: (Pierre) Split cl/clc versions into separate lists and
    make more references const.

v6: (Pierre) Add more const and fix some whitespace

v5: (Aaron) Use a collection of cl versions instead of switch cases
    Consolidates the string, numeric version, and clc langstandard::kind

v4: (Pierre) Split get_language_version addition and use into separate patches
    Squash patches that add the helpers and validate the language standard

v3: Change device_version to device_clc_version

v2: (Pierre) Move create_compiler_instance changes to correct patch
    to prevent temporary build breakage.
    Convert version_str into unsigned and use it to find language version
    Add build_error for unknown language version string
    Whitespace fixes
2018-03-21 06:59:37 -05:00
Juan A. Suarez Romero
14fffefc60 docs: add 17.3.{8,9} in the release calendar
Mesa 18.0 series has not been released yet, so let's extend 17.3 lifetime.

v2: add 17.3.9 in the calendar (Andres Gomez)

CC: Andres Gomez <agomez@igalia.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 11:57:44 +01:00
Eric Anholt
4d8b476fa9 intel/blorp: Fix compiler warning about num_layers.
The compiler doesn't notice that the condition for num_layers to be
undefined already defined it above (as our assert checked in a debug
build).

v2: Move the pair of assignments to one outside of the block.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-20 14:06:46 -07:00
Samuel Pitoiset
f0211155f1 radv: add support for VK_EXT_depth_range_unrestricted
This extension removes the restrictions on minDepth/maxDepth,
minDepthBounds/maxDepthBounds and VkClearDepthStencilValue::depth.

The following CTS tests now pass:

dEQP-VK.glsl.builtin_var.fragdepth.line_list_d32_sfloat_large_depth
dEQP-VK.glsl.builtin_var.fragdepth.point_list_d32_sfloat_large_depth
dEQP-VK.glsl.builtin_var.fragdepth.triangle_list_d32_sfloat_large_depth
dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_depth_range_unrestricted
dEQP-VK.draw.inverted_depth_ranges.depthclamp_depth_range_unrestricted

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-20 21:55:41 +01:00
Samuel Pitoiset
4e9b0b39b5 radv: only enable one channel when exporting prim id
It's a 32-bit integer like the layer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-20 21:54:48 +01:00
Lionel Landwerlin
5770e1d89e i965: fix out of tree autotools build
Fixes: 2d2b15fbca ("i965: fix autotools/android build")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-03-20 19:48:56 +00:00
Stéphane Marchesin
1117edc60d virgl: Implement seamless cube maps
This was previously ignored.

Along with the virglrenderer patch, this fixes ~100 dEQP tests:
dEQP-GLES3.functional.texture.filtering.cube.*

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-21 05:44:52 +10:00
Emil Velikov
c43715d30b i965: annotate brw_oa.py's --header and --code as required
As of earlier commit, the --header was made a hard requirement when
using --code.

Hence - annotate both as required and drop a few no longer needed
checks.

Fixes: 035cc7a12d ("i965: perf: reduce i965 binary size")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-20 17:21:49 +00:00
Lionel Landwerlin
d3e5d3955c i965: pipecontrol: add LRI write immediate flag
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-20 16:58:30 +00:00
Lionel Landwerlin
7f977d51b3 intel: genxml: add INSTPM/CS_DEBUG_MODE2 registers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-20 16:58:30 +00:00
Lionel Landwerlin
2d2b15fbca i965: fix autotools/android build
Autotools/android builds generate the header & code files in 2 steps,
but the code generation requires the name of the header file to
include it.

This change generates both files in one command.

Fixes: 035cc7a12d ("i965: perf: reduce i965 binary size")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-20 16:58:29 +00:00
Daniel Stone
9f3509665d dri3: Fix typo in version check
The have-new-DRI3 codepaths would never actually properly trigger, since
there was a typo in configure.ac which broke the version check. This
went unnoticed but for an error in config.log if you looked closely
enough.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Lukas F. Hartmann <lukas@mntmn.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Fixes: 7aeef2d4ef ("dri3: allow building against older xcb (v3)")
Cc: Dave Airlie <airlied@redhat.com>
2018-03-20 16:38:08 +00:00
Daniel Stone
bc5e59119e meson: Don't build svga by default on ARM/AArch64
VMware has no (published) support for Arm-architecture guests.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reported-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-20 16:18:37 +00:00
Daniel Stone
d7603cb518 meson: Add default DRI drivers for ARM/AArch64
On all Arm architectures (ARMv7 and below as 'arm', ARMv8 and above as
'aarch64'), only build swrast for DRI drivers. The only classic drivers
which could be used are r200 and NV20 cards, which seems unlikely enough
that it shouldn't be the default.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Javier Jardón <jjardon@gnome.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-20 16:18:37 +00:00
Emil Velikov
28780c5028 st/mesa: add compiler/nir/ prefix for nir includes
Stay consistent with the rest of the codebase, effectively fixing the
autotools build.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105621
Fixes: ffa4bbe466 ("st/nir/radeonsi: move nir_lower_uniforms_to_ubo()
to the state tracker")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-20 16:11:19 +00:00
Scott D Phillips
d849d36c6c anv: off-by-one in GetDescriptorSetLayoutSupport
Loop was accessing one more than bindingCount elements from
pBindings, accessing uninitialized memory.

Fixes: ddc4069122 ("anv: Implement VK_KHR_maintenance3")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-20 07:58:10 -07:00
Lionel Landwerlin
035cc7a12d i965: perf: reduce i965 binary size
Performance metric numbers are calculated the following way :

   - out of the 256 bytes long OA reports, we accumulate the deltas
     into an array of uint64_t

   - the equations' generated code reads the accumulated uint64_t
     deltas and normalizes them for a particular platform

Our hardware is such that a number of counters in the OA reports
always return the same values (i.e. they're not programmable), and
they return the same values even across generations, and as a result a
number of equations are identical in different metric sets across
different generations.

Up to now we've kept the generated code of the equations separated in
different files (per generation/GT), and didn't apply any
factorization of the common equations. We could have make some
improvement by reusing equations within a given metrics file, but we
can go even further and reuse across generations (i.e. all files).

This change changes the code generation to emit a single file in which
we reuse equations emitted code based on the hash of equations'
strings.

Here are the savings in a meson build :

Before(.old)/after :
   $ du -h ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old
   43M	./build/src/mesa/drivers/dri/libmesa_dri_drivers.so
   47M	./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old

   $ size build/src/mesa/drivers/dri/libmesa_dri_drivers.so build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old
       text   data          bss	     dec            hex filename
   13054002 409424	 671856	14135282	 d7aff2	build/src/mesa/drivers/dri/libmesa_dri_drivers.so
   14550386 409552	 671856	15631794	 ee85b2	build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old

As a side comment here is the size of the drivers if we remove all of
the metrics from the build :

   $ du -sh build/src/mesa/drivers/dri/libmesa_dri_drivers.so
   40M	build/src/mesa/drivers/dri/libmesa_dri_drivers.so

v2: Fix an issue with hashing of counter equations (Lionel)
    Build system rework (Emil)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (build system part)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-20 13:56:07 +00:00
Lionel Landwerlin
e9a9e85948 i965: perf: fix a counter return type on hsw
The equation code computes a float (percentage) yet the return type
was an uint64_t.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-20 11:36:13 +00:00
Tapani Pälli
604cac9f73 mesa: fix leaking ParameterValueOffset
==15115== 48 bytes in 1 blocks are definitely lost in loss record 16 of 66
==15115==    at 0x4C2EC15: realloc (vg_replace_malloc.c:785)
==15115==    by 0x8602C3E: _mesa_reserve_parameter_storage (prog_parameter.c:212)
==15115==    by 0x8602D1E: _mesa_add_parameter (prog_parameter.c:252)
==15115==    by 0x86032C4: _mesa_add_sized_state_reference (prog_parameter.c:384)
==15115==    by 0x8603324: _mesa_add_state_reference (prog_parameter.c:409)

Fixes: edded12376 "mesa: rework ParameterList to allow packing"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-20 13:25:07 +02:00
Daniel Stone
478fc2d2a1 dri3: Don't fail on version mismatch
The previous commit to make DRI3 modifier support optional, breaks with
an updated server and old client.

Make sure we never set multibuffers_available unless we also support it
locally. Make sure we don't call stubs of new-DRI3 functions (or empty
branches) which will never succeed.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: 7aeef2d4ef ("dri3: allow building against older xcb (v3)")
2018-03-20 08:52:59 +00:00
Timothy Arceri
9a243eccae radv: don't lower indirects until after opts have run
Noticed while passing by. Not sure if it impacts anything, but
likely to impact GFX9 more than anything else since we lower
inputs, outputs and locals there.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-20 15:01:44 +11:00
Timothy Arceri
dfe2f19855 st/nir: fix atomic lowering for gallium drivers
i965 and gallium handle the atomic buffer index differently. It was
just by luck that the single piglit test for this was passing.

For gallium we use the atomic binding so that we match the handling
in st_bind_atomics().

On radeonsi this fixes the CTS test:
KHR-GL43.shader_storage_buffer_object.advanced-write-fragment

It also fixes tressfx hair rendering in Tomb Raider.

Reviewed-by: Marek Olšák  <marek.olsak@amd.com>
2018-03-20 14:29:53 +11:00
Timothy Arceri
632d5e97ef st/radeonsi: enable uniform packing in NIR backend
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:19:35 +11:00
Timothy Arceri
231333a20d st: add uniform packing support to lower_uniforms_to_ubo()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
9c51a7ea29 gallium: add packed uniform CAP
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
ffa4bbe466 st/nir/radeonsi: move nir_lower_uniforms_to_ubo() to the state tracker
This will only ever be used by gallium drivers so it probably doesn't
belong in the nir toolkit. Also we want to pass it some non NIR
things in the following patch.

To avoid regressions we wrap the lowering calls that have been moved
to st_glsl_to_nir with a quick hack so that they are only called for
radeonsi, we will replace the hack with a check for uniform packing
in a following patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
a80cf442d9 st: add st_glsl_type_dword_size() helper
This will be used to support uniform packing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
5488166730 st/glsl_to_nir: add support for packed builtin uniforms
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
57ebab64c0 mesa: add _mesa_add_sized_state_reference() helper
This will be used for adding packed builtin uniforms.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
2377754329 mesa: add support propagate uniform support for packed uniforms
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
40711a7a60 mesa: allow for uniform packing when adding uniforms to param list
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
a2198d4fdb mesa: add packing support for setting uniform handles
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
6cfa15b803 mesa: add packing support for setting uniforms
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
4a7c5c079b mesa: create copy uniform to storage helpers
These will be used in the following patch to allow copying directly
to the param list when packing is enabled.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
edded12376 mesa: rework ParameterList to allow packing
Currently everything is padded to 4 components. Making the list
more flexible will allow us to do uniform packing.

V2 (suggestions from Nicolai):
- always pass existing calls to _mesa_add_parameter() true for padd_and_align
- fix bindless param value offsets
- remove left over wip logic from pad and align code
- zero out param value padding
- whitespace fix

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
b13b9eb432 mesa: add PackedDriverUniformStorage const
Will be used to determine whether to take packing code paths or not.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:33 +11:00
Eric Anholt
00910e3057 broadcom/vc5: Don't annotate dumps with stale live intervals.
As you're debugging register allocation, you may have changed the
intervals and not recomputed yet.  Just skip the dump in that case.
2018-03-19 16:44:20 -07:00
Eric Anholt
facc3c6f58 broadcom/vc5: Add support for register spilling.
Our register spilling support is nice to have since vc4 couldn't at all,
but we're still very restricted due to needing to not spill during a TMU
operation, or during the last segment of the program (which would be nice
to spill a value of, when there's a long-lived value being passed through
with little modification from the start to the end).

We could do better by emitting unspills for the last-segment values just
before the last thrsw, since the last segment is probably not the maximum
interference area.

Fixes GTF uniform_buffer_object_arrays_of_all_valid_basic_types and 3
others.
2018-03-19 16:44:06 -07:00
Eric Anholt
271fc58ba1 broadcom/vc5: Remove redundant last_inst lookup.
The point was to get the MOV, which the MOV_dest already returned.
2018-03-19 16:42:59 -07:00
Eric Anholt
34dc64f627 broadcom/vc5: On QPU pack error, dump the instruction and return cleanly.
This is nice for debugging when you've made a bad instruction.
2018-03-19 16:42:59 -07:00
Eric Anholt
d721348dcd broadcom/vc5: Add cursors to the compiler infrastructure, like NIR's.
This will let me do lowering late in compilation using the same
instruction builder as we use in nir_to_vir.
2018-03-19 16:42:59 -07:00
Eric Anholt
c81d681742 broadcom/vc5: Move the umul macro to a header.
Anywhere we want to multiply, we probably want this.
2018-03-19 16:42:59 -07:00
Eric Anholt
9e28c18cd1 broadcom/vc5: Correct the arg count of TIDX/EIDX. 2018-03-19 16:42:59 -07:00
Eric Anholt
55bf298333 broadcom/vc5: Re-do live variables after removing thrsws.
Otherwise our start/ends ips won't line up with the actual instructions.
2018-03-19 16:42:59 -07:00
Eric Anholt
c3a504f470 broadcom/vc5: Add a QPU helper for instructions using the TLB.
This will be used for detecting last thread segment in register spilling.
2018-03-19 16:42:59 -07:00
Eric Anholt
09c4dd1971 broadcom/vc5: Introduce v3d_qpu_reads_vpm()/v3d_qpu_writes_vpm().
These helpers will be used in register spilling to determine where to add
a last thrsw if needed, and might help refactor QPU scheduling.
2018-03-19 16:42:59 -07:00
Eric Anholt
407f21ef1b broadcom/vc5: The ldvpm signal also a case of using the VPM.
The QPU scheduling code calling this function already separately checked
this signal.
2018-03-19 16:42:59 -07:00
Eric Anholt
4760040c09 broadcom/vc5: Extract v3d_qpu_writes_tmu() helper.
This will be reused in register spilling.
2018-03-19 16:42:59 -07:00
Dave Airlie
32791a0502 radv: don't export NULL layer.
We have some cases where in subpass we want the layer but having
it be 0 and loaded in the frag shader without the vertex shader
exporting it is fine.

So don't export the layer if we don't have a value to put in it.

Fixes: d4c74aed7a (radv/multiview: mark layer_input if we have input attachments.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 21:36:48 +00:00
Marek Olšák
f674b50d0e mesa: adjust incorrect comment in texture_buffer_range 2018-03-19 16:56:17 -04:00
Ian Romanick
6aeaa7d363 nir: Don't compare b2f or b2i with zero
All of the shaders that had loops changed were in Tomb Raider.  The one
shader that lost SIMD16 is one of those.

Skylake
total instructions in shared programs: 14391653 -> 14390468 (<.01%)
instructions in affected programs: 111891 -> 110706 (-1.06%)
helped: 501
HURT: 0
helped stats (abs) min: 1 max: 155 x̄: 2.37 x̃: 1
helped stats (rel) min: 0.05% max: 21.54% x̄: 1.61% x̃: 1.01%
95% mean confidence interval for instructions value: -3.23 -1.50
95% mean confidence interval for instructions %-change: -1.77% -1.45%
Instructions are helped.

total cycles in shared programs: 532793024 -> 532776598 (<.01%)
cycles in affected programs: 987682 -> 971256 (-1.66%)
helped: 348
nnHURT: 41
helped stats (abs) min: 1 max: 3074 x̄: 54.91 x̃: 18
helped stats (rel) min: 0.05% max: 32.24% x̄: 3.36% x̃: 1.68%
HURT stats (abs)   min: 1 max: 422 x̄: 65.39 x̃: 24
HURT stats (rel)   min: 0.09% max: 39.29% x̄: 9.50% x̃: 2.02%
95% mean confidence interval for cycles value: -64.08 -20.38
95% mean confidence interval for cycles %-change: -2.78% -1.23%
Cycles are helped.

total loops in shared programs: 4854 -> 4829 (-0.52%)
loops in affected programs: 27 -> 2 (-92.59%)
helped: 18
HURT: 0

LOST:   1
GAINED: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 13:52:35 -07:00
Dave Airlie
e8d9b7ab02 radv: lower constant initializers on output variables earlier
If a shader only writes to an output via a constant initializer we
need to lower it before we call nir_remove_dead_variables so that
this pass sees the stores from the initializer and doesn't kill the
output.

Fixes test failures in new work-in-progress CTS tests:
dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.float

This is ported from anv:
99b57daf4a anv/pipeline: lower constant initializers on output variables earlier
from Iago Toral Quiroga <itoral@igalia.com>

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:29:40 +00:00
Dave Airlie
032014ac01 radv/query: handle multiview timestamp queries.
For each view bit we need to emit a timestamp query.

Fixes: dEQP-VK.multiview.queries*

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:29:14 +00:00
Dave Airlie
32b4f3c38d radv/query: handle multiview queries properly. (v3)
For multiview we need to emit a number of sequential queries
depending on the view mask.

This avoids dEQP-VK.multiview.queries.15 waiting forever
on the CPU for query results that are never coming.

We only really want to emit one query,
and the rest should be blank (amdvlk does the same),
so we emit begin/end pairs for all the others except
the first query.

v2: fix tests
v3: split out patch.

Fixes: dEQP-VK.multiview.queries*
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:29:09 +00:00
Dave Airlie
4034dc5c72 radv/query: split out begin/end query emission
This just splits out the begin/end query hw emissions,
it makes it easier to add multiview support for queries.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:29:05 +00:00
Dave Airlie
d4c74aed7a radv/multiview: mark layer_input if we have input attachments.
This fixes:
dEQP-VK.multiview.input_attachments*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:26:39 +00:00
Caio Marcelo de Oliveira Filho
f6338c3b85 anv/pipeline: set active_stages early
Since the intermediate states of active_stages are not used,
i.e. active_stages is read only after all stages were set into it,
just set its value before compiling the shaders.

This will allow to conditionally run certain passes based on what
other shaders are being used, e.g. a certain pass might only be
applicable to the vertex shader if there's no geometry or tessellation
shader being used.

v2: Use vk_to_mesa_shader_stage. (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-19 18:00:49 +00:00
Caio Marcelo de Oliveira Filho
318073ce66 anv/pipeline: fail if TCS/TES compile fail
v2: Add Fixes tag. (Lionel)

Fixes: e50d4807a3 ("anv: Compile TCS/TES shaders.")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-19 18:00:49 +00:00
Jordan Justen
2ed288363f main/program_binary: In ProgramBinary set link status as LINKING_SKIPPED
This change allows the disk shader cache to work with programs loaded
with ProgramBinary. Drivers check for LINKING_SKIPPED, and if set,
then they try to use the shader cache.

Since the program loaded by ProgramBinary is similar to loading the
shader from the disk cache, this is probably more appropriate.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
d2b74ca2b5 i965: Allow disk shader cache usage with LINKING_SUCCESS status
Currently, we only look in the disk shader cache if we see that the
shader program is in the cache during the link step.

If the shader cache entry isn't found during the program link, there
are still some (fairly unlikely) scenarios where later it might be
useful to search the cache for gen binary programs.

1. If the cache evicts the serialized glsl cache, there might still be
   valid gen program entries in the disk cache.

2. If two applications are running in parallel, then it is possible
   that one may write out the cached gen program item which the other
   application can then make use of.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
b5baaee0d6 glsl/serialize: Save shader program metadata sha1
When the shader cache is used, this can be generated. In fact, the
shader cache uses this sha1 to lookup the serialized GL shader
program.

If a GL shader program is restored with ProgramBinary, the shaders are
not available, and therefore the correct sha1 cannot be generated. If
this is restored, then we can use the shader cache to restore the
binary programs to the program that was loaded with ProgramBinary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
9b473f9e3c glsl: Remove api_enabled tracking for transform feedback
We used this to prevent usage of the disk shader cache when transform
feedback was enabled via the GL API. This is no longer used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
fc4a7aaa82 i965: Allow disk shader cache usage with transform feedback
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
6d830940f7 glsl/shader_cache: Allow shader cache usage with transform feedback
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Suggested-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jose Fonseca
e10dc12f6f scons: need to split CC or things might fail
We've seen this fail internally.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-03-19 16:41:57 +01:00
Jordan Justen
d07a49fb18 i965: Add INTEL_DEBUG stages support for disk shader cache
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-19 00:07:29 -07:00
Dave Airlie
8f052a3e25 radv: handle exporting view index to fragment shader. (v1.1)
The fragment shader was trying to read this, but nothing
was exporting it from the vertex shader. This handles
it like the prim id export.

Fixes:
dEQP-VK.multiview.secondary_cmd_buffer.*
dEQP-VK.multiview.index.fragment_shader.*

v1.1: updated to use 0x1 (Samuel)

Fixes: e3265c10c8 (radv: Implement multiview draws.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-19 01:20:00 +00:00
Axel Davy
dbc24835d7 st/nine: Fix non inversible matrix check
There was a missing absolute value when
checking if the determinant was big enough.

Fixes: https://github.com/iXit/Mesa-3D/issues/292

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-03-18 22:53:46 +01:00
Axel Davy
f61e9a958b st/nine: Fixes warning about implicit conversion
Makes the conversion explicit.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102542

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-03-18 22:53:42 +01:00
Axel Davy
71eae7940e st/nine: Fix bad tracking of vs textures for NINESBT_ALL
Stateblocks with NINESBT_ALL should track all textures.
For better performance they have a faster path which
copies all the required.

This path was only tracking ps textures.

Fixes: https://github.com/iXit/Mesa-3D/issues/303

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-03-18 22:53:36 +01:00
Axel Davy
76fa1f730b st/nine: Fix bad tracking of bound vs textures
An incorrect formula was used to compute bound_samplers_mask_vs.
Since s is above always 8 for vs and the variable is encoded on 8 bits,
it was always 0.
This resulted in commiting the samplers every call when
there was at least one texture read in the vs shader.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-03-18 22:53:32 +01:00
Grazvydas Ignotas
e1b2e5667c radv: make vk_format_description structures static
No need to bother the linker about them.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-17 18:53:21 +02:00
Grazvydas Ignotas
331141e87e radv: fix stale comment in generated vk_format_table.c
It seems to be a leftover from u_format_table.py.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-17 18:53:21 +02:00
Eric Anholt
7db1c09d12 anv: Silence warning about heap_size.
We only get VK_SUCCESS if it was initialized, but apparently my compiler
doesn't track that far.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-16 15:10:05 -07:00
Eric Anholt
d25640c3a3 i965: Silence compiler warning about promoted_constants.
We only have a cfg != NULL if we went through one of the paths that set
it, but my compiler doesn't figure that out.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 6411defdcd ("intel/cs: Re-run final NIR optimizations for each SIMD size")
2018-03-16 15:09:55 -07:00
Eric Anholt
9f89452ea3 anv: Silence compiler warnings about uninitialized bind_offset.
This is a legitimate warning: if anv's blorp_alloc_binding_table() throws
an error from anv_cmd_buffer_alloc_blorp_binding_table(), we silently
continue to use this undefined value.  The rest of this code doesn't seem
very allocation-error-proof, though, either.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-16 15:09:47 -07:00
Matt Turner
f3833f1ca7 intel/compiler: Use gen_get_device_info() in test_eu_validate
Previously the unit test filled out a minimal devinfo struct. A previous
patch caused the test to begin assert failing because the devinfo was
not complete. Avoid this by using the real mechanism to create devinfo.

Note that we have to drop icl from the table, since we now rely on the
name -> PCI ID translation done by gen_device_name_to_pci_device_id(),
and ICL's PCI IDs are not upstream yet.

Fixes: f89e735719 ("intel/compiler: Check for unsupported register sizes.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-16 13:20:21 -07:00
Matt Turner
54db78b196 intel: Add cfl to gen_device_name_to_pci_device_id()
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-16 13:20:21 -07:00
Rob Clark
bc5001325b meson+dri3: allow building against older xcb (v3)
Similar to previous patch, make xcb 1.13 optional.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-16 16:18:42 -04:00
Dave Airlie
7aeef2d4ef dri3: allow building against older xcb (v3)
I'm not sure everyone wants to be updating their dri3 in a forced
march setting, this allows a nicer approach, esp when you want
to build on distro that aren't brand new.

I'm sure there are plenty of ways this patch could be cleaner,
and I've also not built it against an updated dri3.

For meson I've just left it alone, since if you are using meson
you probably don't mind xcb updates, and if you are using meson
you can fix this better than me.

v3: just don't put a version in for dri3/present without
modifiers, should allow building with 1.11 as well

(feel free to supply meson followups)

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-03-16 13:19:45 -04:00
Marek Olšák
f099c3aef1 r600: consolidate PIPE_BIND_SHARED/SCANOUT handling
(Ported from radeonsi commit f70f6baaa3)

Allows cached BOs to be reused in more cases.

Bugzilla: https://bugs.freedesktop.org/105171
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2018-03-16 17:31:28 +01:00
Rafael Antognolli
f89e735719 intel/compiler: Check for unsupported register sizes.
Make sure we don't emit 64 bit types if the hardware doesn't support
them.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-16 09:27:16 -07:00
Jason Ekstrand
315ee5faec loader: Include include/drm-uapi in the autotools build
We're already including it in the meson build.  This fixes build issues
on systems which have a drm_fourcc.h that doesn't have modifiers.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-16 08:50:07 -07:00
Wu, Zhongmin
5fc21c6044 egl/android: Implement the eglSwapinterval for Android.
Implement the eglSwapinterval for Android platform to
enable the async mode for some GFX benchmarks such as
Daimler C217, CityBench.

Results of the dEQP-EGL.*swap_interval tests

'dEQP-EGL.functional.query_config.get_config_attrib.max_swap_interval'..
'dEQP-EGL.functional.query_config.get_config_attrib.min_swap_interval'..
'dEQP-EGL.functional.choose_config.simple.selection_only.max_swap_interval'..
'dEQP-EGL.functional.choose_config.simple.selection_only.min_swap_interval'..
'dEQP-EGL.functional.choose_config.simple.selection_and_sort.max_swap_interval'..
'dEQP-EGL.functional.choose_config.simple.selection_and_sort.min_swap_interval'..
'dEQP-EGL.functional.negative_api.swap_interval'..

 Test run totals:
   Passed:        7/7 (100.0%)
   Failed:        0/7 (0.0%)
   Not supported: 0/7 (0.0%)
   Warnings:      0/7 (0.0%)

Signed-off-by: Zhongmin Wu <zhongmin.wu@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
[Emil Velikov: polish inline comment, add dEQP stats, s/dpy/disp/]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-16 13:58:56 +00:00
Emil Velikov
3a9fb4f7ad st/mesa: simplify st_init_limits() via tgsi_processor_to_shader_stage
Reuse the tgis helper and remove a bunch of duplicated code.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-16 13:49:16 +00:00
Emil Velikov
f7f95310f0 tgsi: move tgsi_processor_to_shader_stage() to a header
This way we can utilise it with later patches.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-16 13:48:46 +00:00
Emil Velikov
9fa1d822bf egl/dri2: move wayland header inclusion where applicable
Instead of indirectly pulling the wayland headers everywhere, use
forward declarations and #include only as needed.

Should effectively fix build errors like the following:

make[5]: Entering directory
'/.../src/gallium/state_trackers/omx/tizonia'
   CC       h264dprc.lo
In file included from h264dprc.c:45:0:
.../src/egl/drivers/dri2/egl_dri2.h:47:10: fatal error:
wayland/wayland-egl/wayland-egl-backend.h: No such file or directory
  #include "wayland/wayland-egl/wayland-egl-backend.h"

Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
2018-03-16 13:47:59 +00:00
Emil Velikov
d091c9c4cf vulkan/wsi/x11: correct DRI3 version in comment
During development the version was bumped, yet the comment did not get
an update.

Fixes: c80c08e226 ("vulkan/wsi/x11: Add support for DRI3 v1.2")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-16 13:47:52 +00:00
Emil Velikov
19ec817756 vulkan/wsi/x11: use ARRAY_SIZE where applicable
Use the handy macro instead of hard coded numbers.

Fixes: c80c08e226 ("vulkan/wsi/x11: Add support for DRI3 v1.2")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-16 13:45:47 +00:00
Juan A. Suarez Romero
705a6446b4 mesa: RGB9_E5 invalid for CopyTexSubImage* in GLES
According to OpenGL ES 3.2, section 8.6, CopyTexSubImage* should return
an INVALID_OPERATION if the internalformat of the texture is RGB9_E5.

This fixes
dEQP-GLES31.functional.debug.negative_coverage.*.copytexsubimage2d_texture_internalformat.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-16 12:49:16 +00:00
Christian Gmeiner
5e51f72374 etnaviv: remove superfluous \n from DBG(..) callers
The DBG(..) macro appends a \n already so there is no
need to do it twice.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-03-16 11:41:27 +01:00
Samuel Pitoiset
e96a1d27dc radv: run nir_opt_move_load_ubo
Polaris10:
SGPRS: 108560 -> 107856 (-0.65 %)
VGPRS: 74576 -> 74520 (-0.08 %)
Spilled SGPRs: 7375 -> 7113 (-3.55 %)
Code Size: 4273464 -> 4274364 (0.02 %) bytes
Max Waves: 9434 -> 9446 (0.13 %)

Vega10:
Totals from affected shaders:
SGPRS: 108264 -> 107576 (-0.64 %)
VGPRS: 69068 -> 69000 (-0.10 %)
Spilled SGPRs: 7221 -> 6959 (-3.63 %)
Code Size: 3800796 -> 3801496 (0.02 %) bytes
Max Waves: 10687 -> 10709 (0.21 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-16 09:58:19 +01:00
Samuel Pitoiset
af355aaa07 nir: add nir_opt_move_load_ubo() optimization pass
This pass moves load UBO operations just before their first use,
loosely based on nir_opt_move_comparisons.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-16 09:50:31 +01:00
Dave Airlie
9d0d806332 radv: drop geometry stride user sgpr.
This removes the other geometry specific user sgpr.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:21 +00:00
Dave Airlie
6f051549c3 radv: get rid of geometry user sgpr for num entries.
This drops one of the geometry specific user sgprs,
we can work this out at compile time.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:17 +00:00
Dave Airlie
9188bd78d7 radv: migrate lds size calculations to shader gen.
This moves the lds_size calcs into the shader so we have all
the size stuff in one file.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:12 +00:00
Dave Airlie
384aced65e radv: drop scanning the tess shader in the nir code.
This drops the now unneeded scanning and results in favour
of the ones in the info.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:08 +00:00
Dave Airlie
f50d520acf radv: use num_patches output from tcs shader.
Instead of recalculating the value, use the shader calculated value.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:05 +00:00
Dave Airlie
bf9a0ea853 radv/tess: remove last chunk of tess sgprs
This removes the last TES-specifc user sgpr.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:01 +00:00
Dave Airlie
6db44d6a8c radv: pass num_patches to tes from tcs
TES needs num_patches to do some of the calculations.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:58 +00:00
Dave Airlie
010d055aae radv: drop tess offchip layout for tcs.
This removes the last TCS specific user sgpr.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:54 +00:00
Dave Airlie
ee31cff856 radv: drop tcs_out_offsets
Move all calculations to shader generation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:47 +00:00
Dave Airlie
b0460bbf1c radv: drop tcs_out_layout
Move all calculations to shader generation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:43 +00:00
Dave Airlie
6adf99165c radv/tess: drop tcs_in_layout setting completely.
Inline all calcs at shader creation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:37 +00:00
Dave Airlie
f343d11ae7 radv: drop ls_out_layout const.
We can precalculate input_vertex_size at compile time.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:32 +00:00
Dave Airlie
d89b16b7b9 radv/shader_info: start gathering tess output info (v2)
This gathers the ls outputs written by the vertex shader,
and the tcs outputs, these are needed to calculate certain
tcs parameters.

These have to be separate for combined gfx9 shaders.

This is a bit pessimistic compared to the nir pass,
as we don't work out the individual slots for tcs outputs,
but I actually thing it should be fine to just mark the whole
thing used here.

v2: move to radv, handle clip dist (Samuel),
    handle compacts and patchs properly.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:23 +00:00
Dave Airlie
2012dae19a radv: migrate unique index info shader info (v2)
This just moves this function to an inline so the shader_info
pass can use it.

v2: use inline (Samuel)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:19 +00:00
Samuel Pitoiset
f02f1ad13f Revert "mesa: do not trigger _NEW_TEXTURE_STATE in glActiveTexture()"
This reverts commit f314a532fd.

This appears to introduce some blinking textures in UT2004. Not
sure exactly what's the root cause because we don't have much
information about the issue.

Anyway, this was just a micro optimization that actually breaks,
at least, one app almost one year later.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105436
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-15 21:32:52 +01:00
Lionel Landwerlin
51783f3e7d anv: silence unused variable warning
Fixes: 59b0ea0c74 ("anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-15 18:56:26 +00:00
Lionel Landwerlin
b5b56f91f5 i965: silence unused function warning
[123/227] Compiling C object 'src/mesa/drivers/dri/i965/libi965_gen110@sta/genX_blorp_exec.c.o'.
../src/mesa/drivers/dri/i965/genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-15 18:56:23 +00:00
Lionel Landwerlin
0f544a3c51 anv: silence unused function warning on gen11
[84/227] Compiling C object 'src/intel/vulkan/libanv_gen110@sta/genX_blorp_exec.c.o'.
../src/intel/vulkan/genX_blorp_exec.c:68:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-15 18:55:42 +00:00
Dylan Baker
2a7027f79a meson: fix pipe-loaders after omx changes
with_gallium_omx used to be a boolean, but now it's a string. That means
it needs to be compared to 'disabled' instead of false.

CC: Rob Clark <robdclark@gmail.com>
Fixes: 34e852d5b5
       ("meson: Re-add auto option for omx")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Tested-by: Rob Clark <robdclark@gmail.com
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-15 10:02:32 -07:00
Dylan Baker
9bd7a6f6f0 meson: require amdgpu >= 2.4.91
the meson equivalent of f8773edb0a

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-15 10:00:02 -07:00
Marek Olšák
f8773edb0a configure.ac: require libdrm_amdgpu 2.4.91
Since 2.4.90 is problematic, just ask for the next version.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-15 12:44:40 -04:00
Marek Olšák
5d0acff39e configure.ac: blacklist libdrm 2.4.90
Cc: 18.0 17.3 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-15 12:44:37 -04:00
Samuel Pitoiset
16ecf037f9 radv: dump LLVM IR when a hang is detected
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:20:07 +01:00
Samuel Pitoiset
81818662a5 radv: record LLVM IR when debugging shaders
If AMD_shader_info or RADV_TRACE_FILE is used we might need to
keep trace of LLVM IR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:20:03 +01:00
Samuel Pitoiset
d07edf5fdf radv: add dump_shader to the NIR compiler options
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:20:00 +01:00
Samuel Pitoiset
50fcca328c radv: pass the NIR compiler options to ac_compile_llvm_module()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:19:58 +01:00
Samuel Pitoiset
14c27c2511 radv: print some information when RADV_TRACE_FILE is set
Just to be sure all options are enabled when trying to generate
a hang report.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:19:54 +01:00
Samuel Pitoiset
5be2757c35 radv: only display options that are enabled
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:19:52 +01:00
Eric Engestrom
6332893594 mailmap: Use Eric Engestrom's personal email address
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-15 12:03:41 +00:00
Alejandro Piñeiro
50767214a7 spirv/radv: add AMD_gcn_shader capability, remove current extensions
So now, during spirv_to_nir, it uses the capability instead of the
extension. Note that we are really doing here is treating
SPV_AMD_gcn_shader as other supported extensions. SPV_AMD_gcn_shader
is not the first SPV extension supported. For example, the capability
draw_parameters infers if the extension SPV_KHR_shader_draw_parameters
is supported or not.

This could be seen as counter-intuitive, and that it would be easier
to define which extensions are supported, and based our checks on
that, but we need to take into account that some capabilities are
optional from core, and others came from new extensions.

Also this commit would make the implementation of ARB_spirv_extensions
easier.

v2: AMD_gcn_shader capability renamed to gcn_shader (Daniel Schürmann)

Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 12:08:25 +01:00
Samuel Iglesias Gonsálvez
adf58e59d3 spirv: update arguments for vtn_nir_alu_op_for_spirv_opcode()
We don't need anymore the source and destination's data type, just
their bitsize.

v2:
- Use glsl_get_bit_size () instead (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-15 08:56:15 +01:00
Samuel Iglesias Gonsálvez
ce2fd87056 spirv: fix the translation of SPIR-V conversion opcodes to NIR
There are some SPIRV opcodes (like UConvert and SConvert) have some
expectations of the output that doesn't depend on the operands
data type. Generalize the solution of all of them.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-15 08:51:01 +01:00
Mathias Fröhlich
98f35ad63c vbo: Correctly handle source arrays in vbo_split_copy.
The original approach did optimize away a bit too many fields.
Restablish the pointer into the original array and correctly feed that
one.

Reviewed-by: Brian Paul <brianp@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105471
Fixes: 64d2a20480
    mesa: Make gl_vertex_array contain pointers to first order VAO members.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-15 06:11:57 +01:00
Apple SWE
361f79c97f sched.h needs to be imported on Darwin/OSX targets.
sched_yield is used but the include reference on Darwin is missing. This patch
conditionally guards on Darwin/OSX to import sched.h first.

Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-14 22:08:34 -07:00
Apple SWE
67f27b1e18 Add processor topology calculation implementation for Darwin/OSX targets.
The implementation for bootstrapping SWR on Darwin targets is based on the Linux version.
Instead of reading the output of /proc/cpuinfo, sysctlbyname is used to determine the
physical identifiers, processor identifiers, core counts and thread-processor affinities.

With this patch, it is possible to use SWR as an alternate renderer on OSX to softpipe and
llvmpipe.

Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-14 22:08:34 -07:00
Dave Airlie
4b15b5e803 virgl: resize resource bo allocation if we need to.
This fixes an illegal command buffer on the host seen with
piglit arb_internalformat_query2-max-dimensions

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-15 12:26:39 +10:00
Mario Kleiner
c1e47a3c1f nv50,nvc0: Support BGRX1010102 and RGBX1010102 for sampling.
Add them as usable for textures, so they can be used by
Wayland drm in 10 bpc mode and for X11 compositing under
GLX and EGL. We need these formats to be supported at
least for sampling, otherwise GLX_texture_from_pixmap
and the equivalent EGL image extension won't work with
X11 drawables of depth 30 and just display an all black
window.

Do not expose these formats as renderable, and thereby
not as a fbconfig/EGLConfig/Visual, as NVidia hw does
not support 10 bpc unorm formats without alpha channel.

Tested under X11 + GLX/EGL + DRI2/DRI3 for compositing,
and under Wayland+Weston drm backend with a Tesla and
Pascal gpu.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-03-14 21:41:27 -04:00
Thomas Helland
03e37ec6d7 util: Use set_foreach instead of rolling our own
This follows the same pattern as in the hash_table.

Reviewed-by: Jason Ekstrand <jason.ekstrand at intel.com>
2018-03-14 20:03:57 +01:00
Thomas Helland
5f129c05e6 glsl: Use hash table cloning in copy propagation
Walking the whole hash table, inserting entries by hashing them first
is just a really bad idea. We can simply memcpy the whole thing.

V2: Remove leftover creation of acp in two places

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-14 19:52:02 +01:00
Thomas Helland
6baaf4291b util: Implement a hash table cloning function
V2: Don't rzalloc; we are about to rewrite the whole thing (Vladislav)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-14 19:52:01 +01:00
Guillaume Charifi
388ed47081 st/mesa: Factorize duplicate code in st_BlitFramebuffer()
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-03-14 14:46:51 -04:00
Dylan Baker
7dd261ac50 autotools: add -I/src/egl to tizonia
This fixes the following build breakage:

make[5]: Entering directory
'/mnt/sdc1/Gits/mesa/src/gallium/state_trackers/omx/tizonia'
   CC       h264dprc.lo
In file included from h264dprc.c:45:0:
../../../../../src/egl/drivers/dri2/egl_dri2.h:47:10: fatal error:
wayland/wayland-egl/wayland-egl-backend.h: No such file or directory
  #include "wayland/wayland-egl/wayland-egl-backend.h"
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

meson got the same fix in 7598dedfde.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-14 11:23:19 -07:00
Dylan Baker
848f2b6e31 Revert "Add processor topology calculation implementation for Darwin/OSX targets."
This reverts commit de0d10db93.

This breaks the build on at least Linux, probably other non-apple
platforms.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-14 09:30:17 -07:00
Dylan Baker
0f30c80932 Revert "sched.h needs to be imported on Darwin/OSX targets."
This reverts commit 9dc5063262.

This breaks the build on at least Linux, probably other non-apple
platforms.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-14 09:28:58 -07:00
Karol Herbst
b617bfcccf compiler: int8/uint8 support
OpenCL kernels also have int8/uint8.

v2: remove changes in nir_search as Jason posted a patch for that

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-03-14 10:08:42 -04:00
Alex Smith
fcf267ba08 radv: Fix CmdCopyImage between uncompressed and compressed images
From the spec:

    "When copying between compressed and uncompressed formats the
     extent members represent the texel dimensions of the source
     image and not the destination."

However, as per 7b890a36, we must still use the destination image type
when clamping the extent so that we copy the correct number of layers
for 2D to 3D copies.

Fixes: 7b890a36 "radv: Fix vkCmdCopyImage for 2d slices into 3d Images"
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-14 09:59:21 +00:00
Samuel Pitoiset
38f34117dd radv: fix vkGetDeviceQueue2() when create flags don't match
This fixes CTS:
dEQP-VK.api.device_init.create_device_queue2_unmatched_flags

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@gmail.com>
2018-03-14 09:53:42 +01:00
Neil Roberts
25a966a23d spirv: Handle doubles when multiplying a mat by a scalar
The code to handle mat multiplication by a scalar tries to pick either
imul or fmul depending on whether the matrix is float or integer.
However it was doing this by checking whether the base type is float.
This was making it choose the int path for doubles (and presumably
float16s).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14 08:43:33 +01:00
Iago Toral Quiroga
1a0aba7216 anv/entrypoints: VkGetDeviceProcAddr returns NULL for core instance commands
af5f2322d0 addressed this for extension commands, but the spec mandates
this behavior also for core API commands. From the Vulkan spec,
Table 2. vkGetDeviceProcAddr behavior:

device     pname                            return
----------------------------------------------------------
(..)
device     core device-level command        fp
(...)

See that it specifically states "device-level".

Since the vk.xml file doesn't state if core commands are instance or
device level, we identify device level commands as the ones that take a
VkDevice, VkQueue or VkCommandBuffer as their first parameter.

Fixes test failures in new work-in-progress CTS tests.

Also see the public issue:
https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues/2323

v2:
  - Include reference to github issue (Emil)
  - Rebased on top of Vulkan 1.1 changes.

v3:
  - Remove the not in the condition and switch the then/else cases (Jason)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14 08:09:15 +01:00
Iago Toral Quiroga
a631575ff4 anv/entrypoints: dispatches to VkQueue are device-level
v2:
  - Add trampoline functions (Jason)
  - Add an assertion for unhandled trampoline cases

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14 08:09:15 +01:00
Dave Airlie
3b0f2081b5 radv: drop assert on bindingDescriptorCount > 0
The spec is pretty clear that this can be 0, and that it operates
as a reserved binding.

Fixes:
dEQP-VK.binding_model.descriptor_update.empty_descriptor.uniform_buffer

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14 16:54:52 +10:00
Apple SWE
9dc5063262 sched.h needs to be imported on Darwin/OSX targets.
sched_yield is used but the include reference on Darwin is missing. This patch
conditionally guards on Darwin/OSX to import sched.h first.

Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2018-03-13 22:50:56 -07:00
Apple SWE
de0d10db93 Add processor topology calculation implementation for Darwin/OSX targets.
The implementation for bootstrapping SWR on Darwin targets is based on the Linux version.
Instead of reading the output of /proc/cpuinfo, sysctlbyname is used to determine the
physical identifiers, processor identifiers, core counts and thread-processor affinities.

With this patch, it is possible to use SWR as an alternate renderer on OSX to softpipe and
llvmpipe.

Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2018-03-13 22:50:27 -07:00
Roland Scheidegger
274f8bf05e r600: fix abs for op3 sources
If a src was referencing the same temp as the dst, the per-component
copy code didn't work.
e.g.
  cndge r0.xy, r0.xx, |r2|, r3
got expanded into
  mov  r12.x, |r2|
  cndge r0.x, r0.x, r12, r3
  mov  r12.y, |r2|
  cndge r0.y, r0.x, r12, r3
hence for the second cndge r0.x was mistakenly the previous cndge result.
Fix this by doing all the movs first, so there's no bogus alu.last in between.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102905

Tested-by: <iive@yahoo.com>
Reviewed-by: Dave Airlie <airlied@gmail.com>
2018-03-14 04:54:45 +01:00
Dave Airlie
27a5e5366e radv: mark all tess output for an indirect access.
If a shader does a tcs store with an indirect access, we
were only marking the first spot as used. For indirect access
we always now mark all slots used by the variable.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464
Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14 11:18:54 +10:00
Dave Airlie
4f0c89d66c ac/nir: pass the nir variable through tcs loading.
I was going to have to add another parameter to this monster,
so we should just pass the nir_variable in, I can't find any
reason this would be a bad idea.

This needed for the next fix.

Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14 11:18:54 +10:00
Dave Airlie
f9de2d409b radv: get correct offset into LDS for indexed vars.
This seems more correct to me, since if we have an array
of floats they'll be vec4 aligned, and if we do af[2],
we want the const index to increase by 2 slots in the non
compact case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464
Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14 11:18:54 +10:00
Rob Clark
4e4428482e nir: lower_load_const_to_scalar fix for 8/16b types
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-13 20:17:04 -04:00
Dylan Baker
2aad12b2af Update the documentation for meson
Meson is pretty well tested and works in most configurations now, so we
can remove the warning about it being unsuited for actual use.

It's also worth documenting that meson 0.42.0 or greater is required.

v2: - Minor rewording of supported platforms as suggested by Emil
    - Add two missing tags as reported by xmllint --html

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
2018-03-13 14:54:47 -07:00
Jason Ekstrand
85000b812d ac/nir: Use lower_vote_eq_to_ballot instead of ac_nir_lower_subgroups
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 13:25:27 -07:00
Jason Ekstrand
3d1d7e8561 nir/subgroups: Add lowering for vote_ieq/vote_feq to a ballot
This is based heavily on 97f10934ed, "ac/nir: Add vote_ieq/vote_feq
lowering pass." from Bas Nieuwenhuizen.  This version is a bit more
general since it's in common code.  It also properly handles NaN due to
not flipping the comparison for floats.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 13:25:15 -07:00
Dylan Baker
8247a30838 meson: don't use compiler.has_header
Meson's compiler.has_header is completely useless, it only checks that a
header exists, not whether it's usable. This creates problems if a
header contains a conditional #error declaration, like so:

> #if __x86_64__
> # error "Doesn't work with x86_64!"
> #endif

Compiler.has_header will return true in this case, even when compiling
for x86_64. This is useless.

Instead, we'll do a compile check so that any #error declarations will
be treated as errors, and compilation will work.

Fixes compilation on x32 architecture.

Gentoo Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=649746
meson bug: https://github.com/mesonbuild/meson/issues/2246
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-13 11:41:10 -07:00
Jason Ekstrand
8379bff6c4 i965: Emit texture cache invalidates around blorp_copy
This is a terrible hack but it fixes CTS regressions.  It's still
incredibly unclear exactly what is going wrong in the hardware to cause
this to be an issue so this isn't a good fix by any means.  However, it
does fix tests so there is that.

Fixes: fb0e9b5197 "i965: Track the depth and render caches separately"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103746
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-13 11:24:40 -07:00
Eric Anholt
a326eedc75 brodacom/vc4: Fix simulator since the perfmon change.
It would be nice to support perfmon with simulator, and might be a useful
tool for regression testing performance (since the simulator would be
deterministic).
2018-03-13 10:32:58 -07:00
Eric Anholt
191bc7ce61 spirv: Silence compiler warning about undefined srcs[0]
v2: Use assume() at the srcs[] definition instead.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-03-13 10:32:55 -07:00
Samuel Pitoiset
7c83430672 ac/nir: rename radeon_llvm_reg_index_soa() to ac_llvm_reg_index_soa()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:28 +01:00
Samuel Pitoiset
b128fd773f ac/nir: remove some unnecessary includes and declarations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:27 +01:00
Samuel Pitoiset
cd4e823341 ac/nir: drop radv prefix from radv_lower_gather4_integer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:25 +01:00
Samuel Pitoiset
fbe694562b ac/nir: move ac_nir_compiler_options and friends to radv folder
Also replace ac_ by radv_.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:23 +01:00
Samuel Pitoiset
237229430f ac: move ac_shader_info to radv folder
This is RADV specific code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:21 +01:00
Samuel Pitoiset
2cfba40eea ac/nir: move ac_shader_variant_info and friends to radv folder
Also replace ac_ by radv_.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:16 +01:00
Samuel Pitoiset
b2653007b9 ac/nir: move all RADV related code to radv_nir_to_llvm.c
Now the "ac/nir" prefix will really be the shared code between
RadeonSI and RADV, that might avoid confusions in the future.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
8e15824b9d ac/nir: make emit_barrier() non-static
Required in order to move all RADV specific code outside of ac/nir.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
4e3117b718 ac/nir: move radeon_llvm_reg_index_soa() to ac_nir_to_llvm.h
Required in order to move all RADV specific code outside of ac/nir.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
3a30b89353 ac/nir: make handle_shader_output_decl() non-static
Required in order to move all RADV specific code outside of ac/nir.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
3fe47b1290 ac/nir: change prototype of handle_shader_output_decl()
This allows to remove the ac_nir_context dependency.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
61a91ca3f5 ac/nir: move unpack_param() to ac_llvm_build.c
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
28bb6873ec ac/nir: move trim_vector to ac_llvm_build.c
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
895632baef ac/nir: move cast_ptr() to ac_llvm_build.c
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
bf6368297b ac/nir: move ac_build_alloca() to ac_llvm_build.c
As well as si_build_alloca_undef() and drop the si prefix.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Timothy Arceri
370e356eba gallium: silence __builtin_frame_address nonzero argument is unsafe warning
Calling __builtin_frame_address with a nonzero argument is unsafe
but is sometimes done for debugging purposes. Since this code is
part of some debug util code I'm assuming that is the case here
and using GCC pragma to silence the warning.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-13 09:38:10 +11:00
Dylan Baker
b7c6870f87 meson: Add moduledir to d3d.pc
This is required to build wine with the nine patchset

Fixes: 6b4c7047d5
       ("meson: build gallium nine state_tracker")
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-12 13:52:38 -07:00
Mathias Fröhlich
a2f08dd574 gallium: Use struct gl_array_attributes* as st_pipe_vertex_format argument.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-12 18:24:31 +01:00
Ian Romanick
def0030e64 mesa: Don't write to user buffer in glGetTexParameterIuiv on error
With some sets of optimization flags, GCC will generate warnings like
this:

src/mesa/main/texparam.c:2327:27: warning: ‘*((void *)&ip+12)’ may be used uninitialized in this function [-Wmaybe-uninitialized]
             params[3] = ip[3];
                         ~~^~~
src/mesa/main/texparam.c:2320:16: note: ‘*((void *)&ip+12)’ was declared here
          GLint ip[4];
                ^~

ip is not initialized in cases where a GL error is generated.  In these
cases, we should *not* write to the user's buffer, so this is actually a
bug.  I wrote a new piglit test gl-3.0-texparameteri to show this bug.

I suspect that Coverity also detected this, but the scan site is
currently down.

Fixes: c2c507786 "main: Added entry points for glGetTextureParameteriv, Iiv, and Iuiv."
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-12 10:13:30 -07:00
Roman Gilg
f94597f554 gallium: work around libtool relink issue for libdrm
This is similar to commit 90633079. libtool links first to system directories
instead of custom locations of libdrm on relinking. Since a more recent libdrm
version than the one provided by the system is often needed when compiling
mesa, make sure this works by putting libdrm in front.

See also: https://bugs.freedesktop.org/show_bug.cgi?id=100259

Signed-off-by: Roman Gilg <subdiff@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-12 14:49:07 +00:00
Emil Velikov
678ba53240 vulkan: autotools: do not redirect stdin/stdout for wayland-scanner
The tool accepts the input and output files as arguments.
There's no need for the redirection.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
8151f5cad9 wayland-drm: autotools: do not redirect stdin/stdout for wayland-scanner
The tool accepts the input and output files as arguments.
There's no need for the redirection.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
1178e0cf49 egl: autotools: do not redirect stdin/stdout for wayland-scanner
The tool accepts the input and output files as arguments.
There's no need for the redirection.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
08189731a4 docs: document removal of GLX_SGIX_swap_{barrier,group} stubs
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
5ef608fab7 glx: remove empty GLX_SGIX_swap_group stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
2c765b0d9a gallium/x11: remove empty GLX_SGIX_swap_group stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
afab516f5f x11: remove empty GLX_SGIX_swap_group stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
742b8e3301 glx: remove empty GLX_SGIX_swap_barrier stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
447731348e gallium/x11: remove empty GLX_SGIX_swap_barrier stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:51 +00:00
Emil Velikov
1d2d519d78 x11: remove empty GLX_SGIX_swap_barrier stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:51 +00:00
Emil Velikov
f197f02e50 configure: remove unused AM_CONDITIONAL
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-12 14:48:51 +00:00
Bas Nieuwenhuizen
997306c031 radv: Increase the number of dynamic uniform buffers.
The vulkan API is not ideal as it does not allow us have a
shared limit.

Feral needs 15+6 for one of their games, and I'm not a fan
of overcommitting the limits, so increase the number of
dynamic uniform buffers to 16.

CC: <mesa-stable@lists.freedesktop.org>
CC: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-12 09:46:22 +01:00
Dave Airlie
e76cf1ff12 u_vbuf/translate: pass max_index into the set_buffer.
This fixes a memory trashing crash (not the test) seen with
dEQP-GLES3.stress.draw.unaligned_data.random.203
on virgl.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-12 11:57:13 +10:00
Dave Airlie
5d4fbc2b54 r600: implement callstack workaround for evergreen.
This is ported from the sb backend, there are some issues with
evergreen stacks on the boundary between entries and ALU_PUSH_BEFORE
instructions.

Whenever we are going to use a push before, we check the stack
usage and if we have to use the workaround, then we switch to
a separate push.

I noticed this problem dealing with some of the soft fp64 shaders,
in nosb mode, they are quite stack happy.

This fixes all the glitches and inconsistencies I've seen with them

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Elie Tournier <elie.tournier@collabora.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-12 11:11:44 +10:00
Marek Olšák
163a29099a gallium/util: add helper util_wait_for_idle
This is an old patch that I had.
2018-03-11 13:14:27 -04:00
Roland Scheidegger
0f0a6fa21d u_blit: (trivial) u_blit.h needs to include p_defines.h
(For the pipe_tex_filter enum)

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-10 20:09:04 +01:00
Christian Gmeiner
c9b153fea7 travis: bump libxcb version to 1.13
Fixes following dependency problem:
  Native dependency xcb-dri3 found: NO found '1.11' but need: '>= 1.13'

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Fixes: c80c08e226 ("vulkan/wsi/x11: Add support for DRI3 v1.2")
2018-03-10 16:55:36 +01:00
Mathias Fröhlich
64d2a20480 mesa: Make gl_vertex_array contain pointers to first order VAO members.
Instead of keeping a copy of the vertex array content in
struct gl_vertex_array only keep pointers to the first order
information originaly in the VAO.
For that represent the current values by struct gl_array_attributes
and struct gl_vertex_buffer_binding.

v2: Change comments.
    Remove gl... prefix from variables except in the i965 directory where
    it was like that before. Reindent because of that.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-10 07:33:51 +01:00
Roland Scheidegger
d62f0df354 draw: fix alpha value for very short aa lines
The logic would not work correctly for line lengths smaller than 1.0,
even a degenerated line with length 0 would still produce a fragment
with anyhwere between alpha 0.0 and 0.5.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-10 02:11:50 +01:00
Jordan Justen
24b415270f intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview
Ken suggested that we might be underallocating scratch space on HD
400. Allocating scratch space as though there was actually 8 EUs
seems to help with a GPU hang seen on synmark CSDof.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-09 16:15:58 -08:00
Jordan Justen
06e3bd02c0 i965: Hard code CS scratch_ids_per_subslice for Cherryview
Ken suggested that we might be underallocating scratch space on HD
400. Allocating scratch space as though there was actually 8 EUs
seems to help with a GPU hang seen on synmark CSDof.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104636
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105290
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
2018-03-09 16:15:34 -08:00
Marek Olšák
db495b8962 st/dri: fix OpenGL-OpenCL interop for GL_TEXTURE_BUFFER
Tested by our OpenCL team.

Fixes: 9c499e6759 "st/mesa: don't invoke st_finalize_texture & st_convert_sampler for TBOs"

Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-09 16:33:31 -05:00
Marek Olšák
2bdb54bce7 radeonsi: add a workaround for GFX9 hang with init_config alignment
Fixes: 75c5d25f0f "radeonsi: align command buffer starting address to fix some Raven hangs"
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
2018-03-09 16:28:29 -05:00
Marek Olšák
e99212e970 ac/gpu_info: print ib_start_alignment, add assertion 2018-03-09 16:28:29 -05:00
Greg V
e30a165be2 meson: Use system_has_kms_drm in default driver selection
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-09 10:02:44 -08:00
Eric Anholt
c57d5ea3bb broadcom/vc4: Add an accelerated path to turn raster R8/RG88 into tiled.
Drawing a 1080p YV12 video stream generated by MMAL goes from 10.5 FPS to
36.
2018-03-09 09:59:54 -08:00
Eric Anholt
cf170616da gallium: Add a util_blitter path for using a custom VS and FS.
Like the r600 paths to use other custom states, we pass in a couple of
parameters to customize the innards of the blitter.  It's up to the caller
to wrap other state necessary for its shaders (for example, constant
buffers for the uniforms the shader uses).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-09 09:59:54 -08:00
Eric Anholt
46a32e3d2e broadcom/vc4: Allow binding non-zero constant buffers.
We're going to use UBO loads for implementing YUV linear-to-T-format
blits.
2018-03-09 09:59:54 -08:00
Eric Anholt
2725ab2b12 broadcom: Remove our defines of DRM_FORMAT_MOD_INVALID.
The imported drm_fourcc.h handles it now.
2018-03-09 09:59:54 -08:00
Eric Anholt
a3a4c23dec broadcom: Suppress compiler warnings about enum pipe_tex_filter. 2018-03-09 09:59:54 -08:00
Louis-Francis Ratté-Boulianne
3160cb86aa egl/x11: Re-allocate buffers if format is suboptimal
If PresentCompleteNotify event says the pixmap was presented
with mode PresentCompleteModeSuboptimalCopy, it means the pixmap
could possibly have been flipped instead if allocated with a
different format/modifier.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-09 17:47:14 +00:00
Louis-Francis Ratté-Boulianne
069fdd5f9f egl/x11: Support DRI3 v1.1
Add support for DRI3 v1.1, which allows pixmaps to be backed by
multi-planar buffers, or those with format modifiers. This is both
for allocating render buffers, as well as EGLImage imports from a
native pixmap (EGL_NATIVE_PIXMAP_KHR).

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-09 17:47:14 +00:00
Louis-Francis Ratté-Boulianne
61309c2a72 vulkan/wsi/x11: Return VK_SUBOPTIMAL_KHR for X11
When it is detected that a window could have been flipped
but has been copied because of suboptimal format/modifier.
The Vulkan client should then re-create the swapchain.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-09 17:47:13 +00:00
Daniel Stone
c80c08e226 vulkan/wsi/x11: Add support for DRI3 v1.2
Adds support for multiple planes and buffer modifiers.

v4: Rename "has_dri3_v1_1" to "has_dri3_modifiers"
v12: Multi-planar/modifier support is now DRI3 v1.2; also update release
     versions
2018-03-09 17:47:13 +00:00
Dylan Baker
7258be91c5 autotools: include all meson.build files
Otherwise SWR cannot be built with meson from an autotools generated
tarball, such as the 18.0.0-rc4 tarball.

Fixes: 16bf813830 ("meson/swr: re-shuffle generated files")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-09 08:15:04 -08:00
Michel Dänzer
2a4596a2f0 st/mesa: gl_program::info.system_values_read is a 64-bit-field
We were dropping the upper 32 bits, which caused assertion failures in
some compute shader piglit tests with radeonsi since the commit below.

Fixes: 752e969703 ("compiler: Add two new system values for subgroups")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-09 16:52:11 +01:00
George Kyriazis
379e00dc27 swr/rast: Refactor memory gather operations
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:42 -06:00
George Kyriazis
3f7ce10b3e swr/rast: Add KNOB_DISABLE_SPLIT_DRAW
This is useful for archrast data collection. This greatly speeds up the
post processing script since there is significantly less events generated.

Finally, this is a simpler option to communicate to users than having
them directly adjust MAX_PRIMS_PER_DRAW and MAX_TESS_PRIMS_PER_DRAW.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:30 -06:00
George Kyriazis
e0a4a25829 swr/rast: Add VPOPCNT
Supports popcnt on vector masks (e.g. <8 x i1>)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:23 -06:00
George Kyriazis
b56afe1a4f swr/rast: Add tracking for stream out topology
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:14 -06:00
George Kyriazis
2f6ae8cfcd swr/rast: Add split draw and other state information to DrawInfoEvent.
Removed specific split draw events.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:07 -06:00
George Kyriazis
714093203e swr/rast: Refactor api and worker event handlers.
In the API event handler we want to share information between the core
layer and the API. Specifically, around associating various ids with
different kinds of events. For example, associate render pass id with
draw ids, or command buffer ids with draw ids.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:35:59 -06:00
George Kyriazis
cfdd35beaf swr/rast: Add support for generalized late and early z/stencil stats
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:35:52 -06:00
George Kyriazis
9e25f298eb swr/rast: Rasterized Subspans stats support
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:35:47 -06:00
George Kyriazis
d78b28fc33 swr/rast: Added comment
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:34:55 -06:00
Eric Engestrom
e903a7b0bb vulkan/wsi: clean up cleanup path
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-09 13:25:44 +00:00
Bas Nieuwenhuizen
a793e7899f radv: Fix the autotools build take 2.
Forgot to remove a word....

Fixes: 04ffabf17a "radv: Fix autotools build."
2018-03-09 14:10:24 +01:00
Lucas Stach
1f55d06783 etnaviv: allow mixing different bit depths for color and depth surfaces
Vivante hardware supports this just fine. There is no reason why this shouldn't
be advertised as a valid combination.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-03-09 12:06:07 +01:00
Thierry Reding
6d4d46bca9 autotools: Add tegra to AM_DISTCHECK_CONFIGURE_FLAGS
This allows the driver to be built on a make distcheck and makes sure
that it properly builds when a distribution tarball is made.

Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:48:22 +01:00
Thierry Reding
1755f608f5 tegra: Initial support
Tegra K1 and later use a GPU that can be driven by the Nouveau driver.
But the GPU is a pure render node and has no display engine, hence the
scanout needs to happen on the Tegra display hardware. The GPU and the
display engine each have a separate DRM device node exposed by the
kernel.

To make the setup appear as a single device, this driver instantiates
a Nouveau screen with each instance of a Tegra screen and forwards GPU
requests to the Nouveau screen. For purposes of scanout it will import
buffers created on the GPU into the display driver. Handles that
userspace requests are those of the display driver so that they can be
used to create framebuffers.

This has been tested with some GBM test programs, as well as kmscube and
weston. All of those run without modifications, but I'm sure there is a
lot that can be improved.

Some fixes contributed by Hector Martin <marcan@marcan.st>.

Changes in v2:
- duplicate file descriptor in winsys to avoid potential issues
- require nouveau when building the tegra driver
- check for nouveau driver name on render node
- remove unneeded dependency on libdrm_tegra
- remove zombie references to libudev
- add missing headers to C_SOURCES variable
- drop unneeded tegra/ prefix for includes
- open device files with O_CLOEXEC
- update copyrights

Changes in v3:
- properly unwrap resources in ->resource_copy_region()
- support vertex buffers passed by user pointer
- allocate custom stream and const uploader
- silence error message on pre-Tegra124
- support X without explicit PRIME

Changes in v4:
- ship Meson build files in distribution tarball
- drop duplicate driver_tegra dependency

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:48:22 +01:00
Thierry Reding
2052dbdae3 nouveau: Add framebuffer modifier support
This adds support for framebuffer modifiers to Nouveau. This will be
used by the Tegra driver to share metadata about the format of buffers
(such as the tiling mode or compression).

Changes in v2:
- remove unused parameters to nouveau_buffer_create()
- move format modifier query code to nvc0 backend
- restrict format modifiers to 2D textures
- implement ->query_dmabuf_modifiers()

Changes in v4:
- add UAPI include path on meson builds

Changes in v5:
- remove unnecessary includes

Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:48:08 +01:00
Thierry Reding
b964cab80a nouveau/nvc0: Extract common tile mode macro
Add a new macro that can be used to extract the tiling mode from a
tile_mode value. This is will be used to determine the number of GOBs
used in block linear mode.

Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:47:54 +01:00
Thierry Reding
75bf489628 drm/tegra: Sanitize format modifiers
The existing format modifier definitions were merged prematurely, and
recent work has unveiled that the definitions are suboptimal in several
ways:

  - The format specifiers, except for one, are not Tegra specific, but
    the names don't reflect that.
  - The number space is split into two, reserving 32 bits for some
    "parameter" which most of the modifiers are not going to have.
  - Symbolic names for the modifiers are not using the standard
    DRM_FORMAT_MOD_* prefix, which makes them awkward to use.
  - The vendor prefix NV is somewhat ambiguous.

Fortunately, nobody's started using these modifiers, so we can still fix
the above issues. Do so by using the standard prefix. Also, remove TEGRA
from the name of those modifiers that exist on NVIDIA GPUs as well. In
case of the block linear modifiers, make the "parameter" smaller (4
bits, though only 6 values are valid) and don't let that leak into any
of the other modifiers.

Finally, also use the more canonical NVIDIA instead of the ambiguous NV
prefix.

This is based on commit 268892cb63a822315921a8dab48ac3e4abf7dd03 from
Linux v4.16-rc1.

Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:44:35 +01:00
Thierry Reding
ffc85cfac0 drm/fourcc: Fix fourcc_mod_code() definition
Avoid a compiler warnings when the val parameter is an expression.

This is based on commit 5843f4e02fbe86a59981e35adc6cabebee46fdc0 from
Linux v4.16-rc1.

Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:44:35 +01:00
Bas Nieuwenhuizen
04ffabf17a radv: Fix autotools build.
Forgot it again ....

Fixes: b6347807a9 "radv: Generate icd files."
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-09 09:36:19 +01:00
Samuel Pitoiset
365850fd68 ac/nir: set number of channels for packed mrt exports
Bit 0 enables VSRC0 (R in low bits, G high) and bit 2 enables
VSRC1 (B in low bits, A high).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-09 09:28:20 +01:00
Bas Nieuwenhuizen
68201ab2da radv: Update version to 1.1.70.
Turns out they did not reset the patch number on release.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-09 07:53:39 +01:00
Bas Nieuwenhuizen
b6347807a9 radv: Generate icd files.
If the api version is too low, the loader clamps the application
requested version to the advertized version, which messes with
which extensions are enabled.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-09 07:53:39 +01:00
Ian Romanick
6878c9aabc nir: Don't i2b a value that is already Boolean
A bunch of shaders have sequences like:

    i2b(u2i(floatBitsToUint(intBitsToFloat(x == y ? -1 : 0))))

Other optimizations (and NIR's typeless nature) reduce this to

    i2b(x == y)

which is silly.

Skylake
total instructions in shared programs: 14498698 -> 14497948 (<.01%)
instructions in affected programs: 74480 -> 73730 (-1.01%)
helped: 277
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 2.71 x̃: 2
helped stats (rel) min: 0.04% max: 13.79% x̄: 1.45% x̃: 0.68%
95% mean confidence interval for instructions value: -3.35 -2.06
95% mean confidence interval for instructions %-change: -1.74% -1.16%
Instructions are helped.

total cycles in shared programs: 532015500 -> 531999238 (<.01%)
cycles in affected programs: 5943878 -> 5927616 (-0.27%)
helped: 251
HURT: 74
helped stats (abs) min: 1 max: 13149 x̄: 127.89 x̃: 14
helped stats (rel) min: 0.01% max: 17.31% x̄: 1.55% x̃: 0.53%
HURT stats (abs)   min: 1 max: 4550 x̄: 214.04 x̃: 15
HURT stats (rel)   min: <.01% max: 44.43% x̄: 2.81% x̃: 0.33%
95% mean confidence interval for cycles value: -158.51 58.43
95% mean confidence interval for cycles %-change: -1.07% -0.04%
Inconclusive result (value mean confidence interval includes 0).

total loops in shared programs: 4753 -> 4735 (-0.38%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

Haswell and Broadwell had simliar results. (Broadwell shown)
total instructions in shared programs: 14791877 -> 14791127 (<.01%)
instructions in affected programs: 77326 -> 76576 (-0.97%)
helped: 278
HURT: 1
helped stats (abs) min: 1 max: 32 x̄: 2.70 x̃: 2
helped stats (rel) min: 0.04% max: 13.79% x̄: 1.42% x̃: 0.68%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.49% max: 0.49% x̄: 0.49% x̃: 0.49%
95% mean confidence interval for instructions value: -3.33 -2.05
95% mean confidence interval for instructions %-change: -1.70% -1.13%
Instructions are helped.

total cycles in shared programs: 558250067 -> 558252872 (<.01%)
cycles in affected programs: 5806328 -> 5809133 (0.05%)
helped: 235
HURT: 83
helped stats (abs) min: 1 max: 10630 x̄: 81.73 x̃: 16
helped stats (rel) min: 0.03% max: 18.58% x̄: 1.60% x̃: 0.51%
HURT stats (abs)   min: 1 max: 10590 x̄: 265.19 x̃: 20
HURT stats (rel)   min: <.01% max: 15.28% x̄: 1.89% x̃: 0.54%
95% mean confidence interval for cycles value: -89.87 107.51
95% mean confidence interval for cycles %-change: -1.06% -0.32%
Inconclusive result (value mean confidence interval includes 0).

total loops in shared programs: 4735 -> 4717 (-0.38%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

total fills in shared programs: 83111 -> 83110 (<.01%)
fills in affected programs: 28 -> 27 (-3.57%)
helped: 1
HURT: 0

Ivy Bridge
total instructions in shared programs: 11774173 -> 11773436 (<.01%)
instructions in affected programs: 70819 -> 70082 (-1.04%)
helped: 267
HURT: 0
helped stats (abs) min: 1 max: 48 x̄: 2.76 x̃: 2
helped stats (rel) min: 0.21% max: 19.51% x̄: 1.57% x̃: 0.63%
95% mean confidence interval for instructions value: -3.51 -2.01
95% mean confidence interval for instructions %-change: -1.94% -1.21%
Instructions are helped.

total cycles in shared programs: 257153833 -> 257148932 (<.01%)
cycles in affected programs: 585341 -> 580440 (-0.84%)
helped: 167
HURT: 100
helped stats (abs) min: 1 max: 1327 x̄: 44.89 x̃: 16
helped stats (rel) min: 0.04% max: 26.54% x̄: 2.41% x̃: 0.88%
HURT stats (abs)   min: 1 max: 200 x̄: 25.95 x̃: 16
HURT stats (rel)   min: 0.04% max: 9.81% x̄: 1.34% x̃: 0.65%
95% mean confidence interval for cycles value: -33.25 -3.46
95% mean confidence interval for cycles %-change: -1.47% -0.54%
Cycles are helped.

total loops in shared programs: 3416 -> 3398 (-0.53%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

LOST:   2
GAINED: 0

Sandy Bridge
total instructions in shared programs: 10499306 -> 10499094 (<.01%)
instructions in affected programs: 6051 -> 5839 (-3.50%)
helped: 43
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 4.93 x̃: 2
helped stats (rel) min: 0.39% max: 12.90% x̄: 4.29% x̃: 2.45%
95% mean confidence interval for instructions value: -7.66 -2.20
95% mean confidence interval for instructions %-change: -5.47% -3.12%
Instructions are helped.

total cycles in shared programs: 145862568 -> 145861370 (<.01%)
cycles in affected programs: 61733 -> 60535 (-1.94%)
helped: 36
HURT: 2
helped stats (abs) min: 16 max: 66 x̄: 36.61 x̃: 35
helped stats (rel) min: 0.45% max: 17.31% x̄: 4.92% x̃: 2.81%
HURT stats (abs)   min: 18 max: 102 x̄: 60.00 x̃: 60
HURT stats (rel)   min: 1.10% max: 1.85% x̄: 1.48% x̃: 1.48%
95% mean confidence interval for cycles value: -41.28 -21.77
95% mean confidence interval for cycles %-change: -6.16% -3.00%
Cycles are helped.

total loops in shared programs: 1803 -> 1785 (-1.00%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

LOST:   4
GAINED: 0

No changes on Iron Lake of GM45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-08 15:26:26 -08:00
Ian Romanick
1583f49eaa i965/vec4: Allow CSE on subset VF constant loads
v2: Rewrite the code that generates the VF mask.  Suggested by Ken.

No changes on other platforms.

Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown)
total instructions in shared programs: 13059891 -> 13059884 (<.01%)
instructions in affected programs: 431 -> 424 (-1.62%)
helped: 7
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.19% max: 5.26% x̄: 2.05% x̃: 1.49%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -3.39% -0.71%
Instructions are helped.

total cycles in shared programs: 409260032 -> 409260018 (<.01%)
cycles in affected programs: 4228 -> 4214 (-0.33%)
helped: 7
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.28% max: 2.04% x̄: 0.54% x̃: 0.28%
95% mean confidence interval for cycles value: -2.00 -2.00
95% mean confidence interval for cycles %-change: -1.15% 0.07%

Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-08 15:26:26 -08:00
Ian Romanick
360899d457 i965/vec4: Relax writemask condition in CSE
If the previously seen instruction generates more fields than the new
instruction, still allow CSE to happen.  This doesn't do much, but it
also enables a couple more shaders in the next patch.  It helped quite a
bit in another change series that I have (at least for now) abandoned.

v2: Add some extra comentary about the parameters to instructions_match.
Suggested by Ken.

No changes on Skylake, Broadwell, Iron Lake or GM45.

Ivy Bridge and Haswell had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11780295 -> 11780294 (<.01%)
instructions in affected programs: 302 -> 301 (-0.33%)
helped: 1
HURT: 0

total cycles in shared programs: 257308315 -> 257308313 (<.01%)
cycles in affected programs: 2074 -> 2072 (-0.10%)
helped: 1
HURT: 0

Sandy Bridge
total instructions in shared programs: 10506687 -> 10506686 (<.01%)
instructions in affected programs: 335 -> 334 (-0.30%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-08 15:26:26 -08:00
Ian Romanick
52c7df1643 i965/fs: Merge CMP and SEL into CSEL on Gen8+
v2: Fix several problems handling inverted predicates.  Add a much
bigger comment around the BRW_CONDITIONAL_NZ case.

v3: Allow uniforms and shader inputs as sources for the original SEL and
CMP instructions.  This enables a LOT more shaders to receive CSEL
merging (5816 vs 8564 on SKL).

v4: Report progress.

Broadwell and Skylake had similar results. (Broadwell shown)
helped: 8527
HURT: 0
helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 1
helped stats (rel) min: 0.03% max: 17.80% x̄: 1.12% x̃: 0.70%
95% mean confidence interval for instructions value: -2.51 -2.36
95% mean confidence interval for instructions %-change: -1.15% -1.10%
Instructions are helped.

total cycles in shared programs: 559442317 -> 558288357 (-0.21%)
cycles in affected programs: 372699860 -> 371545900 (-0.31%)
helped: 6748
HURT: 1450
helped stats (abs) min: 1 max: 32000 x̄: 182.41 x̃: 12
helped stats (rel) min: <.01% max: 66.08% x̄: 3.42% x̃: 0.70%
HURT stats (abs)   min: 1 max: 2538 x̄: 53.08 x̃: 14
HURT stats (rel)   min: <.01% max: 96.72% x̄: 3.32% x̃: 0.90%
95% mean confidence interval for cycles value: -179.01 -102.51
95% mean confidence interval for cycles %-change: -2.37% -2.08%
Cycles are helped.

LOST:   0
GAINED: 6

No changes on earlier platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3]
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-08 15:26:26 -08:00
Kenneth Graunke
70de61594d i965/fs: Add infrastructure for generating CSEL instructions.
v2 (idr): Don't allow CSEL with a non-float src2.

v3 (idr): Add CSEL to fs_inst::flags_written.  Suggested by Matt.

v4 (idr): Only set BRW_ALIGN_16 on Gen < 10 (suggested by Matt).  Don't
reset the access mode afterwards (suggested by Samuel and Matt).  Add
support for CSEL not modifying the flags to more places (requested by
Matt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v3]
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-08 15:26:26 -08:00
Ian Romanick
54e8d2268d nir: Narrow some dot product operations
On vector platforms, this helps elide some constant loads.

v2: Reorder the transformations.

No changes on Broadwell or Skylake.

Haswell
total instructions in shared programs: 13093793 -> 13060163 (-0.26%)
instructions in affected programs: 1277532 -> 1243902 (-2.63%)
helped: 13216
HURT: 95
helped stats (abs) min: 1 max: 18 x̄: 2.56 x̃: 2
helped stats (rel) min: 0.21% max: 20.00% x̄: 3.63% x̃: 2.78%
HURT stats (abs)   min: 1 max: 6 x̄: 1.77 x̃: 1
HURT stats (rel)   min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19%
95% mean confidence interval for instructions value: -2.57 -2.49
95% mean confidence interval for instructions %-change: -3.65% -3.54%
Instructions are helped.

total cycles in shared programs: 409580819 -> 409268463 (-0.08%)
cycles in affected programs: 71730652 -> 71418296 (-0.44%)
helped: 9898
HURT: 2352
helped stats (abs) min: 2 max: 16014 x̄: 37.08 x̃: 16
helped stats (rel) min: <.01% max: 35.55% x̄: 6.26% x̃: 4.50%
HURT stats (abs)   min: 2 max: 276 x̄: 23.25 x̃: 6
HURT stats (rel)   min: <.01% max: 40.00% x̄: 3.54% x̃: 1.97%
95% mean confidence interval for cycles value: -33.19 -17.80
95% mean confidence interval for cycles %-change: -4.50% -4.26%
Cycles are helped.

total fills in shared programs: 82059 -> 82052 (<.01%)
fills in affected programs: 21 -> 14 (-33.33%)
helped: 7
HURT: 0

Sandy Bridge and Ivy Bridge had similar results (Ivy Bridge shown)
total instructions in shared programs: 11811851 -> 11780605 (-0.26%)
instructions in affected programs: 1155007 -> 1123761 (-2.71%)
helped: 12304
HURT: 95
helped stats (abs) min: 1 max: 18 x̄: 2.55 x̃: 2
helped stats (rel) min: 0.21% max: 20.00% x̄: 3.69% x̃: 2.86%
HURT stats (abs)   min: 1 max: 6 x̄: 1.77 x̃: 1
HURT stats (rel)   min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19%
95% mean confidence interval for instructions value: -2.56 -2.48
95% mean confidence interval for instructions %-change: -3.71% -3.59%
Instructions are helped.

total cycles in shared programs: 257618409 -> 257316805 (-0.12%)
cycles in affected programs: 71999580 -> 71697976 (-0.42%)
helped: 9155
HURT: 2380
helped stats (abs) min: 2 max: 16014 x̄: 38.44 x̃: 16
helped stats (rel) min: <.01% max: 35.75% x̄: 6.39% x̃: 4.62%
HURT stats (abs)   min: 2 max: 290 x̄: 21.14 x̃: 4
HURT stats (rel)   min: <.01% max: 41.55% x̄: 3.14% x̃: 1.33%
95% mean confidence interval for cycles value: -34.32 -17.97
95% mean confidence interval for cycles %-change: -4.55% -4.29%
Cycles are helped.

GM45 and Iron Lake had nearly identical results (Iron Lake shown)
total instructions in shared programs: 7886750 -> 7879944 (-0.09%)
instructions in affected programs: 373781 -> 366975 (-1.82%)
helped: 3715
HURT: 47
helped stats (abs) min: 1 max: 8 x̄: 1.86 x̃: 1
helped stats (rel) min: 0.22% max: 16.67% x̄: 2.88% x̃: 2.06%
HURT stats (abs)   min: 1 max: 6 x̄: 2.55 x̃: 2
HURT stats (rel)   min: 1.09% max: 5.00% x̄: 1.93% x̃: 2.35%
95% mean confidence interval for instructions value: -1.85 -1.77
95% mean confidence interval for instructions %-change: -2.91% -2.73%
Instructions are helped.

total cycles in shared programs: 178114636 -> 178095452 (-0.01%)
cycles in affected programs: 7227666 -> 7208482 (-0.27%)
helped: 3349
HURT: 301
helped stats (abs) min: 2 max: 90 x̄: 6.55 x̃: 4
helped stats (rel) min: <.01% max: 14.18% x̄: 0.95% x̃: 0.63%
HURT stats (abs)   min: 2 max: 42 x̄: 9.13 x̃: 10
HURT stats (rel)   min: 0.01% max: 11.19% x̄: 1.22% x̃: 1.50%
95% mean confidence interval for cycles value: -5.52 -4.99
95% mean confidence interval for cycles %-change: -0.81% -0.73%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-03-08 15:26:26 -08:00
Lionel Landwerlin
d10a39ebe0 i965: perf: consolidate unmapping oa perf bo outside accumulation
Do this in one place outside the only caller of the accumulation
function.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:29 +00:00
Lionel Landwerlin
fb921a2870 i965: perf: count number of accumlated reports
This will be reused later.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:26 +00:00
Lionel Landwerlin
e4387faafb i965: perf: reuse timescale base function from query
We already have the same function in brw_queryobj.c

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:23 +00:00
Lionel Landwerlin
b71da26496 i965: perf: store sysfs device entry into context
We want to reuse it later on.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:21 +00:00
Lionel Landwerlin
5742b17da1 i965: perf: store the hw_id of the context in the query
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:18 +00:00
Lionel Landwerlin
80cd669a32 i965: perf: default case for unknown query types
Just some extra safety before further changes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:00 +00:00
Marek Olšák
9b7db12815 radeonsi: remove chip_class parameter from si_lower_nir
We can get it from si_screen.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Marek Olšák
78ef16e2f9 winsys/amdgpu: query GDS info
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Marek Olšák
a4a113b5bc winsys/amdgpu: pad compute IBs
v2: pad with PKT2 NOPs on SI

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Marek Olšák
35cd86d4e9 radeonsi: expand constbuf 0 address correctly to fix Vega10 hangs
This is only required with the latest libdrm.

This fixes 32-bit support with high addresses.
(and possibly 64-bit support too because the high bits need to be masked out)

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Marek Olšák
75c5d25f0f radeonsi: align command buffer starting address to fix some Raven hangs
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Christian Gmeiner
5b68a7297d etnaviv: add get_driver_query_group_info(..)
This enables AMD_performance_monitor extension.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-03-08 20:44:04 +01:00
Christian Gmeiner
3d912bd742 etnaviv: add query_group_info for sw counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-03-08 20:43:55 +01:00
Dylan Baker
1e9d779331 meson: Fix building gallium media libs without egl
v2: - rebase on omx fix

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
2018-03-08 10:14:02 -08:00
Dylan Baker
f74cf04d3e meson: Allow building dri based EGL without GLX
It should be possible to build EGL without GLX, but the meson build
currently doesn't allow that because it too tightly couples glx and dri.
This patch eases dri and glx apart, so that EGL without GLX can be
built.

CC: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-03-08 09:12:24 -08:00
Thierry Reding
d41ee9ba5d glx/apple: Ship meson build file in tarball
The meson build file for Apple GLX is not listed in the EXTRA_DIST make
variable and therefore isn't shipped as part of the release tarball, so
meson builds from the tarball will fail.

Add the file to EXTRA_DIST to ensure it is included in the tarball.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-08 12:11:32 +01:00
Samuel Pitoiset
4e3c1ace65 ac/nir: do not emit unnecessary null exports in fragment shaders
Null exports should only be needed when no other exports are
emitted. This removes a bunch of 'exp null off, off, off, off done vm'.

Affected games are Dota 2 and Wolfenstein 2, not sure if that
really helps, but code size is decreasing there.

Polaris10:
Totals from affected shaders:
SGPRS: 8216 -> 8216 (0.00 %)
VGPRS: 7072 -> 7072 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 454968 -> 453896 (-0.24 %) bytes
Max Waves: 772 -> 772 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-08 11:56:05 +01:00
Eric Engestrom
19dd7f007e drirc: whitespace fix
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-08 09:53:34 +00:00
Thomas Hellstrom
93e58d5e17 drirc: Disable the GLX_SGI_video_sync extension for gnome-shell on vmware
With this extension enabled and a server GLX implementation that actually
honors it, Window movement lags considerably on gnome-shell/vmware, so
disable it by default.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
2018-03-08 07:26:29 +01:00
Thomas Hellstrom
4ca9ad2bb2 gallium/st_dri: Honor the glx_disable_sgi_video_sync config option
This option is disabled by default. Primarily intended for drivers on
virtual hardware.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
2018-03-08 07:26:29 +01:00
Thomas Hellstrom
f4070956d4 glx/dri: Add a driconf option to disable GLX_SGI_video_sync
Drivers on virtual hardware don't want to expose this extension to
GLX compositors, similarly to GLX_OML_sync_control, since that significantly
increases latency.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
2018-03-08 07:26:29 +01:00
Timothy Arceri
0c90264da4 ac/radeonsi: add emit_kill to the abi
This should fix a regression with Rocket League grass rendering
on the NIR backend.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104717
2018-03-08 11:28:37 +11:00
Timothy Arceri
50cc97d98a radeonsi: add si_llvm_emit_kill() helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 11:28:37 +11:00
Timothy Arceri
f4b877631e spirv: fix autotools builds
Fixes: 68a6a3b51a "spirv: handle AMD_gcn_shader extended instructions"

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-08 10:45:56 +11:00
Timothy Arceri
99cdc019bf ac: make use of if/loop build helpers
These helpers insert the basic block in the same order as they
appear in NIR making it easier to follow LLVM IR dumps. The helpers
also insert more useful labels onto the blocks.

TGSI use the line number of the corresponding opcode in the TGSI
dump as the label id, here we use the corresponding block index
from NIR.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 10:12:34 +11:00
Timothy Arceri
6e1a142863 radeonsi: make use of if/loop build helpers in ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 10:12:34 +11:00
Timothy Arceri
42627dabb4 ac: add if/loop build helpers
These have been ported over from radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 10:12:34 +11:00
Daniel Schürmann
ffbf75cde4 radv: enable AMD_gcn_shader extension
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
18c7f1e041 ac: implement AMD_gcn_shader extended instructions
Co-authored-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
68a6a3b51a spirv: handle AMD_gcn_shader extended instructions
Co-authored-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
a1a2a8dfda nir: add AMD_gcn_shader extended instructions
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
39437025de spirv: import AMD extensions header from glslang
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Dylan Baker
cba104ebe3 meson: Fix indent in omx meson.build
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-03-07 13:30:54 -08:00
Dylan Baker
6f628951af meson: Use include directory variables instead of traversing
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-03-07 13:30:53 -08:00
Dylan Baker
34e852d5b5 meson: Re-add auto option for omx
This re-adds the auto option for omx, without it we default to tizonia
and the build fails almost immediately, this is especially obnoxious
those building a driver that doesn't support the OMX state tracker to
begin with.

v2: - Only define OMX_FOO for auto cases if the dependencies are found.
      This fixes building tizonia with auto (Julien, Eric)

CC: Gurkirpal Singh <gurkirpal204@gmail.com>
Fixes: bb5e27fab6
       ("st/omx/bellagio: Rename st and target directories")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com> (v1)
2018-03-07 13:30:53 -08:00
Dylan Baker
7598dedfde meson: fix tizonia compilation
It needs to have src/egl in it's includes as well.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-03-07 13:30:53 -08:00
Dylan Baker
2d3004ef1c meson: combine state trackers and target if blocks
This is needed later since tizonia requires dri

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-03-07 13:30:53 -08:00
Marek Olšák
55376cb31e st/mesa: expose 0 shader binary formats for compat profiles for Qt
Bugzilla: https://bugreports.qt.io/browse/QTBUG-66420
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105065
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
2018-03-07 15:36:31 -05:00
Roland Scheidegger
8ba3750d3d draw: fix line stippling with aa lines
In contrast to non-aa, where stippling is based on either dx or dy
(depending on if it's a x or y major line), stippling is based on
actual distance with smooth lines, so adjust for this.

(It looks like there's some minor artifacts with mesa demos
line-sample and stippling, it looks like the line endpoints
aren't quite right with aa + stippling - maybe due to the
integer math in the stipple stage, but I can't quite pinpoint it.)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-07 21:29:00 +01:00
Roland Scheidegger
dbb2cf388b draw: simplify (and correct) aaline fallback (v2)
The motivation actually was to get rid of the additional tex
instruction, since that requires the draw fallback code to intercept
all sampler / view calls (even if the fallback is never hit).
Basically, the idea is to use coverage of the pixel to calculate
the alpha value, and coverage is simply based on the distance
to the center of the line (in both line direction, which is useful
for wide lines, as well as perpendicular to the line).
This is much closer to what hw supporting this natively actually does.
It also fixes an issue with line width not quite being correct, as
well as endpoints getting stretched too far (in line direction) with
wide lines, which is apparent with mesa demo line-sample.
(For llvmpipe, it would probably make sense to do something like this
directly when drawing lines, since rendering two tris is twice as
expensive as a line, but it would need some changes with state
management.)
Since we're no longer relying on mipmapping to get the alpha value,
we also don't need to draw 3 rects (6 tris), one is sufficient.

There's still issues (as before):
- quite sure it's not correct without half_pixel_center, but can't test
this with GL.
- aaline + line stipple is incorrect (evident with line-sample demo).
Looking at the spec the stipple pattern should actually be based on
distance (not just dx or dy for x/y major lines as without aa).
- outputs (other than pos + the one used for line aa) should be
reinterpolated since we actually increase line length by half a pixel
(but there's no tests which would care).

v2: simplify the math (should be equivalent), don't need immediate
v3: use float versions of atan2,cos,sin, minor cleanups

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-07 21:28:31 +01:00
Bas Nieuwenhuizen
034cce96b4 radv: Don't emit a warning on VI-GFX9.
We are conformant:

https://www.khronos.org/conformance/adopters/conformant-products#submission_308

v2: Actually not emit it on gfx9.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
04d65d2b76 radv: Enable vulkan 1.1.0 for configurations that can support it.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
0168eaaa42 radv: Disable sampler ycbcr conversion.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
cce62f4065 radv: Expose that we don't support any VK_KHR_16_bit_storage parts.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
b99b9cc864 radv: Implement vkEnumerateInstanceVersion.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
5240fddb9d radv: Add trivial device group implementation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
84e877aa77 radv: Implement vkCmdDispatchBase.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
de5e25898c radv: Implement VkGetDeviceQueue2.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
b137e25277 radv: Support VkPhysicalDeviceProtectedMemoryFeatures.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
4bcf4d1678 radv: Support VkPhysicalDeviceShaderDrawParameterFeatures.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
41d958d073 radv: Implement VK_KHR_maintenance3.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
8f9af587a2 radv: Add minimal subgroup support.
Deliberately not implementing workgroup scopes as that is not needed
for core vulkan.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
89651fba9b radv: Change client version check.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:34 +01:00
Bas Nieuwenhuizen
5b3979704d radv: Update MAX_API_VERSION to 1.1.0
v2: Don't bump supported version.
v3: Update json files.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:34 +01:00
Bas Nieuwenhuizen
97f10934ed ac/nir: Add vote_ieq/vote_feq lowering pass.
The old vote_eq implementation supported only booleans, but now
we have to support arbitrary values, so use the read_first_invocation
intrinsic + ballot.

I took this as an opportunity to figure out how easy it was to do this
in nir instead of in the nir_to_llvm pass, and it actually turned out
pretty okay IMO. Only creating the pass is some extra code.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:32 +01:00
Jason Ekstrand
c217607b65 anv: Support version overrides
While always sketchy to do, this is useful for debugging.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
a1ee51309e vulkan/util: Add a helper to get a version override
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
d6b65222df anv: Enable Vulkan 1.1
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
03c07ac548 anv: Add support for SPIR-V 1.3 subgroup operations
This requires us to bump the subgroup size to 32 for all shader stages
because Vulkan requires that to be a physical device query.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
8b4a5e641b intel/fs: Add support for subgroup quad operations
NIR has code to lower these away for us but we can do significantly
better in many cases with register regioning and SIMD4x2.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
2292b20b29 intel/fs: Implement reduce and scan opeprations
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
4150920b95 intel/fs: Add a helper for emitting scan operations
This commit adds a helper to the builder for emitting "scan" operations.
Given a binary operation #, a scan takes the vector [a0, a1, ..., aN]
and returns the vector [a0, a0 # a1, ..., a0 # a1 # ... # aN] where each
channel contains the combination of all previous channels.  The sequence
of instructions to perform the scan is fairly optimal; a 16-wide scan on
a 32-bit type is only 6 instructions.  The subgroup scan and reduction
operations will be implemented in terms of this.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
b0858c1cc6 intel/fs: Add a couple of simple helper opcodes
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
57bff0a546 spirv: Add support for subgroup arithmetic
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
789221dcfa nir: Add a helper for getting binop identities
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
82d493a939 nir: Add subgroup arithmetic reduction intrinsics
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
b3a5b0f3fc spirv: Add subgroup quad support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
493a165544 nir: Add quad operations and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
90c9f29518 i965/fs: Add support for nir_intrinsic_shuffle
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
8256ee3fa3 spirv: Add subgroup shuffle support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
149b92ccf2 nir: Add subgroup shuffle intrinsics and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
7cfece820d i965/fs: Support nir_intrinsic_vote_feq
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
0e893356fe nir/lower_subgroups: Add scalarizing for vote_eq
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
d792f3d4cd spirv: Add subgroup vote support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
44681e4795 nir: Generalize nir_intrinsic_vote_eq
The SPIR-V extension wants us to be able to do an AllEqual on any vector
or scalar type.  This has two implications:

 1) We need to be able to handle vectors so we switch the vote_eq
    intrinsics to be vectorized intrinsics.

 2) We need to handle floats which have different behavior with respect
    to +-0, NaN, etc. than the integer variant so we need two variants.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
9812fce60b spirv: Add subgroup ballot support
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
974daec495 i965/fs: Implement basic SPIR-V subgroup intrinsics
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
adc077797a spirv: Add initial subgroup support
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
5162a1d884 nir: Add new SPIR-V ballot intrinsics and lowering
Someone can make the lowering optional later if they want something
different for their hardware.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
752e969703 compiler: Add two new system values for subgroups
This will be required for SPIR-V subgroup support

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
34c60ea02b nir: Add new SPIR-V ballot ALU intrinsics and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
cc587ee9a7 spirv: Handle the new OpModuleProcessed instruction
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
59b0ea0c74 anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER
From the Vulkan 1.1 spec:

    "Vulkan 1.0 implementations were required to return
    VK_ERROR_INCOMPATIBLE_DRIVER if apiVersion was larger than 1.0.
    Implementations that support Vulkan 1.1 or later must not return
    VK_ERROR_INCOMPATIBLE_DRIVER for any value of apiVersion."

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
cbab2d1da5 anv: Implement vkEnumerateInstanceVersion
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Iago Toral Quiroga
605fd7c0da anv/device: fail to initialize device if we have queues with unsupported flags
This is not strictly necessary since users should not be requesting any
flags that are not valid for the list of enabled features requested and
we already fail if they attempt to use an unsupported feature, however
it is an easy to implement sanity check that would help developes realize
that they are doing things wrong, so we might as well do it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-07 12:13:47 -08:00
Iago Toral Quiroga
b262f17b15 anv/device: GetDeviceQueue2 should only return queues with matching flags
From the Vulkan 1.1 spec, VkDeviceQueueInfo2 structure:

   "The queue returned by vkGetDeviceQueue2 must have the same flags value
    from this structure as that used at device creation time in a
    VkDeviceQueueCreateInfo instance. If no matching flags were specified
    at device creation time then pQueue will return VK_NULL_HANDLE."

For us this means no flags at all since we don't support any.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
9c8b40001d anv: Support querying for protected memory
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
773a51e772 anv: Implement GetDeviceQueue2
This belongs to the protected memory feature but there's nothing about
it that's specific to protected memory.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
68df93ecbc anv: Trivially implement VK_KHR_device_group
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
dfe18be09e anv: Implement vkCmdDispatchBase
This is part of the device groups extension/feature but it's a decent
chunk of work in its own right so it's worth breaking into its own
patch.  The mechanism we use is fairly straightforward: we just push the
base work group id into the shader and add it to the work group id we
get from dispatch.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
ff9db1a4cc nir/spirv: Add support for device groups
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
ddc4069122 anv: Implement VK_KHR_maintenance3
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
1deb7967c8 anv: Support VkPhysicalDeviceShaderDrawParameterFeatures
This advertises the VK_KHR_shader_draw_parameters functionality as a
"core optimal feature" in Vulkan 1.1.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
06719f9d4b anv/entrypoints: Drop support for protect attributes
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
bd1279bd9f Get rid of a bunch of KHR suffixes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
af461986db anv: Add version 1.1.0 but leave it disabled
This requires us to rename any Vulkan API entrypoints which became core
in 1.1 to no longer have the KHR suffix.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
0128187335 spirv: Update the SPIR-V headers and json to 1.3.1
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
205c271562 vulkan: Update the XML and headers to 1.1.70
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
7fb86fb511 vulkan/enum_to_str: Add support for aliases and new Vulkan versions
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
539a0aec45 vulkan/enum_to_str: Add a add_value_from_xml helper to VkEnum
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
eb23ca069f anv/entrypoints: Generate #ifdef guards from platform attributes
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
05fc377f2e anv/extensions: Add support for multiple API versions
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
8efa173ed2 anv/entrypoints_gen: Add support for aliases in the XML
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
39d9fcea13 anv/entrypoints: Allow an entrypoint to require multiple extensions
In this case, we say an entrypoint is supported if ANY of the extensions
is supported.  This is because, in the XML, entrypoints don't require
extensions so much as extensions require entrypoints.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
8e8f167c72 anv/entrypoints: Add an is_device_entrypoint helper
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
54b3493fc0 anv/entrypoints_gen: Allow the string map to grow
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
d91da06df5 anv/entrypoints_gen: A bit of refactoring
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
a4ca4c99ba anv/entrypoints: Generalize the string map a bit
The original string map assumed that the mapping from strings to
entrypoints was a bijection.  This will not be true the moment we
add entrypoint aliasing.  This reworks things to be an arbitrary map
from strings to non-negative signed integers.  The old one also had a
potential bug if we ever had a hash collision because it didn't do the
strcmp inside the lookup loop.  While we're at it, we break things out
into a helpful class.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
3960d0e332 vulkan: Rename multiview from KHX to KHR
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
68af9f04a4 spirv: Rework barriers
Our previous handling of barriers always used the big hammer and didn't
correctly emit memory barriers when specified along with a control
barrier.  This commit completely reworks the way we emit barriers to
make things both more precise and more correct.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
de518f38e5 spirv: Add a vtn_constant_value helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Marek Olšák
9779f34326 radeonsi: remove si_llvm_add_attribute 2018-03-07 13:55:49 -05:00
Marek Olšák
2c3f3651c4 radeonsi: fix passing address32_hi to LLVM for high values
The old function treats high values as negative, which LLVM interprets as 0.
2018-03-07 13:55:49 -05:00
Marek Olšák
b3b6b00ac8 radeonsi: assume has_virtual_memory == true 2018-03-07 13:55:48 -05:00
Marek Olšák
53db2790c0 radeonsi: add/update assertions for 32-bit address space 2018-03-07 13:55:47 -05:00
Marek Olšák
16856a1ee8 radeonsi: prevent a negative buffer offset in si_upload_descriptors 2018-03-07 13:55:42 -05:00
Marek Olšák
9b55498059 radeonsi: properly extract a buffer address from a descriptor 2018-03-07 13:55:40 -05:00
Marek Olšák
2a47660754 radeonsi: fix vertex buffer address computation with full 64-bit addresses 2018-03-07 13:55:38 -05:00
Marek Olšák
2e30268877 radeonsi: mask out high VM address bits in registers where needed 2018-03-07 13:55:35 -05:00
Bas Nieuwenhuizen
94c9096c83 radv: Add entrypoints generation with the new vk.xml
A lot of it is based on intel again.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 15:50:19 +01:00
Simon Hausmann
fb5825e7ce glsl: Fix memory leak with known glsl_type instances
When looking up known glsl_type instances in the various hash tables, we
end up leaking the key instances used for the lookup, as the glsl_type
constructor allocates memory on the global mem_ctx. This patch changes
glsl_type to manage its own memory, which fixes the leak and also allows
getting rid of the global mem_ctx and its mutex.

v2: remove lambda usage (Tapani)
    (+keep ASSERT_BITFIELD_SIZE, modify dummy ctor to initialize mem_ctx)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104884
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Simon Hausmann <simon.hausmann@qt.io>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-07 14:33:34 +02:00
Caio Marcelo de Oliveira Filho
c17808562e spirv: Add SpvCapabilityShaderViewportIndexLayerEXT
This capability allows gl_ViewportIndex and gl_Layer to also be used
as outputs in Vertex and Tesselation shaders.

v2: Make conditional to the capability, add gl_Layer, add tesselation
    shaders. (Iago)

v3: Don't export to tesselation control shader.

v4: Add Reviewd-by tag.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 07:04:20 +01:00
Mauro Rossi
487f8d48c9 android: anv: add libmesa_intel_dev static dependency
Fixes the following building errors:

external/mesa/src/intel/vulkan/anv_device.c:300: error: undefined reference to 'gen_get_pci_device_id_override'
external/mesa/src/intel/vulkan/anv_device.c:312: error: undefined reference to 'gen_get_device_name'
external/mesa/src/intel/vulkan/anv_device.c:313: error: undefined reference to 'gen_get_device_info'
clang.real: error: linker command failed with exit code 1 (use -v to see invocation)

Fixes: 272bef0601 "intel: Split gen_device_info out into libintel_dev"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-07 07:55:34 +02:00
Timothy Arceri
1fdb21541e Revert "nir: bump loop unroll limit to 96."
This reverts commit 2d36efdb7f.

This raised limit turns out to harmful for more complex shaders,
it causes excessive spilling in some Bioshock Infinite shaders.

The fps for the ssao demo on radv remains unchanged when reverting
this.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-07 15:10:05 +11:00
Dave Airlie
fb077b0728 ac/nir: don't put lod into args if it's zero.
If it's zero but put it in args we still end up consuming a
register for it.

This fixes some spilling in the NIR paths in Dirt Rally that
isn't seen with TGSI.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-07 03:34:59 +00:00
Christian Gmeiner
38e91e2b81 freedreno: bump required libdrm version
Fixes: 26a9321d0a "freedreno: add global_bindings state"

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-06 21:52:59 +01:00
Ian Romanick
e3ea166a2c nir: Simplify some comparisons like a+b < a
All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14514555 -> 14514547 (<.01%)
instructions in affected programs: 1972 -> 1964 (-0.41%)
helped: 8
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.39% max: 0.42% x̄: 0.41% x̃: 0.41%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.41% -0.40%
Instructions are helped.

total cycles in shared programs: 533141444 -> 533136780 (<.01%)
cycles in affected programs: 164728 -> 160064 (-2.83%)
helped: 181
HURT: 3
helped stats (abs) min: 2 max: 94 x̄: 26.17 x̃: 30
helped stats (rel) min: 0.12% max: 5.33% x̄: 3.42% x̃: 3.80%
HURT stats (abs)   min: 4 max: 54 x̄: 24.00 x̃: 14
HURT stats (rel)   min: 0.20% max: 2.39% x̄: 1.09% x̃: 0.68%
95% mean confidence interval for cycles value: -27.12 -23.58
95% mean confidence interval for cycles %-change: -3.54% -3.16%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10533667 -> 10533539 (<.01%)
instructions in affected programs: 10148 -> 10020 (-1.26%)
helped: 124
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.03 x̃: 1
helped stats (rel) min: 0.39% max: 4.35% x̄: 2.20% x̃: 2.04%
95% mean confidence interval for instructions value: -1.06 -1.00
95% mean confidence interval for instructions %-change: -2.46% -1.95%
Instructions are helped.

total cycles in shared programs: 146136887 -> 146132122 (<.01%)
cycles in affected programs: 206382 -> 201617 (-2.31%)
helped: 171
HURT: 0
helped stats (abs) min: 2 max: 40 x̄: 27.87 x̃: 30
helped stats (rel) min: 0.08% max: 5.73% x̄: 2.98% x̃: 2.67%
95% mean confidence interval for cycles value: -29.19 -26.54
95% mean confidence interval for cycles %-change: -3.20% -2.76%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886515 -> 7886507 (<.01%)
instructions in affected programs: 3016 -> 3008 (-0.27%)
helped: 8
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.25% max: 0.28% x̄: 0.27% x̃: 0.27%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.27% -0.26%
Instructions are helped.

total cycles in shared programs: 178100396 -> 178100388 (<.01%)
cycles in affected programs: 156128 -> 156120 (<.01%)
helped: 4
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -3.68 1.68
95% mean confidence interval for cycles %-change: -0.03% <.01%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857872 -> 4857868 (<.01%)
instructions in affected programs: 1544 -> 1540 (-0.26%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.25% max: 0.27% x̄: 0.26% x̃: 0.26%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.28% -0.24%
Instructions are helped.

total cycles in shared programs: 122167654 -> 122167662 (<.01%)
cycles in affected programs: 96248 -> 96256 (<.01%)
helped: 0
HURT: 4
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: 2.00 2.00
95% mean confidence interval for cycles %-change: <.01% 0.02%
Cycles are HURT.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:30 -08:00
Ian Romanick
d1ed4ffe0b nir: Use De Morgan's Law on logic compounded comparisons
The replacement of the comparison operators must happen during this
step.  If it does not, the next pass of nir_opt_algebraic will reapply
De Morgan's Law in the "opposite direction" before performing dead code
elimination.  The resulting infinite loop will eventually get OOM
killed.

Haswell, Broadwell, and Skylake had similar results. (Broadwell shown)
total instructions in shared programs: 14808185 -> 14808036 (<.01%)
instructions in affected programs: 13758 -> 13609 (-1.08%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 3.82 x̃: 3
helped stats (rel) min: 0.44% max: 1.55% x̄: 0.98% x̃: 1.01%
95% mean confidence interval for instructions value: -4.67 -2.97
95% mean confidence interval for instructions %-change: -1.09% -0.88%
Instructions are helped.

total cycles in shared programs: 559438333 -> 559435832 (<.01%)
cycles in affected programs: 199160 -> 196659 (-1.26%)
helped: 42
HURT: 3
helped stats (abs) min: 2 max: 184 x̄: 61.50 x̃: 51
helped stats (rel) min: 0.02% max: 6.94% x̄: 1.41% x̃: 1.40%
HURT stats (abs)   min: 2 max: 40 x̄: 27.33 x̃: 40
HURT stats (rel)   min: 0.05% max: 0.74% x̄: 0.51% x̃: 0.74%
95% mean confidence interval for cycles value: -71.47 -39.69
95% mean confidence interval for cycles %-change: -1.64% -0.93%
Cycles are helped.

Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11811776 -> 11811553 (<.01%)
instructions in affected programs: 15201 -> 14978 (-1.47%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 5.72 x̃: 6
helped stats (rel) min: 0.44% max: 2.53% x̄: 1.30% x̃: 1.26%
95% mean confidence interval for instructions value: -7.21 -4.23
95% mean confidence interval for instructions %-change: -1.48% -1.12%
Instructions are helped.

total cycles in shared programs: 257617270 -> 257614589 (<.01%)
cycles in affected programs: 212107 -> 209426 (-1.26%)
helped: 45
HURT: 0
helped stats (abs) min: 2 max: 180 x̄: 59.58 x̃: 54
helped stats (rel) min: 0.02% max: 6.02% x̄: 1.30% x̃: 1.32%
95% mean confidence interval for cycles value: -74.02 -45.14
95% mean confidence interval for cycles %-change: -1.59% -1.01%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886648 -> 7886515 (<.01%)
instructions in affected programs: 14106 -> 13973 (-0.94%)
helped: 29
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 4.59 x̃: 4
helped stats (rel) min: 0.35% max: 1.83% x̄: 0.90% x̃: 0.81%
95% mean confidence interval for instructions value: -5.65 -3.52
95% mean confidence interval for instructions %-change: -1.03% -0.76%
Instructions are helped.

total cycles in shared programs: 178100812 -> 178100396 (<.01%)
cycles in affected programs: 67970 -> 67554 (-0.61%)
helped: 29
HURT: 0
helped stats (abs) min: 2 max: 40 x̄: 14.34 x̃: 12
helped stats (rel) min: 0.15% max: 1.69% x̄: 0.58% x̃: 0.54%
95% mean confidence interval for cycles value: -18.30 -10.39
95% mean confidence interval for cycles %-change: -0.71% -0.45%
Cycles are helped.

GM45
total instructions in shared programs: 4857939 -> 4857872 (<.01%)
instructions in affected programs: 7426 -> 7359 (-0.90%)
helped: 15
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 4.47 x̃: 4
helped stats (rel) min: 0.33% max: 1.80% x̄: 0.87% x̃: 0.77%
95% mean confidence interval for instructions value: -6.06 -2.87
95% mean confidence interval for instructions %-change: -1.06% -0.67%
Instructions are helped.

total cycles in shared programs: 122167930 -> 122167654 (<.01%)
cycles in affected programs: 43118 -> 42842 (-0.64%)
helped: 15
HURT: 0
helped stats (abs) min: 4 max: 40 x̄: 18.40 x̃: 16
helped stats (rel) min: 0.15% max: 1.69% x̄: 0.62% x̃: 0.54%
95% mean confidence interval for cycles value: -25.03 -11.77
95% mean confidence interval for cycles %-change: -0.82% -0.41%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
52607658ff nir: Replace fmin(b2f(a), b) with a bcsel
All of the affected shaders are HDR mappers from Serious Sam 3.

All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14516285 -> 14516273 (<.01%)
instructions in affected programs: 348 -> 336 (-3.45%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.08% max: 6.67% x̄: 4.31% x̃: 4.17%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -5.55% -3.06%
Instructions are helped.

total cycles in shared programs: 533163876 -> 533163808 (<.01%)
cycles in affected programs: 1144 -> 1076 (-5.94%)
helped: 4
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 5.80% max: 6.08% x̄: 5.94% x̃: 5.94%
95% mean confidence interval for cycles value: -18.84 -15.16
95% mean confidence interval for cycles %-change: -6.20% -5.68%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10533321 -> 10533309 (<.01%)
instructions in affected programs: 372 -> 360 (-3.23%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.00% max: 5.88% x̄: 3.91% x̃: 3.85%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -4.96% -2.86%
Instructions are helped.

total cycles in shared programs: 146136632 -> 146136428 (<.01%)
cycles in affected programs: 11668 -> 11464 (-1.75%)
helped: 12
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 0.99% max: 3.44% x̄: 2.20% x̃: 2.29%
95% mean confidence interval for cycles value: -17.66 -16.34
95% mean confidence interval for cycles %-change: -2.82% -1.58%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886301 -> 7886277 (<.01%)
instructions in affected programs: 576 -> 552 (-4.17%)
helped: 12
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.94% max: 6.06% x̄: 4.51% x̃: 4.65%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -5.30% -3.72%
Instructions are helped.

total cycles in shared programs: 178113176 -> 178113176 (0.00%)
cycles in affected programs: 2116 -> 2116 (0.00%)
helped: 2
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 1.14% max: 1.14% x̄: 1.14% x̃: 1.14%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.50% max: 0.65% x̄: 0.58% x̃: 0.58%
95% mean confidence interval for cycles value: -3.25 3.25
95% mean confidence interval for cycles %-change: -0.93% 0.94%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857756 -> 4857744 (<.01%)
instructions in affected programs: 294 -> 282 (-4.08%)
helped: 6
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.94% max: 5.71% x̄: 4.40% x̃: 4.55%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -5.71% -3.09%
Instructions are helped.

total cycles in shared programs: 122178730 -> 122178722 (<.01%)
cycles in affected programs: 700 -> 692 (-1.14%)
helped: 2
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
b974dfee11 nir: Pull b2f out of bcsel
All platforms had similar results. (Skylake shown)
total instructions in shared programs: 14516592 -> 14516586 (<.01%)
instructions in affected programs: 500 -> 494 (-1.20%)
helped: 2
HURT: 0

total cycles in shared programs: 533167044 -> 533166998 (<.01%)
cycles in affected programs: 6988 -> 6942 (-0.66%)
helped: 2
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
f50400cc80 nir: Replace an odd comparison involving fmin of -b2f
I noticed the fge version while looking at a shader for an unrelated
reason.  The feq version prevents a regression in a later change that
performs strength reduction of some compares.

Broadwell and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14514808 -> 14514796 (<.01%)
instructions in affected programs: 750 -> 738 (-1.60%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.83% max: 1.96% x̄: 1.40% x̃: 1.40%
95% mean confidence interval for instructions value: -6.67 0.67
95% mean confidence interval for instructions %-change: -2.43% -0.36%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 533144939 -> 533144853 (<.01%)
cycles in affected programs: 8911 -> 8825 (-0.97%)
helped: 4
HURT: 0
helped stats (abs) min: 16 max: 32 x̄: 21.50 x̃: 19
helped stats (rel) min: 0.60% max: 1.89% x̄: 1.28% x̃: 1.31%
95% mean confidence interval for cycles value: -32.94 -10.06
95% mean confidence interval for cycles %-change: -2.30% -0.26%
Cycles are helped.

Haswell
total instructions in shared programs: 13093785 -> 13093775 (<.01%)
instructions in affected programs: 924 -> 914 (-1.08%)
helped: 4
HURT: 2
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.82% max: 1.95% x̄: 1.39% x̃: 1.39%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19%
95% mean confidence interval for instructions value: -4.53 1.20
95% mean confidence interval for instructions %-change: -2.02% 0.97%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 409580553 -> 409580118 (<.01%)
cycles in affected programs: 10909 -> 10474 (-3.99%)
helped: 5
HURT: 1
helped stats (abs) min: 6 max: 222 x̄: 89.60 x̃: 18
helped stats (rel) min: 0.16% max: 24.72% x̄: 9.54% x̃: 1.78%
HURT stats (abs)   min: 13 max: 13 x̄: 13.00 x̃: 13
HURT stats (rel)   min: 0.39% max: 0.39% x̄: 0.39% x̃: 0.39%
95% mean confidence interval for cycles value: -180.68 35.68
95% mean confidence interval for cycles %-change: -19.55% 3.79%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 11811851 -> 11811840 (<.01%)
instructions in affected programs: 1032 -> 1021 (-1.07%)
helped: 5
HURT: 1
helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 1
helped stats (rel) min: 0.63% max: 1.95% x̄: 1.13% x̃: 0.97%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19%
95% mean confidence interval for instructions value: -4.17 0.51
95% mean confidence interval for instructions %-change: -1.86% 0.36%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 257618403 -> 257618168 (<.01%)
cycles in affected programs: 10784 -> 10549 (-2.18%)
helped: 4
HURT: 2
helped stats (abs) min: 4 max: 220 x̄: 64.50 x̃: 17
helped stats (rel) min: 0.50% max: 24.34% x̄: 7.07% x̃: 1.72%
HURT stats (abs)   min: 9 max: 14 x̄: 11.50 x̃: 11
HURT stats (rel)   min: 0.24% max: 0.42% x̄: 0.33% x̃: 0.33%
95% mean confidence interval for cycles value: -133.11 54.78
95% mean confidence interval for cycles %-change: -14.79% 5.59%
Inconclusive result (value mean confidence interval includes 0).

GM45, Iron Lake, and Sandy Bridge had similar results. (Sandy Bridge shown)
total instructions in shared programs: 10533871 -> 10533859 (<.01%)
instructions in affected programs: 865 -> 853 (-1.39%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.63% max: 1.83% x̄: 1.22% x̃: 1.21%
95% mean confidence interval for instructions value: -6.67 0.67
95% mean confidence interval for instructions %-change: -2.16% -0.29%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 146139904 -> 146139852 (<.01%)
cycles in affected programs: 15213 -> 15161 (-0.34%)
helped: 4
HURT: 0
helped stats (abs) min: 3 max: 18 x̄: 13.00 x̃: 15
helped stats (rel) min: 0.15% max: 0.84% x̄: 0.39% x̃: 0.29%
95% mean confidence interval for cycles value: -23.79 -2.21
95% mean confidence interval for cycles %-change: -0.88% 0.09%
Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
380136e998 nir: Mark bcsel-to-fmin (or fmax) transformations as inexact
These transformations are inexact because section 4.7.1 (Range and
Precision) says:

    Operations and built-in functions that operate on a NaN are not
    required to return a NaN as the result.

The fmin or fmax might not return NaN in cases where the original
expression would be required to return NaN.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-06 11:17:14 -08:00
Ian Romanick
4addd34b04 nir: Recognize some more open-coded fmin / fmax
This transformation is inexact because section 4.7.1 (Range and
Precision) says:

    Operations and built-in functions that operate on a NaN are not
    required to return a NaN as the result.

The fmin or fmax might not return NaN in cases where the original
expression would be required to return NaN.

v2: Reorder operands and mark as inexact.  The latter suggested by
Jason.

shader-db results:

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14514817 -> 14514808 (<.01%)
instructions in affected programs: 229 -> 220 (-3.93%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4
helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12%

total cycles in shared programs: 533145211 -> 533144939 (<.01%)
cycles in affected programs: 37268 -> 36996 (-0.73%)
helped: 8
HURT: 0
helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2
helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05%

Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
total cycles in shared programs: 257618409 -> 257618403 (<.01%)
cycles in affected programs: 12582 -> 12576 (-0.05%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05%

No changes on Iron Lake or GM45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-06 11:17:14 -08:00
Gurkirpal Singh
c62cf1f165 st/omx/tizonia/h264d: Add EGLImage support
Example Gstreamer pipeline :
MESA_ENABLE_OMX_EGLIMAGE=1 GST_GL_API=gles2 GST_GL_PLATFORM=egl gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! glimagesink

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 17:21:11 +00:00
Gurkirpal Singh
b2f2236dc5 st/omx/tizonia: Add H.264 encoder
v2: Refactor out screen functions to st/omx

Example Gstreamer pipeline :
gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! avdec_h264 ! videoconvert ! omxh264enc ! h264parse ! avdec_h264 ! videoconvert ! ximagesink

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 17:20:08 +00:00
Gurkirpal Singh
83d4a5d5ae st/omx/tizonia: Add H.264 decoder
v2: Refactor out screen functions to st/omx

Example Gstreamer pipeline :
gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! videoconvert ! ximagesink

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 14:29:42 +00:00
Gurkirpal Singh
430ccdbcb9 st/omx/tizonia: Add entrypoint
Adds base files for adding components

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 14:29:42 +00:00
Gurkirpal Singh
e2afa154e9 st/omx/tizonia: Add --enable-omx-tizonia flag and build files
Allow only bellagio or tizonia to be used at the same time.
Detect tizonia package config file
Generate libomx_mesa.so and install it to libtizcore.pc::pluginsdir
Only compile empty source (target.c) for now.

GSoC Project link: https://summerofcode.withgoogle.com/projects/#4737166321123328

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 14:29:42 +00:00
Gurkirpal Singh
bb5e27fab6 st/omx/bellagio: Rename st and target directories
v2: Refactor out screen functions to st/omx

Allows to keep all the code under st/omx (st/omx/tizonia and
st/omx/bellagio).
Reverts targets/omx_bellagio to omx as additions to existing files
is enough to compile for both bellagio and tizonia.

* autotools changes:
  --enable-omx -> --enable-omx-bellagio

* meson changes:
  -Dgallium-omx=false -> -Dgallium-omx=disabled
  -Dgallium-omx=true  -> -Dgallium-omx=bellagio

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 13:07:03 +00:00
Samuel Pitoiset
e96e6f60f7 radv: report the scratch private memory size with shader stats
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:38:42 +01:00
Samuel Pitoiset
7f6b91c9c3 ac/nir: count the scratch private memory size
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:38:40 +01:00
Samuel Pitoiset
3b8e7459f2 ac: add ac_count_scratch_private_memory()
Imported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:38:38 +01:00
Samuel Pitoiset
f3275ca01c ac/nir: only enable used channels when exporting parameters
This allows us to generate, for example,
"exp param0 v0, off, off, off" if only the first channel is needed.

Not sure if this improves performance but it's worth trying.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:38:35 +01:00
Samuel Pitoiset
675dde13b2 ac: update enabled channels mask when optimizing PARAM exports
When the mask is not 0xf we need to update the number of
enabled channels, otherwise the hardware won't emit the
components that are combined.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:37:52 +01:00
Samuel Pitoiset
c24abae9dc ac/nir: pass the number of enabled channels to si_llvm_init_export_args()
Currently, it's always 0xf but an upcoming patch will reduce the
number of channels for parameters export.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:37:50 +01:00
Samuel Pitoiset
5cd34f03c0 ac/shader: scan output usage mask for VS and TES
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:37:47 +01:00
Clayton Craft
d1fa30e0f8 intel: Add missing includes for building on Android
This adds a missing library to the i965/Android.mk file, and updates
intel/Android.mk to include the new library. Without this, mesa does not
build on Android.

Fixes: 272bef0601 "intel: Split gen_device_info out into
libintel_dev"

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-06 00:14:22 -08:00
Tapani Pälli
237c9caa78 vulkan: do not expose surface/swapchain extensions on Android
On Android surface/swapchain extensions are implemented by the loader. Patch
modifies both anv and radv extension scripts disabling currently exposed
ones. See also earlier commit 9f763c1f9b.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-06 08:02:59 +02:00
Tapani Pälli
85518657a9 anv: Don't expose VK_KHX_multiview on android.
Just like commit 2ffe395 does for radv.

Fixes following dEQP test on i965:
   dEQP-VK.api.info.android.no_unknown_extensions

v2: make it !ANDROID since this extension is not about
    surfaces/swapchain

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-06 08:01:20 +02:00
Roland Scheidegger
cf4a92fda2 gallium: increase PIPE_MAX_SHADER_SAMPLER_VIEWS to 128
Some state trackers require 128.
(There are no plans to increase PIPE_MAX_SAMPLERS too, since with gl
state tracker it's unlikely more than 32 will be needed, if you need
more use bindless.)
2018-03-06 05:18:17 +01:00
Roland Scheidegger
06e724c7b4 tgsi/scan: use wrap-around shift behavior explicitly for file_mask
The comment said it will only represent the lowest 32 regs. This was
not entirely true in practice, since at least on x86 you'll get
masked shifts (unless the compiler could recognize it already and toss
it out). It turns out this actually works out alright (presumably
noone uses it for temp regs) when increasing max sampler views, so
make that behavior explicit.
Albeit it feels a bit hacky (but in any case, explicit behavior there
is better than undefined behavior).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-06 05:18:17 +01:00
Aaron Watry
95ae6c0355 clover: Allow overriding platform/device version numbers
Useful for testing API, builtin library, and device completeness of
not-yet-supported versions.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(v3) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>

v4: Remove redundant std::string wrapper around debug_get_option calls
v3: mark CL version overrides as static and const
v2: Make version_string in platform const in case
2018-03-05 20:09:46 -06:00
Aaron Watry
106020712f clover/llvm: Pass device down to compile
We'll need to be able to detect device version to define the appropriate
__OPENCL_VERSION__ header.

v2: Rebase after removing the previous patch (Pierre)
  - Removed "clover: Add device_clc_version to llvm::create_compiler_instance"

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-05 20:09:46 -06:00
Aaron Watry
fc629e3594 clover: Pass device to llvm::create_compiler_instance
We'll be using dev.device_clc_version to select the default language version
soon along with the existing ir_target field.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>

v4: Pass the device down instead of device_clc_version as a separate field
v3: Revise to acknowledge that we now have the device in compile/link_program
    instead of the string values.
v2: (Pierre) Move changes to create_compiler_instance invocation to correct
    patch to prevent temporary build breakage.
    (Jan) Use device_clc_version instead of device_version for compile/link
2018-03-05 20:09:46 -06:00
Aaron Watry
dd81ca3883 clover/llvm: Use device in llvm compilation instead of copying fields
Copying the individual fields from the device when compiling/linking
will lead to an unnecessarily large number of fields getting passed
around.

v3: Rebase on current master
v2: Use device in function args before making additional changes in
    following patches

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-05 20:09:46 -06:00
Timothy Arceri
71b3d681d8 radeonsi/nir: fix handling of doubles for gs inputs
Fixes piglit test:
tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 11:44:06 +11:00
Timothy Arceri
20bd0f6a2b ac: pass the unmodified number of components to load gs inputs
Currently both users of this would overflow an array when the
input was a dual slot double as they expected the number of
components to be a max of 4.

Since we pass the type we can just let the functions handle
doubles in a way they choose.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 11:44:06 +11:00
Timothy Arceri
2a68c6c6c8 radeonsi: move si_nir_load_input_gs() to si_shader.c
All the tess shader and tgsi equivalents are here and it allows
use to use llvm_type_is_64bit() in the following patch without
exposing it externally.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 11:44:06 +11:00
Boris Brezillon
9ea90ffb98 broadcom/vc4: Add support for HW perfmon
The V3D engine provides several perf counters.
Implement ->get_driver_query_[group_]info() so that these counters are
exposed through the GL_AMD_performance_monitor extension.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
2018-03-05 15:54:04 -08:00
Boris Brezillon
5924379a58 drm-uapi: Update vc4 header with perfmon related definitions
v2: Update to the final version with the documentation.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
2018-03-05 15:53:48 -08:00
Roland Scheidegger
434523cf2a r600: fix color export mask
The r600 code (not the eg one) forgot to copy the ps_color_export_mask
in commit 5b14e06d8b when updating the
pixel state, leading to misrenderings (probably with MRT).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105262

Tested-by: LoneVVolf <lonewolf@xs4all.nl>
Tested-by: Pavel Vinogradov <public@sourcemage.org>
2018-03-05 20:15:05 +01:00
Andres Gomez
72552012c7 travis: keep meson version below 0.45.0
Recently Meson upgraded to 0.45.0 and it needs python 3.5+, which is
not available in Trusty.

Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-05 21:12:37 +02:00
Kenneth Graunke
0472aa3efe intel: Drop SURFACE_FORMAT enum from genxml.
We want people to be using ISL_FORMAT_*, rather than the genxml format
enumerations. This patch drops 10 separate copies, and drops a bunch
of ugly casting.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
[jordan.l.justen@intel.com: Minor changes for rebase]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-05 09:51:08 -08:00
Jordan Justen
755e7e6c20 intel/common: Use isl for decoder surface formats
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-05 09:51:04 -08:00
Jordan Justen
bd3392423d intel/isl: Add isl_format_is_valid
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-05 09:51:01 -08:00
Jordan Justen
272bef0601 intel: Split gen_device_info out into libintel_dev
Split out the device info so isl doesn't depend on intel/common. Now
it will depend on the new intel/dev device info lib.

This will allow the decoder in intel/common to use isl, allowing us to
apply Ken's patch that removes the genxml duplication of surface
formats.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-05 09:47:37 -08:00
Gert Wollny
9a0d7bb48c gallium/aux/hud: Avoid possible buffer overflow
Limit the length of acceptable cpu names for use in hud_get_num_cpufreq
in order to avoid a buffer overflow later in add_object when this name
is copied into cpufreq_info::name.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105274
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-03-05 11:38:28 -05:00
Eric Engestrom
b98c905a46 gbm: give a name to rgba fields
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-05 15:14:36 +00:00
Andres Gomez
40abffb295 egl: remove duplicated initialization
Found by inspection.

The line removed is a duplicate of the line literally just above the
the 3 lines context usually printed in a commit log.

v2: enhance the commit log (Emil).

Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-05 15:55:53 +02:00
Rob Clark
5a5a43078c freedreno/ir3: start dealing with half-precision
Some instructions, assume src and/or dst is half-precision based on a
type field (ie. f32/s32/u32 are full precision but others are half
precision).  So add some code to sanity check the src/dst registers to
catch mixups.

Also propagate half-precision flag for SSA sources.  The instruction
consuming a SSA value needs to be of the same type as the one producing
it.

This is probably not complete half-precision support, but a useful first
step.  We do still need to add support for nir alu instructions for
converting between half/full precision.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
175d1b4372 freedreno/ir3: fix fixing-up register footprint
It isn't just vertex shaders that need to fixup reg footprint for inputs
populated before shader starts.

This problem showed up with compute shaders.  If you have (for example)
a localregid sysval, but only the .x component is used, the hw still
writes the .yz components, which could overflow into other threads
causing corruption.  Showed up in cl cts 'basic/test_basic intmath_int'.
But in theory the same problem could crop up elsewhere.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
9a62536108 freedreno: surfaces can be PIPE_BUFFER
At least for clover.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
d7af35a7f3 freedreno/a5xx: handle compute resources
Not *entirely* sure why this is a different BIND bit, but it is.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
82c71b09d5 freedreno/ir3: ignore return jump
I think this should also always only occur at the end of a BB (by
definition), and the BB successor should be the end block.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
c9b1cc33df freedreno: add some more compute caps
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
9630f4df3b freedreno/a5xx: don't expose 64b pointers yet
Temporary hack, but since we can't do 64b math yet in ir3, pretend that
we don't support 64b pointers.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
54988f1e6b freedreno: steal handy macro for compute caps from nouveau
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
26a9321d0a freedreno: add global_bindings state
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
8c42f63151 freedreno/ir3: small cleanup
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
76687b0c0a freedreno: add pctx->memory_barrier()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
9e4f5966e8 freedreno/ir3: cmdline compiler updates for spv shaders
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Samuel Pitoiset
322a51b549 ac: add ac_build_fsign()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-05 11:04:36 +01:00
Samuel Pitoiset
e8bdde2289 ac: add ac_build_isign()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-05 11:04:32 +01:00
Samuel Pitoiset
459e33900f ac: add ac_build_fract()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-05 11:04:30 +01:00
gurchetansingh@chromium.org
fe0647df5a virgl: add offset alignment values to to v2 caps struct
glBindBufferRange(..) in vrend_draw_bind_ubo is failing with
more than one uniform block. This is due to improper alignment
of the start of the second block. Let's query the proper
alignment from the driver and pass it back to Mesa.

Let's query for the texture alignment too, even though the Virgl
renderer doesn't call glTexBufferRange yet.

The default values are the widest workable range possible (for example,
GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT on Nvidia is 256).

Fixes:
	dEQP-GLES3.functional.ubo.* on Nvidia

Example test:
	dEQP-GLES3.functional.ubo.multi_basic_types.single_buffer.shared_vertex

Note: This is based on "virgl: reduce some default capset limits.",
which hasn't landed in Mesa yet but should relatively soon.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-05 13:29:39 +10:00
Dave Airlie
9283cf2ad1 virgl: reduce some default capset limits.
Since v2 might take a while to rollout, we should reduce
these inside some gathered minimums and then v2 can increase
them using host values.

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-05 13:29:38 +10:00
Dave Airlie
cd32258ec1 virgl: handle getting new capsets.
This checks the kernel api is new enough and asks for the
larger caps size since the kernel won't mess it up now.

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-05 13:29:38 +10:00
Timothy Arceri
70190a6567 radeonsi/nir: call ac_lower_indirect_derefs()
Fixes piglit tests:
tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec3-index-rd.shader_test
tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-05 14:09:23 +11:00
Timothy Arceri
561503e3bd radeonsi: add chip class to compiler_ctx_state
This will be used in the following patch.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-05 14:09:23 +11:00
Timothy Arceri
0f2c7341e8 ac/radv: move lower_indirect_derefs() to ac_nir_to_llvm.c
Until llvm handles indirects better we will need to use these
workarounds in the radeonsi backend also.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-05 14:09:23 +11:00
Bas Nieuwenhuizen
eea20d59ab radv: Fix copying from 3D images starting at non-zero depth.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-05 01:04:54 +01:00
Vinson Lee
bb742b6ebf swr/rast: Fix macOS macro.
Fixes: a25093de71 ("swr/rast: Implement JIT shader caching to disk")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-03-04 13:23:57 -08:00
Mathias Fröhlich
411aa8c322 vbo: Try to reuse the same VAO more often for successive dlists.
The change tries to catch more opportunities to reuse the same set
of VAO's when building up display lists. Instead of checking the
offset with respect to the beginning of the vertex buffer object
the change tries to apply this same optimization with respect to the
previous display list node.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-03 05:56:35 +01:00
Ian Romanick
a9eb455e29 mesa: Silence unused parameter warnings from TEXSTORE_PARAMS
Reduces my build from 1717 warnings to 1547 warnings by silencing 170
instances of things like

In file included from ../../SOURCE/master/src/mesa/main/texcompress_bptc.h:30:0,
                 from ../../SOURCE/master/src/mesa/main/texcompress_bptc.c:31:
../../SOURCE/master/src/mesa/main/texcompress_bptc.c: In function ‘_mesa_texstore_bptc_rgba_unorm’:
../../SOURCE/master/src/mesa/main/texstore.h:60:14: warning: unused parameter ‘dstFormat’ [-Wunused-parameter]
  mesa_format dstFormat, \
              ^
../../SOURCE/master/src/mesa/main/texcompress_bptc.c:1276:32: note: in expansion of macro ‘TEXSTORE_PARAMS’
 _mesa_texstore_bptc_rgba_unorm(TEXSTORE_PARAMS)
                                ^~~~~~~~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
1049b57bf2 i965: Silence unused parameter warnings in genX_state_upload
Reduces my build from 1772 warnings to 1717 warnings by silencing 55
instances of things like

../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_vertex_buffer_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:313:41: warning: unused parameter ‘end_offset’ [-Wunused-parameter]
                                unsigned end_offset,
                                         ^~~~~~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_sampler_state_pointers_xs’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4689:58: warning: unused parameter ‘brw’ [-Wunused-parameter]
 genX(emit_sampler_state_pointers_xs)(struct brw_context *brw,
                                                          ^~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4690:62: warning: unused parameter ‘stage_state’ [-Wunused-parameter]
                                      struct brw_stage_state *stage_state)
                                                              ^~~~~~~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_upload_default_color’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4730:40: warning: unused parameter ‘format’ [-Wunused-parameter]
                            mesa_format format, GLenum base_format,
                                        ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘translate_wrap_mode’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4906:41: warning: unused parameter ‘brw’ [-Wunused-parameter]
 translate_wrap_mode(struct brw_context *brw, GLenum wrap, bool using_nearest)
                                         ^~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_update_sampler_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4972:37: warning: unused parameter ‘batch_offset_for_sampler_state’ [-Wunused-parameter]
                            uint32_t batch_offset_for_sampler_state)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
50bf186829 isl: Silence unused parameter warnings in __gen_combine_address implementations
Reduces my build from 1808 warnings to 1772 warnings by silencing 36
instances of things like

../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c: In function ‘__gen_combine_address’:
../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:29: warning: unused parameter ‘data’ [-Wunused-parameter]
 __gen_combine_address(void *data, void *loc, uint64_t addr, uint32_t delta)
                             ^~~~
../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:41: warning: unused parameter ‘loc’ [-Wunused-parameter]
 __gen_combine_address(void *data, void *loc, uint64_t addr, uint32_t delta)
                                         ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
492a472b28 genxml: Silence unused parameter warnings in generated pack code
Reduces my build from 1960 warnings to 1808 warnings by silencing 152
instances of things like

In file included from ../../SOURCE/master/src/intel/genxml/genX_pack.h:32:0,
                 from ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:36:
src/intel/genxml/gen4_pack.h: In function ‘__gen_uint’:
src/intel/genxml/gen4_pack.h:58:49: warning: unused parameter ‘end’ [-Wunused-parameter]
 __gen_uint(uint64_t v, uint32_t start, uint32_t end)
                                                 ^~~
src/intel/genxml/gen4_pack.h: In function ‘__gen_offset’:
src/intel/genxml/gen4_pack.h:94:35: warning: unused parameter ‘start’ [-Wunused-parameter]
 __gen_offset(uint64_t v, uint32_t start, uint32_t end)
                                   ^~~~~
src/intel/genxml/gen4_pack.h:94:51: warning: unused parameter ‘end’ [-Wunused-parameter]
 __gen_offset(uint64_t v, uint32_t start, uint32_t end)
                                                   ^~~
src/intel/genxml/gen4_pack.h: In function ‘__gen_ufixed’:
src/intel/genxml/gen4_pack.h:133:48: warning: unused parameter ‘end’ [-Wunused-parameter]
 __gen_ufixed(float v, uint32_t start, uint32_t end, uint32_t fract_bits)
                                                ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
f726695cce i965: Silence unused parameter warnings in blorp
Reduces my build from 2023 warnings to 1960 warnings by silencing 63
instances of things like

In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:33:0:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_cc_viewport’:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:500:51: warning: unused parameter ‘params’ [-Wunused-parameter]
                        const struct blorp_params *params)
                                                   ^~~~~~
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_sampler_state’:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:524:53: warning: unused parameter ‘params’ [-Wunused-parameter]
                          const struct blorp_params *params)
                                                     ^~~~~~
In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:36:0:
../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h: In function ‘blorp_emit_vs_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h:50:48: warning: unused parameter ‘params’ [-Wunused-parameter]
                     const struct blorp_params *params)
                                                ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c: In function ‘blorp_flush_range’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:39: warning: unused parameter ‘batch’ [-Wunused-parameter]
 blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
                                       ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:52: warning: unused parameter ‘start’ [-Wunused-parameter]
 blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
                                                    ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:66: warning: unused parameter ‘size’ [-Wunused-parameter]
 blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
                                                                  ^~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
3a944316c4 nir: Silence unused parameter warnings in generated nir_constant_expressions code
Reduces my build from 2075 warnings to 2023 warnings by silencing 52
instances of things like

src/compiler/nir/nir_constant_expressions.c: In function ‘evaluate_bfi’:
src/compiler/nir/nir_constant_expressions.c:1812:61: warning: unused parameter ‘bit_size’ [-Wunused-parameter]
 evaluate_bfi(MAYBE_UNUSED unsigned num_components, unsigned bit_size,
                                                             ^~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
ab8f2e30b8 i965: Silence unused parameter warnings in generated OA code
Reduces my build from 6301 warnings to 2075 warnings by silencing 4226
instances of things like

src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c: In function ‘hsw__render_basic__gpu_core_clocks__read’:
src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c:41:62: warning: unused parameter ‘brw’ [-Wunused-parameter]
 hsw__render_basic__gpu_core_clocks__read(struct brw_context *brw,
                                                              ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
a55dae6ea2 i965: Silence warnings about mixing enum and non-enum in conditional
Reduces my build from 6451 warnings to 6301 warnings by silencing 150
instances of

../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_reg_type brw_inst_src1_type(const gen_device_info*, const brw_inst*)’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:802:55: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
    unsigned file = __builtin_strcmp("dst", #reg) == 0 ?                       \
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
                    BRW_GENERAL_REGISTER_FILE :                                \
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                    brw_inst_##reg##_reg_file(devinfo, inst);                  \
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h:811:1: note: in expansion of macro ‘REG_TYPE’
 REG_TYPE(src1)
 ^~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
feefb7810e intel/compiler: Silence unused parameter warnings in release builds
Reduces my build from 7005 warnings to 6451 warnings by silencing 554
instances of

In file included from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:28:0:
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src0_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:346:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_3src_a1_src0_imm(const struct gen_device_info *devinfo,
                                                         ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src2_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:354:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_3src_a1_src2_imm(const struct gen_device_info *devinfo,
                                                         ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src0_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:362:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_set_3src_a1_src0_imm(const struct gen_device_info *devinfo,
                                                             ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src2_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:370:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_set_3src_a1_src2_imm(const struct gen_device_info *devinfo,
                                                             ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_imm_uq’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:703:47: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_imm_uq(const struct gen_device_info *devinfo, const brw_inst *insn)
                                               ^~~~~~~
In file included from ../../SOURCE/master/src/intel/compiler/brw_shader.h:29:0,
                 from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:29:
../../SOURCE/master/src/intel/compiler/brw_compiler.h: In function ‘brw_stage_has_packed_dispatch’:
../../SOURCE/master/src/intel/compiler/brw_compiler.h:1277:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_stage_has_packed_dispatch(const struct gen_device_info *devinfo,
                                                             ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_disasm.c: In function ‘src_ia1’:
../../SOURCE/master/src/intel/compiler/brw_disasm.c:849:18: warning: unused parameter ‘_reg_file’ [-Wunused-parameter]
         unsigned _reg_file,
                  ^~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
c8a03ab453 i965: Silence unused parameter warnings
Reduces my build from 7119 warnings to 7005 warnings by silencing 114
instances of

In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_context.h:46:0,
                 from ../../SOURCE/master/src/mesa/drivers/dri/i965/intel_pixel_read.c:38:
../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h: In function ‘brw_bo_unmap’:
../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h:258:47: warning: unused parameter ‘bo’ [-Wunused-parameter]
 static inline int brw_bo_unmap(struct brw_bo *bo) { return 0; }
                                               ^~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Kenneth Graunke
9fa95359df intel: Drop program size pointer from vec4/fs assembly getters.
These days, we're just passing a pointer to a prog_data field, which
we already have access to.  We can just use it directly.

(In the past, it was a pointer to a separate value.)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-02 14:20:22 -08:00
Kenneth Graunke
b04cf529f2 i965: Mark upload buffers with MAP_ASYNC and MAP_PERSISTENT.
This should have no practical impact.  For the default uploader, we
don't really care, but for others, we may want to append more data
as the GPU is reading existing data, which means we need async and
persistent flags.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-02 14:19:33 -08:00
Kenneth Graunke
eb99bf8abe i965: Generalize intel_upload.c to support multiple uploaders.
I'd like to reuse the upload logic for a new program cache, but the
buffers will need to have a different lifetime than the default
uploader, and also some address space restrictions.  So, we can't
use a single uploader for both situations - we'll need two of them.

This creates a public 'uploader' structure, and adjusts the interface
to take an uploader rather than always using brw->upload.  It should
have no functional change at the moment.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-02 14:19:33 -08:00
Anuj Phogat
56dc9f9f49 intel/compiler: Memory fence commit must always be enabled for gen10+
Commit bit in the message descriptor (Bit 13) must be always set
to true in CNL+ for memory fence messages. It also fixes a piglit
GPU hang on cnl+ in simulation environment.
Piglit test: arb_shader_image_load_store-shader-mem-barrier
See HSD ES # 1404612949

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-02 11:45:21 -08:00
Francisco Jerez
4b4838b1ae Revert "i965/fs: Predicate byte scattered writes if needed"
This reverts commit a4031bdfa9.  It's
redundant with the sample mask predication done at this point by the
common logical send lowering infrastructure, and rather buggy because
it wasn't applying the correct sample mask in shaders using discard,
since the dispatch mask returned by FS_OPCODE_MOV_DISPATCH_TO_FLAGS
doesn't reflect samples discarded by the shader, so it could have led
to data corruption in fragment shader invocations that execute discard
based on a non-dynamically uniform condition.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Francisco Jerez
c063e88909 intel/fs: Handle surface opcode sample masks via predication.
The main motivation is to enable HDC surface opcodes on ICL which no
longer allows the sample mask to be provided in a message header, but
this is enabled all the way back to IVB when possible because it
decreases the instruction count of some shaders using HDC messages
significantly, e.g. one of the SynMark2 CSDof compute shaders
decreases instruction count by about 40% due to the removal of header
setup boilerplate which in turn makes a number of send message
payloads more easily CSE-able.  Shader-db results on SKL:

 total instructions in shared programs: 15325319 -> 15314384 (-0.07%)
 instructions in affected programs: 311532 -> 300597 (-3.51%)
 helped: 491
 HURT: 1

Shader-db results on BDW where the optimization needs to be disabled
in some cases due to hardware restrictions:

 total instructions in shared programs: 15604794 -> 15598028 (-0.04%)
 instructions in affected programs: 220863 -> 214097 (-3.06%)
 helped: 351
 HURT: 0

The FPS of SynMark2 CSDof improves by 5.09% ±0.36% (n=10) on my SKL
laptop with this change.  According to Eero this improves performance
of the same test by 9% on BYT and by 7-8% on BXT J4205 and on SKL GT2
desktop.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
2018-03-02 11:28:56 -08:00
Francisco Jerez
e7c9adca57 intel/eu: Plumb header present bit to codegen helpers for HDC messages.
This makes sure that the header-present bit of the message descriptor
is in sync with the IR instruction fields, which gives the optimizer
more control to avoid the overhead of setting up a message header when
it's possible to do so.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Francisco Jerez
6edb332b44 intel/ir: Allow arbitrary scratch flag registers for SHADER_OPCODE_FIND_LIVE_CHANNEL.
This shouldn't cause any functional change at this point, it changes
SHADER_OPCODE_FIND_LIVE_CHANNEL to use the flag register specified at
the IR level instead of the hard-coded f1.0, now that it can be
represented in backend_instruction::flag_subreg.  This will be
necessary for scheduling to behave correctly once more things start
making use of f1.0.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Francisco Jerez
cc0fc8b8ac intel/ir: Allow representing additional flag subregisters in the IR.
This allows representing conditional mods and predicates on f1.0-f1.1
at the IR level by adding an extra bit to the flag_subreg
backend_instruction field.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Francisco Jerez
9ec3362e0b intel/l3: Don't allocate SLM partition on ICL+.
SLM has a chunk of special-purpose memory separate from L3 on ICL+, we
shouldn't allocate a partition for it on L3 anymore.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Charmaine Lee
af8877af3b svga: add SVGA_NEW_PRESCALE to the tracked dirty mask for gs
Since geometry shader also consumes prescale constants, the
geometry shader constant buffer will need to be updated when prescale
factor is changed.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
dc79b88402 svga: fix blending regression
The earlier Mesa commit 3d06c8afb5 ("st/mesa: don't translate blend
state when it's disabled for a colorbuffer") subtly changed the
details of gallium's per-RT blend state.

In particular, when pipe_rt_blend_state[i].blend_enabled is true,
we have to get the src/dst blend terms from pipe_rt_blend_state[i],
not [0] as before.

We now have to scan the blend targets to find the first one that's
enabled (if any).  We have to use the index of that target for getting
the src/dst blend terms.  And note that we have to set identical blend
terms for all targets.

This fixes the Piglit fbo-drawbuffers2-blend test.  VMware bug 2063493.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
b871a77316 svga: check svga_have_vgpu10() in svga_delete_blend_state()
We were calling SVGA3D_vgpu10_DestroyBlendState() when vgpu10 was not
enabled (bs->id==0 by default), resulting in lots of device errors.

Reviewed-by: Neha Bhende<bhenden@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
72df3a7a39 svga: if svga_update_state() fails, skip the draw call
If svga_update_state() fails, we flush the command buffer and retry.
If it fails again, it likely means we were unable to translate a shader
for some reason (uses too many resources, for example).  In that case,
let's just skip the draw call.  The alternative, just disabling the
shader stage in question, would certainly lead to bad rendering anyway,
and probably device errors.

Fixes failed assertion running Piglit glsl-1.50/execution/
variable-indexing/gs-output-array-vec4-index-wr.shader_test since it
uses too many GS output registers (though the test still fails).
VMware bug 2063492.

v2: also call pipe_debug_message() so apps or apitrace can be notified
when this issue occurs.
v3: use svga_update_state_retry().

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
0a7deaa0d6 svga: let svga_update_state_retry() return a bool
This will allow minor simplifications elsewhere.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
35c5cf8959 svga: s/unsigned/boolean/ for a few local vars
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-03-02 12:23:50 -07:00
Dylan Baker
e23192022a meson: install vulkan_intel.h header
Fixes: d1992255bb
       ("meson: Add build Intel "anv" vulkan driver")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-02 11:11:20 -08:00
Boyuan Zhang
1ad89fa138 st/omx_bellagio: add picture profile and entry point
Profile and entry point were missing in the picture structure.
Therefore, add them back.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-03-02 12:04:36 -05:00
Boyuan Zhang
6a62e455f2 radeonsi: fix radeon create encoder return
Previous patch missed a "return" when trying to modify the create encoder
function, which made the whole logic fail. Therefore, add the return back.

Fixes: b38b208ff8 "radeonsi:create uvd hevc enc entry"

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-02 12:04:36 -05:00
Thierry Reding
f9bc48d41d loader: Add support for platform and host1x busses
ARM SoCs usually have their DRM/KMS devices on the platform bus, so add
support for this bus in order to allow use of the DRI_PRIME environment
variable with those devices.

While at it, also support the host1x bus, which is effectively the same
but uses an additional layer in the bus hierarchy.

Note that it isn't enough to support the bus that has the rendering GPU
because the loader code will also try to construct an ID path tag for a
scanout-only device if it is the default that is being opened.

The ID path tag for a device can be obtained by running udevadm info on
the device node, as shown in this example on NVIDIA Tegra:

	$ udevadm info /dev/dri/card0 | grep ID_PATH_TAG
	E: ID_PATH_TAG=platform-50000000_host1x

The corresponding OF_FULLNAME property, from which the ID_PATH_TAG is
constructed, can be found in the sysfs "uevent" attribute for the card0
device's parent:

	$ grep OF_FULLNAME /sys/devices/platform/50000000.host1x/drm/uevent
	OF_FULLNAME=/host1x@50000000

Similarily, /dev/dri/card1 corresponds to the GPU:

	$ udevadm info /dev/dri/card1 | grep ID_PATH_TAG
	E: ID_PATH_TAG=platform-57000000_gpu

and:

	$ grep OF_FULLNAME /sys/devices/platform/57000000.gpu/uevent
	OF_FULLNAME=/gpu@57000000

Changes in v2:
- avoid confusing pre-increment in strdup()
- add examples of tags to commit message

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-02 14:40:29 +01:00
Thierry Reding
498faea103 disk cache: Link with -latomic if necessary
The disk cache implementation uses 64-bit atomic operations. For some
architectures, such as 32-bit ARM, GCC will not be able to translate
these operations into atomic, lock-free instructions and will instead
rely on the external atomics library to provide these operations.

Check at configuration time whether or not linking against libatomic
is necessary and if so, create a dependency that can be used while
linking the mesautil library.

This is the meson equivalent of 2ef7f23820 ("configure: check if
-latomic is needed for __atomic_*").

For some background information on this, see:

	https://gcc.gnu.org/wiki/Atomic/GCCMM

Changes in v2:
- clarify meaning of lock-free in commit message
- fix build if -latomic is not necessary

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-02 11:31:59 +01:00
Samuel Pitoiset
c133a3411b radv: do not set pending_reset_query in BeginCommandBuffer()
This is just useless for two reasons:
1) flush_bits is not set accordingly, so nothing will be flushed
   in BeginQuery().
2) we always flush caches in EndCommandBuffer(), so if a reset
   is done in a previous command buffer we are safe.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-02 09:44:12 +01:00
Dave Airlie
bf2af063c3 r600/cayman: fix fragcood loading recip generation.
This fixes some hangs seen where the recip_ieee opcodes would
end up split across the wrong slots.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-02 00:33:18 +00:00
Kenneth Graunke
cee9f38903 i965: Allow 48-bit addressing on Gen8+.
This allows most GPU objects to use the full 48-bit address space
offered by Gen8+ platforms, rather than being stuck with 32-bit.
This expands the available GPU memory from 4G to 256TB or so.

A few objects - instruction, scratch, and vertex buffers - need to
remain pinned in the low 4GB of the address space for various reasons.
We default everything to 48-bit but disable it in those cases.

Thanks to Jason Ekstrand for blazing this trail in anv first and
finding the nasty undocumented hardware issues.  This patch simply
rips off all of his findings.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-01 15:46:11 -08:00
Kenneth Graunke
6712611735 i965: Shorten the name of the workaround BO.
This makes the name shorter in debug printouts.  If "workaround_bo"
is good enough for the code, it's probably good enough for debugging.
2018-03-01 15:46:11 -08:00
Kenneth Graunke
b04c5cece7 i965: Add debugging code to dump the validation list.
When anything goes wrong with this code, dumping the validation list
is a useful way to figure out what's happening.
2018-03-01 15:46:11 -08:00
Jason Ekstrand
ff4726077d intel/fs: Set up sampler message headers in the visitor on gen7+
This gives the scheduler visibility into the headers which should
improve scheduling.  More importantly, however, it lets the scheduler
know that the header gets written.  As-is, the scheduler thinks that a
texture instruction only reads it's payload and is unaware that it may
write to the first register so it may reorder it with respect to a read
from that register.  This is causing issues in a couple of Dota 2 vertex
shaders.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104923
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-01 15:11:01 -08:00
Timothy Arceri
f5305c1b44 ac: fix nir_intrinsic_shared_atomic_comp_swap handling
Following on from 49879f3778 this makes sure we use the correct
src index.

Fixes cts test:
KHR-GL46.compute_shader.atomic-case3

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-02 09:11:20 +11:00
Timothy Arceri
13cdf4e590 st/glsl_to_nir: simplify st_nir_assign_var_locations() and fix for fs outputs
We only need to check for previously processed location on user
defined varyings as they are the only ones that support component
packing. Therefore a single instance of processed_locs can be
shared by regular varyings and patches.

For simplicity we make processed_locs an array in order to handle
dual source bleanding.

Fixes the follow piglit test on radeonsi:
tests/spec/arb_enhanced_layouts/execution/component-layout/fs-output.shader_test

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-02 09:11:20 +11:00
Jason Ekstrand
89f78cf333 anv: Enable MSAA fast-clears
This speeds up the Sascha Willems multisampling demo by around 25% when
using 8x or 16x MSAA.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
00da139477 anv/cmd_buffer: Add support for MCS fast-clears and resolves
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
1805c483b1 anv/cmd_buffer: Add helpers for computing resolve predicates
We'll want to re-use the complex resolve predicate computations for MCS
resolves so it's nice to have them as helper functions.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
a0a319f16e anv/cmd_buffer: Handle MCS identical to CCS_E in compute_aux_usage
This doesn't actually do anything because att_state->fast_clear is
determined based on the return value of anv_layout_to_fast_clear_type
which currently returns NONE for multisampled images.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
d0f701d2f1 anv/blorp: Pass the clear address to blorp for subpass MSAA resolves
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
f4f95496cb anv/blorp: Allow indirect clear colors on blorp sources on gen7
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
d85f05bd6f anv/blorp: Add partial clear support to anv_image_mcs_op
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
c34feaea52 intel/blorp: Add indirect clear color support to mcs_partial_resolve
This is a bit complicated because we have to get the indirect clear
color in there somehow.  In order to not do any more work in the shader
than needed, we set it up as it's own vertex binding which points
directly at the clear color address specified by the client.

Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
ca7ab1a6a5 intel/blorp: Add a helper for filling out VERTEX_BUFFER_STATE
There are enough #ifs in there that it's kind-of pointless to duplicate
it for each buffer.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Andriy Khulap
7859701920 i965: Fix RELOC_WRITE typo in brw_store_data_imm64()
Fixes: 6c530ad116
("i965: Reduce passing 2x32b of reloc_domains to 2 bits")

Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com>
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-01 11:20:04 -08:00
Jonathan Gray
034bbaa6c0 gallium/util: use sockets on PIPE_OS_UNIX in u_network
Instead of listing all the UNIX PIPE_OS platforms just use
PIPE_OS_UNIX.  Makes BSD sockets available on PIPE_OS_BSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-01 18:44:39 +00:00
Jonathan Gray
7bea40e566 util: use clock_gettime() on PIPE_OS_BSD
OpenBSD, FreeBSD, NetBSD and DragonFlyBSD all have clock_gettime()
so use it when PIPE_OS_BSD is defined.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-01 18:44:38 +00:00
Jose Maria Casanova Crespo
4420d8866c nir/search: Include 8 and 16-bit support in construct_value
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-01 09:16:03 -08:00
Jason Ekstrand
99ee40fb54 nir/search: Support 8 and 16-bit constants in match_value
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2018-03-01 09:15:01 -08:00
Andres Gomez
b5b912dfee travis: make Meson find the proper llvm-config
Travis CI has moved to LLVM 5.0, and meson is detecting automatically
the available version in /usr/local/bin based on the PATH env variable
order preference.

As for 0.44.x, Meson cannot receive the path to the llvm-config binary
as a configuration parameter. See
https://github.com/mesonbuild/meson/issues/2887 and
7c8b6ee3fa

We want to use the custom (APT) installed version. Therefore, let's
make Meson find our wanted version sooner than the one at
/usr/local/bin

Once this is corrected, we would still need a patch similar to:
https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html

v2: Create the link only to the specificly wanted LLVM version (Gert).

Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Gert Wollny <gw.fossdev@gmail.com>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-01 12:21:30 +02:00
Andres Gomez
98f7650add meson: fix LLVM version detection when <= 3.4
3 digits versions in LLVM only started from 3.4.1 on.

Hence, even if you can perfectly build with an old LLVM (< 3.4.1) in
the system while not needing LLVM at all (auto), when passing through
the LLVM version detection code, meson will fail when accessing
"_llvm_version[2]" due to:

"Index 2 out of bounds of array of size 2."

v2: Properly compare LLVM version and set patch version to 0
    if < 3.4.1 (Eric).

v3: Improve the commit log explanation (Eric).

Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-01 12:16:23 +02:00
Iago Toral Quiroga
bc73016703 i965/sbe: fix number of inputs for active components
In 16631ca30e we fixed gen9 active components to account for padded
inputs in the URB, which we can have with SSO programs. To do that,
instead of going through the bitfield of inputs (which doesn't include
padding information), we compute the number of inputs from the size
of the URB entry.

Unfortunately, there are some special inputs that are not stored in
the URB and that we also need to account for. These special inputs
are identified and handled during calculate_attr_overrides().

Instead of keeping track of the exact number of inputs, we just
program active components for all possible inputs like we do in
anvil.

This fixes a regression in a WebGL program that uses Point Sprite
functionality (specifically, VARYING_SLOT_PNTC).

v2:
 - Add 'Fixes' tag (Mark Janes)
 - make no_vue_inputs int instead of uint32_t, and add const qualifier
   to num_inputs variable (Ian)

v3:
 - Do not try to count inputs correctly, just program all input
   slots like we do in anvil (Ken)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105224
Fixes: 16631ca30e (i965/sbe: fix active components for SSO programs with over 16 inputs)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-01 10:55:12 +01:00
Samuel Pitoiset
c27f5419f6 radv: only emit cache flushes when the pool size is large enough
This is an optimization which reduces the number of flushes for
small pool buffers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-01 09:53:40 +01:00
Samuel Pitoiset
2fe07933bd radv: keep track of the query pool size
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-01 09:53:39 +01:00
Samuel Pitoiset
c956d0f406 radv: make sure to emit cache flushes before starting a query
If the query pool has been previously resetted using the compute
shader path.

Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting the query pool")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-01 09:14:49 +01:00
Alejandro Piñeiro
e72fb4e611 nir/serialize: handle var->name being NULL
var->name could be NULL under ARB_gl_spirv for example. And in any
case, the code is already handing var name being NULL when reading a
variable, so it is consistent to do it writing a variable too.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-01 08:23:33 +01:00
Jose Maria Casanova Crespo
ba642ee3ee anv: Enable VK_KHR_16bit_storage for PushConstant
Enables storagePushConstant16 features of VK_KHR_16bit_storage for Gen8+.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
02266f9ba1 spirv/i965/anv: Relax push constant offset assertions being 32-bit aligned
The introduction of 16-bit types with VK_KHR_16bit_storages implies that
push constant offsets could be multiple of 2-bytes. Some assertions are
updated so offsets should be just multiple of size of the base type but
in some cases we can not assume it as doubles aren't aligned to 8 bytes
in some cases.

For 16-bit types, the push constant offset takes into account the
internal offset in the 32-bit uniform bucket adding 2-bytes when we access
not 32-bit aligned elements. In all 32-bit aligned cases it just becomes 0.

v2: Assert offsets to be aligned to the dest type size. (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
23ffb7c2d1 spirv: Calculate properly 16-bit vector sizes
Range in 16-bit push constants load was being calculated
wrongly using 4-bytes per element instead of 2-bytes as it
should be.

v2: Use glsl_get_bit_size instead of if statement
    (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
994d210429 anv: Enable VK_KHR_16bit_storage for SSBO and UBO
Enables storageBuffer16BitAccess and uniformAndStorageBuffer16BitAccesss
features of VK_KHR_16bit_storage for Gen8+.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
69be3a82ca i965/fs: Support 16-bit store_ssbo with VK_KHR_relaxed_block_layout
Restrict the use of untyped_surface_write with 16-bit pairs in
ssbo to the cases where we can guarantee that offset is multiple
of 4.

Taking into account that VK_KHR_relaxed_block_layout is available
in ANV we can only guarantee that when we have a constant offset
that is multiple of 4. For non constant offsets we will always use
byte_scattered_write.

v2: (Jason Ekstrand)
    - Assert offset_reg to be multiple of 4 if it is immediate.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
8dd8be0323 i965/fs: Support 16-bit do_read_vector with VK_KHR_relaxed_block_layout
16-bit load_ubo/ssbo operations that call do_untyped_read_vector don't
guarantee that offsets are multiple of 4-bytes as required by untyped_read
message. This happens for example in the case of f16mat3x3 when then
VK_KHR_relaxed_block_layout is enabled.

Vectors reads when we have non-constant offsets are implemented with
multiple byte_scattered_read messages that not require 32-bit aligned offsets.

Now for all constant offsets we can use the untyped_read_surface message.
In the case of constant offsets not aligned to 32-bits, we calculate a
start offset 32-bit aligned and use the shuffle_32bit_load_result_to_16bit_data
function and the first_component parameter to skip the copy of the unneeded
component.

v2: (Jason Ekstrand)
    Use untyped_read_surface messages always we have constant offsets.

v3: (Jason Ekstrand)
    Simplify loop for reads with non constant offsets.
    Use end - start to calculate the number of 32-bit components to read with
    constant offsets.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
2dd94f462b i965/fs: shuffle_32bit_load_result_to_16bit_data now skips components
This helper used to load 16bit components from 32-bits read now allows
skipping components with the new parameter first_component. The semantics
now skip components until we reach the first_component, and then reads the
number of components passed to the function.

All previous uses of the helper are updated to use 0 as first_component.
This will allow read 16-bit components when the first one is not aligned
32-bit. Enabling more usages of untyped_reads with 16-bit types.

v2: (Jason Ektrand)
    Change parameters order to first_component, num_components

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
67d7dd594e isl/i965/fs: SSBO/UBO buffers need size padding if not multiple of 32-bit
The surfaces that backup the GPU buffers have a boundary check that
considers that access to partial dwords are considered out-of-bounds.
For example, buffers with 1,3 16-bit elements has size 2 or 6 and the
last two bytes would always be read as 0 or its writting ignored.

The introduction of 16-bit types implies that we need to align the size
to 4-bytew multiples so that partial dwords could be read/written.
Adding an inconditional +2 size to buffers not being multiple of 2
solves this issue for the general cases of UBO or SSBO.

But, when unsized arrays of 16-bit elements are used it is not possible
to know if the size was padded or not. To solve this issue the
implementation calculates the needed size of the buffer surfaces,
as suggested by Jason:

surface_size = isl_align(buffer_size, 4) +
               (isl_align(buffer_size, 4) - buffer_size)

So when we calculate backwards the buffer_size in the backend we
update the resinfo return value with:

buffer_size = (surface_size & ~3) - (surface_size & 3)

It is also exposed this buffer requirements when robust buffer access
is enabled so these buffer sizes recommend being multiple of 4.

v2: (Jason Ekstrand)
    Move padding logic fron anv to isl_surface_state.
    Move calculus of original size from spirv to driver backend.
v3: (Jason Ekstrand)
    Rename some variables and use a similar expresion when calculating.
    padding than when obtaining the original buffer size.
    Avoid use of unnecesary component call at brw_fs_nir.
v4: (Jason Ekstrand)
    Complete comment with buffer size calculus explanation in brw_fs_nir.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Mathias Fröhlich
4c232dc721 vbo: Remove vbo_save_vertex_list::vertex_size.
Like before use local variables from compile_vertex_list instead.
Remove vertex_size from struct vbo_save_vertex_list.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
478a9bc7bb vbo: Remove vbo_save_vertex_list::buffer_offset.
The buffer_offset is used in aligned_vertex_buffer_offset.
But now that most of these decisions are done in compile_vertex_list
we can work on local variables instead of struct members in the
display list code. Clean that up and remove buffer_offset.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
bfa8d8e5bf vbo: Remove vbo_save_vertex_list::start_vertex.
Replace last use on replay with _vbo_save_get_{min,max}_index. Appart from
that it is not used anymore.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
6dd3e98c21 vbo: Remove vbo_save_vertex_list::attrsz.
Is not used anymore on replay, move the last use in display list
compilation to the original array in the display list compiler.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
95b4be4f29 vbo: Remove vbo_save_vertex_list::attrtype.
Is not used anymore on replay, move the last use in display list
compilation to the original array in the display list compiler.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
77df52cc4f vbo: Remove vbo_save_vertex_list::enabled.
Is not used anymore on replay.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
19a0f27a49 vbo: Remove reference to the vertex_store from the dlist node.
Since we now store a set of VAOs in the display list, use these object
to get the reference to the VBO in several places.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
6e410270ee vbo: Implement current values update in terms of the VAO.
Use the information already present in the VAO to update the current values
after display list replay. Set GL_OUT_OF_MEMORY on allocation failure
for the current value update storage.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
08aa0d9bf4 vbo: Implement vbo_loopback_vertex_list in terms of the VAO.
Use the information already present in the VAO to replay a display list
node using immediate mode draw commands. Use a hand full of helper methods
that will be useful for the next patches also.

v2: Insert asserts, constify local variables.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
f7178d677c vbo: Use a local variable for the dlist offsets.
The master value is now stored inside the VAO already present in
struct vbo_save_vertex_list. Remove the unneeded copy from dlist storage.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
1cc3516a11 vbo: Remove unused vbo_save_context::wrap_count.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
07915020f0 vbo: Remove unused vbo_save_vertex_list::dangling_attr_ref.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Jason Ekstrand
6d3edbea16 anv: Always set has_context_priority
We don't zalloc the physical device so we need to unconditionally set
everything.  Crucible helpfully initializes all allocations to 139 so it
was getting true regardless of whether or not the kernel actually
supports context priorities.

Fixes: 6d8ab53303 "anv: implement VK_EXT_global_priority extension"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 17:31:20 -08:00
Mark Janes
0fc009b8c7 Revert "i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+"
This reverts commit a2c1e48f15.

On BDWGT3e and KBLGT3e systems, this commit regressed the following
tests:

  piglit.spec.ext_framebuffer_multisample.accuracy 2 stencil_resolve small depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy 4 stencil_resolve small depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy 6 stencil_resolve small depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy 8 stencil_resolve small depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy all_samples stencil_resolve small depthstencil
2018-02-28 17:26:08 -08:00
Dave Airlie
6c1b5a40fd radeonsi/nir: increase values to 8 for gs fetch.
This stops a crash when running (still fails):
tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-01 10:35:09 +10:00
Bas Nieuwenhuizen
f9898b211e radv: Use the syncobj wait ioctl to wait on fences if possible.
Handles the !waitAll and signal after the start of the wait cases correctly.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen
34bd5e2e2e radv: Implement more efficient !waitAll fence waiting.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen
6968d782d3 radv: Implement waiting on non-submitted fences.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen
2a404c6f92 radv: Implement WaitForFences with !waitAll.
Nothing to do except using a busy wait loop. At least for old kernels.

A better implementation for newer kernels to come later.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105255
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-01 01:07:18 +01:00
Dave Airlie
49879f3778 ac/nir: fix shared atomic operations.
The nir->llvm conversion was using the wrong srcs.

Fixes:
tests/spec/arb_compute_shader/execution/shared-atomics.shader_test

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-01 10:06:06 +10:00
Dave Airlie
69495b30a3 ac/nir: don't apply slice rounding on txf_ms
This matches the tgsi code.

Fixes arb_texture_multisample texelFetch piglit tests.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec79 (radv: add initial non-conformant radv vulkan driver)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-01 10:04:34 +10:00
Timothy Arceri
f383fec903 radeonsi: set some context vars for nir path
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-01 10:51:56 +11:00
Timothy Arceri
7e46214f87 gallium: remove llvm from ir struct
This was added in 425dc4c4b3 but never used. Also since
100796c15c native has superseded llvm.

Acked-by: Dave Airlie <airlied@redhat.com>
2018-03-01 10:51:56 +11:00
Kenneth Graunke
e51b0664e0 i965: Don't emit MOVs with undefined registers for Gen4 point clipping.
Gen4 point clipping calls brw_clip_tri_alloc_regs with nr_verts == 0,
which means that c->reg.vertex[] isn't initialized.  It then emits MOVs
to stomp components of those uninitialized registers to 0.

This started causing assertions after Matt's recent series, when those
uninitialized registers started getting BRW_REGISTER_TYPE_NF, which
definitely doesn't exist on Gen4-5.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-28 15:03:51 -08:00
Eric Anholt
e4e79a02da broadcom/vc5: Fix regression in the page-cache slice size alignment.
We need to align the size of the slice, not the offset of the next slice.
Fixes KHR-GLES3.texture_repeat_mode.rgba32ui_11x131_2_clamp_to_edge.

Fixes: b4b4ada761 ("broadcom/vc5: Fix layout of 3D textures.")
2018-02-28 13:59:50 -08:00
Jason Ekstrand
a2c1e48f15 i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 13:31:42 -08:00
Jason Ekstrand
67da59e320 i965: Be more clever about setting up our viewport clip
Before, we were trusting in the hardware to take the intersection
of the viewport clip with the drawing rectangle.  Unfortunately,
3DSTATE_DRAWING_RECTANGLE is fairly expensive because it implicitly
does a full pipeline stall.  If we're a bit more careful with our
viewport clipping, we can just re-emit it once at context creation
time.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 13:31:42 -08:00
Matt Turner
debaa822ef intel/compiler: Re-add .vs_inputs_dual_locations = true
Looks like a rebase mistake.

Fixes: 89fe5190a2 ("intel/compiler: Lower flrp32 on Gen11+")
2018-02-28 13:25:21 -08:00
Dave Airlie
7cb9353de3 r600/shader: when using images always load thread id gpr at start (v2)
The delayed loading code was fail if we had control flow.

This fixes:
tests/spec/arb_shader_image_load_store/execution/image_checkerboard.shader_test

v2: don't use temp_reg before setting temp_reg up.

Tested-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 20:16:19 +00:00
Dave Airlie
8369fdee8b r600: fix whitespace in recent 1d texture commit.
trivial fix.
2018-02-28 20:16:19 +00:00
Matt Turner
6f00bf519d intel/compiler: Add ICL to test_eu_validate.cpp
With the Align16 tests now disabled, we can run the rest of the tests in
ICL mode (and see them pass!)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
ff4b41dd1d intel/compiler: Disable Align16 tests on Gen11+
Align16 is no more.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
c31d77ac22 intel/compiler: Add instruction compaction support on Gen11
Gen11 only differs from SKL+ in that it uses a new datatype index table.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
d5bf093cf9 intel/compiler: Mark line, pln, and lrp as removed on Gen11+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
89fe5190a2 intel/compiler: Lower flrp32 on Gen11+
The LRP instruction is no more.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
2134ea3800 intel/compiler/fs: Implement ddy without using align16 for Gen11+
Align16 is no more. We previously generated an align16 ADD instruction
to calculate DDY:

   add(16) g25<1>F  -g23<4>.xyxyF   g23<4>.zwzwF   { align16 1H };

Without align16, we now implement it as:

   add(4) g25<1>F   -g23<0,2,1>F    g23.2<0,2,1>F  { align1 1N };
   add(4) g25.4<1>F -g23.4<0,2,1>F  g23.6<0,2,1>F  { align1 1N };
   add(4) g26<1>F   -g24<0,2,1>F    g24.2<0,2,1>F  { align1 1N };
   add(4) g26.4<1>F -g24.4<0,2,1>F  g24.6<0,2,1>F  { align1 1N };

where only the first two instructions are needed in SIMD8 mode.

Note: an earlier version of the patch implemented this in two
instructions in SIMD16:

   add(8) g25<2>F   -g23<4,2,0>F    g23.2<4,2,0>F  { align1 1N };
   add(8) g25.1<2>F -g23.1<4,2,0>F  g23.3<4,2,0>F  { align1 1N };

but I realized that the channel enable bits will not be correct. If we
knew we were under uniform control flow, we could emit only those two
instructions however.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
62cfd4c656 intel/compiler/fs: Simplify ddx/ddy code generation
The brw_reg() constructor just obfuscates things here, in my opinion.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
bed0267ff6 intel/compiler/fs: Pass fs_inst to generate_ddx/ddy instead of opcode
In a future patch, generate_ddy will want to inspect inst->exec_size.
Change generate_ddx as well for consistency.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
3a584a15c0 intel/compiler/fs: Don't generate integer DWord multiply on Gen11
Like CHV et al., Gen11 does not support 32x32 -> 32/64-bit integer
multiplies.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
432674ce93 intel/compiler/fs: Implement FS_OPCODE_LINTERP with MADs on Gen11+
The PLN instruction is no more. Its functionality is now implemented
using two MAD instructions with the new native-float type. Instead of

   pln(16) r20.0<1>:F r10.4<0;1,0>:F r4.0<8;8,1>:F

we now have

   mad(8) acc0<1>:NF r10.7<0;1,0>:F r4.0<8;8,1>:F r10.4<0;1,0>:F
   mad(8) r20.0<1>:F acc0<8;8,1>:NF r5.0<8;8,1>:F r10.5<0;1,0>:F
   mad(8) acc0<1>:NF r10.7<0;1,0>:F r6.0<8;8,1>:F r10.4<0;1,0>:F
   mad(8) r21.0<1>:F acc0<8;8,1>:NF r7.0<8;8,1>:F r10.5<0;1,0>:F

... and in the case of SIMD8 only the first pair of MAD instructions is
used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
b5d8781e19 intel/compiler/fs: Return multiple_instructions_emitted from generate_linterp
If multiple instructions are emitted, special handling of things like
conditional mod and NoDDClr/NoDDChk need to be performed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
b1afdf9fc1 intel/compiler/fs: Fix application of cmod and saturate to LINE/MAC pair
This isn't technically broken, but the next patch will make this
function report whether it generated multiple instructions, and that
information will be used to disable the application of conditional mod
by the generic code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
2cff324210 intel/compiler: Add Gen11+ native float type
This new type exposes the additional precision offered by the
accumulator register and will be used in the next patch to implement the
functionality of the PLN instruction using a pair of MAD instructions.

One weird thing to note: align1 ternary instructions may only have an
accumulator in the dst or src1 normally, but when src0's type is :NF
the accumulator is read.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
58611ff913 intel/compiler: Add Gen11 register types
The hardware register types' encodings have changed on Gen11. Good thing
we have that superfluous looking brw_reg_type abstraction lying around!

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
bb428454a9 intel: Disable 64-bit extensions on platforms without 64-bit types
Gen11 does not support DF, Q, UQ types in hardware. As a result, we have
to disable some GL extensions until they can be reimplemented.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-02-28 11:15:47 -08:00
Anuj Phogat
5e42103f3b intel: Add icl pci id for INTEL_DEVID_OVERRIDE
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-02-28 11:15:47 -08:00
Matt Turner
35bfe20995 i965: Warn about preliminary support for Gen11
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:14:03 -08:00
Anuj Phogat
5ac804bd9a intel: Add a preliminary device for Ice Lake
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Anuj Phogat <anuj.phogat@intel.com>
2018-02-28 11:14:03 -08:00
Tapani Pälli
0c983b9094 anv: remove anv_gem_set_context_priority helper
anv_gem_set_context_param is to be used directly instead!

Fixes: 6d8ab53303 "anv: implement VK_EXT_global_priority extension"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 19:50:54 +02:00
George Kyriazis
a01d5e3712 swr/rast: revert clip distance precision
Fixes piglit tests that broke with 8a64593bde

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:50 -06:00
George Kyriazis
7e813f6214 swr/rast: Faster frustum prim culling
Fix clipper validMask setting. We don't need to run frustum rejected
primitives through the clipper.  Perform frustum culling with only
frustum clip codes. Guardband clip codes cannot be used because they
overlap frustum codes.

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:46 -06:00
George Kyriazis
1c73f42e6e swr/rast: Consolidate TRANSLATE_ADDRESS
Translate is now part of an overloaded LOAD call which required a change to
the code gen to skip the load functions in order to handle them manually
to make them virtual.

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:41 -06:00
George Kyriazis
e2a4fd0761 swr/rast: Code generation cleanup
Generate more compact code from gen_llvm.hpp.

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:37 -06:00
George Kyriazis
190ead3d79 swr/rast: Remove draw type from event definitions
- Have the draw type sent to DrawInfoEvent in handlers created in
  archrast.cpp.  The draw type no longer needs to be sent during during
  AR_API_EVENT() call in api.cpp.

- Remove draw type from event defintions in events_private.proto, no
  longer needed

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:32 -06:00
George Kyriazis
90e3e23f63 swr/rast: whitespace change
Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:28 -06:00
George Kyriazis
539de78633 swr/rast: Fix index buffer overfetch issue for non-indexed draws
Populate pLastIndex, even for the non-indexed case.  An zero pLastIndex
can cause the index offsets inside the fetcher to have non-sensical values
that can be either very large positive or very large negative numbers.

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:19 -06:00
Roland Scheidegger
26103487b5 softpipe: don't iterate through PIPE_MAX_SHADER_SAMPLER_VIEWS
We were setting view to NULL if the iteration was larger than i.
But in fact if the view is NULL the code did nothing anyway...

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-28 18:22:28 +01:00
Roland Scheidegger
b923f21eaa cso: don't cycle through PIPE_MAX_SHADER_SAMPLER_VIEWS on context destroy
There's no point, we know the highest non-null one.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-28 18:22:28 +01:00
Roland Scheidegger
89ae5def8c draw: don't needlessly iterate through all sampler view slots
We already stored the highest (potentially) used number.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-28 18:22:28 +01:00
Tapani Pälli
6d8ab53303 anv: implement VK_EXT_global_priority extension
v2: add ANV_CONTEXT_REALTIME_PRIORITY (Chris)
    use unreachable with unknown priority (Samuel)

v3: add stubs in gem_stubs.c (Emil)
    use priority defines from gen_defines.h

v4: cleanup, add anv_gem_set_context_param (Jason)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> (v2)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v3)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 14:36:57 +02:00
Tapani Pälli
5960023cf4 i965: use context priority definitions from gen_defines.h
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-28 14:36:57 +02:00
Tapani Pälli
4449a1f80d intel: add new common header gen_defines.h
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-28 14:36:57 +02:00
Christian König
33633690aa winsys/amdgpu: request high addresses
We now have hopefully fixed all bugs regarding high addresses on Vega10 and
Raven. Start to use the high range to make room for SVM in the low
range.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 13:30:32 +01:00
Samuel Pitoiset
639c4f2b54 ac/shader: move scanning some info about input PS declarations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-28 10:14:26 +01:00
Samuel Iglesias Gonsálvez
e207b2e2c8 glsl/linker: fix bug when checking precision qualifier
According to GLSL ES 3.2 spec, see table in 9.2.1 "Linked Shaders"
section, the precision qualifier should match for uniform variables.
This also applies to previous GLSL ES 3.x specs.

This 'if' checks the condition for uniform variables, while for UBOs
it is checked in link_interface_blocks.cpp.

Fixes: b50b82b8a5
("glsl/es31: precision qualifier doesn't need to match in shader interface block members")

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-28 07:04:13 +01:00
Samuel Iglesias Gonsálvez
c757c9dc03 anv: set maxResourceSize to the respective value for each generation
v2:
- Add the proper values to gen9+ (Jason)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 06:54:48 +01:00
Dave Airlie
a5853a3333 r600: partly revert disabling tiling for 1d texture.
Previously we had a check for 1d of narrow 2D textures, however
narrow 2d textures caused gpu hangs, but it was correct for 1d
textures.

This fixes a bunch of 1D image piglits for me.

Fixes: 7b8e1c089d (r600/texture: drop lowering 1d/2d images to linear.)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 04:59:37 +00:00
Timothy Arceri
0c1f37cc2d nir: fix interger divide by zero crash during constant folding
From the GLSL 4.60 spec Section 5.9 (Expressions):

   "Dividing by zero does not cause an exception but does result in
    an unspecified value."

Fixes: 89285e4d47 "nir: add new constant folding infrastructure"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105271
2018-02-28 15:55:39 +11:00
Ilia Mirkin
086c88551d st/mesa: ensure that images don't try to reference non-existent levels
Ideally the st_finalize_texture call would take care of that, but it
doesn't seem to with KHR-GL45.shader_image_size.advanced-nonMS-*. This
assertion makes sure that no such values are passed to the driver.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-27 22:38:33 -05:00
Dave Airlie
c7b25005a1 ac/radv: move load base vertex abi setup to vertex shader.
This was segfaulting:
dEQP-VK.memory.pipeline_barrier.host_write_index_buffer.1024

Fixes: 8de6f79707 (ac/radeonsi: add load_base_vertex() to the abi)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 09:58:12 +10:00
Dave Airlie
3401b028df ac/shader: fix vertex input with components.
This fixes:
dEQP-VK.glsl.440.linkage.varying.component.*

Fixes: 1c57a6da5e (ac/shader: scan vertex inputs usage mask)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 09:04:46 +10:00
Dave Airlie
6bafd4f4dd radv: remove device pointer from buffer.
This is never used.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 09:03:26 +10:00
Timothy Arceri
a050ea60ee nir: add lower_ldexp to nir compiler options
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
08fa84bb9a ac: implement nir_op_ldexp
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
9790921ff5 ac: fix nir_op_fdd{x,y} handling
radeonsi, i965 and anv all treat fdd{x,y} opcodes the same as
fdd{x,y}_coarse by default. The SPIR-V spec lets the implementation
decide how it should be handled and radv was previously going
for the higher quality option. Here we change the shared amd
code to match how nir_op_fdd{x,y} is expected to be handled
by the other NIR drivers.

Fixes piglit test:
./bin/arb_shader_texture_lod-texgrad -auto

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
8de6f79707 ac/radeonsi: add load_base_vertex() to the abi
Fixes the following piglit tests:

./bin/arb_shader_draw_parameters-basevertex basevertex -auto -fbo
./bin/arb_shader_draw_parameters-basevertex basevertex-baseinstance -auto -fbo

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
7f91473414 radeonsi: create get_base_vertex() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
ae47af50d6 radeonsi/nir: disable vertex_id_zero_based lowering
The lowering is incompatible with how the radeonsi backend works.

Fixes piglit test:
./bin/arb_shader_draw_parameters-basevertex vertexid-zerobased -auto

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
5504bebfc4 ac: add support for handling nir_intrinsic_load_vertex_id
This will be used by radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
3a0b4187dd ac: fix f2b and i2b for doubles
Without this llvm was asserting in debug builds.

V2: use LLVMConstNull()

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-28 09:23:49 +11:00
Francisco Jerez
cb309d27c5 intel/ir: Fix invalid type aliasing with undefined behavior in test_eu_compact.
test_fuzz_compact_instruction() was attempting to modify the uint64_t
data array of a brw_inst through a pointer to uint32_t, which has
undefined behavior.  This was causing the test_eu_compact unit test to
fail mysteriously for me on GCC 7 with some additional
harmless-looking changes I had applied to my tree, which happened to
affect the order instructions are emitted by GCC causing the bit
twiddling to be done after the clear_pad_bits() call which is supposed
to overwrite the same data through a pointer of different type,
leading to data corruption.  A similar failure has been reported by
Vinson Lee on the master branch built with GCC 8.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105052
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-27 11:42:39 -08:00
Francisco Jerez
69b4a9d21d util/bitset: Make C++ wrapper trivially constructible.
In order to fix a build failure on compilers not implementing
unrestricted unions, which is a C++11 feature.

v2: Provide signed integer comparison and assignment operators instead
    of BITSET_WORD ones to avoid spurious ambiguity warnings on
    comparisons with a signed integer literal.

Fixes: ba79a90fb5 "glsl: Switch ast_type_qualifier to a 128-bit bitset."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105238
Tested-by: Roland Scheidegger <sroland@vmware.com>
Tested-By: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-27 11:38:18 -08:00
Jordan Justen
9f223d860b intel/tools: Use gen_device_name_to_pci_device_id in aubinator
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
8ff89250ff intel/common: Add gen_device_name_to_pci_device_id
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
c2134f94c8 intel/vulkan: Support INTEL_DEVID_OVERRIDE environment variable
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
843f6d187a i965: Use gen_get_pci_device_id_override
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
e560bb9dc2 intel/common: Add gen_get_pci_device_id_override
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
6b274d5cc6 intel/vulkan: Support INTEL_NO_HW environment variable
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Harish Krupo
b9af043716 android: fix source files path for libmesa_anv_gen11
Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-27 14:16:08 +02:00
Eric Engestrom
248c593132 meson: avoid changing types for the dri3 option
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-27 11:21:20 +00:00
Eric Engestrom
76e8d61999 meson: simplify the gbm option code, and avoid changing types
v2: drop gallium comment (Dylan)

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-27 11:21:20 +00:00
Samuel Pitoiset
a549da877b ac/nir: clean up a hack about rounding 2nd coord component
It's basically just the opposite, and it only makes sense to
round the layer for 2D texture arrays.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-27 10:09:27 +01:00
Ilia Mirkin
e683a797c6 nvc0: collapse output slots to have adjacent registers
The hardware skips over unallocated slots, so we have to make sure those
registers are packed together.

Fixes KHR-GL45.enhanced_layouts.fragment_data_location_api

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-02-27 00:10:39 -05:00
Dave Airlie
250468f6b7 radv: expose async compute on SI
It looks like we had all the pieces in place for this,
just never tested it and turned it on.

I don't see any CTS regressions and the computeshader
demo runs.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-27 00:54:59 +00:00
Dave Airlie
1fc19a0f27 radv: merge tess rings into a single bo
Inspired by a passing commit to radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-27 00:54:59 +00:00
Emil Velikov
784d81e97e docs: update calendar, add news and link release notes to 17.3.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-27 00:32:14 +00:00
Emil Velikov
d9391014de docs: add sha256 checksums for 17.3.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b00880973e)
2018-02-27 00:29:44 +00:00
Emil Velikov
676c58fbdb docs: add release notes for 17.3.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b3e5a3f35b)
2018-02-27 00:29:43 +00:00
Dylan Baker
b9636fe38a meson: fix building without GL
libgl will be undefined _glx, so move that check inside the
`if with_glx != 'disabled'` block.

v2: - Simplify commit message (Eric, Emil)

Fixes: 5c460337fd ("meson: Fix GL and EGL pkg-config files with glvnd")
Reported-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
CC: Daniel Stone <daniels@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Untested-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-26 09:32:14 -08:00
Lionel Landwerlin
fca9f5b585 intel: aubinator_error_decode: fix segfault on missing register
Some register might be missing in our genxmls. Don't try to decode
them.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-26 16:54:48 +00:00
Eric Engestrom
11d45304fd *-symbol-check: use correct nm path when cross-compiling
Inspired-by: a similar patch for libdrm by Heiko Becker
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-26 13:50:59 +00:00
Karol Herbst
ef308d4007 nvir/gm107: consider FILE_FLAGS dependencies in SchedDataCalculatorGM107
currently while insterting barriers, writes and reads to FILE_FLAGS aren't
considered. This can lead to WaR hazards in some situations.

With the previous commit fixes shaders with intstructions like this:
  mad u32 $r2 $r4 $r11 $r2
  mad u32 { $r5 $c0 } $r4 $r10 $r6
  mad (SUBOP:1) u32 $r3 $r4 $r10 $r2 $c0

Affects OpenCL CTS tests on Maxwell+:
basic/test_basic intmath_long
basic/test_basic intmath_long2
basic/test_basic intmath_long4

v2: only put barriers on instructions which actually read flags

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-02-26 14:41:58 +01:00
Karol Herbst
2f07f823c9 nvir/gm107: iterate over all defs in SchedDataCalculatorGM107::findFirstUse
In the sched data calculator we have to track first use of defs by iterating
over all defs of an instruction, not just the first one.

v2: fix minGRP and maxGRP values

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-02-26 14:41:58 +01:00
Samuel Pitoiset
e05507a427 ac/nir: use ordered float comparisons except for not equal
Original patch from Timothy Arceri, I have just fixed the
not equal case locally.

This fixes one important rendering issue in Wolfenstein 2
(the cutscene transition issue).

RadeonSI uses the same ordered comparisons, so I guess that
what we should do as well.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104302
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-02-26 13:59:04 +01:00
Mauro Rossi
6451b0703f android: vulkan/util: add dependency on libnativewindow for O and later
Similar to 90dd6e5 ("Android: egl: add dependency on libnativewindow")

Fixes the following building error:

In file included from out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_vulkan_util_intermediates/util/vk_enum_to_str.c:26:
external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found
         ^~~~~~~~~~~~~~~~~
1 error generated.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-02-26 14:50:24 +02:00
Mauro Rossi
d448954228 android: anv: add dependency on libnativewindow for O and later
Similar to 90dd6e5 ("Android: egl: add dependency on libnativewindow")

Fixes the following building errors:

In file included from external/mesa/src/intel/vulkan/gen7_cmd_buffer.c:30:
In file included from external/mesa/src/intel/vulkan/anv_private.h:72:
external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal
error: 'system/window.h' file not found
         ^~~~~~~~~~~~~~~~~
1 error generated.
...
In file included from external/mesa/src/intel/vulkan/anv_gem.c:32:
In file included from external/mesa/src/intel/vulkan/anv_private.h:72:
external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal
error: 'system/window.h' file not found
         ^~~~~~~~~~~~~~~~~
1 error generated.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-02-26 14:49:06 +02:00
Mauro Rossi
9a508b719b android: anv/extensions: fix generated sources build
Building rules are aligned to automake ones

The correct script to build anv_extensions.{c,h} is anv_extensions_gen.py
Generation rules for anv_extensions.c requires --out-c option
Generation rules for anv_extensions.h were missing
Necessary include paths are added to avoid following build errors:

cp: cannot stat '.../gen/STATIC_LIBRARIES/libmesa_vulkan_common_intermediates/vulkan/anv_extensions.c':
No such file or directory

In file included from external/mesa/src/intel/vulkan/anv_gem.c:32:
external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found
         ^~~~~~~~~~~~~~~~~~
1 error generated.

In file included from external/mesa/src/intel/vulkan/anv_batch_chain.c:30:
external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found
         ^~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: dd088d4bec ("anv/extensions: Generate a header file with extension tables")
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-26 14:37:33 +02:00
Marek Olšák
8799eaed99 radeonsi: remove 2 unused user SGPRs from merged TES-GS with 32-bit pointers
The effect of the last 13 commits on user SGPR counts:

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:19 +01:00
Marek Olšák
3fa7a59d69 radeonsi: make SI_SGPR_VERTEX_BUFFERS the last user SGPR input
so that it can be removed and replaced with inline VBO descriptors,
and the pointer can be packed in unused bits of VBO descriptors.
This also removes the pointer from merged TES-GS where it's useless.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:08 +01:00
Marek Olšák
c78640ce31 radeonsi: set correct num_input_sgprs for VS prolog in merged shaders
We need to take num_input_sgprs from VS, not the second shader.
No apps suffered from this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:05 +01:00
Marek Olšák
f852b24ce0 radeonsi: allow fewer input SGPRs in 2nd shader of merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:03 +01:00
Marek Olšák
8d6e6b1d7c radeonsi: don't use struct si_descriptors for vertex buffer descriptors
VBO descriptor code will change a lot one day.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:00 +01:00
Daniel Stone
61d6ff3ba3 build: Move wayland-scanner check into platform
Also only check for wayland-scanner if building for the Wayland
platform.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Cc: Emil Velikov <emil.velikov@collabora.co.uk>
Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211
2018-02-26 10:43:19 +00:00
Daniel Stone
d33cd875e8 build: Move wayland-protocols check into platform
In line with wayland-client and wayland-server, move the check for
wayland-protocols into the wayland platform branch.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Cc: Emil Velikov <emil.velikov@collabora.co.uk>
Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211
2018-02-26 10:43:16 +00:00
Daniel Stone
d8f19d9aa0 vulkan/wsi/wayland: Move Wayland protocol from BUILT_SOURCES
autotools wants to have the BUILT_SOURCES ready as soon as it enters the
directory, even if they are not used. This meant the build failed if
wayland-protocols was not available on the system, even if it was not
enabled.

As BUILT_SOURCES cannot be used in a conditional (cf. 166852ee95), do
the same thing as EGL and manually encode the dependencies in the
Makefile.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Cc: Emil Velikov <emil.velikov@collabora.co.uk>
Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211
2018-02-26 10:43:12 +00:00
Dave Airlie
0cc5be7741 r600: fix tgsi clock last setting
On cayman this was hitting an assert later, which probably wasn't
see on non-cayman due to having the t slot.

Fixes: 9041730d1 (r600: add support for ARB_shader_clock.)
2018-02-26 11:05:45 +10:00
Dave Airlie
4d72a1efea r600: add time lo/hi debugging output.
This just adds the these to the debug prints.
2018-02-26 11:05:26 +10:00
Timothy Arceri
22430224fe radeonsi/nir: enable lowering of fpow
Lowering fpow in NIR rather than LLVM can be beneficial.

Polaris results:

Totals from affected shaders:
SGPRS: 124928 -> 124896 (-0.03 %)
VGPRS: 68616 -> 68332 (-0.41 %)
Spilled SGPRs: 394 -> 413 (4.82 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 3668912 -> 3658368 (-0.29 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 18575 -> 18593 (0.10 %)
Wait states: 0 -> 0 (0.00 %)

Fixes: d6b7539206 "ac/nir: remove emission of nir_op_fpow"

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-26 11:43:47 +11:00
Timothy Arceri
9873bd9dcd ac: make use of ac_get_llvm_num_components() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-26 11:43:47 +11:00
Timothy Arceri
1a757c9c97 gallium/tgsi: remove is_msaa_sampler array from tgsi_shader_info
Seems to have not been used since 16be87c904

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-26 11:43:47 +11:00
Timothy Arceri
9f7c940840 radeonsi/nir: fix loading of doubles for tess varyings
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-26 11:43:47 +11:00
Timothy Arceri
81f9d03807 radeonsi/nir: fix lds store in tcs outputs handling
We were ignoring the channel offset.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-26 11:43:47 +11:00
Gert Wollny
c7cadcbda4 r600: Take ALU_EXTENDED into account when evaluating jump offsets
ALU_EXTENDED needs 4 DWORDS instead of the usual 2, hence if the last ALU
clause within a IF-JUMP or ELSE branch is ALU_EXTENDED the target jump
offset needs to be adjusted accordingly.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104654
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-26 10:29:48 +10:00
Francisco Jerez
51562ea7a0 mesa: Expose EXT_shader_framebuffer_fetch(_non_coherent) on desktop and embedded GL.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
c6c64d4d6a glsl: Silence warnings when reading from a framebuffer fetch output.
Framebuffer fetch outputs are implicitly initialized upon entry to the
fragment shader.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
537bb1da98 glsl: Specify framebuffer fetch coherency mode in lower_blend_equation_advanced().
This requires passing an extra argument to the lowering pass because
the KHR_blend_equation_advanced specification doesn't seem to define
any mechanism for the implementation to determine at compile-time
whether coherent blending can ever be used (not even an "#extension
KHR_blend_equation_advanced_coherent" directive seems to be required
in the shader source AFAICT).

In the long run we'll probably want to do state-dependent recompiles
based on the value of ctx->Color.BlendCoherent, but right now there
would be no benefit from that because the only driver that supports
coherent framebuffer fetch is i965 on SKL+ hardware, which are unable
to support the non-coherent path for the moment because of texture
layout issues, so framebuffer fetch coherency is always enabled for
them.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
ef9e3f63ca glsl: Add support for the framebuffer fetch layout(noncoherent) qualifier.
This allows the application to request framebuffer fetch coherency
with per-fragment output granularity.  Coherent framebuffer fetch
outputs (which is the default if no qualifier is present for
compatibility with older versions of the EXT_shader_framebuffer_fetch
extension) will have ir_variable_data::memory_coherent set to true.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
0aeec504b4 glsl: Allow layout token for EXT_shader_framebuffer_fetch_non_coherent.
EXT_shader_framebuffer_fetch_non_coherent requires layout qualifiers
even on GL(ES) 2.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
1bc01db95f glsl: Initialize ir_variable_data::fb_fetch_output earlier for GL(ES) 2.
At the same point where it is initialized on GL(ES) 3.0+ so we can
implement some common layout qualifier handling in a future commit.
Until now the fb_fetch_output flag would be inherited from the
original implicit gl_LastFragData declaration at a later point in the
AST to GLSL IR translation.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
6ebefb0fd5 glsl: Replace MESA_shader_framebuffer_fetch extension flags with EXT ones.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
ba79a90fb5 glsl: Switch ast_type_qualifier to a 128-bit bitset.
This should end the drought of bits in the ast_type_qualifier object.
The bitset_t type works pretty much as a drop-in replacement for the
current uint64_t bitset.

The only catch is that the bitset_t type as defined in the previous
commit doesn't have a trivial constructor (because it has a
user-defined constructor), so it cannot be used as union member
without providing a user-defined constructor for the union (which
causes it in turn to be non-trivially constructible).  This annoyance
could be easily addressed in C++11 by declaring the default
constructor of bitset_t to be the implicitly defined one -- IMO one
more reason to drop support for GCC 4.2-4.3.

The other minor change was required because glsl_parser_extras.cpp was
hard-coding the type of bitset temporaries as uint64_t, which (unlike
would have been the case if the uint64_t had been replaced with
e.g. an __int128) would otherwise have caused a build failure, because
the boolean conversion operator of bitset_t is marked explicit (if
C++11 is available), so the bitset won't be silently truncated down to
1 bit in order to use it to initialize the uint64_t temporaries
(yikes).

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
bdbc2ffa42 util/bitset: Add C++ wrapper for static-size bitsets.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
8d1f1ce412 util: Add EXPLICIT_CONVERSION macro.
This can be used to specify that a C++ conversion operator is not
meant to be used for implicit conversions, which can lead to
unintended loss of information in some cases.  Implemented as a macro
in order to keep old GCC versions happy.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
378e918e28 mesa: Implement glFramebufferFetchBarrierEXT entry point.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
e4124f9bc1 glapi: Update XML for last revision of EXT_shader_framebuffer_fetch.
Desktop GL is now supported, and there is an additional entry-point
for EXT_shader_framebuffer_fetch_non_coherent.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
6a8ec78c2a mesa: Rename MESA_shader_framebuffer_fetch gl_extensions bits to EXT.
The changes I had originally planned for the MESA_shader_framebuffer_fetch
extension have been merged into the EXT spec, there's no point in keeping
MESA_shader_framebuffer_fetch extension enables.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
d0bef79f12 mesa: Rename dd_function_table::BlendBarrier to match latest EXT spec.
This GL entry point was renamed to glFramebufferFetchBarrier() in the
EXT extension on request from Khronos members.  Update the Mesa
codebase to match the latest spec.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
27c829da28 i965: Fix KHR_blend_equation_advanced with some render targets.
This reverts two bogus and seemingly useless changes from the commits
referenced below, which broke KHR_blend_equation_advanced (and
EXT_shader_framebuffer_fetch_non_coherent which wasn't exposed yet)
for any kind of render target surface that would cause the
get_isl_surf() call in brw_emit_surface_state() to do anything useful
(notice how the result of get_isl_surf() is completely ignored by the
caller right now), as was the case while using those extensions with
1D array or 3D framebuffers in particular.

Fixes: f5859b45b1 "i965/miptree: Switch remaining surfaces to isl"
Fixes: bf24c3539e "i965/miptree: Clean-up unused"
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Marek Olšák
fb410ae392 radeonsi: remove si_descriptors parameter from emit_shader_pointer functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00
Marek Olšák
63ea0a00a3 radeonsi: preload the tess offchip ring in TES
so that it's not done multiple times in branches

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00
Marek Olšák
2d03c4cac8 radeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRs
TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address
aligned to 512KB. Hey, it's a 13-bit pointer!

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00
Marek Olšák
190e064e63 radeonsi: move 2nd-shader descriptor pointers into s[0:1]
If 32-bit pointers are supported, both pointers can be moved into s[0:1]
and then ESGS has exactly the same user data SGPR declarations as VS.

If 32-bit pointers are not supported, only one pointer can be moved into
s[0:1]. In that case, the 2nd pointer is moved before TCS constants,
so that the location is the same in HS and GS.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00
Marek Olšák
1d1df76d2b radeonsi: change si_descriptors::shader_userdata_offset type to short
We will want to use SH registers outside of user data SGPRs, like the GFX9
special SGPRs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00
Marek Olšák
fca7dee9c6 radeonsi: put both tessellation rings into 1 buffer
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00
Marek Olšák
d2963d8b5f radeonsi: move tessellation ring info into si_screen
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00
Marek Olšák
41895c26d3 radeonsi: move TCS_OUT_LAYOUT.PatchVerticesIn to lower bits
For a later patch.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00
Karol Herbst
f0b39779a0 nvir: dont optimize mad with subops to shladd
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-24 18:48:13 +01:00
James Legg
afd8fd0656 radv: Really use correct HTILE expanded words.
When transitioning to an htile compressed depth format, Set the full
depth range, so later rasterization can pass HiZ. Previously, for depth
only formats, the depth range was set to 0 to 0. This caused unwanted
HiZ rejections with a VK_FORMAT_D16_UNORM depth buffer
(VK_FORMAT_D32_SFLOAT was not affected somehow).

These values are derived from PAL [0], since I can't find the
specification describing the htile values.

[0] 5cba4ecbda/src/core/hw/gfxip/gfx9/gfx9MaskRam.cpp (L1500)

CC: Dave Airlie <airlied@redhat.com>
CC: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Fixes: 5158603182 "radv: Use correct HTILE expanded words."
2018-02-24 02:16:22 +01:00
Mauro Rossi
8eed942136 radv/extensions: fix c_vk_version for patch == None
Similar to cb0d1ba156 ("anv/extensions: Fix VkVersion::c_vk_version for patch == None")
fixes the following building errors:

out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/radv_entrypoints.c:1161:48:
error: use of undeclared identifier 'None'; did you mean 'long'?
      return instance && VK_MAKE_VERSION(1, 0, None) <= core_version;
                                               ^~~~
                                               long
external/mesa/include/vulkan/vulkan.h:34:43: note: expanded from macro 'VK_MAKE_VERSION'
    (((major) << 22) | ((minor) << 12) | (patch))
                                          ^
...
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.

Fixes: e72ad05c1d ("radv: Return NULL for entrypoints when not supported.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-24 00:31:31 +01:00
Eric Anholt
b4b4ada761 broadcom/vc5: Fix layout of 3D textures.
Cube maps are entire miptrees repeated, while 3D textures have each level
have all of its layers next to each other.  Fixes tex3d and
tex-miplevel-selection GL2:texture() 3D.
2018-02-23 15:07:26 -08:00
Eric Anholt
97dc077303 broadcom/vc5: Ignore unused usage flags in is_format_supported.
Like for vc4, the new DISPLAY_TARGET flag ended up causing no formats to
match.  Just drop the whole retval == usage thing and return early when we
hit a known unsupported case.

Fixes: f7604d8af5 ("st/dri: only expose config formats that are display targets")
2018-02-23 15:07:18 -08:00
Eric Anholt
880573e737 gbm: Fix the alpha masks in the GBM format table.
Once GBM started looking at the values of the alpha masks, ARGB/ABGR
wouldn't match any more because we had both A and R in the low bits.

Fixes: 2ed344645d ("gbm/dri: Add RGBA masks to GBM format table")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-23 15:03:36 -08:00
Mathias Fröhlich
b54bf0e3e3 mesa: Update vertex processing mode on _mesa_UseProgram.
The change is a bug fix for 92d76a169:
  mesa: Provide an alternative to get_vp_mode()
that actually got exposed through 4562a7b0:
  vbo: Make use of _DrawVAO from the dlist code.

Fixes: KHR-GLES31.core.shader_image_load_store.advanced-sso-simple
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105229
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 21:08:35 +01:00
Marek Olšák
d169438d8e mesa: rename has_core_gs -> has_gs in get_programiv
This is also true for GLES.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:23 +01:00
Marek Olšák
1881f41b6c mesa: replace some API_OPENGL_CORE checks with _mesa_is_desktop_gl
This is more accurate with respect to the compatibility profile.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:22 +01:00
Marek Olšák
1defc973db mesa: add some of missing compatibility support for ARB_bindless_texture
The extension is exposed in the compatibility profile.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:20 +01:00
Marek Olšák
b8e2e9e1a1 mesa: expose ARB_enhanced_layouts in the compatibility profile
GLSL 1.40 is required.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:19 +01:00
Marek Olšák
a0c8b49284 mesa: enable OpenGL 3.1 with ARB_compatibility
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:17 +01:00
Marek Olšák
605a7f6db5 mesa: implement ARB_compatibility
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:15 +01:00
Emil Velikov
14a2c87c41 swr: remove dead LLVM code paths
LLVM requirement was bumped to 4.0.0 with earlier commit.
Hence any code tailored for older versions is now unreachable.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-02-23 19:17:31 +00:00
Eric Anholt
5980a41c0f broadcom/vc4: Remove the retval==usage check in is_format_supported().
This got us into trouble recently, so just remove it entirely.
2018-02-23 08:42:13 -08:00
Eric Anholt
bc3d16e633 broadcom/vc4: Add support for YUV textures using unaccelerated blits.
Previously we would assertion fail about having no hardware format.  This
is enough to get kmscube -M nv12-2img working.
2018-02-23 08:42:13 -08:00
Eric Anholt
c824a045ea broadcom/vc4: Fix double-unrefcounting of prsc->next with shadows.
When we set up the shadow resource we were copying the original resource
as the template, including its prsc->next field.  When we shadowed the
first YUV plane's resource for linear-to-tiled conversion, we would end up
unbalancing the refcount on the shadow resource's destruction.
2018-02-23 08:42:13 -08:00
Eric Anholt
6deb158ec1 broadcom/vc4: Add pipe_reference debugging for vc4_bos.
Trying to track down the YUV EGLImage use-after-free, it helps to see what
the mystery objects are that are being refcounted.
2018-02-23 08:42:13 -08:00
Eric Anholt
34ea1aca92 broadcom/vc4: Remove dead vc4_bo_set_reference().
It would be broken if NULL was passed to it anyway, since it wouldn't
participate in screen->bo_handles management.
2018-02-23 08:42:13 -08:00
Eric Anholt
a49738290c broadcom/vc4: Use pipe_resource_reference in sampler views.
Improves u_debug_refcount output.
2018-02-23 08:42:13 -08:00
Eric Anholt
0c1dd9dee0 broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride.
This is part of supporting YUV textures -- MMAL will be handing us a
single GEM BO with the planes at offsets within it, and MMAL-decided
stride.
2018-02-23 08:42:13 -08:00
Eric Anholt
978b884afc broadcom/vc4: Ignore PIPE_BIND_DISPLAY_TARGET in is_format_supported().
We were failing the retval == usage check at the end.

Fixes: f7604d8af5 ("st/dri: only expose config formats that are display targets")
2018-02-23 08:42:13 -08:00
Lucas Stach
8df11f3fad etnaviv: fix in-place resolve tile count
TS tiles map to a fixed amount of bytes in the color/depth surface,
so the blocksize of the format needs to be taken into account when
calculating the number of tiles to fill.

The simplest fix is to just use the layer stride, which is the surface
size in bytes.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-02-23 15:34:39 +01:00
Lucas Stach
add23b59c9 etnaviv: switch magic single buffer state to "3"
Some of the 16bit formats misrender with missing tiles with the current
"2" state. As all the previously working formats also work with the "3"
state, just always use that one.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-02-23 15:34:39 +01:00
Lucas Stach
8befc11186 etnaviv: add debug switch to disable single buffer feature
This feature has caused some trouble already. Add a debug switch to
allow users to quickly check if a specific issue is caused by this
feature.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-02-23 15:34:31 +01:00
Dylan Baker
5c460337fd meson: Fix GL and EGL pkg-config files with glvnd
Currently meson will generate a pkg-config that links to EGL_mesa (or
GLX_mesa), but this isn't correct, it should always link to EGL or GL.
Probably the "right" solution is to have glvnd itself provide the pkg
config files for GL and EGL, but that also means that glvnd needs to
provide many of the header files, which makes it a more involved job.

Fixes: a47c525f32 ("meson: build glx")
Fixes: 035ec7a2bb ("meson: Add support for EGL glvnd")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-23 13:30:28 +00:00
Frank Binns
6160bf97db egl/dri2: fix segfault when display initialisation fails
dri2_display_destroy() is called when platform specific display
initialisation fails. However, this would typically lead to a
segfault due to the dri2_egl_display vbtl not having been set up.

Fixes: 2db9548296 ("loader_dri3/glx/egl: Optionally use a blit
context for blitting operations")
Signed-off-by: Frank Binns <francisbinns@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-23 11:13:22 +00:00
Juan A. Suarez Romero
e1623b303c mesa: add missing RGB9_E5 format in _mesa_base_fbo_format
RGB9_E5 should be accepted by RenderbufferStorage if the
EXT_texture_shared_exponent is exposed. It is left to the
implementations to return GL_FRAMEBUFFER_UNSUPPORTED_EXT
when checking the framebuffer completeness if they do not
support rendering in this format.

Discussed in:
https://github.com/KhronosGroup/OpenGL-API/issues/32

This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5

v2: Added more info to the commit message (Antia)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
2018-02-23 10:12:06 +01:00
Christian Gmeiner
e72062b66d etnaviv: npot_tex_any_wrap needs one bit only
Reduces size of struct etna_specs from 100 to 94 bytes.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-02-23 09:38:16 +01:00
Mathias Fröhlich
4562a7b0e8 vbo: Make use of _DrawVAO from the dlist code.
Finally use an internal VAO to execute display list draws. Avoid
duplicate state validation for display list draws. Remove client arrays
previously used exclusively for display lists.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:14 +01:00
Mathias Fröhlich
2f35140846 mesa: Use atomics for shared VAO reference counts.
VAOs will be used in the next change as immutable object across multiple
contexts. Only reference counting may write concurrently on the VAO. So,
make the reference count thread safe for those and only those VAO objects.

v3: Use bool/true/false for gl_vertex_array_object::SharedAndImmutable.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:11 +01:00
Mathias Fröhlich
8a3a4b6fae vbo: Make use of _DrawVAO from immediate mode draw
Finally use an internal VAO to execute immediate mode draws. Avoid
duplicate state validation for immediate mode draws. Remove client arrays
previously used exclusively for immediate mode draws.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:07 +01:00
Mathias Fröhlich
c757e416ce vbo: Implement tool functions for vbo specific VAO setup.
Correct VBO_MATERIAL_SHIFT value.
The functions will be used next in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:04 +01:00
Mathias Fröhlich
ef8028017d mesa: Add flush_vertices to _mesa_bind_vertex_buffer.
We will need the flush_vertices argument later in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:01 +01:00
Mathias Fröhlich
354b76ad20 mesa: Make _mesa_vertex_attrib_binding public.
Change vertex_attrib_binding() to _mesa_vertex_attrib_binding(), add a
flush_vertices argument, and make it publicly available.
The function will be needed later in the series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:58 +01:00
Mathias Fröhlich
4331969ac4 mesa: Add flush_vertices to _mesa_{enable,disable}_vertex_array_attrib.
We will need the flush_vertices argument later in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:55 +01:00
Mathias Fröhlich
195bb990ed vbo: Use _DrawVAO for array type draw commands.
Switch over to use the _DrawVAO for all the array type draws.
The _DrawVAO needs to be set before we enter _mesa_update_state, so move
setting the draw method in front of the first call to _mesa_update_state
which is in turn called from the *validate*Draw* calls. Using the
gl_vertex_array_object::_Enabled bitmask, gl_vertex_program_state::_VPMode
and gl_vertex_array_object::_AttributeMapMode we can already set
varying_vp_inputs before we call _mesa_update_state the first time.
Thus remove duplicate state validation.

v2: Update comments.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:50 +01:00
Mathias Fröhlich
6002ab564b vbo: Implement method to track the inputs array.
Provided the _DrawVAO and the derived state that is maintained if we have
the _DrawVAO set, implement a method to incrementally update the array of
gl_vertex_array input pointers.

v2: Add some more comments.
    Rename _vbo_array_init to _vbo_init_inputs.
    Rename vbo_context::arrays to vbo_context::draw_arrays.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:46 +01:00
Mathias Fröhlich
08c7474189 mesa: Introduce a yet unused _DrawVAO.
During the patch series this VAO gets populated with either the currently
bound VAO or an internal VAO that will be used for immediate mode and
dlist rendering.

v2: More comments about the _DrawVAO, filter and enabled mask.
    Rename _DrawVAOEnabled to _DrawVAOEnabledAttribs.
v3: Fix and move comment.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:43 +01:00
Mathias Fröhlich
ce3d2421a0 vbo: Remove get_vp_mode() and enum vp_mode.
Is now unused.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:40 +01:00
Mathias Fröhlich
60c3ca1b23 vbo: Use _VPMode instead of get_vp_mode().
At those places where we used get_vp_mode() use
gl_vertex_program_state::_VPMode instead.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:36 +01:00
Mathias Fröhlich
92d76a1691 mesa: Provide an alternative to get_vp_mode()
To get equivalent information than get_vp_mode(), track the vertex
processing mode in a per context variable at
gl_vertex_program_state::_VPMode.
This aims to replace get_vp_mode() as seen in the vbo module.
But instead of the get_vp_mode() implementation which only gives correct
answers past calling _mesa_update_state() this context variable is
immediately tracked when the vertex processing state is modified. The
correctness of this value is asserted on state validation.

With this in place we should be able to untangle the dependency with
varying_vp_inputs and state invalidation.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:30 +01:00
Ilia Mirkin
d73f1f2ad8 nv50,nvc0: fix integer MS resolves using 2d engine
We don't want filtering for integer textures, same as depth/stencil.

Fixes: KHR-GL45.direct_state_access.renderbuffers_storage_multisample
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-02-22 20:47:48 -05:00
Ilia Mirkin
33ce3569c5 nvc0: fix writing query results into buffer
We need to mark the range as valid, and validate the resource using a
helper to ensure that the buffer status is marked properly.

Fixes some CTS pipeline stats query tests, and
KHR-GL45.direct_state_access.queries_functional

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-02-22 20:47:48 -05:00
Ilia Mirkin
f6e4f95668 nv50,nvc0: fix clear buffer acceleration
Two things were off:
 - valid range was not updated, which could affect waiting for future
   maps
 - fencing was done manually instead of using the *_resource_validate
   helper, which resulted in a missed dirty buffer flag being set

Fixes: KHR-GL45.direct_state_access.buffers_clear
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-02-22 20:47:48 -05:00
Lionel Landwerlin
bd9672695b i965: perf: ensure reading config IDs from sysfs isn't interrupted
Fixes: 458468c136 "i965: Expose OA counters via INTEL_performance_query"
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-23 01:44:07 +00:00
Bas Nieuwenhuizen
032870beda radv: Fix autotools build.
Somewhere along the way the Makefile changes got lost ...

Fixes: 4db78f3a6b "radv: Put supported extensions in a struct."
Acked-by: Dave Airlie <airlied@redhat.com>
2018-02-23 01:54:12 +01:00
Bas Nieuwenhuizen
e72ad05c1d radv: Return NULL for entrypoints when not supported.
This implements strict checking for the entrypoint ProcAddr
functions.

 - InstanceProcAddr with instance = NULL, only returns the 3 allowed
   entrypoints.
 - DeviceProcAddr does not return any instance entrypoints.
 - InstanceProcAddr does not return non-supported or disabled
   instance entrypoints.
 - DeviceProcAddr does not return non-supported or disabled device
   entrypoints.
 - InstanceProcAddr still returns non-supported device entrypoints.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen
414f5e0e14 radv: Reword radv_entrypoints_gen.py
With a big inspiration from anv as always ...

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen
076f7cfc6b radv: Track enabled extensions.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen
4db78f3a6b radv: Put supported extensions in a struct.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-23 00:39:02 +01:00
Jose Fonseca
1f5618e81c appveyor: Build with MSVC 2015.
The MSVC version we (at VMware) primarily care about from now on is
2015.

See https://ci.appveyor.com/project/jrfonseca/mesa/build/46

We can drop support for building with 2013 in a future commit.  I'm not
aware of significant changes in C99/C11 support from MSVC 2013 to 2015,
but there's no point in continuing supporting old MSVC versions when
nobody cares.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-22 21:10:20 +00:00
Samuel Pitoiset
d6b7539206 ac/nir: remove emission of nir_op_fpow
fpow is now lowered at NIR level.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:44:46 +01:00
Samuel Pitoiset
7aa008d1d7 radv: enable lowering of fpow to fexp2 and flog2
There is no fpow in hardware, so it's always lowered somewhere,
but it appears that lowering at NIR level is better. Figured while
comparing compute shaders between RadeonSI and RADV.

Polaris10:
Totals from affected shaders:
SGPRS: 18936 -> 18904 (-0.17 %)
VGPRS: 12240 -> 12220 (-0.16 %)
Spilled SGPRs: 2809 -> 2809 (0.00 %)
Code Size: 718116 -> 719848 (0.24 %) bytes
Max Waves: 1409 -> 1410 (0.07 %)

Vega10:
Totals from affected shaders:
SGPRS: 18392 -> 18392 (0.00 %)
VGPRS: 12008 -> 11920 (-0.73 %)
Spilled SGPRs: 3001 -> 2981 (-0.67 %)
Code Size: 777444 -> 778788 (0.17 %) bytes
Max Waves: 1503 -> 1504 (0.07 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:40:47 +01:00
Samuel Pitoiset
63fb30c674 nir: lower fexp2(fmul(flog2(a), 2)) to fmul(a, a)
Similar for the 4 case.

Suggested by Bas.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:40:45 +01:00
Samuel Pitoiset
b18997876f nir: add is_used_once for fmul(fexp2(a), fexp2(b)) to fexp2(fadd(a, b))
Otherwise the code size increases because the original fexp2()
instructions can't be deleted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:40:43 +01:00
Samuel Pitoiset
a01e9996b5 ac/nir: set GLC=1 for load/store of coherent/volatile images
This disables persistence accross wavefronts.

F1 2017 and Wolfenstein 2 appear to use some coherent images
but this patch doesn't seem to change anything.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:39:55 +01:00
Samuel Pitoiset
3c40be126f spirv: apply memory qualifiers to images
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:39:53 +01:00
Chuck Atkins
540e49e105 glx: Properly handle cases where screen creation fails
This fixes a segfault exposed by a29d63ecf7 which occurs when swr is
used on an unsupported architecture.

v2: re-work to place logic in xmesa_init_display

Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: George Kyriazis <george.kyriazis@intel.com>
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-22 10:20:32 -05:00
Iago Toral Quiroga
7668b594e6 anv/blorp: multisample resolve all attachment layers
We were only resolving the first.

v2:
  - Do not require that the number of layers on dst and src are an
    exact match, it is okay if the dst has more layers so long as
    it has at least the same that we are going to resolve.
  - Do not always resolve array_len layers, we should resolve
    only from base_array_layer to array_len.

v3:
  - v2 was assuming that array_len represented the total number of
    layers in the image, but it represents the number of layers
    starting at the base array ayer.

v4:
 - The number of layers to resolve should be taken from the
   framebuffer (Nanley).

Fixes new CTS tests for multisampled layered rendering:
dEQP-VK.renderpass.multisample_resolve.layers_*

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-22 08:23:39 +01:00
Jason Ekstrand
2dce4ac6ac intel/isl: Improve the documentation on get_default_aux_state
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
24952160fd i965: Use finish_external instead of make_shareable in setTexBuffer2
The setTexBuffer2 hook from GLX is used to implement glxBindTexImageEXT
which has tighter restrictions than just "it's shared".  In particular,
it says that any rendering to the image while it is bound causes the
contents to become undefined.

The GLX_EXT_texture_from_pixmap extension provides us with an acquire
and release in the form of glXBindTexImageEXT and glXReleaseTexImageEXT.
The extension spec says,

    "Rendering to the drawable while it is bound to a texture will leave
    the contents of the texture in an undefined state.  However, no
    synchronization between rendering and texturing is done by GLX.  It
    is the application's responsibility to implement any synchronization
    required."

From the EGL 1.4 spec for eglBindTexImage:

    "After eglBindTexImage is called, the specified surface is no longer
    available for reading or writing.  Any read operation, such as
    glReadPixels or eglCopyBuffers, which reads values from any of the
    surface’s color buffers or ancillary buffers will produce
    indeterminate results.  In addition, draw operations that are done
    to the surface before its color buffer is released from the texture
    produce indeterminate results

In other words, between the bind and release calls, we effectively own
those pixels and can assume, so long as we don't crash, that no one else
is reading from/writing to the surface.  The GLX and EGL implementations
call the setTexBuffer2 and releaseTexBuffer function pointers that the
driver can hook.

In theory, this means that, between BindTexImage and ReleaseTexImage, we
own the pixels and it should be safe to track aux usage so we
can avoid redundant resolves so long as we start off with the right
assumption at the start of the bind/release pair.

In practice, however, X11 has slightly different expectations.  It's
expected that the server may be drawing to the image at the same time as
the compositor is texturing from it.  In that case, the worst expected
outcome should be tearing or partial rendering and not random corruption
like we see when rendering races with scanout with CCS.  Fortunately,
the GEM rules about texture/render dependencies save us here.  If X11
submits work to write to a pixmap after the compositor has submitted
work to texture from it, GEM inserts a dependency between the compositor
and X11.  If X11 is using a high-priority context, this will cause the
compositor to get a temporarily boosted priority while the batch from
X11 is waiting on it.  This means that we will never have an actual race
between X11 and the compositor so no corruption can happen.

Unfortunately, however, this means that X11 will likely be rendering to it
between the compositor's BindTexImage and ReleaseTexImage calls.  If we
want to avoid strange issues, we need to be a bit careful about
resolves because we can't really transition it away from the "default"
aux usage.  The only case where this would practically be a problem is
with image_load_store where we have to do a full resolve in order to use
the image via the data port.  Even there it would only be a problem if
batches were split such that X11's rendering happens between the resolve
and the use of it as a storage image.  However, the chances of this
happening are very slim so we just emit a warning and hope for the best.

This commit adds a new helper intel_miptree_finish_external which resets
all aux state to whatever ISL says is the right worst-case "default" for
the given modifier.  It feels a little awkward to call it "finish"
because it's actually an acquire from the perspective of the driver, but
it matches the semantics of the other prepare/finish functions.  This
new helper gets called in intelSetTexBuffer2 instead of make_shareable.
We also add an intelReleaseTexBuffer (we passed NULL to releaseTexBuffer
before) and call intel_miptree_prepare_external in it.  This probably
does nothing most of the time but it means that the prepare/finish calls
are properly matched.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
00926a2730 i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2
The old code made a new miptree that referenced the same BO as the
renderbuffer and just trusted in the memory aliasing to work.  There are
only two ways in which the new miptree is liable to differ from the one
in the renderbuffer and neither of them matter:

 1) It may have a different target.  The only targets that we can ever
    see in intelSetTexBuffer2 are GL_TEXTURE_2D and GL_TEXTURE_RECTANGLE
    and the difference between the two doesn't matter as far as the
    miptree is concerned; genX(update_sampler_state) only looks at the
    gl_texture_object and not the miptree when determining whether or
    not to use normalized coordinates.

 2) It may have a very slightly different format.  Again, this doesn't
    matter because we've supported texture views for quite some time so
    we always look at the gl_texture_object format instead of the
    miptree format for hardware setup anyway.

On the other hand, because we were recreating the miptree, we were using
intel_miptree_create_for_bo which doesn't understand modifiers.  We
really want this function to work without doing a resolve so long as you
have modifiers so we need to fix that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
41d45eb21e i965/tex_image: Pull the tex format from the renderbuffer in intelSetTexBuffer2
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
344b57b10b i965/miptree: Loosen the format check in miptree_match_image
This function is used to determine when we need to re-allocate a
miptree.  Since we do nothing different in miptree allocation for
sRGB vs. linear, loosening this should be safe and may lead to less
copying and reallocating in some odd cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
5b1b710e6f i965/state: Ignore intel_obj->_Format for depth/stencil and ETC2
We're about to start letting the intel_obj->_Format be the "real"
texture format.  For depth/stencil textures, this may be a combined
depth stencil format.  For ETC2 on gen7 and earlier, this will be the
actual ETC2 format.  This makes a bit more GL sense but means we have to
be careful in state upload.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Kenneth Graunke
183ce5e629 glsl: Parse 'layout' as a token with advanced blending or bindless
Both KHR_blend_equation_advanced and ARB_bindless_texture provide
layout qualifiers, and are exposed in compatibility contexts.  We
need to parse the layout qualifier as a token in order for those
to work, but forgot to extend this check.

ARB_shader_image_load_store would need a similar treatment, but we
don't expose that in legacy OpenGL contexts.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105161
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-02-21 17:50:57 -08:00
Daniel Stone
c7e22483fe vulkan/wsi/x11: Consistently update and return swapchain status
Use a helper function for updating the swapchain status. This will be
used later to handle VK_SUBOPTIMAL_KHR, where we need to make a
non-error status stick to the swapchain until recreation.  Instead of
direct comparisons to VK_SUCCESS to check for error, test for negative
numbers meaning an error status, and positive numbers indicating
non-error statuses.

v2 (Jason Ekstrand):
 - Use a pattern of "return x11_swapchain_result(chain, VK_WHATEVER)"
 - Handle wsi_queue_pull returning VK_TIMEOUT
 - Call x11_swapchain_result in x11_present_to_x11

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-21 22:37:10 +00:00
Jason Ekstrand
6937c61324 vulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails
This most likely means we lost our connection to the X server so
OUT_OF_DATE is reasonable.  This was also the one case where we pushed a
UINT32_MAX into the queue without setting an error condition.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-21 22:37:10 +00:00
Daniel Stone
bfa22266cd vulkan/wsi/wayland: Add support for zwp_dmabuf
zwp_linux_dmabuf_v1 lets us use multi-planar images and buffer
modifiers.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-21 22:37:10 +00:00
Jason Ekstrand
c757fd2852 anv/image: Add support for modifiers for WSI
This adds support for the modifiers portion of the WSI "extension".

Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-21 22:37:10 +00:00
Jason Ekstrand
adca1e4a92 anv/image: Separate modifiers from legacy scanout
For a bit there, we had a bug in i965 where it ignored the tiling of the
modifier and used the one from the BO instead.  At one point, we though
this was best fixed by setting a tiling from Vulkan.  However, we've
decided that i965 was just doing the wrong thing and have fixed it as of
5048572352.

The old assumptions also affected the solution we used for legacy
scanout in Vulkan.  Instead of treating it specially, we just treated it
like a modifier like we do in GL.  This commit goes back to making it
it's own thing so that it's clear in the driver when we're using
modifiers and when we're using legacy paths.

v2 (Jason Ekstrand):
 - Rename legacy_scanout to needs_set_tiling

Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-21 22:37:10 +00:00
Jason Ekstrand
f5433e4d6c vulkan/wsi: Add modifiers support to wsi_create_native_image
This involves extending our fake extension a bit to allow for additional
querying and passing of modifier information.  The added bits are
intended to look a lot like the draft of VK_EXT_image_drm_format_modifier.
Once the extension gets finalized, we'll simply transition all of the
structs used in wsi_common to the real extension structs.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-21 22:37:10 +00:00
Daniel Stone
55b27e1e5f vulkan/wsi: Add drm_modifier member to wsi_image
Not yet used anywhere.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-21 22:37:10 +00:00
Daniel Stone
61c3feb38d vulkan/wsi: Add multiple planes to wsi_image
Not currently used.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-21 22:37:10 +00:00
Timothy Arceri
cdeac00267 nir: remove old assert
This was originally intended to make sure the remap location
was not -1. However the code has changed alot since then,
the location is now never set to -1 and we also handle
components meaning this old assert has been doing comparisions
with the pointer to the array of component data.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183
2018-02-22 09:31:00 +11:00
Timothy Arceri
86098696fc radeonsi/nir: collect more accurate output_usagemask
Fixes assert in the glsl-1.50-gs-max-output-components piglit test.

Note that the double handling will only work for doubles that
don't take up multiple slots i.e. double and dvec2. However
dual slot double handling is an existing bug which is made no
worse by this patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-22 09:31:00 +11:00
Timothy Arceri
79dc94828a radeonsi/nir: disable GLSL IR loop unrolling
Delaying unrolling and allowing NIR to do it instead has been shown
to result in better code in drivers such as i965. shader-db results
appear to show the same is true for radeonsi.

The other advantage is that using NIR unrolling improves compile
times significantly.

Totals from affected shaders:
SGPRS: 9624 -> 10016 (4.07 %)
VGPRS: 6800 -> 6464 (-4.94 %)
Spilled SGPRs: 0 -> 2 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 359176 -> 332264 (-7.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1355 -> 1432 (5.68 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-22 09:31:00 +11:00
Timothy Arceri
e6269ffc2e radeonsi/nir: fix tess varying loads for doubles
Fixes the following piglit tests:

tests/spec/arb_tessellation_shader/execution/double-array-vs-tcs-tes.shader_test
tests/spec/arb_tessellation_shader/execution/double-vs-tcs-tes.shader_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-22 09:31:00 +11:00
Timothy Arceri
6d338d757f ac/radeonsi: pass type to load_tess_varyings()
We need this to be able to load 64bit varyings.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-22 09:31:00 +11:00
Daniel Stone
eef890b7b1 x11/dri3: Store raw present completion mode
The DRI3 drawable info struct currently stores a boolean for whether the
last completed operation was a flip or not. As we need to track the full
completion mode for handling suboptimal returns, change the 'flipping'
field to the raw present completion mode from the server.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-21 21:57:38 +00:00
Daniel Stone
a6f1952814 x11/dri3: Don't open-code ARRAY_SIZE
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-21 21:57:38 +00:00
Jason Ekstrand
52056206e1 anv: Don't assert that stencil HiZ clears are single-slice
It's true for depth HiZ clears because we only have HiZ on single-slice
images right now.  However, for stencil-only clears there is no such
restriction.

Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-21 13:54:11 -08:00
Jason Ekstrand
7dd0f73fe1 anv: Only copy clear dwords if we're rendering to the first slice
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-02-21 12:47:17 -08:00
Marek Olšák
b494ed168c radeonsi: don't flush when si_eliminate_fast_color_clear is no-op 2018-02-21 20:03:11 +01:00
Marek Olšák
5f55f4c59f radeonsi: make texture_discard_cmask/eliminate functions non-static 2018-02-21 20:03:11 +01:00
James Zhu
81dd4a7637 radeonsi: enable uvd encode for HEVC main
Enable UVD encode for HEVC main profile

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
b38b208ff8 radeonsi:create uvd hevc enc entry
Add UVD hevc encode pipe video codec creation entry

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
e7d51e27ed radeon/uvd:add uvd hevc enc functions
Implement UVD hevc encode functions

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
2b86f5fa0b radeon/uvd:add uvd hevc enc hw ib implementation
Implement required IBs for UVD HEVC encode.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
461508c15c radeon/uvd:add uvd hevc enc hw interface header
Add hevc encode hardware interface for UVD

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
c6acae22c8 winsys/amdgpu:add uvd hevc enc support in amdgpu cs
Support UVD HEVC encode in amdgpu cs

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
f0ad908e79 amd/common:add uvd hevc enc support check in hw query
Based on amdgpu hardware query information to check if UVD hevc enc support

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-21 13:53:38 -05:00
Karol Herbst
7319311a50 nvir/nvc0: fix legalizing of ld unlock c0[0x10000]
We have to increase the file index also for 0x10000 not just for values
greater than 0x10000.

Fixes: 37b67db6ae
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-21 11:12:45 +01:00
Samuel Pitoiset
a6accad68f ac/nir: add glsl_is_array_image() helper
For consistency.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-21 09:41:51 +01:00
Samuel Pitoiset
ff83dfb364 ac/nir: set the DA field when performing atomics on 3D images
This doesn't fix anything known but it should definitely be set.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-21 09:41:49 +01:00
Eric Anholt
afa7b2f199 i965: Fix compiler warning about write being undefined.
This looks like it should be protected by the assume() about
nr_color_regions, but my compiler warns anyway.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-20 20:23:57 -08:00
Eric Anholt
4636ce362d glsl/tests: Fix a compiler warning about signed/unsigned loop comparison.
Fixes: d32956935e ("glsl: Walk a list of ir_dereference_array to mark array elements as accessed")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-20 20:23:57 -08:00
Eric Anholt
7075c084fc loader: Fix compiler warnings about truncating the PCI ID path.
My build was producing:

../src/loader/loader.c:121:67: warning: ‘%1u’ directive output may be truncated writing between 1 and 3 bytes into a region of size 2 [-Wformat-truncation=]

and we can avoid this careful calculation by just using asprintf (as we do
elsewhere in the file).

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-20 20:23:57 -08:00
Eric Anholt
1b313eedb5 glsl: Silence warnings in the uniform initializer test about 16-bit types
They should probably get unit tests implemented, but this cleans up a
bunch of warnings in my build for now.

Fixes: 59f458cd87 ("glsl: Add 16-bit types")
Cc: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-20 20:23:57 -08:00
Jordan Justen
96fe36f7ac i965: Enable disk shader cache by default
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-20 18:49:43 -08:00
Dave Airlie
baa0feb73d radv: don't send num_tcs_input_cp to sgprs.
We never use it in the shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:36 +00:00
Dave Airlie
952222ddd4 radv/tess: don't need to look in constant for vertices_per_patch
This just avoids passing this value via user sgprs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:28 +00:00
Dave Airlie
77fd1b9187 ac/radv: cleanup some tcs output values access
Just consolidates some code to make it easier to change.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:23 +00:00
Dave Airlie
0e6f0d400b ac/radv: remove total_vertices variable
This just removes an unneeded variable.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:19 +00:00
Dave Airlie
e9b9fb3616 ac/radv: don't mark tess inner as used if we don't use it.
This just avoids marking it as a used output if we don't
actually use it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:15 +00:00
Dave Airlie
d5b2d7ed67 ac/nir: to integer the args to bcsel.
dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw
was hitting an llvm assert due to one value being an int and the
other a float.

This just casts both values to integer and fixes the test.

Fixes: dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-20 23:15:18 +00:00
Jason Ekstrand
c66fb12117 anv/blorp: Use layout_to_aux_usage when a layout is provided
Instead of having aux usage and ANV_AUX_USAGE_DEFAULT to mean "give me
something reasonable" we now use anv_layout_to_aux_usage whenever a
layout is available.  If a layout is available, we ignore the aux_usage
parameter.  For the cases where we have an explicit aux usage such as
clears and aux ops, we have a new ANV_IMAGE_LAYOUT_EXPLICIT_AUX layout.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-20 13:57:17 -08:00
Jason Ekstrand
0fa040e6f5 anv/cmd_buffer: Delete some assert-only variables
Checking the sample count is almost as good as aux usage in this case.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-20 13:57:16 -08:00
Jason Ekstrand
e10a62662b anv/cmd_buffer: Use layout_to_* helpers in compute_aux_usage
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-20 13:57:14 -08:00
Jason Ekstrand
7ea8131aa0 anv/cmd_buffer: Simplify transition_depth_buffer
If we don't have HiZ, then anv_layout_to_aux_usage will return NONE for
both layouts.  If the two layouts are the same, they will get the aux
usage.  In either case, the code below will give us ISL_AUX_OP_NONE and
we'll return without doing anything.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-20 13:57:09 -08:00
Jason Ekstrand
87e86ee2e6 anv/cmd_buffer: Do subpass image transitions in begin/end_subpass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
7d5f6b6088 anv/cmd_buffer: Mark depth/stencil surfaces written in begin_subpass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
8a3f086a42 anv/cmd_buffer: Sync clear values in begin_subpass
This is quite a bit cleaner because we now sync the clear values at the
same time as we do the fast clear.  For loading the clear values into
the surface state, we now do it once when we handle the LOAD_OP_LOAD
instead of every subpass.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
a4136b8c1a anv/pass: Store usage in each subpass attachment
This requires us to ditch the VkAttachmentReference struct in favor of
an anv-specific struct.  However, we can now easily identify from just
the subpass attachment what kind of an attachment it is.  This will make
iteration over anv_subpass::attachments a little easier in some case.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
bd356e1bcf anv/cmd_buffer: Add a concept of pending load aspects
These are the same as pending clear aspects only for the "load"
operation.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
e526d49edd anv/cmd_buffer: Iterate all subpass attachments when clearing
This unifies things a bit because we now handle depth and stencil at the
same time.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
2cc3445eb2 anv/cmd_buffer: Decide whether or not to HiZ clear up-front
This moves the decision out of begin_subpass and into BeginRenderPass
like the decision for color clears.  We use a similar name for the
function for depth/stencil as for color even though no aux usage is
really getting computed.

v2 (Jason Ekstrand):
 - Don't always disable HiZ clears by accident
 - Use the initial layout to decide whether to do fast clears

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
6fc8555610 anv/cmd_buffer: Move the rest of clear_subpass into begin_subpass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
7991838973 intel/blorp: Add a blorp_hiz_clear_depth_stencil helper
This is similar to blorp_gen8_hiz_clear_attachments except that it takes
actual images instead of trusting in the already set depth state.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
1900dd76d0 anv/cmd_buffer: Move the color portion of clear_subpass into begin_subpass
This doesn't really change much now but it will give us more/better
control over clears in the future.  The one interesting functional
change here is that we are now re-emitting 3DSTATE_DEPTH_BUFFERS and
friends for each clear.  However, this only happens at begin_subpass
time so it shouldn't be substantially more expensive.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
6fb9d6c6f5 anv/cmd_buffer: Pass a subpass id into begin_subpass
This is a bit less awkward than passing in the subpass because it means
we don't have to extract the subpass id from the subpass.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
01223b8199 anv/cmd_buffer: Add begin/end_subpass helpers
Having begin/end_subpass is a bit nicer than the begin/next/end hooks
that Vulkan gives us.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
b5bd3fb4e4 anv/cmd_buffer: Apply subpass flushes before set_subpass
This seems slightly more correct because it means that the flushes
happen before any clears or resolves implied by the subpass transition.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
869448a8ab anv: Use framebuffer layers for implicit subpass transitions
Fixes: de3be61801 "anv/cmd_buffer: Rework aux tracking"
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
85d0bec961 anv: Be more careful about fast-clear colors
Previously, we just used all the channels regardless of the format.
This is less than ideal because some channels may have undefined values
and this should be ok from the client's perspective.  Even though the
driver should do the correct thing regardless of what is in the
undefined value, it makes things less deterministic.  In particular, the
driver may choose to fast-clear or not based on undefined values.  This
level of nondeterminism is bad.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
4796025ba5 intel/isl: Add an isl_color_value_is_zero helper
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
116e818ef1 anv/gpu_memcpy: CS Stall before a MI memcpy on gen7
This fixes a pile of hangs caused by the recent shuffling of resolves
and transitions.  The particularly problematic case is when you have at
least three attachments with load ops of CLEAR, LOAD, CLEAR.  In this
case, we execute the first CLEAR followed by a MI memcpy to copy the
clear values over for the LOAD followed by a second CLEAR.  The MI
commands cause the first CLEAR to hang which causes us to get stuck on
the 3DSTATE_MULTISAMPLE in the second CLEAR.

We also add guards for BLORP to fix the same issue.  These shouldn't
actually do anything right now because the only use of indirect clears
in BLORP today is for resolves which are already guarded by a render
cache flush and CS stall.  However, this will guard us against potential
issues in the future.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:19 -08:00
Guillaume Charifi
a572ec2efe st/mesa: Factorize duplicate code for atomic buffer binding
Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 20:54:49 +01:00
Guillaume Charifi
56bfcd50f7 st/mesa: Factorize duplicate code in st_update_framebuffer_state()
Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 20:54:49 +01:00
Rob Clark
4c4e6232ee freedreno/ir3: fix use_count refcnt'ing issue
Was hitting an assert with vs-varying-array-mat4-index-col-row-wr.shader_test

When eliminating a copy, we were dropping the use_count of the mov that
is skipped, but not increasing the use_count of it's src instruction.

Fixes: 76440fcca9 freedreno/ir3: clean up dangling false-dep's
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-20 13:43:42 -05:00
Eric Engestrom
ac731531a1 docs: fix patent url
Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-20 15:14:34 +00:00
Brian Paul
e7d1a93723 svga: replaced 'unsigned' with proper enum types in shader code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-20 08:11:06 -07:00
Jonathan Gray
9401d90a53 configure.ac: pthread-stubs not present on OpenBSD
pthread-stubs is no longer required on OpenBSD and has been removed.
libpthread parts involved moved to libc.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-20 15:08:47 +00:00
Andres Gomez
36ac485bd1 swr: bump minimum supported LLVM version to 4.0
Since radv and radeonsi removed support for LLVM 3.9 the distcheck
target got broken because SWR distribution needed 3.9.x.

After checking with George Kyriazis, SWR is OK with moving to LLVM 4.0
and above, which will solve this problem.

Fixes: 3bf1e036e8 ("amd: remove support for LLVM 3.9")
Cc: George Kyriazis <george.kyriazis@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2018-02-20 17:03:06 +02:00
Andres Gomez
b39f6d5fc7 travis: radeonsi and radv need LLVM 4.0
Fixes: 3bf1e036e8 ("amd: remove support for LLVM 3.9")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-20 16:58:30 +02:00
Samuel Pitoiset
1ac741d690 ac/nir: move ac_declare_lds_as_pointer() outside of the switch
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-20 10:44:59 +01:00
Samuel Pitoiset
b5d111ae76 radv: allow to force family using RADV_FORCE_FAMILY
Useful for pipeline-db.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-20 10:44:47 +01:00
Thomas Hellstrom
f386776ea5 loader_dri3/glx/egl: Reinstate the loader_dri3_vtable get_dri_screen callback
Removing this callback caused rendering corruption in some multi-screen cases,
so it is reinstated but without the drawable argument which was never used
by implementations and was confusing since the drawable could have been
created with another screen.

Cc: "17.3 18.0" mesa-stable@lists.freedesktop.org
Fixes: 5198e48a0d (loader_dri3/glx/egl: Remove the loader_dri3_vtable get_dri_screen callback)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105013
Reported-by: Daniel van Vugt <daniel.van.vugt@canonical.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-20 10:36:53 +01:00
Thomas Hellstrom
80c31f7837 svga: Fix a leftover debug hack
Fix what appears to be a leftover debug hack.
The hack would force the driver to take a different blit path; possibly,
although unverified, reverting to software blits.

Tested using piglit tests/quick. No related regressions.

Cc: "17.2 17.3 18.0" <mesa-stable@lists.freedesktop.org>
Fixes: 9d81ab7376 (svga: Relax the format checks for copy_region_vgpu10 somewhat)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104625
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-20 10:12:19 +01:00
Iago Toral Quiroga
af5f2322d0 anv/entrypoints: make vkGetDeviceProcAddr return NULL for instance commands
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-20 08:12:32 +01:00
Ilia Mirkin
e1a70aed10 nv50,nvc0: mark ABGR format as displayable instead of ARGB format
This matches the hardware's capabilities.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-19 22:33:58 -05:00
Ilia Mirkin
f7604d8af5 st/dri: only expose config formats that are display targets
In the case of NVIDIA hardware, ABGR is displayable but ARGB is not.
Only advertise the one set in the visuals list.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-02-19 22:33:58 -05:00
Ilia Mirkin
ebdc4c31e2 mesa: add xbgr support adjacent to xrgb
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-02-19 22:33:58 -05:00
Timothy Arceri
d88a2906f8 st/shader_cache: copy nir pointer to gl_program after deserializing
This fixes a crash when running the arb_get_program_binary-api-errors
piglit test twice.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 13:15:02 +11:00
Timothy Arceri
691c320de0 radeonsi: add nir shader cache support
In future we might want to try avoid calling nir_serialize() but
this works for now.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 13:15:02 +11:00
Timothy Arceri
2b431808ab radeonsi: rename variables tgsi_binary -> ir_binary
This better represents that the ir could be either tgsi or nir.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 13:15:02 +11:00
Emil Velikov
1270990438 docs: update calendar, add news and link release notes to 17.3.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-19 22:10:18 +00:00
Emil Velikov
be5a996039 docs: add sha256 checksums for 17.3.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 164a993112)
2018-02-19 22:08:14 +00:00
Emil Velikov
ca614d40cd docs: add release notes for 17.3.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2529d77179)
2018-02-19 22:08:12 +00:00
Marek Olšák
f78fe98fff radeonsi: fix regression from 32-bit pointers on CI
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2018-02-19 17:56:23 +01:00
Samuel Pitoiset
549c7f3724 radv: compact varyings after removing unused ones
It makes no sense to compact before, and the description of
nir_compact_varyings() confirms that.

Polaris10:
Totals from affected shaders:
SGPRS: 108528 -> 108128 (-0.37 %)
VGPRS: 74548 -> 74500 (-0.06 %)
Spilled SGPRs: 844 -> 814 (-3.55 %)
Code Size: 3007328 -> 2992932 (-0.48 %) bytes
Max Waves: 16019 -> 16009 (-0.06 %)

Vega10:
Totals from affected shaders:
SGPRS: 106088 -> 106232 (0.14 %)
VGPRS: 74652 -> 74700 (0.06 %)
Spilled SGPRs: 692 -> 658 (-4.91 %)
Code Size: 2967708 -> 2953028 (-0.49 %) bytes
Max Waves: 18178 -> 18162 (-0.09 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-19 12:19:17 +01:00
Timothy Arceri
51e745cf77 radeonsi/nir: fix gl_FragCoord for pixel_center_integer
Fixes piglit test glsl-arb-fragment-coord-conventions

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-19 08:47:48 +11:00
Timothy Arceri
347038baa9 glsl/nir: add pixel_center_integer to shader info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-19 08:47:48 +11:00
Ilia Mirkin
fe76fc11b1 gm107/ir: avoid using kepler instruction capabilities
Split up the op properties table into generation-specific bits, and only
use the kepler ones on kepler. Fixes some CTS images tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-02-17 23:41:21 -05:00
Ilia Mirkin
f08fd676bf nvc0: add support for bindless on maxwell+
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-17 23:41:21 -05:00
Ilia Mirkin
0255550eb1 gm107/ir: change how SUQ works in preparation for bindless
All this information can be retrieved from the TIC directly. Avoid
having to dip into the constbuf information about the image.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-17 23:41:21 -05:00
Kenneth Graunke
fa8a764b62 i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.
By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic
state base address.  This makes it unusable for pushing UBOs.

There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake)
which controls whether buffer 0 is relative to dynamic state base
address, or simply a normal pointer.  Setting that gives us full
flexibility.  This lets us push up to 4 UBO ranges.

We can't currently write this on Haswell and earlier, and will need
to update the kernel command parser, and then do the whole version
checking song and dance.  We also need a brand new kernel that supports
context isolation - on older kernels, newly created contexts inherit
register state from whatever happened to be running.  So, setting this
would have catastrophic impact on other drivers such as libva, Beignet,
or older Mesa.

See commit 8ec5a4e4a4 where we did this
once before, but had to revert it in commit 013d331220.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-02-17 11:26:31 -08:00
Kenneth Graunke
a63c74be85 i965: Stop restoring the default L3 configuration on Kernel 4.16+.
Kernel 4.16 has proper context isolation, which means we can change
the L3 configuration without worrying about that leaking to other
newly created contexts, breaking the assumptions of other userspace.

So, disable our workaround to reprogram it back to the default.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-02-17 11:26:18 -08:00
Mikko Perttunen
5a1606c51f nvc0: Use GP100_COMPUTE_CLASS on GP10B
GP10B requires the use of GP100_COMPUTE_CLASS instead of
GP104_COMPUTE_CLASS as is used for other non-GP100 chips.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-17 14:16:10 -05:00
Daniel Stone
9d21dbeb88 i965: Fix aux-surface size check
The previous commit reworked the checks intel_from_planar() to check the
right individual cases for regular/planar/aux buffers, and do size
checks in all cases.

Unfortunately, the aux size check was broken, and required the aux
surface to be allocated with the correct aux stride, but full image
height (!).

As the ISL aux surface is not recorded in the DRIimage, we cannot easily
access it to check. Instead, store the aux size from when we do have the
ISL surface to hand, and check against that later when we go to access
the aux surface.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: c2c4e5bae3 ("i965: Fix bugs in intel_from_planar")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-17 10:22:35 +00:00
Marek Olšák
931ec80eeb radeonsi: implement 32-bit pointers in user data SGPRs (v2)
User SGPRs changes:
    VS:     14 ->  9
    TCS:    14 -> 10
    TES:    10 ->  6
    GS:      8 ->  4
    GSCOPY:  2 ->  1
    PS:      9 ->  5
    Merged VS-TCS: 24 -> 16
    Merged VS-GS:  18 -> 11
    Merged TES-GS: 18 -> 11

SGPRS: 2170102 -> 2158430 (-0.54 %)
VGPRS: 1645656 -> 1641516 (-0.25 %)
Spilled SGPRs: 9078 -> 8810 (-2.95 %)
Spilled VGPRs: 130 -> 114 (-12.31 %)
Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread
Code Size: 52094872 -> 52692540 (1.15 %) bytes
Max Waves: 371848 -> 372723 (0.24 %)

v2: - the shader cache needs to take address32_hi into account
    - set amdgpu-32bit-address-high-bits

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)
2018-02-17 04:52:17 +01:00
Marek Olšák
5722cd4084 radeonsi: disallow constant buffers with a 64-bit address in slot 0
State trackers must use a user buffer or const_uploader,
or set pipe_resource::flags same as const_uploader->flags.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
d790b6cece radeonsi: move const_uploader allocations to 32-bit address space
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
50581549b7 winsys/radeon: implement and enable 32-bit VM allocations
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
1104d1e9d3 winsys/radeon: add struct radeon_vm_heap
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
48ecacfefa winsys/amdgpu: enable 32-bit VM allocations
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
c2da45be86 gallium/radeon: add 32-bit address space heaps
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
0977b7f7b3 ac: query high bits of 32-bit address space 2018-02-17 04:51:58 +01:00
Marek Olšák
16be55da94 gallium: use PIPE_CAP_CONSTBUF0_FLAGS 2018-02-17 04:20:55 +01:00
Marek Olšák
8e7222f4e5 gallium: allow drivers to impose BO flags restrictions on constant buffer 0
Required by radeonsi for optimal behavior.
2018-02-17 04:20:55 +01:00
Alexander von Gluck IV
834d221512 meson: Add Haiku platform support v4
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-16 16:56:34 -06:00
Anuj Phogat
7b283544dc anv/icl: Add render target flush after uploading binding table
The PIPE_CONTROL command description says:

"Whenever a Binding Table Index (BTI) used by a Render Taget Message
points to a different RENDER_SURFACE_STATE, SW must issue a Render
Target Cache Flush by enabling this bit. When render target flush
is set due to new association of BTI, PS Scoreboard Stall bit must
be set in this packet."

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
136f583a24 anv/icl: Enable float blend optimization
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
cd7102972f anv/icl: Use gen11 functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
9673c21d4f anv/icl: Build anv libs for gen11
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
1f108b436b anv/icl: Generate gen11 entry point functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
a86c0a08df anv/icl: Don't use DISPATCH_MODE_SIMD4X2
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
cd5fc634a8 anv/icl: Don't use SingleVertexDispatch
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
6e3940b3cf anv/icl: Don't set ResetGatewayTimer
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
41a4c2c8e8 anv/icl: Add #define genX
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:31 -08:00
Anuj Phogat
413d475b44 anv/icl: Add gen11 mocs defines
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:31 -08:00
Kenneth Graunke
1d6cf433d2 i965: Implement GenerateMipmap directly, rather than using Meta.
Meta is awful and we'd like to stop using it.  Implementing this using
BLORP allows us to stop trashing a bunch of GL state every time.

This follows the structure of st_generate_mipmap().
compute_num_levels is lifted directly from there.

Improves performance in Gl41HdrBloom by about 11.794% +/- 1.01919% (n=3)
on Kabylake GT2 at 1280x720 (the difference seems much smaller at higher
resolutions).

v2 (idr): Don't try depth or depth-stencil blorp blits on Gen4 or Gen5
because it's not implemented yet.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-02-16 10:48:10 -08:00
Kenneth Graunke
9bcd31ea90 mesa: Move compute_num_levels from st_gen_mipmap.c to mipmap.c.
I want to use compute_num_levels inside i965.  Rather than duplicating
it, move it from mesa/st to core Mesa, and make it non-static.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-16 10:48:10 -08:00
Dylan Baker
03ab40b1f7 meson: freedreno depends on nir
This fixes a race condition in building targets that link in freedreno.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105120
Fixes: 0bbecc5a85 ("meson: define driver dependencies")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Mark Janes <mark.a.janes@intel.com>
2018-02-16 10:10:18 -08:00
George Kyriazis
f1fbeb1a53 swr/rast: blend_epi32() should return Integer, not Float
fix gcc8 compiler error for KNL.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105029
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis
7dd793d10c swr/rast: Normalize path for debug metadata
in template gen_llvm.hpp

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis
f979d0bc2f swr/rast: Consolidate archrast Draw events
Consolidate archrst draw events into single draw event with an attribute
that represents the type of draw

- Add handlers for new private proto versions of DrawInstancedEvent,
  DrawIndexedInstancedEvent, DrawInstancedSplitEvent, and
  DrawIndexedInstancedSplitEvent
- Convert the draw events to generic DrawInfoEvents
- parse_proto_event_fields() replaces 'AR_DRAW_TYPE' as a field type with
  'uint32_t'. This draw type is actually an enum, but can be represented
  as an unsigned integer.
- is_draw_or_dispatch() recognizes DrawInfoEvent as a draw event

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis
45df1a6520 swr/rast: Add semantics for translating address
Added support for another full translation path in fetch jitter.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis
c09483cf0a swr/rast: Convert C Sampler intrinsics
Convert portions of the C sampler to the rasty SIMD lib.

Also fix SRL call with a non-immediate.  Don't count on the compiler
automagically converting an srli call to srl if the shift count isn't
an immediate.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
37ebf86add swr/rast: Make SIMDLib templated types easier to use
"typename SIMD_T::TypeName" --> "TypeName<SIMD_T>"

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
74e8bb4a22 swr/rast: Be more explicit when fetching next component
Use a new function to denote that we want to get offset to next component
and hide the fact that GEP is used underneath.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
da77eb55d5 swr/rast: Fix bug related to passing AR handle
We were passing a garbage handle. Let's not do that.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
48d62409f8 swr/rast: Fix primitive replication issue in tesselation PA.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
e12db47a7d swr/rast: Use llvm intrinsic masked gather
Use llvm intrinsic masked.gather instead of manual unroll for the cases
where we have vector of pointers.  Improves llvm IR debug experience by
reducing a ton of IR to a single intrinsic call. Also seems to reduce
overall stack use considerably.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
9cc9688e49 swr/rast: Misc cleanup
Together with correct detection of clipDistance NaNs when no cullDistance is set

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
036c8b6247 swr/rast: Renamed variable in vertexbufferstate
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
b25efa36e6 swr/rast: Fix GATHERPS to avoid assertions.
With the pBase type change, LLVM was asserting because of wrong types.
Cast appropriately.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
8a64593bde swr/rast: More precise user clip distance interpolation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
3e560b7c85 swr/rast: Cull prims when all verts have negative clip distances
Performance optimization, and fixes some clipping issues.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
cb4b604ebd swr/rast: whitespace and comment cleanup
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
5df4d98780 swr/rast: Fix invalid number of attributes
Fix invalid number of attributes passed into tesselation PA.
Needs to take into account any offsets from the shader.
Innocuous issue, but removes an assert firing in debug.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
2053472723 swr/rast: Add clipper stats.
Clipper event is now:

event ClipperEvent
{
    uint32_t drawId;
    uint32_t trivialRejectCount;
    uint32_t trivialAcceptCount;
    uint32_t mustClipCount;
};

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
0420b2be89 swr/rast: Separate event types to public and private
Split into two proto files and modify appropriate build rules for
configure / scons / meson builds.

There are private internal events (proxy) that communicate information
from rasterizer to ArchRast. ArchRast can use these events to calculate
a final answer and then emit other public events which will be saved to
file. Users will use the public proto file and not the private one.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
e48dd2489c swr/rast: Clean up event types and remove BE events
Begin/End events not needed anymore.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
7070027d7b swr/rast: Removed unused variable
Gets rid of zillions of unused variable warnings, made worse by templates.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
e3f92bb7af swr/rast: Separate RDTSC code from archrast
Renamed rdstc defines more appropriately

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
8bce71622e swr/rast: Cleanup of mpPrivateContext in Builder
Provide access functions for mpPrivateContext in Builder.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:58 -06:00
George Kyriazis
5697dc3e23 swr/rast: Remove some JIT debug code
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:58 -06:00
George Kyriazis
2407b8c9b4 swr/rast: Don't include private context in gather args
Move mpPrivateContext to compensate

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:58 -06:00
George Kyriazis
a4c23fc25b swr/rast: Cleanup knob definitions
Rename some of the categories and move some options around.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:42 -06:00
George Kyriazis
ec34ed73d6 swr/rast: Add missing parameter to a few gather functions
We now pass pDrawContext as a default parameter

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:39:42 -06:00
Philipp Zabel
bfe4e24a42 etnaviv: add useful information to BO import errors
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-02-16 17:05:43 +01:00
Daniel Stone
ff5432dc50 egl/wayland: Always use in-tree wayland-egl-backend.h
A recent patchset to Wayland[0] migrated Mesa's libwayland-egl backend
into Wayland itself, so implementations could provide backends. Mesa
still uses its own, and the two have already diverged[1].

The include from egl_dri2.h could pick up either the installed Wayland
wayland-egl-backend.h (with a 'driver_private' member), or the Mesa
internal wayland-egl-backend.h (with a 'private' member), failing the
build in the first instance.

Add an explicit directory prefix to the include, so we always get our
in-tree version.

[0]: https://patchwork.freedesktop.org/series/31663/
[1]: https://cgit.freedesktop.org/wayland/wayland/commit/?id=9fa60983b579

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105103
Fixes: 198af27c67 ("wayland-egl: rename wayland-egl-{priv,backend}.h")
2018-02-16 14:04:19 +00:00
Daniel Stone
f766e1afa5 meson: Move Wayland dmabuf to wayland-drm
As the comment notes: linux-dmabuf has nothing to do with wayland-drm,
but we need a single place to build these files we can use from both EGL
and Vulkan, which is guaranteed to be included before both EGL and
Vulkan WSI.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
2018-02-16 14:04:19 +00:00
Eric Engestrom
65dda6c9ec egl/wayland: check for invalid format index
v2: just tell the compiler to assume the format will always be found, as
it comes from the table itself to begin with. (DanielS)

CID: 1429516
Fixes: d32b23f383 "egl/wayland: Add bpp to visual map"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-16 13:14:29 +00:00
Eric Engestrom
a176b053b6 glsl: fix sizeof(pointer) bug
Doesn't really change anything to the test though ¯\_(ツ)_/¯

CID: 1429511
Fixes: e8495646af "glsl/tests: changes to test_disk_cache_create test"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-16 12:04:29 +00:00
Timothy Arceri
2f5d3df9fc radeonsi/nir: set TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL correctly
We set this for post_depth_coverage in addition to early_fragment_tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-16 15:53:13 +11:00
Dave Airlie
60c14a0db2 virgl: remap query types to hw support.
The gallium query types changed, so we need to remap from the
gallium ones to the virgl ones.

Fixes:
dEQP-GLES3.functional.transform_feedback.basic_types*

"This also fixes:

dEQP-GLES3.functional.transform_feedback.array.separate*
dEQP-GLES3.functional.transform_feedback.array_element*
dEQP-GLES3.functional.transform_feedback.interpolation.*

Gallium's p_defines.h and virglrenderer's p_defines.h have diverged
quite a bit, so not including
PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE there makes sense for now."
 - Gurchetan Singh

Fixes: 3f6b3d9db (gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-16 12:42:06 +10:00
Anuj Phogat
8a05b06146 i965/icl: Add render target flush after uploading binding table
From PIPE_CONTROL command description in gfxspecs:

"Whenever a Binding Table Index (BTI) used by a Render Taget Message
 points to a different RENDER_SURFACE_STATE, SW must issue a Render
 Target Cache Flush by enabling this bit. When render target flush
 is set due to new association of BTI, PS Scoreboard Stall bit must
 be set in this packet."

V2: Move the PIPE_CONTROL to update_renderbuffer_surfaces() in
    brw_wm_surface_state.c (Ken).

Fixes a fulsim error and a GPU hang described in below JIRA.
JIRA: MD5-322
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
3f8289164f i965/icl: Enable float blend optimization and Wa3DStateMode
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
ba3cbee6c5 intel/common/icl: Add has_sample_with_hiz flag in gen_device_info
Sampling from hiz is enabled in i965 for GEN9+ but this feature has
been removed from gen11. So, this new flag will be useful to turn
the feature on/off for different gen h/w. It will be used later
in a patch adding device info for gen11.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
9c144dc81e i965/icl: Add assertions to check dispatch mode is SIMD8
SIMD4x2 dispatch mode has been removed in GEN11. We're not using
it anyways in Mesa. Adding few asserts to make it explicit.

Use GEN_GEN macro in place of devinfo->gen (Ken)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
02e91b6d62 i965/icl: Update switch statements
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
27d0034938 i965/icl: Update the assert in brw_memory_barrier()
Nothing is changed here from gen10 to gen11. So, just update
the assert.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
d6b26649a6 i965/icl: Define and use icl mocs settings
Gen11 MOCS settings are duplicate of Gen10 MOCS settings.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
e9ad5c9a5d i965/icl: Update the comment for maximum number of threads per PSD
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
93f601d7ed i965/icl: Build and use gen11 functions for genxml state-upload and blorp
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-15 16:14:56 -08:00
Anuj Phogat
85f319155f i965/icl: Don't set ResetGatewayTimer
This field is removed in gen11+

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
772a75be46 intel/icl: Do StateCacheInvalidation for indirect clear color
StateCacheInvalidation is required on all gen7+ platforms. We
don't need to update this check for every new gen h/w unless
this requirement is changed. So, dropping the check for latest
gen h/w.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:55 -08:00
Anuj Phogat
bff24e2173 intel/isl/icl: Build and use gen11 surface state emit functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-15 16:14:55 -08:00
Anuj Phogat
0427bd4954 intel/isl/icl: Add the maximum surface size limit
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:55 -08:00
Anuj Phogat
c68ede0be7 intel/genxml/icl: Update genx_bits header
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:55 -08:00
Anuj Phogat
165a68b05a intel/genxml/icl: Generate packing headers
Move build system changes in to one patch (Ken, Emil)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-15 16:14:55 -08:00
Anuj Phogat
7ed27d8cbf intel/genxml/icl: Add gen11.xml
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 16:14:55 -08:00
Kenneth Graunke
4dee8f0548 i965: Drop EXEC_OBJECT_CAPTURE defines.
These only existed to avoid making people update libdrm for new uABI
headers.  A while ago we imported those headers into the Mesa repo,
so the dependency is gone and these are no longer useful.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-15 15:35:52 -08:00
Jan Vesely
78673b614b clover: Fix build after llvm r325155 and r325160
r325155 ("Pass a reference to a module to the bitcode writer.")
and
r325160 ("Pass module reference to CloneModule")

change function interface from pointer to reference.

v2: Fix indentation (tab instead of spaces)

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-02-15 18:18:53 -05:00
Bas Nieuwenhuizen
05d84ed68a radv: Always lower indirect derefs after nir_lower_global_vars_to_local.
Otherwise new local variables can cause hangs on vega.

CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105098
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-15 23:45:59 +01:00
Dylan Baker
2ab1ce30c4 meson: fix xvmc target linkage
This needs to link the state tracker with --whole-archive to expose the
right symbols.

v4: - Always add libswdri and libswkmsdri to the link_with list

Fixes: 22a817af8a ("meson: build gallium xvmc state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:38:43 -08:00
Dylan Baker
0b73c329bc meson: Fix xa target linkage
This needs to use --whole-archive (link_whole in meson) to properly
expose symbols.

v4: - Always add libswdri and libswkmsdri to link_with list

Fixes: 0ba909f0f1 ("meson: build gallium xa state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:31 -08:00
Dylan Baker
91a59b6287 meson: Fix omx-bellagio target linkage
This needs to use --whole-archive (link_whole in meson) to properly
expose symbols.

v4: - Always add libswdri and libswkmsdri to link_with

Fixes: 1d36dc674d ("meson: build gallium omx state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:26 -08:00
Dylan Baker
2e4be28fb2 meson: fix va target linkage
The state tracker needs to be linked with whole-archive (like
autotools). As a result there are symbols from libswdri and libswkmsdri
that are needed, so link those as well.

v4: - Always add libswdri and libswkmsdri to link_with list

Fixes: 5a785d51a6 ("meson: build gallium va state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:16 -08:00
Dylan Baker
90d361753c meson: fix vdpau target linkage
The VDPAU state tracker needs to be linked with whole-archive (autotools
does this). Because we are linking the whole archive we alos need to
link with libswdri and libswkmsdri if those have been enabled.

v4: - Always add libswdri and libswkmsdri to link_with list

Fixes: 68076b8747 ("meson: build gallium vdpau state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:09 -08:00
Dylan Baker
3403055768 meson: Actually link xvmc target with libxvmc
Unlike vdpau this is required.

Fixes: 22a817af8a ("meson: build gallium xvmc state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:04 -08:00
Dylan Baker
7708103857 meson: actually link with libomxil-bellagio
This state tracker actually needs to link, unlike vdpau.

Fixes: 1d36dc674d ("meson: build gallium omx state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:57 -08:00
Dylan Baker
7023b373ec meson: link dri3 xcb libs into vlwinsys instead of into each target
This makes the dependencies easier to manage, since each media target
doesn't need to worry about linking to half a dozen libraries.

Fixes: b1b65397d0 ("meson: Build gallium auxiliary")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:51 -08:00
Dylan Baker
424e654cb0 meson: use va-api version reported by pkg-config
Fixes: 5a785d51a6 ("meson: build gallium va state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:47 -08:00
Dylan Baker
8eb608df61 meson: add libswdri and libswkmsdri to dri link_with
Fixes: b154b44ae3 ("meson: build radeonsi gallium driver")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:42 -08:00
Dylan Baker
be879f9f29 meson: add libswdri and libswkmsdri to d3dadaptor link_with
v5: - Fix libswdi -> libswdri typo

Fixes: 6b4c7047d5 ("meson: build gallium nine state_tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:36 -08:00
Dylan Baker
d672084ba2 meson: define empty variables for libswdri and libswkmsdri
This allows these variables to unconditionally included in `link_with`
lists, even if they're not used. This allows deleting duplicated logic
in nearly every gallium target implemented in meson today. This also
removes the now useless `build_by_default` flag from swdri and swkmsdri.

v4: - add this patch

Fixes: 66c94b9313
       ("meson: build gallium winsys for dri, null, and wrapper")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:23 -08:00
Dylan Baker
7d0e342af2 meson: add convenience variable for anv_extensions.py depdendency
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:46:07 -08:00
Dylan Baker
0e617c04f1 meson: use depend_files for adding extra file dependencies
cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: dd088d4bec ("anv/extensions: Generate a header file with extension tables")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:46:04 -08:00
Dylan Baker
b03969a5ad meson: use depend_files to track extra file dependencies
cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: f939940809 ("anv: Split anv_extensions.py into two files")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:45:56 -08:00
Dylan Baker
384bff13e0 Revert "anv/meson: Make anv_entrypoints_gen.py depend on anv_extensions.py"
This reverts commit 10d1b0be8e.

This is unnecessary, the depend_files argument is for adding
dependencies on files that are not part of the input, which is already
done.

cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: 10d1b0be8e
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:45:40 -08:00
Brian Paul
64a1223a80 svga: replace gotos with else clauses
Simple clean-up.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-15 09:49:06 -07:00
Brian Paul
fa901768a4 svga: s/unsigned/enum pipe_shader_type/
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-15 09:05:09 -07:00
Brian Paul
8b54299c34 svga: move duplicated code for setting fillmode/flatshade state
Move the calls to svga_hwtnl_set_fillmode() and svga_hwtnl_set_flatshade()
out of the two retry_draw_*() functions to the svga_draw_vbo() function.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-15 09:05:09 -07:00
Brian Paul
072df89a79 svga: move svga_update_state() call in draw code
This fixes a few Piglit transform feedback regressions caused by
commit 7a1401938b.

In that change I moved the moved svga_update_state() into the loops,
after the calls to svga_hwtnl_set_flatshade().  But
svga_hwtnl_set_flatshade() actually depends on some derived shader
state.  This patch moves the svga_update_state() call into
svga_draw_vbo() so it's not duplicated in two places.

Fixes: 7a1401938b ("svga: clean up retry_draw_range_elements(),
retry_draw_arrays()")

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-15 09:05:08 -07:00
Brian Paul
6f0aec5671 svga: call tgsi_scan_shader() for dummy shaders
If we fail to compile the normal VS or FS we fall back to a simple/
dummy shader.  We need to rescan the the shader to update the shader
info.  Otherwise, this can lead to further translations failures
because the shader info doesn't match the actual shader.

Found by adding some extra debug assertions in the state-update code
while debugging something else.

v2: also update shader generic_inputs/outputs, etc. per Charmaine

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-15 09:05:01 -07:00
Samuel Pitoiset
579b33c1fd ac/nir: do not reserve user SGPRs for unused descriptor sets
In theory this might lead to corruption if we bind a descriptor
set which is unused, because LLVM is smart and it can re-use
unused user SGPRs. In practice, this doesn't seem to fix
anything.

As a side effect, this will reduce the number of emitted
SH_REG packets.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-15 14:53:30 +01:00
Samuel Pitoiset
309854148c ac/shader: fix gathering of desc_set_used_mask
This was quite wrong.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-15 14:53:30 +01:00
Samuel Pitoiset
61a4fc3ecc ac/shader: be a little smarter when scanning vertex buffers
Although meta shaders don't use any vertex buffers, there is no
behaviour change but I think it's better to do this. Though,
this saves two user SGPRs for push constants inlining or
something else.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-15 14:53:30 +01:00
Louis-Francis Ratté-Boulianne
a34715ad9c dri: fromPlanar() can return NULL as a valid result
It was assumed that fromPlanar() could return NULL to mean
that the planar image is the same as the parent DRI image.
That assumption wasn't made everywhere though.

Let's fix things and make sure that all callers understand
a NULL result

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-15 11:58:17 +00:00
Emil Velikov
f0654dfa65 docs: correct link to the 17.3.3 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 11:33:27 +00:00
Emil Velikov
dd4734d5c1 docs: update calendar, add news and link release notes to 17.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 11:33:04 +00:00
Emil Velikov
eadde35f83 docs: add sha256 checksums for 17.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 26c84b8af9)
2018-02-15 11:28:19 +00:00
Emil Velikov
6f4a6e2310 docs: add release notes for 17.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2f9820c553)
2018-02-15 11:28:18 +00:00
Karol Herbst
7bc15090fc nvc0: disable MS Images for sample_count == 1 on Maxwell
fixes KHR-GL45.multi_bind.dispatch_bind_textures on Maxwell

Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-15 11:14:46 +01:00
Gurchetan Singh
c6694793e1 mesa: don't clamp just based on ARB_viewport_array extension
The ARB_viewport_array spec says:

"Dependencies
    OpenGL 1.0 is required.

    OpenGL 3.2 or the EXT_geometry_shader4 or ARB_geometry_shader4 extensions
    are required.

    This extension is written against the OpenGL 3.2 (Compatibility)
    Specification."

As such, we should ignore it for GLES2 contexts.

Fixes:
dEQP-GLES2.functional.state_query.integers.viewport_getinteger
dEQP-GLES2.functional.state_query.integers.viewport_getfloat

on llvmpipe and virgl.

v2: Use _mesa_has_* (Ilia)

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
2018-02-15 01:58:50 +01:00
Dylan Baker
5317211fa0 meson: use a custom target instead of a generator for i965 oa
Generators really are never the thing you want. The problem in this case
is that a generator must create a file that contains any file that the
generated target depends on. Since brw_oa.py doesn't generate such a
file the generated sources are not regenerated even if the xml files
they should depend on changes.

While we could change brw_oa.py to write such a file, that's silly, it
depends on itself and the xml file. So we'll just use a custom target
instead, which will have the correct dependency behavior and doesn't
really add that much code.

Fixes: 3218056e0e ("meson: Build i965 and dri stack")
CC: Ian Romanick <idr@freedesktop.org>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-02-14 16:45:40 -08:00
Anuj Phogat
0cd37f9178 isl: Don't use surface format R32_FLOAT for typed atomic integer operations
From Skylake PRM Surface Formats section:

   "The surface format for the typed atomic integer operations must
    be R32_UINT or R32_SINT."

Fixes an error and a piglit GPU hang in simulation environment.
Piglit test: gl45-imageAtomicExchange-float.shader_test

Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.co
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>
2018-02-14 16:30:05 -08:00
Timothy Arceri
7be5f30bb1 radeonsi/nir: fix si_nir_load_tcs_varyings() for outputs
We were incorrectly using the input info for outputs.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
9740c8a8aa ac: implement nir_intrinsic_image_samples
Fixes cts test:
KHR-GL45.shader_texture_image_samples_tests.image_functional_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
c6b70a0eae st: add NIR GL_ARB_get_program_binary support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
928be4e97e st/shader_cache: add st_{de}serialise_nir_program() helpers
These will be used for NIR GL_ARB_get_program_binary support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
3ad52501dc ac/nir_to_llvm: fix image size for arrays of arrays
Fixes cts test:
KHR-GL44.shader_image_size.advanced-changeSize

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
6acab18828 radeonsi/nir: fix shader ballot return value bitsize
Fixes cts test:
KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Jason Ekstrand
8534af44e4 intel/aubinator: Correctly decode INTERFACE_DESCRIPTOR_DATA
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 13:17:26 -08:00
Jason Ekstrand
5c9d47d9c6 i965: Add gl_state_index casts for PATCH_VERTICES_IN
This fixes the build in clang

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105088
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 13:16:47 -08:00
Scott D Phillips
3b4f432d9b i965/miptree: Initialize mcs with a linear map
When initializing mcs, map with MAP_RAW and fill in the linear
map. Removes a place where gtt mapping is used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 12:38:34 -08:00
Scott D Phillips
d13ab69a78 i965/tiled_memcpy: change linear pointer from (0, 0) to (xt1, yt1)
In all current uses, the linear surface is only allocated starting
at (xt1, yt1) anyway, so this improves the calling ergonomics.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 12:38:34 -08:00
Scott D Phillips
ecaad89525 i965/tiled_memcpy: linear_to_ytiled a cache line at a time
TileY's low 6 address bits are: v1 v0 u3 u2 u1 u0
Thus a cache line in the tiled surface is composed of a 2d area of
16x4 bytes of the linear surface.

Add a special case where the area being copied is 4-line aligned
and a multiple of 4-lines so that entire cache lines will be
written at a time.

On Apollolake, this increases tiling throughput to wc maps by
84.0103% +/- 0.862818%

v2: Split [y0, y1) and [y2, y3) loops apart for clarity (Jason Ekstrand)
v3: Don't reset src var (Jason), Ensure y0 <= y1 <= y2 <= y3

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-14 12:38:34 -08:00
Rafael Antognolli
eb2e17e2d1 docs: Add Cannonlake support to 18.0 release notes.
17.4 is actually 18.0.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: "18.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 10:11:05 -08:00
Rafael Antognolli
fcae3d1a9a anv/gen10: Remove warning message.
Gen10 seems pretty stable so far, remove "alpha support" message.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: "18.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 10:11:01 -08:00
Rafael Antognolli
bf1577fe09 i965/gen10: Remove warning message.
Gen10 seems pretty stable so far, so there's no reason to keep this
message.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 10:09:41 -08:00
Louis-Francis Ratté-Boulianne
aad14cf15a egl/x11: Fix leak in dri3_create_image_khr_pixmap
bp_reply wasn't properly free'd

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-14 11:52:06 +00:00
Iago Toral Quiroga
cb9dbd6dec i965/compiler: clean up nir_intrinsic_load_input for vertex shaders
This code to re-set the type of the source and destination is not
necessary since we never manipulate the types. Looks like a
left over from a time where we had to retype to float temporarily
to handle 64-bit inputs.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-14 12:00:14 +01:00
Iago Toral Quiroga
4917d38321 intel/compiler: fix first_component for 64-bit types on vertex inputs
Divide it by two as we do for other stages. This is because the
component layout qualifier is always in 32-bit units.

Fixes issues in a new CTS test (still WIP):
KHR-GL45.enhanced_layouts.varying_double_components

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-14 12:00:14 +01:00
Samuel Pitoiset
ad4b58ea70 ac/nir: rename nir_to_llvm_context to radv_shader_context
There is still more to do in that area, but it's a good start.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:16 +01:00
Samuel Pitoiset
141db61509 ac: remove nir_to_llvm_context from ac_nir_translate()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:14 +01:00
Samuel Pitoiset
a541117ff4 ac/nir: remove nir_to_llvm_context::nir link
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:12 +01:00
Samuel Pitoiset
e9f0205ca2 ac: move the outputs array to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:10 +01:00
Samuel Pitoiset
07e4268f36 ac/shader: scan force_persample
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:08 +01:00
Dave Airlie
b9d2ff05a6 r600: fix regression in gl_FragColor drawing
This fixes a regression in the broadcast color to all color bufs case.

Fixes: 6c691081a (r600: fixup sparse color exports.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 14:02:41 +10:00
Dave Airlie
9c9a9bee44 r600: fix array spill if temp[0] is before all arrays
I found a shader with
DCL TEMP[0], LOCAL
DCL TEMP[1..256], ARRAY(1), LOCAL
DCL TEMP[257..512], ARRAY(2), LOCAL
DCL TEMP[513..768], ARRAY(3), LOCAL
DCL TEMP[769], LOCAL

This would remap badly, as it would add up all the spilled sizes
and subtract it from the temp for 0. If the current temp is less
than the array start break out.

Fixes: 1d871aa6 (r600g: Implement spilling of temp arrays (v2))
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 13:37:59 +10:00
Dave Airlie
8f2656c75b virgl: add ARB_sample_shading support.
This enable ARB_sample_shading if the renderer supports it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 13:06:07 +10:00
Dave Airlie
9b95b70719 virgl: add ARB_draw_indirect support.
This relies on the renderer code landing first.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 13:06:07 +10:00
Roland Scheidegger
f6718baabc tgsi: Recognize RET in main for tgsi_transform
Shaders coming from dx10 state trackers have a RET before the END.
And the epilog needs to be placed before the RET (otherwise it will
get ignored).
Hence figure out if a RET is in main, in this case we'll place
the epilog there rather than before the END.
(At a closer look, there actually seem to be problems with control
flow in general with output redirection, that would need another
look. It's enough however to fix draw's aa line emulation in some
internal bug - lines tend to be drawn with trivial shaders, moving
either a constant color or a vertex color directly to the output).

v2: add assert so buggy handling of RET in main is detected

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-14 02:06:54 +01:00
Bas Nieuwenhuizen
7461bd5b8f ac: Use the renumbered const address space for LLVM 7.
The LLVM AMDGPU backend decided to renumber the constant address
space ....

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-14 01:05:03 +01:00
Dave Airlie
9ddacd9af4 gallium: drop all the guard band float caps.
Nobody queries these and nobody sets them to anything useful,
the docs say TODO.

Drop them until a use appears.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 08:50:08 +10:00
Vadym Shovkoplias
a553c54abf mesa: add glsl version query (v4)
Add support for GL_NUM_SHADING_LANGUAGE_VERSIONS
and glGetStringi for GL_SHADING_LANGUAGE_VERSION

v2:
  - Combine similar functionality into
    _mesa_get_shading_language_version() function.
  - Change GLSL version return mechanism.
v3:
  - Add return of empty string for GLSL ver 1.10.
  - Move _mesa_get_shading_language_version() function
    to src/mesa/main/version.c.
v4:
  - Add OpenGL version check.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104915
Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com>
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 13:24:31 -07:00
Brian Paul
b08d718703 mesa: add missing switch case for EXTRA_VERSION_40 in check_extra()
The EXTRA_VERSION_40 predicate is tested as part of
extra_gl40_ARB_sample_shading but there was no switch case for it.

Fixes: 77b440e42d ("mesa: Add new functions and enums required
by GL_ARB_sample_shading")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-13 10:35:55 -07:00
Mark Janes
e5809788d6 mesa: fix compile failure
Missing header triggered a failure in i965 CI buildtest project.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067
Fixes: e149a0253c
2018-02-13 00:22:05 -08:00
Mark Janes
d9de7aaca3 Partially revert "mesa: use GLenum16 in a few more places"
This reverts part of commit ca721b3d89.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067
2018-02-13 00:22:05 -08:00
Mark Janes
3e5758a70a Revert "mesa: reduce the size of gl_texture_image"
This reverts commit f4ea2b2a9e.

Several members reduced in size by the offending commit are not large
enough to store the data needed by the i965 driver.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067
2018-02-13 00:22:05 -08:00
Dave Airlie
db5f422169 i965: fix tessellation regressions with gl_state_index16
Looks like one conversion was missed.

Fixes: e149a0253 (mesa,glsl,nir: reduce gl_state_index size to 2 bytes)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2018-02-12 23:05:16 -08:00
Stéphane Marchesin
5e4a2b394e virgl: Support v2 caps struct (v2)
This struct allows us to report:
- accurate max point size/line width.
- accurate texel and texture gather offsets
- vertex/geometry limits.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-13 14:23:54 +10:00
Timothy Arceri
10457712ed ac/nir: add nir_intrinsic_{load,store}_shared support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-13 14:43:05 +11:00
Timothy Arceri
c787cbfa33 ac/nir_to_llvm: add support for nir_intrinsic_shared_atomic_*
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-13 14:43:05 +11:00
Timothy Arceri
b6cf898ec2 radeonsi: make si_declare_compute_memory() more generic and call for nir
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-13 14:43:05 +11:00
Timothy Arceri
94fa090fad st/glsl: set req_local_mem earlier for compute shaders
Without this change it will never be set for backends using nir.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-13 14:43:05 +11:00
Marek Olšák
6b1e26e181 mesa: move STATE_LENGTH to shader_enums.h and use it everywhere
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
f4ea2b2a9e mesa: reduce the size of gl_texture_image
80 -> 40 bytes.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
4794fbc86e mesa: reduce the size of gl_program_parameter
40 -> 24 bytes, which includes the gl_state_index16 change.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
e149a0253c mesa,glsl,nir: reduce gl_state_index size to 2 bytes
Let's use the new gl_state_index16 type everywhere and remove
the typecasts.

This helps reduce the size of gl_program_parameter.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
a7882013d3 mesa: reduce the size of gl_viewport_attrib
All drivers convert these to float, so there is no reason to use double.
The piglit test that expects double precision from glGet will be adjusted
not to require it (there is a piglit patch).

gl_context::ViewportArray: 512 -> 384 bytes

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
d7550d783a mesa: reduce the size of gl_texture_object
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
65ed98839b mesa: reduce the size of gl_program
gl_program: 1456 -> 976 bytes

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
78f1decc95 mesa: reduce the size of gl_image_unit (v2)
gl_context::ImageUnits: 6144 -> 4608 bytes

v2: use ASSERT_BITFIELD_SIZE

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
ca5c5d96d8 mesa: further reduce the size of ctx->Texture
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
78043a75f6 mesa: decrease the array size of ctx->Texture.FixedFuncUnit to 8
GL allows doing glTexEnv on 192 texture units, while in reality,
only MaxTextureCoordUnits units are used by fixed-func shaders.

There is a piglit patch that adjusts piglits/texunits to check only
MaxTextureCoordUnits units.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
07c10cc59c mesa: separate legacy stuff from gl_texture_unit into gl_fixedfunc_texture_unit
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
79aca14f5f mesa: inline init_texture_unit
because this is going to be changed

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
ca721b3d89 mesa: use GLenum16 in a few more places
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Jason Ekstrand
4c77e21c81 anv: Move setting current_pipeline to cmd_state_init
We were setting current_pipeline to UINT32_MAX and then calling
cmd_cmd_state_reset which memsets the entire state struct to 0 which
implicitly resets current_pipeline to 3D.  I have no idea how this
hasn't caused everything to explode.

Fixes: cd3feea745 "anv/cmd_buffer: Rework anv_cmd_state_reset"
cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-02-12 15:18:23 -08:00
Jason Ekstrand
f37bd726c7 anv: Don't resolve or ambiguate non-existent layers
The previous code was trying to avoid non-existent layers by taking a
MAX with anv_image_aux_layers.  Unfortunately, it wasn't taking into
account that layer_count starts at base_layer which may not be zero.
Instead, we need to subtract base_layer from anv_image_aux_layers with
a guard against roll-over.

Fixes: de3be61801 "anv/cmd_buffer: Rework aux tracking"
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-12 15:14:57 -08:00
Daniel Stone
c2c4e5bae3 i965: Fix bugs in intel_from_planar
This commit fixes two bugs in intel_from_planar.  First, if the planar
format was non-NULL but only had a single plane, we were falling through
to the planar case.  If we had a CCS modifier and plane == 1, we would
return NULL instead of the CCS plane.  Second, if we did end up in the
planar_format == NULL case and the modifier was DRM_FORMAT_MOD_INVALID,
we would end up segfaulting in isl_drm_modifier_has_aux.

Cc: mesa-stable@lists.freedesktop.org
Fixes: 8f6e54c929
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-12 15:14:45 -08:00
Eric Anholt
1aed66dc1e radv: Fix compiler warning about uninitialized 'set'
The compiler doesn't figure out that we only get result == VK_SUCCESS if
set got initialized.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 20:48:47 +00:00
Eric Anholt
21670f8208 glsl/tests: Fix strict aliasing warning about int64/double.
Fixes: 4bf9862747 ("glsl/tests: Add UINT64 and INT64 types")
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
2018-02-12 20:48:43 +00:00
Eric Anholt
091bff8317 ac/nir: Fix compiler warning about uninitialized dw_addr.
Even switching the def's condition to be the same chip revision check as
the use, the compiler doesn't figure it out.  Just NULL-init it.

Fixes: ec53e52742 ("ac/nir: Add ES output to LDS for GFX9.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 20:48:29 +00:00
Eric Anholt
7a83be4b28 gallium/llvmpipe: Fix compiler warnings about ddx/ddy/ddmax.
My gcc doesn't figure out that dims >= 1 (seems reasonable), and doesn't
notice that ddmax is used from the same no_rho_opt as its initialization.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-12 20:48:18 +00:00
Kenneth Graunke
bd87bd178c anv: Drop I915_EXEC_CONSTANTS_REL_GENERAL from execbuf.
The kernel used to have execbuf parameters to program the INSTPM bit
for whether 3DSTATE_CONSTANT_* should be relative to dynamic state
base address or an absolute address.  However, they never worked in
the presence of hardware contexts, so I deleted them a while back.

It doesn't make sense to set this flag, as it doesn't exist anymore.
It also never did anything anyway - the flag is zero, so |'ing it in
did nothing.  The default is relative anyway.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-12 07:00:41 -08:00
Eric Engestrom
111d4bf1d0 r200: remove left over dead code
0aaa27f291 removed the references to this array without
removing the array itself

Cc: Ian Romanick <ian.d.romanick@intel.com>
Fixes: 0aaa27f291 "mesa: Pass the translated color logic op dd_function_table::LogicOpcode"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-02-12 11:19:44 +00:00
Samuel Pitoiset
f4e85ba93f ac/nir: remove backlink to nir_to_llvm_context
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:39 +01:00
Samuel Pitoiset
be5f6eb13e ac/nir: remove nir_to_llvm_context::module
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:36 +01:00
Samuel Pitoiset
90a815ddeb ac/nir: remove nir_to_llvm_context::builder
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:34 +01:00
Samuel Pitoiset
759acfa180 ac/nir: drop nir_to_llvm_context from glsl_to_llvm_type()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:31 +01:00
Samuel Pitoiset
e7373a6498 ac/nir: drop nir_to_llvm_context from visit_var_atomic()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:29 +01:00
Samuel Pitoiset
485346b05a ac/nir: drop nir_to_llvm_context from visit_vulkan_resource_reindex()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:27 +01:00
Samuel Pitoiset
cd6dfacda9 ac/nir: drop nir_to_llvm_context from visit_load_push_constant()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:25 +01:00
Samuel Pitoiset
5c9e398c83 ac/nir: drop nir_to_llvm_context from cast_ptr()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:23 +01:00
Samuel Pitoiset
5ef5944848 ac/nir: drop nir_to_llvm_context from visit_load_local_invocation_index()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:21 +01:00
Samuel Pitoiset
da8b0b8264 ac/nir: drop nir_to_llvm_context from emit_f2f16()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:19 +01:00
Samuel Pitoiset
e32f374944 ac: remove unused parameters in abi::load_tess_coord()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:17 +01:00
Samuel Pitoiset
1e69db003d ac/nir: remove useless bitcast in load_tess_coord()
nir_intrinsic_load_tess_coord always returns a v3i32.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:15 +01:00
Samuel Pitoiset
ed179fbdf3 ac: add load_resource() to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:13 +01:00
Samuel Pitoiset
ecf229706f ac: add load_sample_mask_in() to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:11 +01:00
Samuel Pitoiset
0f48eeea05 ac: move view_index to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:09 +01:00
Samuel Pitoiset
0efbede949 ac: move push_constants to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:07 +01:00
Samuel Pitoiset
460d3ce726 ac: move tg_size to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:04 +01:00
Samuel Pitoiset
054c92190c ac/nir: remove unused nir_to_llvm_context:{defs,phis}
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:02 +01:00
Eric Anholt
0b97eb02b0 egl/gbm: Fix compiler warning about visual matching.
The compiler doesn't know that num_visuals > 0.

Fixes: 37a8d907cc ("egl/gbm: Ensure EGLConfigs match GBM surface format")
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-12 09:16:44 +00:00
Rob Clark
831fb29252 freedreno: small fix for flushing dependent batches
Flush a resource's previous write_batch synchronously.  Because a
resource's associated batches are not updated until after the flush
thread submits rendering to the kernel, this was causing a bit of
confusion in the following loop.  This fixes a bug that appeared with
recent stk.

Perhaps we need to re-work things a bit to clear out dependent patches
in the ctx's thread and use a fence to deal with the period between
when a flush is queued and when it is submitted to the kernel.  But
this will do until time permits a larger refactor.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
c57ed8e01c freedreno/ir3: intra-block scheduling
Because of loops, we can't schedule all of a block's predecessors first.
Instead just assume that the result consumed in a block was written far
enough away in all paths into a block.  And do an intra-block scheduling
pass to figure out if there are any cases where we need to insert extra
nop's.  This works out better than always assuming the worst case (ie.
that a value live into a block was written in the last instruction in
the predecessor block).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
2a2099a875 freedreno/ir3: "boost" the depth of if/else condition
Account for the move to predicate register, to try to avoid needing to
insert extra NOPs later.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
ffb00f6841 freedreno/ir3: account for arrays in delayslot calc
Normally false-deps are not something to consider, since they mostly
exist for delay-slot related reasons:

 * barriers
 * ordering writes after read
 * SSBO/image access ordering

The exception is a false-dependency on an array store.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
f54d2b4f10 freedreno/ir3: more clever legalize algorithm
Previously we didn't handle flow control in legalize, and instead just
set (ss)(sy) on the first instruction in every block.  Which isn't very
clever.

Instead, consider output state of all predecessor blocks, so we only
set a sync bit if needed for any possible path leading into a block.
Because of loops, we can't require that all successor blocks are
legalized before a given block, so instead run in a loop until results
converge.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
015afb6a38 freedreno/ir3: track block predecessors
Useful in the following patches.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
76440fcca9 freedreno/ir3: clean up dangling false-dep's
Maybe there is a better way for this..  where it comes useful is "array"
loads, which end up as a false-dep for a later array store.

If all the uses of an array load are CP'd into their consumer, it still
leaves the dangling array load, leading to funny things like:

  mov.u32u32 r5.y, r0.y
  mov.u32u32 r5.y, r0.z

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
aea223741f freedreno/ir3: handle IMMED for mad 2nd src special case
Consider also immediates for swapping the first two srcs, because they
can be lowered to constant.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
242a8a1957 freedreno/ir3: remove ir3 phi instruction
Now that we convert phi webs to ssa, we can drop all this.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
a7b569d60c freedreno/ir3: remove lower_if_else pass
Now that it is unused.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
268ab05484 freedreno/ir3: add experimental GCM pass
Generally seems to do worse on instruction count and register usage,
according to shader-db.  But shader-db also doesn't do a very good job
of weighting loop bodies, so that might not be totally valid.

So add an env variable to enable GCM pass for easier experimentation.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
4c15c53d91 freedreno/ir3: change opt passes
There are more useful nir passes added since initial conversion to nir.
But ir3 was never updated to use them.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
ec8bc54ad2 freedreno/ir3: use peephole select pass
Agressively lowering all if/else to selects in some extreme cases
results in much higher register pressure.  Using peephole select instead
with a modest threshold speeds up alu2 4x!

16 seems like a good limit, low enough to help alu2 but not too low that
it penalizes everything else.  With a bit better scheduling of the
instruction that moves a value into a predicate register, we might be
able to lower this limit a bit more in the future, but since we need 6
cycles from the move to predicate register to predicated branch, that
puts some sort of lower bound on how far we can lower this threshold.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
a7ea2b4eba freedreno/ir3: lower phi webs to regs
nir's from_ssa pass is much better at avoiding inserting extra moves
than our logic is.  And lowering phi webs to regs just treats anything
involved in a phi web as an array of length=1.  Which with previous
array related fixes in RA/etc ends up working out quite well.  This cuts
down on extra instructions and also helps with register pressure.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
0a6ddf964f freedreno/ir3: separate arrays from groups
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
55f14a1ac4 freedreno/ir3: make block/instruction serialno per-shader
Makes it easier to compare values seen in-game (where there are many
shaders) to cmdline standalone compiler.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
5a7de94392 freedreno/ir3: add spirv support to cmdline compiler
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
942341bcd0 freedreno/ir3: don't lower fsat
Instead, if possible fold (sat) flag into src, otherwise use:

  (sat)max.f rD, rS, rS

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
b2fc94f074 freedreno/ir3: add encoding/decoding for (sat) bit
Seems to be there since a3xx, but we always lowered fsat.  But we can
shave some instructions, especially in shaders that use lots of
clamp(foo, 0.0, 1.0) by not lowering fsat.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
1b658533e1 freedreno/ir3: extend liverange of arrays
Use livein state of other blocks to extend liverange of arrays when they
are still needed by successor blocks.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
ac459a6f7f freedreno/ir3: avoid extra mov's for "arrays"
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
2bc3fb6992 freedreno/ir3: a couple more array fixes
(Plus a couple TODOs)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
8ea1ef4191 freedreno/ir3: keep array stores
Since these are not in SSA form, add to block's keeps so it doesn't
appear unused.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
c60f150d56 freedreno/ir3: propagate barrier information
When eliminating movs, the instruction that is now directly using the
src of the mov has the same scheduling order constraints as the original
mov instruction.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
98702c1010 freedreno/ir3: remove pointless statement
Function ends after this if/else ladder, so it was pointless.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
930ca0e038 freedreno/ir3: some more debug prints
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
a84e324847 freedreno/ir3: fix printing of relative branch offsets
The number of bits depends on generation.  But printing negative values
with a5xx encoding (largest size) but compiling for a3xx or a4xx, would
result in negative values printed as large positive values.

I guess in practice huge negative branch offsets aren't likely (and if
that is the case, the shader is probably too big to grok by reading the
assembly).  So just print using smallest bitfield size.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
a5c28fe07b freedreno/ir3: be more clever with if/else jumps
Try to clean up things like:

  br !p0.x #2
  br p0.x #something

to eliminate the first branch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
44dd7dcd2f freedreno/ir3: avoid some spurious sync bits
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
069c0ac625 freedreno/ir3: print # of sync bits for shaderdb
When trying to optimize to reduce stalls, it is nice to see this info.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
7d45e2e39f freedreno: add debug trace for flush
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Grazvydas Ignotas
9b9a89cd79 intel/compiler: fix 64bit value prints on 32bit
Fix the following:
warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but
argument 3 has type ‘uint64_t {aka long long unsigned int}.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-02-10 17:59:02 +02:00
Timothy Arceri
ff0e3fa1fe st/glsl_to_nir: remove unused options variable 2018-02-10 11:06:55 +11:00
Timothy Arceri
8f378c116e st/radeonsi: enable disk cache for nir
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
bc9d9f9b86 st: add nir shader disk cache support
v2: include compute shader support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
97efdc0d57 st/glsl_to_tgsi: move nir detection earlier
We move the nir check before the shader cache call so that we can
call a nir based caching function in a following patch.

Also with this change we simply check if vertex shaders support
NIR rather than looping over the stages as mixing of shader types
is not supported anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
b5e23887fe radeonsi: stop returning PIPE_SHADER_IR_NATIVE for PIPE_SHADER_CAP_PREFERRED_IR
Clover now checks PIPE_SHADER_CAP_SUPPORTED_IRS for native support instead.

This change indirectly enables NIR support for compute shaders
on radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
73f1d6f0c1 r600: always return PIPE_SHADER_IR_TGSI for PIPE_SHADER_CAP_PREFERRED_IR
We now use PIPE_SHADER_CAP_SUPPORTED_IRS to check for native support
in clover.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
51f484bb44 clover: use PIPE_SHADER_CAP_SUPPORTED_IRS to discover IR
PIPE_SHADER_CAP_PREFERRED_IR was conflicting with PIPE_SHADER_IR_NIR
for compute shaders, so we let clover pick the one it wants to use.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
3af4f34e61 r600: add PIPE_SHADER_IR_NATIVE to supported shaders for cs
Acked-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
ce836487b8 radeonsi/nir: add depth layout to scan pass
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-10 10:46:28 +11:00
Timothy Arceri
6a8efbe652 radeonsi/nir: add FRAG_RESULT_COLOR to scan pass
Fixes a number of draw buffers piglit tests.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-10 10:46:28 +11:00
Timothy Arceri
ef8082baf8 ac: convert nir_op_f2f32 src to a float
Fixes the following piglit test:

./bin/arb_vertex_attrib_64bit-check-explicit-location -auto -fbo

Where we would end up with the nir such as:

	vec1 64 ssa_11 = pack_64_2x32_split ssa_9, ssa_10
	vec1 32 ssa_12 = f2f32 ssa_2

And our pack_64_2x32_split nir to llvm code always produces
a 64bit integer as output.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-10 10:46:28 +11:00
Timothy Arceri
1b1e5f8edf ac: fix some 64bit unpack asserts
Previously the asserts did not take swizzles into account.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-10 10:46:28 +11:00
Mark Janes
9a05c66feb Revert "i965: prevent potentially null pointer access"
This reverts commit 712332ed54, which
caused over 90k failures in Mesa i965 CI.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-09 09:46:07 -08:00
Daniel Stone
37a8d907cc egl/gbm: Ensure EGLConfigs match GBM surface format
When we create an EGL window surface on a GBM surface, ensure that the
EGLConfig is compatible with the GBM format, notwithstanding XRGB/ARGB
interchange.

For example, rendering with an XRGB8888 EGLConfig on to an ARGB8888
gbm_surface (and vice-versa) are acceptable, but rendering with an
XRGB2101010 EGLConfig on to an XRGB8888 gbm_surface will now be
rejected.

This was previously allowed through; when 10bpc formats were enabled,
clients which picked a completely random EGL config and hoped/assumed
they were XRGB8888 would break.

If you have bisected a failure to start a GBM/KMS client to this commit,
please look at its EGLConfig selection (e.g. through eglChooseConfigs),
and add an EGL_NATIVE_VISUAL_ID == gbm_surface format match to the
attribs for config selection.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
8174e5b49e egl/gbm: Remove duplicate format table
Now that we have mask/channel information in gbm_dri's format conversion
table, we can remove the copy in EGL.

As this table contains more formats (notably including R8 and RG8, which
can be used for BO but not surface allocation), we now compare the masks
of all channels when trying to find a suitable config. Without doing
this, an XRGB8888 EGLConfig would match on an R8 format.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
314714ac53 gbm/dri: Expose visuals table through gbm_dri_device
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
2ed344645d gbm/dri: Add RGBA masks to GBM format table
Eventually, we can replace the visuals list inside GBM EGL driver with
this one.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
4732094cff egl/wayland: Use an array for modifiers
Each Wayland EGLDisplay currently contains a struct with one vector of
modifiers per format, hardcoded in the header. To allow easier support
for more formats, turn this into an array of u_vectors which is opaque
outside of platform_wayland.c.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
5bc49d4cbf egl/wayland: Remove has_format enum
Instead of the has_format enum, use an index into the visual array. This
makes adding new formats less typing.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
d32b23f383 egl/wayland: Add bpp to visual map
Both the DRI2 GetBuffersWithFormat interface, and SHM buffer allocation,
had their own format -> bpp lookup tables. Replace these with a lookup
into the visual map.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
4de98a9c07 egl/wayland: Use visual map for DRIImage<->FourCC map
When trying to translate between DRIImage format enums and FourCC codes,
use our visual map rather than an open-coded subset.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
68a80c11bd egl/wayland: Use visual map for format advertisement
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
3323ce72ff egl/wayland: Use visual map for buffer_from_image
When creating a wl_buffer on an upstream Wayland display from an
existing EGLImage, use the dri2_wl_visual map rather than another
hardcoded list of formats.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
a9cc4edb60 egl/wayland: Use visual map for config->format lookup
Having hoisted the format -> config map into common code, we now use it
for config -> format lookups.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
1dc013f1ee egl/wayland: Add format enums to visual map
Extend the visual map from only containing names and bitmasks, to also
carrying the three format enums we need. These are the DRIImage format
tokens for internal allocation, FourCC codes for wl_drm and dmabuf
protocol, and wl_shm codes for swrast drivers.

We will later use these formats to eliminate a bunch of open-coded
conversions.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
66912641df egl/wayland: Use proper enum type in visual definition
No semantic change.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
845c2f6156 egl/wayland: Widen channel masks to bpp
Widen the channel masks given in the visual table to the full width of
the pixel format, i.e. as many leading zeros as required.

No functional change.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
19cbca38e4 egl/wayland: Hoist format <-> EGLConfig definition up
Pull the mapping between Wayland formats and EGLConfigs up to the top
level, so we can reuse it elsewhere.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
4fbd2d50b1 egl/wayland: Fix ARGB/XRGB transposition in config map
When 0b2b719121 moved from an if tree to a struct to map between
wl_drm formats and EGLConfigs, it transposed the mapping between XRGB
and ARGB. Luckily, everyone exposes both formats, so this is harmless.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: 0b2b719121 ("egl/wayland: introduce dri2_wl_add_configs_for_visuals() helper")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:06 +00:00
Marek Olšák
76085f2048 st/mesa: generate blend state according to the number of enabled color buffers
Non-MRT cases always translate blend state for 1 color buffer only.
MRT cases only check and translate blend state for enabled color buffers.

This also avoids an assertion failure in translate_blend for:
  dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_advanced_blend_eq_buffer_blend_eq

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-09 15:52:22 +01:00
Marek Olšák
c446dd7927 st/mesa: don't translate blend state when color writes are disabled
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-09 15:52:22 +01:00
Marek Olšák
3d06c8afb5 st/mesa: don't translate blend state when it's disabled for a colorbuffer
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-09 15:52:22 +01:00
Lionel Landwerlin
712332ed54 i965: prevent potentially null pointer access
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
CID: 1418110
2018-02-09 14:02:59 +00:00
Mark Thompson
5db29d62ce st/va: Make the vendor string more descriptive
Include the Mesa version and detail about the platform.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-02-09 13:37:43 +01:00
Mark Thompson
768f1487b0 st/va: Enable vaExportSurfaceHandle()
It is present from libva 2.1 (VAAPI 1.1.0 or higher).

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-02-09 13:37:36 +01:00
Tapani Pälli
41c5bf3836 disk cache: move path creation back to constructor
This patch moves disk cache path and index creation back to the
constructor which matches previous behavior. We still allow create
to succeed without path so that cache can be used with callback
functionality.

Fixes: c95d3ed091 "disk cache: create cache even if path creation fails"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-09 11:33:25 +02:00
Samuel Pitoiset
3a2bb4db23 ac/nir: compute correct number of user SGPRs on GFX9
For merged shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 10:16:04 +01:00
Michel Dänzer
171076f082 st/mesa: Initialize tex_target in compile_tgsi_instruction
Initialize to TGSI_TEXTURE_BUFFER (== 0), same as was done before the
variable type was changed to enum tgsi_texture_type.

Fixes a bunch of piglit failures with radeonsi, e.g.:

gles-3.0-transform-feedback-uniform-buffer-object: ../../../../src/gallium/auxiliary/tgsi/tgsi_util.c:502: tgsi_util_get_texture_coord_dim: Assertion `!"unknown texture target"' failed.

Corresponding compiler warning:

  CXX      state_tracker/st_glsl_to_tgsi.lo
../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp: In function ‘pipe_error st_translate_program(gl_context*, uint, ureg_program*, glsl_to_tgsi_visitor*, const gl_program*, GLuint, const ubyte*, const ubyte*, const ubyte*, const ubyte*, const ubyte*, GLuint, const ubyte*, const ubyte*, const ubyte*)’:
../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5992:23: warning: ‘tex_target’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       ureg_memory_insn(ureg, inst->op, dst, num_dst, src, num_src,
       ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                        inst->buffer_access,
                        ~~~~~~~~~~~~~~~~~~~~
                        tex_target, inst->image_format);
                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5866:27: note: ‘tex_target’ was declared here
    enum tgsi_texture_type tex_target;
                           ^~~~~~~~~~

Fixes: 9f9ce1625f ("st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 09:26:40 +01:00
Alejandro Piñeiro
f32b01ca43 glsl/linker: remove ubo explicit binding handling
This is already handled at link_uniform_blocks, specifically at
process_block_array_leaf.

Additionally, this code was not handling correctly arrays of
arrays. When creating the name of the block to set the binding, it
only took into account the first level, so any attempt to set a
explicit binding on a array of array ubo would trigger an assertion.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-09 08:32:42 +01:00
Mathias Fröhlich
77cb2fc0bd mesa: Only update enabled VAO gl_vertex_array entries.
Instead of updating all modified gl_vertex_array_object::_VertexArray
entries just update those that are modified and enabled.
Also release buffer object from the _VertexArray that belong
to disabled attributes.

v2: Also set Ptr and Size to zero.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-09 04:26:23 +01:00
Mathias Fröhlich
437cae411e gallium: Mute arrays for several meta like callbacks.
Set the _DrawArray pointer to NULL when calling into the Drivers
Bitmap/CopyPixels/DrawAtlasBitmaps/DrawPixels/DrawTex hooks.
This fixes an assert that gets uncovered when the following
patch gets applied.

v2: Mute from within the state tracker instead of generic mesa.
v3: Avoid evaluating _DrawArrays from within st_validate_state.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 04:26:13 +01:00
Mathias Fröhlich
2f9eb0aad5 mesa: Fix VAO buffer object tracking.
When changing the attribute binding in the VAO we also need to
account for getting rid of non vbo bits from VertexAttribBufferMask.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-09 04:21:36 +01:00
Timothy Arceri
d8bca3809d radeonsi/nir: gather some missing fs info
Fixes some early-z arb_shader_image_load_store piglit tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 12:51:27 +11:00
Timothy Arceri
c77078c942 ac: pass struct ac_llvm_context to emit_membar()
Fixes segfault in piglit test:

./bin/arb_shader_image_load_store-shader-mem-barrier --quick -auto -fbo

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 12:51:27 +11:00
Marek Olšák
12fd567c78 radeonsi: copy the NIR enablement debug bit to the shader cache flags
When NIR is enabled, TGSI must not be used. When NIR is disabled, TGSI

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-09 02:01:45 +01:00
Jason Ekstrand
8f20cf166e intel/blorp: Use isl_aux_op instead of blorp_hiz_op
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1e941a0528 intel/blorp: Use isl_aux_op instead of blorp_fast_clear_op
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1810f965c8 anv: Allow fast-clearing the first slice of a multi-slice image
Now that we're tracking aux properly per-slice, we can enable this for
applications which actually care.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
de3be61801 anv/cmd_buffer: Rework aux tracking
This commit completely reworks aux tracking.  This includes a number of
somewhat distinct changes:

 1) Since we are no longer fast-clearing multiple slices, we only need
    to track one fast clear color and one fast clear type.

 2) We store two bits for fast clear instead of one to let us
    distinguish between zero and non-zero fast clear colors.  This is
    needed so that we can do full resolves when transitioning to
    PRESENT_SRC_KHR with gen9 CCS images where we allow zero clear
    values in all sorts of places we wouldn't normally.

 3) We now track compression state as a boolean separate from fast clear
    type and this is tracked on a per-slice granularity.

The previous scheme had some issues when it came to individual slices of
a multi-LOD images.  In particular, we only tracked "needs resolve"
per-LOD but you could do a vkCmdPipelineBarrier that would only resolve
a portion of the image and would set "needs resolve" to false anyway.
Also, any transition from an undefined layout would reset the clear
color for the entire LOD regardless of whether or not there was some
clear color on some other slice.

As far as full/partial resolves go, he assumptions of the previous
scheme held because the one case where we do need a full resolve when
CCS_E is enabled is for window-system images.  Since we only ever
allowed X-tiled window-system images, CCS was entirely disabled on gen9+
and we never got CCS_E.  With the advent of Y-tiled window-system
buffers, we now need to properly support doing a full resolve of images
marked CCS_E.

v2 (Jason Ekstrand):
 - Fix an bug in the compressed flag offset calculation
 - Treat 3D images as multi-slice for the purposes of resolve tracking

v3 (Jason Ekstrand):
 - Set the compressed flag whenever we fast-clear
 - Simplify the resolve predicate computation logic

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
2cbfcb205e anv/cmd_buffer: Move the mi_alu helper higher up
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
2e69045c4d anv/image: Simplify some verbose commennts
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
f0523f70ef anv: Use blorp_ccs_ambiguate instead of fast-clears
Even though the blorp pass looks a bit on the sketchy side, the end
result in the Vulkan driver is very nice.  Instead of having this weird
case where you do a fast clear and then maybe have to resolve, we just
do the ambiguate and are done with it.  The ambiguate does exactly what
we want of setting all the CCS values to 0 which puts it into the
pass-through state.

This should also improve performance a bit in certain cases.  For
instance, if we did a transition from UNDEFINED to GENERAL for a surface
that doesn't have CCS enabled all the time, we would end up doing a
fast-clear and then a full resolve which ends up touching every byte in
the main surface as well as the CCS.  With the ambiguate pass, that
transition only touches the CCS.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
84fd2ebfbc anv/cmd_buffer: Re-arrange the logic around UNDEFINED fast-clears
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
3ef8c4b2f5 anv/cmd_buffer: Pull the undefined layout condition into the if
Now that this isn't a multi-case if and it's just the one case, it's a
bit clearer if the condition is just part of the if instead of being
pulled out into a boolean variable.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
857b5b5a7f intel/blorp: Add a CCS ambiguation pass
This pass performs an "ambiguate" operation on a CCS-compressed surface
by manually writing zeros into the CCS.  On gen8+, ISL gives us a fairly
detailed notion of how the CCS is laid out so this is fairly simple to
do.  On gen7, the CCS tiling is quite crazy but that isn't an issue
because we can only do CCS on single-slice images so we can just blast
over the entire CCS buffer if we want to.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
13b621d6fd anv: Only fast clear single-slice images
The current strategy we use for managing resolves has an issues where we
track clear colors and the need for resolves per-LOD but we still allow
resolves of only a subset of the slices in any given LOD and doing so
sets the "needs resolve" flag for that LOD to false while leaving the
remaining layers unresolved.  This patch is only the first step and does
not, by itself fix anything.  However, it's fairly self-contained and
splitting it out means any performance regressions should bisect to this
nice obvious commit rather than to the giant "rework aux tracking"
commit.

Nanley and I did some testing and none of the applications we tested
even tried to fast-clear anything other than the first slice of an
image.  The test was done by adding a printf right before we call
blorp_fast_clear if we were every going to touch any slice other than
the first with a fast-clear.  Due to the way the original code was
structured, this would not have included applications which only cleared
a subset of layers.  The applications tested were:

 * All Sascha Willems demos
 * Aztec Ruins
 * Dota 2
 * The Talos Principle
 * Mad Max
 * Warhammer 40,000: Dawn of War III
 * Serious Sam Fusion 2017: BFE

While not the full list of shipping applications, it's a pretty good
spread and covers most of the engines we've seen running on our driver.
If this is ever shown to be a performance problem in the future, we can
reconsider our strategy.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
571ed588ac anv/cmd_buffer: Add a mark_image_written helper
Currently, this helper does nothing but we call it every place where an
image is written through the render pipeline.  This will allow us to
properly mark the aux state so that we can handle resolves correctly.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
9876d6f0ef anv/blorp: Add src/dst_level helper variables in CmdCopyImage
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
c180c2c868 anv/cmd_buffer: Add an anv_genX_call macro
This is copied and pasted from the similar macro we added to ISL.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
ab7543b13d anv/cmd_buffer: Generalize transition_color_buffer
This moves it to being based on layout_to_aux_usage instead of being
hard-coded based on bits of a priori knowledge of how transitions
interact with layouts.  This conceptually simplifies things because
we're now using layout_to_aux_usage and layout_supports_fast_clear to
make resolve decisions so changes to those functions will do what one
expects.

There is a potential bug with window system integration on gen9+ where
we wouldn't do a resolve when transitioning to the PRESENT_SRC layout
because we just assume that everything that handles CCS_E can handle it
all the time.  When handing a CCS_E image off to the window system, we
may need to do a full resolve if the window system does not support the
CCS_E modifier.  The only reason why this hasn't been a problem yet is
because we don't support modifiers in Vulkan WSI and so we always get X
tiling which implies no CCS on gen9+.  This patch doesn't actually fix
that bug yet but it takes us the first step in that direction by making
us actually pick the correct resolve op.  In order to handle all of the
cases, we need more detailed aux tracking.

v2 (Jason Ekstrand):
 - Make a few more things const
 - Use the anv_fast_clear_support enum

v3 (Jason Ekstrand):
 - Move an assert and add a better comment

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
151771b390 anv/cmd_buffer: Recurse in transition_color_buffer instead of falling through
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
bea7373c92 anv/image: Support color aspects in layout_to_aux_usage
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
b09464db42 anv/image: Add a helper for determining when fast clears are supported
v2 (Jason Ekstrand):
 - Return an enum instead of a boolean

v3 (Jason Ekstrand):
 - Return ANV_FAST_CLEAR_NONE instead of false (Topi)
 - Rename ANV_FAST_CLEAR_ANY to ANV_FAST_CLEAR_DEFAULT_VALUE
 - Add documentation for the enum values

v4 (Jason Ekstrand):
 - Remove a dead comment

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1f7eee6bc1 anv/image: Update a comment
This got lost in all of the aspect vs. plane rebasing of YCBCR.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
5c38ab8f07 anv/blorp: Rework HiZ ops to look like MCS and CCS
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1d473e26f2 anv/blorp: Support ISL_AUX_USAGE_HIZ in surf_for_anv_image
If the function gets passed ANV_AUX_USAGE_DEFAULT, it still has the old
behavior of setting ISL_AUX_USAGE_NONE for depth/stencil which is what
we want for blits/copies.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
42f1668a54 anv/blorp: Rework image clear/resolve helpers
This replaces image_fast_clear and ccs_resolve with two new helpers that
simply perform an isl_aux_op whatever that may be on CCS or MCS.  This
is a bit cleaner as it separates performing the aux operation from which
blorp helper we have to call to do it.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
482c24783e intel/isl: Codify AUX operations in an enum
Right now, we have different entrypoints and enums in blorp for these
different operations.  This provides us a central enum which we can
begin to transition to.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Gert Wollny
c36172e387 r600/sb: Check whether optimizations would result in reladdr conflict
v2: * Check whether the node src and dst registers are NULL before using
      them.
    * fix a type in the commit message.

Two cases are handled with this patch:

1. If copy propagation tries to eliminated a move from a relative
   array access then it could optimize

     MOV R1, ARRAY[RELADDR_1]
     MOV R2, ARRAY[RELADDR_2]
     OP2 R3, R1 R2

   into

     OP2 R3, ARRAY[RELADDR_1], ARRAY[RELADDR_2]

   which is forbidden, because there is only one address register available.

2. When MULADD(x,a,MUL(x,c)) is handled

      MUL TMP, R1, ARRAY[RELADDR_1]
      MULLADD R3, R1, ARRAY[RELADDR_2], TMP

   by folding this into

      ADD TMP, ARRAY[RELADDR_2], ARRAY[RELADDR_1]
      MUL R3, R1, TMP

   which is also forbidden.

Test for these cases and reject the optimization if a forbidden combination
of relative access would be created.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 10:00:38 +10:00
Glenn Kennard
1d871aa626 r600g: Implement spilling of temp arrays (v2)
Pessimistically spills arrays if GPR limit is exceeded.

v2: fix r600 support [airlied]

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:26 +10:00
Dave Airlie
22fc5eff80 r600/sb: handle scratch mem reads on r600
On r600 we use the scratch mem with read/read_ind, in that case
sb should track the rw_gpr as a dst instead of a src.

This stops the whole shader being optimised out.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:21 +10:00
Glenn Kennard
cd34deb585 r600g/sb: Add dependency tracking for scratch ops
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:19 +10:00
Glenn Kennard
a100d906b2 r600g/sb: Support scratch ops
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:16 +10:00
Glenn Kennard
6b4303f358 r600g: Implement scratch buffer state management (v2)
v2: add Glenn's fixes

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:12 +10:00
Glenn Kennard
9d31596d7a r600g: Add pending output function
Spills have to happen after the VLIW bundle currently
processed, so defer emitting the spill op.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:08 +10:00
Glenn Kennard
9c48a139b0 r600g: Support emitting scratch ops
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:52:48 +10:00
Dave Airlie
2a891ed190 r600: fix texture gather swizzling.
This fixes:
KHR-GL45.texture_gather.swizzle
on cayman and redwood.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:32:20 +10:00
Timothy Arceri
12a2350e6d ac: add 64bit support to ac_find_lsb()
v2: use LLVMBuildTrunc()

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 09:42:59 +11:00
Timothy Arceri
a9f6b392c7 ac: move get_elem_bits() to ac_llvm_build.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 09:42:59 +11:00
Timothy Arceri
19f9839f0b ac: add 64bit bitCount support
v2: use LLVMBuildTrunc()

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 09:42:59 +11:00
Samuel Pitoiset
bb750d265c ac/nir: clean up handle_fs_outputs_post()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:33 +01:00
Samuel Pitoiset
528bc14fa5 ac/nir: add radv_load_output() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:30 +01:00
Samuel Pitoiset
834d9845ca ac/shader: scan info about output PS declarations
NIR->LLVM should only be a translation pass, and all scan stuff
should be done before.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:27 +01:00
Samuel Pitoiset
a8e04e91de ac/nir: add radv_export_param() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:26 +01:00
Samuel Pitoiset
e3cfd6b805 ac/nir: remove set but unused export_mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:24 +01:00
Samuel Pitoiset
724136d590 ac/nir: remove dead code in handle_vs_outputs_post()
The memcpy can't be reached because the condition is always false.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:22 +01:00
Samuel Pitoiset
c63d8d0284 ac/nir: remove useless check in si_llvm_init_export_args()
values can't be NULL because we use ac_build_export_null() now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:20 +01:00
Samuel Pitoiset
26ab5a4269 ac/nir: use ac_build_export_null()
The number of enabled channels should be 0 when exporting null.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:11:44 +01:00
Samuel Pitoiset
bd9f7b7635 ac: add ac_build_export_null() helper
Imported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-08 22:11:42 +01:00
Scott D Phillips
1f4d2433e7 meson: Add build option for tools
Add a build option to control building some of the misc tools we
have. Also set the executables to install, presumably you want
that if you're asking for the build.

v2: set 'install:' to the with_tools value, not true (Jordan)
    handle 'all' in a the comma list (Dylan)
    Add freedreno's tools (Dylan)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-08 11:24:42 -08:00
Anuj Phogat
464d057c86 intel: Add Coffee Lake brand strings
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-02-08 10:26:34 -08:00
Brian Paul
11e92889aa gallium/util: silence clang warning in blitter code
Silence "warning: comparison of constant 4294967295 with expression
of type 'ubyte'".

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-08 10:27:31 -07:00
Brian Paul
4b0a45da25 tgsi: s/unsigned/enum tgsi_semantic/ in ureg_DECL_output()
So the function matches the prototype.  Found with clang.
v2: fix copy&paste error

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-08 10:27:19 -07:00
Brian Paul
d95c2d86cc tgsi: use TGSI_INTERPOLATE_x arguments instead of zeros in ureg code
TGSI_INTERPOLATE_CONSTANT and TGSI_INTERPOLATE_LOC_CENTER have the
value zero so there's no change in behavior.  It seems funny to
declare these fs input registers with constant interpolation.  But
it looks like ureg_DECL_input_layout() is not called anywhere and
ureg_DECL_input() is only called from
util_make_geometry_passthrough_shader().

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
26948ba761 gallium/util: s/uint/enum tgsi_semantic/ in simple shader code
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
0f40f4ffda tgsi: s/unsigned/enum pipe_shader_type/ in ureg code
And add a default switch case to silence a compiler warning.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
c0dc337ecd gallium/util: s/uint/enum tgsi_semantic/ in u_blitter.c
And put static qualifier on const arrays.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
e55de6e20c st/mesa: s/unsigned/enum tgsi_semantic/ st_cb_drawpixels.c
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
b9ff185e41 vbo: add a comment on vbo_draw_transform_feedback()
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
93b3d38176 gallium/util: trivial whitespace/formatting fixes in u_blit.c
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
5396f8546a vbo: improve comments on vbo_draw_func()
And rename a parameter name.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
b03ade55b9 cso: add a couple sanity check assertions in cso_draw_vbo()
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
5cf342704d st/mesa: rename some vars related to indirect draw count
'indirect_params' was a bit vague.  Use the names that we use in
gallium's pipe_draw_indirect_info.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Marek Olšák
d9e6e0bbe3 st/mesa: remove out_num_textures from update_textures
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-08 16:14:11 +01:00
Marek Olšák
08496c5d52 st/mesa: don't store non-fragment sampler states and views in st_context
those are unused.

st_context: 10120 -> 3704 bytes

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-08 16:14:11 +01:00
Lionel Landwerlin
e843667733 i965: perf: cleanup detection of kernel support for loadable configs
The initial revision of the patch adding loadable configs was testing
the feature's availability by adding a new config successfully and
then removing it.

A second version tested the availability just by exercising the
removal. But some unused code remained.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-08 10:52:14 +00:00
Lionel Landwerlin
bd6c0cab60 i965: perf: use drmIoctl() instead of ioctl()
ioctl() might be interrupted, use drmIoctl() instead as it'll retry
automatically.

Fixes: 27ee83eaf7 "i965: perf: add support for userspace configurations"
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2018-02-08 10:51:40 +00:00
Lionel Landwerlin
0f952b778f i965: perf: add debug messages for loaded configs
This helps figuring out potential problems when metrics don't show up
on frameretrace for example.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-08 10:51:01 +00:00
Dave Airlie
3f7a7bd897 r600: implement tg4 integer workaround. (v2)
This ports the texture gather integer workaround from radeonsi.

This fixes:
KHR-GL45.texture_gather.plain-gather-uint/int*

v2: add rect support, fix 2d array shadow
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (on irc)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-08 16:21:40 +10:00
Glenn Kennard
77b1b33724 r600: clean up initial shader register setup
This is taken from Glenn Kennards scratch series, but separated
out as a cleanup by me.

Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-08 16:21:35 +10:00
Roland Scheidegger
b936f4d1ca r600: partly fix sampleMaskIn value
The hw gives us coverage for pixel, not for individual fragment shader
invocations, in case execution isn't per pixel (eg, unlike cm, actually
cannot do "real" minSampleShading, it's either per-pixel or per-fragment,
but it doesn't really make a difference here).
Also, with msaa disabled, the hw still gives us a mask corresponding to
the number of samples, where GL requires this to be 1.
Fix this up by masking the sampleMaskIn bits with the bit corresponding to
the sampleID, if we know this shader is always executed at per-sample
granularity. (In case of a per-sample frequency shader and msaa disabled,
the sampleID will always be 0, so this works just fine there.)
Fixing this for the minSampleShading case will need a shader key (radeonsi
uses the prolog part for) (for eg, could get away with a single bit, cm
would need more bits depending on sample/invocation ratio, or read the
bits from a uniform), unless we'd want to always use a sample mask uniform
(which is probably not a good idea, as it would make the ordinary common
msaa case slower for no good reason).
This fixes some parts of piglit arb_sample_shading-samplemask (with fixed
test), in particular those which use a sampleID, still failing others
as expected.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-08 04:07:52 +01:00
Roland Scheidegger
07d724326a r600: clean up fragment shader input scan code
For some reason, we were iterating through the code twice (first just for
instructions needing barycentrics, then for instructions and input dcls).
Move things around slightly so this is no longer necessary.
There also was a unnedeed enabling of the fixed_pt_position_gpr - this is only
needed if the per-sample interpolation comes from an input, not from an
instruction (just move the assert where it belongs) (since the sample id to
sample from comes from a tgsi src in this case, and isn't sampleID).
Otherwise there should be no functional change.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-08 04:07:52 +01:00
Roland Scheidegger
6fd3c39590 mesa: (trivial) remove unused ignore_sample_qualifier_parameter
This parameter for _mesa_get_min_incations_per_fragment() was once used
by the intel driver, but it's long gone.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@vmware.com>
2018-02-08 04:07:52 +01:00
Roland Scheidegger
becc7faae2 r600/cm: (trivial) code cleanup for emitting msaa state
No functional change (compile tested only).

Reviewed-by: Dave Airlie <airlied@redhate.com>
2018-02-08 04:07:52 +01:00
Brian Paul
b99cb13002 tgsi: use tgsi_semantic enum type in ureg code
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:43:01 -07:00
Brian Paul
174f3a4ab7 st/mesa: use tgsi_semantic enum type
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:43:01 -07:00
Brian Paul
0f7be4fc16 tgsi: use TGSI enum types in ureg code
v2: fix enum tgsi_interpolate_mode/loc typo.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:42:39 -07:00
Brian Paul
9f9ce1625f st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:38:04 -07:00
Brian Paul
6321b1bd40 gallium/util: replace uint with tgsi enum types
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:38:04 -07:00
Brian Paul
15874338ff gallium/util: replace unsigned with tgsi enum types
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:38:04 -07:00
Fredrik Höglund
5a38d8f103 radv: implement VK_EXT_external_memory_host
Ported from the radeonsi GL_AMD_pinned_memory implementation.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 00:46:07 +01:00
Dave Airlie
5dd385f378 r600: fix rendering regression on r6/7 gpus
Fixes: 2d5b5d267e (r600: work out target mask at framebuffer bind.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104989

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-08 09:37:09 +10:00
Grazvydas Ignotas
f91aa68ac6 radeonsi: avoid int-to-pointer-cast warnings on 32bit
I hope the actual dropping of MSB is ok, but that's what's already
happened before this change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-08 01:13:58 +02:00
Grazvydas Ignotas
13ada91740 gallium/hud: update some query functions
It seems these were missed when struct pipe_context * argument was
added to hud_graph::query_new_value.

Fixes: 3132afdf4c "gallium/hud: pass pipe_context explicitly to most functions"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-08 01:12:07 +02:00
Roland Scheidegger
09f49b9e50 Revert "gallium: build ddebug, noop, rbug, trace as part of auxiliary"
This reverts commit 6f82b8d8d0.

This broke scons build, and reportedly clover with autotools/meson too.
2018-02-07 23:47:39 +01:00
Marek Olšák
6f82b8d8d0 gallium: build ddebug, noop, rbug, trace as part of auxiliary
Building gallium is faster by 7.5 seconds on a 4core/8thread 3GHz CPU.
(gallium build time is reduced by 15% when building only radeonsi)

Non-recursive makefiles are great!
2018-02-07 22:08:34 +01:00
Roland Scheidegger
def09f8db0 u_blit: (trivial) fix bogus argument order for set_fragment_shader
Amazingly this still worked sometimes, albeit I'm not even sure why...
This fixes d7bec6f7a6.
2018-02-07 22:03:18 +01:00
Andres Rodriguez
83990dd529 mesa: fix incorrect type when allocating arrays
The array members are have type 'struct gl_buffer_object *'

Found by coverity.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-07 14:50:21 -05:00
Roland Scheidegger
d7bec6f7a6 u_blit,u_simple_shaders: add shader to convert from xrbias format
We need this to handle some oddball dx10 format
(DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM). What you can do with this
format is very limited, hence we don't want to add it as a gallium
format (we could not express the properties of this format as
ordinary format properties neither, so like all special formats
it would need specific code for handling it in any case).
While here, also nuke the array for different shaders for different
writemasks, as it was not actually used (always full masks are
passed in for generating shaders).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-07 17:09:37 +01:00
Roland Scheidegger
afd1e9be17 u_simple_shaders: fix mask handling in util_make_fragment_tex_shader_writemask
The writemask handling was busted, since writing defaults to output
meant they got overwritten by the tex sampling anyway. Albeit the
affected components were undefined, so maybe with some luck it
still would have worked with some drivers - if not could as well
kill it... (This would have affected u_blitter but not u_blit since
the latter always used xyzw mask.)

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-07 17:08:24 +01:00
Bas Nieuwenhuizen
5d754872b5 autotools: Only build libmesa-st-tests-common.a for tests.
We don't need the library if we don't build tests, and building
it adds a dependency on gtest which adds a dependency on cxxabi.h.

Fixes: 6569b33b6e "mesa/st/tests: unify MockCodeLine* classes"
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
2018-02-07 14:04:04 +01:00
Tapani Pälli
9d322fde97 i965: add __DRI2_BLOB support and set cache functions
v2: adjust to change that moved cache from ctx to screen

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
ae00ef2702 disk cache: add callback functionality
v2: add disk_cache_has_key, disk_cache_put_key support
    using blob cache (Nicolai, Jordan)

v3: rename set_cb as put_cb to match existing naming (Timothy)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
6a651b6b77 disk cache: initialize cache path and index only when used
This patch makes disk_cache initialize path and index lazily so
that we can utilize disk_cache without a path using callback
functionality introduced by next patch.

v2: unmap mmap and destroy queue only if index_mmap exists

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
e8495646af glsl/tests: changes to test_disk_cache_create test
Next patch will allow disk_cache instance to be created without
path set for it, modify some test cases that assume disk_cache
creation to fail with invalid path. Creation should succeed but
simple put/get test fail.

v2: leave tests as is but check that both cache struct exists
    and try simple put/get that should fail with invalid path set
    (Emil)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
83c81b6cce glsl/tests: move utility functions in cache_test
Patch moves functions higher so that we can utilize them from
test_disk_cache_create which is modified by next patch.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
6f5b57093b egl: add support for EGL_ANDROID_blob_cache
v2: cleanup, move callbacks to _egl_display struct (Emil Velikov)
    adapt to earlier ctx->screen changes

v3: remove useless checking, add _eglSetFuncName (Emil Velikov)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v2)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
cf4569da6b dri: add interface for EGL_ANDROID_blob_cache extension
v2: move from __DRIcontext to __DRIscreen (Emil Velikov)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Samuel Pitoiset
757d36ee70 ac/nir: use new pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics
Ported from RadeonSI.

Only one F1 2017 shader is affected, code size decreased
from 532 to 488 on both Polaris10 and Vega10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-07 12:42:13 +01:00
Samuel Pitoiset
2f54d7382d ac/nir: avoid loading unused VS input components
Polaris10:
Totals from affected shaders:
SGPRS: 122840 -> 120984 (-1.51 %)
VGPRS: 78812 -> 78440 (-0.47 %)
Spilled SGPRs: 177 -> 129 (-27.12 %)
Code Size: 2950028 -> 2941276 (-0.30 %) bytes
Max Waves: 17899 -> 17976 (0.43 %)

Vega10:
Totals from affected shaders:
SGPRS: 117144 -> 115776 (-1.17 %)
VGPRS: 77580 -> 77532 (-0.06 %)
Spilled SGPRs: 0 -> 152 (0.00 %)
Code Size: 3352656 -> 3347860 (-0.14 %) bytes
Max Waves: 19756 -> 19866 (0.56 %)

This increases SGPRs spilling a bit with Talos, but I have
some other ideas that might reduce it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-07 12:42:09 +01:00
Samuel Pitoiset
1c57a6da5e ac/shader: scan vertex inputs usage mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-07 12:42:07 +01:00
Iago Toral Quiroga
f474b19875 i965: allocate a SGVS element when VertexID or InstanceID are read
Although on gen8+ platforms we can in theory use 3DSTATE_VF_SGVS
to put these beyond the last vertex element it seems that we still
need to allocate the SVGS element, otherwise we have observed cases
where we end up reading garbage. Specifically, the CTS test mentioned
below was flaky with a fail rate of ~1% on some gen9+ platforms caused
by reading garbage for the gl_InstanceID value. The flakyness goes
away as soon as we start allocating the SVGS element.

v2:
  - Do this for gen8+, not just gen9+, and pull the boolean
    outside the #if block (Jason)

Fixes flaky test:
KHR-GL45.vertex_attrib_64bit.limits_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104335
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-07 11:11:16 +01:00
Dylan Baker
c74719cf4a glapi: fix check_table test for non-shared glapi with meson
v2: - Add glapitable_h generated source to requirements

Fixes: 3218056e0e ("meson: Build i965 and dri stack")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2018-02-06 15:00:17 -08:00
Dylan Baker
002fbde71e glapi: Don't search through subdirs from glapitable.h
Because meson won't put it in that folder.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
aac3d01178 state_tracker: Don't build st-renumerate-test without shared glapi
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
0316aa432d glapi: remove APPLE extensions from test
Fixes: 7009955281 ("mesa: Remove GL_APPLE_vertex_array_object stubs")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
a4f1fc5dd1 glapi/check_table: Remove 'extern "C"' block
Using 'extern "C"' around includes is always incorrect, as the header may
contain C++ symbols (as it does in this case), which means it cannot use
C linkage. In this case the header has a template in it, which obviously
cannot be linked with C linkage rules.

Fixes: a29ad2b421 ("mesa/tests: Add tests for the generated dispatch table")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
105178db8f meson: fix test source name for static glapi
fixes: 43a6e84927 ("meson: build mesa test.")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
9be7487f30 glapi: don't walk backwards for includes
Instead just set the proper -I flags and include it from a more standard
path. In this case we'll add -Isrc/mesa (which is common), and #include
main/foo.h.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Brian Paul
e7a4536e64 mesa: rename gl_vertex_array_object::_VertexAttrib -> _VertexArray
Since the type is gl_vertex_array.  Update comment to explain that
these arrays are only used by the VBO module.

Also rename some local variables in _mesa_update_vao_derived_arrays().

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-06 15:36:47 -07:00
Brian Paul
d9ab39ea65 mesa: minor whitespace fixes, line wrapping in texcompress.c
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-06 15:23:26 -07:00
Brian Paul
b38196b452 mesa: simplify _mesa_get_compressed_formats()
Instead of testing for formats==NULL everywhere, just point formats at
a dummy array which will be discarded.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-06 15:23:26 -07:00
Vlad Golovkin
d919ff0f27 util: remove redundant check for the __clang__ macro
Clang defines __GNUC__ macro, so one doesn't need to check __clang__
macro in this particular case.

v2: added comment as per Brian Paul's suggestion

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-06 15:23:26 -07:00
Brian Paul
77bc74e674 st/mesa: use st_access_flags_to_transfer_flags() helper in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-06 15:23:26 -07:00
Brian Paul
1852a2e1a2 st/mesa: refactor st_bufferobj_map_range()
Use a new helper function, st_access_flags_to_transfer_flags(), to
convert the GL_MAP_x flags to PIPE_TRANSFER_x flags.

We'll be able to use this function in a couple other places.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-06 15:23:26 -07:00
Brian Paul
8a32dd2ec9 st/mesa: refactor bufferobj_data()
Split out some of the code into three new helper functions:
buffer_target_to_bind_flags(), storage_flags_to_buffer_flags(),
buffer_usage() to make the code more managable.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-06 15:23:26 -07:00
Samuel Pitoiset
3488a3f033 radv: run nir_opt_shrink_load
LLVM can't shrink loads.

Polaris10:
Totals from affected shaders:
SGPRS: 62528 -> 59955 (-4.11 %)
VGPRS: 44708 -> 44616 (-0.21 %)
Spilled SGPRs: 16 -> 8 (-50.00 %)
Code Size: 1355504 -> 1355172 (-0.02 %) bytes
Max Waves: 11710 -> 11670 (-0.34 %)

Vega10:
Totals from affected shaders:
SGPRS: 51448 -> 50371 (-2.09 %)
VGPRS: 39140 -> 39048 (-0.24 %)
Spilled SGPRs: 16 -> 16 (0.00 %)
Code Size: 1307188 -> 1304296 (-0.22 %) bytes
Max Waves: 11312 -> 11292 (-0.18 %)

This reduces SGPRs spilling in MadMax, and it also reduces
number of SGPRs in DOW3 and F12017. The number of waves slightly
decreases in F1 but I don't see any performance changes after
benchmarking it. Talos and Serious Sam are not affected because
they don't use any push constants.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-06 23:08:44 +01:00
Samuel Pitoiset
e68562b94b nir: add nir_opt_shrink_load pass
This is a very simple pass that just shrinks load_push_constant
intrinsics when some components are unused. For now, it can just
shrink vec4 to vec3, vec3 to vec2 and so on.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-06 23:08:39 +01:00
Timothy Arceri
e2ea9e1191 radeonsi/nir: add nir support for compiling compute shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
9c52902c76 ac/radeonsi: add num_work_groups to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
f12e2f9c12 ac: implement nir_intrinsic_shader_clock
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
b7b89bbddb ac/radeonsi: create ac_build_shader_clock() helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
d116af383f ac/radeonsi: add load_local_group_size() to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
f6932d1ef3 radeonsi: add get_block_size() helper
This will be reused by the nir backend in a later patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
e3ebffdbb0 ac: don't call emit_outputs() for compute
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
c8066cdfa7 ac/radeonsi: add local_invocation_ids to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
fa5239c153 ac/radeonsi: add workgroup_ids to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
64c10c9737 radeonsi/nir: gather some compute info in si_nir_scan_shader()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
1142b1d3e1 radeonsi/nir: always set input_usage_mask as using all components
This fixes a regression for now, in the future we should gather
the used components properly.

V2: just set for VS and correctly handle doubles

Fixes: be973ed21f "radeonsi: load the right number of components for VS inputs and TBOs"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:38:52 +11:00
Timothy Arceri
ffeebcfa7e i965: remove unused brw_nir_lower_cs_shared()
This has been unused since 8761a04d0d.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-02-07 08:38:01 +11:00
Bas Nieuwenhuizen
a3e42e7a69 vulkan/wsi: Fix OOM behavior with prime images.
Fixes: d50937f137 "vulkan/wsi: Implement prime in a completely generic way"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-06 21:52:39 +01:00
Bas Nieuwenhuizen
c7d640fbbf ac/nir: fix GS load input type.
Fixes: df1d5174fc "ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-06 21:52:38 +01:00
Mathias Fröhlich
e8a9473d32 mesa: Factor out _mesa_disable_vertex_array_attrib.
And use it in the enable code path.
Move _mesa_update_attribute_map_mode into its only remaining file.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-06 21:20:14 +01:00
Mathias Fröhlich
236657842b vbo: Move vbo_rebase into its only caller module tnl.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-06 21:20:14 +01:00
Mathias Fröhlich
2313c33e95 mesa: Use atomics for buffer objects reference counts.
The mutex is currently used for reference counting and updating
the minmax index cache.
The change uses atomics directly for reference counting and
the mutex for the minmax cache.
This is safe since the reference count is not modified beside
in _mesa_reference_buffer_object where atomics aim to be used.
While using the minmax cache, the calling code holds a reference
to the buffer object. Thus unreferencing or even referencing the
buffer object does not need to be serialized with accessing
the minmax cache.
The change reduces the time _mesa_reference_buffer_object_ takes
by about a factor of two when looking at perf results for some
of my favorite use cases.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-06 21:20:14 +01:00
Dave Airlie
6c691081a1 r600: fixup sparse color exports.
If we have gaps in the shader mask we have to have 0x1 in them
according to a comment in radeonsi, and this is required to fix
the test at least on cayman.

We also need to record the highest one written to write to the
ps exports reg.

This fixes:
KHR-GL45.enhanced_layouts.fragment_data_location_api

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:16:59 +10:00
Dave Airlie
2d5b5d267e r600: work out target mask at framebuffer bind.
If we only get 1,2,3,6 framebuffers we want a sparse target mask.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:16:55 +10:00
Dave Airlie
5b14e06d8b r600: work out shader export mask at shader build time (v1.1)
Since enhanced layouts allows setting specific MRT outputs, we
can get sparse outputs, so we have to calculate the shader
mask earlier.

v1.1: update checks for state update (Roland)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:16:27 +10:00
Dave Airlie
f292eceae1 r600: fix xfb stream check.
This fixes:
KHR-GL45.enhanced_layouts.xfb_vertex_streams

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
680cb9898a r600/compute: add render cond support.
Set render cond and emit atom.

Fixes:
KHR-GL45.compute_shader.conditional-dispatching

Reviewed-by: Roland Scheidegger <sorland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
5fd7b282b3 r600: fix not-very indirect compute
We need to get the grid sizes earlier to fill in to the const
buffer.

Fixes:
KHR-GL45.compute_shader.built-in-variables
and
KHR-GL45.compute_shader.dispatch-indirect

Reviewed-by: Roland Scheidegger <sorland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
00a112641b r600: overhaul buffer resource query.
This cleans up and fixes the previous fix even more.

Buffers from textures start at max const,
buffers from buffers/images come in from the 168 offset.

This fixes a bunch of:
KHR-GL45.shader_storage_buffer_object*

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
736b150768 r600/eg: fix buffer sizing.
For buffers we want the size in bytes,
For images we want it in elements.

This fixes:
KHR-GL45.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-pad

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
c9c4f0b722 r600/images: set offset for compute shaders with number of declared samplers
for frag shaders we get a value in the key, I expect I need
to make compute work better

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
ab5cee4c24 r600/compute: only mark buffer/image state dirty for fragment shaders
The compute emission path always emits this currently, and emitting
it on the fragment path breaks the blitter.

This fixes gpu hangs in KHR-GL45.compute_shader.resource-texture

Reviewed-by: Roland Scheidegger <sorland@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
4e3b43f180 r600/atomic: fix ATOMCAS instruction.
This has 4 srcs.

This fixes:
KHR-GL45.shader_atomic_counter_ops_tests.ShaderAtomicCounterOpsExchangeTestCase

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:11 +10:00
Dave Airlie
8bdad9fa1f r600/sb/cayman: fix indirect ubo access on cayman
With sb enabled on cayman, this was overwriting the proper
cf index value with random ones if the dst gpr was 2 or 3,
only save the value for a MOVA instruction.

Fixes:
KHR-GL45.gpu_shader5.uniform_blocks_array_indexing
(on cayman with sb)

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:11 +10:00
Dave Airlie
012100b809 r600/eg: use texture target to pick array size not view target (v2)
This fixes a few CTS cases in :
KHR-GL45.texture_view.view_sampling

some multisample cases are still broken, but not sure this is
the same problem.

v2: fix more cases

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:11 +10:00
Dave Airlie
e7e81f362d radv: don't support tc-compat on multisample d32s8 at all.
RX550 fails
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_2

So increase the range of the workaround.

Fixes: f4c534ef6 (radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1))

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-06 19:56:00 +00:00
Michal Navratil
4081e08896 winsys/amdgpu: allow non page-aligned size bo creation from pointer
Fix INVALID_OPERATION caused by BufferData with target
EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD when the buffer size is
not page aligned.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
2018-02-06 18:51:12 +01:00
Jon Turney
9440599c8e meson: ensure xmlpool/options.h is generated for libgallium
In file included from ../src/gallium/targets/dri/target.c:1:
In file included from ../src/gallium/auxiliary/target-helpers/drm_helper.h:8:
../src/util/xmlpool.h:103:10: fatal error: 'xmlpool/options.h' file not found

See also 26bde1e3.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-06 15:56:12 +00:00
Andres Gomez
1ec88755c2 vbo: provide 64bits support to print_draw_arrays
Cc: Mathias Fröhlich <mathias.froehlich@web.de>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-06 15:30:29 +02:00
Andres Gomez
0057ae4038 vbo: take into account the size when printing VAO elements
When using print_draw_arrays for debugging, we were printing an "n"
amount of vertex but that meant not to print all the size in the "n"
vertex, depending on the stride used.

Now we print the whole size in the "n" vertex.

Cc: Mathias Fröhlich <mathias.froehlich@web.de>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-06 15:30:23 +02:00
Andres Gomez
c9325b4fa9 vbo: print first element of the VAO when the binding stride is 0
Cc: Mathias Fröhlich <mathias.froehlich@web.de>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-06 15:30:12 +02:00
Iago Toral Quiroga
a5053ba27e anv/device: initialize the list of enabled extensions properly
The loop goes through the list of enabled extensions marking them as
enabled in the list, but this relies on every other extension being
initialized to false by default.

This bug would make us, for example, advertise certain device extension
entry points as available even when the corresponding extensions had
not been enabled.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: abc62282b5 "anv: Add a per-device table of enabled extensions"
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-02-06 07:51:00 +01:00
Iago Toral Quiroga
ef439a4fdc spirv: split constant initializers on in/out structs
The SPIR-V parser splits in/out struct variables and creates
a separate variable for each first-level member of the struct.
When the struct variable has an initializer this means that we also
need to split the initializer.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-06 07:50:18 +01:00
Iago Toral Quiroga
1d20001d97 i965/nir: do int64 lowering before optimization
Otherwise loop unrolling will fail to see the actual cost of
the unrolling operations when the loop body contains 64-bit integer
instructions, and very specially when the divmod64 lowering applies,
since its lowering is quite expensive.

Without this change, some in-development CTS tests for int64
get stuck forever trying to register allocate a shader with
over 50K SSA values. The large number of SSA values is the result
of NIR first unrolling multiple seemingly simple loops that involve
int64 instructions, only to then lower these instructions to produce
a massive pile of code (due to the divmod64 lowering in the unrolled
instructions).

With this change, loop unrolling will see the loops with the int64
code already lowered and will realize that it is too expensive to
unroll.

v2: Run nir_algebraic first so we can hopefully get rid of some of
    the int64 instructions before we even attempt to lower them.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-06 07:49:27 +01:00
Ilia Mirkin
02a6d901ee mesa: add OES_EGL_image_external_essl3 support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-06 07:28:11 +02:00
Vinson Lee
fe32f796f2 r600/fp64: Fix build.
CC       r600_shader.lo
r600_shader.c: In function ‘egcm_int_to_double’:
r600_shader.c:4543:12: error: ‘ctx’ is a pointer; did you mean to use ‘->’?
     if (ctx.bc->chip_class == CAYMAN)
            ^
            ->

Fixes: 35b4301577 ("r600/fp64: fix integer->double conversion")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-05 15:32:20 -08:00
Dave Airlie
35b4301577 r600/fp64: fix integer->double conversion
Doing a straight uint/int->fp32->fp64 conversion causes
some precision issues, Roland suggested splitting the
integer into two portions and doing two separate
int->fp32->fp64 conversions then adding the results.

This passes the tests in CTS and piglit.

[airlied: fix cypress conversion opcodes]
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-06 08:21:48 +10:00
Samuel Pitoiset
0170ae1e23 ac/nir: remove emission of nir_op_fdiv
RadeonSI and RADV lower fdiv.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-05 23:09:34 +01:00
Jon Turney
b5af199f92 travis: add macOS meson build
v2: Simplify set of options now we have better defaults

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-05 19:42:01 +00:00
Jon Turney
80bc41b2ec meson: osx ld doesn't support --build-id
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-05 19:40:43 +00:00
Jon Turney
ea8730024f meson: build src/glx/apple
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-05 19:40:43 +00:00
Dylan Baker
569628dd24 meson: set apple glx defines
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-05 19:40:43 +00:00
Jon Turney
4772909447 meson: better defaults for osx, windows and cygwin
set suitable defaults for 'dri-drivers', 'gallium-drivers', 'vulkan-drivers'
and 'platforms' options for osx, windows and cygwin, adding cygwin where
appropriate.

v2: error() for unknown OS

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-05 19:34:37 +00:00
Matt Turner
e2b31e9acf i965: Move mistakenly placed line
Ken called this out in review, but it seems I forgot to make the change.
I noticed that the control flow annotations in the fragment shader
disassembly of tests/shaders/glsl-fs-loop-continue.shader_test were not
correct, and moving this line to the correct place fixes it.
2018-02-05 09:50:56 -08:00
Juan A. Suarez Romero
4195eed961 glsl/linker: check same name is not used in block and outside
According with OpenGL GLSL 3.20 spec, section 4.3.9:

  "It is a link-time error if any particular shader interface
   contains:
     - two different blocks, each having no instance name, and each
       having a member of the same name, or
     - a variable outside a block, and a block with no instance name,
       where the variable has the same name as a member in the block."

This fixes a previous commit 9b894c8 ("glsl/linker: link-error using the
same name in unnamed block and outside") that covered this case, but
did not take in account that precision qualifiers are ignored when
comparing blocks with no instance name.

With this commit, the original tests
KHR-GL*.shaders.uniform_block.common.name_matching keep fixed, and also
dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision
regression is fixed, which was broken by previous commit.

v2: use helper varibles (Matteo Bruni)

Fixes: 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104668
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104777
CC: Mark Janes <mark.a.janes@intel.com>
CC: "18.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Matteo Bruni <matteo.mystral@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-02-05 18:10:43 +01:00
Juan A. Suarez Romero
3d14e72057 mesa: enable ASTC format for CompressedTexSubImage3D
If extensions GL_KHR_texture_compression_astc_hdr or
GL_KHR_texture_compression_astc_sliced_3d are implemented then ASTC
format are supported in CompressedTex*Îmage3D.

Fixes KHR-GLES2.texture_3d.* with this format.

CC: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-02-05 17:00:19 +01:00
Stephan Gerhold
02e2009b92 util/build-id: Fix address comparison for binaries with LOAD vaddr > 0
build_id_find_nhdr_for_addr() fails to find the build-id if the first LOAD
segment has a virtual address other than 0x0.

For most shared libraries, the first LOAD segment has vaddr=0x0:

    Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
    LOAD           0x000000 0x00000000 0x00000000 0x2d2e26 0x2d2e26 R E 0x1000
    LOAD           0x2d2e54 0x002d3e54 0x002d3e54 0x2e248 0x2f148 RW  0x1000

However, compiling the Intel Vulkan driver as 32-bit binary on Android produces
the following ELF header with vaddr=0x8000 instead:

    Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
    PHDR           0x000034 0x00008034 0x00008034 0x00100 0x00100 R   0x4
    LOAD           0x000000 0x00008000 0x00008000 0x224a04 0x224a04 R E 0x1000
    LOAD           0x225710 0x0022e710 0x0022e710 0x25988 0x27364 RW  0x1000

build_id_find_nhdr_callback() compares the address of dli_fbase from dladdr()
and dlpi_addr from dl_iterate_phdr(). With vaddr > 0, these point to a
different memory address, e.g.:

    dli_fbase=0xd8395000 (offset 0x8000)
    dlpi_addr=0xd838d000

At least on glibc and bionic (Android) dli_fbase refers to the address where
the shared object is mapped into the process space, whereas dlpi_addr is just
the base address for the vaddrs declared in the ELF header.

To compare them correctly, we need to calculate the start of the mapping
by adding the vaddr of the first LOAD segment to the base address.

Note: musl users will need the following patch.
https://git.musl-libc.org/cgit/musl/commit/?id=b3ae7beabb9f0c219bb8a8b63567a01c6530c1ac

Cc: Chad Versace <chadversary@chromium.org>
Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104642
Fixes: 5c98d38 "util: Query build-id by symbol address, not library name"
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-05 14:26:33 +00:00
Boyuan Zhang
d645b0850a radeonsi: enable vcn encode for HEVC main
Enable vcn encode for HEVC main profile on Raven.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
5534a2791f st/va: implement HEVC encode functions
Implement HEVC encode functions based on VAAPI HEVC encode interface.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
9ac50a2e0c st/va: add HEVC encode functions
Add a separate file for HEVC encode functions.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
66087d8a2d st/va: enable dual instances encode only for H264
Logics that related to dual instances encode should only be done for
H264, not other codecs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
a9c0861c6c st/va: add entrypoint check for HEVC
Add entrypoint check for HEVC to differentiate decode and encode jobs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
ecc3944344 st/va: add HEVC picture desc
Add HEVC picture desc, and add codec check when creating and destroying
context.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
9393b53c29 st/va: move H264 enc functions into separate file
Move all H264 encode related functions into separate file. Similar to
VAAPI decode side, there will be separate file for each codec on encode
side as well.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
b391d34916 radeon/vcn: add header implementations for HEVC
Implement encoding of sps, pps, vps, aud, and slice headers for HEVC
based on HEVC specs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
fdc952b320 radeon/vcn: add ib implementations for HEVC
Implement required ibs for vcn HEVC encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
5ab73edddb radeon/vcn: support picture parameters for HEVC
Pass pipe_picture_desc instead of pipe_h264_enc_picture_desc so that
it can be used for different codecs. Add functions to handle picture
parameters that will be used for HEVC encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
db67d04df3 radeon/vcn: add vcn encode interface for HEVC
Add vcn encode interface for HEVC, and rename radeon_enc_h264_enc_pic
to radeon_enc_pic since radeon_enc_pic is used by both H264 and HEVC.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
f410936439 vl: add parameters for HEVC encode
Add HEVC encode interface

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Eric Anholt
aa2f609f70 broadcom/vc5: Ignore samplers for finding uniform offsets.
Fixes:
KHR-GLES3.shaders.struct.uniform.sampler_array_fragment
KHR-GLES3.shaders.struct.uniform.sampler_array_vertex
KHR-GLES3.shaders.struct.uniform.sampler_nested_fragment
KHR-GLES3.shaders.struct.uniform.sampler_nested_vertex
2018-02-05 13:56:02 +00:00
Eric Anholt
63a8a0f3c0 broadcom/vc5: Fix non-mipfiltered sampling.
We need to clamp the LOD to 0 if mip filtering is disabled.  This is part
of fixing KHR-GLES3.shaders.struct.uniform.sampler_array_fragment.
2018-02-05 13:53:38 +00:00
Eric Anholt
e29988c908 broadcom/vc5: Fix "hardwrae" typo in a field name in XML. 2018-02-05 13:53:38 +00:00
Samuel Pitoiset
a1d568c830 ac/nir: fix a crash in load_gs_input() on pre-GFX9 chips
Fixes: df1d5174fc ("ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-05 11:05:52 +01:00
Eric Anholt
8bb000f460 broadcom/vc5: Try to merge more than 2 QPU instructions together.
Obviously it would be good to have an ADD and a MUL and a signal together,
but we can even potentially have multiple signals merged, as well.

total instructions in shared programs: 100423 -> 97874 (-2.54%)
instructions in affected programs:     78812 -> 76263 (-3.23%)
2018-02-05 09:29:37 +00:00
Eric Anholt
dc78643ace broadcom/vc5: Remove no-op MOVs after register allocation.
We emit some MOVs to track lifetimes of payload registers, but we don't
need there to be actual MOV instructions for them.

total instructions in shared programs: 101045 -> 100423 (-0.62%)
instructions in affected programs:     37083 -> 36461 (-1.68%)
2018-02-05 09:29:37 +00:00
Eric Anholt
f3978a7380 broadcom/vc5: Add missing shader-db instruction counting.
I must have misplaced it in the instruction packing rework.
2018-02-05 09:29:37 +00:00
Dave Airlie
7801425028 r600: fix resq for buffer images.
If this is an image buffer, we need to calculate the correct resource
id.

Fixes:
KHR-GL45.shader_image_size.*

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-05 05:15:41 +10:00
Dave Airlie
6c1432f0be r600/eg: fix cube map array buffer images.
This fixes a crash in:
KHR-GL45.texture_cube_map_array.texture_size_compute_sh.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-05 05:14:56 +10:00
Marek Olšák
af3685d149 mesa: change ctx->Color.ColorMask into a 32-bit bitmask
4 bits per draw buffer, 8 draw buffers in total --> 32 bits.

This is easier to work with.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-04 01:50:10 +01:00
Jordan Justen
83e60ce927 i965: Create new program cache bo when clearing the program cache
When the disk shader cache CI testing was enabled, we started noticing
occasional failures on deqp test runs. (Mainly SNB, rarely HSW)

Before this change, when we cleared the (in memory) program cache we
reused the same bo. Since the disk shader cache quickly restores
programs, it appears that this would lead to overwrites of the older
program binaries in the in memory program cache that apparently were
still executing in some cases. If these programs were still executing,
this could cause a GPU hang.

This issue is probably not disk shader cache specific, but may have
been hidden due to the compiler taking time to recompile programs
after the cache was cleared.

v2:
 * Don't add `copy` param to brw_cache_new_bo (Ken)
 * Call from brw_program_cache_check_size (Ken)

Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-03 12:16:58 -08:00
Jason Ekstrand
589e9db23f aubinator: Multiply count by 4 to compute buffer sizes
The count field is in terms of dwords and not bytes.  In
7d4007d58a, I fixed one instance
of this but missed another.
2018-02-02 22:30:56 -08:00
Eric Anholt
2e746bc63d broadcom/vc5: Enable UIF XOR on textures.
This should increase performance by reducing SDRAM bank conflicts when
crossing between UIF columns (particularly on power-of-two height
textures).

The uif_xor_disable setup is dropped, since we need to allow XOR on lower
miplevels even when level 0 is XOR.  The level 0 force UIF and level 0 XOR
flags should handle setting XOR properly on imported buffers.
2018-02-02 16:50:02 -08:00
Eric Anholt
6a862b0de7 broadcom/vc5: Fix alignment of miplevel 1 with UIF.
The alignment here means that we can't get back the padded height from the
size/stride any more, so it's now a field in the slice as well.

Fixes piglit fbo-generatemipmap-formats RGBA16 NPOT.
2018-02-02 16:27:49 -08:00
Eric Anholt
5c57e0a549 broadcom/vc5: Switch our RGBA4 support to the new gallium format.
Fixes fbo-generatemipmap-formats, fbo-alphatest-formats, etc. tests for
GL_RGBA4, GL_RGB4, GL_RGBA2, etc.
2018-02-02 16:27:49 -08:00
Eric Anholt
2a97f1d3ef gallium: Add a new A4B4G4R4 pipe format for Broadcom.
The VC5 HW puts A in the low bits and R in the high bits.  We can't just
swizzle in the shaders because the blending HW can't pick what channel A
is in, so make a new format to match it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-02 16:27:49 -08:00
Eric Anholt
1429cd74c2 mesa: Drop incorrect A4B4G4R4 _mesa_format_matches_format_and_type() cases.
swapBytes operates on bytes, not 4-bit channels, so you can't just take
non-swapBytes cases and flip the REV flag.

Avoids piglit texture-packed-formats regressions when enabling the
ABGR4444 format.

Fixes: c5a5c9a7db ("mesa/formats: add new mesa formats and their pack/unpack functions.")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-02 16:27:49 -08:00
George Kyriazis
bbef9474fa meson/swr: Updated copyright dates
cc: mesa-stable@lists.freedesktop.org
cc: dylan@pnwbakers.com

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-02 17:43:07 -06:00
George Kyriazis
16bf813830 meson/swr: re-shuffle generated files
Move generated files from codegen/meson.build to other directories, in order
to satisfy generated include file dependencies

Add correct file lists for architecture-specific libraries.

cc: mesa-stable@lists.freedesktop.org
cc: dylan@pnwbakers.com

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-02 17:43:00 -06:00
Marek Olšák
3bf1e036e8 amd: remove support for LLVM 3.9
Only these are supported:
- LLVM 4.0
- LLVM 5.0
- LLVM 6.0
- master (7.0)

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-02 23:47:40 +01:00
Dylan Baker
c75a4e5b46 meson: Check for actual LLVM required versions
Currently we always check for 3.9.0, which is pretty safe since
everything except radv work with >= 3.9 and 3.9 is pretty old at this
point. However, radv actually requires 4.0, and there is a patch for
radeonsi to do the same.

Fixes: 673dda8330 ("meson: build "radv" vulkan driver for radeon hardware")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-02 13:22:58 -08:00
Dylan Baker
d7235ef83b meson: Don't confuse the install and search paths for dri drivers
Currently there is not a separate option for setting the search path of
DRI drivers in meson, like there is in scons and autotools. This is an
oversight and needs to be fixed. This adds an extra option
`dri-search-path`, which will default to the value of
`dri-drivers-path`, like autotools does.

v2: - Split input list before joining.
v3: - use : instead of ; as the delimiter. The autotools help string
      incorrectly says ; but the code uses :
v4: - Take list in pre : delimited form (Ilia)
    - Ensure that the dri-search-path is absolute when using
      dri_drivers_path

Fixes: db9788420d ("meson: Add support for configuring dri drivers directory.")
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v2)
Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v3)
2018-02-02 11:01:42 -08:00
Marek Olšák
847d0a393d radeonsi: use pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-02 16:46:22 +01:00
Jon Turney
b3a1d9588e travis: add osx autotools build
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-02 15:28:52 +00:00
Jon Turney
4701379d96 travis: pip -> pip2
On travis, for OSX, python2 from homebrew is pre-installed. per [1]:

 python points to the macOS system Python (with no manual PATH modification)
 python2 points to Homebrew’s Python 2.7.x (if installed)
 python3 points to Homebrew’s Python 3.x (if installed)
 pip doesn't exist
 pip2 points to Homebrew’s Python 2.7.x’s pip (if installed)
 pip3 points to Homebrew’s Python 3.x’s pip (if installed)

We will end up using 'python2' for building mesa.

Just use 'pip2' instead of 'pip', as that seems to work for all platforms on
travis.

[1] https://docs.brew.sh/Homebrew-and-Python.html

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-02 15:28:52 +00:00
Jon Turney
7d1ec6d6a9 travis: conditionalize building of prerequisites on if OS=linux
Use a '|' YAML literal block to avoid the convoluted syntax needed to put
the entire conditional on a single line.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-02 15:28:52 +00:00
Jon Turney
63041ba613 glx/test: fix building for osx
An additional stub for applegl_create_context() is needed
Cannot test indirect API as it's not built on osx, currently

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-02 15:28:52 +00:00
Andres Gomez
4761a8fea6 i965: check if upload is 0 explicitely, when downsizing a format
downsize_format_if_needed takes an integer as number of uploads
parameter. Hence, let's do an integer comparation instead of a boolean
check, since that is confusing.

Since we are at it, fix a couple of wrongly tabbed indents.

Cc: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-02-02 16:32:30 +02:00
Marek Olšák
51d36f5e02 mesa: don't flag _NEW_COLOR for KHR adv.blend if prog constant doesn't change
This only affects drivers that set DriverFlags.NewBlend.

v2: - fix typo advanded -> advanced
    - return "enum gl_advanced_blend_mode" from
      _mesa_get_advanced_blend_sh_constant
    - don't call FLUSH_VERTICES twice

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-02 15:06:47 +01:00
Samuel Pitoiset
df1d5174fc ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load
The old one generates useless instructions in there, found while
comparing geometry shaders between RadeonSI and RADV.

This improves all Vulkan demos that use geometry shaders, +4%
for deferredshadows, +9% for viewportarray, +7% for
geometryshader on Polaris10.

This seems to also improve DOW3 a little bit (+1%).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by:  Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-02 12:32:21 +01:00
Dave Airlie
f9c121c420 r600/eg: add crap indirect compute support.
I think the cp packets can be made work, but I think it might
need a kernel change, so for now just do the worst thing.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-02 16:50:18 +10:00
Jason Ekstrand
2f7205be47 i965: Call prepare_external after implicit window-system MSAA resolves
This fixes some rendering corruption in a couple of Android apps that
use window-system MSAA.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104741
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-01 21:45:25 -08:00
Roland Scheidegger
c2f0e08857 r600: don't do stack workarounds for hemlock
By the looks of it it seems hemlock is treated separately to cypress, but
certainly it won't need the stack workarounds cedar/redwood (and
seemingly every other eg chip except cypress/juniper) need.
(Discovered by accident.)

Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-02 01:46:43 +01:00
Dave Airlie
8fa5aade43 r600: initial attempt at gl_HelperInvocation (v3)
This passes the CTS and piglit tests.

This also disable sb for helper invocations until it doesn't
mess up the VPM flags.

Thanks to Ilia and Glenn for advice, and Roland for working
out the working evergreen path.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-02 09:46:05 +10:00
Bas Nieuwenhuizen
2ffe395cba radv: Don't expose VK_KHX_multiview on android.
deqp does not allow any KHX extensions, and since deqp is included
in android-cts, android does not allow any khx extensions.

So disable VK_KHX_multiview on android.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: 18.0 <mesa-stable@lists.freedesktop.org>
2018-02-01 23:32:48 +01:00
Mathias Fröhlich
5b3d58520f vbo: Simplify input array distribution for dlist type draws.
Using the newly introduced VAO array maps, we can
simplify vbo_bind_vertex_list.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:08 +01:00
Mathias Fröhlich
fb10a7b7b0 vbo: Simplify input array distribution for imm type draws.
Using the newly introduced VAO array maps, we can
simplify vbo_exec_bind_arrays.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:08 +01:00
Mathias Fröhlich
44b1454b96 vbo: Simplify input array distribution for array type draws.
Using the newly introduced VAO state variable, we can
simplify recalculate_input_bindings.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:07 +01:00
Mathias Fröhlich
3d4fb879dd vbo: Use static const VERT_ATTRIB->VBO_ATTRIB maps.
Instead of each context having its own map instance for
this purpose, use a global static const map.

v2: s,unsigned char,GLubyte,g
    s,_VP_MODE_MAX,VP_MODE_MAX,g
    Change comment style.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:07 +01:00
Mathias Fröhlich
b4fd63015a mesa: Track position/generic0 aliasing in the VAO.
Since the first material attribute no longer aliases with
the generic0 attribute, only aliasing between generic0 and
position is left and entirely dependent on the enabled
state of the VAO. So introduce a gl_attribute_map_mode
in the VAO that is used to track how the position
and the generic 0 attribute alias.
Provide a static const array that can be used to
map from vertex program input indices to VERT_ATTRIB_*
indices. The outer dimension of the array is meant to
be indexed directly by the new VAO member variable.
Also provide methods on the VAO to convert bitmasks of
VERT_BIT's from the VAO numbering to the vertex processing
inputs numbering.

v2: s,unsigned char,GLubyte,g
    s,_ATTRIBUTE_MAP_MODE_MAX,ATTRIBUTE_MAP_MODE_MAX,g
    Change comment style, add comments.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:06 +01:00
Mathias Fröhlich
186f03cfb0 mesa: Put materials at the end of the generic block.
The materials are now moved to the end of the
generic attributes block to the range 4-15.

Before, the way the position and generic 0 attribute
is handled was dependent on the presence and kind of
the currently attached vertex program. With this
change the way the position attribute and the generic 0
attribute is treated only depends on the enabled
flag of those two arrays.
This will later help to untangle the update dependencies
between enabled arrays and shader inputs.

v2: s,VERT_ATTRIB_MAT_OFFSET,VERT_ATTRIB_MAT0,g

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:06 +01:00
Mathias Fröhlich
38b41fd718 mesa: Use defines for the aliased material array attributes.
Instead of just assuming that the material attributes
just overlap with the generic attributes 0-12, give
them symbolic defines so that we can easier move them
to an other range.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:06 +01:00
Mathias Fröhlich
f37e29ac22 vbo: Correctly handle attribute offsets in dlist draw.
When executing a display list draw, for the offset
list to be correct, the offset computation needs to
accumulate all attribute size values in order.
Specifically, if we are shuffling around the position
and generic0 attributes, we may violate the order or
if we do not walk the generic vbo attributes we may
skip some of the attributes.
Even if this is an unlikely usecase we can fix this use
case by precomputing the offsets on the full attribute list
and store the full offset list in the display list node.

v2: Formatting fix
v3: Rebase

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:05 +01:00
Brian Paul
7a044ef68b gallivm/llvmpipe: add const qualifiers on sampler variables
Once a lp_build_sampler_soa or lp_build_sampler_aos object is created,
it should never be modified.  Found by inspection.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-01 14:19:58 -07:00
Brian Paul
1bdbeae17c vbo: change an argument in vbo_draw_indirect_prims()
In vbo_draw_indirect_prims() pass the 'indirect_data' argument to
vbo->draw_prims().  All the callers are passing ctx->DrawIndirectBuffer
so this should be no functional change.  Add a (temporary) assertion to
be sure.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-01 12:17:59 -07:00
Brian Paul
1b7ad3ae97 vbo: add comments on the VBO draw function typedefs
And rename indirect_params -> indirect_draw_count_buffer and
indirect_params_offset -> indirect_draw_count_offset to be more
specific.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-01 12:17:59 -07:00
Brian Paul
c7bf05c833 vbo: s/drawcount/drawcount_offset
This parameter (from the glMultiDrawArraysIndirectCountARB function)
is poorly named.  It's an offset into the buffer which contains the
number of primitives to draw.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-01 12:17:59 -07:00
Brian Paul
b0a2f38db9 vbo: use vbo local var for draw call in vbo_save_playback_vertex_list()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-01 12:17:59 -07:00
Brian Paul
84c3641864 svga: remove unneeded #includes in svga_pipe_draw.c
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-01 12:17:59 -07:00
Brian Paul
fa98730bf3 svga: whitespace/formatting fixes in svga_pipe_draw.c
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-01 12:17:59 -07:00
Brian Paul
7a1401938b svga: clean up retry_draw_range_elements(), retry_draw_arrays()
Get rid of a bunch of goto spaghetti.  Remove unneeded do_retry parameter.
No Piglit changes.  Also tested w/ Google Earth and other apps.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-01 12:17:59 -07:00
Brian Paul
c744289552 svga: remove unused min/max_index params to draw_vgpu10()
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-01 12:17:59 -07:00
Eric Anholt
06858c7348 broadcom/vc5: Fix image_h setup for both loads and stores.
The image_h for the tiling algorithm needs to be the padded-to-a-uifblock
height of the level, not the unpadded height or the height of level 0.
Fixes some cases of KHR-GLES3.texture_repeat_mode.* and
depthstencil-render-miplevels.
2018-02-01 11:02:29 -08:00
Eric Anholt
5329f35ea1 broadcom/vc5: Add appropriate height padding for bank conflicts.
I thought I didn't need this because I was doing level-0-always-UIF and
that the pad there would propagate down, but it turns out that for level 1
the padding ends up being chosen by the HW.  This brings us closer to
being able to turn on UIF XOR for increased performance, as well.
2018-02-01 11:02:29 -08:00
Eric Anholt
dea902c933 broadcom/vc5: Simplify separate stencil surface setup.
If we just make another gallium surface for the separate stencil, it's a
lot easier to keep track of which set of fields we're using in RCL setup.

This also incidentally fixes a little bug in setting up the surface's
padded height for separate stencil when the UIF-ness changes at different
levels of Z versus stencil.
2018-02-01 11:02:29 -08:00
Eric Anholt
7239b3edbe broadcom/vc5: Rename the UIFCFG register in the UAPI.
This matches the naming of the other hub regs we get, and I don't know for
sure if UIFCFG will be the same register between the hub and the cores on
all versions.
2018-02-01 11:02:29 -08:00
Eric Anholt
353b42ccc7 broadcom/vc5: Fix a segfault on mix of booleans.
We don't have a src1 to look up if the compare instruction is "i2b".
2018-02-01 11:02:29 -08:00
Eric Anholt
eb765394c2 broadcom/vc5: Skip over missing color buffers for a couple of checks.
Fixes crashes in piglit alpha-to-coverage-no-draw-buffer-zero 2
2018-02-01 11:02:29 -08:00
Eric Anholt
aec066c7aa broadcom/vc5: Add the missing PIPE_CAP_FENCE_SIGNAL. 2018-02-01 11:02:29 -08:00
Baldur Karlsson
030821a873 mesa: fix query of GL_TEXTURE_COMPRESSION_HINT_ARB
Fixes: f96a69f916 ("mesa: replace GLenum with GLenum16 in common
structures (v4)")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104908
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 11:58:02 -07:00
Lucas Stach
0c71a19fe4 renderonly: fix dumb BO allocation for non 32bpp formats
Take into account the resource format, instead of applying a hardcoded
32bpp. This not only over-allocates 16bpp formats, but also results in
a wrong stride being filled into the handle.

Fixes: 848b49b288 ("gallium: add renderonly library")
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-01 19:36:17 +01:00
Kenneth Graunke
85ec7abc3f intel/decoder: Fix control / evaluation label mixup.
Trivial.  DS is TES, HS is TCS.
2018-02-01 09:44:15 -08:00
Kenneth Graunke
c3cd2aac27 i965: Bump official kernel requirement to Linux v3.9.
In commit 3f353342a6 (present in 17.3.0)
we started unconditionally using I915_EXEC_NO_RELOC, which was
introduced in Linux v3.9.  ChromeOS kernel 3.8 has backported this,
so it should work too.

Running on older kernels would likely result in every single batch
being rejected by the kernel, which is pretty catastrophic.  Yet, it
appears that nobody noticed.  So, let's just bump the official
requirement and move forward ever so slowly.

Fixes: 3f353342a6 ("i965: Use I915_EXEC_NO_RELOC")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 07:58:58 -08:00
Marc Dietrich
4c5f0b4fd4 meson: don't install windows headers on non-windows platforms
Only dive into the windows subdir if windows platform is selected.

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Fixes: 5ef75cb02b "meson: build src/glx/windows"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-01 15:33:02 +00:00
Marek Olšák
71c6f64e54 radeonsi: use ac_build_buffer_load_format for image buffer loads
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
b0a6053a99 ac/nir: use ac_build_buffer_load_format for image buffer loads
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
bac9fa9f17 ac: add glc parameter to ac_build_buffer_load_format
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
be973ed21f radeonsi: load the right number of components for VS inputs and TBOs
The supported counts are 1, 2, 4. (3=4)

The following snippet loads float, vec2, vec3, and vec4:

Before:
    buffer_load_format_x v9, v4, s[0:3], 0 idxen          ; E0002000 80000904
    buffer_load_format_xyzw v[0:3], v5, s[8:11], 0 idxen  ; E00C2000 80020005
    s_waitcnt vmcnt(0)                                    ; BF8C0F70
    buffer_load_format_xyzw v[2:5], v6, s[12:15], 0 idxen ; E00C2000 80030206
    s_waitcnt vmcnt(0)                                    ; BF8C0F70
    buffer_load_format_xyzw v[5:8], v7, s[4:7], 0 idxen   ; E00C2000 80010507

After:
    buffer_load_format_x v10, v4, s[0:3], 0 idxen         ; E0002000 80000A04
    buffer_load_format_xy v[8:9], v5, s[8:11], 0 idxen    ; E0042000 80020805
    buffer_load_format_xyzw v[0:3], v6, s[12:15], 0 idxen ; E00C2000 80030006
    s_waitcnt vmcnt(0)                                    ; BF8C0F70
    buffer_load_format_xyzw v[3:6], v7, s[4:7], 0 idxen   ; E00C2000 80010307

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
472361dd7e radeonsi: remove unused si_shader_context members
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Jon Turney
d3540b405b glx/apple: locate dispatch table functions to wrap by name
Avoid reaching into the dispatch table internals (and thus having to deal
with the complexities of remap etc.) by identifying functions to wrap by
name.

See:
https://lists.freedesktop.org/archives/mesa-dev/2015-June/086721.html et seq.
https://bugs.freedesktop.org/show_bug.cgi?id=90311

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:14:08 +00:00
Jon Turney
b37b7b42dc glx/apple: include util/debug.h for env_var_as_boolean prototype
mesa/src/glx/glxcmds.c:1295:21: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
mesa/src/glx/apple/apple_visual.c:85:28: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration]

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:14:02 +00:00
Jon Turney
f8ed9f24d5 osx: ld doesn't support --build-id
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:13:56 +00:00
Jon Turney
7ad7a07c88 configure: Default to gbm=no on osx
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:13:00 +00:00
Andres Rodriguez
bbd00844a2 mesa: remove usage of alloca in externalobjects.c v4
Don't want an overly large numBufferBarriers/numTextureBarriers to blow
up the stack.

v2: handle malloc errors
v3: fix patch
v4: initialize texObjs/bufObjs

Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
2018-02-01 09:48:04 -05:00
Samuel Pitoiset
2ef5ce1198 radv: do not insert shaders in cache when it's disabled
When the application doesn't provide its own pipeline cache,
the driver uses a in-memory cache but it shouldn't insert any
entries when the cache is explicitely disabled by the user.

Found while running my experimental pipeline-db tool with a
ton of shaders, the memory footprint was just huge, and sometimes
the process was even killed...

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-01 09:40:11 +01:00
Samuel Pitoiset
4922e7f25c radv: use separate bindings for graphics and compute descriptors
The Vulkan spec says:

   "pipelineBindPoint is a VkPipelineBindPoint indicating whether
    the descriptors will be used by graphics pipelines or compute
    pipelines. There is a separate set of bind points for each of
    graphics and compute, so binding one does not disturb the other."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104732
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-01 09:37:09 +01:00
Samuel Pitoiset
cf224014dd radv: store the bind point when creating descriptors with templates
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-01 09:37:07 +01:00
Dave Airlie
7ea15a36fb r600/eg: make sure we allow vpm bit on other CF ops.
the vpm bit wasn't being applied to the push/pop instructions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-01 13:41:32 +10:00
Timothy Arceri
4d982ae2c7 gallium/st/clover: remove unused PIPE_SHADER_IR_LLVM
This has been unused since 100796c15c.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-02-01 13:56:34 +11:00
Dave Airlie
0491d5425f r600/sb: just add some missing debug bits
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-01 12:06:40 +10:00
Dave Airlie
df155a73f4 r600: fix buffer resinfo opcode translation.
The vtx operations never got translated, so things worked by
0 being equal to 0, translate them so we can use the proper buffer
resinfo code.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-01 11:59:55 +10:00
Timothy Arceri
679e4e7a46 st/glsl_to_nir: add more nir opts to st_nir_opts()
All of the current gallium nir driver use these optimisations but
they do so in their backends. Having these called in the backend
only can cause a number of problems:

- Shader compile times are greater because the opts need to do
  significant passes over all shader variants.
- The shader cache is partially defeated due to the significant
  optimisation passes over variants.
- We might miss out on nir linking optimisation opportunities.

Adding these passes to st_nir_opts() alleviates these problems.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-01 09:42:57 +11:00
Andres Gomez
5a7aba2e0a i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen<8
The emission of vertex attributes corresponding to dvec3 and dvec4
vertex shader input variables was not correct when the <size> passed
to the VertexAttribL* commands was <= 2.

In 61a8a55f55 ("i965/gen8: Fix vertex attrib upload for dvec3/4
shader inputs"), for gen8+ we needed to determine if the attrib was
dual slot to emit 128 or 256-bit, independently of the VAO size.

Similarly, for gen < 8 we also need to determine whether the attrib is
dual slot to force the emission of 256-bits through 2 uploads.

Additionally, we make use of the ISL_FORMAT_R32_FLOAT format in this
second upload to fill these unspecified components with zeros, as we
also do for gen8+.

Fixes the following test on Haswell:
KHR-GL46.vertex_attrib_binding.basic-inputL-case1

v2: Added more inline comments to explain why we are using
    ISL_FORMAT_R32_FLOAT and its consequences, as requested by
    Alejandro and Antía.

Fixes: 75968a668e ("i965/gen7: expose OpenGL 4.2 on Haswell when
supported")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103006
Cc: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Antia Puentes <apuentes@igalia.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-31 22:50:06 +02:00
Kenneth Graunke
ab1f2e6bc4 i965: Make texture validation code use texture objects, not units.
This requires moving the _MaxLevel handling up to the callers.  Another
user of intel_finalize_mipmap_tree will be added later that depends on
_MaxLevel not being modified.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-01-31 11:33:52 -08:00
Kenneth Graunke
0a2e878c69 i965: Pass tObj into intel_update_max_level instead of intel_obj.
We want both anyway, but this will simplify things a tiny bit in an
upcoming patch.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-01-31 11:33:52 -08:00
Kenneth Graunke
876f1537e9 i965: Delete more misleading comments.
brw_bo_wait_rendering used to take a brw_context pointer for perf_debug
messages about stalls.  Chris eliminated that in 833108ac14.
This message about passing NULL to avoid those warnings is no longer
relevant, and just adds confusion.  So, drop it.
2018-01-31 11:33:52 -08:00
Andres Rodriguez
8996610acb docs/features: mark EXT_semaphore(_fd) as DONE v2
Support for these extensions is available in radeonsi.

v2: also updated relnotes

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
2018-01-31 12:31:40 -05:00
Brian Paul
d32c22a13f st/mesa: whitespace, formatting fixes in st_glsl_to_tgsi.cpp
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Brian Paul
3b3d8275d8 st/mesa: s/int/GLenum/ in st_glsl_to_tgsi.cpp
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Brian Paul
1882ec4ff7 svga: use opcode local var to simplify some code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Brian Paul
338c35c427 svga: s/unsigned/VGPU10_OPCODE_TYPE/
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Samuel Pitoiset
a097a6f519 radv: do not dump meta shader stats
That's quite useless and that pollutes the output.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-31 14:10:26 +01:00
Samuel Pitoiset
26cc3e74b9 ac/nir: fix emission of ffract for 64-bit
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-31 14:10:24 +01:00
Eric Engestrom
2f0db33527 meson: dedup gallium-xa logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
fa5d616bf9 meson: dedup gallium-va logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
86168ed31c meson: dedup gallium-omx logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
724916c8a8 meson: dedup gallium-xvmc logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
992af0a4b8 meson: dedup gallium-vdpau logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Antia Puentes
0da434fb47 Revert "mesa: add missing RGB9_E5 format in _mesa_base_fbo_format"
This reverts commit 513c2263cb.

_mesa_base_fbo_format_ is used to validate the internalformat
passed to RenderbufferStorage, which in the OpenGL 4.6 is said:

"An INVALID_ENUM error is generated if internalformat is not one of the
color-renderable, depth-renderable, or stencil-renderable formats defined
in section 9.4."

RGB9_E5 format is not renderable, as stated in the same specification
(Bug 9338).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104794

Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-01-31 12:06:00 +01:00
Michel Dänzer
1cf1bf32ef winsys/radeon: Compute is_displayable in surf_drm_to_winsys
It was always 0, breaking (at least) DRI3 with Xwayland.

Bugzilla: https://bugs.freedesktop.org/104306
Fixes: 5f2073be32 ("ac/surface: add ac_surface::is_displayable")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:53:58 +01:00
Matthew Nicholls
ef272b161e radv: remove predication on cache flushes
This can lead to a situation where cache flushes could get conditionally
disabled while still clearing the flush_bits, and thus flushes due to
application pipeline barriers may never get executed.

Fixes: a6c2001ace (radv: add support for cmd predication.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-31 13:37:18 +10:00
Brian Paul
1ea9efd2f8 mesa: fix broken glGet*(GL_POLYGON_MODE) query
This reverts part of the patch which introduced the GLenum16 change.
Fixes a conform regression found by Roland.

Fixes: f96a69f916 ("mesa: replace GLenum with GLenum16 in
common structures (v4)")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-30 20:32:37 -07:00
Dave Airlie
49c61d8b84 virgl: also remove dimension on indirect.
This fixes some dEQP tests that generated bad shaders.

Fixes: b6f6ead19 (virgl: drop const dimensions on first block.)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-31 12:24:11 +10:00
Marek Olšák
fdf01d0244 radeonsi: remove DBG_PRECOMPILE
it's useless and shader-db stats only report the main shader part.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-31 03:21:20 +01:00
Marek Olšák
148b48646b radeonsi: print shader-db stats for main parts, not final binaries
This is needed to get shader-db stats for LS,HS,ES,GS stages on gfx9.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-31 03:21:20 +01:00
Marek Olšák
c02c9ee550 radeonsi: move max_simd_waves computation into a separate function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-31 03:21:20 +01:00
Marek Olšák
a7311cd7ee mesa: fix glGet MAX_VERTEX_ATTRIB queries
Broken by f96a69f916

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-31 03:21:20 +01:00
Jason Ekstrand
97938dac36 anv/cmd_buffer: Re-emit the pipeline at every subpass
If we ever hit this edge-case, it can theoretically cause problem for
CNL because we could end up changing render targets without re-emitting
3DSTATE_MULTISAMPLE which is part of the pipeline.  Just get rid of the
edge case.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-01-30 17:16:33 -08:00
Ian Romanick
ee63933a73 nir: Distribute binary operations with constants into bcsel
This was specifically designed to simplify 1+mix(0, a-1, condition) to
mix(1, a, condition) by pushing the 1+ inside.

Skylake, Broadwell, and Haswell had similar results.  Skylake shown.
total instructions in shared programs: 14521753 -> 14521716 (<.01%)
instructions in affected programs: 10619 -> 10582 (-0.35%)
helped: 51
HURT: 14
helped stats (abs) min: 1 max: 12 x̄: 1.43 x̃: 1
helped stats (rel) min: 0.20% max: 3.58% x̄: 1.01% x̃: 0.95%
HURT stats (abs)   min: 1 max: 11 x̄: 2.57 x̃: 1
HURT stats (rel)   min: 0.22% max: 1.75% x̄: 1.20% x̃: 1.32%
95% mean confidence interval for instructions value: -1.31 0.17
95% mean confidence interval for instructions %-change: -0.80% -0.27%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 533000205 -> 533003533 (<.01%)
cycles in affected programs: 110610 -> 113938 (3.01%)
helped: 43
HURT: 28
helped stats (abs) min: 6 max: 440 x̄: 27.12 x̃: 16
helped stats (rel) min: 0.39% max: 4.84% x̄: 1.60% x̃: 1.67%
HURT stats (abs)   min: 2 max: 3066 x̄: 160.50 x̃: 14
HURT stats (rel)   min: 0.08% max: 77.78% x̄: 5.16% x̃: 0.62%
95% mean confidence interval for cycles value: -43.81 137.56
95% mean confidence interval for cycles %-change: -1.47% 3.60%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 10018840 -> 10018713 (<.01%)
instructions in affected programs: 9431 -> 9304 (-1.35%)
helped: 51
HURT: 3
helped stats (abs) min: 1 max: 80 x̄: 2.76 x̃: 1
helped stats (rel) min: 0.20% max: 16.43% x̄: 1.16% x̃: 0.81%
HURT stats (abs)   min: 1 max: 12 x̄: 4.67 x̃: 1
HURT stats (rel)   min: 0.22% max: 1.33% x̄: 0.59% x̃: 0.22%
95% mean confidence interval for instructions value: -5.36 0.66
95% mean confidence interval for instructions %-change: -1.66% -0.46%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 87571944 -> 87572785 (<.01%)
cycles in affected programs: 117234 -> 118075 (0.72%)
helped: 42
HURT: 23
helped stats (abs) min: 2 max: 114 x̄: 51.90 x̃: 30
helped stats (rel) min: 0.11% max: 11.01% x̄: 4.45% x̃: 2.74%
HURT stats (abs)   min: 1 max: 2341 x̄: 131.35 x̃: 10
HURT stats (rel)   min: 0.06% max: 37.11% x̄: 2.75% x̃: 0.61%
95% mean confidence interval for cycles value: -61.05 86.93
95% mean confidence interval for cycles %-change: -3.47% -0.33%
Inconclusive result (value mean confidence interval includes 0).

Sandy Bridge
total instructions in shared programs: 10542933 -> 10542844 (<.01%)
instructions in affected programs: 11487 -> 11398 (-0.77%)
helped: 52
HURT: 3
helped stats (abs) min: 1 max: 40 x̄: 1.96 x̃: 1
helped stats (rel) min: 0.08% max: 8.16% x̄: 0.90% x̃: 0.72%
HURT stats (abs)   min: 1 max: 11 x̄: 4.33 x̃: 1
HURT stats (rel)   min: 0.22% max: 1.22% x̄: 0.55% x̃: 0.22%
95% mean confidence interval for instructions value: -3.17 -0.07
95% mean confidence interval for instructions %-change: -1.13% -0.52%
Instructions are helped.

total cycles in shared programs: 146098397 -> 146097094 (<.01%)
cycles in affected programs: 128140 -> 126837 (-1.02%)
helped: 47
HURT: 8
helped stats (abs) min: 2 max: 333 x̄: 29.21 x̃: 18
helped stats (rel) min: 0.13% max: 5.04% x̄: 1.18% x̃: 0.95%
HURT stats (abs)   min: 1 max: 16 x̄: 8.75 x̃: 9
HURT stats (rel)   min: 0.08% max: 0.43% x̄: 0.30% x̃: 0.34%
95% mean confidence interval for cycles value: -37.49 -9.90
95% mean confidence interval for cycles %-change: -1.22% -0.71%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886711 -> 7886509 (<.01%)
instructions in affected programs: 10425 -> 10223 (-1.94%)
helped: 50
HURT: 2
helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1
helped stats (rel) min: 0.34% max: 15.38% x̄: 1.12% x̃: 0.54%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.86% max: 0.91% x̄: 0.89% x̃: 0.89%
95% mean confidence interval for instructions value: -8.05 0.28
95% mean confidence interval for instructions %-change: -1.83% -0.26%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 178115324 -> 178114612 (<.01%)
cycles in affected programs: 765726 -> 765014 (-0.09%)
helped: 39
HURT: 1
helped stats (abs) min: 2 max: 276 x̄: 18.31 x̃: 8
helped stats (rel) min: <.01% max: 8.47% x̄: 0.39% x̃: 0.04%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -32.07 -3.53
95% mean confidence interval for cycles %-change: -0.86% 0.10%
Inconclusive result (%-change mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857762 -> 4857661 (<.01%)
instructions in affected programs: 5523 -> 5422 (-1.83%)
helped: 25
HURT: 1
helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1
helped stats (rel) min: 0.34% max: 13.61% x̄: 1.04% x̃: 0.52%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.86% max: 0.86% x̄: 0.86% x̃: 0.86%
95% mean confidence interval for instructions value: -9.99 2.22
95% mean confidence interval for instructions %-change: -2.01% 0.08%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 122179674 -> 122179194 (<.01%)
cycles in affected programs: 530162 -> 529682 (-0.09%)
helped: 22
HURT: 1
helped stats (abs) min: 2 max: 292 x̄: 21.91 x̃: 7
helped stats (rel) min: <.01% max: 8.65% x̄: 0.44% x̃: 0.04%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -46.56 4.82
95% mean confidence interval for cycles %-change: -1.20% 0.36%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:15 -08:00
Ian Romanick
03fb13f646 nir: Rearrange logic op-compounded integer compares
Skylake and Broadwell had similar results.  Skylake shown.
total instructions in shared programs: 14521769 -> 14521753 (<.01%)
instructions in affected programs: 8782 -> 8766 (-0.18%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.12% max: 0.40% x̄: 0.20% x̃: 0.18%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.23% -0.16%
Instructions are helped.

total cycles in shared programs: 533000376 -> 533000205 (<.01%)
cycles in affected programs: 447035 -> 446864 (-0.04%)
helped: 9
HURT: 9
helped stats (abs) min: 2 max: 40 x̄: 35.78 x̃: 40
helped stats (rel) min: 0.02% max: 0.18% x̄: 0.10% x̃: 0.09%
HURT stats (abs)   min: 1 max: 52 x̄: 16.78 x̃: 10
HURT stats (rel)   min: <.01% max: 1.11% x̄: 0.29% x̃: 0.12%
95% mean confidence interval for cycles value: -25.07 6.07
95% mean confidence interval for cycles %-change: -0.08% 0.27%
Inconclusive result (value mean confidence interval includes 0).

No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
053be9f020 nir: Rearrange and-compounded float compares
If both comparisons are used as sources for instructions other than the
iand, this transformation is detrimental.  If the non-identical value in
both compares is constant, the fmin or fmax will be constant-folded
away, so the transformation is always a win.

It is interesting to me that on Iron Lake only 81 shaders have
instruction counts changed, but 726 shaders have cycle counts changed.

shader-db results:

Skylake
total instructions in shared programs: 14525728 -> 14521017 (-0.03%)
instructions in affected programs: 1164726 -> 1160015 (-0.40%)
helped: 1692
HURT: 5
helped stats (abs) min: 1 max: 637 x̄: 2.79 x̃: 2
helped stats (rel) min: 0.07% max: 16.36% x̄: 0.81% x̃: 0.33%
HURT stats (abs)   min: 1 max: 12 x̄: 3.20 x̃: 1
HURT stats (rel)   min: 0.38% max: 2.86% x̄: 2.36% x̃: 2.86%
95% mean confidence interval for instructions value: -3.52 -2.03
95% mean confidence interval for instructions %-change: -0.86% -0.74%
Instructions are helped.

total cycles in shared programs: 533115449 -> 532991404 (-0.02%)
cycles in affected programs: 119401803 -> 119277758 (-0.10%)
helped: 1145
HURT: 467
helped stats (abs) min: 1 max: 34644 x̄: 145.92 x̃: 18
helped stats (rel) min: <.01% max: 45.33% x̄: 1.58% x̃: 0.42%
HURT stats (abs)   min: 1 max: 1590 x̄: 92.15 x̃: 15
HURT stats (rel)   min: <.01% max: 13.48% x̄: 1.26% x̃: 0.39%
95% mean confidence interval for cycles value: -122.16 -31.74
95% mean confidence interval for cycles %-change: -0.94% -0.57%
Cycles are helped.

total spills in shared programs: 9597 -> 9534 (-0.66%)
spills in affected programs: 403 -> 340 (-15.63%)
helped: 1
HURT: 1

total fills in shared programs: 13904 -> 13790 (-0.82%)
fills in affected programs: 1627 -> 1513 (-7.01%)
helped: 2
HURT: 1

LOST:   0
GAINED: 2

Broadwell
total instructions in shared programs: 14816966 -> 14812590 (-0.03%)
instructions in affected programs: 1499885 -> 1495509 (-0.29%)
helped: 1672
HURT: 15
helped stats (abs) min: 1 max: 455 x̄: 2.70 x̃: 2
helped stats (rel) min: 0.05% max: 16.36% x̄: 0.81% x̃: 0.33%
HURT stats (abs)   min: 1 max: 21 x̄: 9.20 x̃: 8
HURT stats (rel)   min: 0.08% max: 2.86% x̄: 1.06% x̃: 0.53%
95% mean confidence interval for instructions value: -3.14 -2.05
95% mean confidence interval for instructions %-change: -0.85% -0.73%
Instructions are helped.

total cycles in shared programs: 559353622 -> 559345595 (<.01%)
cycles in affected programs: 139893703 -> 139885676 (<.01%)
helped: 921
HURT: 697
helped stats (abs) min: 1 max: 42424 x̄: 143.45 x̃: 18
helped stats (rel) min: <.01% max: 36.23% x̄: 2.02% x̃: 0.87%
HURT stats (abs)   min: 1 max: 2370 x̄: 178.03 x̃: 38
HURT stats (rel)   min: <.01% max: 17.35% x̄: 0.71% x̃: 0.14%
95% mean confidence interval for cycles value: -59.64 49.72
95% mean confidence interval for cycles %-change: -1.02% -0.66%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 78902 -> 78861 (-0.05%)
spills in affected programs: 2418 -> 2377 (-1.70%)
helped: 1
HURT: 11

total fills in shared programs: 83782 -> 83678 (-0.12%)
fills in affected programs: 3515 -> 3411 (-2.96%)
helped: 2
HURT: 11

LOST:   0
GAINED: 5

Haswell and Ivy Bridge had similar results. Haswell shown.
total instructions in shared programs: 9033898 -> 9032010 (-0.02%)
instructions in affected programs: 308064 -> 306176 (-0.61%)
helped: 921
HURT: 4
helped stats (abs) min: 1 max: 20 x̄: 2.05 x̃: 1
helped stats (rel) min: 0.17% max: 17.54% x̄: 0.80% x̃: 0.35%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23%
95% mean confidence interval for instructions value: -2.21 -1.87
95% mean confidence interval for instructions %-change: -0.88% -0.68%
Instructions are helped.

total cycles in shared programs: 84628949 -> 84620520 (<.01%)
cycles in affected programs: 2164913 -> 2156484 (-0.39%)
helped: 518
HURT: 359
helped stats (abs) min: 1 max: 440 x̄: 41.52 x̃: 20
helped stats (rel) min: <.01% max: 17.17% x̄: 1.95% x̃: 1.01%
HURT stats (abs)   min: 1 max: 586 x̄: 36.43 x̃: 8
HURT stats (rel)   min: 0.04% max: 18.65% x̄: 1.47% x̃: 0.40%
95% mean confidence interval for cycles value: -15.17 -4.05
95% mean confidence interval for cycles %-change: -0.77% -0.32%
Cycles are helped.

LOST:   0
GAINED: 4

Sandy Bridge
total instructions in shared programs: 10544860 -> 10542933 (-0.02%)
instructions in affected programs: 360019 -> 358092 (-0.54%)
helped: 931
HURT: 4
helped stats (abs) min: 1 max: 20 x̄: 2.07 x̃: 1
helped stats (rel) min: 0.11% max: 15.52% x̄: 0.68% x̃: 0.30%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.33% max: 3.33% x̄: 3.33% x̃: 3.33%
95% mean confidence interval for instructions value: -2.23 -1.89
95% mean confidence interval for instructions %-change: -0.76% -0.58%
Instructions are helped.

total cycles in shared programs: 146106820 -> 146098397 (<.01%)
cycles in affected programs: 3435047 -> 3426624 (-0.25%)
helped: 572
HURT: 329
helped stats (abs) min: 1 max: 1289 x̄: 32.52 x̃: 15
helped stats (rel) min: <.01% max: 26.29% x̄: 0.97% x̃: 0.33%
HURT stats (abs)   min: 1 max: 1714 x̄: 30.93 x̃: 6
HURT stats (rel)   min: 0.02% max: 41.31% x̄: 1.13% x̃: 0.19%
95% mean confidence interval for cycles value: -16.85 -1.85
95% mean confidence interval for cycles %-change: -0.39% -0.01%
Cycles are helped.

LOST:   1
GAINED: 0

Iron Lake
total instructions in shared programs: 7886925 -> 7886711 (<.01%)
instructions in affected programs: 25763 -> 25549 (-0.83%)
helped: 75
HURT: 6
helped stats (abs) min: 1 max: 13 x̄: 3.33 x̃: 1
helped stats (rel) min: 0.35% max: 17.57% x̄: 1.96% x̃: 0.53%
HURT stats (abs)   min: 1 max: 16 x̄: 6.00 x̃: 1
HURT stats (rel)   min: 2.86% max: 4.79% x̄: 3.49% x̃: 2.86%
95% mean confidence interval for instructions value: -3.69 -1.60
95% mean confidence interval for instructions %-change: -2.54% -0.57%
Instructions are helped.

total cycles in shared programs: 178116888 -> 178115324 (<.01%)
cycles in affected programs: 5858790 -> 5857226 (-0.03%)
helped: 484
HURT: 242
helped stats (abs) min: 2 max: 76 x̄: 5.27 x̃: 6
helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06%
HURT stats (abs)   min: 2 max: 76 x̄: 4.07 x̃: 2
HURT stats (rel)   min: 0.01% max: 3.99% x̄: 0.19% x̃: 0.03%
95% mean confidence interval for cycles value: -2.76 -1.55
95% mean confidence interval for cycles %-change: -0.12% 0.01%
Inconclusive result (%-change mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857870 -> 4857762 (<.01%)
instructions in affected programs: 13994 -> 13886 (-0.77%)
helped: 39
HURT: 5
helped stats (abs) min: 1 max: 13 x̄: 3.28 x̃: 2
helped stats (rel) min: 0.33% max: 17.11% x̄: 1.86% x̃: 0.48%
HURT stats (abs)   min: 1 max: 16 x̄: 4.00 x̃: 1
HURT stats (rel)   min: 2.86% max: 4.71% x̄: 3.23% x̃: 2.86%
95% mean confidence interval for instructions value: -3.86 -1.05
95% mean confidence interval for instructions %-change: -2.61% 0.04%
Inconclusive result (%-change mean confidence interval includes 0).

total cycles in shared programs: 122180744 -> 122179674 (<.01%)
cycles in affected programs: 3686646 -> 3685576 (-0.03%)
helped: 273
HURT: 141
helped stats (abs) min: 2 max: 76 x̄: 5.81 x̃: 6
helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06%
HURT stats (abs)   min: 2 max: 76 x̄: 3.66 x̃: 2
HURT stats (rel)   min: 0.01% max: 3.99% x̄: 0.16% x̃: 0.02%
95% mean confidence interval for cycles value: -3.42 -1.75
95% mean confidence interval for cycles %-change: -0.15% 0.03%
Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
821e7a4d32 nir: Separate a weird compare with zero to two compares with zero
min(a+b, c+d) >= 0 becomes (a+b >= 0 && c+d >= 0).

No shader-db changes, but it does prevent 6 to 12 instruction
regressions in the next patch on all measured Intel platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
68420d8322 nir: Simplify min and max of b2f
v2: Rebase on almost 2 years.  Require that one of the arguments to fmin
or fmax be used only once.  This prevents some regressions.

shader-db results:

Skylake and Broadwell had similar results.  Skylake shown.
total instructions in shared programs: 14526021 -> 14525913 (<.01%)
instructions in affected programs: 4613 -> 4505 (-2.34%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 3.48 x̃: 4
helped stats (rel) min: 0.62% max: 6.67% x̄: 3.31% x̃: 2.42%

total cycles in shared programs: 533118710 -> 533118403 (<.01%)
cycles in affected programs: 34334 -> 34027 (-0.89%)
helped: 24
HURT: 0
helped stats (abs) min: 4 max: 24 x̄: 12.79 x̃: 14
helped stats (rel) min: 0.25% max: 2.40% x̄: 1.08% x̃: 1.03%

No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
d8d18516b0 nir: Undo possible damage caused by rearranging or-compounded float compares
shader-db results:

Skylake and Broadwell had similar results (Skylake shown)
total instructions in shared programs: 14525898 -> 14525836 (<.01%)
instructions in affected programs: 1964 -> 1902 (-3.16%)
helped: 14
HURT: 0
helped stats (abs) min: 1 max: 25 x̄: 4.43 x̃: 1
helped stats (rel) min: 0.68% max: 9.77% x̄: 2.10% x̃: 0.86%
95% mean confidence interval for instructions value: -9.46 0.60
95% mean confidence interval for instructions %-change: -3.97% -0.24%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 533119892 -> 533115756 (<.01%)
cycles in affected programs: 96061 -> 91925 (-4.31%)
helped: 13
HURT: 1
helped stats (abs) min: 60 max: 596 x̄: 318.77 x̃: 300
helped stats (rel) min: 1.15% max: 5.49% x̄: 4.27% x̃: 4.42%
HURT stats (abs)   min: 8 max: 8 x̄: 8.00 x̃: 8
HURT stats (rel)   min: 0.46% max: 0.46% x̄: 0.46% x̃: 0.46%
95% mean confidence interval for cycles value: -379.43 -211.43
95% mean confidence interval for cycles %-change: -4.84% -3.01%
Cycles are helped.

Haswell, Ivy Bridge and Sandy Bridge had similar results (Haswell shown).
total instructions in shared programs: 9033948 -> 9033898 (<.01%)
instructions in affected programs: 535 -> 485 (-9.35%)
helped: 2
HURT: 0

total cycles in shared programs: 84631402 -> 84628949 (<.01%)
cycles in affected programs: 63197 -> 60744 (-3.88%)
helped: 13
HURT: 2
helped stats (abs) min: 1 max: 594 x̄: 189.62 x̃: 140
helped stats (rel) min: 0.07% max: 5.04% x̄: 3.79% x̃: 4.01%
HURT stats (abs)   min: 4 max: 8 x̄: 6.00 x̃: 6
HURT stats (rel)   min: 0.17% max: 0.45% x̄: 0.31% x̃: 0.31%
95% mean confidence interval for cycles value: -253.40 -73.67
95% mean confidence interval for cycles %-change: -4.24% -2.25%
Cycles are helped.

No changes on GM45 or Iron Lake.

v2: Add a couple more tautological compares.  Suggested by Elie.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
3941cba0f7 nir: Be more conservative about rearranging or-compounded compares
If both comparisons are used as sources for instructions other than the
ior, this transformation is detrimental.  If the non-identical value in
both compares is constant, the fmin or fmax will be constant-folded
away, so the transformation is always a win.

shader-db results:

Skylake
total instructions in shared programs: 14526147 -> 14525898 (<.01%)
instructions in affected programs: 70239 -> 69990 (-0.35%)
helped: 102
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.44 x̃: 1
helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20%
95% mean confidence interval for instructions value: -2.86 -2.02
95% mean confidence interval for instructions %-change: -0.46% -0.31%
Instructions are helped.

total cycles in shared programs: 533120531 -> 533119892 (<.01%)
cycles in affected programs: 994875 -> 994236 (-0.06%)
helped: 76
HURT: 26
helped stats (abs) min: 1 max: 324 x̄: 27.09 x̃: 13
helped stats (rel) min: <.01% max: 4.21% x̄: 0.45% x̃: 0.18%
HURT stats (abs)   min: 1 max: 167 x̄: 54.62 x̃: 26
HURT stats (rel)   min: <.01% max: 4.36% x̄: 1.01% x̃: 0.39%
95% mean confidence interval for cycles value: -19.44 6.91
95% mean confidence interval for cycles %-change: -0.30% 0.15%
Inconclusive result (value mean confidence interval includes 0).

Broadwell
total instructions in shared programs: 14816005 -> 14815787 (<.01%)
instructions in affected programs: 64658 -> 64440 (-0.34%)
helped: 97
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.25 x̃: 1
helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20%
95% mean confidence interval for instructions value: -2.62 -1.87
95% mean confidence interval for instructions %-change: -0.45% -0.30%
Instructions are helped.

total cycles in shared programs: 559340386 -> 559339907 (<.01%)
cycles in affected programs: 1090491 -> 1090012 (-0.04%)
helped: 66
HURT: 28
helped stats (abs) min: 2 max: 198 x̄: 23.83 x̃: 16
helped stats (rel) min: 0.01% max: 4.21% x̄: 0.47% x̃: 0.27%
HURT stats (abs)   min: 2 max: 226 x̄: 39.07 x̃: 11
HURT stats (rel)   min: <.01% max: 4.61% x̄: 0.64% x̃: 0.20%
95% mean confidence interval for cycles value: -15.94 5.75
95% mean confidence interval for cycles %-change: -0.35% 0.07%
Inconclusive result (value mean confidence interval includes 0).

LOST:   0
GAINED: 1

Haswell
total instructions in shared programs: 9034106 -> 9033948 (<.01%)
instructions in affected programs: 24096 -> 23938 (-0.66%)
helped: 38
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 4.16 x̃: 4
helped stats (rel) min: 0.42% max: 2.29% x̄: 0.71% x̃: 0.64%
95% mean confidence interval for instructions value: -4.71 -3.60
95% mean confidence interval for instructions %-change: -0.84% -0.58%
Instructions are helped.

total cycles in shared programs: 84631628 -> 84631402 (<.01%)
cycles in affected programs: 148674 -> 148448 (-0.15%)
helped: 14
HURT: 14
helped stats (abs) min: 1 max: 114 x̄: 22.14 x̃: 12
helped stats (rel) min: 0.02% max: 2.98% x̄: 0.66% x̃: 0.21%
HURT stats (abs)   min: 1 max: 10 x̄: 6.00 x̃: 5
HURT stats (rel)   min: 0.01% max: 0.20% x̄: 0.12% x̃: 0.11%
95% mean confidence interval for cycles value: -19.42 3.28
95% mean confidence interval for cycles %-change: -0.59% 0.05%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 10015456 -> 10015293 (<.01%)
instructions in affected programs: 27701 -> 27538 (-0.59%)
helped: 38
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 4.29 x̃: 4
helped stats (rel) min: 0.33% max: 2.79% x̄: 0.66% x̃: 0.52%
95% mean confidence interval for instructions value: -4.87 -3.71
95% mean confidence interval for instructions %-change: -0.82% -0.51%
Instructions are helped.

total cycles in shared programs: 87524771 -> 87524569 (<.01%)
cycles in affected programs: 112324 -> 112122 (-0.18%)
helped: 6
HURT: 12
helped stats (abs) min: 2 max: 111 x̄: 44.67 x̃: 20
helped stats (rel) min: 0.02% max: 2.94% x̄: 1.45% x̃: 1.26%
HURT stats (abs)   min: 1 max: 16 x̄: 5.50 x̃: 5
HURT stats (rel)   min: <.01% max: 0.16% x̄: 0.08% x̃: 0.08%
95% mean confidence interval for cycles value: -29.14 6.69
95% mean confidence interval for cycles %-change: -0.93% 0.08%
Inconclusive result (value mean confidence interval includes 0).

LOST:   0
GAINED: 2

Sandy Bridge
total instructions in shared programs: 10545655 -> 10545465 (<.01%)
instructions in affected programs: 37198 -> 37008 (-0.51%)
helped: 42
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 4.52 x̃: 4
helped stats (rel) min: 0.31% max: 2.15% x̄: 0.58% x̃: 0.49%
95% mean confidence interval for instructions value: -5.14 -3.91
95% mean confidence interval for instructions %-change: -0.68% -0.47%
Instructions are helped.

total cycles in shared programs: 146113059 -> 146112427 (<.01%)
cycles in affected programs: 423514 -> 422882 (-0.15%)
helped: 32
HURT: 10
helped stats (abs) min: 4 max: 162 x̄: 24.34 x̃: 12
helped stats (rel) min: 0.06% max: 2.74% x̄: 0.37% x̃: 0.11%
HURT stats (abs)   min: 12 max: 19 x̄: 14.70 x̃: 14
HURT stats (rel)   min: 0.10% max: 0.18% x̄: 0.16% x̃: 0.14%
95% mean confidence interval for cycles value: -26.03 -4.07
95% mean confidence interval for cycles %-change: -0.43% -0.05%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886959 -> 7886925 (<.01%)
instructions in affected programs: 1340 -> 1306 (-2.54%)
helped: 4
HURT: 0
helped stats (abs) min: 2 max: 15 x̄: 8.50 x̃: 8
helped stats (rel) min: 0.63% max: 4.30% x̄: 2.45% x̃: 2.43%
95% mean confidence interval for instructions value: -20.44 3.44
95% mean confidence interval for instructions %-change: -5.78% 0.89%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 178116996 -> 178116888 (<.01%)
cycles in affected programs: 6262 -> 6154 (-1.72%)
helped: 2
HURT: 2
helped stats (abs) min: 44 max: 78 x̄: 61.00 x̃: 61
helped stats (rel) min: 3.31% max: 3.94% x̄: 3.62% x̃: 3.62%
HURT stats (abs)   min: 6 max: 8 x̄: 7.00 x̃: 7
HURT stats (rel)   min: 0.34% max: 0.68% x̄: 0.51% x̃: 0.51%
95% mean confidence interval for cycles value: -93.27 39.27
95% mean confidence interval for cycles %-change: -5.38% 2.27%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857887 -> 4857870 (<.01%)
instructions in affected programs: 674 -> 657 (-2.52%)
helped: 2
HURT: 0

total cycles in shared programs: 122180816 -> 122180744 (<.01%)
cycles in affected programs: 3764 -> 3692 (-1.91%)
helped: 1
HURT: 1
helped stats (abs) min: 78 max: 78 x̄: 78.00 x̃: 78
helped stats (rel) min: 3.94% max: 3.94% x̄: 3.94% x̃: 3.94%
HURT stats (abs)   min: 6 max: 6 x̄: 6.00 x̃: 6
HURT stats (rel)   min: 0.34% max: 0.34% x̄: 0.34% x̃: 0.34%

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
cfc0d34802 nir: See through an fneg to apply existing optimizations
Doing the same for the existing feq and fne transformations didn't help
anything in shader-db.

shader-db results:

Broadwell and Skylake (Skylake shown)
total instructions in shared programs: 14529463 -> 14526147 (-0.02%)
instructions in affected programs: 402420 -> 399104 (-0.82%)
helped: 2136
HURT: 131
helped stats (abs) min: 1 max: 10 x̄: 1.61 x̃: 1
helped stats (rel) min: 0.03% max: 16.22% x̄: 3.14% x̃: 1.12%
HURT stats (abs)   min: 1 max: 2 x̄: 1.01 x̃: 1
HURT stats (rel)   min: 0.13% max: 7.69% x̄: 0.75% x̃: 0.57%
95% mean confidence interval for instructions value: -1.51 -1.41
95% mean confidence interval for instructions %-change: -3.06% -2.78%
Instructions are helped.

total cycles in shared programs: 533146915 -> 533120531 (<.01%)
cycles in affected programs: 10356261 -> 10329877 (-0.25%)
helped: 1933
HURT: 844
helped stats (abs) min: 1 max: 490 x̄: 29.44 x̃: 16
helped stats (rel) min: <.01% max: 28.57% x̄: 3.43% x̃: 1.88%
HURT stats (abs)   min: 1 max: 423 x̄: 36.17 x̃: 12
HURT stats (rel)   min: <.01% max: 23.75% x̄: 1.90% x̃: 0.59%
95% mean confidence interval for cycles value: -11.78 -7.22
95% mean confidence interval for cycles %-change: -1.98% -1.65%
Cycles are helped.

Haswell
total instructions in shared programs: 9037416 -> 9034106 (-0.04%)
instructions in affected programs: 389831 -> 386521 (-0.85%)
helped: 2184
HURT: 120
helped stats (abs) min: 1 max: 11 x̄: 1.57 x̃: 1
helped stats (rel) min: 0.03% max: 25.00% x̄: 2.73% x̃: 1.02%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.19% max: 7.69% x̄: 0.81% x̃: 0.57%
95% mean confidence interval for instructions value: -1.49 -1.39
95% mean confidence interval for instructions %-change: -2.68% -2.41%
Instructions are helped.

total cycles in shared programs: 84636243 -> 84631628 (<.01%)
cycles in affected programs: 4745058 -> 4740443 (-0.10%)
helped: 1904
HURT: 960
helped stats (abs) min: 1 max: 466 x̄: 30.21 x̃: 18
helped stats (rel) min: 0.02% max: 36.36% x̄: 3.57% x̃: 2.38%
HURT stats (abs)   min: 1 max: 1080 x̄: 55.11 x̃: 14
HURT stats (rel)   min: 0.02% max: 51.33% x̄: 2.77% x̃: 0.81%
95% mean confidence interval for cycles value: -4.51 1.29
95% mean confidence interval for cycles %-change: -1.64% -1.25%
Inconclusive result (value mean confidence interval includes 0).

LOST:   1
GAINED: 0

Sandy Bridge and Ivy Bridge (Ivy Bridge shown)
total instructions in shared programs: 10018873 -> 10015456 (-0.03%)
instructions in affected programs: 512820 -> 509403 (-0.67%)
helped: 2268
HURT: 162
helped stats (abs) min: 1 max: 11 x̄: 1.62 x̃: 1
helped stats (rel) min: 0.03% max: 25.00% x̄: 2.47% x̃: 0.88%
HURT stats (abs)   min: 1 max: 4 x̄: 1.59 x̃: 1
HURT stats (rel)   min: 0.09% max: 7.69% x̄: 0.86% x̃: 0.50%
95% mean confidence interval for instructions value: -1.46 -1.35
95% mean confidence interval for instructions %-change: -2.38% -2.12%
Instructions are helped.

total cycles in shared programs: 87538223 -> 87524771 (-0.02%)
cycles in affected programs: 5435520 -> 5422068 (-0.25%)
helped: 1916
HURT: 946
helped stats (abs) min: 1 max: 1392 x̄: 29.44 x̃: 18
helped stats (rel) min: <.01% max: 34.51% x̄: 3.34% x̃: 1.97%
HURT stats (abs)   min: 1 max: 633 x̄: 45.41 x̃: 11
HURT stats (rel)   min: 0.02% max: 25.95% x̄: 2.41% x̃: 0.62%
95% mean confidence interval for cycles value: -7.34 -2.06
95% mean confidence interval for cycles %-change: -1.62% -1.26%
Cycles are helped.

LOST:   1
GAINED: 0

Iron Lake
total instructions in shared programs: 7888446 -> 7886959 (-0.02%)
instructions in affected programs: 331581 -> 330094 (-0.45%)
helped: 1160
HURT: 97
helped stats (abs) min: 1 max: 10 x̄: 1.37 x̃: 1
helped stats (rel) min: 0.02% max: 9.68% x̄: 0.93% x̃: 0.43%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.17% max: 4.17% x̄: 0.37% x̃: 0.25%
95% mean confidence interval for instructions value: -1.25 -1.12
95% mean confidence interval for instructions %-change: -0.91% -0.75%
Instructions are helped.

total cycles in shared programs: 178130766 -> 178116996 (<.01%)
cycles in affected programs: 12534564 -> 12520794 (-0.11%)
helped: 1856
HURT: 187
helped stats (abs) min: 2 max: 202 x̄: 7.78 x̃: 4
helped stats (rel) min: <.01% max: 6.47% x̄: 0.28% x̃: 0.11%
HURT stats (abs)   min: 2 max: 26 x̄: 3.55 x̃: 2
HURT stats (rel)   min: 0.01% max: 2.14% x̄: 0.08% x̃: 0.02%
95% mean confidence interval for cycles value: -7.41 -6.07
95% mean confidence interval for cycles %-change: -0.28% -0.22%
Cycles are helped.

GM45
total instructions in shared programs: 4858912 -> 4857887 (-0.02%)
instructions in affected programs: 237565 -> 236540 (-0.43%)
helped: 867
HURT: 57
helped stats (abs) min: 1 max: 10 x̄: 1.25 x̃: 1
helped stats (rel) min: 0.02% max: 9.38% x̄: 0.87% x̃: 0.43%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.16% max: 3.85% x̄: 0.34% x̃: 0.22%
95% mean confidence interval for instructions value: -1.18 -1.04
95% mean confidence interval for instructions %-change: -0.88% -0.71%
Instructions are helped.

total cycles in shared programs: 122189118 -> 122180816 (<.01%)
cycles in affected programs: 8776418 -> 8768116 (-0.09%)
helped: 1213
HURT: 166
helped stats (abs) min: 2 max: 202 x̄: 7.30 x̃: 4
helped stats (rel) min: <.01% max: 6.43% x̄: 0.25% x̃: 0.11%
HURT stats (abs)   min: 2 max: 26 x̄: 3.35 x̃: 2
HURT stats (rel)   min: 0.01% max: 2.14% x̄: 0.06% x̃: 0.02%
95% mean confidence interval for cycles value: -6.78 -5.26
95% mean confidence interval for cycles %-change: -0.24% -0.18%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Timothy Arceri
283e25102b st/glsl_to_nir: disable io lowering and array splitting of fs inputs
We need this to be able to support the interpolateAt builtins in a
sane way. It also leads to the generation of more optimal code.

The lowering and splitting is made conditional on lower_all_io_to_temps
because vc4 and freedreno both expect these passes to be enabled and
niether support glsl 400 so don't need to deal with the interpolateAt
builtins.

We leave the other stages for now as to avoid regressions. Ideally we
could remove the stage checks and just set the nir options correctly
for each stage. However all gallium drivers currently just use return
the same nir compiler options for all stages, and it's probably more
trouble than its worth to change this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri
9a2e085680 nir: add lower_all_io_to_temps flag
This will be used for freedreno and vc4 which require all inputs
and outputs to be copied to temps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri
3218756262 nir/st_glsl_to_nir: add param to disable splitting of inputs
We need this because we will always copy fs outputs to temps and
split the arrays, but do not want to do either of these with fs
inputs as it is unnessisary and makes handling interpolateAt
builtins difficult.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri
93e213f91f st/glsl_to_nir: copy nir compiler options to context
Various nir passes may expect this to be here as does the nir
serialisation pass.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri
dd6d6c63a7 radeonsi/nir: add input support for arrays that have not been copied to temps and split
We need this to be able to support the interpolateAt builtins in a
sane way. It also leads to the generation of more optimal code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
d185190222 ac/radeonsi: add lookup_interp_param and load_sample_position to the abi
This will enable the interpolateAt builtins to work on the radeonsi
nir backend.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
97058168a4 radeonsi/nir: add prim_mask to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
3ff012f142 radeonsi/nir: adjust load_sample_position() to be shared between backends
With this interface change it can be shared between the tgsi and
nir backends.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
3a47b138e3 radeonsi/nir: add si_nir_lookup_interp_param() helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
b8808848ce ac/nir_to_llvm: move some interp defines to the header
These will be used in the following patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
fea6da9aaa radeonsi/nir: move the interpolation qualifier scanning
We need to collect this when scanning over the instruction rather
than when scanning over the inputs otherwise we might get confliting
values for inputs that are use by the interpolateAt* builtins.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
580f1aa247 radeonsi/nir: add interpolate at intrinsics to scan_instruction()
V2: use the uses_*_opcode_interp_* flags

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Bas Nieuwenhuizen
882eff4d20 radv: Merge raster state with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:02:05 +01:00
Bas Nieuwenhuizen
69364f1c34 radv: Move gs state out of pipeline.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:02:01 +01:00
Bas Nieuwenhuizen
e4e060d135 radv: Split out cliprect rule generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:56 +01:00
Bas Nieuwenhuizen
acbaef3005 radv: Merge VGT_GS_MODE computation with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:52 +01:00
Bas Nieuwenhuizen
4ae6a8b0cd radv: Split out processing the vertex input state.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:41 +01:00
Bas Nieuwenhuizen
9062b1c241 radv: Move tessellation state out of pipeline.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:38 +01:00
Bas Nieuwenhuizen
4aa1cb4e90 radv: Move blend state out of pipeline.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:34 +01:00
Bas Nieuwenhuizen
0f72f0eacb radv: Split out generating VGT_SHADER_STAGES_EN.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:30 +01:00
Bas Nieuwenhuizen
694c34314b radv: Split out the ia_multi_vgt_param precomputation.
Also moved everything in a struct and then return the struct from
the helper function, so it is clear in the caller what part of the
pipeline gets modified.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:26 +01:00
Bas Nieuwenhuizen
0bea0851aa radv: Split out db_shader_control computation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:18 +01:00
Bas Nieuwenhuizen
5dce47ae6d radv: Compute shader_z_format when emitting it.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:13 +01:00
Bas Nieuwenhuizen
df2e7ab0db radv: Merge depth stencil state with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:06 +01:00
Bas Nieuwenhuizen
d5a0af84ec radv: Merge ps_input_cntl computation with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:01 +01:00
Bas Nieuwenhuizen
e2bf18030d radv: Merge vtx_reuse_depth computation with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:55 +01:00
Bas Nieuwenhuizen
c80747b32c radv: Merge vs state computation with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:50 +01:00
Bas Nieuwenhuizen
c4191cf944 radv: Merge binning state generation with pm4 emission.
We don't need the pipeline state struct anymore.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:45 +01:00
Bas Nieuwenhuizen
6f1a3f081e radv: Constify some pipeline helpers.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:40 +01:00
Bas Nieuwenhuizen
f0c9ef410a radv: Add PM4 pregeneration for compute pipelines.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:34 +01:00
Bas Nieuwenhuizen
beeab44190 radv: Record a PM4 sequence for graphics pipeline switches.
This gives about 2% performance improvement on dota2 for me.

This is mostly a mechanical copy and replacement, but at bind time
we still do:

1) Some stuff that is only based on num_samples changes.
2) Some command buffer state setting.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:22 +01:00
Bas Nieuwenhuizen
7c366bc152 radv: Determine unneeded dynamic states.
Which avoids setting or emitting them.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:17 +01:00
Andres Rodriguez
0a89784bcc mesa: check for invalid index on UUID glGet queries
This fixes the piglit test:
spec/ext_semaphore/api-errors/usigned-byte-i-v-bad-value

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
566ed727a4 mesa: fix glGet for ext_external_objects parameters
This allows the client to actually query the enums specified in the
ext_external_objects spec.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
0ebd3cc863 mesa: fix error codes for importing memory/semaphore FDs
This fixes the following piglit tests:
spec/ext_semaphore_fd/api-errors/import-semaphore-fd-bad-enum
spec/ext_memory_object_fd/api-errors/import-memory-fd-bad-enum

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
50b06cbc10 radeonsi: fix fence_server_sync() holding up extra work v2
When calling si_fence_server_sync(), the wait operation is associated
with the next kernel submission. Therefore, any unflushed work
submitted previous to fence_server_sync() will also be affected by
the wait.

To avoid adding the dependency to the unflushed work, we flush before
emitting the fence dependency.

v2: s/semaphore/fence

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
e0f16ee666 radeonsi: implement semaphore_server_signal v2
Syncobj based waits or signals only happen at submission boundaries. In
order to guarantee that the requested signal event will occur when the
state tracker requested it, we must issue a flush.

v2: s/fence/semaphore for pipe objects

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
5b07b06d6b radeonsi: add support for importing PIPE_FD_TYPE_SYNCOBJ semaphores
Hook up importing semaphores of type PIPE_FD_TYPE_SYNCOBJ

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
cc9762d74d winsys/amdgpu: add support for syncobj signaling v3
Add the ability to signal a syncobj when a cs completes execution.

v2: corresponding changes for gallium fence->semaphore rename
v3: s/semaphore/fence for pipe objects

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
29b9bd0539 mesa/st: add support for semaphore object signal/wait v4
Bits to implement ServerWaitSemaphoreObject/ServerSignalSemaphoreObject

v2:
  - corresponding changes for gallium fence->semaphore rename
  - flushing moved to mesa/main

v3: s/semaphore/fence for pipe objects
v4: add bitmap flushing

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
89b52891fd mesa: add support for semaphore object signal/wait v3
Memory synchronization is left for a future patch.

v2: flush vertices/bitmaps moved to mesa/main
v3: removed spaces before/after braces

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
260f7fcc46 mesa: add semaphore parameter stub v2
EXT_semaphore and EXT_semaphore_fd define no pnames. Therefore there
isn't much to do besides determining the correct error code.

v2: removed useless return

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
382067f065 mesa/st: add support for semaphore object create/import/delete v3
Add basic semaphore object operations.

v2: s/semaphore/fence for pipe objects
v3: added missing license headers

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
67d5d08682 mesa: add support for semaphore object creation/import/delete v3
Used by EXT_semmaphore and EXT_semaphore_fd

v2: Removed unnecessary dummy callback initialization
v3: Fixed attempting to free the DummySemaphoreObject

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
8e635f7d65 mesa/st: introduce EXT_semaphore and EXT_semaphore_fd v2
Guarded by PIPE_CAP_SEMAPHORE_SIGNAL

v2: corresponding changes for PIPE_CAP_SEMAPHORE_SIGNAL rename

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
fde1afc495 u_threaded_context: add support for fence_server_signal v2
v2: s/semaphore/fence

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
d34c2cf3e6 gallium: add fence_server_signal() v2
Calling this function will emit a fence signal operation into the
GPU's command stream.

v2: documentation typos

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
458f89be78 gallium: introduce PIPE_FD_TYPE_SYNCOBJ
Denotes that a fd is backed by a synobj. For example, radv shared
semaphores.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
2ab405d254 gallium: introduce PIPE_CAP_FENCE_SIGNAL v2
Protects semaphore signaling functionality required by GL_EXT_semaphore.

v2: s/semaphore/fence

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
585daa2378 gallium: add type parameter to create_fence_fd
An fd can potentially have different types of objects backing it.
Specifying the type helps us make sure we treat the FD correctly.

This is in preparation to allow importing syncobj fence FDs in addition
to native sync FDs.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Dave Airlie
16dd0eb517 ac/llvm: bump the number of results to 8.
This function can get access for a 64-bit dvec4, which means we
have to load 8 components.

This fixes:
R600_DEBUG=nir ./bin/shader_runner generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-abs-dvec4.shader_test -auto

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 05:37:16 +10:00
Dave Airlie
8d633f067b r600/sb: insert the else clause when we might depart from a loop
If there is a break inside the else clause and this means we
are breaking from a loop, the loop finalise will want to insert
the LOOP_BREAK/CONTINUE instruction, however if we don't emit
the else there is no where for these to end up, so they will end
up in the wrong place.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101442
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-31 04:47:29 +10:00
Brian Paul
1a9aa69ae8 mesa: remove invalid assertion in _mesa_enable_vertex_array_attrib()
The meta module passes some 0-based attrib values.  Should fix Piglit
regressions reported by Mark Janes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104863
Fixes: 4ab7e03e1f ("mesa: add an assertion in
_mesa_enable_vertex_array_attrib()")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-30 11:02:43 -07:00
Brian Paul
efa0993eaf mesa: use gl_vert_attrib enum type in more places
Slightly better readbility.

Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 11:02:43 -07:00
Brian Paul
f892e332a8 mesa: rename some 'client' array functions
A long time ago gl_vertex_array was gl_client_array.  Update some function
names to be consistent.

Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
d2d9d090e5 mesa: s/src/attribs/ in _mesa_update_client_array()
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
e863541e43 mesa: check/assert array index in _mesa_bind_vertex_buffer()
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
fcee2cc711 mesa: trivial comment typo fix in arrayobj.c
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
4ab7e03e1f mesa: add an assertion in _mesa_enable_vertex_array_attrib()
Some of the enable/disable vertex array functions take a zero-based
generic index, while others take a VERT_ATTRIB_GENERIC0-based value.
Add an assertion to clarify that in one place.

Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
7f12791cc6 mesa: rename some vars in client_state()
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
06621e8a0d mesa: Care for differences in fog mode only if fog is consumed.
In creating fixed function vertex shader hash keys do only
care for producing the varying output if fog is enabled and the
varing is consumed in the fragment stage.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
6395a0ecf2 mesa: Reduce ffvertex_prog state_key to 36 bytes.
Using lower alignment restrictions for the state key fields finally
yields to a smaller hashing state key.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
b4216b588e mesa: Remove unused ffvertex_prog texunit_really_enabled.
Remove set but not read field from the state key used for hashing
fixed function vertex shaders.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
1169791c18 mesa: Remove unused bit in ffvertex_prog state_key.
Remove set but not read field from the state key used for hashing
fixed function vertex shaders.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
6726d16098 mesa: texgen_enabled is only 1 bit.
For the state key for hashing fixed function vertex shaders, the
texgen_enabled field requires only a single bit.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
d6b0ad51ec mesa: Encode fog modes in a 2 bit field.
For the state key for hashing fixed function
vertex shaders, encode the different fog modes, including
if fog is generally enabled or not, into a 2 bit field.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
63e845d3cc mesa: Move seperate_specular into the lighting section.
For the state key for hashing fixed function
vertex shaders, the information is only evaluated
if lighting is generally switched on.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:58 -07:00
Mathias Fröhlich
11e665d434 mesa: Get the point size array state from varying_vp_inputs.
For the state key for hashing fixed function
vertex shaders, The varying_vp_inputs bitmask already
contains the point size array enabled information.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:58 -07:00
Mathias Fröhlich
bc5c54cadf mesa: Remove unused gl_fog_attrib::_Scale.
The patch removes a variable that is only written to.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:58 -07:00
Iago Toral Quiroga
99b57daf4a anv/pipeline: lower constant initializers on output variables earlier
If a shader only writes to an output via a constant initializer we
need to lower it before we call nir_remove_dead_variables so that
this pass sees the stores from the initializer and doesn't kill the
output.

Fixes test failures in new work-in-progress CTS tests:
dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output_vert
dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output_frag

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-30 08:10:29 +01:00
Tapani Pälli
6316c2ecbd i965: move disk cache from brw_context to intel_screen
Now every context refers to same disk_cache instance in screen.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-30 08:42:51 +02:00
Elie Tournier
6f8518e068 mesa: Correctly print glTexImage dimensions
texture_format_error_check_gles() displays error like "glTexImage%dD".
This patch just replace the %d by the correct dimension.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-30 07:48:56 +02:00
Brian Paul
d5f42f96e1 mesa: shrink size of gl_array_attributes (v2)
Inspired by Marek's earlier patch, but even smaller.  Sort fields from
largest to smallest.  Use bitfields for more fields (sometimes with an
extra bit for MSVC).  Reduce Stride field to GLshort.

Note that some fields cannot be bitfields because they're accessed via
pointers (such as for glEnableClientState(GL_VERTEX_ARRAY) to set the
Enabled field).

Reduces size from 48 to 24 bytes.
Also reduces size of gl_vertex_array_object from 3632 to 2864 bytes.

And add some assertions in init_array().

v2: use s/GLuint/unsigned/, improve commit comments.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-29 21:16:50 -07:00
Brian Paul
79cafa0df3 mesa: shrink gl_vertex_array
Inspired by Marek's earlier patch, but goes a little further.
Sort fields from largest to smallest.  Use bitfields.

Reduced from 48 bytes to 32.  Also reduces size of gl_vertex_array_object
from 4144 to 3632

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-29 21:15:52 -07:00
Marek Olšák
f96a69f916 mesa: replace GLenum with GLenum16 in common structures (v4)
v2: - fix glGet*
    - also use GLenum16 for DrawBuffers
v3: - rebase to top of tree (BrianP) and incorporate Ian's suggestions
v4: - fix a GLenum16 bug in VBO/save code, add some STATIC_ASSERT()s

gl_context = 152432 -> 136840 bytes
vbo_context = 22096 -> 20608 bytes

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-29 21:15:52 -07:00
Brian Paul
94843e6056 mesa: fix incorrect size/error test in _mesa_GetUnsignedBytevEXT()
get_value_size() returns -1 for an error.  The similar check in
_mesa_GetUnsignedBytei_vEXT() is correct.

Found by chance.  There are apparently no Piglit tests which exercise
glGetUnsignedBytei_vEXT() or glGetUnsignedBytevEXT().

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-29 21:15:52 -07:00
Neha Bhende
e4ca1d6456 svga: Check rasterization state object before checking poly_stipple_enable
Sometimes rasterization state object could be empty. This is causing
segfault on hw8,9,10 for some traces.

This patch fixes enemy_territory_quake_wars_high,
enemy_territory_quake_wars_low, etqw-demo, lightsmark2008, quake1
glretrace crashes on hw 8,9,10.

Tested with mtt-glretrace and mtt-piglit.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-29 21:04:49 -07:00
Neha Bhende
d4a5e14fae svga: Adjust alpha for S3TC_DXT1_EXT RGB formats
According to spec, S3TC_DXT1_EXT RGB formats are supposed to be
opaque. Correspoding svga formats are not handling it so explicitly
setting it to 1.0.
This fixes piglit test spec@ext_texture_compression_s3tc@s3tc-targeted
Note: This test is testcase for freedesktop bug 100925

Tested with mtt-piglit and mtt-glretrace on 8,9,10,11 and 15

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-29 21:04:49 -07:00
Gert Wollny
6a7d1ca2c4 mesa/st/glsl_to_tgsi: Mark first write as unconditional when appropriate
In the register lifetime estimation if the first write is unconditional or
conditional but not within a loop then this is an unconditional dominant
write in the sense of register life time estimation.
Add a test case and record the write accordingly.

Fixes: 807e2539e5 ("mesa/st/glsl_to_tgsi: Add
tracking of ifelse writes in register merging")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104803
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-29 21:04:49 -07:00
Roland Scheidegger
3c7aa242f5 mesa: skip validation of legality of size/type queries for format queries
The size/type query is always legal (if we made it that far).
Removing this causes a difference for GL_TEXTURE_BUFFER - the reason is that
these parameters are valid only with GetTexLevelParameter() if gl 3.1 is
supported, but not if only ARB_texture_buffer_object is supported.
However, while the spec says that these queries return "the same information
as querying GetTexLevelParameter" I believe we're not expected to return just
zeros here. By definition, these pnames are always valid (unlike for the
GetTexLevelParameter() function which would return an error without GL 3.1).
The spec is a bit inconsistent there and open to interpretation - while
mentioning the "same information as querying GetTexLevelParameter" is
returned, it also mentions that 0 is returned for size/type if the
target/format is not supported - implying correct results to be returned
if it is supported, regardless that GetTexLevelParameter would return
an error. (Also, the bit about this returning the same as
GetTexLevelParameter also includes querying stencil type, which isn't
even possible with GetTexLevelParameter.)

This breaks some piglit arb_internalformat_query2 tests (which I believe to
be wrong).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com
2018-01-30 01:28:47 +01:00
Roland Scheidegger
21fe02d1d3 mesa: restrict formats being supported by target type for formatquery
The code just considered all formats as being supported if they were either
a valid fbo or texture format.
This was quite awkward since then the query would return "supported" for
e.g. GL_RGB9E5 or compressed formats and target RENDERBUFFER (albeit the driver
could still refuse it in theory). However, when then querying for instance the
internalformat sizes, it would just return 0 (due to the checks being more
strict there).
It was also a problem for texture buffer targets, which have a more restricted
list of formats which are allowed (and again, it would return supported but
then querying sizes would return 0).
So only take validation of formats into account which make sense for a given
target.
Can also toss out some special checks for rgb9e5 later, since we'd never get
there if it wasn't supported in the first place.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-30 01:28:47 +01:00
Roland Scheidegger
272e7e1bd5 mesa: (trivial) add TODO comment for default results for internal queries 2018-01-30 01:28:47 +01:00
Roland Scheidegger
09dc4f9012 mesa: remove misleading gles checks for formatquery
Testing for gles there is just confusing - this is about target being
supported, if it was valid at all was already determined earlier
(in _legal_parameters). It didn't make sense at all in any case, since
it would only have said false there for gles for 2d but not 2d arrays etc.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-30 01:28:47 +01:00
Rafael Antognolli
e7ecc5e160 i965: Emit PIPE_CONTROL with ISP bit on older platforms.
Emit it on all platforms since gen7.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-29 14:52:07 -08:00
Rafael Antognolli
fa21ddf7b1 anv/cmd_buffer: Emit PIPE_CONTROL with ISP bit on older platforms.
Emit it on all platforms since gen7.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-29 14:52:07 -08:00
Timothy Arceri
2b4afaef1c st/glsl_to_nir: remove dead io after conversion to nir
This fixes an assert in nir_lower_var_copies() for some bioshock
shaders where an unused clipdistance array has no size.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:14:36 +11:00
Timothy Arceri
327c1a7fb3 radeonsi/nir: add support vs double inputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
44067d6f0d radeonsi: pass input_idx to declare_nir_input_vs()
This make it consistent with declare_nir_input_fs() and will allow
us to support doubles.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
cf75ee3ab1 radeonsi: add bitcast_inputs() helper
Will be used in a following patch to help support doubles.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
96cfd4bd7e radeonsi/nir: fix num_inputs for doubles in vs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
09cd484d61 nir: partially revert c2acf97fcc
c2acf97fcc changed the use of double_inputs_read to be
inconsitent with its previous meaning. Here we re-enable the
gather info code that was removed as the modified code from
c2acf97fcc now uses the double_inputs member rather than
double_inputs_read.

This change allows us to use double_inputs_read with gallium
drivers without impacting double_inputs which is used by i965.

We also make use of the compiler option vs_inputs_dual_locations
to allow for the difference in behaviour between drivers that handle
vs inputs as taking up two locations for doubles, versus those that
treat them as taking a single location.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
5b8de4bdff nir: add vs_inputs_dual_locations compiler option
Allows nir drivers to either use a single or dual locations for
vs double inputs.

i965 uses dual locations for both OpenGL and Vulkan drivers, for
now gallium OpenGL drivers only use a single location.

The following patch will also make use of this option when
calling nir_shader_gather_info().

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
f63e05ae9e compiler: tidy up double_inputs_read uses
First we move double_inputs_read into a vs struct in the union,
double_inputs_read is only used for vs inputs so this will
save space and also allows us to add a new double_inputs field.

We add the new field because c2acf97fcc changed the behaviour
of double_inputs_read, and while it's no longer used to track
actual reads in i965 we do still want to track this for gallium
drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Dave Airlie
f6cc15dccd radv/gfx9: fix block compression texture views. (v2)
This ports a fix from amdvlk, to fix the sizing for mip levels
when block compressed images are viewed using uncompressed views.

My original fix didn't power the clamping, but it looks like
the clamping is required to stop the sizing going too large.

Fixes:
dEQP-VK.image.texel_view_compatible.graphic.extended*bc*
Doesn't crash DOW3 anymore.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-30 07:39:13 +10:00
Bas Nieuwenhuizen
0347a83bbf radv: Signal fence correctly after sparse binding.
It did not signal syncobjs in the fence, and also signalled too early
if there was work on the queue already, as we have to wait till that
work is done.

Fixes: d27aaae4d2 "radv: Add external fence support."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-29 17:22:58 +01:00
Brian Paul
0d044f7d61 mesa/vbo: replace vbo_draw_method() with _mesa_set_drawing_arrays()
The arrays specified by ctx->Array._DrawArrays are used for all
vertex drawing via vbo_context::draw_prims().  Different arrays are
used for immediate mode, vertex arrays, display lists, etc.  Changing
from one to another requires updating derived/driver array state.

Before, we indirectly specifid the arrays with the gl_draw_method values.
Now we just directly specify the arrays instead.  This is simpler and
will allow a subsequent display list optimization.

In the future, it might make sense to get rid of ctx->Array._DrawArrays
entirely and just pass the arrays as another parameter to
vbo_context::draw_prims().

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
d9894ede02 vbo: s/[0]/[VERT_ATTRIB_POS]/ in recalculate_input_bindings()
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
48a6ab472a vbo: add new VBO_ATTRIBS_ masks to vbo_attrib.h
These will be used in a later patch.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
41cd3ee5a2 vbo: s/VBO_ATTRIB_INDEX/VBO_ATTRIB_COLOR_INDEX/
To match the VERT_ATTRIB_COLOR_INDEX name.
Give a name to the previously anonymous enum of VBO_ATTRIB_x values.
Update the comment on the enum.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
425da3bbfc vbo: minor clean-ups in vbo_exec.h
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
d631ea3a23 vbo: s/_API_NOOP_H/VBO_NOOP_H/ in vbo_noop.h
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
094a80db4c vbo: whitespace/formatting fixes in vbo_exec.h
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
b080fc6199 vbo: move, rename vp_mode enums, get_program_mode() function
Instead of NONE/ARB use FF/SHADER.  Move the enum declaration to
vbo_private.h where it's used.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
35e0ff5bd5 vbo: s/cl/array/ in vbo_context.c
I think 'cl' used to mean client array.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Tapani Pälli
d0343bef66 nir: mark unused space in packed_tex_data
This change cleans following scary warnings in valgrind output
when disk cache is being written:

   ==6532== Uninitialised byte(s) found during client check request
   ==6532==    at 0x14423FAD: blob_write_bytes (blob.c:152)
   ==6532==    by 0x144240FB: blob_write_uint32 (blob.c:194)
   ==6532==    by 0x144001A5: write_tex (nir_serialize.c:613)

and later (loads of):

   ==6532== Use of uninitialised value of size 8
   ==6532==    at 0x62FCD9E: crc32_z (in /usr/lib64/libz.so.1.2.11)
   ==6532==    by 0x13F65014: util_hash_crc32 (crc32.c:127)
   ==6532==    by 0x13F5DABA: cache_put (disk_cache.c:947)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-29 08:11:22 +02:00
Tapani Pälli
b99c88037b i965: fix disk_cache leak when destroying context
==2780== 1,024 bytes in 1 blocks are possibly lost in loss record 180 of 205
   ==2780==    at 0x4C31A1E: calloc (vg_replace_malloc.c:711)
   ==2780==    by 0x13F6467E: util_queue_init (u_queue.c:309)
   ==2780==    by 0x13F5C9F6: disk_cache_create (disk_cache.c:369)
   ==2780==    by 0x13F05406: brw_disk_cache_init (brw_disk_cache.c:428)
   ==2780==    by 0x13F01E78: brwCreateContext (brw_context.c:1068)

Fixes: 1a61a8b9a7 ("i965: Initialize disk shader cache if MESA_GLSL_CACHE_DISABLE is false")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-29 08:11:14 +02:00
Tapani Pälli
28db950b51 i965: fix prog_data leak in brw_disk_cache
==25481== 576 bytes in 1 blocks are definitely lost in loss record 179 of 208
   ==25481==    at 0x4C2FB6B: malloc (vg_replace_malloc.c:299)
   ==25481==    by 0x1404E2CC: ralloc_size (ralloc.c:121)
   ==25481==    by 0x14119F82: read_and_upload (brw_disk_cache.c:176)
   ==25481==    by 0x1411A5C9: brw_disk_cache_upload_program (brw_disk_cache.c:271)
   ==25481==    by 0x1412FCA4: brw_upload_wm_prog (brw_wm.c:597)

Fixes: 516d50db31 ("i965: add initial implementation of on disk shader cache")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-29 08:11:03 +02:00
Timothy Arceri
9afc38c799 ac: fix indentation
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-29 11:14:23 +11:00
Timothy Arceri
03086f86ae ac: remove unused nir2llvmtype()
The last use of this was removed in the previous patch.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-29 11:14:23 +11:00
Timothy Arceri
fa29a9625e ac: fix gs load inputs type
This fixes the scenario where the input is a struct. With this
the Unreal engines Elemental demo now works on radeonsi.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-29 11:14:23 +11:00
Kai Wasserbäch
0aba967328 ac/nir: call glsl_get_sampler_dim() only once where possible
Changes since v1:
  * Rebased on top of e68150de26 and
    82adf53308.

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-29 10:47:31 +11:00
Dave Airlie
2af66ba7e7 docs/features: add r600 ARB_query_buffer_object support
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-29 05:42:34 +10:00
Dave Airlie
1c9ea24a19 r600: add ARB_query_buffer_object support
This uses a different shader than radeonsi, as we can't address non-256
aligned ssbos, which the radeonsi code does. This passes some extra
offsets into the shader.

It also contains a set of u64 instruction implementation that may
or may not be complete (at least the u64div is definitely not something
that works outside this use-case). If r600 grows 64-bit integers,
it will use the GLSL lowering for divmod.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-29 05:42:28 +10:00
Dave Airlie
a7ec366e50 r600/shader: refactor mul hi/lo instruction emission
This just makes it a bit simpler for cayman vs eg

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-29 05:42:17 +10:00
Dave Airlie
e0e23ea69c r600/eg: construct proper rat mask for image/buffers.
If the images/buffer bindings had a gap, this produced the wrong values,
this should fix that to generate the correct rat mask for mixes of
images/buffers/cbs.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-29 05:41:58 +10:00
Jon Turney
4a0bab1d7f meson: libdrm shouldn't appear in Requires.private: if it wasn't found
Otherwise, using pkg-config to retrieve flags will fail, e.g.

$ pkg-config gl --cflags
Package libdrm was not found in the pkg-config search path.
Perhaps you should add the directory containing `libdrm.pc'
to the PKG_CONFIG_PATH environment variable
Package 'libdrm', required by 'gl', not found

Fixes: 3218056e0e ("meson: Build i965 and dri stack")

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
2018-01-27 18:13:18 +00:00
Eric Anholt
e5a81ac704 broadcom/vc5: Don't forget to get the BO offset when opening a dmabuf.
Fixes black display in DRI due to storing to 0x00000000.
2018-01-27 19:40:14 +11:00
Eric Anholt
314e9ee6c4 broadcom/vc5: Enable the driver on V3D 4.2.
The changes in 4.2 haven't impacted any of our CL or state struct entries
that I can see, so I haven't enabled custom compile for doing 4.2 instead
of 4.1.
2018-01-27 19:39:56 +11:00
Eric Anholt
71c7e9bea1 broadcom/vc5: Enable CLIF dumping of V3D 4.2. 2018-01-27 19:04:21 +11:00
Eric Anholt
91f899cbc1 broadcom/vc5: Update the compiler for V3D 4.2. 2018-01-27 19:04:21 +11:00
Eric Anholt
f2e41daac5 broadcom/vc5: Update QPU instruction pack/unpack for v4.2.
After the 4.1 spec, 4.2 retroactively renamed patchid to barrierid because
it's used for other barriers in compute.
2018-01-27 19:03:55 +11:00
Eric Anholt
96d3e8f134 broadcom/vc5: Add XML for V3D 4.2. 2018-01-27 18:57:58 +11:00
Eric Anholt
b026063b16 broadcom/vc5: Fix a race between XML codegen build and CLIF build. 2018-01-27 18:57:58 +11:00
Eric Anholt
de60ea4432 Android: Attempt to fix broadcom build after vc5 changes. 2018-01-27 18:03:58 +11:00
Marek Olšák
b633999a4e ac: rename and move si_const_array into common code
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
e17eb8800f ac: move address space definitions to common code
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
0d62370bbb ac: don't use byval LLVM qualifier in shaders
shader-db doesn't show any regression and 32-bit pointers with byval
are declared as VGPRs for some reason.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
0e40c6a7b7 gallium/radeon: set number of pb_cache buckets = number of heaps
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
175549e0e9 pb_cache: let drivers choose the number of buckets
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
ecfd521502 pb_cache: call os_time_get outside of the loop
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
e553cb5a68 gallium/radeon: simplify radeon_flags_from_heap
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Timothy Arceri
041b18cf23 st/shader_cache: restore num_tgsi_tokens when loading from cache
Without this we will fail to correctly serialise programs when
using glGetProgramBinary() if the program was retrieved from
the disk cache rather than freshly compiled.

Fixes: c69b0dd681 "st/glsl_to_tgsi: store num_tgsi_tokens in st_*_program"

Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104762
2018-01-27 10:06:16 +11:00
Marek Olšák
17423c993d winsys/amdgpu: fix assertion failure with UVD and VCE rings
Cc: 18.0 <mesa-stable@lists.freedesktop.org>
2018-01-26 23:12:11 +01:00
Brian Paul
ac0e9e343c mesa: remove MESA_FUNCTION
Just use __func__ in the two macros where it was used.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 13:52:48 -07:00
Brian Paul
bacf72a18d mesa: change gl_link_status enums to uppercase
follow the convention of other enums.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 13:52:48 -07:00
Brian Paul
aff5d9c256 mesa: change gl_compile_status enums to uppercase
To follow the convention of other enums.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 13:52:48 -07:00
Brian Paul
d9832f1fc4 mesa: minor comment reformatting, whitespace fixes in mtypes.h
Trivial.
2018-01-26 13:52:42 -07:00
Rafael Antognolli
131e871385 i965/gen10: Use CS Stall instead of WriteImmediate.
Fixes: ca19ee33d7
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 12:02:34 -08:00
Rafael Antognolli
20578f81a6 anv/gen10: Emit CS stall and mark push constants dirty.
I got reviews and fixed the patches locally, but ended up merging the
ones that I sent originally to the list. This patch fixes those
mistakes.

Fixes: 78c125af39
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 11:59:17 -08:00
Rafael Antognolli
bcfd78e448 i965/gen10: Re-enable push constants.
The GPU hang caused by push constants is apparently fixed, so let's
enable them again.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 10:07:44 -08:00
Rafael Antognolli
78c125af39 anv/gen10: Ignore push constant packets during context restore.
Similar to the GL driver, ignore 3DSTATE_CONSTANT_* packets when doing a
context restore.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 10:07:40 -08:00
Rafael Antognolli
ca19ee33d7 i965/gen10: Ignore push constant packets during context restore.
These packets were causing GPU hangs when the context was restored,
possibly because they were pointing to BO's that were already
unreferenced. So we tell the hardware to ignore such packets after the
batch buffer ends, since we know those BO's are not around anymore.

This change fixes GPU hangs on CNL. The (partial) solution to this
problem so far was to entirely disable push constants on this platform.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 10:07:35 -08:00
Brian Paul
acaec6cdd9 mesa: silence MinGW 'may be unused uninitialized' warning in get.c
The warning happens on line 2114 for the memcpy(data, p, size) call.
I'm not sure why that generates the warning but not the earlier use
of p in the code.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 10:44:05 -07:00
Eleni Maria Stea
8096b558a7 mesa: Fix function pointers initialization in status tracker
We assigned the function that gets the device uuid to the GetDriverUuid
function pointer and the function that gets the driver uuid to the
GetDeviceUuid function pointer inside the state tracker. Exchanged the
pointers.

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-26 08:17:55 -07:00
Iago Toral Quiroga
d3ce493b34 anv/pipeline: remove the pipeline layout field from anv_pipeline
It no longer has any users.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 14:06:47 +01:00
Iago Toral Quiroga
75a4802060 anv/cmd_buffer: add the pipeline layout to the pipeline state
We need to access the pipeline layout to compute correct dynamic
offsets for dyamic UBO/SSBO descriptors when we emit draw commands.
Instead of taking it from the pipeline object, store the layout
in the command buffer pipeline state.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 14:06:47 +01:00
Iago Toral Quiroga
e1a49f974b anv/pipeline: don't take the layout from the pipeline to compile shaders
The Vulkan spec states that VkPipelineLayout objects must not be
destroyed while any command buffer that uses them is in the recording
state, but it permits them to be destroyed otherwise. This means that
applications are allowed to free pipeline layouts after command recording
is finished even if there are pipeline objects that still exist and were
created with these layouts.

There are two solutions to this, one is to use reference counting on
pipeline layout objects. The other is to avoid holding references to
pipeline layouts where they are not really needed.

This patch takes a step towards the second option by making the
pipeline shader compile code take pipeline layout from the
VkGraphicsPipelineCreateInfo provided rather than the pipeline
object.

A follow-up patch will remove any remaining uses of the layout field
so we can remove it from the pipeline object and avoid the need
for reference counting.

v2: Use ANV_FROM_HANDLE, remove unnecessary braces (Jason)

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 14:06:46 +01:00
Iago Toral Quiroga
14f6275c92 anv/descriptor_set: add reference counting for descriptor set layouts
The spec states that descriptor set layouts can be destroyed almost
at any time:

   "VkDescriptorSetLayout objects may be accessed by commands that
    operate on descriptor sets allocated using that layout, and those
    descriptor sets must not be updated with vkUpdateDescriptorSets
    after the descriptor set layout has been destroyed. Otherwise,
    descriptor set layouts can be destroyed any time they are not in
    use by an API command."

v2: allocate off the device allocator with DEVICE scope (Jason)

Fixes the following work-in-progress CTS tests:
dEQP-VK.api.descriptor_set.descriptor_set_layout_lifetime.graphics
dEQP-VK.api.descriptor_set.descriptor_set_layout_lifetime.compute

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 14:06:46 +01:00
Samuel Pitoiset
e28233a527 ac/nir: set amdgpu.uniform and invariant.load for SSBOs
For descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:28 +01:00
Samuel Pitoiset
49b0a140a7 ac/nir: set amdgpu.uniform and invariant.load for UBOs
UBOs are constants buffers.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Fixes: 41c36c45 ("amd/common: use ac_build_buffer_load() for emitting UBO loads")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:28 +01:00
Samuel Pitoiset
b453f38a47 ac/nir: set the noalias attribute on input pointers
This attribute is similar to the definition of restrict in
C99 and it might help LLVM.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:28 +01:00
Samuel Pitoiset
310d17fcf1 ac: only load used channels when sampling buffer views
This allows to reduce the number of dwords that are loaded
with buffer_load_format_xyzw. For example, when the only used
channel is 1, the driver will emit buffer_load_format_x instead.

Shader stats for DOW3 (with some local hacky scripts for SPIRV):

143 shaders in 143 tests
Totals:
SGPRS: 5344 -> 5352 (0.15 %)
VGPRS: 3476 -> 3452 (-0.69 %)
Spilled SGPRs: 30 -> 29 (-3.33 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 269860 -> 269808 (-0.02 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1267 -> 1272 (0.39 %)
Wait states: 0 -> 0 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
51e14bc3c0 ac: pass the number of channels to ac_build_buffer_load_format()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
d7c93b558a ac: add ac_build_buffer_load_common() helper
For both versions of llvm.amdgcn.buffer.load.{format}.*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
6d07e443ba radv: fix RADV_DEBUG=syncshaders on GFX9
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
5391de1262 radv: fix a GPU hang with RADV_DEBUG=syncshaders
The GPU hangs when the driver forces a PS_PARTIAL_FLUSH after
a dispatch call (and vice versa for graphics). Something has
changed in the kernel driver because it used to work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
b358e0e67f ac/shader: scan if fragment shaders write memory
It's better to do that in ac_shader_info.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
b9e2f78d6e ac/nir: only canonicalize 32-bit float min/max outputs on pre-GFX9
According to LLVM, only pre-GFX9 targets do not flush denorms
for fmin/fmax.

All dEQP-VK.glsl.builtin.precision.* still pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:27 +01:00
Jason Ekstrand
c8949e2498 anv/pipeline: Don't look at blend state unless we have an attachment
Without this, we may end up dereferencing blend before we check for
binding->index != UINT32_MAX.  However, Vulkan allows the blend state to
be NULL so long as you don't have any color attachments.  This fixes a
segfault when running The Talos Principal.

Fixes: 12f4e00b69
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-26 01:44:45 -08:00
Maxin B. John
8116b9170b anv_icd.py: improve reproducible builds
Sort the output to ensure build reproducibility

Signed-off-by: Maxin B. John <maxin.john@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Fixes: 0ab04ba979 ("anv: Use python to generate ICD json files")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 01:37:45 -08:00
Ian Romanick
c7deeb71a8 nouveau: Remove no-op nvgl_logicop_func function
The values that this function returned were always the values passed
in.  The only thing that happened was either an assertion or undefined
results when an unknown value was passed in.  This doesn't seem that
useful.  Most of nouveau_gldefs.h could be removed in this manner.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-01-26 11:21:46 +08:00
Ian Romanick
f5b9c2a6e3 i915: Silence unused parameter warnings
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_alloc_window_storage’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:290:48: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 intel_alloc_window_storage(struct gl_context * ctx, struct gl_renderbuffer *rb,
                                                ^~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_nop_alloc_storage’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:303:74: warning: unused parameter ‘rb’ [-Wunused-parameter]
 intel_nop_alloc_storage(struct gl_context * ctx, struct gl_renderbuffer *rb,
                                                                          ^~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:32: warning: unused parameter ‘internalFormat’ [-Wunused-parameter]
                         GLenum internalFormat, GLuint width, GLuint height)
                                ^~~~~~~~~~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:55: warning: unused parameter ‘width’ [-Wunused-parameter]
                         GLenum internalFormat, GLuint width, GLuint height)
                                                       ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:69: warning: unused parameter ‘height’ [-Wunused-parameter]
                         GLenum internalFormat, GLuint width, GLuint height)
                                                                     ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_bind_framebuffer’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:396:47: warning: unused parameter ‘fb’ [-Wunused-parameter]
                        struct gl_framebuffer *fb, struct gl_framebuffer *fbread)
                                               ^~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:396:74: warning: unused parameter ‘fbread’ [-Wunused-parameter]
                        struct gl_framebuffer *fb, struct gl_framebuffer *fbread)
                                                                          ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_renderbuffer_update_wrapper’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:422:57: warning: unused parameter ‘intel’ [-Wunused-parameter]
 intel_renderbuffer_update_wrapper(struct intel_context *intel,
                                                         ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_blit_framebuffer_with_blitter’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:644:61: warning: unused parameter ‘filter’ [-Wunused-parameter]
                                     GLbitfield mask, GLenum filter)
                                                             ^~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-26 11:21:46 +08:00
Ian Romanick
39f875a6b7 i915: Make intelEmitCopyBlit static
And rename to emit_copy_blit.

v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-01-26 11:21:46 +08:00
Ian Romanick
9eed6bea6b i965: Make intelEmitCopyBlit static
And rename to emit_copy_blit.

v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-01-26 11:21:46 +08:00
Ian Romanick
4e9e964de6 i915: Use enum color_logic_ops for blits
v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-01-26 11:21:46 +08:00
Ian Romanick
21be331401 i965: Use enum color_logic_ops for blits
v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-01-26 11:21:46 +08:00
Ian Romanick
0aaa27f291 mesa: Pass the translated color logic op dd_function_table::LogicOpcode
And delete the resulting dead code.  This has only been compile-tested.

v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 11:21:46 +08:00
Ian Romanick
cf0b26ec12 st/mesa: Use the translated color logic op from the context
And delete the resulting dead code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 11:21:46 +08:00
Ian Romanick
0c69db895f i965: Use the translated color logic op from the context
And delete the resulting dead code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-26 11:21:46 +08:00
Ian Romanick
9c1f010f34 mesa: Also track a remapped version of the color logic op
With the exception of NVIDIA hardware, these are is the values that all
hardware and Gallium want.  The remapping is currently implemented in at
least 6 places.  This starts the process of consolidating to a single
place.

v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.  Added some comments about the
selection of bit patterns for gl_logicop_mode and the GLenums.
Suggested by Nicolai.  Folded the GLenum_to_color_logicop macro into its
only users.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com> [v1]
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 11:21:46 +08:00
Bas Nieuwenhuizen
5a3404d443 radeonsi: Export signalled sync file instead of -1.
-1 is considered an error for EGL_ANDROID_native_fence_sync, so
we need to actually create a sync file.

Fixes: f536f45250 "radeonsi: implement sync_file import/export"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-26 01:26:53 +01:00
Jason Ekstrand
db682b8f0e i965/fs: Reset the register file to VGRF in lower_integer_multiplication
18fde36ced changed the way temporary
registers were allocated in lower_integer_multiplication so that we
allocate regs_written(inst) space and keep the stride of the original
destination register.  This was to ensure that any MUL which originally
followed the CHV/BXT integer multiply regioning restrictions would
continue to follow those restrictions even after lowering.  This works
fine except that I forgot to reset the register file to VGRF so, even
though they were assigned a number from alloc.allocate(), they had the
wrong register file.  This caused some GLES 3.0 CTS tests to start
failing on Sandy Bridge due to attempted reads from the MRF:

    ES3-CTS.functional.shaders.precision.int.highp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.int.mediump_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.int.lowp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.highp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.mediump_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.lowp_mul_fragment.snbm64

This commit remedies this problem by, instead of copying inst->dst and
overwriting nr, just make a new register and set the region to match
inst->dst.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103626
Fixes: 18fde36ced
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-01-25 13:58:55 -08:00
Jason Ekstrand
af9d4ce480 vulkan: Update the XML and headers to 1.0.68
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2018-01-25 13:30:05 -08:00
Dave Airlie
f4c534ef68 radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)
This seems to be broken, at least the cts tests fail.

This fixes:
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_4
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_8

2 samples seems to pass fine, amdvlk doesn't appear to enable TC for
possibly some other reasons here.

This is most likely a hack.

v1.1: add a bit of explaination text. (Samuel)
Fixes: ad3d98da9 (radv: enable tc compatible htile for d32s8 also.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-26 06:55:09 +10:00
Chuck Atkins
6ac5e851f1 configure.ac: add missing llvm dependencies to .pc files
v2: Only add as dependencies for gallium-osmesa and gallium-xlib

CC: <mesa-stable@lists.freedesktop.org>
Signed-of-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 14:54:08 -05:00
George Kyriazis
5d8f270d10 swr/rast: Optimize DumpToFile output size
Modify DumpToFile to only dump the function, not the entire module.
Reduces file sizes and speeds up the dumping.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
dfe4dd48ec swr/rast: Updated copyright dates
on knob-related files.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
36dbbf11a0 swr/rast: Move memory-related JIT functions
Move them to their own file (builder_mem.{h|cpp}).  Add builder_mem.cpp
to the build system.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
94922dbe4b swr/rast: Add extra (optional) parameter in GATHERPS
Now also takes in an additional parameter (draw context) for future
expansion.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
0b46c7b3b0 swr/rast: Better ExecCmd (i.e. system()) implmentation
Hides console window creation during JIT linker execution in apps that
don't have a console.  Remove hooking of CreateProcessInternalA - the
MSFT implementation just turns around and calls CreateProcessInternalW
which, we do hook.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
2d16b61bff swr/rast: Support USE_SIMD16_FRONTEND=0 for EarlyRast
Early Rasterization did not initially work with USE_SIMD16_FRONTEND=0.
Fix it so it works there, too.  Please note that the default setting
is USE_SIMD16_FRONTEND=1.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
Brian Paul
123798eb44 mesa: whitespace fixes in attrib.c
Trivial.
2018-01-25 12:17:26 -07:00
Brian Paul
0e7aaaf5a5 mesa: whitespace fixes in varray.h
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
ba01589c0c mesa: include mtypes.h in varray.h
We actually use some of the types from mtypes.h so include it directly
instead of relying on indirectly including it via bufferobj.h

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
e4504be6fc mesa: s/gl_vertex_attrib_array/gl_array_attributes/ in comments
The structure type was renamed some time ago, but some comments
were not updated.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
6c724fb7c1 mesa: simplify _mesa_delete_list() a bit, add some assertions
All but two cases of the switch did the same n += InstSize[n[0].opcode]
instruction.  Just move it after the switch.

Add some sanity check assertions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
c860171c63 st/mesa: expand glDrawPixels cache to handle multiple images
The newest version of WSI Fusion makes several glDrawPixels calls
per frame.  By caching more than one image, we get better performance
when panning/zooming the map.

v2: move pixel unpack param checking out of cache search loop, per Roland
v3: also move unpack->BufferObj check out of loop, per Roland.
2018-01-25 12:17:26 -07:00
Brian Paul
5092610f29 st/mesa: add some debug code in st_choose_format()
To aid in debugging gallium surface format selection issues.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
94610758a3 svga: s/Bool/SVGA3dBool/ in SVGA3dDevCapResult
And fix whitespace.  To sync up with in-house code.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-25 11:56:33 -07:00
Emil Velikov
6aeef54644 configure.ac: correct driglx-direct help text
The default was toggled a while back, but the text wasn't updated.

Fixes: bd526ec9e1 ("configure: Always default to
--enable-driglx-direct")
Cc: Jon TURNEY <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-01-25 17:44:35 +00:00
Emil Velikov
7b744a494d swrast: remove non-applicable GLX_SWAP_COPY_OML comment
Noticed while skimming for GLX_ instances in the dri codebase.
Comment is completely off and was in such a state since day 1.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-25 17:42:57 +00:00
Emil Velikov
3e3956d6ae mapi: remove duplicate GL typedefs
Remove the instances already available in gl.h or glext.h.
Sadly GLclampx is only available in GLES(1) so we need to keep that one.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-25 17:42:50 +00:00
Emil Velikov
647f40298a mapi: remove non applicable HAVE_DIX_CONFIG_H hunk
Seeming artefact from when the xserver build was diving directly into
mesa's tree.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-25 17:42:48 +00:00
Emil Velikov
48e7bc6833 mapi: autotools: remove unused MAPI_FILES file list
The sole user was OpenVG, which was removed couple of years ago.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-25 17:42:46 +00:00
Emil Velikov
785d9a4ed8 automake: st/mesa/tests: add st_tests_common.h to the tarball
Fixes: 6569b33b6e ("mesa/st/tests: unify MockCodeLine* classes")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 17:06:29 +00:00
Emil Velikov
0beaf7ad3e automake: mesa: include vbo_private.h in the tarball
Fixes: a7cfec3be0 ("vbo: move VBO-private types, prototypes, etc. into
new vbo_private.h header")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 17:06:29 +00:00
Emil Velikov
ac4437b20b automake: small cleanup after the meson.build inclusion
Namely extend the EXTRA_DIST list, instead of re-assigning it and bring
back a file dropped by mistake.

Fixes: 436ed65d38 ("autotools: include meson build files in tarball")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-01-25 17:06:29 +00:00
Emil Velikov
50265cd9ee automake: anv: ship anv_extensions_gen.py in the tarball
Fixes: dd088d4bec ("anv/extensions: Generate a header file with
extension tables")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 17:06:29 +00:00
Emil Velikov
265d36c890 automake: vc5: remove non-applicable v3dx_simulator.h
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 17:06:28 +00:00
Roland Scheidegger
4fe662c58f gallivm: fix crash with seamless cube filtering with different min/mag filter
We are not allowed to modify the incoming coords values, or things may
crash (as we may be inside a llvm conditional and the values may be used
in another branch).
I recently broke this when fixing an issue with NaNs and seamless cube
map filtering, and it causes crashes when doing cubemap filtering
if the min and mag filters are different.
Add const to the pointers passed in to prevent this mishap in the future.

Fixes: a485ad0bcd ("gallivm: fix an issue with NaNs with seamless cube filtering")

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-01-25 18:03:38 +01:00
Eric Engestrom
57223fb07a egl: keep extension list sorted, per comment at the top
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-01-25 16:38:11 +00:00
George Kyriazis
0e879aad2f swr/rast: support llvm 3.9 type declarations
LLVM 3.9 was not taken into account in initial check-in.

Fixes: 01ab218bbc ("swr/rast: Initial work for debugging support.")
cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104749
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 08:22:52 -06:00
Samuel Pitoiset
e1331c9d61 ac/nir: add break statements in needs_view_index_sgpr()
Previous code is correct but as the first case statement uses
a break, keep it consistent.

CID: 1428579
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-25 13:59:52 +01:00
Eric Engestrom
0663ae0aa1 loader: let compiler figure out the length of the string
Basically, turn comment into code

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 11:40:25 +00:00
Eric Engestrom
57b0ccd178 meson: simplify dri3 logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-25 10:10:04 +00:00
Juan A. Suarez Romero
513c2263cb mesa: add missing RGB9_E5 format in _mesa_base_fbo_format
This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-25 09:54:31 +01:00
Jason Ekstrand
df13588d21 i965: Stop disabling aux during texture preparation
Previously, we were handling self-dependencies by marking the render
buffer and then passing disable_aux=true to prepare_texture so that it
would do a resolve.  This works but ends us up doing to much resolving
in some cases.  Specifically, if we're doing something such as mipmap
generation, this would cause us to resolve all levels of the texture if
even one of them is overlapping.

Instead, this commit makes us wait until we process the framebuffer to
do these resolves and we only resolve the slices needed for rendering.
Doing this resolve puts them into the pass-through state so, even if we
do texture using CCS_E, the CCS data will effectively be ignored and the
real surface contents read.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
20f70ae385 i965/draw: Set NEW_AUX_STATE when draw aux changes
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383
Fixes: ea0d2e98ec
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
e52a9f18d6 i965: Replace draw_aux_buffer_disabled with draw_aux_usage
Instead of keeping an array of booleans, we now hang onto an array of
isl_aux_usage enums.  This means that the thing we are passing from
brw_draw.c to surface state setup is the thing that surface state setup
actually needs instead of an input to compute what it needs.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
468ea3cc45 i965/surface_state: Drop brw_aux_surface_disabled
The only purpose of this function is to disable aux on texture surfaces
when the corresponding renderbuffer has aux disabled.  However, the act
of disabling aux on the renderbuffer will cause it to be resolved and
intel_miptree_texture_aux_usage will already check the resolved status
of a texture and return ISL_AUX_USAGE_NONE for it.  Even if we used CCS
for it, that wouldn't really be a problem because the CCS will be in the
pass-through state and so it would effectively be ignored.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
d38ec24f53 i965/miptree: Add an aux_disabled parameter to render_aux_usage
Only one of the callers of intel_miptree_render_aux_usage actually took
brw->draw_aux_buffer_disabled into account.  This was causing us to
ignore draw_aux_buffer_disabled for the intel_miptree_prepare_render.
This isn't a problem because the draw_aux_buffer_disabled entry was set
during texture preparation and we already did the resolve at that time.
However, this also meant that the aux_usage we were passing to
brw_cache_flush_for_render and brw_render_cache_add_bo was wrong so our
automatic cache flushing around aux_usage changes wasn't happening.
This was causing GPU hangs in Oxenfree.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104711
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383
Fixes: ea0d2e98ec
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
dfe0217905 i965/miptree: Take an aux_usage in prepare/finish_render
Both callers of intel_miptree_prepare/finish_render have to call
intel_miptree_render_aux_usage anyway for other reasons.  They may as
well pass the result in instead of us calling it again.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
7d4007d58a aubinator: Multiply count by 4 to compute buffer sizes
The count field is in terms of dwords and not bytes.
2018-01-24 19:05:36 -08:00
Timothy Arceri
e776791432 st/glsl_to_nir: remove reallocation of sampler/image location
As far as I can tell this always just reassigns the same value.

Also as we don't curretly store UniformHash in the shader cache
removing this will help with adding a shader cache to gallium
nir drivers.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-01-25 13:27:22 +11:00
Jordan Justen
62b68d05e7 docs: add 18.1.0-devel release notes template
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-24 17:10:58 -08:00
Jordan Justen
65c18b02fc mesa: bump version to 18.1.0-devel
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-24 17:10:58 -08:00
Greg V
8fae5eddd9 meson: handle LLVM 'x.x.xgit-revision' versions
When LLVM is built inside of a git repo (even way below, e.g. /usr/ports/.git
exists, and LLVM is built in /usr/ports/devel/llvm50/work), its version
becomes something like 5.0.0git-f8ab206b2176.

New meson versions already handle this, but we support older versions too.

Fixes: 673dda8330 ("meson: build "radv" vulkan driver for radeon hardware")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-24 15:25:54 -08:00
Greg V
53f9131205 meson: fix getting cflags from pkg-config
get_pkgconfig_variable('cflags') always returns an empty list, it's a
function for getting *custom* variables.

Meson does not yet support asking for cflags, so explicitly invoke
pkg-config for now.

Fixes: 68076b8747 ("meson: build gallium vdpau state tracker")
Fixes: a817af8a89eb ("meson: build gallium xvmc state tracker")
Fixes: 1d36dc674d ("meson: build gallium omx state tracker")
Fixes: 5a785d51a6 ("meson: build gallium va state tracker")
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-24 15:25:54 -08:00
Greg V
c38c60a63c meson: fix BSD build
CC: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-24 15:25:54 -08:00
Greg V
7c8cfe2d59 meson: fix missing dependencies
Fixes: 66f97f6640 ("meson: build radeonsi")
Reviewed-by: Emil Velikov <emil.velikov@colalbora.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-24 15:25:54 -08:00
Grazvydas Ignotas
0cc7370733 anv: correct a duplicate check in an assert
Looks like checking both sources was intended, instead of the first one
twice. Found with Coccinelle, coccinellery/xand/xand.cocci semantic patch.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-25 01:10:45 +02:00
Marc Dietrich
a2a1b0e75e meson: fix HAVE_LLVM version define in meson build
LLVM patch level is not included in HAVE_LLVM.

Fixes: e6418ab156 ("meson: build "radv" vulkan driver for radeon hardware")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Signed-off-by: Marc Dietrich <marvin24@gmx.de>
2018-01-24 14:04:20 -08:00
Dylan Baker
5781c3d1db meson: correctly set SYSCONFDIR for loading dirrc
Fixes: d1992255bb ("meson: Add build Intel "anv" vulkan driver")
Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-24 13:10:32 -08:00
Dave Airlie
d2414e64e4 radv: add multisample Z optimisation from amdvlk
This was just found while reading for other stuff,
src/core/hw/gfxip/gfx6/gfx6DepthStencilView.cpp.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-25 06:48:11 +10:00
Dave Airlie
298554541d radv: move spi_baryc_cntl to pipeline
We need to enable the pos float location 2 mode anytime we have
persample not just when forced by the frag shader.

This fixes:
dEQP-VK.pipeline.multisample.min_sample_shading*

Fixes: 58c97a079 (radv: enable location at sample when persample is forced.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-25 06:47:28 +10:00
Marek Olšák
125c0529f3 gallium/u_tests: add texture_barrier and FBFETCH tests
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-24 21:08:45 +01:00
Marek Olšák
022c5b22fe radeonsi: don't ignore pitch for imported textures
Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-24 21:08:45 +01:00
Scott D Phillips
0b8d38bd48 meson: Fix define for USE_SSE41
Before we were adding -DHAVE_SSE41 which isn't what the code is
looking for, so some uses of the sse4.1 code were always being
skipped.

v2: Don't add any compile check for the quite old -msse4.1 option (Dylan)

Fixes: 84486f6462 ("meson: Enable SSE4.1 optimizations")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-24 11:32:34 -08:00
Gert Wollny
8172b9ff48 mesa/st/glsl_to_tgsi: remove now unneeded assert.
With the implementation of the tracking of the registers used in reladdr
asserting that a driver calling merge_register() uses the address register
is no longer needed.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:34:05 -07:00
Gert Wollny
f2040fbe48 mesa/st/tests: Add tests for lifetime tracking with indirect addressing
Add a code line type that accepts one layer of indirect addressing and
add tests to check that temporary register access used for indirect
addressing is accounted for in the lifetime estimation.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:34:00 -07:00
Gert Wollny
51c0cee267 mesa/st/glsl_to_tgsi: Add tracking of indirect addressing registers
So far indirect addressing was not tracked to estimate the temporary
life time, and it was not needed, because code to load the address
registers was always emitted eliminating the reladdr* handles in the
past glsl-to.tgsi stages. Now, with Mareks patch allowing any 1D register
to be used for addressing on some hardware this changed, and
the tracking becomes necessary.

Because the registers have no direct indication on whether the reladdr* was
already loaded into an address register, the temporaries in reladdr* are
always tracked as reads. This may result in a slight over-estimation of the
lifetime in the cases when the load to the address register was emitted.

v2: no changes
v3: Use debug_log variable instead of directly writing to std::err in debugging
    output.
v6: fix indention and typos

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
517e34c62f mesa/st/tests: Add tests for improved tracking of temporaries
Additional tests are added that check the tracking of access to temporaries
in if-else branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
807e2539e5 mesa/st/glsl_to_tgsi: Add tracking of ifelse writes in register merging
Improve the life-time evaluation of temporary registers by also tracking
writes in both if and else branches and in up to 32 nested scopes.
As a result the estimated required register life-times can be further
reduced enabling more registers to be merged.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
8dda01ef5a mesa/st/tests: cleanup whitespace usage and correct some comments
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
6569b33b6e mesa/st/tests: unify MockCodeLine* classes
* Merge the classes MockCodeLine and MockCodelineWithSwizzle into
   one, and  refactor tests accordingly.
 * Change memory allocations to use ralloc* interface.

 v2:
 * move the test classes into a conveniance library
 * rename the Mock* classes to Fake* since they are not really
   Mocks
 * Base assertion of correct number of src and dst registers in tests
   on what the operatand actually expects
 * Fix number of destinations in one test

 v6:
 * fix local includes using "..." insteadof <...>

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
ad1990629e mesa/st/tests: Fix zero-byte allocation leaks
Don't allocate a zero-sized array, when no texture offsets are given.

v5: correct spaces and empty lines

Reviewed-by: Brian Paul <brianp@vmware.com>(v4)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
ee48e3acb8 mesa/st/glsl_to_tgsi: Add some operators for glsl_to_tgsi related classes
Add the equal operator and the "<<" stream write operator for the
st_*_reg classes and the "<<" operator to the instruction class, and
make use of these operators in the debugging output.

v5: Fix empty lines

Reviewed-by: Brian Paul <brianp@vmware.com> (v4)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
6a3421078a mesa/program: Add missing file types to printout
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Brian Paul
365a48abdd vbo: fix incorrect min/max_index values in display list draw call
This fixes another regression from commit 8e4efdc895 ("vbo: optimize
some display list drawing").  The problem was the min_index, max_index
values passed to the vbo drawing function were not computed to compensate
for the biased prim::start values.

https://bugs.freedesktop.org/show_bug.cgi?id=104746
https://bugs.freedesktop.org/show_bug.cgi?id=104742
https://bugs.freedesktop.org/show_bug.cgi?id=104690
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Fixes: 8e4efdc895 ("vbo: optimize some display list drawing")
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
2018-01-24 10:12:49 -07:00
Brian Paul
2123bd2805 vbo: whitespace/formatting fixes in vbo_split_inplace.c
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
6b0109cf39 vbo: whitespace/formatting fixes in vbo.h
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
b9280031a8 vbo/i965: move vbo_all_varyings_in_vbos() to brw_draw.c
It's only used in brw_draw_prims().

s/GLboolean/bool/, etc.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
a83f7e119c vbo: remove unused vbo_any_varyings_in_vbos() function
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
718f4251c5 vbo: remove unneeded #includes
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
f4376a0c2b vbo: remove vbo_context.h and change includes to use vbo.h instead
Now vbo.h is the public interface to the VBO module.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
aafb56a148 vbo: move remaining items from vbo_context.h to vbo.h
Non-VBO sources files sometimes included vbo.h while others included
vbo_context.h.  We're moving all public types, functions to the former.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
a7cfec3be0 vbo: move VBO-private types, prototypes, etc. into new vbo_private.h header
Things which should not be used outside the VBO module.
More public/private clean-ups coming.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
d40fa42292 mesa: use new _vbo_install_exec_vtxfmt() function
Instead of reaching into the vbo_context object in vtxfmt.c

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
04a17ec327 nouveau: remove vbo_context() call
_vbo_DestroyContext() can be safely called even if there's no VBO
module.  Removes a dependency on the vbo_context() function.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
7b0ae96711 i965: use vbo_set_[indirect]_draw_func()
Instead of poking into the vbo_context object.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
3bbf8d9042 vbo: move vbo_sizeof_ib_type() into vbo_exec_array.c
It's only used in this one file.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
a152cb7492 mesa: move vbo_count_tessellated_primitives() to api_validate.c
It's only used in this file and has nothing VBO-specific about it.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
5d3e10fd27 mesa: update comment on gl_display_list
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
cffa82327d mesa: whitespace clean-ups in mtypes.h
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
b3a1aa94d9 mesa: remove unused MAT_INDEX_AMBIENT/DIFFUSE/SPECULAR contants
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
67dc551ba9 vbo: move DLIST_DANGLING_REFS from mtypes.h to vbo_save_api.c
It's only used in this file.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
cb7ef0df00 vbo: replace assert(0) with unreachable()
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
8b3cb7c651 vbo: fix, add comment in vbo_save.h
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
67ebde19d4 vbo: whitespace, formatting fixes in vbo_split.[ch]
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Topi Pohjolainen
ec4bb693a0 i965: Don't try to disable render aux buffers for compute
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104546
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-01-24 10:54:08 +02:00
Jason Ekstrand
4064fe59e7 anv/cmd_buffer: Move gen7 index buffer state to graphics state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:46 -08:00
Jason Ekstrand
38ec78049f anv/cmd_buffer: Move num_workgroups to compute state
While we're here, make it an anv_address.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:44 -08:00
Jason Ekstrand
95ff232294 anv/cmd_buffer: Move dynamic state to graphics state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:43 -08:00
Jason Ekstrand
24caee8975 anv/cmd_buffer: Use a temporary variable for dynamic state
We were already doing this for some packets to keep the lines shorter.
We may as well just do it for all of them.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:40 -08:00
Jason Ekstrand
8bd5ec5b86 anv/cmd_buffer: Move vb_dirty bits into anv_cmd_graphics_state
Vertex buffers are entirely a graphics pipeline thing.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:39 -08:00
Jason Ekstrand
e85aaec148 anv/cmd_buffer: Move dirty bits into anv_cmd_*_state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:36 -08:00
Jason Ekstrand
97f96610c8 anv: Separate compute and graphics descriptor sets
The Vulkan spec says:

    "pipelineBindPoint is a VkPipelineBindPoint indicating whether the
    descriptors will be used by graphics pipelines or compute pipelines.
    There is a separate set of bind points for each of graphics and
    compute, so binding one does not disturb the other."

Up until now, we've been ignoring the pipeline bind point and had just
one bind point for everything.  This commit separates things out into
separate bind points.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102897
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:33 -08:00
Jason Ekstrand
31b2144c83 anv/cmd_buffer: Use anv_descriptor_for_binding for samplers
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:31 -08:00
Jason Ekstrand
b9e1ca16f8 anv/cmd_buffer: Add a helper for binding descriptor sets
This lets us unify some code between push descriptors and regular
descriptors.  It doesn't do much for us yet but it will.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:30 -08:00
Jason Ekstrand
90cceaa9dd anv/cmd_buffer: Refactor ensure_push_descriptor_set
It's now a function which returns the push descriptor set.  Since we set
the error on the command buffer, returning the error is a little
redundant.  Returning the descriptor set (or NULL on error) is more
convenient.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:28 -08:00
Jason Ekstrand
d5592e2fda anv: Remove semicolons from vk_error[f] definitions
With the semicolons, they can't be used in a function argument without
throwing syntax errors.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:27 -08:00
Jason Ekstrand
9af5379228 anv/cmd_buffer: Add substructs to anv_cmd_state for graphics and compute
Initially, these just contain the pipeline in a base struct.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:25 -08:00
Jason Ekstrand
ddc2d28548 anv/cmd_buffer: Use some pre-existing pipeline temporaries
There are several places where we'd already saved the pipeline off to a
temporary variable but, due to an artifact of history, weren't actually
using that temporary everywhere.  No functional change.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:24 -08:00
Jason Ekstrand
cd3feea745 anv/cmd_buffer: Rework anv_cmd_state_reset
This splits anv_cmd_state_reset into separate init and finish functions.
This lets us share init code with cmd_buffer_create.  This potentially
fixes subtle bugs where we may have missed some bit of state that needs
to get initialized on command buffer creation.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:22 -08:00
Jason Ekstrand
d6c9a89d13 anv/cmd_buffer: Get rid of the meta query workaround
Meta has been gone for a long time.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:20 -08:00
Jason Ekstrand
bc0a21e348 anv/cmd_state: Drop the scratch_size field
This is a legacy left-over from the mechanism we used to use to handle
scratch.  The new (and better) mechanism doesn't use this.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:19 -08:00
Jason Ekstrand
4b69ba3817 anv/pipeline: Don't assert on more than 32 samplers
This prevents an assert when running one unreleased Vulkan game.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:08 -08:00
Dave Airlie
766589d89a radv: fix sample_mask_in loading. (v3.1)
This is ported from radeonsi and fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.bit_*

v2: don't call this path for radeonsi, it does it in the epilog.
use the radeonsi code path.
v3: handle NULL pCreateInfo->pMultisampleState properly (Samuel)
v3.1: set ps_iter_samples default to 1 (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: bdcbe7c76 (radv: add sample mask input support)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-24 14:25:11 +10:00
Dave Airlie
c727ea9370 radv: don't use hw resolves for r16g16 norm formats.
radeonsi has a workaround for this, but it uses a R16A16 format,
which vulkan doesn't have, we could probably come up with a work
around but for now just avoid hw resolves.

Fixes:
dEQP-VK.renderpass.suballocation.multisample.r16g16_*norm*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-24 09:01:12 +10:00
Dave Airlie
4df414bbd2 radv: don't use hw resolve for integer image formats
From reading AMDVLK it currently never uses hw resolve paths.

This patch takes from radeonsi which doesn't use hw resolve
for integer formats, and does the same for radv.

This fixes:
dEQP-VK.renderpass.suballocation.multisample*uint tests.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-24 08:53:18 +10:00
Dave Airlie
316d762186 radv: add fs_key meta format support to resolve passes.
Some of the hw resolve passes need the SPI color format setup
correctly.

This fixes lots of 16-bit and 32-bit format tests in
dEQP-VK.renderpass.suballocation.multisample*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-24 08:50:51 +10:00
Grazvydas Ignotas
224fd17e1e winsys/svga: check correct member after create
.mob_fenced was already checked, probably a copy-paste bug.
Found by Coccinelle.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-23 11:04:07 -07:00
Grazvydas Ignotas
08085df313 svga: fix context alloc error handling
'cleanup' path is dereferencing 'svga' a lot, 'done' is a better choice.
Found by Coccinelle.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-23 11:04:07 -07:00
Christoph Haag
4b4d929c27 meson: remove lib prefix from libd3dadapter9.so
Fixes: 6b4c7047d5 ("meson: build gallium nine state_tracker")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-23 09:30:30 -08:00
Emil Velikov
3b6d232a5c docs: update calendar 18.0.0-rc1 is out
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-23 17:02:17 +00:00
Eric Engestrom
eee8dd7c33 radeon: remove left over dead code
Fixes: 4e0d99a635 "r100: Use shared debug code"
Cc: Pauli Nieminen <suokkos@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-01-23 15:39:57 +00:00
Eric Engestrom
10f5e0dce2 docs: ask for backport nominations to cc: the author
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-23 15:39:57 +00:00
Marc Dietrich
911ca587f8 meson: fix some defines misspelled errors in meson.build
Defines
- HAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL
- HAVE_FUNC_ATTRIBUTE_VISIBILITY
were misspelled.

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-23 15:39:57 +00:00
Bas Nieuwenhuizen
5a4dc28500 ac/nir: Use instance_rate_inputs per attribute, not per variable.
This did the wrong thing if we had e.g. an array for which only some
of the attributes use the instance index. Tripped up some new CTS
tests.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-23 12:58:48 +01:00
Jason Ekstrand
de00e8227b anv: Return trampoline entrypoints from GetInstanceProcAddr
Technically, the Vulkan spec requires that we return valid entrypoints
for all core functionality and any available device extensions.  This
means that, for gen-specific functions, we need to return a trampoline
which looks at the device and calls the right device function.  In 99%
of cases, the loader will do this for us but, aparently, we're supposed
to do it too.  It's a tiny increase in binary size for us to carry this
around but really not bad.

Before:
       text    data   bss      dec     hex  filename
    3541775  204112  6136  3752023  394057  libvulkan_intel.so

After:
       text    data   bss      dec     hex  filename
    3551463  205632  6136  3763231  396c1f  libvulkan_intel.so

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
eac29f3a6d anv/entrypoints: Use an named tuple for params
This allows us to store a bit more detailed data per-param

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
1f79d986af anv: Only advertise enabled entrypoints
The Vulkan spec annoyingly requires us to track what core version and
what all extensions are enabled and only advertise those entrypoints.
Any call to vkGet*ProcAddr for an entrypoint for an extension the client
has not explicitly enabled is supposed to return NULL.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
e3d27542ae anv: Add a per-device dispatch table
We also switch GetDeviceProcAddr over to use it.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
0c399dca51 anv: Add a per-instance dispatch table
We also switch GetInstanceProcAddr over to use it.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
a372b9247d anv: Properly NULL for GetInstanceProcAddr with a null instance
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
cb0d1ba156 anv/extensions: Fix VkVersion::c_vk_version for patch == None
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
93e789a266 anv/entrypoints: Parse entrypoints before extensions/features
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
2f493121ae anv/entrypoints: Expose the different dispatch tables
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
083e126694 anv/entrypoints: Split entrypoint index lookup into its own function
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
7039308d7c anv/entrypoints: Add a LAYERS helper variable
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
f54227856f anv/entrypoints: Add an Entrypoint class
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
abc62282b5 anv: Add a per-device table of enabled extensions
Nothing uses this at the moment, but we will need it soon.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
01b9701a5c anv: Use tables for device extension wrangling
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
920bd2c0bc anv: Add a per-instance table of enabled extensions
Nothing needs this yet but we will want it later.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
ff5f3e2b21 anv: Use tables for instance extension wrangling
This lets us move a bunch of stuff out of codegen and back into
anv_device.c which is a bit nicer.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
dd088d4bec anv/extensions: Generate a header file with extension tables
This allows us better introspection into extensions.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
ffb10bfd8e anv/meson: Simplify some dependency and flag tracking
This removes some redundant code between libanv_common, libvulkan_intel,
and libvulkan_intel_test.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
f939940809 anv: Split anv_extensions.py into two files
The new anv_extensions_gen.py is the code generator while the old
anv_extensions.py file is purely declarative.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Jason Ekstrand
10d1b0be8e anv/meson: Make anv_entrypoints_gen.py depend on anv_extensions.py
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-23 00:15:40 -08:00
Timothy Arceri
e68150de26 ac: fix image load store for GLSL_SAMPLER_DIM_3D
Fixes the following piglit tests:

arb_shader_image_load_store/layer/image3d/layered binding test
arb_shader_image_load_store/max-size/image3d max size test/2048x8x8x1
arb_shader_image_load_store/max-size/image3d max size test/8x2048x8x1
arb_shader_image_load_store/max-size/image3d max size test/8x8x2048x1
arb_shader_image_load_store/semantics/imageload/vertex shader/rgba32f/image3d test

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 18:05:13 +11:00
Timothy Arceri
82adf53308 ac: image size builtin for GLSL_SAMPLER_DIM_3D
This is what radeonsi does. Fixes remaing piglit subtest in:

./bin/arb_shader_image_size-builtin --quick -auto -fbo

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 18:05:13 +11:00
Chuck Atkins
a29d63ecf7 swr: refactor swr_create_screen to allow for proper cleanup on error
This makes the following changes to address cleanup issues:
- Error conditions now return NULL instead of calling exit()
- swr_creen is now freed upon error, rather than leak.
- Library handle from dlopen is now closed upon swr_screen destruction

v2: Added additional context in commit msg and remove unnecessary "PUBLIC"
v3: Fix typo in commit message.

Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
cc: mesa-stable@lists.freedesktop.org
2018-01-22 17:56:44 -06:00
Anuj Phogat
56b9060381 intel: Add Geminilake brand strings
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-22 15:40:04 -08:00
Timothy Arceri
5b9362c248 ac: fix ac_build_varying_gather_values() for packed layouts
This fixes a segfault for varyings not starting at component 0.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 10:00:52 +11:00
Timothy Arceri
209b14c2cb ac: remove arrays when when querying sampler info
Fixes the following ARB_arrays_of_arrays piglit tests:

basic-imagestore-const-uniform-index
basic-imagestore-mixed-const-non-const-uniform-index
basic-imagestore-mixed-const-non-const-uniform-index2
basic-imagestore-non-const-uniform-index

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:50:47 +11:00
Timothy Arceri
0cbc62a4dd glsl: add image and sampler (un)packing support to glsl to nir
This is needed for ARB_bindless_texture support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:44:37 +11:00
Timothy Arceri
549ccbb435 nir: add image and sampler type to glsl_get_bit_size()
These are needed for ARB_bindless_texture support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:44:37 +11:00
Timothy Arceri
324d2fe6a7 ac: fix emit vertex stream parameter
Fixes the following piglit test on radeonsi:

./bin/arb_enhanced_layouts-gs-stream-location-aliasing

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:30:00 +11:00
Timothy Arceri
271067967a ac: add support for gl_HelperInvocation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:23:26 +11:00
Timothy Arceri
3bc5fa69f5 ac/radeonsi: add emit primitive to the abi
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:18:37 +11:00
Timothy Arceri
dd4591b794 radeonsi: add generic emit primitive helper
This will be shared by the tgsi and nir backends.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:18:36 +11:00
Timothy Arceri
fdc2fb4d88 ac: add stream handling to visit_end_primitive()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:18:36 +11:00
Timothy Arceri
6bf1c93fe0 radeonsi/nir: fix fs output index
Fixes the following piglit tests:

arb_blend_func_extended-fbo-extended-blend
arb_blend_func_extended-fbo-extended-blend-explicit
arb_blend_func_extended-fbo-extended-blend-explicit_gles3
arb_blend_func_extended-fbo-extended-blend-pattern
arb_blend_func_extended-fbo-extended-blend-pattern_gles2
arb_blend_func_extended-fbo-extended-blend-pattern_gles3
arb_blend_func_extended-fbo-extended-blend_gles3
ext_framebuffer_multisample/alpha-to-coverage-dual-src-blend
ext_framebuffer_multisample/alpha-to-one-dual-src-blend

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:11:22 +11:00
Timothy Arceri
882af004d8 ac/nir/radeonsi: add ARB_shader_ballot support
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:11:22 +11:00
Timothy Arceri
4a9643413f ac/nir: add ARB_shader_group_vote support
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:11:22 +11:00
Timothy Arceri
dabff1cf7a radeonsi/nir: add primitive id to inputs scan
Fixes the following piglit tests:

arb_tessellation_shader/fs-primitiveid-instanced
glsl-1.50/primitive-id-no-gs
glsl-1.50/primitive-id-no-gs-first-vertex
glsl-1.50/primitive-id-no-gs-instanced
glsl-1.50/primitive-id-no-gs-strip
glsl-1.50/primitive-id-no-gs-strip-first-vertex

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:11:21 +11:00
Timothy Arceri
c6a0ce7e54 radeonsi/nir: add nir_intrinsic_load_sample_mask_in to ir scan
Fixes a bunch of ARB_sample_shading piglit tests.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:11:21 +11:00
Samuel Thibault
9131e6d3c2 u_thread: Use pthread_setname_np on linux only.
pthread_setname_np was added in glibc 2.12 for the Linux port only, other
ports do not necessarily have it.

Signed-off-by: Jose Fonseca <jfonseca@vmware.com>
2018-01-22 21:12:41 +00:00
Jose Fonseca
dcbb224c68 svga: Prevent use after free.
Courtesy of clang static analyzer.

I was hunting for potential sources of memory corruption using Mesa with
a GL trace, and happened to find this (unrelated) issue.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-01-22 21:12:41 +00:00
Kenneth Graunke
60f15477da i965: Drop render_target_start from binding table struct.
We have to start render targets at binding table index 0 in order to use
headerless FB write messages, and in fact already assume this in a bunch
of places in the code.  Let's finish that off, and not bother storing 0
in a struct to pretend to add it in a few places.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-01-22 10:03:52 -08:00
Emil Velikov
a9bb067e27 i965: make brw_context::num_samples unsigned int
It is never a negative number. Variable is compared against unsigned
values and passed into functions that expect unsigned int.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-22 16:31:57 +00:00
Emil Velikov
ef1df63046 st/mesa: provide static inline st_init_vdpau_functions
The ifdef spaghetty in st_vdpau.c is rather confusing and misleading.
Simplily it by introducing a static inline helper noop (when
HAVE_ST_VDPAU is not defined) in the header.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-01-22 16:31:15 +00:00
Samuel Pitoiset
33e6e5e6a4 radv: add an option that allows to dump pre-optimization ir
With RADV_DEBUG=preoptir.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-22 12:28:33 +01:00
Chris Wilson
525b4f7548 i965: Accept CONTEXT_ATTRIB_PRIORITY for brwCreateContext
The forward port of commit 6d87500fe1 ("dri: Change
__DriverApiRec::CreateContext to take a struct for attribs") failed to
adapt the set of allowed attributes for the earlier introduction of
context priorities (commit 1617fca6d1 "i965: Pass the EGL/DRI context
priority through to the kernel").

Fixes: 6d87500fe1 ("dri: Change __DriverApiRec::CreateContext to take a struct for attribs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Adam Jackson <ajax@redhat.com>
Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2018-01-22 10:24:20 +00:00
Matthew Nicholls
005375717b radv: restore previous stencil reference after depth-stencil clear
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
2018-01-22 08:57:42 +00:00
Jason Ekstrand
5048572352 i965: Set tiling on BOs imported with modifiers
We need this to ensure that GTT maps work on buffers we get from Vulkan
on the off chance that someone does a readpixels or something.  Soon, we
will be removing GTT maps from i965 entirely and this can be reverted.
None the less, it's needed for stable.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2018-01-21 23:07:18 -08:00
Jason Ekstrand
b9e7b29705 i965/bufmgr: Add a create_from_prime_tiled function
This new function is an import and a set tiling in one go.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2018-01-21 23:07:18 -08:00
Jason Ekstrand
ad424b2243 i965/miptree: Use the tiling from the modifier instead of the BO
This fixes a bug where we were taking the tiling from the BO regardless
of what the modifier said.  When we got images in from Vulkan where it
doesn't set the tiling on the BO, we would treat them as linear even
though the modifier expressly said to treat it as Y-tiled.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2018-01-21 23:07:18 -08:00
Jason Ekstrand
0465dd13d2 i965/miptree: Add an explicit tiling parameter to create_for_bo
Otherwise, create_for_bo will just grab the tiling from the BO which is
not what we want when using modifiers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2018-01-21 23:07:16 -08:00
Bas Nieuwenhuizen
4584c4ef04 radv: Don't allow 3d or 1d depth/stencil textures.
addrlib asserts when that happens, and supporting it is not
required so lets not allow this for now.

It also assert on fmask, but we don't have the number of samples here.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-22 00:07:43 +01:00
Bas Nieuwenhuizen
8b98929074 radv: Init variant entry with memset.
This gets memcpy'd and written driectly, and due to alignment, this
resulted in uninitialized gaps. This makes those gaps go away.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-22 00:07:39 +01:00
Bas Nieuwenhuizen
fb0992e967 radv: Fix bufimage failure deallocation.
The inidividual init parts don't clean up their own stuff on failure.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-22 00:07:32 +01:00
Bas Nieuwenhuizen
2c802ca66c radv: Fix fragment resolve init memory allocation failure paths.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-22 00:07:29 +01:00
Bas Nieuwenhuizen
c685076ab0 radv: Fix freeing meta state if the device pipeline cache fails to allocate.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-22 00:07:24 +01:00
Bas Nieuwenhuizen
71f0315a88 radv: Fix memory allocation failure path in compute resolve init.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-22 00:07:19 +01:00
Bas Nieuwenhuizen
d956e0bdf5 radv: Fix ordering issue in meta memory allocation failure path.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-22 00:07:03 +01:00
Lucas Stach
29a0ea699a etnaviv: dirty TS state when framebuffer has changed
When switching between framebuffers with and without TS, the TS state
needs to be flushed to the command stream even if the derived state
isn't changed.

Fixes: 4ee7c2c284 ("etnaviv: enable TS, but disable autodisable")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-01-21 12:58:02 +01:00
Vinson Lee
e03c880971 broadcom/vc5: Fix source file name.
Fixes: c9b2cb7897 ("vc5: add missing files to the tarball")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-01-21 11:13:16 +08:00
Vinson Lee
14abbe604b broadcom/vc5: Add missing include paths.
Fixes: 954a704da3 ("broadcom/vc5: Port the RCL setup to V3D4.1.")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-01-21 11:05:33 +08:00
Eric Anholt
f398aa6aef mesa: Only require independent blending for GLES 3.2.
We've been requiring this since GLES 3.0 was introduced, but the GLES 3.2
spec is the one that has "Supporting blending on a per-draw-buffer basis"
in the new features.  V3D 3.3 would require lowering blending to shader
code to implement independent blending.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-21 10:48:00 +08:00
Kenneth Graunke
60d8fe9216 i965: Delete completely bogus comment
This hasn't been true in 6+ years, if it was even true then.  Before
we rewrote the compiler and introduced GLSL IR in 2010-2011, i965 used
to have two compiler backends for WM programs, based on Mesa IR.  One
handled flow control and was SIMD8-only, while the other was SIMD16
only and didn't handle flow control.  Or something like that.

Even then, this certainly didn't handle vertex shaders, so "all ...
code generation" is a bit strong.
2018-01-20 01:31:25 -08:00
Dylan Baker
436ed65d38 autotools: include meson build files in tarball
This adds the meson.build, meson_options.txt, and a few scripts that are
used exclusively by the meson build.

v2: - Remove accidentally included changes needed to test make dist with
      LLVM > 3.9

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 16:30:51 -08:00
George Kyriazis
9d80ed0862 swr/rast: Fix llvm5 behavior
For some reason llvm5 is picky about accepting a void * type in the
case of building an argument list.

Since we don't care about the type (we ignore the argument for now),
pick another pointer type

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 17:08:30 -06:00
George Kyriazis
d335b32baf swr/rast: Enable early rasterization
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:43 -06:00
George Kyriazis
bacfbe5a32 swr/rast: Implement Early Rasterization optimization
Early Rasterization is an optimization for small triangles.

Scientific workloads often contain very small triangles that has non-zero
area and cannot be trivially rejected as falling between pixel centers,
but does not cover any pixel center. Those triangles can be initially
rasterized as early as in binner and rejected if they cover no pixels The
optimization can be disabled in compilation using KNOB_ENABLE_EARLY_RAST
option in knobs.h

The Early Rast is disabled by default.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:43 -06:00
George Kyriazis
be3cd7add1 swr/rast: Enable simd16 vertex shaders
Flip the switch(es) to enable simd16 vertex shaders:

USE_SIMD16_SHADERS and USE_SIMD16_VS

Both have to be enabled at the same time.  Currently, just setting
USE_SIMD16_SHADERS does not work correctly.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:42 -06:00
George Kyriazis
8c83d2d371 swr: Support simd16 vertex shaders
Supporting simd16 vertex shaders involves packing the output of the
fetch shader appropriately, especially the vertexID buffers that have to
be formatted in one simd16 register, needed by the VS.

As part of this support, we needed to remove the 2nd JitManager, since it
was not accounting for vector width correctly.

USE_SIMD16_SHADERS is also split into two defines.  The additional
one (USE_SIMD16_VS) controls the width of the vertex shader (VS), while
the original one (USE_SIMD16_SHADERS) controls overall front end width.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:42 -06:00
George Kyriazis
1874d95a8e swr/rast: changed jit debug magic number
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:41 -06:00
George Kyriazis
c719f62621 swr/rast: Added ICLAMP builder function
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:41 -06:00
George Kyriazis
f192502001 swr/rast: Jit debug work
Properly validate DLL matches OBJ for jitted function

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:41 -06:00
George Kyriazis
3c405e32b0 swr/rast: silence generated file warnings
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:40 -06:00
George Kyriazis
fe107e3c17 swr/rast: jit shader lib debug work
Create shader_lib during build, link with shaders at DLL generation time

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:40 -06:00
George Kyriazis
0cd9ad98a3 swr/rast: AVX-512 changes to enable 16-wide VS
Add a new define (USE_SIMD16_VS), to denote calling a 16-wide vertex shader.
This is needed because the mesa driver can do 16-wide shaders, but rasty
cannot yet, so we need to distinguish.

Create a new VertexID entry (VertexID16) for the USE_SIMD16_VS case, since
we need to format the vertex id in a way that is digestible by the 16-wide VS

Disabled for now.  To be enabled in a future checkin when driver work
is complete.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:40 -06:00
George Kyriazis
3140e714d2 swr/rast: x86 autogenerated macro work
Add name argument to x86 autogenerated macros.
Add useful variable names for DCL_inputVec implementation.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:39 -06:00
George Kyriazis
4cd6e2ebfd swr/rast: Shorten some filenames
in shader and fetch dump files

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:39 -06:00
George Kyriazis
3936044d07 swr/rast: work supporting optimizations in Debug builds.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:38 -06:00
George Kyriazis
c4a42f5add swr/rast: Add debugging type support for function types.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:38 -06:00
George Kyriazis
e9e7f3ce0a swr/rast: Shader debugging work
- Move debug .ll files to JIT_CACHE_DIR
- Don't link against jitter SRGBLut table, add global data to shader that needs it.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:34 -06:00
George Kyriazis
34bbcb5052 swr/rast: Debug Symbols work
Added support for Fetch / Sample / LD functions
Added DLL link to JitCache implementation

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:30 -06:00
George Kyriazis
01ab218bbc swr/rast: Initial work for debugging support.
Adds ability to step into jitted llvm IR in Visual Studio.
- Updated llvm type generation script to also generate corresponding debug types.
- New module pass inserts debug metadata into the IR for each function

Disabled by default.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:52:22 -06:00
George Kyriazis
4660e13152 swr/rast: Add private state parameter in fetcher
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:48:41 -06:00
George Kyriazis
079ae3c48d swr/rast: Added missing define for Linux/gcc
+ ZeroMemory() macro definition for non win32-compilation in common/os.h

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:48:41 -06:00
George Kyriazis
70f8eac603 swr/rast: Fix one more invalid object format for windows.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:48:41 -06:00
Bas Nieuwenhuizen
61a790409e radv: Always re-emit the sample position offset user SGPR.
The user SGPR location can change between pipelines, so we need to
emit it again to the pottentially changed SGPR index.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-19 23:35:12 +01:00
Bas Nieuwenhuizen
dbf1e918cd radv: emit pa_sc_mode_cntl_0 with multisample state.
We don't have the meta kludge with 0 viewports anymore,
so we can always enable them.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-19 23:35:12 +01:00
Kenneth Graunke
c7dcee58b5 i965: Avoid problems from referencing orphaned BOs after growing.
Growing the batch/state buffer is a lot more dangerous than I thought.

A number of places emit multiple state buffer sections, and then write
data to the returned pointer, or save a pointer to brw->batch.state.bo
and then use it in relocations.  If each call can grow, this can result
in stale map references or stale BO pointers.  Furthermore, fences refer
to the old batch BO, and that reference needs to continue working.

To avoid these woes, we avoid ever swapping the brw->batch.*.bo pointer,
instead exchanging the brw_bo structures in place.  That way, stale BO
references are fine - the GEM handle changes, but the brw_bo pointer
doesn't.  We also defer the memcpy until a quiescent point, so callers
can write to the returned pointer - which may be in either BO - and
we'll sort it out and combine the two properly in the end.

v2/v3:
- Handle stale pointers in the shadow copy case, where realloc may or
  may not move our shadow copy to a new address.
- Track the partial map explicitly, to avoid problems with buffer reuse
  where multiple map modes exist (caught by Chris Wilson).

v4:
- Don't use realloc in the CPU shadow case, it isn't safe.

Fixes: 2dfc119f22 "i965: Grow the batch/state buffers if we need space and can't flush."
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v3]
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-01-19 11:30:10 -08:00
Kenneth Graunke
8a5bc304ff i965: Rename 'aux' to 'prog_data' in program cache.
'aux' is a very generic name, suggesting it can be a bunch of things.
However, it's always the brw_*_prog_data structure.  So, call it that.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-01-19 11:29:47 -08:00
Chuck Atkins
a4be2bcee2 swr: allow a single swr architecture to be builtin
Part 2 of 2 (part 1 is autoconf changes, part 2 is C++ changes)

When only a single SWR architecture is being used, this allows that
architecture to be builtin rather than as a separate libswrARCH.so that
gets loaded via dlopen.  Since there are now several different code
paths for each detected CPU architecture, the log output is also
adjusted to convey where the backend is getting loaded from.

This allows SWR to be used for static mesa builds which are still
important for large HPC environments where shared libraries can impose
unacceptable application startup times as hundreds of thousands of copies
of the libs are loaded from a shared parallel filesystem.

Based on an initial implementation by Tim Rowley.

v2: Refactor repetitive preprocessor checks to reduce code duplication
v3: Formatting changes per Bruce C. Also delay screen creation until end
    to avoid leaks when failure conditions are hit.

Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
CC: Tim Rowley <timothy.o.rowley@intel.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 13:16:00 -06:00
Chuck Atkins
2ed8b6f827 swr: (autoconf) allow a single swr architecture to be builtin
Part 1 of 2 (part 1 is autoconf changes, part 2 is C++ changes)

When only a single SWR architecture is being used, this allows that
architecture to be builtin rather than as a separate libswrARCH.so that
gets loaded via dlopen.  Since there are now several different code
paths for each detected CPU architecture, the log output is also
adjusted to convey where the backend is getting loaded from.

This allows SWR to be used for static mesa builds which are still
important for large HPC environments where shared libraries can impose
unacceptable application startup times as hundreds of thousands of copies
of the libs are loaded from a shared parallel filesystem.

Based on an initial implementation by Tim Rowley.

v2: Fix comment placement pointed out by Bruce C.

Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
CC: Tim Rowley <timothy.o.rowley@intel.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 13:15:54 -06:00
Greg V
8ff8c82630 swr: fix clang 5 null cast warning
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:15:56 +00:00
Gert Wollny
ea89843b3d mesa/program: Fix -Wunused-param warning
v2: Don't annotate, but remove the unused ctx parameter

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-19 15:45:57 +00:00
Gert Wollny
81d8a0f4a4 mesa/program/prog_execute.c: Silence -Wunused-param
v2: Don't annotate, but remove the unused ctx parameter

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:45:57 +00:00
Gert Wollny
fef4b16523 mesa: Make numSamples an unsigned int
As a followup to the previous patch propagate the change of numSamples
from int to unsigned to gl_config::samples and consequently fix some
-Wsign-compare warnings.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-19 15:45:57 +00:00
Gert Wollny
d0e37599ab gallium: Make (num_)samples an unsigned int
According to the ARB_multisample num_samples is a non-negative integer.
Consequently define it as such, fail in glx/choose_visual if a negative
number is given.

v2: split patch into gallium and mesa part

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:45:57 +00:00
Andres Gomez
7a2c87177a docs: correct a typo in releasing instructions
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:25:53 +02:00
Andres Gomez
7760566ab7 docs: move untar line in basic testing instructions for coherence
For scons, windows/mingw dealing with LLVM_CONFIG is done before
untarring. This is also more convenient for copy and paste.

Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:25:39 +02:00
Andres Gomez
bd8537fa71 docs: add a notice whenever a release is the final in a series
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:25:27 +02:00
Andres Gomez
b910eec489 docs: add final release note for 17.2.8
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:25:20 +02:00
Andres Gomez
4b50cfef44 docs: add final release note for 17.1.10
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:20:57 +02:00
Grazvydas Ignotas
e6abc613e2 st/vdpau: release held lock in error path
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-01-19 13:30:22 +02:00
Juan A. Suarez Romero
302ff82434 docs: update calendar, add news and link release notes to 17.3.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-01-19 10:46:18 +01:00
Juan A. Suarez Romero
059db12097 docs: add sha256 checksums for 17.3.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit bc1503b13f)
2018-01-19 10:46:18 +01:00
Juan A. Suarez Romero
3205a45fc3 docs: add release notes for 17.3.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 80f5f279b3)
2018-01-19 10:46:18 +01:00
Samuel Iglesias Gonsálvez
7109a1fe13 anv: avoid segmentation fault due to vk_error()
vk_error() is a macro that calls __vk_errorf() with instance == NULL.

Then, __vk_errorf() passes a pointer to instance->debug_report_callbacks
to vk_debug_error(), which segfaults as this pointer is invalid but not
NULL.

Fixes: e5b1bd6ab8 "vulkan: move anv VK_EXT_debug_report implementation to common code."

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-19 09:39:05 +01:00
Bas Nieuwenhuizen
32170d87e3 ac/nir: Fix vector extraction if source vector has >4 elements.
v2: Add forgotten argument and start offset.

Fixes: 91074bb11b "radv/ac: Implement Float64 SSBO stores."
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-19 02:00:28 +01:00
Bas Nieuwenhuizen
f4211e6f93 ac/nir: Use correct 32-bit component writemask for 64-bit SSBO stores.
Fixes: 91074bb11b "radv/ac: Implement Float64 SSBO stores."
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-19 02:00:14 +01:00
Bas Nieuwenhuizen
4a9fd90e1e ac/nir: Fix TCS output LDS offsets.
When a channel was not set we also did not increase the LDS address,
while that obviously should happen.

The output loading code was inadvertently fixed which resulted in a
mismatch causing the SaschaWillems tessellation demo to result
in corrupt rendering.

Fixes: 7898eb9a60 "ac: rework load_tcs_{inputs,outputs}"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-19 01:54:59 +01:00
Bas Nieuwenhuizen
bd5c942cef radv: Use correct bindings for inputRate in key generation.
The bindings also have an index field.

Fixes: 49d035122e "radv: Add single pipeline cache key."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104677
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-19 01:54:59 +01:00
Bas Nieuwenhuizen
b1444c9ccb radv: Implement VK_ANDROID_native_buffer.
Passes
  dEQP-VK.api.smoke.*
  dEQP-VK.wsi.android.*

with android-cts-7.1_r12 .

Unlike the initial anv implementation this does
use syncobjs instead of waiting on the CPU.

This is missing meson build coverage for now.

One possible todo is that linux 4.15 now has a
sycall that allows us to export amdgpu fence to
a sync_file, which allows us not to force all
fences and semaphores to use syncobjs. However,
I had trouble with my kernel crashing regularly
with NULL pointers, and I'm not sure how beneficial
it is in the first place given that intel uses
syncobjs for all fences if available.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-19 01:43:55 +01:00
Bas Nieuwenhuizen
a3e241ed07 radv: Add create image flag to not use DCC/CMASK.
If we import an image, we might not have space in the
buffer for CMASK, even though it is compatible.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-19 01:43:55 +01:00
Bas Nieuwenhuizen
e344cd8178 radv: Generate VK_ANDROID_native_buffer.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-19 01:43:55 +01:00
Bas Nieuwenhuizen
0f89f9b8eb radv: Replace an assert with unreachable.
Otherwise we get uninitialized variable warnings for es_vgpr_comp_cnt.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-19 00:38:45 +01:00
Bas Nieuwenhuizen
e417ab212b radv: Remove DCC check on CS resolve dst image.
Gives a warning when the assert is disabled, and not even
necessarily true.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-19 00:38:45 +01:00
George Kyriazis
f76ca91ae0 gallivm: support avx512 (16x32) in interleave2_half
lp_build_interleave2_half was not doing the right thing for avx512-style
16-wide loads.

This path is hit in the swr driver with a 16-wide vertex shader. It is
called from lp_build_transpose_aos, when doing texel fetches and the
fetched data needs to be transposed to one component per output register.

Special-case the post-load swizzle operations for avx512 16x32 (16-wide
32-bit values) so that we move the xyzw components correctly to the outputs.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-18 17:07:06 -06:00
Brian Paul
9e6efdd177 vbo: fix VBO optimization regression
The optimization in change 8e4efdc895 ("vbo: optimize some display
list drawing") missed the loopback case.  This is used when the
glBegin/End primitive doesn't have a uniform set of vertex attributes.
The new Piglit gl-1.0-dlist-materials test hits this.

So check the aligned_vertex_buffer_offset(list) value and adjust the
buffer offset accordingly.

We also need to remove the 'start == 0' assertion in the loopback
code since it no longer applies.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-18 15:07:17 -07:00
Dylan Baker
26bde1e354 meson: ensure that xmlpool_options.h is generated for targets that need it
Currently a couple of gallium targets race with xmlpool_options.h being
generated, don't do that.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-01-18 13:31:47 -08:00
Timothy Arceri
3bccb5dba9 ac: fix visit_ssa_undef() for doubles
V2: use LLVMIntTypeInContext()

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-19 08:09:04 +11:00
Dave Airlie
3153d74207 ac/nir: account for view index in the user sgpr allocation.
The view index user sgpr wasn't being accounted for properly,
this refactors out the code to decide if it's required and then
uses that info to account for it.

Fixes: 180c1b924e (ac/nir: Add shader support for multiviews.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 19:47:40 +00:00
Dave Airlie
5758a8c402 r600: enable ARB_enhanced_layouts
Only one piglit test fails,
sso-vs-gs-fs-array-interleave

There are 3 tests using ssbo without checking sizes failing also
but those are test bugs.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-19 05:33:44 +10:00
Chris Wilson
34499e8ddc intel: Future-proof ring names for aubinator_error_decode
The kernel is moving to a $class$instance naming scheme in preparation
for accommodating more rings in the future in a consistent manner. It is
already using the naming scheme internally, and now we are looking at
updating some soft-ABI such as the error state to use the new naming
scheme. This of course means we need to teach aubinator_error_decode how
to map both sets of ring names onto its register maps.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
2018-01-18 17:35:21 +00:00
Kenneth Graunke
3e18c53e59 i965: Bind null render targets for shadow sampling + color.
Portal 2 appears to bind RGBA8888_UNORM textures to a sampler2DShadow,
and calls shadow2D() on it.  This causes undefined behavior in OpenGL.

Unfortunately, our sampler appears to hang in this scenario, which is
not acceptable.  Just give them a null surface instead, which returns
all zeroes.

Fixes GPU hangs in Portal 2 on Kabylake.

Huge thanks to Jason Ekstrand for noticing this crazy behavior while
sifting through crash dumps.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104487
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-18 09:32:28 -08:00
Iago Toral Quiroga
7ec6e4e689 anv/query: implement multiview interactions
From the Vulkan spec with KHX extensions:

  "If queries are used while executing a render pass instance that has
   multiview enabled, the query uses N consecutive query indices
   in the query pool (starting at query) where N is the number of bits
   set in the view mask in the subpass the query is used in.

   How the numerical results of the query are distributed among the
   queries is implementation-dependent. For example, some implementations
   may write each view's results to a distinct query, while other
   implementations may write the total result to the first query and write
   zero to the other queries. However, the sum of the results in all the
   queries must accurately reflect the total result of the query summed
   over all views. Applications can sum the results from all the queries to
   compute the total result."

In our case we only really emit a single query (in the first query index)
that stores the aggregated result for all views, but we still need to manage
availability for all the other query indices involved, even if we don't
actually use them.

This is relevant when clients call vkGetQueryPoolResults and pass all N
queries to retrieve the results. In that scenario, without this patch,
we will never see queries other than the first being available since we
never emit them.

v2: we need the same treatment for timestamp queries.

v3 (Jason):
 - Better an if instead of an early return.
 - We can't write to this memory in the CPU, we should use
   MI_STORE_DATA_IMM and emit_query_availability (Jason).

v4 (Jason):
 - No need to take the value to write as parameter, just hard code it to 0.

Fixes test failures in some work-in-progress CTS multiview+query tests.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-18 16:37:06 +01:00
Emil Velikov
c9b2cb7897 vc5: add missing files to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-18 11:36:36 +00:00
Emil Velikov
393cf04fa4 broadcom: add missing headers to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-18 11:21:35 +00:00
Mario Kleiner
d67ef48580 i965/screen: Allow drirc to set 'allow_rgb10_configs' again.
Since setup of ALLOW_RGB10_CONFIGS was moved to i965's own
brw_config_options.xml, this was hard-coded to false and
could not be overriden by drirc. Add some parsing into
i965's private screen->optionCache to enable drirc again.

Fixes: b391fb26df ("dri_util: remove ALLOW_RGB10_CONFIGS option (v2)")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-18 08:18:47 +02:00
Samuel Iglesias Gonsálvez
eac629deb6 anv: return VK_ERROR_OUT_OF_DEVICE_MEMORY when surface size is out of HW limits
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-18 06:48:47 +01:00
Timothy Arceri
9248f72c4e ac: tidy up array indexing logic
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-18 15:59:27 +11:00
Rob Clark
4c69961daf mesa/st: translate SO info in glsl_to_nir() case
This was handled for VS, but not for GS.

Fixes for gallium drivers using nir:
spec@arb_gpu_shader5@arb_gpu_shader5-xfb-streams-without-invocations
spec@arb_gpu_shader5@arb_gpu_shader5-xfb-streams*
spec@arb_transform_feedback3@arb_transform_feedback3-ext_interleaved_two_bufs_gs*
spec@ext_transform_feedback@geometry-shaders-basic
spec@ext_transform_feedback@* use_gs
spec@glsl-1.50@execution@geometry@primitive-id*
spec@glsl-1.50@execution@geometry@tri-strip-ordering-with-prim-restart gl_triangle_strip *
spec@glsl-1.50@transform-feedback-builtins
spec@glsl-1.50@transform-feedback-type-and-size

v2: don't call st_translate_program_stream_output) for TCS

v3: drop scanning patch outputs as TCS can't output xfb

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-01-18 15:35:58 +11:00
Dave Airlie
44a27cdcec r600/sb: add lds related peepholes.
if no destination:
a) convert _RET instructions to non _RET variants if no dst
b) set src0 to undefined if it's a READ, this should get DCE then.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:38:17 +00:00
Dave Airlie
3bb2b2cc45 r600/sb: use different stacks for tracking lds and queue usage.
The normal ssa renumbering isn't sufficient for LDS queue access,
this uses two stacks, one for the lds queue, and one for the
lds r/w ordering.

The LDS oq values are incremented in their use in a linear
fashion.
The LDS rw values are incremented in their definitions and used
in the next lds operation to ensure reordering doesn't occur.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:38:09 +00:00
Dave Airlie
8cfec333c0 r600/sb: schedule LDS ops in appropriate places.
So LDS ops have to be SLOT_X,
and LDS OQ reads have read port restrictions so we try
and force those into only having one per slot and avoiding
bank swizzles.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:38:05 +00:00
Dave Airlie
71a50de4fc r600/sb: hit the scheduler with a big hammer to avoid lds splits.
This tries to avoid an lds queue read getting scheduled separately
from an lds ret read, the non-sb code uses the same style of hammer,
this isn't foolproof.

We can do better, but it's a bit tricky, as you have to scan ahead
and either schedule more lds oq moves and more lds reads and that
could lead to you running out of space anyways.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:37:56 +00:00
Dave Airlie
46549bd6b6 r600/sb: adding lds oq tracking to the scheduler
This adds support for tracking the lds oq read/writes
so can avoid scheduling other things in between.

This patch just adds the tracking and assert to show
problems.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:37:52 +00:00
Dave Airlie
5002dd4052 r600/sb: add gcm support to avoid clause between lds read/queue read
You have to schedule LDS_READ_RET _, x and MOV reg, LDS_OQ_A_POP
in the same basic block/clause. This makes sure once we've issues
and MOV we don't add another block until we balance it with an
LDS read.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:37:42 +00:00
Dave Airlie
046cf68cad r600/sb: handle lds special dest registers.
This adds lds to the geom emit handling

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:37:39 +00:00
Dave Airlie
d72590032f r600/sb: handle LDS operations in folding.
Don't try and fold LDS using expressions.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:37:35 +00:00
Dave Airlie
c314b0a27a r600/sb: add finalising for lds output queue special values.
We need to convert these to the hw special registers.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:37:27 +00:00
Dave Airlie
9f3a1e9b0c r600/sb: add initial support for parsing lds operations.
This handles parsing the LDS ops and queue accessess.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:37:13 +00:00
Dave Airlie
795512b235 r600/sb: disable if conversion for hs
This fixes bad interactions with the LDS special values.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:37:01 +00:00
Dave Airlie
1ca2eb3bf3 r600/sb: lds ops have no dst register.
Although these are op3s they don't have a dst reg.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:36:52 +00:00
Dave Airlie
09c1c13c44 r600/sb: introduce special register values for lds support.
For LDS read/write ordering we use the LDS_RW value, reads
will wait on previous writes.
For LDS read/read from LDS queue ordering we use the LDS_OQ
values, we define two for now, though initially we'll just
support OQA.

Also add the check for the lds oq values

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:36:47 +00:00
Dave Airlie
2f2cef385f r600/sb: update last_cf if alu is the last clause
It's rare to have a final alu clause on normal shaders (exports)
but tess shaders write to LDS as their output, so we see some
alu clauses, and the CF_END get put in the wrong place.

This makes sure to update last_cf correctly.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:36:41 +00:00
Dave Airlie
da977ad907 r600/sb: start adding GDS support
This adds support for GDS ops to sb backend.

This seems to work for atomics and tess factor writes.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:35:37 +00:00
Dave Airlie
05f5282d63 r600/sb: add tess/compute initial state registers.
This stops them being optimised out.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:35:12 +00:00
Dave Airlie
68b976bd91 r600/sb: fix a bug emitting ar load from a constant.
Some tess shaders were doing MOVA_INT _, c0.x on cayman, and then
hitting an assert in sb_bc_finalize.cpp:translate_kcache.

This makes sure the toplevel kcache tracker gets updated,
and the clause gets fixed up.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:34:46 +00:00
Dave Airlie
7efcafce7c r600/shader: only emit add instruction if param has a value.
Just saves a pointless a = a + 0;

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:34:43 +00:00
Dave Airlie
2bd01adf14 r600: emit 0 gds_op for tf write.
This field is ignored for tf writes so should be 0.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:34:36 +00:00
Dave Airlie
9041730d1c r600: add support for ARB_shader_clock.
Reviewed-by: Gert Wollny <gw.fossedev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 13:25:59 +10:00
Dave Airlie
6785034a70 radv/ws: get rid of useless return value
This also used boolean, so nice to kill that.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 01:57:53 +00:00
Bas Nieuwenhuizen
2ce11ac11f radv: Initialize DCC on transition from preinitialized.
Looks like the decompress does not handle invalid encodings well,
which happens with random memory. Of course apps should not use it
with random memory, but they are allowed to ....

Fixes: 44fcf58744 "radv: Disable DCC for GENERAL layout and compute transfer dest."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-18 01:57:52 +01:00
Timothy Arceri
e2b9296146 ac: fix buffer overflow bug in 64bit SSBO loads
Fixes: 441ee1e65b "radv/ac: Implement Float64 SSBO loads"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-18 10:26:58 +11:00
Timothy Arceri
409e15f26f ac: fix nir_intrinsic_get_buffer_size for radeonsi
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-18 10:25:20 +11:00
Kenneth Graunke
d139b5e4cc i965: Pass brw_growing_bo to grow_buffer().
Cleaner.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-01-17 13:13:26 -08:00
Kenneth Graunke
81ca8e69e3 i965: Make a helper for recreating growing buffers.
Now that we have two of these, we're duplicating a bunch of this logic.
The next commit will add more logic, which would make the duplication
seem worse.

This ends up setting EXEC_OBJECT_CAPTURE on the batch, which isn't
necessary (it's already captured), but it should be harmless.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-01-17 13:13:26 -08:00
Kenneth Graunke
02c1c25b1a i965: Replace cpu_map pointers with a "use_shadow_copy" boolean.
Having a boolean for "we're using malloc'd shadow copies for all
buffers" is cleaner than having a cpu_map pointer for each.  It was
okay when we had one buffer, but this is more obvious.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-01-17 13:13:26 -08:00
Francisco Jerez
11674dad8a intel/fs: Optimize and simplify the copy propagation dataflow logic.
Previously the dataflow propagation algorithm would calculate the ACP
live-in and -out sets in a two-pass fixed-point algorithm.  The first
pass would update the live-out sets of all basic blocks of the program
based on their live-in sets, while the second pass would update the
live-in sets based on the live-out sets.  This is incredibly
inefficient in the typical case where the CFG of the program is
approximately acyclic, because it can take up to 2*n passes for an ACP
entry introduced at the top of the program to reach the bottom (where
n is the number of basic blocks in the program), until which point the
algorithm won't be able to reach a fixed point.

The same effect can be achieved in a single pass by computing the
live-in and -out sets in lock-step, because that makes sure that
processing of any basic block will pick up the updated live-out sets
of the lexically preceding blocks.  This gives the dataflow
propagation algorithm effectively O(n) run-time instead of O(n^2) in
the acyclic case.

The time spent in dataflow propagation is reduced by 30x in the
GLES31.functional.ssbo.layout.random.all_shared_buffer.5 dEQP
test-case on my CHV system (the improvement is likely to be of the
same order of magnitude on other platforms).  This more than reverses
an apparent run-time regression in this test-case from my previous
copy-propagation undefined-value handling patch, which was ultimately
caused by the additional work introduced in that commit to account for
undefined values being multiplied by a huge quadratic factor.

According to Chad this test was failing on CHV due to a 30s time-out
imposed by the Android CTS (this was the case regardless of my
undefined-value handling patch, even though my patch substantially
exacerbated the issue).  On my CHV system this patch reduces the
overall run-time of the test by approximately 12x, getting us to
around 13s, well below the time-out.

v2: Initialize live-out set to the universal set to avoid rather
    pessimistic dataflow estimation in shaders with cycles (Addresses
    performance regression reported by Eero in GpuTest Piano).
    Performance numbers given above still apply.  No shader-db changes
    with respect to master.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104271
Reported-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:56:08 -08:00
Marek Olšák
63b231309e gallium: remove PIPE_CAP_USER_CONSTANT_BUFFERS
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-17 20:18:00 +01:00
Marek Olšák
85bbcdda34 st/mesa: assume that user constant buffers are always supported
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-17 20:17:59 +01:00
Marek Olšák
5981a5226e nine: assume that user constant buffers are always supported
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-17 20:17:59 +01:00
Marek Olšák
e871abe452 gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-17 20:17:59 +01:00
Marek Olšák
e411d2572b st/mesa: expose ARB_sync unconditionally
All drivers support it.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-17 20:17:59 +01:00
Marek Olšák
3778a0a533 gallium: remove PIPE_CAP_TWO_SIDED_STENCIL
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-17 20:17:59 +01:00
Brian Paul
f631a6ae8c glsl: remove unneeded extern "C" {} bracketing around Mesa includes
The two headers already have the right extern "C" annotations.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul
7421e34dd6 mesa: move gl_external_samplers() to program.[ch]
The function is only called from a couple places.  It doesn't make
sense to have it in mtypes.h

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul
6dc8896726 st/mesa: include util/bitscan.h in st_glsl_to_tgsi_temprename.cpp
And use "" instead of <> for including Mesa headers, as we do elsewhere.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul
741d423478 glsl: include util/bitscan.h in serialize.cpp
Instead of relying on indirect inclusion of the header.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul
2f14146200 util: include string.h in u_dynarray.h
To get memset() prototype.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul
02c0734adc mesa: remove unneeded #includes of main/compiler.h
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul
7845397183 st/mesa: remove unneeded #includes of main/compiler.h
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul
bafa33befd st/mesa: include main/compiler.h in st_cb_queryobj.c
To get CPU_TO_LE32() macro.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul
7de0262f8f mesa: include util/macros.h in format_fallback.c
To get definition of unreachable() macro.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul
484ac243f6 mesa: include compiler.h in disk_cache.c
Instead of indirect inclusion to get CPU_TO_LE32() macro.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul
ad00a78993 mesa/program: change validate_inputs() local var 'inputs' to GLbitfield64
Both state->prog->info.inputs_read and state->InputsBound are GLbitfield64
so it seems that the OR of those values should be of the same type.
I'm not sure this fixes any actual issues though.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-17 11:17:56 -07:00
Brian Paul
d8306de4ac vbo: reindent vbo_attrib.h to use 3 spaces
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul
ef01d911ee vbo: whitespace, formatting fixes in vbo_exec_api.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul
18f4241b89 vbo: add assertions, comments in vbo_exec_api.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul
5e8962b58f vbo: whitespace, formatting fixes in vbo_exec_draw.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul
f5140c1a6b vbo: use inputs_read var to simplify code
v2: add some const qualifiers, per Ian.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul
e6e78d98a2 vbo: whitespace, formatting fixes in vbo_split_copy.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul
380165d399 vbo: use a new local 'array' variable in bind_vertex_list() loop
Make the code a bit more concise.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul
37863c38f3 vbo: remove unneeded #includes in vbo_context.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul
76a08eeeec vbo: whitespace, formatting fixes in vbo_context.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul
95dec9097f vbo: change vbo_context attribute map arrays to GLubyte
The values will never be larger than VBO_ATTRIB_MAX (currently 44).

v2: add STATIC_ASSERT to be sure VBO_ATTRIB_MAX can fit in ubyte,
per Emil.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul
5d78440d58 vbo: lift common code out of switch cases
Both switch cases began with the same code.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul
8e4efdc895 vbo: optimize some display list drawing (v2)
The vbo_save_vertex_list structure records one or more glBegin/End
primitives which all have the same vertex format.

To draw these primitives, we setup the vertex array state, then
issue the drawing command.  Before, the 'start' vertex was typically
zero and we used the vertex array pointer to indicate where the
vertex data starts.

This patch checks if the vertex buffer offset is an exact multiple of
the vertex size.  If so, that means we can use zero-based vertex array
pointers and use the draw's start value to indicate where the vertex
data starts.

This means a series of display list drawing commands may have
identical vertex array state.  This will get filtered out by the
Gallium CSO module so we can issue a tight series of drawing commands
without state changes to the device.

Note that this also works for a series of glCallList commands (not
just one list that contains multiple glBegin/End pairs).

No Piglit or conform changes.

v2: minor fixes suggested by Ian.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
4edc8fdcdc vbo: rewrite some code in playback_copy_to_current()
I think this is a little easier to understand.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
ad3b162b81 vbo: add some comments in vbo_save_api.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
ee86c34664 vbo: rename some functions in vbo_save_api.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
efb5ce1c5e vbo: rename some functions in vbo_save_draw.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
2064c4e814 vbo: add comment that vbo_save_vertex_list::buffer_offset is in bytes
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
8b06acf642 vbo: minor code simplification in _save_compile_vertex_list()
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
4acc8a0de3 vbo: rename prim to prims
Using a plural name makes it easier to see that this is an array and
not a pointer to a single object.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
841d1839e2 vbo: removed unused ctx parameter for alloc_prim_store()
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
3442672df4 vbo: rename vbo_save_context::buffer to buffer_map
And move the field and improve comments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
62fd12e5bc vbo: remove unused vbo_save_context::count field
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
1863b18ce6 vbo: s/GLuint/GLbitfield/ for vbo_save_context::replay_flags
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
65df0d41cf vbo: rename vbo_save_vertex_list::count to vertex_count
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
b04195b959 vbo: rename vbo_save_vertex_store::buffer to buffer_map
To match other parts of the VBO code and make things easier to understand.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
213bef62c3 vbo: rename vbo_save_primitive_store::buffer to prims
A little easier to understand.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
1e9ca36058 vbo: whitespace fixes in vbo_save.h
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
c40e9034be vbo: whitespace fixes in vbo_save_draw.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul
7027d9c1fd svga: add num-commands-per-draw HUD query
This query shows the ratio of total commands vs. drawing commands sent
to the vgpu device.  This gives some idea of how many state changes
are sent per draw call.  The closer the ratio is to 1.0, the better.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-17 11:17:56 -07:00
Brian Paul
92840bd276 gallium/hud: Fix support for PIPE_DRIVER_QUERY_TYPE_FLOAT
Evidently, nobody has used PIPE_DRIVER_QUERY_TYPE_FLOAT up to this
point.  Adding a driver query of this type which returns the query
value in pipe_query_result::f resulted in garbage output in the HUD.

The problem is the pipe_query_result::f field was being accessed as
through the u64 field and being added to the query_info::results_cumulative
field.  This patch checks for PIPE_DRIVER_QUERY_TYPE_FLOAT in a few
places and scales the float by 1000 before converting to uint64_t.

Also, add some comments to explain the query_info::result_index field.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul
23a5fae317 gallium/hud: remove uint64_t casts in sensor query_sti_load() function
The hud_graph_add_value() function takes a double value, so just pass
the current/critical values as-is since they're doubles.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul
f774f2b52f gallium/hud: compute cpu load, percent with doubles
The hud_graph_add_value() function takes a double precision value,
so compute it that way.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul
541f569a19 gallium/hud: s/unsigned/enum pipe_query_type/
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-17 11:17:56 -07:00
Jon Turney
3c35dad1df meson: Set with_dri from with_gallium when DRI glx is explicitly configured
Set with_dri from with_gallium when DRI GLX is explicitly configured, as
well as when DRI GLX is chosen automatically.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-17 18:05:18 +00:00
George Kyriazis
22a2027dd7 meson: add llvm dependency for swr build
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-17 11:38:57 -06:00
Leo Liu
d1833b8cd8 st/va: add break for MPEG4 data buffer handling case
Signed-off-by: Leo Liu <leo.liu@amd.com>
2018-01-17 08:31:38 -05:00
Leo Liu
3d0b561f34 st/va: remove TODO line for JPEG data buffer handling
Nothing to do

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-01-17 08:31:38 -05:00
Rhys Kidd
996a397719 i915: No longer rely on compatability define in intel_bufmgr.h
Symbol rename from dri_* to drm_intel_* introduced a number of compatability
defines within intel_bufmgr.h.

Replace the old function with the new function, consistent with the balance
of this file.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-17 08:08:04 -05:00
Timothy Arceri
1256ab18c1 radeonsi: bump glsl version to 450 for nir backend
We still have more work to do but piglit results are looking
pretty good.

At GLSL 1.50 we have 30647/31118 piglit tests passing.
At GLSL 4.50 we have 37927/38551 piglit tests passing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-18 00:03:33 +11:00
Timothy Arceri
b282207c32 radeonsi/nir: add some missing tcs bits to the nir scan pass
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-18 00:03:33 +11:00
Timothy Arceri
7898eb9a60 ac: rework load_tcs_{inputs,outputs}
This shares more code and calls the new shared load_tess_varyings()
abi so that the radeonsi nir path now supports tcs output loads.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-18 00:03:33 +11:00
Timothy Arceri
9622b445c8 ac/radeonsi: add tcs load outputs support
The code to load outputs is essentially the same as load inputs
so we make the interface more generic to maximise code sharing.

We will make use of the new support in the following patch.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-18 00:03:33 +11:00
Timothy Arceri
a20016d827 st/glsl_to_tgsi: add ARB_get_program_binary support using TGSI
This resolves a game bug in Dead Island. The game doesn't properly
handle ARB_get_program_binary with 0 supported formats, and ends up
crashing.

This will enable ARB_get_program_binary binary support for any
driver that currently enables the on-disk shader cache.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85564
2018-01-17 23:43:28 +11:00
Timothy Arceri
a34262aed7 st/glsl_to_tgsi: add st_get_program_binary_driver_sha1() helper
This will be used by ARB_get_program_binary.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 23:43:28 +11:00
Timothy Arceri
9eebc55cc2 st/glsl_to_tgsi: add (de)serialise program helpers
These will be shared between the on-disk shader cache and
ARB_get_program_binary.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 23:43:28 +11:00
Timothy Arceri
dbf7e483b4 st/glsl_to_tgsi: stop passing pipe_shader_state to st_store_tgsi_in_disk_cache()
We can instead just get this from st_*_program.

V2: store tokens to to st_compute_program before attempting to
    write to cache (fixes crash).

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 23:43:28 +11:00
Timothy Arceri
c69b0dd681 st/glsl_to_tgsi: store num_tgsi_tokens in st_*_program
We will need this for ARB_get_program_binary binary support.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 23:43:28 +11:00
Michel Dänzer
7b0e8264dd loader/dri3: Try to make sure we only process our own NotifyMSC events
We were using a sequence counter value to wait for a specific NotifyMSC
event. However, we can receive events from other clients as well, which
may already be using higher sequence numbers than us. In that case, we
could stop processing after an event from another client, which could
have been received significantly earlier. This would have multiple
undesirable effects:

* The computed MSC and UST values would be lower than they should be
* We could leave a growing number of NotifyMSC events from ourselves and
  other clients in XCB's special event queue

I ran into this with Firefox and Thunderbird, whose VSync threads both
seem to use the same window. The result was sluggish screen updates and
growing memory consumption in one of them.

Fix this by checking the XCB sequence number and MSC value of NotifyMSC
events, instead of using our own sequence number.

v2:
* Use the Present event ID for the sequence parameter of the
  PresentNotifyMSC request, as another safeguard against processing
  events from other clients
* Rebase on drawable mutex changes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> # v1
2018-01-17 11:40:22 +01:00
Bas Nieuwenhuizen
0b8991c0b6 radv: Implement VK_EXT_debug_report.
This is not hooked up to any messages yet, but useful for e.g.
renderdoc if you add some messages during development.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-17 11:29:04 +01:00
Bas Nieuwenhuizen
e5b1bd6ab8 vulkan: move anv VK_EXT_debug_report implementation to common code.
For also using it in radv. I moved the remaining stubs back to
anv_device.c as they were just trivial.

This does not move the vk_errorf/anv_perf_warn or the object
type macros, as those depend on anv types and logging.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-17 11:27:52 +01:00
Timothy Arceri
f69cbb2b53 st/glsl_to_nir: disable io lowering to temps for tess
Lowering these to temps makes a big mess, and results in some
piglit test failures. Also the radeonsi backend (the only backend
to support tess) has support for indirects so there is no need to
lower them anyway.

Fixes the following piglit tests on radeonsi:

tests/spec/arb_tessellation_shader/execution/variable-indexing/tes-input-array-vec3-index-rd.shader_test
tests/spec/arb_tessellation_shader/execution/variable-indexing/tes-input-array-vec4-index-rd.shader_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-17 17:14:14 +11:00
Jason Ekstrand
af10ce21ff i965: Enable CCS_E sampling of sRGB textures as UNORM
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-16 21:41:32 -08:00
Jason Ekstrand
96aa558715 i965/draw: Do resolves properly for textures used by TXF
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-16 21:41:32 -08:00
Jason Ekstrand
361e1df1ed i965/miptree: Refactor CCS_E and CCS_D cases in render_aux_usage
This commit unifies the CCS_E and CCS_D cases.  This should fix a couple
of subtle issues.  One is that when you use INTEL_DEBUG=norbc to disable
CCS_E, we don't get the sRGB blending workaround.  By unifying the code,
we give CCS_D that workaround as well.

The second issue fixed by this refactor is that the blending workaround
was appears to be enabled on all gens but really only applies on gen9.
Due to a happy accident in the way code was laid out, it was only
getting enabled on gen9: gen8 and earlier don't support non-zero-one
clear colors, and gen10 supports sRGB for CCS_E so it got caught in the
format_ccs_e_compat_with_miptree case.  This refactor moves it above the
format_ccs_e_compat_with_miptree case so it's an explicit early exit and
makes it explicitly only on gen9.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
2018-01-16 21:41:32 -08:00
Jason Ekstrand
f79bb2e651 Re-enable regular fast-clears (CCS_D) on gen9+
This reverts commit ee57b15ec7, "i965:
Disable regular fast-clears (CCS_D) on gen9+".  How taht we've fixed the
issue with too many different aux usages in the render cache, it should
be safe to re-enable CCS_D for sRGB.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104163
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
2018-01-16 21:41:32 -08:00
Jason Ekstrand
d84275b884 i965: Track format and aux usage in the render cache
This lets us perform render cache flushes whenever a surface goes from
being used with one aux+format to a different aux+format.

This is the "proper" fix for https://bugs.freedesktop.org/102435.
ee57b15ec7 which was really just a partial
revert of 3e57e9494c was just a hack to
get rid of a hang in a bunch of Valve games.  This solves the actual
problem responsible for the hang and lets us enable CCS_E once again.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102435
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
2018-01-16 21:41:32 -08:00
Jason Ekstrand
622786c20c i965: Call brw_cache_flush_for_render in predraw_resolve_framebuffer
This makes sure we flush things out of other caches prior to using a
surface through the render cache.  Currently, this is a no-op because GL
won't let you bind anything other than a color surface as color so it
should never end up in the depth cache.  However, this does complete the
flush/add_bo pair for regular drawing which will be required for the
next commit.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
2018-01-16 21:41:32 -08:00
Francisco Jerez
53d8508f1d i965/gen6-7/sol: Bump primitive counter BO size.
Improves performance of SynMark2 OglGSCloth by a further 9.65%±0.59%
due to the reduction in overwraps of the primitive count buffer that
lead to a CPU stall on previous rendering.  Cummulative performance
improvement from the series 81.50% ±0.96% (data gathered on VLV).

Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-16 16:03:56 -08:00
Francisco Jerez
f476b3f6e7 i965/gen6-7/sol: Keep independent counters for the current and previous begin/end block.
This allows us to aggregate the primitive counts of a completed
transform feedback begin/end block lazily, which in the most typical
case (where glDrawTransformFeedback is not used) will allow us to
avoid aggregating the primitive counters on the CPU altogether,
preventing a stall on previous rendering during
glBeginTransformFeedback(), which dramatically improves performance of
applications that rely heavily on transform feedback.

Improves performance of SynMark2 OglGSCloth by 65.52% ±0.25% (data
gathered on VLV).

Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-16 16:03:48 -08:00
Francisco Jerez
b0c8d61281 i965/gen6-7/sol: Restructure primitive counter into a separate type.
A primitive counter encapsulates a scalar aggregating counter for each
vertex stream along with a section within the primitive tally buffer
which hasn't been read out yet.  Defining this as a separate type will
allow us to keep multiple counter objects around for the same
transform feedback object without any code duplication.

Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-16 16:03:42 -08:00
Timothy Arceri
dc520dafdc st/mesa: enable ARB_enhanced_layouts on nir drivers
I'm guessing this may have been disable because of missing
component packing support. However recent nir linking changes
required nir based gallium drivers to support component packing
so this should now be ok to enable.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-17 10:42:55 +11:00
Roland Scheidegger
b0413cfd8b draw: remove VSPLIT_CREATE_IDX macro
Just inline the little bit of code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-17 00:01:19 +01:00
Roland Scheidegger
1f462eaf39 draw: fix vsplit code when the (post-bias) index value is -1
vsplit_add_cache uses the post-bias index for hashing, but the
vsplit_add_cache_uint/ushort/ubyte ones used the pre-bias index, therefore
the code for handling the special case (because -1 matches the initialization
value of the cache) wasn't actually working.
Commit 78a997f728 actually simplified the
cache logic somewhat, but it looks like this particular problem carried over
(and duplicated to the ushort/ubyte cases, since before only uint needed it).
This could lead to the vsplit cache doing the wrong thing, in particular
later fetch_info might indicate there are 0 values to fetch. This only really
affected edge cases which were bogus to begin with, but it could lead to a
crash with the jit vertex shader, since it cannot handle this case correctly
(the count loop is always executed at least once and we would not allocate
any memory for the shader outputs), so add another assert to catch it there.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-01-17 00:01:19 +01:00
Grazvydas Ignotas
0ad73031ec st/va: release held locks in error paths
Found with the help of following Coccinelle semantic patch:
// <smpl>
@@
expression E;
@@

  \(pthread_mutex_lock\|mtx_lock\|simple_mtx_lock\)(E)
  ...
(
  \(pthread_mutex_unlock\|mtx_unlock\|simple_mtx_unlock\)(E);
  ...
  return ...;
|
+ maybe need_unlock(E);
  return ...;
)
// </smpl>

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-01-17 00:39:55 +02:00
Grazvydas Ignotas
cce982a70b mesa: remove unneeded semicolons
Trivial. Found by Coccinelle.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-17 00:39:55 +02:00
Grazvydas Ignotas
e3adb1abaf radeon: remove unneeded semicolons
Trivial. Found by Coccinelle.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-17 00:39:55 +02:00
Grazvydas Ignotas
6129c03cc7 osmesa: don't check SmoothFlag twice
Trivial. Found by Coccinelle.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-17 00:39:55 +02:00
Samuel Pitoiset
05f73b9672 ac: set no-signed-zeros-fp-math when RADV_DEBUG="unsafemath" is used
This is an optimisation that is recommended by Matt Arsenault,
and used by RadeonSI, but it's not compatible with Vulkan.

Note that AC_FLOAT_MODE_UNSAFE_FP_MATH includes the no signed
zeros flag in LLVM.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-16 21:39:57 +01:00
Samuel Pitoiset
4f5318df2c ac: set fast math flags when RADV_DEBUG="unsafemath" is used
When that debug option is not used, we use the default float mode
because the no signed zeros optimisation is not Vulkan compatible.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-16 21:39:55 +01:00
Samuel Pitoiset
2091206ad3 ac: import lp_create_builder() from gallivm
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-16 21:39:53 +01:00
Samuel Pitoiset
ad2b3b2a9c ac: replace llvm.AMDGPU.kilp by llvm.amdgcn.kill with LLVM 6
This also replaces llvm.AMDGPU.kilp by llvm.AMDGPU.kill with
LLVM < 6. Similar to RadeonSI codepath.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-16 21:39:51 +01:00
Juan A. Suarez Romero
9b894c88a6 glsl/linker: link-error using the same name in unnamed block and outside
According with OpenGL GLSL 4.20 spec, section 4.3.9, page 57:

   "It is a link-time error if any particular shader interface
    contains:
      - two different blocks, each having no instance name, and each
        having a member of the same name, or
      - a variable outside a block, and a block with no instance name,
        where the variable has the same name as a member in the block."

This means that it is a link error if for example we have a vertex
shader with the following definition.

  "layout(location=0) uniform Data { float a; float b; };"

and a fragment shader with:

  "uniform float a;"

As in both cases we refer to both uniforms as "a", and thus using
glGetUniformLocation() wouldn't know which one we mean.

This fixes KHR-GL*.shaders.uniform_block.common.name_matching.

v2: add fixed tests (Tapani)

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-16 19:42:35 +01:00
Samuel Thibault
47ac11bcf8 glx: fix non-dri build
glXGetDriverConfig parameters do not provide a context to dynamically
check for the presence of the function, so the dispatcher directly calls
glXGetDriverConfig, but in non-dri builds dri_glx.c didn't provide
glXGetDriverConfig.

This change make it just return NULL in that case.

Fixes: 84f764a759 "glxglvnddispatch: Add missing dispatch for GetDriverConfig
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-16 16:27:19 +00:00
Indrajit Das
338638a8af st/va: clear pointers for mpeg2 quantiser matrices
This is to fix VA-API issues with GStreamer and MPEG2.
Since gstreamer does not pass quantiser matrices with each frame, invalid
pointers were being passed to the driver. This patch addresses the same.

Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-01-16 10:51:15 +01:00
Indrajit Das
f5277e8492 radeon/vcn: update quantiser matrices only when requested
Only update them when the pointers are valid.

Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-01-16 10:51:00 +01:00
Indrajit Das
38dee62c9a radeon/uvd: update quantiser matrices only when requested
Only upload them when the pointers are valid.

Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-01-16 10:50:08 +01:00
Adam Jackson
1cbcd70b64 Revert "docs: Mark GLX_ARB_context_flush_control done"
This reverts commit d547e18184.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104490
Signed-off-by: Adam Jackson <ajax@redhat.com>
2018-01-15 13:48:41 -05:00
Adam Jackson
138f4e3805 Revert "gallium/dri2: Enable {GLX_ARB,EGL_KHR}_context_flush_control"
This reverts commit 0d044351b7.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104490
Signed-off-by: Adam Jackson <ajax@redhat.com>
2018-01-15 13:48:39 -05:00
Adam Jackson
f2a5d27ce2 Revert "i965: Enable flush control"
This reverts commit 6ce9006d76.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104490
Signed-off-by: Adam Jackson <ajax@redhat.com>
2018-01-15 13:48:16 -05:00
Samuel Pitoiset
8045f01e2a Revert "ac/shader: gather If TES reads TESSINNER or TESSOUTER"
This can't work for two reasons:
- TESSINNER/TESSOUTER are shader input values, so never translated
to the intrinsic ops
- the shader info pass scans the current stage but we want to know
in TCS, if TES reads the tess factors.

This fixes 6 regressions related to
deqp-vk/tessellation/shader_input_output/tess_level_{inner,outer}_XXX_tes

This reverts commit 5ba1a61648.
2018-01-15 13:47:18 +01:00
Samuel Pitoiset
5842cb0df1 amd/common: fix loading InstanceID for tess on < GFX9
InstanceID is in VGPR2, not 1.

One more failure that CTS didn't catch up...

Reported-by: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-15 11:59:16 +01:00
Samuel Pitoiset
5ba1a61648 ac/shader: gather If TES reads TESSINNER or TESSOUTER
This shouldn't be scanned in the pipeline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-15 11:51:47 +01:00
Samuel Pitoiset
aebde47840 ac: remove ac_shader_variant_info::fs::output_mask
Unused.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-15 11:48:42 +01:00
Gert Wollny
5d6470d26b r600/shader: Initialize max_driver_temp_used correctly for the first time
Without this initialization the temp registers used in tgsi_declaration
may used random indices, and this may result in failing translation from TGSI
with an error message "GPR limit exceeded", because the random index is greater
then the allowed limit implying that the shader uses more temporary registers then
available.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-15 09:14:22 +10:00
Rob Clark
f10bd0a0e1 freedreno/ir3: "soft" depth scheduling for SFU instructions
First try with a "soft" depth, to try to schedule sfu instructions
further from their consumers, but fall back to hard depth (which might
result in stalling) if nothing else is avail to schedule.

Previously the consumer of a sfu instruction could end up scheduled
immediately after (since "hard" depth from sfu to consumer would be 0).
This works because legalize pass would insert a (ss) sync bit, but it
is sub-optimal since it would cause a stall.

Instead prioritize other instructions for 4 cycles if they would no
cause a nop to be inserted.  This minimizes the stalling.  There is a
slight penalty in general to overall # of instructions in shader (since
we could end up needing nop's later due to scheduling the "deeper" sfu
consumer later), but ends up being a wash on register pressure.

Overall this seems to be worth a 10+% gain in fps.  Increasing the
"soft" depth of sfu consumer beyond 4 helps a bit in some cases, but 4
seems to be a good trade-off between getting 99% of the gain and not
increasing instruction count of shaders too much.

It's possible a similar approach could help for tex/mem instructions,
but the (sy) sync bit seems to trigger a switch to a different thread-
group to hide memory latency (possibly with some limits depending on
number of registers used?).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-01-14 16:14:19 -05:00
Rob Clark
50f9a9aa96 freedreno/a5xx: work around SWAP vs TILE_MODE constraint
If the blit isn't changing format, but is changing tiling, just lie and
call things ARGB (since the exact component order doesn't matter for a
tiling blit).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-01-14 16:14:19 -05:00
Rob Clark
39b63c18f1 freedreno/a5xx: texture tiling
Overall a nice 5-10% gain for most games.  And more for things like
glmark2 texture benchmark.

There are some rough edges.  In particular, the hardware seems to only
support tiling or component swap.  (Ie. from hw PoV, ARGB/ABGR/RGBA/
BGRA are all the same format but with different component swap.)  For
tiled formats, only ARGB is possible.  This isn't a big problem for
*sampling* since we also have swizzle state there (and since
util_format_compose_swizzles() already takes into account the component
order, we didn't use COLOR_SWAP for sampling).  But it is a problem if
you try to render to a tiled BGRA (for example) surface.

The next patch introduces a workaround for blitter, so we can generate
tiled textures in ABGR/RGBA/BGRA, but that doesn't help the render-
target case.  To handle that, I think we'd need to keep track that the
tiled format is different from the linear format, which seems like it
would get extra fun with sampler views/etc.

So for now, disabled by default, enable with FD_MESA_DEBUG=ttile.  In
practice it works fine for all the games I've tried, but makes piglit
grumpy.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-01-14 16:13:39 -05:00
Rob Clark
868b02cfb4 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-01-14 16:10:06 -05:00
Rob Clark
16b91c2254 freedreno: add screen->setup_slices() for tex layout
The rules are sufficiently different for a5xx with tiled textures, so
split this out into something that can be implemented per-generation.
The a5xx specific implementation will come in a later patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-01-14 16:10:06 -05:00
Grazvydas Ignotas
047c6fe2c5 r300g: remove double assignment
Trivial. Found by Coccinelle.
2018-01-14 23:04:49 +02:00
Grazvydas Ignotas
6acf22a179 util: use faster zlib's CRC32 implementaion
zlib provides a faster slice-by-4 CRC32 implementation than the
traditional single byte lookup one used by mesa. As most supported
platforms now link zlib unconditionally, we can easily use it.

Improvement for a 1MB buffer (avg MB/s, n=100, zlib 1.2.8):

  i5-6600K                    C2D E4500
mesa zlib                    mesa zlib
 443 1443 225% +/- 2.1%       403 1175 191% +/- 0.9%

It has been verified the calculation results stay the same after this
change.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-14 19:10:33 +02:00
Grazvydas Ignotas
87f723408b android,configure,meson: define HAVE_ZLIB
The next change wants to use some optional zlib functionality, however
not all platforms currently use zlib. Based on earlier Jordan Justen's
patches and their review feedback.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-14 18:52:23 +02:00
Grazvydas Ignotas
b7347cc313 util/crc32: don't drop the const qualifier
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-14 18:47:50 +02:00
Timothy Arceri
e6378962ce ac: add doubles support to isign
Fixes a number of int64 piglit tests, for example:

generated_tests/spec/arb_gpu_shader_int64/execution/built-in-functions/fs-sign-i64vec2.shader_test

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-14 11:40:03 +11:00
Timothy Arceri
38876c88d1 ac: add i64_0 and i64_1 to llvm build context
These will be used in the following patch.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-14 11:40:03 +11:00
Timothy Arceri
741b21b713 ac/nir: fix translation of nir_op_b2i for doubles
V2: just zero-extend the 32-bit value.

Fixes a number of int64 piglet tests, for example:

generated_tests/spec/arb_gpu_shader_int64/execution/conversion/frag-conversion-explicit-bool-int64_t.shader_test

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-14 11:40:03 +11:00
Mauro Rossi
4d61eb8018 ac: fix build error in si_shader
assert() is replaced by unreachable(), to avoid following building error:

external/mesa/src/gallium/drivers/radeonsi/si_shader.c:1967:1:
error: control may reach end of non-void function [-Werror,-Wreturn-type]
}
^
1 error generated.

Fixes: c797cd6 ("ac: add load_patch_vertices_in() to the abi")

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-13 18:13:07 +11:00
Timothy Arceri
f0d74ecce8 radv/radeonsi/nir: lower 64bit flrp
Fixes a bunch of arb_gpu_shader_fp64 piglit tests for example:

generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-mix-double-double-double.shader_test

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-13 18:04:40 +11:00
Eric Anholt
5bc0b63799 broadcom/vc5: Use MSF to ignore discards/non-dispatched channels in loops.
Prevents potential infinite loops when a non-dispatched or discarded
channel never triggers the loop break condition.
2018-01-12 21:58:24 -08:00
Eric Anholt
762dd52951 broadcom/vc5: Use XOR instead of SUB for execute flags comparisons.
I think this should be equivalent other than power, and it's the kind of
comparison we use for nir_op_ieq.
2018-01-12 21:58:18 -08:00
Eric Anholt
8e4cba9d92 broadcom/vc5: Also check the update flags for avoiding DCE.
I was trying to do a NULL-destination UF, and it got removed.
2018-01-12 21:58:11 -08:00
Eric Anholt
ff77ca8a3b broadcom/vc5: Fix up channel swizzling for textures on 4.x.
I had 3.x putting swizzling in the texture state only for 16-bit texture
returns, and in the shader for 32-bit.  This may be due to having mixed up
the return channel setup on 3.x back before I had moved it into the
compiler.  On 4.x, the non-border-color texwrap tests are passing nicely
with both 16 and 32-bit returns with swizzling in the texture state.
2018-01-12 21:58:04 -08:00
Eric Anholt
e41e33fdb8 broadcom/vc5: Port the draw-time state emission to V3D 4.1. 2018-01-12 21:57:52 -08:00
Eric Anholt
aa77a9cf5a broadcom/vc5: Rename V3D 3.x Flat Shade Action to match v4.x naming.
Now that the actions are reused for centroid and nonperspective, give them
a more generic name.
2018-01-12 21:57:45 -08:00
Eric Anholt
8e4c705515 broadcom/vc5: Update pixel center setup for V3D 4.x.
The fxcd/fycd instructions now return half-integer pixel centers when not
doing sample-rate shading.
2018-01-12 21:57:37 -08:00
Eric Anholt
95873a184e broadcom/vc5: Print the buffer name in simulator overflow checks.
Revealed that I was writing past the TSDA, not the Z buffer as I expected.
2018-01-12 21:57:30 -08:00
Eric Anholt
368bab43fd broadcom/vc5: Add support for loading varyings in V3D 4.1.
The LDVARY signal now writes an arbitrary register, so I took out the
magic src register file and replaced it with an instruction with LDVARY
set so we have somewhere to hang a QFILE_TEMP destination for register
allocation.
2018-01-12 21:57:21 -08:00
Eric Anholt
af9753e246 broadcom/vc5: Update state setup for V3D 4.1. 2018-01-12 21:57:09 -08:00
Eric Anholt
5aaea3c4a0 broadcom/vc5: Add compiler support for V3D 4.x texturing. 2018-01-12 21:56:57 -08:00
Eric Anholt
028f6b327c broadcom/vc5: Add the new TMU write addresses for V3D 4.x (and r5rep).
The V3D 3.x series of TMU writes with meaning depending on the texture
type is replaced with writes to specific registers for each texture
argument semantic.
2018-01-12 21:56:48 -08:00
Eric Anholt
42a35da96d broadcom/vc5: Move V3D 3.3 texturing to a separate file.
V3D 4.x texturing changes enough that #ifdefs would just make a mess of
it.
2018-01-12 21:56:37 -08:00
Eric Anholt
acf30e4916 broadcom/vc5: Move V3D 3.3 VPM write setup to a separate file.
For V4.1 texturing, I need the V4.1 XML, so the main compiler needs to
stop including V3.3 XML.
2018-01-12 21:56:24 -08:00
Eric Anholt
725d73981a broadcom/vc5: Set up depth formats for V3D 4.x.
We no longer have the small depth-specific output format enum, and instead
depth is just at the end of the output image format enum.
2018-01-12 21:56:17 -08:00
Eric Anholt
66f2f3ed97 broadcom/vc5: Always use the RGBA8 formats for RGBX8.
The RGBX8 formats were dropped from V3D 4.x, but we don't really need them
anyway (we already handle other non-alpha formats by forcing A to 1).
2018-01-12 21:56:10 -08:00
Eric Anholt
469bbd8387 broadcom/vc5: Move the formats table to per-V3D-version compile. 2018-01-12 21:56:00 -08:00
Eric Anholt
34898c8c45 broadcom/vc5: Add support for V3D 4.1 CLIF dumping. 2018-01-12 21:55:49 -08:00
Eric Anholt
409696b76e broadcom/vc5: Move the body of CLIF dumping to a per-version file.
I want the library's entrypoints to still be unversioned, but the actual
packet dumping needs to be per-version.
2018-01-12 21:55:38 -08:00
Eric Anholt
90269ba353 broadcom/vc5: Use THRSW to enable multi-threaded shaders.
This is a major performance boost on all of V3D, but is required on V3D
4.x where shaders are always either 2- or 4-threaded.
2018-01-12 21:55:30 -08:00
Eric Anholt
86a12b4d5a broadcom/vc5: Properly schedule the thread-end THRSW.
This fills in the delay slots of thread end as much as we can (other than
being cautious about potential TLBZ writes).

In the process, I moved the thread end THRSW instruction creation to the
scheduler.  Once we start emitting THRSWs in the shader, we need to
schedule the thread-end one differently from other THRSWs, so having it in
there makes that easy.
2018-01-12 21:55:23 -08:00
Eric Anholt
a075bb6726 broadcom/vc5: Implement GFXH-1684 workaround.
Apparently the VPM writes need to be flushed out before we end the shader.
2018-01-12 21:55:15 -08:00
Eric Anholt
57965755e2 broadcom/vc5: Port drawing commands to V3D 4.x.
This required extending the CL submit ioctl, because the tile alloc/state
buffer setup has moved from the BCL to register writes.
2018-01-12 21:55:04 -08:00
Eric Anholt
f50d39ab49 broadcom/vc5: Add a test for .ifb in ADD ops.
I had a .ifb being decoded weird in sampid, so this is to check that .ifb
is fine.
2018-01-12 21:54:57 -08:00
Eric Anholt
267f13dbee broadcom/vc5: Add the new tesselation opcodes in V3D 4.1. 2018-01-12 21:54:50 -08:00
Eric Anholt
edbd817c30 broadcom/vc5: Use a physical-reg-only register class for LDVPM.
This is needed for LDVPM on V3D 4.x, but will also be needed for keeping
values out of the accumulators across THRSW.
2018-01-12 21:54:42 -08:00
Eric Anholt
22a02f3e34 broadcom/vc5: Use the new LDVPM/STVPM opcodes on V3D 4.1.
Now, instead of a magic write register for VPM stores we have an
instruction to do them (which means no packing of other ALU ops into it),
with the ability to reorder the VPM stores due to the offset being baked
into the instruction.

VPM loads also gain the ability to be reordered by packing the row into
the A argument.  They also no longer write to the r3 accumulator, and
instead must be stored to a physical register.
2018-01-12 21:54:33 -08:00
Eric Anholt
55f8a01aca broadcom/vc5: Drop dead VC5_QPU_* defines from qpu_instr.c.
I had all the packing code in this file at one point, but these defines
now live in qpu_pack.c.
2018-01-12 21:54:27 -08:00
Eric Anholt
2bd378647b broadcom/vc5: Add support for QPU pack/unpack/disasm of small immediates. 2018-01-12 21:54:18 -08:00
Eric Anholt
5f227ac210 broadcom/vc5: Enable the driver on V3D 4.1 2018-01-12 21:54:12 -08:00
Eric Anholt
39ce1ab7ba broadcom/vc5: Port the simulator to support V3D 4.1
This required moving the register accesses to a separate v3dx file, since
the register definitions for each V3D version collide.  It seems that
initializing the v3d_hw from a file dictating 3.3
(v3d_simulator_wrapper.cpp) is safe, though.
2018-01-12 21:54:00 -08:00
Eric Anholt
c81cc767e4 broadcom/vc5: Drop signal bit #defines.
Signals are more complicated than that, and tables ended up being better.
2018-01-12 21:53:53 -08:00
Eric Anholt
dfee62eed3 broadcom/vc5: Add support for V3Dv4 signal bits.
The WRTMUC replaces the implicit uniform loads in the first two texture
instructions.  LDVPM disappears in favor of an ALU op.  LDVARY, LDTMU,
LDTLB, and LDUNIF*RF now write to arbitrary registers, which required
passing the devinfo through to a few more functions.
2018-01-12 21:53:45 -08:00
Eric Anholt
81ec2ba229 broadcom/vc5: Fix pack/unpack of vfmul input unpack flags. 2018-01-12 21:53:38 -08:00
Eric Anholt
954a704da3 broadcom/vc5: Port the RCL setup to V3D4.1.
The TLB load/store path is rebuilt in this version.  There is no longer a
single-byte resolved store or the 3-byte extended store.  Instead, you get
to always use general loads/stores (which, honestly, was tempting even in
previous versions).
2018-01-12 21:53:26 -08:00
Eric Anholt
80c84241af broadcom/vc5: Fix per-tile extra clear packet.
I accidentally emitted this into the RCL instead of the per-tile generic
list, so we wouldn't get tiles after the first cleared.
2018-01-12 21:52:02 -08:00
Eric Anholt
f13fe510d1 broadcom/vc5: Move the TLB loads and stores to helper functions.
This is going to get more complicated with V3D 4.1 support, which redoes
all the TLB packets.
2018-01-12 21:51:54 -08:00
Eric Anholt
2c48ce74f7 broadcom/vc5: Convert vc5_cl.h to use the V3DX() macros.
To conditionally compile cl_emit() macros per V3D version, we need it to
expand to whatever V3D we're building for.  This required emitting #define
V3D_VERSION 33 in all our currently 3.3-only code.
2018-01-12 21:51:47 -08:00
Eric Anholt
fb4face86a broadcom/vc5: Introduce v3dx_macros.h and v3dx_pack.h headers.
This will be used by vc5 for prefixing functions and including the pack
header in v3d-version-dependent code, following the model of anv.
2018-01-12 21:51:40 -08:00
Eric Anholt
7dedfd9660 broadcom/cle: Fix error path of missing a "type" in the XML.
We try to emit a #error and continue so that you can debug the missing
type at C compile time, but were missing a couple of definitions in that
path (sigh, python).
2018-01-12 21:51:34 -08:00
Eric Anholt
3d8ad50370 broadcom/vc5: Add XML for V3D v4.1 (BCM7278) 2018-01-12 21:48:07 -08:00
Samuel Pitoiset
0eb30d81c4 ac: add 'const' qualifiers to the shader info pass
For clarification purposes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-12 12:25:21 +01:00
Samuel Pitoiset
20f7f9a328 ac: remove unused ac_nir_compiler_options from gather_info_input_decl()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-12 12:25:19 +01:00
Samuel Pitoiset
d5e369ff8a nir: add a 'const' qualifier to nir_ssa_def_components_read()
To avoid compilation warnings and because this helper
shouldn't update anything.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-12 12:25:17 +01:00
Thomas Hellstrom
897c54d522 loader/dri3: Avoid freeing renderbuffers in use
Upon reception of an event that lowered the number of active back buffers,
the code would immediately try to free all back buffers with an id equal to or
higher than the new number of active back buffers.

However, that could lead to an active or to-be-active back buffer being freed,
since the old number of back buffers was used when obtaining an idle back
buffer for use.

This lead to crashes when lowering the number of active back buffers by
transitioning from page-flipping to non-page-flipping presents.

Fix this by computing the number of active back buffers only when trying to
obtain a new back buffer.

Fixes: 15e208c4cc ("loader/dri3: Don't accidently free buffer holding new back content")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104214
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Tested-by: Andriy.Khulap <andriy.khulap@globallogic.com>
Tested-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2018-01-12 09:17:35 +01:00
Samuel Iglesias Gonsálvez
e63adf8b1e anv: VkDescriptorSetLayoutBinding can have descriptorCount == 0
From Vulkan spec:

"descriptorCount is the number of descriptors contained in the binding,
accessed in a shader as an array. If descriptorCount is zero this
binding entry is reserved and the resource must not be accessed from
any stage via this binding within any pipeline using the set layout."

Fixes:

dEQP-VK.binding_model.descriptor_update.empty_descriptor.uniform_buffer

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2018-01-12 07:08:51 +01:00
Roland Scheidegger
734bef372d mesa: require at least 14 UBOs for GL 4.3
ARB_ubo requires 12 UBOs (per stage) at least, but this limit has been
raised by GL 4.3 to 14, so don't advertize GL 4.3 without it (only checking
the vertex stage since all drivers probably have the same limit anyway for
other stages). (piglit has minmax tests for that kind of thing, but they go
only up to 3.3, so this won't really be noticed.)
I think this currently should not affect any driver - r600 until very
recently only supported 12 but now advertizes 14 too.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-12 02:52:10 +01:00
Roland Scheidegger
85377dc55c util: fix NORETURN for msvc, add HAVE_FUNC_ATTRIBUTE_NORETURN to c99_compat.h
We've seen some problems internally due to macro redefinition.
Fix this by adding HAVE_FUNC_ATTRIBUTE_NORETURN to c99_compat.h,
and defining it for msvc.
And avoid redefinition just in case.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-01-12 02:52:10 +01:00
Dave Airlie
ad11fc3571 radv: don't emit unneeded vertex state.
If the number of instances hasn't changed and we've already
emitted it, don't emit it again.

If the vertex shader is the same and the first_instance, vertex_offset
haven't changed don't emit them again.

This increases the fps in GL_vs_VK -t 1 -m -api vk from around 40
to around 60 here, it may not impact anything else.

Dieter also reported smoketest going from 1060->1200 fps.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-12 00:43:07 +00:00
Dave Airlie
e37db93246 radv: trim buffer load result (fixes dota2)
Running dota2 since the below commit crashes with an llvm assert.

Trim the vector like the other user. This possible could also be
avoided by not padding inside the load vec3->vec4.

Fixes: 41c36c4549 (amd/common: use ac_build_buffer_load() for emitting UBO loads)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-12 00:41:55 +00:00
Dylan Baker
aca3b647be meson: add variable for including include/GL/internal
Signed-off-by: <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-11 15:40:02 -08:00
Dylan Baker
5fcadaec80 meson: define inc_gbm as empty if not otherwise assigned
Otherwise this could be undefined in the egl directory.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-11 15:40:02 -08:00
Dylan Baker
a0a764cde5 meson: move libsensors dependency to libgallium
This simplifies the build by removing the need to link targets against
libsensors.

Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-11 15:40:02 -08:00
Dylan Baker
2083a14179 meson: Use dependencies for nir
This creates two new internal dependencies, idep_nir_headers and
idep_nir. The former encapsulates the generation of nir_opcodes.h and
nir_builder_opcodes.h and adding src/compiler/nir as an include path.
This ensures that any target that needs nir headers will have the
includes and that the generated headers will be generated before the
target is build. The second, idep_nir, includes the first and
additionally links to libnir.

This is intended to make it easier to avoid race conditions in the build
when using nir, since the number of consumers for libnir and it's
headers are quite high.

Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-11 15:40:02 -08:00
Dylan Baker
60856a7b49 meson: don't use intermediate variables that are immediately discarded
For things like:
loop
    x = func()
    list += x
end

just do:
loop
    list += func()
end

Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-11 15:40:02 -08:00
Dylan Baker
4ccb981673 meson: Use consistent style for tests
Don't use intermediate variables, use consistent whitespace.

Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-11 15:40:02 -08:00
Dylan Baker
8e981eb2b7 meson: Use include variables
These were added after adderlib was mesonified, but it still good to use
them instead of open coding them.

Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-11 15:40:02 -08:00
Dylan Baker
fbf192a67e meson: Use consistent style
Currently the meosn build has a mix of two styles:
arg : [foo, ...
       bar],

and
arg : [
  foo, ...,
  bar,
]

For consistency let's pick one. I've picked the later style, which I
think is more readable, and is more common in the mesa code base.

v2: - fix commit message

Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-11 15:40:02 -08:00
Jason Ekstrand
c3d802d68e i965: Use UD types for gl_SampleID setup
We already had to switch all of the W types to UW to prevent issues
with vector immediates on gen10.  We may as well use unsigned types
everywhere.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-01-11 14:31:47 -08:00
Jason Ekstrand
3d2b157e23 i965/fs: Use UW types when using V immediates
Gen 10 has a strange hardware bug involving V immediates with W types.
It appears that a mov(8) g2<1>W 0x76543210V will actually result in g2
getting the value {3, 2, 1, 0, 3, 2, 1, 0}.  In particular, the bottom
four nibbles are repeated instead of the top four being taken.  (A mov
of 0x00003210V yields the same result.)  This bug does not appear in any
hardware documentation as far as we can tell and the simulator does not
implement the bug either.

Commit 6132992cdb was mostly a no-op
except that it changed the type of the subgroup invocation from UW to W
and caused us to tickle this bug with basically every compute shader
that uses any sort of invocation ID (which is most of them).  This is
also potentially an issue for geometry shader input pulls and SampleID
setup.  The easy solution is just to change the few places where we use
a vector integer immediate with a W type to use a UW type.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Fixes: 6132992cdb
2018-01-11 14:31:38 -08:00
Timothy Arceri
30c1a93f6d ac/nir: fix translation of nir_op_fsign for doubles
Without this we end up with the llvm error message:

"Both operands to a binary operator are not of the same type!"

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-12 09:29:18 +11:00
Timothy Arceri
d7b6b8ba52 ac: add f64_0 to the llvm build context
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-12 09:29:18 +11:00
Timothy Arceri
7b971c828a ac/nir: fix translation of nir_op_frcp for doubles
Without this we end up with the llvm error message:

"Both operands to a binary operator are not of the same type!"

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-12 09:29:18 +11:00
Timothy Arceri
24575c815c ac/nir: fix translation of nir_op_frsq for doubles
Without this we end up with the llvm error message:

"Both operands to a binary operator are not of the same type!"

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-12 09:29:17 +11:00
Timothy Arceri
c0eb304acd ac: add f64_1 to the llvm build context
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-12 09:29:17 +11:00
Bas Nieuwenhuizen
b9f4c615f8 radv: reset semaphores & fences on sync_file export.
Per spec:

"Additionally, exporting a fence payload to a handle with copy transference has the same side effects
on the source fence’s payload as executing a fence reset operation. If the fence was using a
temporarily imported payload, the fence’s prior permanent payload will be restored."

And similar for semaphores:

"Additionally, exporting a semaphore payload to a handle with copy transference has the same side
effects on the source semaphore’s payload as executing a semaphore wait operation. If the
semaphore was using a temporarily imported payload, the semaphore’s prior permanent payload
will be restored."

Fixes: 42bc25a79c "radv: Advertise sync fd import and export."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-11 21:56:13 +01:00
Anuj Phogat
fe668b5c15 intel: Add more Coffee Lake PCI IDs
More Coffee Lake PCI IDs have been added to the spec.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-01-11 10:16:54 -08:00
Matt Turner
c0ef14f5b1 Revert "Revert "i965/fs: Use align1 mode on ternary instructions on Gen10+""
This reverts commit 2d04572038.

Acked-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-01-11 10:11:59 -08:00
Matt Turner
01ebfbb67a i965/fs: Add/use functions to convert to 3src_align1 vstride/hstride
Some cases weren't handled, such as stride 4 which is needed for 64-bit
operations. Presumably fixes the assertion failure mentioned in commit
2d04572038 (Revert "i965/fs: Use align1 mode on ternary instructions
on Gen10+") but who can really say since the commit neglected to list
any of them!

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-01-11 10:11:59 -08:00
Alex Smith
4fd85617c1 anv: Make sure state on primary is correct after CmdExecuteCommands
After executing a secondary command buffer, we need to update certain
state on the primary command buffer to reflect changes by the secondary.
Otherwise subsequent commands may not have the correct state set.

This fixes various issues (rendering errors, GPU hangs) seen after
executing secondary command buffers in some cases.

v2 (Jason Ekstrand):
 - Reset to invalid values instead of pulling from the secondary
 - Change the comment to be more descriptive

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2018-01-11 18:11:08 +00:00
Brian Paul
bb951d45f2 svga: simplify failure code in emit_rss_vgpu9()
No need for a goto.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-11 08:06:38 -07:00
Brian Paul
8f884b83d4 svga: remove unused fail parameter to EMIT_RS(), EMIT_RS_FLOAT()
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-11 08:06:38 -07:00
Brian Paul
879ea9432d svga: add assertion in svga_queue_rs()
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-11 08:06:38 -07:00
Brian Paul
66c6cec612 svga: whitespace/formatting fixes in svga_state_rss.c
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-11 08:06:38 -07:00
Andres Gomez
a1901d092c anv: Import mako templates only during execution of anv_extensions
anv_extensions usage from anv_icd was bringing the unwanted dependency
of mako templates for the latter. We don't want that since it will
force the dependency even for distributable tarballs which was not
needed until now.

Jason suggested this approach.

v2: Patch simplification (Jason).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104551
Fixes: 0ab04ba979 ("anv: Use python to generate ICD json files")
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-11 14:44:03 +02:00
Tapani Pälli
f2c0e47d9c glsl: cleanup shader_cache header guard
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-11 08:04:56 +02:00
Samuel Iglesias Gonsálvez
c0816389c2 anv: fix maxDescriptorSet* limits
"The maxDescriptorSet* limit is n times the corresponding
maxPerStageDescriptor* limit, where n is the number of shader stages
supported by the VkPhysicalDevice. If all shader stages are supported,
n = 6 (vertex, tessellation control, tessellation evaluation,
geometry, fragment, compute)."

Fixes:

dEQP-VK.api.info.device.properties

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-11 07:00:42 +01:00
Timothy Arceri
c797cd605a ac: add load_patch_vertices_in() to the abi
Fixes the follow test for radeonsi nir:

tests/spec/arb_tessellation_shader/execution/quads.shader_test

Also stops 8 other tests from crashing, they now just fail e.g.

tcs-output-array-float-index-rd-after-barrier.shader_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-11 14:28:37 +11:00
Bas Nieuwenhuizen
67e09c8b45 ac/nir: Sanitize location_frac for local variables.
If they were promoted from inputs/outputs, they could have a
non-zero value left over, which messed with our store handling.

Fixes: 06f05040eb "radv: Link shaders."
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-11 00:56:52 +01:00
Rob Herring
af8fd38996 tgsi: include struct definitions for tgsi_build declarations
Many of the functions declared in tgsi_build.h return structs (not struct
pointers). Therefore the full struct definitions are needed to avoid
warnings or errors:

In file included from src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp:23:
external/mesa3d/src/gallium/auxiliary/tgsi/tgsi_build.h:47:1: error: 'tgsi_build_header' has C-linkage specified, but returns incomplete type 'struct tgsi_header' which could be incompatible with C [-Werror,-Wreturn-type-c-linkage]

This error shows up on Android builds using clang and -Werror.

Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Herring <robh@kernel.org>
2018-01-10 14:56:09 -06:00
George Kyriazis
5c4081d66d swr: Handle indirect indices in GS
BuilderSWR::swr_gs_llvm_fetch_input() (and consequently
 swr_gs_llvm_fetch_input()), did not handle the case where
is_vindex_indirect or is_aindex_direct is set.

Implement it, using the code in draw_llvm.c as a guideline.

Fixes the following piglit tests:
dynamic_input_array_index (crash)
gs-input-array-vec4-index-rd
vs-output-array-vec4-index-wr-before-gs

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-10 14:02:17 -06:00
Samuel Pitoiset
41c36c4549 amd/common: use ac_build_buffer_load() for emitting UBO loads
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-10 19:02:27 +01:00
Samuel Pitoiset
7239e265eb amd/common: import get_{load,store}_intr_attribs() from RadeonSI
v2: move those helpers to the header and use static inline

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
2018-01-10 19:02:23 +01:00
Marek Olšák
b391fb26df dri_util: remove ALLOW_RGB10_CONFIGS option (v2)
This is unused because it's for libGL/libEGL, not drivers.

v2: i965 was wrong, because it used dri_util instead of its own config.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-10 17:52:56 +01:00
Tim Rowley
c259888c52 swr/rast: switch win32 jit format to COFF
Allows for call-stack and exception handling for jitted functions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-10 09:44:07 -06:00
Tim Rowley
3d4d34e380 swr/rast: don't use 32-bit gathers for elements < 32-bits in size
Using a gather for elements less than 32-bits in size can cause
pagefaults when loading the last elements in a page-aligned-sized
buffer.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-10 09:44:07 -06:00
Tim Rowley
f5f1bbcb5c swr/rast: autogenerate named structs instead of literal structs
Results in far smaller and useful IR output.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-10 09:44:07 -06:00
Tim Rowley
04d0bfde39 swr/rast: SIMD16 fetch shader jitter cleanup
Bake in USE_SIMD16_BUILDER code paths (for USE_SIMD16_SHADER defined),
remove USE_SIMD16_BUILDER define, remove deprecated psuedo-SIMD16 code
paths.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-10 09:44:07 -06:00
Tim Rowley
d3a4c8057d swr/rast: shuffle header files for msvc pre-compiled header usage
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-10 09:44:07 -06:00
Tim Rowley
e14b48e00e swr/rast: SIMD16 builder - cleanup naming (simd2 -> simd16)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-10 09:44:07 -06:00
Ian Romanick
336afe7d7a glsl/linker: Safely generate mask of possible locations
If MaxAttribs were ever raised to 32, undefined behavior would occur.
We had already gone to the effort (albeit incorrectly) handle this in
one case, so fix them all.

CID: 1369628
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-10 07:21:12 -08:00
Ian Romanick
0c9df36157 glsl/linker: Mark no locations as invalid instead of marking all locations
If max_index were ever 32, the linker would have marked all 32
locations as invalid instead of marking none of them as invalid.  It's
a good thing the maximum value actually set by any driver for
MaxAttribs is 16.

Found by inspection while investigating CID 1369628.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-10 07:21:11 -08:00
Ian Romanick
702dc43f7e glsl: Don't handle visit_stop in several ::accept methods
All cases where the result could be non-visit_continue would have
already returned.

CID: 401351, 1224465, 1224466
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-10 07:21:11 -08:00
Ian Romanick
a170f27958 glsl: Remove unnecessary assignments to type
None of these are necessary because result->type is the only thing used
outside the giant switch-statement.

CID: 1230983, 1230984
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-10 07:21:11 -08:00
Ian Romanick
fd2f4f507f nir: Silence unused parameter warnings
In file included from src/compiler/nir/nir_opt_algebraic.c:4:0:
src/compiler/nir/nir_search_helpers.h: In function ‘is_not_const’:
src/compiler/nir/nir_search_helpers.h:118:59: warning: unused parameter
‘num_components’ [-Wunused-parameter]
 is_not_const(nir_alu_instr *instr, unsigned src, unsigned num_components,
                                                           ^~~~~~~~~~~~~~
src/compiler/nir/nir_search_helpers.h:119:29: warning: unused parameter
‘swizzle ’ [-Wunused-parameter]
              const uint8_t *swizzle)
                             ^~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-10 07:21:11 -08:00
Bas Nieuwenhuizen
d0ef3d4bb0 radv: Remove some typos.
Trivial.
2018-01-10 13:26:27 +01:00
Bas Nieuwenhuizen
5db0bf9994 radv: Implement VK_EXT_discard_rectangles.
Tested with a modified deferred demo and no regressions in a 1.0.2
mustpass run.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 13:26:22 +01:00
Bas Nieuwenhuizen
11b9cdd2d7 radv: Add mapping between dynamic state mask and external enum.
The EXT values are really large, e.g.
VK_DYNAMIC_STATE_DISCARD_RECTANGLE_EXT = 1000099000, so 1 << value
is not going to fit into a 32-bit mask.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 13:24:31 +01:00
Samuel Pitoiset
7145b20afb amd/common: bump the number of available user SGPRS to 32 on GFX9
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-10 12:35:08 +01:00
Samuel Pitoiset
a1f1f708c0 radv: remove radv_pipeline_layout::push_constant_stages field
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-10 12:31:57 +01:00
Samuel Pitoiset
d43f50c00b amd/common: do not rely on the pipeline for the push constants logic
It makes more sense to rely on nir_intrinsic_load_push_constant
instead of the pipeline layout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-10 12:31:54 +01:00
Samuel Pitoiset
4e701cf75c radv/gfx9: calculate the number of ES VGPRs for merged shaders
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-10 12:31:53 +01:00
Samuel Pitoiset
232c418af5 radv/gfx9: enable LDS for GS only if the ES type is TES
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-10 12:31:51 +01:00
Samuel Pitoiset
9e2395faf5 amd/common: determine the ES type (VS or TES) for the GS on GFX9
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-10 12:31:49 +01:00
Iago Toral Quiroga
9ef5b3d517 i965/nir: lower TES PatchVerticesIn to a constant when a TCS is present
When a TCS is present at link time we know the number of vertices in the
patch and we can lower gl_PatchVerticesIn in the TesEval stage directly
to a constant. We already have a pass for this that we use in the
Vulkan pipeline, so we just reuse that.

Notice that the GLSL linker also implements this optimization, which
we are not removing because other drivers may still depend on it, so
this should only be useful for OpenGL SPIR-V shaders for now.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-10 08:21:02 +01:00
Iago Toral Quiroga
7e5c81235f glsl: remove Lower{TCS,TES}PatchVerticesIn
Intel was the only user and now NIR can do the lowering.

v2: do not try to handle it as a system value directly for the SPIR-V
    path. In GL we rather handle it as a uniform like we do for the
    GLSL path (Jason).

v3: drop LowerTESPatchVerticesIn as well (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-10 08:21:02 +01:00
Iago Toral Quiroga
dae856eced i965: lower gl_PatchVerticesIn to a uniform
We want this here instead of nir_lower_system_values because for
Vulkan we don't want this lowering to take place.

v2: do not try to handle it as a system value directly for the SPIR-V
    path. In GL we rather handle it as a uniform like we do for the
    GLSL path (Jason).

v3: do this also for the TessEval stage (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-10 08:21:02 +01:00
Iago Toral Quiroga
4317c848b9 i965/nir: add a helper to lower gl_PatchVerticesIn to a uniform
v2: do not try to handle it as a system value directly for the SPIR-V
    path. In GL we rather handle it as a uniform like we do for the
    GLSL path (Jason).

v3:
  - Remove the uniform variable, it is alwats -1 now (Jason)
  - Also do the lowering for the TessEval stage (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-10 08:21:02 +01:00
Roland Scheidegger
ea227f4322 r600: don't emit tes samplers/views when tes isn't active
Similar to const buffers. The driver must not emit any tes-related state if tes
is disabled, since the hw slots are all shared by VS, therefore it would
overwrite them (the mesa state tracker might not do this, but it would be
perfectly legal to do so).
Nevertheless I think the dirty state tracking logic in the driver is
fundamentally flawed when tes is disabled/enabled, since it looks to me like
the VS (and TES) state would not get reemitted to the correct slots (if it's
not dirty anyway). Unless I'm missing something...
Theoretically, the overwrite problem could be solved by using non-overlapping
resource slots for TES and VS (since we're not even close to using half the
resource slots), but it wouldn't work for constant buffers nor samplers, and
for VS would still need to propagate changes to both LS and VS, so probably
not a useful idea.
Unfortunately there's zero coverage of this with piglit, since all tessellation
shader tests are just shader_runner tests, which are unsuitable for testing
any kind of state dependency tracking issues (so I can't even quickly hack
something up to proove it and fix it...).
TCS otoh is just fine - like GS it has its own hw slots.

Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 04:59:00 +01:00
Roland Scheidegger
523b6c8704 r600: increase number of UBOs to 15
With the exception of the default tess levels only ever accessed
by the default tcs shader, the LDS_INFO const buffer was only accessed by vtx
instructions, and not through kcache. No idea why really, but use this to our
advantage by not using a constant buffer slot for it. This just requires us to
throw the default tess levels into the "normal" driver const buffer instead.
Alternatively, could acesss those constants via vtx instructions too, but then
we couldn't use a ordinary ureg prog accessing them as constants and would have
to generate that directly when compiling the default tcs shader. (Another
alternative would be to put all lds info into the ordinary driver const
buffer, albeit we'd maybe need to increase the fixed size as it can't fit
alongside the ucp since vs needs access to the lds info too.)

Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Dave Airlie <airlied@redhat.com>
2018-01-10 04:59:00 +01:00
Roland Scheidegger
c5162fd3c4 r600: use GET_BUFFER_RESINFO vtx fetch on eg instead of setting up consts
Contrary to what the comment said, this appears to work just fine on my rv770
(tested with piglit textureSize 140 fs/vs samplerBuffer).
Dave Airlie confirmed it working on cayman too.
I have no clue though if it's actually preferrable to use it (unfortunately
we cannot get rid of the tex constants completely, as we still require them
for cube map txq).
Albeit filling in the format (1 channels or 4?) and the stuff related to mega-
or mini-fetch (what the hell is this...) is just a guess based on other usage
of vtx fetch instructions...

v2: it really needs to be done through texture cache (I botched the
testing because sb optimizations turned it automatically into tc, but
can't rely on it and isn't happening on tes).

Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 04:59:00 +01:00
Roland Scheidegger
0be1dc25cf r600: increase number of ubos by one to 14
Ideally we'd support 16 (d3d11 requires 15, and mesa subtracts one for non-ubo
constants), but that's kind of impossible (it would be only doable if either
we'd somehow merge the mesa non-ubo constants with the driver constants, or
only use the driver constants with vtx fetch instead of through the kcache
mechanism - the latter probably wouldn't be too bad).
For now just do as the comment already said, place the gs ring (not really
a const buffer in any case) which is only ever referred to through vc fetch
clauses at index 16. Throw in a couple asserts for good measure to make sure
the hw limit isn't exceeded.

Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 04:59:00 +01:00
Roland Scheidegger
43292c78b7 r600: set up constants needed for txq for buffers and cube maps with tes
We only did this for the other stages, but obviously tess eval/ctrl need it
too.
This fixes the (newly modified) piglit texturing/textureSize test when run
with tes stage and bufferSampler.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 04:59:00 +01:00
Roland Scheidegger
22ba4ebb18 r600: don't emit reloc for ring buffer out into the blue
It looks like this reloc belongs to setting the constant reg, which is skipped
for gs ring.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 04:59:00 +01:00
Roland Scheidegger
76baf99737 r600: hack up num_render_backends on Juniper to 8
Juniper really has a maximum of 4 RBEs (16 pixels). However, predication
always locks up on my HD 5750, and through experiments it looks like if we're
pretending it has a maximum of 8, with 4 disabled, it works correctly.
My conclusion would be that there's a bug (likely firmware, not hw) which
causes the predication logic to try to read 8 results out of the query buffer
instead of just 4, and since of course noone ever writes the upper 4, the
status bit is never set and hence it will wait for it forever.

Ideally this would be fixed in firmware, but I'd guess chances of that
happening are slim.
This will double the size of (occlusion) query result buffers, write the
status bit for the disabled rbs in these buffers, and will also add 8 results
together instead of just 4 when reading them back. The latter is unnecessary,
but it's probably not worth bothering - luckily num_render_backends isn't
used outside of occlusion queries, so don't need separate value for the
"real" maximum.
Also print out the enabled_rb_mask if it changed from the pre-fixed value
(which is already printed out), just in case there's some more problems
with chips which have some rbs disabled...

This fixes all the lockups with piglit nv_conditional_render tests on my
HD 5750 (all pass).

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 04:59:00 +01:00
Roland Scheidegger
f0dd1b3612 winsys/radeon: fix up default enabled_rb_mask for r600
The logic had two fatal flaws which completely killed the default value.
1) drm will overwrite the value anyway even if the chip can't be handled
2) the default value logic is relying on num_render_backends, which was
filled in later.
Luckily noone is relying on it, but it's a bit confusing seeing the chip clock
printed out there (as hex) with R600_DEBUG=info...
(Albeit radeonsi does not appear to fix up the value. If kernels which don't
handle this query are still supported, radeonsi will still end up with a broken
enabled_rb_mask, I have no idea of the potential results of this there.)

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 04:59:00 +01:00
Roland Scheidegger
7c0bc495f1 r600: fix enabled_rb_mask on eg/cm
For eg/cm, the r600_gb_backend_map will always be 0. This is a bug in
the drm kernel driver, as it just just never fills the information in
(it is now being fixed - the history shows it was being filled in when
the query was brand new but got lost shortly thereafter with backend_map
fixes).
This causes r600_query_hw_prepare_buffer to write the "status bit"
(just the highest bit of the occlusion query result) even for active rbes
(all but the first). This doesn't make much sense, albeit I suppose it's mostly
safe. According to the commit history, it's necessary to set these bits for
inactive rbes since otherwise predication will lock up - presumably the hw just
is waiting for the status bit to appear, which will never happen with inactive
rbes. I'd guess potentially predication could be wrong (due to not waiting for
the actual result if the status bit is already there) if this is set for
active rbes.

Discovered while trying to fix predication lockups on Juniper (needs another
patch).

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 04:59:00 +01:00
Roland Scheidegger
762ccf483a r600: fix sampler indexing with texture buffers sampling
This fixes the new piglit test.
While here also fix up the logic for early exit of setting up driver consts.

Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Reviewed-by: Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 04:59:00 +01:00
Roland Scheidegger
6c8d6ce982 r600: don't use vtx offset for load_sample_position
The offset looks bogus to me. Albeit in the end it doesn't matter, by the
looks of it offsets smaller than 4 get ignored there (not sure of the rules,
I suppose either non-dword aligned offsets never work there or the offset
must be at least aligned to the size of a single element).

Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 04:59:00 +01:00
Dave Airlie
f4b1ec2972 r600: drop l2 related queries
radeonsi only.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-10 00:56:09 +00:00
Dave Airlie
e836fb2002 r600/shader: only read back the necessary tess factor components.
This just reduces the lds reads for the the tess factor emission.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-10 00:54:32 +00:00
Jon Turney
adfb9c5c7b Fix use of alloca() without #include <c99_alloca.h>
../../../src/mesa/main/shaderapi.c: In function ‘_mesa_ShaderBinary’:
../../../src/mesa/main/shaderapi.c:2188:9: error: implicit declaration of function ‘alloca’ [-Werror=implicit-function-declaration]
2018-01-09 22:07:52 +00:00
Kenneth Graunke
28c2d0d80b genxml: Add missing INSTDONE_1 bits on Gen7.5+.
This will make aubinator_error_decode decode them properly.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-09 10:13:53 -08:00
Kenneth Graunke
8eadc2fb8f intel: Apply Geminilake "Barrier Mode" workaround.
Apparently, Geminilake requires you to whack a chicken bit to select
either compute or tessellation mode for barriers.  The recommendation
is to switch between them at PIPELINE_SELECT time.

We may not need to do this all the time, but I don't know that it hurts
either.  PIPELINE_SELECT is already a pretty giant stall.

This appears to fix hangs in tessellation control shaders with barriers
on Geminilake.  Note that this requires a corresponding kernel change,

    drm/i915: Whitelist SLICE_COMMON_ECO_CHICKEN1 on Geminilake.

in order for the register write to actually happen.  Without an updated
kernel, this register write will be noop'd and the fix will not work.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-01-09 10:13:33 -08:00
Emil Velikov
5e7d06fcb0 docs: update calendar, add news and link release notes for 17.3.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-09 16:13:31 +00:00
Emil Velikov
ea9c548494 docs: add sha256 checksums for 17.3.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 3a67ca681b)
2018-01-09 16:10:12 +00:00
Emil Velikov
6c73767596 docs: add release notes for 17.3.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0f27052e32)
2018-01-09 16:10:10 +00:00
Indrajit Das
e05d5b0cf3 st/omx_bellagio: Update default intra matrix per MPEG2 spec
Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-01-09 09:10:24 -05:00
Scott D Phillips
42f421cbbf aubinator: add support for aubinating memtrace aubs
Memtrace aubs are similar to classic aubs, with the major
difference being how command submission is serialized (as register
writes instead of a high-level submit message). Some internal
tools generate or consume only memtrace aubs.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-08 21:11:11 -08:00
Scott D Phillips
8cdf5bd292 aubinator: extract aubinator_init() out of the header handler function
A later patch will use the aubinator_init() function from the
memtrace aub header handler.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-08 21:11:11 -08:00
Scott D Phillips
4f0a2ff4c1 aubinator: honor --color option when printing the header
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-08 21:11:11 -08:00
Scott D Phillips
161a97c3d5 .gitignore: Ignore new generated files
New generated files from:

  bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
  65fc16c974 ("autotools: set XA versions in configure.ac and configure header file")

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-08 21:11:11 -08:00
Dylan Baker
73ce7cb474 Meson: ensure variable defined
A gallium driver is undefined if passing -Dgallium-drivers=''

Fixes: e0b037d697 ("meson: Build SWR driver")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2018-01-08 17:43:45 -08:00
Dylan Baker
21bca27349 meson: Fix typo in clover build
The leading space breaks things.

fixes: 42ea0631f1 ("meson: build clover")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-08 17:31:55 -08:00
Dylan Baker
eab0316d10 meson: set opencl flags for r600
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-08 16:39:48 -08:00
Dylan Baker
42ea0631f1 meson: build clover
This has only been compile tested.

v2: - Have a single option for opencl (Eric E)
    - fix typo "tgis" -> "tgsi" (Curro)
    - Don't add "lib" to pipe loader libraries, which matches the
      autotools behavior
v3: - Remove trailing whitespace
    - Make PIPE_SEARCH_DIR an absolute path
v4: - add trailing / to LIBCLC defines

Acked-by: Curro Jerez <currojerez@riseup.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
cc: Aaron Watry <awatry@gmail.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-08 16:39:42 -08:00
Dylan Baker
425fcbde3f meson: Turn on swr for relevant targets
Currently that's dri, libgl-xlib, and osmesa.

v2: - put drivers on a separate line from normal dependencies (Eric E)

cc: George Kyriazis <george.kyriazis@intel.com>
cc: Tim Rowley <timothy.o.rowley@intel.com>
cc: Bruce Cherniak <bruce.cherniak@intel.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-08 16:39:37 -08:00
Dylan Baker
e0b037d697 meson: Build SWR driver
This enables the SWR driver, but doesn't actually hook it up to any of
the targets yet. I felt like this patch was big and complicated enough
without adding that.

v2: - Fix typo 'delemeited' -> 'delimited' (Eric E)
    - Fix type 'errror' -> 'error' (Eric E)
    - Use variables to hold files instead of looking above the current
      meson build (Eric E)
    - Use foreach loops to reduce the number of unique generators
    - Add comment about why some generators have names and some are just
      added to a list
v3: - Remove trailing whitespace

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-08 16:39:30 -08:00
Timothy Arceri
f04d2ca0d9 ac: rework emit_barrier() to not segfault on radeonsi
nir_to_llvm_context will always be NULL for radeonsi so we need
work around this.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-09 10:21:32 +11:00
Timothy Arceri
19f3141e6a ac: add load_tess_level() to the abi
Fixes the following piglit tests in radeonsi:

vs-tcs-tes-tessinner-tessouter-inputs-quads.shader_test
vs-tcs-tes-tessinner-tessouter-inputs-tris.shader_test
vs-tes-tessinner-tessouter-inputs-quads.shader_test
vs-tes-tessinner-tessouter-inputs-tris.shader_test

v2: make use of si_shader_io_get_unique_index_patch()
    via the helper in the previous patch rather than
    shader_io_get_unique_index()

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-09 10:21:32 +11:00
Timothy Arceri
2bd7ab32cf radeonsi: add load_tess_level() helper
This will be shared by the tgsi and nir backends.

v2: move si_shader_io_get_unique_index_patch() call inside
    the helper.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-09 10:21:32 +11:00
Jason Ekstrand
9e5aaa93cb spirv: Do implicit conversions of uint to bool in OpStore
Technically, the GLSLang bug related to this can also affect SSBO writes
where the bool -> uint conversion is missing.  However, the only known
shipping application with an old enough version of GLSLang to cause
issues with this is the new DOOM game so we keep the workaround as small
as possible.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
154668e79c spirv: Loosen the validation for load/store type matching
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104338
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
986303cb92 spirv: Require a storage type for OpStore destinations
This rules out things such as trying to store a pointer to a local
variable.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
70f588778c spirv: Add a vtn_types_compatible helper
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
8bad7f33c6 spirv: Store the id of the type in vtn_type
Previously, we were storing a pointer to the vtn_value because we use it
to look up decorations when we create input/output variables.  This
works, but it also may be useful to have the id itself so we may as well
store that instead.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
53265c8798 spirv: Add a mechanism for dumping failing shaders
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
819adfdfb4 spirv: Rework asserts in var_decoration_cb
Now that higher levels are enforcing decoration sanity, we don't need
the vtn_asserts here.  This function *should* be safe but we still want
a few well-placed regular asserts in case something goes awry.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
71ea4dded5 spirv: Rework error checking for decorations
This reworks the error checking on our generic handling of decorations.
The objective is to validate all of the SPIR-V assumptions we make
up-front and convert redundant checks to compiled-out asserts.  The most
important part of this is to ensure that member decorations only occur
on OpTypeStruct and that the member is never out-of-bounds.  This way
later code can assume that the member is sane and not have to worry
about OOB array access due to a misplaced OpMemberDecorate.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
d6a4099303 spirv: Add better type validation to OpTypeImage
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
03c543d041 spirv: Switch on vtn_base_type in OpComposite(Extract|Insert)
This is a bit simpler since we have fewer enum values in the case.  It's
also a bit more efficient because we're making fewer glsl_get_* calls.
While we're at it, add better type validation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
936f49268e spirv: Refactor Op[Spec]ConstantComposite and add better validation
Now that vtn_base_type is a real and full base type, we can switch on
that instead of the GLSL base type which is a lot fewer cases in our
switch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
dabce5061d spirv: Add better validation to Op[Spec]Constant
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
6cf965751a spirv: Remove a pointless assignment in SpvOpSpecConstant
We re-assign later inside the bit_size switch

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
f13a5cff72 spirv: Unify boolean constants and add better validation
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
0bb18858fb spirv/info: Add spirv_op_to_string
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
ab85fd02d5 spirv: Make 'info' a local array spirv_info_c.py
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand
296046556a spirv: Add better error messages in vtn_value helpers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Caio Marcelo de Oliveira Filho
22980f941e spirv: Import 1.2 rev 3 headers and grammar from Khronos
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-08 13:22:17 -08:00
Samuel Pitoiset
08a5f4412a radv: get InstanceID from VGPR1 (or VGPR2 for tess) instead of VGPR3
VGPR1 = InstanceID / StepRate0; // StepRate0 can be set to 1

Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-08 21:30:01 +01:00
Samuel Pitoiset
be16bbe1d3 radv: avoid PS partial flushes when viewports/scissors don't change
For Vega10 and Raven that need a special workaround for the
scissor bug.

This seems to give a minor boost for Talos and Dota 2, at least.

To reduce the cost of memcmp, the driver checks if it's
really useful to do the comparison.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-08 21:24:58 +01:00
Samuel Pitoiset
b09b3f8834 radv: add has_scissor_bug for Vega10 and Raven
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-08 21:24:56 +01:00
Samuel Pitoiset
b462ceb482 radv/gfx9: do not load VGPR1 when GS uses points or lines
VGPR1 is only needed for topology that needs 3 offsets like
triangles or quads.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-08 21:24:53 +01:00
Samuel Pitoiset
a3c2a86757 radv: make shader BOs read-only for the GPU
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-08 21:24:51 +01:00
Samuel Pitoiset
6e3459eaf4 radv: make descriptor BOs read-only for the GPU
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-08 21:24:49 +01:00
Samuel Pitoiset
e4f2ad403f radv: make the indirect GFX config BO read-only for the GPU
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-08 21:24:47 +01:00
Samuel Pitoiset
0e84fc2e2b radv/winsys: make IBs read-only for the GPU
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-08 21:24:45 +01:00
Samuel Pitoiset
a3aaa03624 radv/winsys: add RADEON_FLAG_READ_ONLY
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-08 21:24:43 +01:00
Samuel Pitoiset
2dab5e96ec radv/winsys: rework radv_amdgpu_bo_va_op()
Needed for the following commit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-08 21:24:41 +01:00
Igor Gnatenko
23ce168048 link mesautil with pthreads
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_setname':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:66: undefined reference to `pthread_setname_np'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `thrd_join':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../include/c11/threads_posix.h:336: undefined reference to `pthread_join'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_create':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:48: undefined reference to `pthread_sigmask'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `thrd_create':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../include/c11/threads_posix.h:296: undefined reference to `pthread_create'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_create':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:50: undefined reference to `pthread_sigmask'
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:50: undefined reference to `pthread_sigmask'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `call_once':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../include/c11/threads_posix.h:96: undefined reference to `pthread_once'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_get_time_nano':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:84: undefined reference to `pthread_getcpuclockid'
collect2: error: ld returned 1 exit status

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Igor Gnatenko <ignatenko@redhat.com>
2018-01-08 11:40:02 -05:00
Alex Smith
0d8b9c529c anv: Allow PMA optimization to be enabled in secondary command buffers
This was never enabled in secondary buffers because hiz_enabled was
never set to true for those.

If the app provides a framebuffer in the inheritance info when beginning
a secondary buffer, we can determine if HiZ is enabled and therefore
allow the PMA optimization to be enabled within the command buffer.

This improves performance by ~13% on an internal benchmark on Skylake.

v2: Use anv_cmd_buffer_get_depth_stencil_view().

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-08 09:31:17 +00:00
Florian Will
7e025def6d glsl: Respect std430 layout in lower_buffer_access
Respect the std430 rules for determining offset and size of struct
members when using a std430 buffer. std140 rules lead to wrong buffer
offsets in that case.

Fixes my test case attached in Bugzilla. No piglit changes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104492
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-08 13:42:09 +11:00
Karol Herbst
efd2169c1a nir: fix st_nir_assign_var_locations for patch variables
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-08 10:12:53 +11:00
Ilia Mirkin
6f4ac7b418 nvc0: enable bindless on kepler
All the functionality is in. Maxwell will take a little bit more
enablement work.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-01-07 11:40:35 -05:00
Ilia Mirkin
23a6e8d8ff nvc0: add bindless image support for kepler
A part of the driver constbuf area is allocated for bindless images. Any
update requires uploading to all driver constbufs. This also extends the
driver constbuf to 64KB, up from 2KB.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-01-07 11:40:35 -05:00
Ilia Mirkin
8eb1214755 nvc0: add support for bindless textures on kepler+
This keeps a list of resident textures (per context), and dumps that
list into the active buffer list when submitting. We also treat bindless
texture fetches slightly differently, wrt the meaning of indirect, and
not requiring the SAMPLER file to be used.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-01-07 11:15:23 -05:00
Ilia Mirkin
7061333653 nv50/ir: use the image info in the instruction rather than decl
In preparation for bindless images, we have to retrieve the
target/format info from the instruction directly, as there will be no
declaration. Furthermore, for bound images, this information is still
available in the instruction, so we can drop the declaration-based
mechanism entirely.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-01-07 11:15:23 -05:00
Ilia Mirkin
7f92c8ee37 nvc0/ir: safen up lowering logic against overwriting reused values
I'm fairly sure both of the changed sites are OK as-is, but they're
fragile, so this is just safening them up. Since this is happening
pre-ssa, we don't want to be overwriting values that may potentially get
used later on.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-01-07 11:15:23 -05:00
Ilia Mirkin
bdf300e09d nvc0: update tic in-place when buffer address changes
This is helpful for bindless, where changing TIC id's is undesirable.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-01-07 11:15:23 -05:00
Ilia Mirkin
adcd241b56 nvc0: ensure that pushbuf keeps ref to old text/tls bos
If we free the bo, then the PTE may get deallocated immediately. We have
to make sure that the submission includes a ref to the old bo so that it
remains mapped for the duration of the command execution.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-07 11:14:51 -05:00
Kenneth Graunke
be144e251c i965: Torch public intel_batchbuffer_emit_dword/float helpers.
intel_batchbuffer_emit_float is dead code, it should go.

intel_batchbuffer_emit_dword only had one user, which had bungled using
them by forgetting to call intel_batchbuffer_require_space first.  So it
seems wise to delete these unsafe helpers.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-06 20:10:32 -08:00
Kenneth Graunke
1c9f1a28c0 i965: Require space for MI_BATCHBUFFER_END.
intel_batchbuffer_emit_dword doesn't reserve space for the DWord it
emits.  In the past, we had some reserved batch space to ensure this
worked.  With the switch to growing batches, we need to actually request
space so that we grow if necessary.

Fixes: 2c46a67b41 (i965: Delete BATCH_RESERVED handling.)
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-06 20:10:32 -08:00
Kenneth Graunke
a693a6f51a i965: Shut up a few unused variable warnings.
If asserts are disabled, you get pointless warnings about devinfo
being used (it's used to assert on devinfo->gen).
2018-01-06 17:34:54 -08:00
Marek Olšák
a140aeb619 ac: add ac_build_fmin/fmax helpers
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-06 09:51:43 +01:00
Marek Olšák
581507f10a mesa: remove dd_function_table::GetCompressedTexSubImage and clean it up
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-06 09:51:43 +01:00
Neil Roberts
2971f688e6 mesa: Tidy up the 4.6 section of GL4x.xml
The enums are moved to the top and indented like the rest of the file.
Comments are added to split up the function aliases by corresponding
extension. This should make no functional difference.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-06 09:28:45 +01:00
Samuel Pitoiset
87efa71001 radv: remove unused radv_color_buffer_info::cb_clear_valueX
Found by inspection.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-05 17:26:51 +01:00
Alex Smith
12f4e00b69 anv: Take write mask into account in has_color_buffer_write_enabled
If we have a color attachment, but its writes are masked, this would
have still returned true. This is inconsistent with how HasWriteableRT
in 3DSTATE_PS_BLEND is set, which does take the mask into account.

This could lead to PixelShaderHasUAV not being set in 3DSTATE_PS_EXTRA
if the fragment shader does use UAVs, meaning the fragment shader may
not be invoked because HasWriteableRT is false. Specifically, this was
seen to occur when the shader also enables early fragment tests: the
fragment shader was not invoked despite passing depth/stencil.

Fix by taking the color write mask into account in this function. This
is consistent with how things are done on i965.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-05 15:36:22 +00:00
Neil Roberts
0bd1c4676d mesa: Add GL4.6 aliases of functions from GL_ARB_indirect_parameters
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-05 11:41:10 +01:00
Samuel Pitoiset
ec63ab39be radv: enable denorms for 64-bit and 16-bit floats
Similar to RadeonSI.

This fixes:
dEQP-VK.image.texel_view_compatible.graphic.basic.attachment_read.bc*r16g16b16a16_sfloat
dEQP-VK.image.extended_usage_bit.attachment_write.r16_sfloat

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-05 09:51:33 +01:00
Samuel Pitoiset
7643c71527 amd/common: correctly detect if we need ring buffers
When allocate_user_sgprs() was called, ctx->stage was actually
unset and 0 is for the vertex shader. This doesn't change
anything for now because of the spill support thing.

Though, the number of user SGPRs has to be fixed for merged
shaders on GFX9. It was broken before anyway.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-05 09:49:51 +01:00
Samuel Pitoiset
50cfad0298 amd/common: use ac_image_load when lod is zero
This might decrease VGPR spilling, because we no longer
have to use v4i32 for 2D fetches when level == 0. We now
use v2i32 for those cases.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 09:49:45 +01:00
Samuel Pitoiset
85769759bf radv: limit the scissor bug workaround to Vega 10 and Raven
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-05 09:47:49 +01:00
Alejandro Piñeiro
2719467eb6 glsl/standalone: set MaxTransformFeedbackBuffers
Using 4, as it is the default value on mesa. See mesa/main/config.h
and the following commit that introduced the value:
15ac66e331

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-05 08:52:22 +01:00
Alejandro Piñeiro
0ba3de2ad7 glsl/standalone: set MaxVertexStreams
ARB_transform_feedback3 sets a minimum of 1, ARB_gpu_shader5 a minimum
of 4. It shouldn't matter too much, so choosing the later.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-05 08:52:22 +01:00
Alejandro Piñeiro
8dcf131f04 glsl/standalone: set MaxUniformBufferBindings
Used to handle how many ubo you can define on the context. Minimimum
defined as 36 on ARB_uniform_buffer_object spec, up to 84 on OpenGL
4.6 (12 per stage at each moment).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-05 08:52:22 +01:00
Alejandro Piñeiro
937b210551 glsl/standalone: point which arguments are mandatory
Every now and then I execute the standalone compiler, get the
non-version error, and need to remember what I'm doing wrong

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-05 08:52:22 +01:00
Timothy Arceri
4a0c24f2dd ac: rework ac_llvm_extract_elem()
Simplifies the logic a little and asserts index is 0.

Suggested-by: Nicolai Hähnle <nhaehnle@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 12:20:38 +11:00
Timothy Arceri
71f82dc9a3 st/glsl_to_nir/radeonsi: enable tessellation shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
9755eeb15a gallium/tgsi: add patch support to tgsi_get_gl_varying_semantic()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
452586b56a radeonsi: add dummy implementation of si_nir_scan_tess_ctrl()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
14adf7853a ac/radeonsi: add load_tess_coord() to the abi
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
eb1e555cfd radeonsi: make si_llvm_emit_tcs_epilogue compatible with emit_outputs abi
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
91f3c4ec1b radeonsi/nir: gather tess properties
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
9e1a3caf32 ac/radeonsi: add tcs_rel_ids to the abi
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
9c2f877830 radeonsi: add unpack_llvm_param() helper
This allows us to pass the llvm param directly rather than looking
it up.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
f93740efc1 ac: add {tcs,tes}_patch_id to the abi
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
15c6f3fdd5 radeonsi: add nir support for tcs outputs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
b99ebaa4fd ac: move some helpers to ac_llvm_build.c
We will call these from the radeonsi NIR backend.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
2deb822075 ac: add store_tcs_outputs() to the abi
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
8be0135082 radeonsi: add si_nir_load_input_tcs()
V2: drop type param and just use ctx->i32

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
234507b3cf radeonsi: add get_dw_address_from_generic_indices() helper
This will be used by both the tgsi and nir backends.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
b104e7e172 ac: call load_tcs_input() via the abi
This also enables some code sharing with tes.

V2: drop type param and just use ctx->i32

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
b09a3196e0 ac: add load_tes_inputs() to the abi
V2: drop type param and just use ctx->i32

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Timothy Arceri
e04bf8a619 radeonsi: add si_nir_load_input_tes()
V2: drop type param and just use ctx->i32

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-05 11:58:55 +11:00
Tim Rowley
396c006d90 swr/rast: fix invalid sign masks in avx512 simdlib code
Should be 0x80000000 instead of 0x8000000.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-04 13:35:17 -06:00
Bas Nieuwenhuizen
76daa30e4a radv: Use correct flush bits for flushing L2 during CB/DB flushes.
Copied from radeonsi.

Putting in the correct metadata flush commands for eventually not
flushing L2 on CB/DB switch.

Does not remove the need for V_028A90_CACHE_FLUSH_AND_INV_TS_EVENT
at the moment.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-04 19:35:36 +01:00
Bas Nieuwenhuizen
f2c9f13ec2 radv: Invalidate L1 for VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT.
These are just shaders reads, so we need to invalidate L1.

Fixes: 6dbb0eaccc "radv: handle subpass cache flushes"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-04 19:35:36 +01:00
Samuel Pitoiset
2670ebb584 radv/gfx9: reduce the number of input VGPRs for the GS stage
This can still be improved, but let's start with this.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-04 18:43:25 +01:00
Samuel Pitoiset
a4d2782664 amd/common: scan if gl_PrimitiveID is used before translating to LLVM
It makes more sense to move all scan stuff in the same place.
Also, we don't really need to duplicate the uses_primid field
for each stages.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-04 18:43:09 +01:00
Samuel Pitoiset
3b2cb2f99a amd/common: scan if gl_InvocationID is used
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-04 18:43:07 +01:00
Rob Herring
aa187fe7bf egl/android: Fix build break with dri2_initialize_android _EGLDisplay parameter
Commit 2f421651ac ("egl: let each platform decided how to handle
LIBGL_ALWAYS_SOFTWARE") broke the build due to copy-n-paste of misnamed
function parameter.:

src/egl/drivers/dri2/platform_android.c:1183:8: error: use of undeclared identifier 'disp'

Rather than just fixing 'disp', rename the function parameter 'dpy' to
'disp' to align with the other EGL platforms' implementations.

Fixes: 2f421651ac ("egl: let each platform decided how to handle LIBGL_ALWAYS_SOFTWARE")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2018-01-04 10:18:10 -06:00
Alex Smith
00a81e9909 anv: Add missing unlock in anv_scratch_pool_alloc
Fixes hangs seen due to the lock not being released here.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-04 14:54:02 +00:00
Ilia Mirkin
e923075a82 mesa/bindless: fix missing image _Layer initialization
Some later code relies on _Layer to set first/last_layer. Make sure it's
always initialized.

Detected by valgrind's conditional jump/move with uninit value logic.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-04 02:22:25 -05:00
Józef Kucia
f222cf3c6d radeonsi: fix alpha-to-coverage if color writes are disabled
If alpha-to-coverage is enabled, we have to compute alpha
even if color writes are disabled.

Signed-off-by: Józef Kucia <joseph.kucia@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-04 01:58:33 +01:00
Bas Nieuwenhuizen
79724c89f8 ac: rename has_sync_file to has_fence_to_handle.
sync_files are in linux since 4.7, while the amdgpu fence_to_handle
ioctl is only in 4.15.

In particular we don't need it for sync_file in radv, because
everything happens via syncobjs, which got support earlier than
fence_to_handle.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-04 01:12:09 +01:00
Bas Nieuwenhuizen
c99426ea83 ac/nir: Handle loading data from compact arrays.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-04 00:14:23 +01:00
Bas Nieuwenhuizen
1c78e4f053 radv: Allow writing 0 scissors.
When rasterization is disabled we can have that few.

Fixes: 76603aa90b "radv: Drop the default viewport when 0 viewports are given."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-04 00:14:19 +01:00
Bas Nieuwenhuizen
5158603182 radv: Use correct HTILE expanded words.
Seems like users are actually hitting 0xFFFFFFFF actually making
things broken for them, and the mad max regression is fixed, so
lets put this in once more.

v2: Use 0xf for depth-only htile. (Dave)

Fixes: af2844116f "radv: Revert HTILE reset word to 0xFFFFFFFF."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-04 00:14:03 +01:00
Marek Olšák
4f19cc82f9 ac: rename has_syncobj_wait -> has_syncobj_wait_for_submit
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-04 00:07:45 +01:00
Eric Anholt
81567d01c1 braodcom/vc5: Fix internal type/bpp for RGB10_A2UI images.
I found that we were getting GPU hangs on most tests rendering to them,
and the simulator was assertion failing.
2018-01-03 14:31:37 -08:00
Eric Anholt
a93fd7b41e broadcom/vc5: Try to fix up compressed texture load/store.
We were trying to load/store the logical width/height number of compressed
blocks.  As long as the textures were large, single-level, and the
load/store at (0,0), it kind of worked.
2018-01-03 14:31:36 -08:00
Eric Anholt
44237b3f85 broadcom/vc5: Fix image_h value for CPU-side tiling on miplevels > 1.
Fixes overflow that caused failure in
dEQP-GLES3.functional.texture.filtering.2d.sizes.128x128_linear.
2018-01-03 14:31:36 -08:00
Eric Anholt
e60e3a56a2 broadcom/vc5: Fix discard_if during control flow.
I want to do the SETMSF.IFA to discard only if execute == 0 and cond, so
our dest of the PUSHZ needs to be nonzero if execute or !cond are nonzero.

Fixes dEQP-GLES3.functional.shaders.discard.dynamic_loop_dynamic.
2018-01-03 14:31:36 -08:00
Eric Anholt
7836c85919 broadcom/vc5: Disable early Z when the stencil func isn't ALWAYS.
Apparently the other funcs will have observable differences when early Z
is enabled.

Fixes (new) simulator assertion failures in
dEQP-GLES3.functional.rasterizer_discard.basic.clear_depth.
2018-01-03 14:31:36 -08:00
Eric Anholt
635131a238 broadcom/vc5: Don't emit component 3/4 F16 TLB writes for float/vec2.
Fixes a simulator assertion failure on
dEQP-GLES3.functional.fragment_out.array.fixed.r8_highp_float.
2018-01-03 14:31:28 -08:00
Eric Anholt
deb552ca27 nir: Add a helper to get the uvec4 type.
I needed this in the vc5 compiler.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-03 14:25:23 -08:00
Eric Anholt
39811a2894 broadcom/vc5: Introduce enums for internal depth/type, with V3D prefixes. 2018-01-03 14:25:23 -08:00
Eric Anholt
d3e8a4b96c broadcom/xml: Fix up safe name confusion with prefixing.
For enums we were doubling the underscore if the value had a numeric first
character of its name (which safe_name() adds an underscore to).  A little
helper function cleans up the other instance of prefixing while also
fixing this.
2018-01-03 14:25:23 -08:00
Eric Anholt
48cabc1e75 broadcom/vc5: Turn the decimate mode field into an enum in the XML. 2018-01-03 14:25:23 -08:00
Eric Anholt
17cb634b1c broadcom/vc5: Turn the output image format into an enum. 2018-01-03 14:25:23 -08:00
Eric Anholt
883a9b02c9 broadcom/vc5: Turn the CLE XML's memory format into an enum. 2018-01-03 14:25:23 -08:00
Eric Anholt
8e5a0ed953 broadcom/vc5: Emit flat shade flags for varying components > 24.
This means that with no flatshading we'll emit the single-byte
ZERO_ALL_FLAT_SHADE_FLAGS, and otherwise emit a set of FLAT_SHADE_FLAGS to
get all the bits we need set.

There's a _SET enum in the packet we could use to possibly set entire
ranges of the bitfield without using another packet, but this at least
fixes the conformance failure.
2018-01-03 14:25:23 -08:00
Eric Anholt
2056e4a777 broadcom/vc5: Emit proper flatshading code for glShadeModel(GL_FLAT).
In updating the simulator, behavior changed slightly so that our old code
wasn't getting glxgears's flatshading interpolated right.  Emit flat
shading code just like we would for a normal flat-shaded varying, by
passing a flag in the shader key for glShadeModel(GL_FLAT) state and
customizing the color inputs based on that.
2018-01-03 14:25:23 -08:00
Eric Anholt
4764699552 braodcom/vc5: Rely on OVRTMUOUT always being set.
It seems that the HW team has decided that it's the only supported mode,
and it's the mode I actually meant to be using but forgot.  Our table of
return_32_bit should have matched the default non-OVRTMUOUT behavior, so
this change should be invisible.

However, the change revealed that some my return_size checks for swizzling
were a bit confused in the shadow case, so I had to move them to draw time
once we have both the sampler and the view together.

Fixes assertion failures in the updated simulator, where the non-OVRTMUOUT
support has been removed.
2018-01-03 14:25:23 -08:00
Eric Anholt
ba965084b6 broadcom/vc5: Move texture return channel setup into the compiler.
The compiler decides how many LDTMUs we're going to emit, and that must
match the P1 flags.  This brings the return channel counting to a single
place (so all that's passed into the compiler is "how many return channels
you may request from this texture's format), and was a necessary step for
shadow samplers once we stop using OVRTMUOUT=0.
2018-01-03 14:25:23 -08:00
Eric Anholt
ac4054ca17 broadcom/vc5: Switch to setting the primitive list format in the RCL.
This means that we get a single copy of it emitted, instead of once at the
start of each tile (though it's still executed once per tile).  Fixes
assertion failures with the updated simulator.
2018-01-03 14:25:23 -08:00
Eric Anholt
7d8b19f0dd broadcom/vc5: Switch to using the C++ interface for the simulator.
In newer versions they've removed the C interface, so make one here.  This
also isolates the Mesa codebase from the simulator codebase, so we don't
have conflicts over things like "unreachable"
2018-01-03 14:25:23 -08:00
Mario Kleiner
190ac52827 mesa: Add GL_UNSIGNED_INT_2_10_10_10_REV OES read type for BGRX1010102.
As Marek noted, the GL_RGBA + GL_UNSIGNED_INT_2_10_10_10_REV type
combo is also good for readback of BGRX1010102 framebuffers, not
only for BGRA1010102 framebuffers for use with glReadPixels()
under GLES, so add it for the GL_IMPLEMENTATION_COLOR_READ_TYPE_OES
query.

Successfully tested on gallium r600 driver with a (quickly hacked
for RGBA 10 10 10 0) dEQP testcase
dEQP-EGL.functional.wide_color.window_1010102_colorspace_default.

Suggested-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:57 +01:00
Mario Kleiner
2d8e1a6a27 st/dri: Add option to control exposure of 10 bpc color configs.
Some clients may not like rgb10 fbconfigs and visuals.
Support driconf option 'allow_rgb10_configs' on gallium
to allow per application enable/disable.

The option defaults to enabled.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:57 +01:00
Mario Kleiner
e5ff036c67 st/dri: Add support for BGR[A/X]1010102 formats.
Exposes RGBA 10 10 10 2 and 10 10 10 0 visuals and
fbconfigs for rendering.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:57 +01:00
Mario Kleiner
ef99597d95 st/dri: Support texture_from_pixmap for BGR[A/X]1010102 formats.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:57 +01:00
Mario Kleiner
893723a64b st/dri2: Add buffer handling for BGR[A/X]1010102 formats.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:57 +01:00
Mario Kleiner
f52aa596d4 st/dri2: Add format translations for BGR[A/X]1010102 formats.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:57 +01:00
Mario Kleiner
6e665d07eb st/mesa: Handle BGR[A/X]1010102 formats.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:57 +01:00
Mario Kleiner
3f867d1299 egl/wayland: Add Wayland shm swrast support for RGB10 winsys buffers.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
859ccd2096 egl/wayland: Add Wayland dmabuf support for RGB10 winsys buffers. (v2)
Successfully tested under Weston 3.0.
Photometer confirms 10 rgb bits from rendering to display.

v2: Rebased onto master for dri2_teardown_wayland().

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
84fd5151cd egl/wayland: Add Wayland drm support for RGB10 winsys buffers.
Successfully tested under Weston 3.0.
Photometer confirms 10 rgb bits from rendering to display.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
82a2ede9aa egl/x11: Handle depth 30 drawables for EGL_KHR_image_pixmap.
Enables eglCreateImageKHR() with target set to
EGL_NATIVE_PIXMAP_KHR to handle color depth 30
X11 drawables.

Note that in theory the drawable depth 32 case in the
current implementation is ambiguous: A depth 32 drawable
could be of format ARGB8888 or ARGB2101010, therefore an
assignment of __DRI_IMAGE_FORMAT_ARGB8888 for a pixmap of
ARGB2101010 format would be wrong. In practice however, the
X-Server (as of v1.19) does not provide any depth 32 visuals
for ARGB2101010 EGL/GLX configs. Those are associated with
depth 30 visuals without an alpha channel instead. Therefore
the switch-case depth 32 branch is only executed for ARGB8888
pixmaps and we get away with this.

Tested with KDE Plasma 5 under X11, DRI2 and DRI3/Present,
selecting EGL + OpenGL compositing and different fbconfigs
with/without 2 bit alpha channel. glxinfo confirms use of
depth 30 visuals for ARGB2101010 only.

Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
d0b320c941 egl/x11: Handle depth 30 drawables under software rasterizer.
For fixing eglCreateWindowSurface() under swrast, as tested
with LIBGL_ALWAYS_SOFTWARE=1.

Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
be3d88e539 egl/x11: Match depth 30 RGB visuals to 32-bit RGBA EGLConfigs.
Similar to the matching of 24 bit RGB visuals to 32-bit
RGBA EGLConfigs, so X11 compositors won't alpha-blend any
config with a destination alpha buffer during compositing.

Additionally this fixes failure to select ARGB2101010
configs via eglChooseConfig() with EGL_ALPHA_BITS 2 on
a depth 30 X-Screen. The X-Server doesn't provide any
visuals of depth 32 for ARGB2101010 configs, it only
provides depth 30 visuals. Therefore if we'd only match
ARGB2101010 configs to depth 32 RGBA visuals, we would
not ever get a visual for such a config.

This was apparent in piglit tests for egl configs, which
are fixed by this commit.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
01ce3f2cad mesa: Add GL_RGBA + GL_UNSIGNED_INT_2_10_10_10_REV for OES read type.
This format + type combo is good for BGRA1010102 framebuffers
for use with glReadPixels() under GLES, so add it for the
GL_IMPLEMENTATION_COLOR_READ_TYPE_OES query.

Allows successful testing of 10 bpc / depth 30 rendering with dEQP test
case dEQP-EGL.functional.wide_color.window_1010102_colorspace_default.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
cfb98bcdd0 i965/screen: Honor 'allow_rgb10_configs' option. (v2)
Allows to prevent exposing RGB10 configs and visuals to
clients.

v2: Rename expose_rgb10_configs to allow_rgb10_configs,
    as suggested by Emil.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
67674ad0dc dri/common: Add option to allow exposure of 10 bpc color configs. (v2)
Some clients may not like RGB10X2 and RGB10A2 fbconfigs and
visuals. Add a new driconf option 'allow_rgb10_configs' to
allow per application enable/disable.

The option defaults to enabled.

v2: Rename expose_rgb10_configs to allow_rgb10_configs,
    as suggested by Emil. Add comment to option parsing,
    to make sure it stays before the ->InitScreen().

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
9e63cbacb6 i965/screen: Add basic support for rendering 10 bpc/depth 30 framebuffers. (v3)
Expose formats which are supported at least back to Gen 5 Ironlake,
possibly further. Allow creation of 10 bpc winsys buffers for drawables.

glxinfo now lists new RGBA 10 10 10 2/0 formats.

v2: Move the BGRA/BGRX1010102 formats before the RGBA/RGBX8888
    32 bit formats, as the code comments require. Thanks Emil!
    Update num_formats from 3 to 5, to keep the special Android
    handling intact.

v3: Use num_formats = ARRAY_SIZE(formats) - 2 as suggested by Tapani,
    to only exclude the last 2 Android formats, add Tapani's r-b.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
87faa8b1e8 i965/screen: Add XRGB2101010 and ARGB2101010 support for DRI3.
Allow DRI3/Present buffer sharing for 10 bpc buffers.
Otherwise composited desktops under DRI3 will only display
black client areas for redirected windows.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
304f363a96 loader/dri3: Add XRGB2101010 and ARGB2101010 support.
To allow DRI3/Present buffer sharing for 10 bpc buffers.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
f3878aa622 dri: Add 10 bpc formats as available formats. (v2)
Used to support ARGB2101010 and XRGB2101010
winsys framebuffers / drawables, but added
other 10 bpc fourcc's as well for consistency
with definitions in wayland_drm.h, gbm.h, and
drm_fourcc.h.

v2: Align new defines with tabs instead of spaces, for
    consistency with remainder of that block of definitions,
    as suggested by Tapani.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
26c4d804ff i965: Support accelerated blit for depth 30 formats. (v2)
Extend intel_miptree_blit() to handle at least
ARGB2101010 -> XRGB2101010, ARGB2101010 -> ARGB2101010,
and XRGB2101010 -> XRGB2101010 via the BLT engine,
but not XRGB2101010 -> ARGB2101010 yet.

This works as tested under Compiz, KDE-5, Gnome-Shell.

v2: Restrict BLT fast path to exclude XRGB2101010 -> ARGB2101010,
    as intel_miptree_set_alpha_to_one() isn't ready to set 2 bit
    alpha channels to 1.0 yet. However, couldn't find a test case
    where this specific blit would be needed, so maybe not much
    of a point to improve here.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:56 +01:00
Mario Kleiner
6945f313c4 i965: Support xrgb/argb2101010 formats for glx_texture_from_pixmap.
Makes compositing under X11/GLX work.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-01-03 22:57:55 +01:00
Tim Rowley
ad218754c7 swr/rast: fix MemoryBuffer build break for llvm-6
LLVM api change.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104381
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-03 11:42:00 -06:00
Rob Herring
28234c5bf8 Android: util: fix locale generation in options.h
The parameters to gen_xmlpool.py are wrong and cause the following
warnings:

Warning: language 'out/target/product/linaro_x86_64/gen/STATIC_LIBRARIES/libmesa_util_intermediates/xmlpool/es/LC_MESSAGES/options.mo' not found.
Warning: language 'out/target/product/linaro_x86_64/gen/STATIC_LIBRARIES/libmesa_util_intermediates/xmlpool/nl/LC_MESSAGES/options.mo' not found.
Warning: language 'out/target/product/linaro_x86_64/gen/STATIC_LIBRARIES/libmesa_util_intermediates/xmlpool/fr/LC_MESSAGES/options.mo' not found.
Warning: language 'out/target/product/linaro_x86_64/gen/STATIC_LIBRARIES/libmesa_util_intermediates/xmlpool/sv/LC_MESSAGES/options.mo' not found.
Warning: language 'external/mesa3d/src/util/xmlpool/t_options.h' not found.
Warning: language 'out/target/product/linaro_x86_64/gen/STATIC_LIBRARIES/libmesa_util_intermediates/xmlpool' not found.
Warning: language 'de' not found.
Warning: language 'es' not found.
Warning: language 'nl' not found.
Warning: language 'fr' not found.
Warning: language 'sv' not found.

The result is English is the only language in options.h. Use "$<"
instead of "$^" because we only need the first dependency (the script),
not all dependencies.

Signed-off-by: Rob Herring <robh@kernel.org>
2018-01-03 09:49:08 -06:00
Kenneth Graunke
74e1d6e20c i965: Drop support for the legacy SNORM -> Float equation.
Older OpenGL defines two equations for converting from signed-normalized
to floating point data.  These are:

    f = (2c + 1)/(2^b - 1)                (equation 2.2)
    f = max{c/2^(b-1) - 1), -1.0}         (equation 2.3)

Both OpenGL 4.2+ and OpenGL ES 3.0+ mandate that equation 2.3 is to be
used in all scenarios, and remove equation 2.2.  DirectX uses equation
2.3 as well.  Intel hardware only supports equation 2.3, so Gen7.5+
systems that use the vertex fetcher hardware to do the conversions
always get formula 2.3.

This can make a big difference for 10-10-10-2 formats - the 2-bit value
can represent 0 with equation 2.3, and cannot with equation 2.2.

Ivybridge and older were using equation 2.2 for OpenGL, and 2.3 for ES.
Now that Ivybridge supports OpenGL 4.2, this is wrong - we need to use
the new rules, at least in core profile.  That would leave Gen4-6 doing
something different than all other hardware, which seems...lame.

With context version promotion, applications that requested a pre-4.2
context may get promoted to 4.2, and thus get the new rules.  Zero cases
have been reported of this being a problem.  However, we've received a
report that following the old rules breaks expectations.  SuperTuxKart
apparently renders the cars red when following equation 2.2, and works
correctly when following equation 2.3:

https://github.com/supertuxkart/stk-code/issues/2885#issuecomment-353858405

So, this patch deletes the legacy equation 2.2 support entirely, making
all hardware and APIs consistently use the new equation 2.3 rules.

If we ever find an application that truly requires the old formula, then
we'd likely want that application to work on modern hardware, too.  We'd
likely restore this support as a driconf option.  Until then, drop it.

This commit will regress Piglit's draw-vertices-2101010 test on
pre-Haswell without the corresponding Piglit patch to accept either
formula (commit 35daaa1695ea01eb85bc02f9be9b6ebd1a7113a1):

    draw-vertices-2101010: Accept either SNORM conversion formula.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
2018-01-02 16:51:42 -08:00
Ian Romanick
bd32d4d067 meta: Don't pollute the texture namespace
tl;dr: For many types of GL object, we can *NEVER* use the Gen function.

In OpenGL ES (all versions!) and OpenGL compatibility profile,
applications don't have to call Gen functions.  The GL spec is very
clear about how you can mix-and-match generated names and non-generated
names: you can use any name you want for a particular object type until
you call the Gen function for that object type.

Here's the problem scenario:

 - Application calls a meta function that generates a name.  The first
   Gen will probably return 1.

 - Application decides to use the same name for an object of the same
   type without calling Gen.  Many demo programs use names 1, 2, 3,
   etc. without calling Gen.

 - Application calls the meta function again, and the meta function
   replaces the data.  The application's data is lost, and the app
   fails.  Have fun debugging that.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:52 -08:00
Ian Romanick
5325a34ed7 meta: Use _mesa_bind_texture instead of _mesa_BindTexture
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:52 -08:00
Ian Romanick
e0ad314568 meta: Use _mesa_CreateTextures instead of _mesa_GenTextures
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:52 -08:00
Ian Romanick
173e3045a9 meta: Track temporary textures using gl_texture_object instead of GL API object handle
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:51 -08:00
Ian Romanick
c36e3d3016 meta/blit: Track temporary texture using gl_texture_object instead of GL API object handle
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:51 -08:00
Ian Romanick
05f4be9641 meta/blit: Use _mesa_bind_texture instead of _mesa_BindTexture
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:51 -08:00
Ian Romanick
d17e6bc48e meta/blit: Don't bind texture in _mesa_meta_bind_rb_as_tex_image
All of the callers of _mesa_meta_bind_rb_as_tex_image call
_mesa_meta_setup_sampler shortly after.  _mesa_meta_setup_sampler also
binds the texture.  This is necessary because not all paths that lead to
_mesa_meta_setup_sampler some through _mesa_meta_bind_rb_as_tex_image.

Rename the function _mesa_meta_texture_object_from_renderbuffer to
reflect its true purpose.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:51 -08:00
Ian Romanick
7609d54e4a meta/blit: Track source texture using gl_texture_object instead of GL API object handle
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:51 -08:00
Ian Romanick
29a948e06d meta/blit: Since _mesa_meta_bind_rb_as_tex_image has only one output, return it
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:51 -08:00
Ian Romanick
44e153616d meta/blit: Don't return the texture handle from _mesa_meta_bind_rb_as_tex_image
It's always the same as *texObj->Name.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:51 -08:00
Ian Romanick
922ee3b493 meta/blit: Don't return the target from _mesa_meta_bind_rb_as_tex_image
It's always the same as *texObj->Target.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:51 -08:00
Ian Romanick
9de64d0baa meta/blit: Don't restore state of the temporary texture
It's about to be destroyed, so there's no point.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:51 -08:00
Ian Romanick
a232df1523 meta/blit: Check the values instead of the target before restoring
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:50 -08:00
Ian Romanick
594d02892e mesa: Add _mesa_bind_texture method
Light-weight glBindTexture for internal use.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:50 -08:00
Ian Romanick
e6cef4b081 Revert "mesa: remove unused _mesa_delete_nameless_texture()"
Changes in this series use this function.

This reverts commit 048de9e34a.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-02 16:23:50 -08:00
Ian Romanick
d80be51775 mesa: Fold _mesa_record_error into its only caller
Also, the comment on _mesa_record_error was wrong.
dd_function_table::Error was not called because that function does not
exist.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-02 16:23:50 -08:00
Lucas Stach
0158565924 etnaviv: disable in-place resolve for non-supertiled surfaces
The in-place resolve probably has some additional restrictions when not
operating on a super tiled surface. Disable it on non-supertiled surfaces
for now to work around a GPU hang.

Fixes: 78ade65956 ("etnaviv: Do GC3000 resolve-in-place when possible")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-01-01 22:48:06 +01:00
Bas Nieuwenhuizen
6a36bfc64d radv: Implement binning on GFX9.
Overall it does not really help or hurt. The deferred demo gets 1%
improvement and some games a 3% decrease, so I don't think this
should be enabled by default.

But with the code upstream it is easier to experiment with it.

v2: Remove initializing the registers from si_emit_config.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-31 15:07:07 +01:00
Bas Nieuwenhuizen
b0d17270ad radv: Add flag for enabling binning.
Letting it be disabled by default.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-31 13:47:51 +01:00
Kenneth Graunke
a1afef8de0 i965: Combine {VS,FS}_OPCODE_GET_BUFFER_SIZE opcodes.
These are the same, we don't need a separate opcode enum per backend.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-30 20:30:34 -08:00
Rob Clark
ea0bbe8201 nir: add missing local_group_size intrinsic
For GL_ARB_compute_variable_group_size

Reported-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-30 12:39:07 -05:00
Rhys Kidd
60c2d09483 nv50/ir: Fix unused var warnings in release build
v2: Add preventative comment (Ilia Mirkin)

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
2017-12-29 23:04:42 -05:00
Rhys Kidd
634ca4c2c3 nvc0: Fix unused var warnings in release build
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
2017-12-29 23:04:42 -05:00
Rhys Kidd
540d829d38 nv50: Fix unused var warning in release build
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
2017-12-29 23:04:42 -05:00
Roland Scheidegger
878bc4a5ae r600: fix textureSize queries with tbos
piglit doesn't care, but I'm quite confident that the size actually bound
as range should be reported and not the base size of the resource (and
some quick piglit test hacking confirms this).
Also, the array in the constant buffer looks overallocated by a factor of 4.
For eg, also decrease the size by another factor of 2 by using the same
constant slot for both buffer size (required for txq for TBOs) and the number
of layers for cube arrays, as these are mutually exclusive. Could of course use
some more logic and only actually do this for the samplers/images/buffers where
it's required rather than for all, but ah well...

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-30 03:30:48 +01:00
Roland Scheidegger
eafaf13686 r600: kill off native_integer shader ctx flag
Maybe upon a time it wasn't always true.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-30 03:30:48 +01:00
Bas Nieuwenhuizen
b0a6fd0274 radv: Also set DCC params for sampling for input attachment usage.
Those are implemented as texture sampling, so we need to make the
texture TC-compatible too.

Fixes: 34d23e82ca "radv: set some dcc parameters depending on if texture will be sampled"
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2017-12-29 23:42:30 +01:00
Bas Nieuwenhuizen
ab957243e1 radv: Enable DCC with transfers.
Before this DCC was in practice disabled for most games. This
enables practical DCC use. Expect a 5-10% perf increase on a
bunch of games on vega @ 4k.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-12-29 12:22:02 +01:00
Bas Nieuwenhuizen
eb9a4c3464 radv: Decompress copy destination if formats are incompatible.
If both source and destination are DCC compressed, and their formats
are not compatible, we need to decompress one of them to make
sure we can do reinterpretation (which needs src format == dst format)
.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-12-29 12:21:58 +01:00
Bas Nieuwenhuizen
44fcf58744 radv: Disable DCC for GENERAL layout and compute transfer dest.
Apps can use this for render feedback loops, where things are
defined if they render each pixel only once. However, DCC fails
here, as the level of coherence is a block not a pixel, so disable it.

This is also going to help implementing other stuff.

Even if we optimize this later to only happen if there actually is
a loop (if possible at all ...), then the machinery is still useful
to exclude images accessible by the SDMA queue when that is implemented.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-12-29 12:21:53 +01:00
Bas Nieuwenhuizen
95f50f7f6c radv: Don't init DCC metadata during FS resolve.
It should already be valid there + the RB will update it during
rendering.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-12-29 12:21:49 +01:00
Bas Nieuwenhuizen
1cfab28e6e radv: Make color meta operations layout aware.
For fast clear eliminate and decompressions, we always use the most compressed
format.

For clears, the code already creates a renderpass on demand with the exact same
layout as specified.

Otherwise we start distinguishing between GENERAL and TRANSFER_DST_OPTIMAL.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-12-29 12:21:44 +01:00
Bas Nieuwenhuizen
3e2a6191c9 radv: Add compute DCC decompress.
We do an in place copy where we read compressed and write decompressed.
By doing this in sizes that cover entire DCC blocks and waiting for all
reads in the block before starting to write we avoid corruption.

In the end we clear the DCC metadata to 0xffffffff.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-12-29 12:21:40 +01:00
Bas Nieuwenhuizen
8abaa3aeaa radv: Use the meta fast clear destructor on construction failure.
Simplifies failure paths. The caller already calls
radv_device_finish_meta_fast_clear_flush_state on failure.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-12-29 12:21:35 +01:00
Bas Nieuwenhuizen
e5feeec140 radv: Add GFX DCC decompress.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-12-29 12:21:31 +01:00
Bas Nieuwenhuizen
fc80f52536 radv: Don't enable DCC / TC compat HTILE for storage images.
We don't get a layout when binding to a descriptor set, but can
assume that the LAYOUT is GENERAL.

For DCC stores with the DCC bits set will result in a hang, so
better be safe than sorry.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-12-29 12:21:15 +01:00
Bas Nieuwenhuizen
516a80b579 Revert "radv/gfx9: fix block compression texture views."
This reverts commit 5951578043.

The mentioned commit causes a hang in DoW3 on Vega.

Fixes: 5951578043 "radv/gfx9: fix block compression texture views."
Acked-by: Dave Airlie <airlied@redhat.com>
2017-12-29 11:21:43 +01:00
Brian Paul
23f37e98a1 svga: update SVGA_NEW_ flags for updating sampler state
The SVGA_NEW_FS flag is needed since we now examine the fragment
shader's fs_shadow_compare_units flags.  The SVGA_NEW_TEXTURE_FLAGS
flag is not needed since it's only for pre-VGPU10.

No piglit changes.  This doesn't fix any known issues but it could
pop up somewhere.  Suggested by Charmaine.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-12-28 22:09:29 -07:00
Brian Paul
f50924cb5b svga: whitespace, formatting fixes in svga_state_tss.c 2017-12-28 22:09:29 -07:00
Dave Airlie
a4c23ce1b6 radv/gfx9: use correct swizzle parameter to work out border swizzle.
This should fix:
dEQP-VK.pipeline.sampler.view_type.*.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
and a few others in that area.

Fixes: b11c4a5546 (radv: add texture descriptor/fmask/cmask support for GFX9)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-29 12:09:13 +10:00
Dave Airlie
868377ab33 radv/gfx9: use a bigger hammer to flush cb/db caches.
amdvlk is probably more subtle than this but it never uses
the inv cb/db variants, we fail some CTS tests without this.

Fixes:
dEQP-VK.renderpass.dedicated_allocation.formats.d32_sfloat_s8_uint.input*.

Fixes: c2fbeb7ca0 (radv: add GFX9 cache flushing support.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (for now :-)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-29 11:43:30 +10:00
Dave Airlie
5951578043 radv/gfx9: fix block compression texture views.
This ports a fix from amdvlk, to fix the sizing for mip levels
when block compressed images are viewed using uncompressed views.

Fixes:
dEQP-VK.image.texel_view_compatible.graphic.extended*bc*

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-29 11:42:47 +10:00
Dave Airlie
420627e6e7 radv/gfx9: fix buffer to image for 3d images on compute queues
This fixes some of the broken:
dEQP-VK.synchronization.op.multi_queue.*64x64x8* tests.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-29 09:37:09 +10:00
Dave Airlie
09612a62e1 radv/gfx9: fix 3d image clears on compute queues
This fixes some of the broken:
dEQP-VK.synchronization.op.multi_queue.*64x64x8* tests.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-29 09:37:05 +10:00
Dave Airlie
d08f267814 radv/gfx9: fix 3d image to image transfers on compute queues.
This fixes some of the broken:
dEQP-VK.synchronization.op.multi_queue.*64x64x8* tests.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-29 09:37:00 +10:00
Jason Ekstrand
967d238c69 anv/device: Mark all state buffers as needing capture
Previously, we were flagging the instruction state buffer for capture
but not surface state or dynamic state.  We want those captured too.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-28 10:39:04 -08:00
Jason Ekstrand
69fa3fb77f intel/aubinator: Gracefully handle dynamic state not being available
Some older versions of the Vulkan driver didn't properly tag dynamic
state as needing to be captured.  Also, this prevents crashes when
looking at dumps on older kernels.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-28 10:39:04 -08:00
Jason Ekstrand
a92d52c3c1 intel/aubinator: Free section data last
We were walking the sections, printing the batches, and then freeing
them in one pass.  If the batch happens to reference any earlier
sections (which it almost certainly will since it's at the end), we will
access freed memory.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-28 10:39:04 -08:00
Eero Tamminen
5c5f2eaa08 spirv: consider bitsize when handling OpSwitch cases
This reverts commit 7665383a33 and is
squashed together with https://patchwork.freedesktop.org/patch/194610/
(spirv: avoid infinite loop / freeze in vtn_cfg_walk_blocks()) which
fixes https://bugs.freedesktop.org/show_bug.cgi?id=104359 properly.

Fixes: 9702fac68e (spirv: consider bitsize when handling OpSwitch cases)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104359
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-28 10:38:58 -08:00
Brian Paul
e0eaeef3e7 svga: check for null fs pointer in update_samplers()
This can happen when there's no active fragment shader, such as
when using transform feedback.  This wasn't hit by any Piglit test
but is hit by Daniel Rákos' Nature demo.  VMware bug 2026189.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-12-28 09:49:31 -07:00
Brian Paul
be5153fbee st/mesa: increase size of glsl_base_type bitfields
Change 59f458cd87 added more enums to glsl_base_type.  We
have to bump up the size of the bitfields for fields of this type
for MSVC.  Also, add another assertion to catch another place where
this enum bitfield is used.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-12-28 08:30:56 -07:00
Dave Airlie
ec1edd0fd2 radv: fix pipeline statistics end query on compute queue
It's legal to a pipeline stat query on a compute queue,
but we'd emit the wrong packet here. This should fix it to emit
the correct packet.

Noticed while inspecting the mpv hang.

Fixes: ad61eac250 (radv: factor out eop event writing code. (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-28 19:31:01 +10:00
Dave Airlie
38e4467e99 radv: fix events on compute queues.
The event emission wasn't sending the correct packet for gfx8 compute
queues, which explains why it works on vega fine.

This fixes the mpv vulkan hang.

Fixes: ad61eac250 (radv: factor out eop event writing code. (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-28 19:30:32 +10:00
Dave Airlie
ff75d3a9aa radv: move local bos usage to a perftest flag.
These seem mildly unstable on vega, crashing CTS in various fun ways,
and looks like leaking memory.

Disable for now, but leave the option to enable them.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-28 19:30:16 +10:00
Dave Airlie
78a8b73e7d vulkan/wsi: free cmd pools
We destroy the pools but don't free the container.

This fixes:
dEQP-VK.wsi.xlib.swapchain.simulate_oom*

Fixes: d50937f137 (vulkan/wsi: Implement prime in a completely generic way)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-28 09:57:33 +10:00
Bas Nieuwenhuizen
a636208ace radv: Always use fragment resolve if dest uses DCC.
HW resolve does not support it either.

Fixes: 2a04f5481d "radv/meta: select resolve paths"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-28 00:30:47 +01:00
Bas Nieuwenhuizen
da192b50b2 radv: Use correct framebuffer size for partial FS resolves.
Framebuffer is from 0,0, not (dst.x, dst.y).

Fixes: 69136f4e63 "radv/meta: add resolve pass using fragment/vertex shaders"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-28 00:30:47 +01:00
Bas Nieuwenhuizen
73279da41d radv: Fix fragment resolve destination offset.
The position start at (dst.x, dst.y), so if we want the source to
start at (src.x, src.y), we have to offset by (src.x-dst.x,src.y-dst.y).

Haven't tested that this fixed anything yet, but found by inspection.

Fixes: 69136f4e63 "radv/meta: add resolve pass using fragment/vertex shaders"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-28 00:26:07 +01:00
Bas Nieuwenhuizen
258ebe79a0 radv: Don't handle DCC in compute resolve.
If the destination has DCC, we will use the FS resolve.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-28 00:26:07 +01:00
Bas Nieuwenhuizen
cebc9a119d radv: Flush caches before subpass resolve.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-28 00:26:07 +01:00
Bas Nieuwenhuizen
c39947ce30 radv: Invert condition for all samples identical during resolve.
the samples_identical instruction returns 0 if they are differet, so
we have to do the extra work if the result is 0, not if it is != 0.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-28 00:26:07 +01:00
Eric Engestrom
e5a7ef0013 egl: don't try the software path twice
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Brendan King <Brendan.King@imgtec.com>
2017-12-27 22:31:56 +00:00
Eric Engestrom
81cea66ff1 egl: rename LIBGL_ALWAYS_SOFTWARE variable from UseFallback to ForceSoftware
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-27 22:31:50 +00:00
Eric Engestrom
2f421651ac egl: let each platform decided how to handle LIBGL_ALWAYS_SOFTWARE
My refactor in 47273d7312 missed this early return; because
of it, setting UseFallback one layer above actually prevented the
software path from being used.

Remove this early return and let each platform's dri2_initialize_*()
decide what it can do with the LIBGL_ALWAYS_SOFTWARE restriction.

platform_{surfaceless,x11,wayland} were already handling it themselves.

Fixes: 47273d7312 "egl: set UseFallback if LIBGL_ALWAYS_SOFTWARE is set"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Brendan King <Brendan.King@imgtec.com>
2017-12-27 22:31:38 +00:00
Brendan King
e491bffc5c egl: link libEGL against the dynamic version of libglapi
Note: the following happens only when using slibtool.
Since this is a very serious breakage, we will keep the workaround until
a better solution is available.

DRI modules store the address of the dispatch table in a TLS variable,
_glapi_tls_Dispatch.

Changes to the way libEGL is built in d884d8d007 resulted in
it being statically linked against libglapi, and thus containing its own
copy of _glapi_tls_Dispatch. The result was that some applications would
fail to work (e.g. deqp-egl, which dynamically loads libEGL), due to the
DRI module storing the dispatch table address in one copy of
_glapi_tls_Dispatch, and libEGL obtaining the address from another copy
of the variable.

Fixes: d884d8d007 "egl/dri: link directly to libglapi.so"
Signed-off-by: Brendan King <Brendan.King@imgtec.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-27 22:29:20 +00:00
Dave Airlie
d2acf97e49 radv: don't do format replacement on tc compat htile surfaces.
For copies the texture unit needs to know the depth format so
it can read the htile data properly.

This fixes:
dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.load.clear

Fixes: ad3d98da9f (radv: enable tc compatible htile for d32s8 also.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-28 05:24:52 +10:00
Dave Airlie
5ba26ed6e5 radv/gfx9: use correct stencil format for tc compat htile.
This needs to correspond to the bit depth of the Z plane.

noticed in passing reading amdvlk.

Fixes: fc6c77e162 (radv: fix TC-compat HTILE with VK_FORMAT_D32_SFLOAT_S8_UINT on Vega)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-28 05:23:49 +10:00
Brian Paul
3e59e442c3 svga: move variant->fs_shadow_compare_units assignment
Fixes a crash since the variant object isn't allocated until later
in the function.  Not sure how this got through.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-12-27 11:34:51 -07:00
Samuel Pitoiset
3260a96c17 amd/common: rework set_userdata_location() and rename to set_loc()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:25:17 +01:00
Samuel Pitoiset
4221a816e2 amd/common: rename set_userdata_location_shader() to set_loc_shader()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:25:15 +01:00
Samuel Pitoiset
5081fd398e amd/common: replace set_userdata_location_indirect() by set_loc_desc()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:25:13 +01:00
Samuel Pitoiset
f8202ef683 amd/common: rename radv_define_vs_user_sgprs_phase2()
... to set_vs_specific_input_locs().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:25:11 +01:00
Samuel Pitoiset
9d5a1787ee amd/common: rename radv_define_common_user_sgprs_phase2()
... to set_global_input_locs().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:25:08 +01:00
Samuel Pitoiset
9a2393a510 amd/common: rename add_user_sgpr_array_argument() to add_array_arg()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:25:06 +01:00
Samuel Pitoiset
b6217bdbee amd/common: replace add_sgpr_argument() by add_arg()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:25:04 +01:00
Samuel Pitoiset
32bbc9eb0f amd/common: replace add_user_sgpr_argument() by add_arg()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:25:02 +01:00
Samuel Pitoiset
e946b5360d amd/common: replace add_vgpr_argument() by add_arg()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:24:59 +01:00
Samuel Pitoiset
f1242a8976 amd/common: add new add_arg() helper for SGPRs/VGPRs arguments
The idea is to clean up the add arguments logic.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:24:57 +01:00
Samuel Pitoiset
bedfa06eaf amd/common: rename radv_define_common_user_sgprs_phase1()
... to declare_global_input_sgprs().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:24:55 +01:00
Samuel Pitoiset
0f58f67abe amd/common: rename radv_define_vs_user_sgprs_phase1()
... to declare_vs_specific_inputs_sgprs().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:24:53 +01:00
Samuel Pitoiset
5c91c1614c amd/common: do not try to declare input VS SGPRs for GS
It's a no-op anyway but it looked strange to me, remove it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:24:51 +01:00
Samuel Pitoiset
fc35a071b6 amd/common: add declare_vs_input_vgprs() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:24:49 +01:00
Samuel Pitoiset
3015668cad amd/common: add declare_tes_input_vgprs() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:24:47 +01:00
Samuel Pitoiset
62942aa8c6 amd/common: remove unnecessary num_user_sgprs_used
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:24:46 +01:00
Samuel Pitoiset
6edf1fcdf5 amd/common: remove unnecessary user_sgpr_count
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-27 10:24:44 +01:00
Samuel Pitoiset
0f8290dd32 radeonsi: make use of ac_init_exec_full_mask()
Similar to si_init_exec_full_mask().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-27 09:57:10 +01:00
Brian Paul
2250c967e7 svga: use tgsi_util_get_shadow_ref_src_index() in a couple place
No piglit changes.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-12-26 21:44:42 -07:00
Brian Paul
ad26999d33 tgsi: improve comment on tgsi_util_get_shadow_ref_src_index()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-12-26 21:44:33 -07:00
Brian Paul
8571506e51 svga: fix TGSI_TEXTURE_SHADOW1D coordinate selection
Fixes about 24 Piglit tex-miplevel-selection tests.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-12-26 21:44:22 -07:00
Brian Paul
1e0b64ced9 svga: fix shadow comparison failures
In some cases, We do shadow comparison cases in the fragment shader
instead of with texture sampler state.  But when we do so, we must
disable the shadow comparison test in the sampler state.  As it
was, we were doing the comparison twice, which resulted in nonsense.
Also, we had the texcoord and texel value swapped in the comparison
instruction.

Fixes about 38 Piglit tex-miplevel-selection tests.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-12-26 21:43:37 -07:00
Brian Paul
ac78ab951a util: add trivial comment on u_upload_create() 2017-12-26 21:40:49 -07:00
Dave Airlie
88d09b642d r600: fix atomic counter index mode getting emitted on pre-cayman
This is a regression since I added cayman atomic support, not sure
it fixes anything, but the shader dumps look better.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-27 02:02:46 +00:00
Dave Airlie
34d23e82ca radv: set some dcc parameters depending on if texture will be sampled
This is ported from amdvlk which sets the independent 64b blocks
only for image which will sample dcc.

I'm not sure how to port this to radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-27 11:10:52 +10:00
Dave Airlie
db27907d78 radv/radeonsi: set dcc min uncompressed properly for APUs.
This is ported from amdvlk.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-27 11:10:50 +10:00
Dave Airlie
cf363e4405 amd/common/radv/radeonsi: use register defines for dcc block sizes.
These are just taken from amdvlk, we probably knew these already,
but may as well port them now.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-27 11:10:35 +10:00
Timothy Arceri
a88532c612 st/glsl_to_nir: add patch support to st_nir_assign_var_locations()
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-27 11:26:08 +11:00
Timothy Arceri
351eee05d3 st/glsl_to_nir: call post opt functions after opts have finished
We need to move this to a separate loop because
nir_compact_varyings() can alter the IR of a previous stage.

Fixes: 6648bd68fd "st/glsl_to_nir: enable NIR link time opts"

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-27 11:26:08 +11:00
Timothy Arceri
ddc0e7941f st/st_glsl_to_nir: call nir_lower_64bit_pack
Fixes 56 crashes in the radeonsi nir backend.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-27 11:26:08 +11:00
Dave Airlie
ae556ba778 docs/features: show es3.1 compat done on r600.
This was already being reported, just missed the docs.

Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-27 00:08:20 +00:00
Miklós Máté
3667714ccd mesa: always compare optype with symbolic name in ATI_fs
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-25 14:32:23 +01:00
Miklós Máté
0fd82f5455 mesa: document ati_fragment_shader::cur_pass and swizzlerq
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
2017-12-25 14:32:23 +01:00
Miklós Máté
f5415dcc8c mesa: move ATI_fs state compile changes after the error checks
Both in setup and arithmetic instructions. Also, remove the useless
new_*_inst() functions, and refactor check_arith_arg(), because it did
two completely different things.

Piglit: spec/ati_fragment_shader/error04-endshader

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
2017-12-25 14:32:23 +01:00
Miklós Máté
7df48c48f9 tnl: fix not having texture coords in ATI_fs in swrast
ATI_fs in swrast only had access to texture coordinates if there was a
valid texture bound and texturing was enabled.

Piglit: spec/ati_fragment_shader/render-sources and render-notexture

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
2017-12-25 14:32:23 +01:00
Miklós Máté
a64d6fb730 mesa: fix not having secondary color in ATI_fs in swrast
ATI_fs in swrast only had secondary color if GL_COLOR_SUM was enabled.
This patch probably fixes the same issue in r200.

Piglit: spec/ati_fragment_shader/render-sources and render-precedence

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
2017-12-25 14:32:23 +01:00
Miklós Máté
759d193ceb mesa: fix validate for secondary interpolator
This patch fixes multiple problems:
- the interpolator check was duplicated
- both had arg instead of argRep
- I split it into color and alpha for better readability and error msg
- the DOT4 check only applies to color instruction according to the spec
- made the DOT4 check fatal, and improved the error msg

Piglit: spec/ati_fragment_shader/error08-secondary

v2: fixed formatting, added spec quotations

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
2017-12-25 14:32:23 +01:00
Miklós Máté
8b3a519913 mesa: fix typo in ATI_fs dstMod error checking
Piglit: spec/ati_fragment_shader/error14-invalidmod

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-25 14:32:23 +01:00
Miklós Máté
4003ad298a mesa: fix crash when an ATI_fs pass begins with an alpha inst
This fixes crash when:
- first pass begins with alpha inst
- first pass ends with color inst, second pass begins with alpha inst
Also, use the symbolic name instead of a number.

Piglit: spec/ati_fragment_shader/api-alphafirst

v2: fixed formatting

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
2017-12-25 14:32:23 +01:00
Miklós Máté
178a3dfb0e mesa: add fallback texture for SampleMapATI if there is nothing
This fixes crash in the state tracker.

Piglit: spec/ati_fragment_shader/render-notexture

v2: fixed formatting, moved stuff inside the loop,
    moved the fallback later to fix more cases

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
2017-12-25 14:32:22 +01:00
Marek Olšák
f9cd6c502e radeonsi: don't use fast color clear for small images even on APUs
Increase the limit and handle non-square images better.

This makes glxgears 20% faster on APUs, and a little more on dGPUs.
We all use and love glxgears.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-12-25 14:23:18 +01:00
Marek Olšák
afdcf0f6b2 radeonsi: set PNT_SPRITE_ENA = point_quad_rasterization
This is based on how nvc0 translates the state.
2017-12-25 14:23:02 +01:00
Marek Olšák
986e467e4c gallium/util: add util_num_layers helper 2017-12-25 14:23:02 +01:00
Bas Nieuwenhuizen
70b5e85fc3 radv: Fix DCC compatible formats.
DCC was disabled when the image format is !!supported, which is one ! too many.

Ironically the commit that introduced it was supposed to lead to more DCC use ...

Fixes: 969537d935 "radv: Add support for more DCC compression with VK_KHR_image_format_list."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-23 10:58:18 +01:00
Anuj Phogat
2d04572038 Revert "i965/fs: Use align1 mode on ternary instructions on Gen10+"
This reverts commit 9cd60fce9c.
Above commit caused 2000+ piglit tests to assert fail. Disabling
the align1 mode on gen10 for now to avoid failures.

Cc: Matt Turner <mattst88@gmail.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-12-22 16:40:40 -08:00
Andres Gomez
466011e46a docs: update calendar, add news item and link release notes for 17.2.8
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-12-23 00:59:22 +02:00
Andres Gomez
7f4ea112ce docs: add sha256 checksums for 17.2.8
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 3281775ab9)
2017-12-23 00:55:35 +02:00
Andres Gomez
d18f00e160 docs: add release notes for 17.2.8
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 3482790712)
2017-12-23 00:55:33 +02:00
Ilia Mirkin
0dbdb07070 freedreno: set missing internal_format when importing texture
Fixes running piglits without -fbo. Probably lots of other stuff too.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-12-22 09:56:02 -05:00
Samuel Pitoiset
38f9b87af2 amd/common: add ac_export_mrt_z() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-22 10:38:49 +01:00
Samuel Pitoiset
03ef264146 amd/common: pass the family to ac_llvm_context_init()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-22 10:38:44 +01:00
Samuel Pitoiset
79c495aa37 radv: reduce the number of small surfaces that need CMASK or DCC
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-22 10:38:44 +01:00
Ilia Mirkin
05944a392e gm107/ir: use lane 0 for manual textureGrad handling
This is parallel to the pre-SM50 change which does this. Adjusts the
shuffles / quadops to make the values correct relative to lane 0, and
then splat the results to all lanes for the final move into the target
register.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-By: Karol Herbst <kherbst@redhat.com>
2017-12-22 00:17:15 -05:00
Dave Airlie
fbac9f86aa radv/meta: fix blit paths for depth/stencil (v2.1)
This fixes the layout issue for the blit path as well.

This fixes:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint*

v2: use compatible render passes.
v2.1: use enum

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-22 14:11:02 +10:00
Dave Airlie
821b5379f0 radv: handle depth/stencil image copy with layouts better. (v3.1)
If we are doing a general->general transfer with HIZ enabled,
we want to hit the tile surface disable bits in radv_emit_fb_ds_state,
however we never get the current layout to know we are in general
and meta hardcoded the transfer layout which is always tile enabled.

This fixes:
dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_general
dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.general_general

v2: refactor some shared helpers for blit patches
v3: we only need multiple render passes as they should be compatible.
v3.1: use enum (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-22 14:10:04 +10:00
Dave Airlie
286fe1db47 radv: refactor blit2d pipeline creation
This just refactors the gfx9 blit2d pipeline creation
to be less lines of code.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-22 14:09:49 +10:00
Dave Airlie
9f675bf934 radv/gfx9: add support for 3d images to blit 2d paths
This add support for a 3D image reading path to the blit 2d paths,
like I did for the clear paths.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-22 14:09:28 +10:00
Dave Airlie
a99fa7e8a2 radv/gfx9: add 3d sampler image->buffer copy shader. (v3)
On GFX9 we must access 3D textures with 3D samplers AFAICS.

This fixes:
dEQP-VK.api.image_clearing.core.clear_color_image.3d.single_layer

on GFX9 for me.

v1.1: fix tex->sampler_dim to dim
v2: send layer in from outside
v3: don't regress on pre-gfx9

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-22 14:08:48 +10:00
Dave Airlie
9594667899 radv: fix surface max layer count (v2)
looking at traces I noticed we'd set slice_max too large sometimes.

This should fix it.

v2: fix missing - 1

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-22 14:07:55 +10:00
Francisco Jerez
b3e3cb9901 intel/fs: Initialize fs_visitor::grf_used on construction.
This should shut up some Valgrind errors during pre-regalloc
scheduling.  The errors were harmless since they could only have led
to the estimation of the bank conflict penalty of an instruction
pre-regalloc, which is inaccurate at that point of the program
compilation, but no less accurate than the intended "return 0"
fall-back path.  The scheduling pass is normally re-run after regalloc
with a well-defined grf_used value and accurate bank conflict
information.

Fixes: acf98ff933 "intel/fs: Teach instruction scheduler about GRF bank conflict cycles."
Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-21 15:20:17 -08:00
Francisco Jerez
1aa79d5ed5 intel/fs/bank_conflicts: Use posix_memalign() instead of overaligned new to obtain vector storage.
The weight_vector_type constructor was inadvertently assuming C++17
semantics of the new operator applied on a type with alignment
requirement greater than the largest fundamental alignment.
Unfortunately on earlier C++ dialects the implementation was allowed
to raise an allocation failure when the alignment requirement of the
allocated type was unsupported, in an implementation-defined fashion.
It's expected that a C++ implementation recent enough to implement
P0035R4 would have honored allocation requests for such over-aligned
types even if the C++17 dialect wasn't active, which is likely the
reason why this problem wasn't caught by our CI system.

A more elegant fix would involve wrapping the __SSE2__ block in a
'__cpp_aligned_new >= 201606' preprocessor conditional and continue
taking advantage of the language feature, but that would yield lower
compile-time performance on old compilers not implementing it
(e.g. GCC versions older than 7.0).

Fixes: af2c320190 "intel/fs: Implement GRF bank conflict mitigation pass."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104226
Reported-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-21 15:19:59 -08:00
Mark Janes
7665383a33 Revert "spirv: consider bitsize when handling OpSwitch cases"
This reverts commit 9702fac68e, which
hangs vulkancts and crucible on all platforms.

The patch is being reverted because it disables continuous integration
testing.  The patch from bug 104359 does not apply to master.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104359
2017-12-21 12:15:40 -08:00
Dave Airlie
b81f1a592b radv: fix issue with multisample positions and interp_var_at_sample.
This fixes vmfaults seen on vega with:
dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_single_sample_.128_128_1.samples_1

These were caused by the don't allocate cmask but it was just accidental.

The actual problem was the shader was trying to get the sample positions from
a buffer, but the buffer was never getting configured to contain them, as the
previous shader never needed them.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 1171b304f3 (radv: overhaul fragment shader sample positions.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-22 05:42:48 +10:00
Emil Velikov
be86e5e7d5 docs: update calendar, add news item and link release notes for 17.3.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-21 17:40:16 +00:00
Emil Velikov
022258117e docs: add sha256 checksums for 17.3.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f66496d291)
2017-12-21 17:39:38 +00:00
Emil Velikov
11ec85ddee docs: add release notes for 17.3.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4f5e85e9e9)
2017-12-21 17:39:38 +00:00
Samuel Pitoiset
9f54675dbe radv/gfx9: fix primitive topology when adjacency is used
Found by inspection.

Cc: 17.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-21 10:49:17 +01:00
Brian Paul
6e5b882339 glsl: disable vec3 packing/splitting in tfb separate mode
This fixes a varying packing issue when using transform feedback in
GL_SEPARATE_ATTRIBS mode.  By time we get to linking, we already
know that the number of feedback attributes is under the
GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS limit so packing isn't
as critical.  In fact, packing/splitting vec3 attributes can cause
trouble because splitting effectively creates another TFB output
which can exceed device limits.  So, disable vec3 packing when it's
not needed to avoid that issue.

Fixes the Piglit ext_transform_feedback-separate test on VMware
driver.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:17 -07:00
Brian Paul
462df64495 glsl: simply packing class comparison
Handle comparing the packing class using the same method as we do
for var->data.is_xfb_only

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:17 -07:00
Brian Paul
06588a065f glsl: document varying_matches::assign_locations() params and return value
And change *components to components[] as a reminder that it's an array.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Brian Paul
544f41ff19 glsl: remove some continue statements
In some cases, I think loop code is easier to read without continue
statements.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Brian Paul
76fc24ba8d glsl: use bitwise operators in varying_matches::compute_packing_class()
The mix of bitwise operators with * and + to compute the packing_class
values was a little weird.  Just use bitwise ops instead.

v2: add assertion to make sure interpolation bits fit without collision,
per Timothy.  Basically, rewrite function to be simpler.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Brian Paul
cd7705de44 glsl: simplify loop in varying_matches::assign_locations()
The use of break/continue was kind of weird/confusing.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Brian Paul
47b4183c92 glsl: minor simplification in assign_varying_locations()
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Brian Paul
a0430bb62c glsl: make varying_matches::is_varying_packing_safe() const
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Brian Paul
d86c9836d5 glsl: trivial comment fixes in lower_packed_varyings.cpp
Reviewed by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Andres Gomez
a42e96f522 docs: update 17.3 and 18.0 cycles for the release calendar
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-20 19:48:58 +02:00
Juan A. Suarez Romero
24aac5f81e spirv: Makefile.nir.am: include vtn_gather_types_c.py script in tarball dist
Fixes: bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-12-20 17:44:35 +01:00
Lucas Stach
51523ab9fa st/dri: allow direct YUYV import
Push this format to the pipe driver unchanged.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-12-20 16:54:37 +01:00
Juan A. Suarez Romero
9702fac68e spirv: consider bitsize when handling OpSwitch cases
When walking over all the cases in a OpSwitch, take in account the bitsize
of the literals to avoid getting wrong cases.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-20 10:39:15 +01:00
Tapani Pälli
fcfb423646 drirc: set allow_glsl_cross_stage_interpolation_mismatch for more games
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104288
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-12-20 09:43:42 +02:00
Samuel Iglesias Gonsálvez
a31f0c4a36 anv: disallow VK_REMAINING_ARRAY_LAYERS in vkCmdClearAttachments()
Vulkan spec doesn't specify that VK_REMAINING_ARRAY_LAYERS is allowed
in the passed VkClearRect struct.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-20 06:55:41 +01:00
Ilia Mirkin
0cf6320eb5 nvc0/ir: change textureGrad to always use lane 0 as the tex origin
Thanks to Karol Herbst for the debugging / tracing work that led to this
change.

Move to using lane 0 as the "work" lane for the texture. It is unclear
why this helps, as that computation should be identical to doing it in
the "correct" lane with the properly adjusted quadops.

In order to be able to use the lane 0 result, we also have to ensure
that lane 0 contains the proper array/indirect/shadow values.

This applies to Fermi and Kepler. Maxwell+ may or may not need fixing,
but that lowering logic is separate.

Fixes KHR-GL45.texture_cube_map_array.sampling

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-12-19 23:09:19 -05:00
Eric Anholt
22ceb1f99b broadcom/vc5: Add missing setting of the UIF XOR disable flag in textures.
Most piglit textures happened to work out by RGBW not changing in that
bit, but it did cause failures in RGBA16F fbo-generatemipmap-formats.
2017-12-19 15:55:14 -08:00
Eric Anholt
200562ad01 broadcom/vc5: Clean up the comment and code around level 0 UIF.
I wrote this early in driver development, and our UIF handling is much
better now.
2017-12-19 14:20:19 -08:00
Eric Anholt
5473dc2b1f broadcom/vc5: Simplify the tiling calculations.
The mb_tile_layout table was just the utile_w/h times two, so reuse the
utile code instead.
2017-12-19 14:10:06 -08:00
Eric Anholt
81cdfdd670 broadcom/vc5: Return the depth in all components of depth textures.
Apparently gallium's u_blitter wants depth from at least the .z component,
and other swizzling appears to apply on top of that.  Fixes
fbo-generatemipmap-formats failures with depth formats.
2017-12-19 13:40:30 -08:00
Eric Anholt
8efb57c05e broadcom/vc5: Enable decompressing RGTC for desktop GL support.
This matches freedreno's behavior.
2017-12-19 13:40:30 -08:00
Eric Anholt
a13e9915c2 broadcom/vc5: Use u_transfer_helper for MSAA mappings. 2017-12-19 13:40:30 -08:00
Eric Anholt
7a30517cce broadcom/vc5: Start adding support for rendering to Z32F_S8X24_UINT.
There may be some more RCL work to be done (I think I need to split my Z/S
stores when doing separate stencil), but this gets piglit's "texwrap
GL_ARB_depth_buffer_float" working.

v2: Unwrap the z32f_wrapper before calling the helper, rather than having
    the helper have a callback.
v3: Rebase on Rob Clark's u_transfer_helper instead
2017-12-19 13:40:30 -08:00
Rob Clark
308076fd55 freedreno: add debug flag to force high priority context
Mainly for testing, FD_MESA_DEBUG=hiprio will force high priority
contexts.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-19 16:36:10 -05:00
Rob Clark
805a72404c freedreno: context priority support
For devices (and kernels) which support different priority ringbuffers,
expose context priority support.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-19 16:36:10 -05:00
Rob Clark
0015217c1e gallium: plumb context priority through to driver
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-12-19 16:36:10 -05:00
Rafael Antognolli
85789831b4 intel/compiler/gen10: Disable push constants.
We still have gpu hangs on Cannonlake when using push constants, so
disable them for now until we have a proper fix for these hangs.

v2: Add warning message when creating context too.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2017-12-19 12:32:24 -08:00
Samuel Pitoiset
4237c3d645 radv: properly load unused gl_LocalInvocationID/gl_WorkGroupID components
F1 2017 looks good now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-19 21:26:25 +01:00
Samuel Pitoiset
0c4a30eb51 radv: do not add extra SGPR when push constants are not used
This is not because the vertex stage needs some push constants
that other stages need them too. This should reduce the number
of loaded SGPRs in some situations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-19 21:22:18 +01:00
Samuel Pitoiset
39097282f7 radv: change the needs_push_constants logic
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-19 21:22:16 +01:00
Samuel Pitoiset
ca8f3a8d55 radv: store pipeline stages that need push constants
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-19 21:22:14 +01:00
Samuel Pitoiset
1cecaa9174 radv: remove one useless check in ac_nir_shader_info_pass()
pipeline->layout can't be NULL now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-19 21:22:12 +01:00
Samuel Pitoiset
f9a07474a1 radv: remove one useless check in radv_flush_constants()
pipeline->layout can't be NULL now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-19 21:22:11 +01:00
Samuel Pitoiset
00162b2108 radv: add assertions to make sure pipeline layout objects are valid
The spec requires it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-19 21:22:09 +01:00
Samuel Pitoiset
3595a11648 radv: create pipeline layout objects for all meta operations
They are dummy objects but the spec requires layout to not be
NULL, this just makes sure we are creating valid pipeline layout
objects. This will allow us to remove some useless checks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-19 21:22:06 +01:00
Bas Nieuwenhuizen
8b5fe4b2b4 radv: Use a sort for rebuilding the sparse buffer bo list.
It uses slightly more memory (though still bounded by the number
of mapped ranges), but gives less quadratic behavior.

Cuts 4 minutes from the runtime of the CTS *.sparse.* tests.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-19 21:12:48 +01:00
Rob Clark
3511a51be0 freedreno/ir3: handle VTXID_BASE for indirect draws
Need to do some gymnastics to copy the parameter from the indirect
parameters buffer to uniform so shader sees the correct base-vertex-id.

Fixes ./bin/arb_draw_indirect-vertexid on a5xx and probably a4xx too.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-19 15:00:18 -05:00
Rob Clark
d7cb509fd3 freedreno/ir3: add ctx->mem_to_mem()
For dealing with indirect-draw + gl_VertexID, we'll introduce another
case where we need to use CP_MEM_TO_MEM.  Rather than adding more
if(a5xx)/else make this a ctx vfunc.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-19 15:00:18 -05:00
Rob Clark
0536737983 freedreno/a5xx: use vertex_id_zero_base
Cmdstream traces from blob make it clear that the blob driver dev's
*think* a5xx has a real (non-zero-based) vtxid.  But reality claims
differently.

Fixes ./bin/gl-3.2-basevertex-vertexid and probably others.

This means draw-indirect is going to need some gymnastics to copy
base-vertex into uniform.  (a4xx probably needs that too.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-19 15:00:18 -05:00
Dave Airlie
a1e18e87c7 r600: clear compressed flags in image state on unbind.
If we aren't binding an image, clear the compressed flags.

This fixes a segfault seen with an apitrace.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104331
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-19 05:38:12 +00:00
George Kyriazis
999f1cd5c6 swr: Account for index_bias in offsets
When calculating buffer offsets for client buffers account for info.index_bias.

Fixes the follow piglit tests:
arb_draw_elements_base_vertex-drawelements-user_varrays
arb_draw_elements_base_vertex-negative-index-user_varrays

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-18 18:51:25 -06:00
Dave Airlie
6ab346c4d9 r600: only reported tgsi ir compute support on evergreen+
This fixes a crash on r600/r700.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-18 21:41:46 +00:00
Bas Nieuwenhuizen
42bc25a79c radv: Advertise sync fd import and export.
Passes dEQP-VK.*.sync_fd.*

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 22:13:31 +01:00
Bas Nieuwenhuizen
52b3f50df8 radv: Implement sync file import/export for fences & semaphores.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 22:13:31 +01:00
Bas Nieuwenhuizen
b98bbdf490 radv/amdgpu: wrap sync fd import/export.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 22:13:31 +01:00
Dave Airlie
dd517ad96d ac/nir: fix lds store for patch outputs.
This wasn't calculating the correct value, this along with
a nir patch fixes a regression in:
dEQP-VK.tessellation.shader_input_output.barrier

Fixes: 043d14db30 (ac/nir: don't write tcs outputs to LDS that aren't read back.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-19 06:44:24 +10:00
Dave Airlie
0e8e7ccf9d nir/linking: always set the used_across_stages/outputs_read bits
If we don't remap and output this code would trample the outputs
read bits.

This fixes a regression in
dEQP-VK.tessellation.shader_input_output.barrier

Fixes: 1c9c42d16b (nir: add varying component packing helpers)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-19 06:44:11 +10:00
Jason Ekstrand
3be382cd7c spirv: Relax the validation conditions of OpSelect
The Talos Principle contains shaders with an OpSelect between two
vectors where the condition is a scalar boolean.  This is technically
against the spec bout nir_builder gracefully handles it by splatting
out the condition to all the channels.  So long as the condition is a
boolean, just emit a warning instead of failing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104246
2017-12-18 09:48:58 -08:00
Samuel Pitoiset
8d00e63ca8 radv: remove useless radv_cmask_info::base_address_reg
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-18 11:51:11 +01:00
Samuel Pitoiset
79b34d0832 amd/common: add ac_vgt_gs_mode() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-18 11:50:50 +01:00
Samuel Pitoiset
55f8431c76 amd/common: add ac_get_cb_shader_mask() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-18 11:50:48 +01:00
Samuel Pitoiset
bb01661918 Revert "radv: do not load unused gl_LocalInvocationID/gl_WorkGroupID components"
This reverts commit 2294d35b24.

We can't do this without adjusting the input SGPRs/VGPRs logic.
For now, just revert it. I will send a proper solution later.

It fixes a rendering issue in F1 2017 that CTS didn't catch up.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-18 11:50:02 +01:00
Dave Airlie
1bdeac545f radv: port merge tess info from anv
anv merges the tess info correctly, but radv wasn't doing this.

This fixes hangs in
dEQP-VK.tessellation.winding.default_domain.hlsl_triangles_ccw

Fixes: 60fc0544e0 (radv/pipeline: handle tessellation shader compilation)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-18 18:36:49 +10:00
Bas Nieuwenhuizen
d27aaae4d2 radv: Add external fence support.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 09:31:21 +01:00
Bas Nieuwenhuizen
6abfa37879 radv: Implement VK_KHR_external_fence_fd.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 09:31:17 +01:00
Bas Nieuwenhuizen
969421b7da radv: Implement fences based on syncobjs.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 09:31:12 +01:00
Bas Nieuwenhuizen
b308bb8773 amd/common: Add detection of the syncobj wait/signal/reset ioctls.
First amdgpu bump after inclusion was 20 (which was done for local BOs).

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 09:31:06 +01:00
Bas Nieuwenhuizen
1c3cda7d27 radv: Add syncobj signal/reset/wait to winsys.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 09:31:02 +01:00
Bas Nieuwenhuizen
52be440f48 configure/meson: Bump libdrm_amdgpu version requirement.
For the radv dependencies on syncobj signal/reset.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 09:30:55 +01:00
Tapani Pälli
61756a6ceb android: fix vulkan driver build
fixes undefined references by adding missing wsi common API

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-18 09:49:15 +02:00
Tapani Pälli
f94e234dd5 android: fix undefined references to futex API
Fixes: f98a2768ca "mesa: Add new fast mtx_t mutex type for basic use cases"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-18 09:24:38 +02:00
Dave Airlie
8de6d4ba0c docs: mark GL4.3 as finished for r600
Still only on fp64 supported hw.
2017-12-18 04:30:14 +00:00
Dave Airlie
7ce3bd9af3 r600: export robust buffer access
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-18 04:30:11 +00:00
Dave Airlie
ec7008f03a r600: export GLSL 430
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-18 04:30:08 +00:00
Dave Airlie
91dd4e44c2 r600/cs: add compute support to caps
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-18 04:30:05 +00:00
Dave Airlie
4388bbbf29 r600: always flush between gfx and compute
This is in no way optimal, but there seems to be some problems
mixing at the moment, lots of hangs, it is possible, just need
to figure out more magic.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-18 04:30:03 +00:00
Dave Airlie
af9e34b8d7 r600: fix unused variable warning
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-18 04:29:57 +00:00
Bas Nieuwenhuizen
b42e106d4d radv: Fix multi-layer blits.
We did not set the layer correctly for the dst, as we would keep
using the base layer. Same for the source image.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102710
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 01:27:49 +01:00
Rob Clark
e095b1347e freedreno/a5xx: add a5xx blitter
FD_MESA_DEBUG=noblit to disable

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-17 12:41:32 -05:00
Rob Clark
37464efa3f freedreno: add generic blitter
Basically a clone of util_blitter_blit() but with special handling to
blit PIPE_BUFFER as a PIPE_TEXTURE_1D.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-17 12:41:32 -05:00
Rob Clark
b852c3bf67 freedreno: add non-draw batches for compute/blit
Get rid of "gmem" (ie. tiling) ringbuffer, and just emit setup commands
directly to "draw" ringbuffer for compute (and in future for blits not
using the 3d pipe).  This way we can have a simple flat cmdstream buffer
and bypass setup related to 3d pipe.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-17 12:41:32 -05:00
Rob Clark
2697480c92 freedreno: track staging and shadow perf ctrs for the HUD
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-17 12:41:32 -05:00
Rob Clark
d848bee50f freedreno: staging upload transfers
In the busy && !needs_flush case, we can support a DISCARD_RANGE upload
using a staging buffer.  This is a bit different from the case of mid-
batch uploads which require us to shadow the whole resource (because
later draws in an earlier tile happen before earlier draws in a later
tile).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-17 12:41:32 -05:00
Rob Clark
f20013a119 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-17 12:41:32 -05:00
Bas Nieuwenhuizen
6d9849d63e anv: Remove unused variable.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-17 14:53:46 +01:00
Marek Olšák
35c3cbad3c radeonsi: don't call force_dcc_off for buffers
This was undefined yet harmless behavior in LLVM.
Not anymore - it causes a hang now.

Cc: 17.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-12-16 01:22:01 +01:00
Kenneth Graunke
02720f8d24 isl: Don't require VALIGN_2 for R32G32B32_FLOAT on Haswell.
According to the RENDER_SURFACE_STATE internal documentation, the
R32G32B32_FLOAT restriction is marked "IVB" only.  We choose to apply
it to Ivybridge and Baytrail, but not Haswell.

Apparently fixes KHR-GL46.texture_size_promotion.functional on Haswell.

Changes these tests from crashing to skipping on Haswell:
- KHR-GL46.direct_state_access.textures_storage_multisample_2d_rgb32f
- KHR-GL46.direct_state_access.textures_storage_multisample_3d_rgb32f

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-15 14:00:09 -08:00
Boyuan Zhang
2ec48039b8 radeon/uvd: add and manage render picture list
Create a list in decoder to store all render picture buffer pointers that
currently being used in reference picture lists.

During get message buffer call, check each pointer in render_pic_list[]
within given pic->ref[] list, remove pointer that no longer being used by
pic->ref[]. Then add current render surface pointer to the render_pic_list[]
and assign the associated index to result.curr_idx.

As a result, result.curr_idx will have the correct index to represent the
current render picture, instead of the previous increamenting values.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-12-15 16:04:31 -05:00
Boyuan Zhang
f2bfd1cbb7 radeon/vcn: add and manage render picture list
Create a list in decoder to store all render picture buffer pointers that
currently being used in reference picture lists.

During get message buffer call, check each pointer in render_pic_list[]
within given pic->ref[] list, remove pointer that no longer being used by
pic->ref[]. Then add current render surface pointer to the render_pic_list[]
and assign the associated index to result.curr_idx.

As a result, result.curr_idx will have the correct index to represent the
current render picture, instead of the previous increamenting values.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-12-15 16:04:31 -05:00
Boyuan Zhang
d9727f31a8 vl: remove is idr flag
Remove is_idr flag since not being used anymore.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-12-15 16:04:05 -05:00
Boyuan Zhang
3181065b7f st/va: directly use idr pic flag
Remove is_idr flag, and use idr_pic_flag provided by vaapi directly

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-12-15 16:04:05 -05:00
Boyuan Zhang
130e1d142f radeon/vce: determine idr by pic type
Vaapi encode interface provides idr frame flags, where omx interface doesn't.
Therefore, change to use picture type to determine idr frame, which will
work for both interfaces.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2017-12-15 16:04:05 -05:00
Boyuan Zhang
c87d91b9d8 radeon/vcn: determine idr by pic type
Vaapi encode interface provides idr frame flags, where omx interface doesn't.
Therefore, change to use picture type to determine idr frame, which will
work for both interfaces.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-12-15 16:04:05 -05:00
Emil Velikov
5d03a68640 util: scons: wire up the sha1 test
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-12-15 19:01:12 +00:00
Tim Rowley
f475ac3c40 swr/rast: Move more RTAI handling out of binner
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:57:12 -06:00
Tim Rowley
11a9d4f9b5 swr/rast: EXTRACT2 changed from vextract/vinsert to vshuffle
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:57:06 -06:00
Tim Rowley
12adf2c815 swr/rast: Fix cache of API thread event manager
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:57:01 -06:00
Tim Rowley
c68b2d5c79 swr/rast: Replace VPSRL with LSHR
Replace use of x86 intrinsic with general llvm IR instruction.

Generates the same final assembly.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:56:54 -06:00
Tim Rowley
20f9006603 swr/rast: Rework thread binding parameters for machine partitioning
Add BASE_NUMA_NODE, BASE_CORE, BASE_THREAD parameters to
SwrCreateContext.

Add optional SWR_API_THREADING_INFO parameter to SwrCreateContext to
control reservation of API threads.

Add SwrBindApiThread() function to allow binding of API threads to
reserved HW threads.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:56:46 -06:00
Tim Rowley
182cc51a50 swr/rast: Pull of RTAI gather & offset out of clip/bin code
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:56:40 -06:00
Tim Rowley
ca59b2e75c swr/rast: Remove no-op VBROADCAST of vID
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:56:36 -06:00
Tim Rowley
01a57c11cb swr/rast: SIMD16 Fetch - Fully widen 32-bit integer vertex components
Also widen the 16-bit a 8-bit integer vertex component gathers to SIMD16.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:56:30 -06:00
Tim Rowley
fa3105cdb5 swr/rast: Replace INSERT2 vextract/vinsert with JOIN2 vshuffle
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:56:25 -06:00
Tim Rowley
b38ac9dca1 swr/rast: SIMD16 Fetch - Fully widen 16-bit float vertex components
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:56:19 -06:00
Tim Rowley
df54678ba0 swr/rast: SIMD16 Fetch - Fully widen 32-bit float vertex components
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:56:03 -06:00
Tim Rowley
fbc27ff027 swr/rast: Pass prim to ClipSimd
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:55:54 -06:00
Tim Rowley
8b06920796 swr/rast: Pull most of the VPAI manipulation out of the binner/clipper
Move out of binner/clipper; hand them down from the frontend code instead.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:55:49 -06:00
Tim Rowley
f882891684 swr/rast: Move GatherScissors to header
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:55:42 -06:00
Tim Rowley
cdb61d45cd swr/rast: Rewrite Shuffle8bpcGatherd using shuffle
Ease future code maintenance, prepare for folding simd8 and simd16 versions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:55:38 -06:00
Tim Rowley
3ec98ab5d4 swr/rast: Convert gather masks to Nx1bit
Simplifies calling code, gets gather function interface closer to llvm's
masked_gather.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:55:33 -06:00
Tim Rowley
36e276b6b0 swr/rast: WIP - Widen fetch shader to SIMD16
Widen vertex gather/storage to SIMD16 for all component types.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:55:28 -06:00
Tim Rowley
6d5275498a swr/rast: Corrections to multi-scissor handling
binner's GatherScissors() will be turned into a real gather in the not
too distant future.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:55:24 -06:00
Tim Rowley
0e9e247687 swr/rast: Binner fixes for viewport index offset handling
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:55:19 -06:00
Tim Rowley
f2e3900a1e swr/rast: Remove unneeded copy of gather mask
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15 10:55:01 -06:00
Chris Wilson
a68873f668 i965: Allow old begin/end queryobj for gen4/5 with HW contexts
Since we have HW contexts on gen4/5, we could take advantage of them, as
done for gen6+ in commit e32cd5ffbb ("i965: Rely on hardware contexts
for query objects on Gen6+."), to only emit a pair of counters at
begin/end queryobj, rather than around every primitive. However, to keep
queryobj working in the meantime as we bringup support for HW ctx on
gen4/5, we can keep using the existing code.

References: e32cd5ffbb ("i965: Rely on hardware contexts for query objects on Gen6+.")
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-15 13:41:18 +00:00
Rob Clark
d1465b3aee freedreno: use u_transfer_helper
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-15 08:09:44 -05:00
Rob Clark
e94eb5e600 gallium/util: add u_transfer_helper
Add a new helper that drivers can use to emulate various things that
need special handling in particular in transfer_map:

 1) z32_s8x24.. gl/gallium treats this as a single buffer with depth
    and stencil interleaved but hardware frequently treats this as
    separate z32 and s8 buffers.  Special pack/unpack handling is
    needed in transfer_map/unmap to pack/unpack the exposed buffer

 2) fake RGTC.. GPUs designed with GLES in mind, but which can other-
    wise do GL3, if native RGTC is not supported it can be emulated
    by converting to uncompressed internally, but needs pack/unpack
    in transfer_map/unmap

 3) MSAA resolves in the transfer_map() case

v2: add MSAA resolve based on Eric's "gallium: Add helpers for MSAA
    resolves in pipe_transfer_map()/unmap()." patch; avoid wrapping
    pipe_resource, to make it possible for drivers to use both this
    and threaded_context.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-15 08:09:44 -05:00
Tapani Pälli
eac1aad624 i965: enable EXT_disjoint_timer_query extension
Following dEQP cases pass:
   dEQP-EGL.functional.get_proc_address.extension.gl_ext_disjoint_timer_query
   dEQP-EGL.functional.client_extensions.disjoint

Piglit test 'ext_disjoint_timer_query-simple' passes with these changes.

No changes/regression observed in Intel CI.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-15 08:42:48 +02:00
Tapani Pälli
33f73345da mesa: GL_EXT_disjoint_timer_query extension API bits
Patch adds GL_GPU_DISJOINT_EXT and enables to use timer queries when
EXT_disjoint_timer_query is enabled.

v2: enable extension only when EXT_disjoint_timer_query set

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-15 08:42:48 +02:00
Tapani Pälli
0a202dd5e8 glapi: add GL_EXT_disjoint_timer_query
Most entrypoints already available via other extensions like
GL_EXT_occlusion_query_boolean, GL_EXT_timer_query.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-15 08:42:48 +02:00
Tapani Pälli
80d96ca4c8 mesa: add DisjointOperation to gl_shared_state
This state will be used by EXT_disjoint_timer_query. As first
usage, patch sets DisjointOperation true when gpu reset happens.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-15 08:42:48 +02:00
Eric Anholt
49e2586bfc broadcom/vc5: Fix a typo in memcmp for sig unpack checking.
This shockingly ended up working out, because only the first byte of *sig
is used and (sizeof(*sig) != 0) == 1.  Fixes a compiler warning.

Link: https://bugs.freedesktop.org/show_bug.cgi?id=104183
2017-12-14 14:36:24 -08:00
Eric Anholt
1171f1749d broadcom/vc5: Enable NIR txd lowering on all txd instructions.
Fixes almost all of piglit's arb_shader_texture_lod grad tests, except for
the base -texgrad/texgradcube ones which fail on what appear to be
precision problems.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-14 14:36:17 -08:00
Eric Anholt
0bead224fe nir: Add a new lowering option to lower all txd to txl.
VC5 requires that all txd are lowered in the shader.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-14 14:36:17 -08:00
Eric Anholt
b08b628994 nir: Fix interaction of GL_CLAMP lowering with texture offsets.
We want the clamping of the coordinate to apply after the offset, so we
need to do math to lower the offset out of the instruction.  Fixes texwrap
offset cases for GL_CLAMP with GL_NEAREST on vc5.

Note: I moved the get_texture_size() verbatim, so that it was defined
before use.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-14 14:36:17 -08:00
Eric Anholt
52f024b052 broadcom/vc5: Fix shader input/outputs for gallium's new NIR linking. 2017-12-14 14:36:17 -08:00
Roland Scheidegger
1ae48963f7 gallivm: implement accurate corner behavior for textureGather with cube maps
The spec says the missing texel (when we wrap around both x and y axis)
should be synthesized as the average of the 3 other texels. For bilinear
filtering however we instead adjusted the filter weights (because, while
the complexity looks similar, there would be 4 times as many color values
to fix up than weights). Obviously this could not work for gather (hence
accurate corner filtering was disabled with gather).
Implement this by just doing it as the spec implies - calculate the 4th
texel as the average of the other 3. With gather of course there's only
one color to worry about, so it's not all that many instructions neither
(albeit surely the whole cube map filtering is hilariously complex).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-12-14 22:59:55 +01:00
Roland Scheidegger
a485ad0bcd gallivm: fix an issue with NaNs with seamless cube filtering
Cube texture wrapping is a bit special since the values (post face
projection) always are within [0,1], so we took advantage of that and
omitted some clamps.
However, we can still get NaNs (either because the coords already had NaNs,
or the face projection generated them), and in fact we didn't handle them
quite safely. I've seen -INT_MAX + 1 been propagated through as the final int
coord value, albeit I didn't observe a crash. (Not quite a coincidence, since
any stride mul with -INT_MAX or -INT_MAX+1 will turn up as a small positive
number - nevertheless, I'd rather not try my luck, I'm not entirely sure it
can't really turn up negative neither due to seamless coord swapping, plus
ifloor of a NaN is not guaranteed to return -INT_MAX by any standard. And
we kill off NaNs similarly with ordinary texture wrapping too.)
So kill off the NaNs by using the common max against zero method.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-12-14 22:59:55 +01:00
Jason Ekstrand
4b8c9ea46b intel/tools: Convert aubinator over to the common framework
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:24 -08:00
Jason Ekstrand
35f9c27be3 intel/batch-decoder: Decode registers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:22 -08:00
Jason Ekstrand
81e4ecbc19 intel/batch-decoder: Decode dynamic state
Unfortunately, in aubinator and aubinator_error_decode we don't always
know how many of a given state we have, so we must guess.  One day,
we'll come up with a way to annotate the batch to solve this problem.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:20 -08:00
Jason Ekstrand
4ac2ee9001 intel/batch-decoder: Decode constants, binding tables, and samplers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:18 -08:00
Jason Ekstrand
d374423eab intel/tools: Switch aubinator_error_decode over to the gen_print_batch
The shared framework can now do everything that aubinator_error_decode
ever did and more.  It's time to make the switch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:16 -08:00
Jason Ekstrand
c86671c438 intel/batch-decoder: Decode graphics shaders
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:15 -08:00
Jason Ekstrand
d4081fb778 intel/batch-decoder: Decode vertex and index buffers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:13 -08:00
Jason Ekstrand
e27ec208ed intel/batch-decoder: Decode MEDIA_INTERFACE_DESCRIPTOR_LOAD
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:12 -08:00
Jason Ekstrand
be20043d00 intel/tools: Add the start of a generic batch decoder
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:10 -08:00
Jason Ekstrand
4cb96fbd91 intel/decoder: Expose the raw field value in the iterator
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:09 -08:00
Jason Ekstrand
79269e8f4b intel/disasm: Take a devinfo in gen_disasm_create
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:06 -08:00
Jason Ekstrand
a7ae72032f intel/decoder: Take a bit offset in gen_print_group
Previously, if a group was nested in another group such that it didn't
start on a dword boundary, we would decode it as if it started at the
start of its first dword.  This changes things to work even more in
terms of bits so that we can properly decode these structs.  This
affects MOCS, attribute swizzles, and several other things.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:04 -08:00
Jason Ekstrand
dca8f466ee intel/decoder: Stop rounding down to the nearest dword
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:03 -08:00
Jason Ekstrand
f264640693 intel/decoder: Convert the iterator to work entirely in bits
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:27:01 -08:00
Jason Ekstrand
ada705b671 intel/decoder: Drop gen_field_decode helper
It's unused

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-12-14 13:26:44 -08:00
Samuel Pitoiset
225b198802 amd/common: add ac_build_waitcnt()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:24:44 +01:00
Samuel Pitoiset
24601810e9 amd/common: more use of i32_1
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:24:42 +01:00
Samuel Pitoiset
ec4e566560 amd/common: more use of i32_0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:24:41 +01:00
Samuel Pitoiset
d43e72fd8c radeonsi: make use of ac_build_fdiv()
And move the comment to amd/common.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:24:38 +01:00
Samuel Pitoiset
88522e2bcd radv: export SampleMask from pixel shaders at full rate
Use 16_ABGR instead of 32_ABGR if Z isn't written.

Ported from RadeonSI.

No CTS regressions on Polaris.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:23:28 +01:00
Samuel Pitoiset
45872a0a6d radeonsi: make use of ac_get_spi_shader_z_format()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:23:25 +01:00
Samuel Pitoiset
91f4d746e4 amd/common: add ac_get_spi_shader_z_format()
ac_shader_util.c will contain shader helpers for RadeonSI
and RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:23:23 +01:00
Samuel Pitoiset
90c3bf0789 radv: do not load the local invocation index when it's unused
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:22:26 +01:00
Samuel Pitoiset
2294d35b24 radv: do not load unused gl_LocalInvocationID/gl_WorkGroupID components
We should also not load the input SGPRs and VGPRS, but
let's start with this for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:22:06 +01:00
Samuel Pitoiset
e001944410 amd/common: scan which components of gl_LocalInvocationID are used
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:22:04 +01:00
Samuel Pitoiset
42285ed8c3 amd/common: scan which components of gl_WorkGroupID are used
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:22:02 +01:00
Samuel Pitoiset
5a761167f5 radv: set FORCE_SIMD_DIST(1) for compute when profitable
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:20:59 +01:00
Samuel Pitoiset
75b1c4997f radv: calculate best compute resource limits
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:20:57 +01:00
Samuel Pitoiset
9fdc1437ba radv: store the dispatch initiator into the device
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:20:55 +01:00
Samuel Pitoiset
2e58ef46a8 radv: replace grid_components_used by uses_grid_size
Use a boolean instead because the number of needed SGPRs
is always 3.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:19:42 +01:00
Samuel Pitoiset
97e57740d8 radv: always emit all compute block components
The number of grid components is always 3 when gl_NumWorkGroups
is declared, because it relies on the number of components of
nir_instrinsic_load_num_work_groups.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:19:39 +01:00
Emil Velikov
271fc8606a docs: update calendar, add news item and link release notes for 17.2.7
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-14 13:52:11 +00:00
Emil Velikov
7ddc3d9f15 docs: add sha256 checksums for 17.2.7
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-14 13:49:32 +00:00
Emil Velikov
0811bb3bd3 docs: add release notes for 17.2.7
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-14 13:49:30 +00:00
Harish Krupo
96fc5fbf23 egl/android: Provide an option for the backend to expose KHR_image
From android cts 8.0_r4, a new test case checks if all the required egl
extensions are exposed. In the current implementation we expose KHR_image
if KHR_image_base and KHR_image_pixmap are supported but KHR_image spec
does not mandate the existence of both the extensions.
This patch preserves the current check and also provides the backend
with an option to expose the KHR_image extension.

Test: run cts -m CtsOpenGLTestCases -t \
android.opengl.cts.OpenGlEsVersionTest#testRequiredEglExtensions

Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-14 13:43:03 +02:00
Bas Nieuwenhuizen
4eb0dca46b radv: Don't advertise VK_EXT_debug_report.
We never supported it. Missed during copy and pasting.

Fixes: 17201a2eb0 "radv: port to using updated anv entrypoint/extension generator."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-12-14 10:05:22 +01:00
Kenneth Graunke
fd3fc5f547 i965: Don't allocate an MCS for 16x MSAA and width > 8192.
The hardware doesn't support this, and isl_surf_get_mcs_surf will fail.

I feel a bit bad replicating this logic, but we want to decide up front.

This fixes the following test when run with --deqp-surface-width=16384:
- GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_error_blitframebuffer_multisampled_framebuffers_different_sample_count

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-12-14 00:37:33 -08:00
Rob Herring
546633dce2 Android: fix missing generation of vtn_gather_types.c
Commit bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
added generation of vtn_gather_types.c, but forgot to add it to the
Android build files.

Fixes: bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-12-13 16:20:15 -06:00
Dylan Baker
e5d8ffdda6 mesa: Add glSpecializeShaderARB to common_desktop_functions
CC: Nicolai Hähnle <nicolai.haehnle@amd.com>
CC: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104231
Fixes: 46b21b8f90 ("mesa: add GL_ARB_gl_spirv boilerplate")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-13 13:24:57 -08:00
Tomasz Figa
5364e73624 egl/android: Partially handle HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED
There is no API available to properly query the IMPLEMENTATION_DEFINED
format. As a workaround we rely here on gralloc allocating either
an arbitrary YCbCr 4:2:0 or RGBX_8888, with the latter being recognized
by lock_ycbcr failing.

Reviewed-on: https://chromium-review.googlesource.com/566793

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-12-13 14:51:48 -06:00
Bruce Cherniak
ea2ee9cd19 swr: Correct texture allocation and limit max size to 2GB
This patch fixes piglit tex3d-maxsize by correcting 4 things:

The total_size calculation was using 32-bit math, therefore a >4GB
allocation request overflowed and was not returning false (unsupported).

Changed AlignedMalloc arguments from "unsigned int" to size_t, to handle
>4GB allocations.

Added error checking on texture allocations to fail gracefully.

Finally, temporarily decreased supported max texture size from 4GB to 2GB.
The gallivm texture-sampler needs some additional work to correctly handle
larger than 2GB textures (offsets to LLVMBuildGEP are signed).

I'm working on a follow-on patch to allow up to 4GB textures, as this is
useful in HPC visualization applications.

Fixes piglit tex3d-maxsize.

v2: Updated patch description to clarify ">4GB".

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2017-12-13 14:44:04 -06:00
Bruce Cherniak
709f5bdc4a swr: Fix KNOB_MAX_WORKER_THREADS thread creation override.
Environment variable KNOB_MAX_WORKER_THREADS allows the user to override
default thread creation and thread binding.  Previous commit to adjust
linux cpu topology caused setting this KNOB to bind all threads to a single
core.

This patch restores correct functionality of override.

Cc: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-12-13 14:44:01 -06:00
Dylan Baker
1774c10361 meson: fix glx-test race
This test should rely on dispatch.h being generated, but it doesn't.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-12-13 09:37:12 -08:00
Brian Paul
c27a6c45c2 gallium/docs: document behavior of set_sample_mask()
The sample mask is used even if msaa is not explicity enabled when we
have a framebuffer with multisampled surfaces.  That's DX behavior and
what the Radeon drivers do.  Not sure about other drivers at this point.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-12-13 08:38:07 -07:00
Brian Paul
0f2bd31baf glsl: trivial whitespace fixes in link_varyings.cpp 2017-12-13 08:38:07 -07:00
Jordan Justen
dc07bb5fd1 program: Don't reset SamplersValidated when restoring from shader cache
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103988
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-13 00:31:06 -08:00
Kai Wasserbäch
729fec6eab mesa: remove second include of errors.h in src/mesa/main/glspirv.c
Fixes: 5bc03d2508 ("mesa: implement SPIR-V loading in glShaderBinary")
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-12 23:42:53 -08:00
Timothy Arceri
3308f4b81a radeonsi: create get_tcs_tes_buffer_address helper
This will be shared between the NIR and TGSI backends.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-13 13:41:53 +11:00
Timothy Arceri
a5f9ac2928 ac: fix nir_op_f2f64
Without this we get the error "FPExt only operates on FP" when
converting the following:

   vec1 32 ssa_5 = b2f ssa_4
   vec1 64 ssa_6 = f2f64 ssa_5

Which results in:

   %44 = and i32 %43, 1065353216
   %45 = fpext i32 %44 to double

With this patch we now get:

   %44 = and i32 %43, 1065353216
   %45 = bitcast i32 %44 to float
   %46 = fpext float %45 to double

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-13 13:20:28 +11:00
Timothy Arceri
cab5513b47 nir: fix shift for uint64_t
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-13 13:20:28 +11:00
Timothy Arceri
dd119a4263 st/glsl_to_nir: skip forced array splitting for tcs
nir_lower_io_to_temporaries() does not support tcs so we cannot
assume there are no indirects here. Also the radeonsi backend
(the only backend to support tess) has support for tcs indirects
so there is no need to lower them anyway.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-13 13:20:28 +11:00
Francisco Jerez
acab52f520 intel/fs/bank_conflicts: Don't touch Gen7 MRF hack registers.
Fixes: af2c320190 "intel/fs: Implement GRF bank conflict mitigation pass."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104199
Reported-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-12-12 12:05:45 -08:00
Kevin Rogovin
b1ce812c51 i965: compute scratch space size correctly for Gen9+
Fixes: 8ecdbb6136 "i965: Pretend there are 4 subslices for compute shader threads on Gen9+."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104005
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
2017-12-12 10:02:43 -08:00
Kevin Rogovin
eea9027f87 i965: Program MEDIA_VFE_STATE in a more readable fashion.
This patch is purely for readability improvements when programming
the MEDIA_VFE_STATE.

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-12-12 10:02:31 -08:00
Brian Paul
7469966ed2 cso: add point rasterization sanity check assertion
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-12-12 09:46:18 -07:00
Brian Paul
38a4fd8ad6 gallium/u_blitter: replace tabs with spaces
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-12 09:46:18 -07:00
Brian Paul
7a46063803 xlib: call _mesa_warning() instead of fprintf()
We use _mesa_warning() everywhere else in this code.  Change requested
by Rick Irons of Mathworks.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-12 09:44:59 -07:00
Brian Paul
63b03dc924 gallium/util: don't pass a pipe_resource to util_resource_is_array_texture()
No need to pass a pipe_resource when we can just pass the target.
This makes the function potentially more usable.  Rename it too.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-12 09:44:59 -07:00
Brian Paul
dde8309cde gallium/aux: include nr_samples in util_resource_size() computation
This function is only used in two places:
1. VMware driver, but only for HUD reporting
2. st/nine state tracker, used for texture memory accounting

Fixes: a69efa9482 ("util: add new util_resource_size() function in
u_resource.[ch]")

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-12 09:44:59 -07:00
Brian Paul
09b69828a3 svga: trivial whitespace/formatting fixes in svga_pipe_rasterizer.c 2017-12-12 09:44:59 -07:00
Brian Paul
71ac73ce76 st/mesa: trivial whitespace/formatting fixes in st_atom_rasterizer.c 2017-12-12 09:44:59 -07:00
Jason Ekstrand
9718ce44c2 spirv: Handle image and sampler function parameters 2017-12-12 07:34:46 -08:00
Jason Ekstrand
7d3ebd1286 spirv/cfg: Refactor the function parameter loop a bit 2017-12-12 07:34:46 -08:00
Jason Ekstrand
e6ba457c99 spirv/cfg: Be a bit more precise about function parameters
Pointers with no storage type are converted to inout variables but SSA
values and pointers with a storage type (which turns into a uint or
uvec2) are just input variables.
2017-12-12 07:34:46 -08:00
Jason Ekstrand
aaeda8d7d4 spirv: Make sampled images a real type
Previously, we just gave them exactly the same type as the respective
image (which already had a sampler2D or similar type).  Now they have
their own base type and a pointer to the vtn_type for the image.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-12-12 07:34:46 -08:00
Eric Engestrom
ec0a4fcec0 i915: add missing 0 defines
Thanks to Emil's -Wundef, t_dd_dmatmp.h now complains that intel_render.c
is missing a couple `#define`s.

Assigning them to 0 keeps the existing behaviour; I'll let someone else
turn them on if this is the behaviour that was intended.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-12 13:59:46 +00:00
Nicolai Hähnle
accb7d4390 mesa: refuse to compile SPIR-V shaders or link mixed shaders
Note that gl_shader::CompileStatus will also indicate whether a shader
has been successfully specialized.

v2: Use the 'spirv_data' member of gl_shader to know if it is a SPIR-V
   shader, instead of a dedicated flag. (Timothy Arceri)

v3: Use bool instead of GLboolean. (Ian Romanick)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-12 08:18:32 +01:00
Nicolai Hähnle
4ccd00d762 mesa/shaderapi: add a getter for GL_SPIR_V_BINARY_ARB
v2: Use the 'spirv_data' member of gl_shader instead of a
   dedicated flag. (Timothy Arceri)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-12 08:18:32 +01:00
Nicolai Hähnle
5bc03d2508 mesa: implement SPIR-V loading in glShaderBinary
v2: * Add a gl_shader_spirv_data member to gl_shader, which already
   encapsulates a gl_spirv_module where the binary will be saved.
   (Eduardo Lima)

    * Just use the 'spirv_data' member to know whether a gl_shader has
   the SPIR_V_BINARY_ARB state. (Timothy Arceri)

    * Remove redundant argument checks. Move extension presence check
   to API entry point where the rest of checks are. Retype 'n' and
   'length'arguments to use the correct and more standard types.
   (Ian Romanick)

    * Fix some nitpicks. (Ian Romanick)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-12 08:18:32 +01:00
Eduardo Lima Mitev
a8889f5cc7 mesa/glspirv: Add struct gl_shader_spirv_data
This is a per-shader structure holding the SPIR-V data associated with the
shader (binary module, specialization constants and entry-point).

This is needed because both gl_shader and gl_linked_shader need to share this
data. Instead of copying the data, we pass a reference to it upon program
linking. That's why it is reference-counted.

This struct is created and associated with the shader upon calling
glShaderBinary(), then subsequently filled up by the call to
glSpecializeShaderARB().

v2: Readability improvements (Ian Romanick)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-12 08:18:32 +01:00
Nicolai Hähnle
74f98ab76f mesa/glspirv: Add struct gl_spirv_module
v2: * Make the SPIR-V module struct part of a larger gl_shader_spirv_data
    struct that will be introduced later, and don't reference it directly
    in gl_shader. (Eduardo Lima)
    * Readability improvements (Ian Romanick)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-12 08:18:32 +01:00
Nicolai Hähnle
46b21b8f90 mesa: add GL_ARB_gl_spirv boilerplate
v2: * Add meson build bits (Eric Engestrom)
    * Return INVALID_OPERATION error on SpecializeShaderARB (Ian Romanick)

v3: Include boilerplate for the GL 4.6 alias of glSpecializeShaderARB
   (Neil Roberts)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-12 08:18:32 +01:00
Jason Ekstrand
df657ebb68 spirv: Add support for all bit sizes in OpSwitch
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101560
2017-12-11 22:28:34 -08:00
Jason Ekstrand
58cabae8cc spirv: Restructure the case loop in OpSwitch handling
Instead of calling vtn_add_case for the default case and then looping,
add an is_default variable and do everything inside the loop.  This will
make the next commit easier.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-11 22:28:34 -08:00
Jason Ekstrand
5f572ccc95 spirv: Add better parameter validation for vector and matrix types
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-11 22:28:34 -08:00
Jason Ekstrand
a7c2be9944 spirv: Add type validation for OpSelect
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-11 22:28:34 -08:00
Jason Ekstrand
6737b1b859 spirv: Add basic type validation for OpLoad, OpStore, and OpCopyMemory
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-12-11 22:28:34 -08:00
Jason Ekstrand
bb1e6ff161 spirv: Add a prepass to set types on vtn_values
This autogenerated pass will automatically find and set the type field
on all vtn_values.  This way we always have the type and can use it for
validation and other checks.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-11 22:28:34 -08:00
Jason Ekstrand
2c84b49ddf spirv: Add a vtn_type field to all vtn_values
At the moment, this just lets us drop the const_type for constants and
unify things a bit.  Eventually, we will use this to store the types of
all SPIR-V SSA values.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-11 22:28:34 -08:00
Samuel Iglesias Gonsálvez
ba4bb0838b anv: fix bug when using component qualifier in FS outputs
We can write to the same output but in different components, like
in this example:

layout(location = 0, component = 0) out ivec2 dEQP_FragColor_0;
layout(location = 0, component = 2) out ivec2 dEQP_FragColor_1;

Therefore, they are not two different outputs but only one.

Fixes:

dEQP-VK.glsl.440.linkage.varying.component.frag_out.*

v3:
- Remove FRAG_RESULT_MAX.
- Add const and use sizeof (Ian).
- Do three-pass to set properly the locations of fragment
  outputs when having arrays (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-12 07:24:55 +01:00
Ilia Mirkin
0332c7484b st/mesa: swizzle argument when there's a vector size mismatch
GLSL IR operation arguments can sometimes have an implicit swizzle as a
result of a vector arg and a scalar arg, where the scalar argument is
implicitly expanded to the size of the vector argument.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103955
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-11 23:08:43 -05:00
Roland Scheidegger
84c363fb09 gallivm: fix texture wrapping for texture gather for mirror modes
Care must be taken that all coords end up correct, the tests are very
sensitive that everything is correctly rounded. This doesn't matter
for bilinear filter (since picking a wrong texel with weight zero is
ok), and we could also switch the per-sample coords mistakenly.
While here, also optimize the coord_mirror helper a bit (we can do the
mirroring directly by exploiting float rounding, no need for fixing up
odd/even manually).
I did not touch the mirror_clamp and mirror_clamp_to_border modes.
In contrast to mirror_clamp_to_edge and mirror_repeat these are legacy
modes. They are specified against old gl rules, which actually does
the mirroring not per sample (so you get swapped order if the coord
is in the mirrored section). I think the idea though is that they should
follow the respecified mirror_clamp_to_edge rules so the order would be
correct.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-12-12 04:23:02 +01:00
Jason Ekstrand
24f019fd69 spirv: Allow ignoring decorations for workgroup variables
Since we switched over to lowering SLM access directly in SPIR-V -> NIR,
we no longer have vtn_variables for SLM.  It's all safe as with UBOs and
SSBOs but we need to let it through in the assert.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104213
Fixes: 8761a04d0d
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-11 19:02:47 -08:00
Jason Ekstrand
2bc9123c33 spirv: Set lengths on scalar and vector types
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-11 19:02:47 -08:00
Bas Nieuwenhuizen
3342a432fa ac/nir: Support vulkan_resource_reindex.
Fixes: 93b4cb61eb "spirv: Allow OpPtrAccessChain for block indices"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-12 00:16:18 +01:00
Bas Nieuwenhuizen
368f49b284 ac/nir: Don't load the descriptor in vulkan_resource_index.
To support the reindex intrinsic, we need the result to be
something on which we can adjust the index/address.

Since it is all within a basic block, the compiler should be
able to merge any extra loads.

v2: Change visit_get_buffer_size too.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-12 00:16:18 +01:00
Marek Olšák
bf0904e31f winsys/amdgpu: disable local BOs again due to worse performance
Cc: 17.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-11 19:11:14 +01:00
Marek Olšák
8a821fa91c drirc: whitelist glthread for Mount and Blade Warband again 2017-12-11 19:11:12 +01:00
Bas Nieuwenhuizen
6469669beb radv: Don't use local BOs when allocating with export options.
If the app does not plan to put a buffer or image in it
(why? But it is allowed and CTS does it), they do not need to
allocate it with the deciate allocation struct.

Fixes: a639d40f13 "radv: add support for local bos. (v3)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-10 23:47:23 +01:00
Bas Nieuwenhuizen
b926da241a spirv: Fix loading an entire block at once.
There is no chain, so  checking the length ends with a SEGFAULT.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103579
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-10 01:43:26 +01:00
Jason Ekstrand
4c7af87fb9 anv: Enable UBO pushing
Push constants on Intel hardware are significantly more performant than
pull constants.  Since most Vulkan applications don't actively use push
constants on Vulkan or at least don't use it heavily, we're pulling way
more than we should be.  By enabling pushing chunks of UBOs we can get
rid of a lot of those pulls.

On my SKL GT4e, this improves the performance of Dota 2 and Talos by
around 2.5% and improves Aztec Ruins by around 2%.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:26 -08:00
Jason Ekstrand
f1ce0b905a i965/fs: Handle !supports_pull_constants and push UBOs properly
In Vulkan, we don't support classic pull constants and everything the
client asks us to push, we push.  However, for pushed UBOs, we still
want to fall back to conventional pulls if we run out of space.
2017-12-08 15:43:25 -08:00
Jason Ekstrand
8d34077182 anv/device: Increase the UBO alignment requirement to 32
Push constants work in terms of 32-byte chunks so if we want to be able
to push UBOs, every thing needs to be 32-byte aligned.  Currently, we
only require 16-byte which is too small.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
2f9eb045f3 anv/cmd_buffer: Add support for pushing UBO ranges
In order to do this we have to modify push constant set up to handle
ranges.  We also have to tweak the way we handle dirty bits a bit so
that we re-push whenever a descriptor set changes.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
0c879b62b0 anv/cmd_buffer: Add some stage asserts
There are several places where we look up opcodes in an array of stages.
Assert that the we don't end up going out-of-bounds.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
1968cd07a2 anv/cmd_buffer: Add some helpers for working with descriptor sets
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
1bce04deb8 anv/pipeline: Translate vulkan_resource_index to a constant when possible
We want to call brw_nir_analyze_ubo_ranges immedately after
anv_nir_apply_pipeline_layout and it badly wants constants.  We could
run an optimization step and let constant folding do it but that's way
more expensive than needed.  It's really easy to just handle constants
in apply_pipeline_layout.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
3b34ed79f1 i965/fs: Rewrite assign_constant_locations
This rewires the logic for assigning uniform locations to work in terms
of "complex alignments".  The basic idea is that, as we walk the list of
instructions, we keep track of the alignment and continuity requirements
of each slot and assert that the alignments all match up.  We then use
those alignments in the compaction stage to ensure that everything gets
placed at a properly aligned register.  The old mechanism handled
alignments by special-casing each of the bit sizes and placing 64-bit
values first followed by 32-bit values.

The old scheme had the advantage of never leaving a hole since all the
64-bit values could be tightly packed and so could the 32-bit values.
However, the new scheme has no type size special cases so it handles not
only 32 and 64-bit types but should gracefully extend to 16 and 8-bit
types as the need arises.

Tested-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
597c194487 anv: Disable VK_KHR_16bit_storage
The testing for this extension is currently very poor.  The CTS tests
only test accessing UBOs and SSBOs at dynamic offsets so none of our
constant-offset paths get triggered at all.  Also, there's an assertion
in our handling of nir_intrinsic_load_uniform that offset % 4 == 0 which
is never triggered indicating that nothing every gets loaded from an
offset which is not a dword.  Both push constants and the constant
offset pull paths are complex enough, we really don't want to ship
without tests.  We'll turn the extension back on once we have decent
tests.
2017-12-08 15:42:55 -08:00
Leo Liu
6d74cb2570 radeon/vce: move destroy command before feedback command
VCE processing IBs starts from session and task info at first level,
other commands processed subsequently. The task info for destroy is
embedded to destroy command, resulting that feedback command is not
properly procoessed. This is causing kernel spin VM fault messages on
Polaris and Vega10 card when running ends at encode application.

The fix is also verified on VCE physical mode card.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Christian König <christian.koenig@amd.com>
2017-12-08 12:56:48 -05:00
Ben Crocker
060eb314eb docs/llvmpipe: document ppc64le as alternative architecture to x86.
Power8, Power8NV, and Power9 are supported on an equal footing
with X86.

Cc: "17.2" "17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

[Eric: changed formatting, reworded a bit (with Ben's ack)]
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-08 14:49:00 +00:00
Emil Velikov
bce489a4ed docs/release-calendar: drop 17.3.0 from the table
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-08 13:59:27 +00:00
Emil Velikov
95c9d751ce docs: add news item and link release notes for 17.3.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-08 13:58:03 +00:00
Emil Velikov
706986bcc9 docs: add sha256 checksums for 17.3.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 49a612d158)
2017-12-08 13:54:34 +00:00
Emil Velikov
4124ac51f4 docs: Update 17.3.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 8d55da9f57)
2017-12-08 13:54:33 +00:00
Samuel Pitoiset
572b2bad1d radv: do not print ASM to stderr when dumping shaders
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:24:24 +01:00
Samuel Pitoiset
33b329f769 radv/winsys: implement query_value()
Might be useful to know the VRAM/GTT usage, the number of VRAM
CPU page faults, etc. Nothing is currently using that new
interface, but it's a first step.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:22:35 +01:00
Samuel Pitoiset
c202119286 radv: remove useless check radv_set_dcc_need_cmask_elim_pred()
emit_fast_color_clear() already checks that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:22:03 +01:00
Samuel Pitoiset
d90b7a4c50 radv: remove useless checks in radv_set_{color,depth}_clear_regs()
Already checked by the respective callers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:22:00 +01:00
Samuel Pitoiset
c7c7b00889 radv: only re-mit the index type when it changes
dota2 binds a ton of index buffers but the type is always 16-bit.
Note that we have to invalidate the type when switching from
indexed draws to normal draws.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:21:36 +01:00
Samuel Pitoiset
a302009b7b radv: only reset command buffers that are not in the initial state
dota2 always calls vkResetCommandBuffer() before
vkBeginCommandBuffer() which is quite useless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:21:23 +01:00
Samuel Pitoiset
a380bc7ecf radv: track different status of a command buffer
RADV_CMD_BUFFER_STATUS_INVALID is not used for now, but I think
it makes sense to declare it. Could be used later with better
command buffer error handling.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:21:21 +01:00
Samuel Pitoiset
fc6c77e162 radv: fix TC-compat HTILE with VK_FORMAT_D32_SFLOAT_S8_UINT on Vega
Copied from RadeonSI.

This fixes all CTS
dEQP-VK.renderpass.dedicated_allocation.formats.d32_sfloat_s8_uint.clear.*

And some other ones which use the same format.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:15:44 +01:00
Jordan Justen
4d81c8e43e docs: Update GL_ARB_get_program_binary docs to support 1 format
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 17:01:02 +11:00
Jordan Justen
b4c37ce214 i965: Add ARB_get_program_binary support using nir_serialization
This resolves an apparent game bug described in 85564. The game
doesn't properly handle ARB_get_program_binary with 0 supported
formats.

V2 (Timothy Arceri):
 - less driver code as more has been moved into the common helpers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85564
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 17:00:57 +11:00
Jordan Justen
c1ff99fd70 main: Clear shader program data whenever ProgramBinary is called
The GL_ARB_get_program_binary extension spec says:

 "If ProgramBinary fails to load a binary, no error is generated, but
  any information about a previous link or load of that program object
  is lost."

v2:
 * Re-initialize shProg->data after clear. (Jordan)
   (Required after 6a72eba755)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
50c09a648f main: add binary support to ProgramBinary
V2: call generic mesa_program_binary() helper rather than driver
    function directly to allow greater code sharing.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
7ee54ad057 main: add binary support to GetProgramBinary
V2: call generic _mesa_get_program_binary() helper rather than driver
    function directly to allow greater code sharing.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
e30ed18215 main: Support getting GL_PROGRAM_BINARY_LENGTH
V2: call generic _mesa_get_program_binary_length() helper
    rather than driver function directly to allow greater
    code sharing.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>i (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
c20fd744fe mesa: Add Mesa ARB_get_program_binary helper functions
V2 (Timothy Arceri):
 - add extra code comment
 - stop passing around void *binary and just pass
   program_binary_header *hdr instead.
 - move to src/mesa/main rather than src/util

V3 (Timothy Arceri):
 - Move more code out of the backend and into the common
   helpers.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:59:25 +11:00
Timothy Arceri
90d4abdd87 mesa: add driver callbacks for serialising ProgramBinary blobs
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
64ad804e59 main: Support 1 Mesa format with get for GL_PROGRAM_BINARY_FORMATS
Mesa supports either 0 or 1 formats. If 1 format is supported, it is
GL_PROGRAM_BINARY_FORMAT_MESA as defined in the
GL_MESA_program_binary_formats extension spec.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
fb077d603b main: Allow non-zero NUM_PROGRAM_BINARY_FORMATS
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
2e28494af2 i965: Fix memory leak when serializing nir
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
25b3ce6e3b i965: Add brw_program_serialize_nir
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:59:22 +11:00
Jordan Justen
b3f1b765e9 i965: Free serialized nir after deserializing
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:44:35 +11:00
Jordan Justen
cdc7ac23b9 i965: Add brw_program_deserialize_nir
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:44:35 +11:00
Jordan Justen
7cf1037d5a main, glsl: Add UniformDataDefaults which stores uniform defaults
The ARB_get_program_binary extension requires that uniform values in a
program be restored to their initial value just after linking.

This patch saves off the initial values just after linking. When the
program is restored by glProgramBinary, we can use this to copy the
initial value of uniforms into UniformDataSlots.

V2 (Timothy Arceri):
 - Store UniformDataDefaults only when serializing GLSL as this
   is what we want for both disk cache and ARB_get_program_binary.
   This saves us having to come back later and reset the Uniforms
   on program binary restores.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:44:35 +11:00
Jordan Justen
ebd9e789c4 glsl: Split out shader program serialization
This will allow us to use the program serialization to implement
ARB_get_program_binary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:44:35 +11:00
Jordan Justen
219628c118 include: Add GL_MESA_program_binary_formats to GL/GLES2 ext.h files
Thus was merged into the OpenGL Registry in version
667c5a253781834b40a6ae9eb19d05af4542cfe1.

Ref: https://github.com/KhronosGroup/OpenGL-Registry/pull/127
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-08 16:44:35 +11:00
Jordan Justen
0c48487893 mesa: add GL_PROGRAM_BINARY_FORMAT_MESA enum
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:44:35 +11:00
Francisco Jerez
4d1959e693 intel/cfg: Represent divergent control flow paths caused by non-uniform loop execution.
This addresses a long-standing back-end compiler bug that could lead
to cross-channel data corruption in loops executed non-uniformly.  In
some cases live variables extending through a loop divergence point
(e.g. a non-uniform break) into a convergence point (e.g. the end of
the loop) wouldn't be considered live along all physical control flow
paths the SIMD thread could possibly have taken in between due to some
channels remaining in the loop for additional iterations.

This patch fixes the problem by extending the CFG with physical edges
that don't exist in the idealized non-vectorized program, but
represent valid control flow paths the SIMD EU may take due to the
divergence of logical threads.  This makes sense because the i965 IR
is explicitly SIMD, and it's not uncommon for instructions to have an
influence on neighboring channels (e.g. a force_writemask_all header
setup), so the behavior of the SIMD thread as a whole needs to be
considered.

No changes in shader-db.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-12-07 18:27:05 -08:00
Francisco Jerez
9355116bda intel/fs: Don't let undefined values prevent copy propagation.
This makes the dataflow propagation logic of the copy propagation pass
more intelligent in cases where the destination of a copy is known to
be undefined for some incoming CFG edges, building upon the
definedness information provided by the last patch.  Helps a few
programs, and avoids a handful shader-db regressions from the next
patch.

shader-db results on ILK:

  total instructions in shared programs: 6541547 -> 6541523 (-0.00%)
  instructions in affected programs: 360 -> 336 (-6.67%)
  helped: 8
  HURT: 0

  LOST:   0
  GAINED: 10

shader-db results on BDW:

  total instructions in shared programs: 8174323 -> 8173882 (-0.01%)
  instructions in affected programs: 7730 -> 7289 (-5.71%)
  helped: 5
  HURT: 2

  LOST:   0
  GAINED: 4

shader-db results on SKL:

  total instructions in shared programs: 8185669 -> 8184598 (-0.01%)
  instructions in affected programs: 10364 -> 9293 (-10.33%)
  helped: 5
  HURT: 2

  LOST:   0
  GAINED: 2

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-07 18:27:04 -08:00
Francisco Jerez
c3c1aa5aeb intel/fs: Restrict live intervals to the subset possibly reachable from any definition.
Currently the liveness analysis pass would extend a live interval up
to the top of the program when no unconditional and complete
definition of the variable is found that dominates all of its uses.

This can lead to a serious performance problem in shaders containing
many partial writes, like scalar arithmetic, FP64 and soon FP16
operations.  The number of oversize live intervals in such workloads
can cause the compilation time of the shader to explode because of the
worse than quadratic behavior of the register allocator and scheduler
when running out of registers, and it can also cause the running time
of the shader to explode due to the amount of spilling it leads to,
which is orders of magnitude slower than GRF memory.

This patch fixes it by computing the intersection of our current live
intervals with the subset of the program that can possibly be reached
from any definition of the variable.  Extending the storage allocation
of the variable beyond that is pretty useless because its value is
guaranteed to be undefined at a point that cannot be reached from any
definition.

According to Jason, this improves performance of the subgroup Vulkan
CTS tests significantly (e.g. the runtime of the dvec4 broadcast test
improves by nearly 50x).

No significant change in the running time of shader-db (with 5%
statistical significance).

shader-db results on IVB:

  total cycles in shared programs: 61108780 -> 60932856 (-0.29%)
  cycles in affected programs: 16335482 -> 16159558 (-1.08%)
  helped: 5121
  HURT: 4347

  total spills in shared programs: 1309 -> 1288 (-1.60%)
  spills in affected programs: 249 -> 228 (-8.43%)
  helped: 3
  HURT: 0

  total fills in shared programs: 1652 -> 1597 (-3.33%)
  fills in affected programs: 262 -> 207 (-20.99%)
  helped: 4
  HURT: 0

  LOST:   2
  GAINED: 209

shader-db results on BDW:

  total cycles in shared programs: 67617262 -> 67361220 (-0.38%)
  cycles in affected programs: 23397142 -> 23141100 (-1.09%)
  helped: 8045
  HURT: 6488

  total spills in shared programs: 1456 -> 1252 (-14.01%)
  spills in affected programs: 465 -> 261 (-43.87%)
  helped: 3
  HURT: 0

  total fills in shared programs: 1720 -> 1465 (-14.83%)
  fills in affected programs: 471 -> 216 (-54.14%)
  helped: 4
  HURT: 0

  LOST:   2
  GAINED: 162

shader-db results on SKL:

  total cycles in shared programs: 65436248 -> 65245186 (-0.29%)
  cycles in affected programs: 22560936 -> 22369874 (-0.85%)
  helped: 8457
  HURT: 6247

  total spills in shared programs: 437 -> 437 (0.00%)
  spills in affected programs: 0 -> 0
  helped: 0
  HURT: 0

  total fills in shared programs: 870 -> 854 (-1.84%)
  fills in affected programs: 16 -> 0
  helped: 1
  HURT: 0

  LOST:   0
  GAINED: 107

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-07 18:27:04 -08:00
Francisco Jerez
acf98ff933 intel/fs: Teach instruction scheduler about GRF bank conflict cycles.
This should allow the post-RA scheduler to do a slightly better job at
hiding latency in presence of instructions incurring bank conflicts.
The main purpuse of this patch is not to improve performance though,
but to get conflict cycles to show up in shader-db statistics in order
to make sure that regressions in the bank conflict mitigation pass
don't go unnoticed.

Acked-by: Matt Turner <mattst88@gmail.com>
2017-12-07 15:56:49 -08:00
Francisco Jerez
af2c320190 intel/fs: Implement GRF bank conflict mitigation pass.
Unnecessary GRF bank conflicts increase the issue time of ternary
instructions (the overwhelmingly most common of which is MAD) by
roughly 50%, leading to reduced ALU throughput.  This pass attempts to
minimize the number of bank conflicts by rearranging the layout of the
GRF space post-register allocation.  It's in general not possible to
eliminate all of them without introducing extra copies, which are
typically more expensive than the bank conflict itself.

In a shader-db run on SKL this helps roughly 46k shaders:

   total conflicts in shared programs: 1008981 -> 600461 (-40.49%)
   conflicts in affected programs: 816222 -> 407702 (-50.05%)
   helped: 46234
   HURT: 72

The running time of shader-db itself on SKL seems to be increased by
roughly 2.52%±1.13% with n=20 due to the additional work done by the
compiler back-end.

On earlier generations the pass is somewhat less effective in relative
terms because the hardware incurs a bank conflict anytime the last two
sources of the instruction are duplicate (e.g. while trying to square
a value using MAD), which is impossible to avoid without introducing
copies.  E.g. for a shader-db run on SNB:

   total conflicts in shared programs: 944636 -> 623185 (-34.03%)
   conflicts in affected programs: 853258 -> 531807 (-37.67%)
   helped: 31052
   HURT: 19

And on BDW:

   total conflicts in shared programs: 1418393 -> 987539 (-30.38%)
   conflicts in affected programs: 1179787 -> 748933 (-36.52%)
   helped: 47592
   HURT: 70

On SKL GT4e this improves performance of GpuTest Volplosion by 3.64%
±0.33% with n=16.

NOTE: This patch intentionally disregards some i965 coding conventions
      for the sake of reviewability.  This is addressed by the next
      squash patch which introduces an amount of (for the most part
      boring) boilerplate that might distract reviewers from the
      non-trivial algorithmic details of the pass.

The following patch is squashed in:

SQUASH: intel/fs/bank_conflicts: Roll back to the nineties.

Acked-by: Matt Turner <mattst88@gmail.com>
2017-12-07 15:56:06 -08:00
Dylan Baker
c34b53f133 meson: Fix building gallium media targets with gallium-xlib glx
To demonstrate this bug run meson with the options:
-Ddri-drivers= -Dglx=gallium-xlib

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-07 10:22:27 -08:00
Dylan Baker
2adc3817c6 meson: Add lmsensors to gallium libgl-xlib target.
Fixes: 5e71efef44 ("meson: Add lmsensors support")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-07 10:20:58 -08:00
Eric Engestrom
4cba39331d meson: add dep_thread to every lib that includes threads.h
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104141
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-12-07 17:29:42 +00:00
Eric Engestrom
f0337f0f70 meson: fix pl111 dependency on vc4
src/gallium/winsys/pl111/drm/libpl111winsys.a(pl111_drm_winsys.c.o): In function `pl111_drm_screen_create':
pl111_drm_winsys.c:(.text+0x33): undefined reference to `vc4_drm_screen_create_renderonly'

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-12-07 17:21:03 +00:00
Samuel Pitoiset
5f81a43535 radv: use a faster version for nir_op_pack_half_2x16
This patch is ported from RadeonSI and it has two effects.

It fixes a rendering issue which affects F1 2017 and Dawn
of War 3 (Vega only) because LLVM was ending up by generating
the new v_mad_mix_{hi,lo} instructions which appear to be
buggy in some way. Not sure if Mesa is generating something
wrong or if the issue is in LLVM only. Anyway, that explains why
the DOW3 issue can't be reproduced with GL on Vega.

It also improves performance because v_cvt_pkrtz_f16 is faster,
and because I guess the rounding mode behaviour is similar between
GL and VK, we can use it. About performance, it improves Talos
by +3/4% but I don't see any other impacts.

No CTS regressions on Polaris.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-07 17:21:50 +01:00
Alejandro Piñeiro
25e56b2eba mesa/spirv: move and rename nir_spirv_supported_capabilities
To avoid any vulkan driver to include the GL mtypes.h. Renamed as
eventually this could be used by drivers not using nir.

v2: remove compiler/spirv/spirv.h from mtypes (Alejandro)
v3: added the definition at compiler/shader_info.h (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-07 17:15:11 +01:00
Vadym Shovkoplias
b2490a326c util/disk_cache: Remove unneeded free() on always null string
At this point dc_job->cache_item_metadata.keys always equals
NULL, so call to free() is useless

Fixes: b86ecea344 ("util/disk_cache: write cache item metadata to disk")
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-07 11:50:41 +00:00
Samuel Iglesias Gonsálvez
392638d6b5 spirv: fix bug when OpSpecConstantOp calls a conversion
In that case, nir_eval_const_opcode() will evaluate the conversion
but as it was using destination's bit_size, the resulting
value was just a cast of the source constant value. By passing the
source's bit size, it does the conversion properly.

Fixes:

dEQP-VK.spirv_assembly.instruction.*.opspecconstantop.*convert*

v2:
- Remove invalid conversion op cases.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-07 10:19:34 +01:00
Samuel Iglesias Gonsálvez
67ec314347 spirv: allow specialization constants with bitsize different than 32 bits
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-07 10:19:34 +01:00
James Legg
947470d10b nir/opcodes: Fix constant-folding of bitfield_insert
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104119
CC: <mesa-stable@lists.freedesktop.org>
CC: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-12-07 08:59:36 +00:00
Alex Smith
8fda98c4f1 radv: Add LLVM version to the device name string
Allows apps to determine the LLVM version so that they can decide
whether or not to enable workarounds for LLVM issues.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-12-07 08:58:34 +00:00
Alejandro Piñeiro
be2c434308 mesa: remove set_entry from forward type declarations
This type was used at gl_sync_object, but it is not used anymore.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-07 09:01:58 +01:00
Kenneth Graunke
8705ed13e3 meta: Fix ClearTexture with GL_DEPTH_COMPONENT.
We only handled unpacking for GL_DEPTH_STENCIL formats.

Cemu was hitting _mesa_problem() for an unsupported format in
_mesa_unpack_float_32_uint_24_8_depth_stencil_row(), because the
format was depth-only, rather than depth-stencil.

Cc: "13.0 12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94739
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103966
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-06 20:35:46 -08:00
Kenneth Graunke
d6d16c0218 meta: Initialize depth/clear values on declaration.
This helps avoid compiler warningss in the next commit - everything
was initialized, but it wasn't obvious to static analysis.

Suggested-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-06 20:30:24 -08:00
Timothy Arceri
9d53ccccb2 glsl: get correct member type when processing xfb ifc arrays
This fixes a crash in:

KHR-GL45.enhanced_layouts.xfb_block_stride

Fixes: 0822517936 "glsl: add helper to process xfb qualifiers during linking"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-12-07 15:22:23 +11:00
Gert Wollny
6c268ea79a r600/sb: do not convert if-blocks that contain indirect array access
If an array is accessed within an if block, then currently it is not known
whether the value in the address register is involved in the evaluation of the
if condition, and converting the if condition may actually result in
out-of-bounds array access. Consequently, if blocks that contain indirect array
access should not be converted.

Fixes piglits on r600/BARTS:
spec/glsl-1.10/execution/variable-indexing/
  vs-output-array-float-index-wr
  vs-output-array-vec3-index-wr
  vs-output-array-vec4-index-wr

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104143

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-07 09:48:41 +10:00
Dave Airlie
81683c3d42 r600: add support for compute grid/block sizes. (v2)
We just pass these in from outside in a constant buffer.

The shader side stores them once they are accessed once.

v2: fix to not use a temp_reg.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-06 23:21:09 +00:00
Dave Airlie
4525cdb751 r600: handle image/buffer sizes correctly.
This adds support to compute for the resq workarounds (buffer/cube sizes)

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-06 23:21:06 +00:00
Dave Airlie
f51458637c r600/compute: add support for emitting compute image/buffer atoms
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-06 23:21:02 +00:00
Dave Airlie
a5a50d9c89 r600/compute: handle atomic counters in compute state.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-06 23:20:58 +00:00
Dave Airlie
c82934f212 r600/compute: add support for TGSI compute shaders. (v1.1)
This add paths to handle TGSI compute shaders and shader selection.

It also avoids emitting certain things on tgsi paths,
CBs, vertex buffers, config reg init (not required).

v1.1: fix rat mask calc

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-06 23:20:53 +00:00
Dave Airlie
08dc205c61 r600/shader: add compute support to shader assembler
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-06 23:20:50 +00:00
Dave Airlie
7b8e1c089d r600/texture: drop lowering 1d/2d images to linear.
This appears to cause hangs with compute images. Unless
we can find more specifics, just don't do this for now.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-06 23:20:20 +00:00
Alejandro Piñeiro
0398b31d1d mesa: define nir_spirv_supported_capabilities
Until now it was part of spirv_to_nir_options. But it will be used on
the implementation of ARB_gl_spirv and ARB_spirv_extensions, and added
to the OpenGL context, as a way to save what SPIR-V capabilities the
current OpenGL implementation supports.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-06 22:25:52 +01:00
Fredrik Höglund
5e1cb16768 anv: fix a case statement in GetMemoryFdPropertiesKHR
The handle type in the case statement is supposed to be VK_EXTERNAL_-
MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT.

Fixes: ab18e8e59b ("anv: Implement VK_EXT_external_memory_dma_buf")
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 20:04:39 +01:00
Fredrik Höglund
b055045378 radv: fix a case statement in GetMemoryFdPropertiesKHR
The handle type in the case statement is supposed to be VK_EXTERNAL_-
MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT.

Fixes: 546e747867 ("radv: Implement VK_EXT_external_memory_dma_buf")
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-12-06 20:04:39 +01:00
Eric Engestrom
31d403160f meson: fix keyword argument in declare_dependency()
`declare_dependency()` takes `compile_args`, not `c_args`.
It was correct in all the other `declare_dependency()` from that commit.

Fixes: 0bbecc5a85 "meson: define driver dependencies"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-12-06 18:31:33 +00:00
Emil Velikov
526945f7dc i965: include brw_pipe_control.h in the tarball
Fixes: bfe0f3a702 ("i965: Move PIPE_CONTROL defines and prototypes to
brw_pipe_control.h.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-06 17:33:57 +00:00
Emil Velikov
e964e01fdd mesa: document _mesa_extension_override_* variables
Currently there are no users of these outside of extensions.c.
Provide some information why they exist and how to use them.

Cc: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-12-06 17:31:53 +00:00
Emil Velikov
d7ba4f41f9 docs: annotate MESA_program_debug as obsolete
It has been obsolete for years - state it explicitly.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-06 17:31:53 +00:00
George Kyriazis
bc75adcb1e swr/scons: Fix another intermittent build failure
gen_BackendPixelRate*.cpp depends on gen_ar_eventhandler.hpp.
Fix missing dependency.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-06 11:04:02 -06:00
Marek Olšák
4038db72d4 radeonsi: make const and stream uploaders allocate read-only memory
and anything that clones these uploaders, like u_threaded_context.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-06 15:19:02 +01:00
Marek Olšák
7a6643fb4c radeonsi: use a separate allocator for fine fences
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-06 15:19:02 +01:00
Marek Olšák
3e1287caef radeonsi/gfx9: make shader binaries use read-only memory
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-06 15:19:02 +01:00
Marek Olšák
fef51ebcea winsys/amdgpu: make IBs use read-only memory
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-06 15:19:02 +01:00
Marek Olšák
ba59064409 radeonsi: print the buffer list for CHECK_VM
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-06 15:19:02 +01:00
Marek Olšák
010214b403 radeonsi: allow DMABUF exports for local buffers
Cc: 17.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-06 15:19:02 +01:00
Nicolai Hähnle
20ccb51ffc radeonsi: always place sparse buffers in VRAM
Together with "radeonsi: fix the R600_RESOURCE_FLAG_UNMAPPABLE check",
this ensures that sparse buffers are placed in VRAM.

Noticed by an assertion that started triggering with commit d4fac1e1d7
("gallium/radeon: enable suballocations for VRAM with no CPU access")

Fixes KHR-GL45.sparse_buffer_tests.BufferStorageTest in debug builds.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-12-06 11:19:00 +01:00
Nicolai Hähnle
5e2962c949 radeonsi: fix the R600_RESOURCE_FLAG_UNMAPPABLE check
The flag is on the pipe_resource, not the r600_resource.

I don't see an obvious bug related to this, but it could potentially lead
to suboptimal placement of some resources.

Fixes: a41587433c ("gallium/radeon: add R600_RESOURCE_FLAG_UNMAPPABLE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-12-06 11:18:14 +01:00
Jose Maria Casanova Crespo
a1e257a5bf i965/fs: Use untyped_surface_read for 16-bit load_ssbo
SSBO loads were using byte_scattered read messages as they allow
reading 16-bit size components. byte_scattered messages can only
operate one component at a time so we needed to emit as many messages
as components.

But for vec2 and vec4 of 16-bit, being multiple of 32-bit we can use the
untyped_surface_read message to read pairs of 16-bit components using only
one message. Once each pair is read it is unshuffled to return the proper
16-bit components. vec3 case is assimilated to vec4 but the 4th component
is ignored.

16-bit scalars are read using one byte_scattered_read message.

v2: Removed use of stride = 2 on sources (Jason Ekstrand)
    Rework optimization using unshuffle 16 reads (Chema Casanova)
v3: Use W and D types insead of HF and F in shuffle to avoid rounding
    erros (Jason Ekstrand)
    Use untyped_surface_read for 16-bit vec3. (Jason Ekstrand)
v4: Use subscript insead of chaging type and stride  (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo
ce2e572c4c i965/fs: Optimize 16-bit SSBO stores by packing two into a 32-bit reg
Currently, we use byte-scattered write messages for storing 16-bit
into an SSBO. This is because untyped surface messages have a fixed
32-bit size.

This patch optimizes these 16-bit writes by combining 2 values (e.g,
two consecutive components aligned with 32-bits) into a 32-bit register,
packing the two 16-bit words.

16-bit single component values will continue to use byte-scattered
write messages. The same will happens when the first consecutive
component is not aligned 32-bits.

This optimization reduces the number of SEND messages used for storing
16-bit values potentially by 2 or 4, which cuts down execution time
significantly because byte-scattered writes are an expensive
operation as they only write a component for message.

v2: Removed use of stride = 2 on sources (Jason Ekstrand)
    Rework optimization using shuffle 16 write and enable writes
    of 16bit vec4 with only one message of 32-bits. (Chema Casanova)
v3: - Fix coding style (Eduardo Lima)
    - Reorganize code to avoid duplication. (Jason Ekstrand)
    - Include new comments to explain the length calculations to
      fix alignment issues of components. (Jason Ekstrand)
    - Fix issues with writemask yz with 16-bit writes. (Jason Ektrand)
v4: (Jason Ekstrand)
    - Reorganize 64-bit ssbo-writes to avoid using slots_per_component.
    - Comment about why suffle is needed when using byte_scattered_write.

Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Alejandro Piñeiro
66ce6ce78f anv: Enable SPV_KHR_16bit_storage and VK_KHR_16bit_storage for SSBO/UBO
Enables SPV_KHR_16bit_storage on gen 8+.

VK_KHR_16bit_storage is enabled for SSBO/UBO using the
VK_KHR_get_physical_device_properties2 functionality to expose
if the extension is supported or not.

v2: update due rebase against master (Alejandro)
v3: (Jason Ekstrand)
    - Move this patch up in VK_KHR_16bit_storage series enabling only
      storageBuffer16BitAccess and uniformAndStorageBuffer16BitAccess.
    - Only expose VK_KHR_16bit_storage on Gen8+
v4: (Jason Ekstrand)
    - Squash enable SPV_KHR_16bit_storage into VK_KHR_16bit_storage
      enablement for SSBO/UBO.

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jason Ekstrand
3282309f74 i965/fs: Enables 16-bit load_ubo with sampler
load_ubo is using 32-bit loads as uniforms surfaces have a 32-bit
surface format defined. So when reading 16-bit components with the
sampler we need to unshuffle two 16-bit components from each 32-bit
component.

Using the sampler avoids the use of the byte_scattered_read message
that needs one message for each component and is supposed to be
slower.

v2: (Jason Ekstrand)
    - Simplify component selection and unshuffling for different bitsizes
    - Remove SKL optimization of reading only two 32-bit components when
      reading 16-bits types.

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo
3db31c0b06 i965/fs: Helpers for un/shuffle 16-bit pairs in 32-bit components
This helpers are used to load/store 16-bit types from/to 32-bit
components.

The functions shuffle_32bit_load_result_to_16bit_data and
shuffle_16bit_data_for_32bit_write are implemented in a similar
way than the analogous functions for handling 64-bit types.

v1: Explain need of temporary in shuffle operations. (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo
fa4a9d63bb i965/fs: Use byte scattered read for 16-bit load_ssbo
Used to enable 16-bit reads at do_untyped_vector_read, that is used on
the following intrinsics:

   * nir_intrinsic_load_shared
   * nir_intrinsic_load_ssbo

v2: Removed use of stride = 2 on 16-bit sources (Jason Ekstrand)

v3: - Add bitsize to scattered read operation (Jason Ekstrand)
    - Remove implementation of 16-bit UBO read from this patch.
    - Avoid assertion at opt_algebraic caused by ADD of two IMM with
      offset with BRW_REGISTER_TYPE_UD type found on matrix tests.
      (Jose Maria Casanova)
v4: (Jason Ekstrand)
    - Put if case for 16-bits at the beginning of the if ladder.
    - Use type_sz(dest.type) * 8 as bit_size parameter for scattered read.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo
c57a3f200d i965/fs: Add byte scattered read message and fs support
v2: Fix alignment style (Topi Pohjolainen)
    (Jason Ekstrand)
    - Enable bit_size parameter to scattered messages to enable different
      bitsizes byte/word/dword.
    - Remove use of brw_send_indirect_scattered_message in favor of
      brw_send_indirect_surface_message.
    - Move scattered messages to surface messages namespace.
    - Assert align1 for scattered messages and assume Gen8+.
    - Inline brw_set_dp_byte_scattered_read.

v3: (Jason Ekstrand)
    - Use renamed brw_byte_scattered_data_element_from_bit_size method
    - Assert scattered read for Gen8+ and Haswell.
    - Use conditional expresion at components_read.
    - Include comment about params for scattered opcodes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Alejandro Piñeiro
a4031bdfa9 i965/fs: Predicate byte scattered writes if needed
While on Untyped Surface messages the bits of the execution mask are
ANDed with the corresponding bits of the Pixel/Sample Mask, that is
not the case for byte scattered writes. That is needed to avoid ssbo
stores writing on helper invocations. So when that can affect, we load
the sample mask, and predicate the send message.

Note: the need for this patch was tested with a custom test. Right now
the 16 bit storage CTS tests doesnt need this path in order to get a
full pass.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Alejandro Piñeiro
96f1926aab i965/fs: Use byte_scattered_write on 16-bit store_ssbo
We need to rely on byte scattered writes as untyped writes are 32-bit
size. We could try to keep using 32-bit messages when we have two or
four 16-bit elements, but for simplicity sake, we use the same message
for any component number. We revisit this aproach in the follwing
patches.

v2: Removed use of stride = 2 on 16-bit sources (Jason Ekstrand)

v3: (Jason Ekstrand)
    - Include bit_size to scattered write message and remove namespace
    - specific for scattered messages.
    - Move comment to proper place.
    - Squashed with i965/fs: Adjust type_size/type_slots on store_ssbo.
    (Jose Maria Casanova)
    - Take into account that get_nir_src returns now WORD types for
      16-bit sources instead of DWORD.
v4: (Jason Ekstrand)
    - Rename lenght variable to num_components.
    - Include assertions before emit_untyped_write.
    - Remove type_slot in favor of num_slot and first_slot.

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo
f1a9936ee1 i965/fs: Add byte scattered write message and fs support
v2: (Jason Ekstrand)
    - Enable bit_size parameter to scattered messages to enable different
      bitsizes byte/word/dword.
    - Remove use of brw_send_indirect_scattered_message in favor of
      brw_send_indirect_surface_message.
    - Move scattered messages to surface messages namespace.
    - Assert align1 for scattered messages and assume Gen8+.
    - Inline brw_set_dp_byte_scattered_write.
v3: - Remove leftover newline (Topi Pohjolainen)
    - Rename brw_data_size to brw_scattered_data_element and use
      defines instead of an enum (Jason Ekstrand)
    - Assert scattered write for Gen8+ and Haswell (Jason Ekstrand)

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Alejandro Piñeiro
d038deaa40 i965/fs: Add remove_extra_rounding_modes optimization
Although from SPIR-V point of view, rounding modes are attached to the
operation/destination, on i965 it is a status, so we don't need to
explicitly set the rounding mode if the one we want is already set.

Taking into account that the default mode is RTE, one possible
optimization would be optimize out the first RTE set for each
block. For in order to work, we would need to take into account block
interrelationships. At this point, it is not worth to complicate the
optimization for such small gain.

v2: Use a single SHADER_OPCODE_RND_MODE opcode taking an immediate
    with the rounding mode (Curro)
v3: Reset optimization for every block. (Jason Ekstrand)

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Alejandro Piñeiro
82fa4d45e7 i965/fs: Enable rounding mode on f2f16 ops
By default we don't set the rounding mode. We only set
round-to-near-even or round-to-zero mode if explicitly set from nir.

v2: Use a single SHADER_OPCODE_RND_MODE opcode taking an immediate
    with the rounding mode (Curro)

v3: Use new helper brw_rnd_mode_from_nir_op  (Jason Ekstrand)

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Alejandro Piñeiro
d6cd14f213 i965/fs: Define new shader opcode to set rounding modes
Although it is possible to emit them directly as AND/OR on brw_fs_nir,
having a specific opcode makes it easier to remove duplicate settings
later.

v2: (Curro)
  - Set thread control to 'switch' when using the control register
  - Use a single SHADER_OPCODE_RND_MODE opcode taking an immediate
    with the rounding mode.
  - Avoid magic numbers setting rounding mode field at control register.
v3: (Curro)
  - Remove redundant and add missing whitespace lines.
  - Match printing instruction to IR opcode "rnd_mode"

v4: (Topi Pohjolainen)
  - Fix code style.

Signed-off-by:  Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by:  Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo
ac8d4734f6 i965: Add support for control register
Control register cr0 in i965 can be used to change the rounding modes
in 32-bit to 16-bit floating-point conversions.

From intel Skylake PRM, vol 07, section "Register and Tegister Regions",
 subsection "Control Register" (page 754):

"Subregister cr0.0:ud contains normal operation control fields such as the
 floating-point mode ... "

Floating-point Rounding mode is changed at bits 5:4 of cr0.0:

"Rounding Mode. This field specifies the FPU rounding mode. It is
initialized by Thread Dispatch."
  00b = Round to Nearest or Even (RTNE)
  01b = Round Up, toward +inf (RU)
  10b = Round Down, toward -inf (RD)
  11b = Round Toward Zero (RTZ)"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Alejandro Piñeiro
5d5ee507fb i965/fs: Handle 32-bit to 16-bit conversions
Conversions to 16-bit need having aligment between the 16-bit
and 32-bit types. So the conversion operations unpack 16-bit types
to with an stride=2 and then applies a MOV with the conversion.

v2 (Jason Ekstrand):
  - Avoid the general use of stride=2 for 16-bit register types.

v3 (Topi Pohjolainen)
  - Code style fix
   (Jason Ekstrand)
  - Now nir_op_f2f16 was renamed to nir_op_f2f16_undef
    because conversion to f16 with undefined rounding is explicit

Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Alejandro Piñeiro
a05b6f25bf i965/fs: Remove BRW_REGISTER_TYPE_HF assert at get_exec_type
Note that we don't remove the assert at i965/vec4. At this point half
float support is only for the scalar backend.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo
75a88d8567 i965: Support for 16-bit base types in helper functions
v2: Fixed calculation of scalar size for 16-bit types. (Jason Ekstrand)
v3: Fix coding style (Topi Pohjolainen)

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Eduardo Lima <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Alejandro Piñeiro
2d28ca7000 i965/vec4: Handle 16-bit types at type_size_xvec4
These types have similar vec4 sizes as their 32-bit counterparts.

The vec4 backend doesn't support 16-bit types and probably never will,
but this method is called by the scalar backend at
fs_visitor::nir_setup_outputs(), so we still need to provide valid vec4
sizes for 16-bit types. In the future, something different should be
implemented to avoid this dependency.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev
4049c04122 spirv/nir: Add support for SPV_KHR_16bit_storage
v2: Minor changes after rebase against recent master (Alejandro
    Pinheiro)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo
e0667a8bd8 spirv: Enable FPRoundingMode decorator to nir operations
SpvOpFConvert now manages the FPRoundingMode decorator for the
returning values enabling the nir_rounding_mode in the conversion
operation to fp16 values.

v2: Fixed breaking of specialization constants. (Jason Ekstrand)

v3: Avoid nir_rounding_mode * casting. (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev
549894a681 spirv/nir: Handle 16-bit types
v2: Added more missing implementations of 16-bit types. (Jason Ekstrand)

v3: Store values in values[0].u16[i] (Jason Ekstrand)
    Include switches based on bitsize for 16-bit types
    (Chema Casanova)
v4: Coding style fixes (Jason Ekstrand)
    Use vtn_u64_literal and u64[0] at 64-bit SpvOpConstant (Jason Ekstrand)

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Eduardo Lima <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo
1f440d00d2 nir: Handle fp16 rounding modes at nir_type_conversion_op
nir_type_conversion enables new operations to handle rounding modes to
convert to fp16 values. Two new opcodes are enabled nir_op_f2f16_rtne
and nir_op_f2f16_rtz.

The undefined behaviour doesn't has any effect and uses the original
nir_op_f2f16 operation.

v2: Indentation fixed (Jason Ekstrand)

v3: Use explicit case for undefined rounding and assert if
    rounding mode is used for non 16-bit float conversions
    (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev
2af63683bc nir: Populate conversion opcodes to 16-bit types
This will include the following NIR ALU opcodes:
 * nir_op_i2i16
 * nir_op_i2f16
 * nir_op_u2u16
 * nir_op_u2f16
 * nir_op_f2i16
 * nir_op_f2u16
 * nir_op_f2f16

v2: Remove "from" 16-bit in commit subject (Topi Pohjolainen)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo
d711445430 nir: Add rounding modes enum
v2: Added comments describing each of the rounding modes. (Jason
    Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev
5165e222d1 nir: Add support for 16-bit types (half float, int16 and uint16)
v2: Renamed glsl_half_float_type() to glsl_float16_t_type().
    (Jason Ekstrand)

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Eduardo Lima <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev
52b10c7f20 mesa/st: Handle 16-bit types at st_glsl_storage_type_size()
This is basically to avoid "not handle in switch" warnings.

v2: Let the new types hit the assertion instead. (Marek Olšák
    and Jason Ekstrand)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev
59f458cd87 glsl: Add 16-bit types
Adds new INT16, UINT16 and FLOAT16 base types.

The corresponding GL types for half floats were reused from the
AMD_gpu_shader_half_float extension. The int16 and uint16 types come from
NV_gpu_shader_5 extension.

This adds the builtins and the lexer support.

To avoid a bunch of warnings due to cases not handled in switch, the
new types have been added to a few places using same behavior as
their 32-bit counterparts, except for a few trivial cases where they are
already handled properly. Subsequent patches in this set will provide
correct 16-bit implementations when needed.

v2: * Use FLOAT16 instead of HALF_FLOAT as name of the base type.
    * Removed float16_t from builtin types.
    * Don't copy 16-bit types as if they were 32-bit values in
      copy_constant_to_storage().
    * Use get_scalar_type() instead of adding a new custom switch
      statement.
    (Jason Ekstrand)
v3: Use GL_FLOAT16_NV instead of GL_HALF_FLOAT for consistency
    (Ilia Mirkin)
v4: Add missing 16-bit base types support in glsl_to_nir (Eduardo Lima).
v5: Fix coding style (Topi Poholainen).

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-06 08:57:18 +01:00
Jason Ekstrand
8761a04d0d anv: Add support for the variablePointers feature
Not to be confused with variablePointersStorageBuffer which is the
subset of VK_KHR_variable_pointers required to enable the extension.
This means we now have "full" support for variable pointers.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:54 -08:00
Jason Ekstrand
93b4cb61eb spirv: Allow OpPtrAccessChain for block indices
The SPIR-V spec is a bit underspecified when it comes to exactly how
you're allowed to use OpPtrAccessChain and what it means in certain edge
cases.  In particular, what if the base pointer of the OpPtrAccessChain
points to the base struct of an SSBO instead of an element in that SSBO.
The original variable pointers implementation in mesa assumed that you
weren't allowed to do an OpPtrAccessChain that adjusted the block index
and asserted such.  However, there are some CTS tests that do this and,
if the CTS does it, someone will do it in the wild so we should probably
handle it.  With this commit, we significantly reduce our assumptions
and should be able to handle more-or-less anything.

The one assumption we still make for correctness is that if we see an
OpPtrAccessChain on a pointer to a struct decorated block that the block
index should be adjusted.  In theory, someone could try to put an array
stride on such a pointer and try to make the SSBO an implicit array of
the base struct and we would not give them what they want.  That said,
any index other than 0 would count as an out-of-bounds access which is
invalid.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:54 -08:00
Jason Ekstrand
32c859125b anv: Handle nir_intrinsic_vulkan_resource_reindex
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:54 -08:00
Jason Ekstrand
cfb81f58a0 nir: Add a vulkan_resource_reindex intrinsic
This is required for being able to handle OpPtrAccessChain in SPIR-V
where the base type of the incoming pointer requires us to add to the
block index instead of the byte offset.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:54 -08:00
Jason Ekstrand
ae54a4f84f spirv: Add support for lowering workgroup access to offsets
Before, we always left workgroup variables as shared nir_variables and
let the driver call nir_lower_io.  This adds an option to do the
lowering directly in spirv_to_nir.  To do this, we implicitly assign the
variables a std430 layout and then treat them like a UBO or SSBO and
immediately lower all the way to an offset.

As a side-effect, the spirv_to_nir pass now handles variable pointers
for workgroup variables.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:54 -08:00
Jason Ekstrand
f6eb5ce39c spirv: Rename get_shared_nir_atomic_op to get_var_nir_atomic_op
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:54 -08:00
Jason Ekstrand
992aabf239 spirv: Add theoretical support for single component pointers
Up until now, all pointers have been ivec2s.  We're about to add support
for pointers to workgroup storage and those are going to be uints.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:54 -08:00
Jason Ekstrand
843c192e2b spirv: Use offset_pointer_dereference to instead of get_vulkan_resource_index
There is no good reason why we should have the same logic repeated in
get_vulkan_resource_index and vtn_ssa_offset_pointer_dereference.  If
we're a bit more careful about how we do things, we can just use the one
function and get rid of the other entirely.  This also makes the push
constant special case a lot more clear.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:53 -08:00
Jason Ekstrand
6dffef6308 spirv: Refactor a couple of pointer query helpers
This commit moves them both into vtn_variables.c towards the top, makes
them take a vtn_builder, and replaces a hand-rolled instance of
is_external_block with a function call.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 20:56:16 -08:00
Jason Ekstrand
93646fb503 spirv: Refactor the base case of offset_pointer_dereference
This makes us key off of !offset instead of !block_index.  It also puts
the guts inside a switch statement so that we can handle more than just
UBOs and SSBOs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 20:56:14 -08:00
Jason Ekstrand
98edf6bca4 spirv: Add a switch statement for the block store opcode
This parallels what we do for vtn_block_load except that we don't yet
support anything except SSBO loads through this path.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 20:55:39 -08:00
Jason Ekstrand
91d91ce3e2 spirv: Use a dereference instead of vtn_variable_resource_index
This is equivalent and means we don't have resource index code scattered
about.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 20:55:37 -08:00
Brian Paul
dee936c805 mesa: add const qualifier on _mesa_is_renderable_texture_format()
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-05 15:32:25 -07:00
Brian Paul
ca78b6b4f4 mesa: add const qualifier on _mesa_base_fbo_format()
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-05 15:32:25 -07:00
Brian Paul
08ba4a103f mesa: s/%u/%d/ in _mesa_error() call in check_layer()
The layer parameter is signed.  Fixes the error message seen when
running the arb_texture_multisample-errors test which checks a
negative layer value.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-05 15:32:25 -07:00
Brian Paul
323e6029a3 mesa: simplify/improve some _mesa_error() calls in teximage.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-05 15:32:25 -07:00
Brian Paul
726e495bbd mesa: trivial whitespace fixes in transformfeedback.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-05 15:32:25 -07:00
Brian Paul
b2c3fd984a mesa: add const qualifier in test_attachment_completeness()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-05 15:32:25 -07:00
Brian Paul
78dfbc30b6 st/mesa: remove unneeded #include in st_format.h
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-05 15:32:24 -07:00
Brian Paul
9c7840f93f st/mesa: rename a few vars to 'bindings'
To be consistent.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-05 15:32:24 -07:00
Brian Paul
067f6fc1af st/mesa: whitespace fixes in st_format.c
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-05 15:32:24 -07:00
Rob Clark
e0c6769ef5 freedreno/a5xx: hide ARB_base_instance
Grrr..

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-05 16:16:18 -05:00
Rob Clark
77c595358f freedreno/ir3: handle input/output component
After the mesa/st nir linking support, we start to see inputs/outputs
like:

   decl_var shader_out INTERP_MODE_NONE float packed:uv (VARYING_SLOT_VAR9.x, 1, 0)
   decl_var shader_out INTERP_MODE_NONE float packed:uv@0 (VARYING_SLOT_VAR9.y, 1, 0)

(ie. were location_frac != .x)

Unfortunately I overlooked the addition of the component parameter to
load_input/store_output, so when we started encountering inputs/outputs
with component other than .x, we'd end up loading/storing the wrong
input/output.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-05 16:03:38 -05:00
Rob Clark
fd6a96635e mesa/st: move cloning of NIR shader for compute
Since in the NIR case, driver takes ownership of the NIR shader, we need
to clone what is passed to the driver.  Normally this is done as part of
creating the shader variant (where is clone is anyways needed).  But
compute shaders have no variants, so we were cloning earlier.

The problem is that after the NIR linking optimizations, we ended up
cloning *before* all the lowering passes where done.

So move this into st_get_cp_variant(), to make compute shaders work more
like other shader stages.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-05 16:03:38 -05:00
Dave Airlie
12a96aaf90 r600: refactor and export some shader selector code for compute
This just moves some code around to make it easier to add compute.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:31:50 +00:00
Dave Airlie
ca64281690 r600: add compute support to compressed resource handling.
This just adds support for decompressing compute resources.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:31:48 +00:00
Dave Airlie
5c78d000e6 r600: update max threads per block for evergreen compute
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:31:37 +00:00
Dave Airlie
5f15d35efc r600/shader: add local memory support to shader assembler.
This is needed for compute shaders.

v1.1: make work for vectors, fix missing lds ops.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:31:34 +00:00
Dave Airlie
dd3630f71c r600/cs: add support for compute to image/buffers/atomics state
This just adds the compute paths to state handling for the
main objects

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:31:24 +00:00
Dave Airlie
84feb6c24a r600: handle compute null key shader state
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:31:13 +00:00
Dave Airlie
9b8b78b457 r600: add some missing cayman register defines
These are just taken from the kernel, and were seen in some fglrx dumps.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:09:43 +00:00
Dave Airlie
3a403a9797 r600: don't set EOP on pop or loop end
This appears to bad, compute shaders hang without it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:09:33 +00:00
Dave Airlie
2fdc21bcab r600/ssbo: refactor out buffer coord calcs and use for atomic path.
The atomic rat path has a bug in the ssbo path, refactor out the
address calcs from the load/store paths and reuse to fix the bug
in the buffer rat atomic path.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:07:08 +00:00
Dave Airlie
a256506b76 r600/ssbo: fix multi-dword buffer loads.
This fixes loading from different channels.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:07:08 +00:00
Dave Airlie
989697eccc r600/ssbo: use r32ui format for ssbo resources.
This works best for returning the correct values and sizes in
tests.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:07:08 +00:00
Dave Airlie
275293b2b4 r600: refactor out the immediate setup code.
This just refactors the same code out of the images/buffers paths.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:07:07 +00:00
Dave Airlie
df21cd5248 r600/shader: fix ssbo atomic operations formats.
Don't try and use the image format for ssbo, just 32-bit uint.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:07:07 +00:00
Dave Airlie
4ee2b7c452 r600/shader: fix thread id loading.
This just changes how thread id loading is done, it makes
smaller shaders if we don't use thread id gprs.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 20:07:07 +00:00
Rob Herring
20d37da597 Android: enable noreturn and returns_nonnull attributes
Commit 94ca8e04ad ("spirv: Add vtn_fail and vtn_assert helpers") broke
Android builds which have -Werror enabled with the following errors:

external/mesa3d/src/compiler/spirv/spirv_to_nir.c:272:1: error: control may reach end of non-void function [-Werror,-Wreturn-type]
external/mesa3d/src/compiler/spirv/spirv_to_nir.c:810:1: error: control may reach end of non-void function [-Werror,-Wreturn-type]
...

The problem is the noreturn attribute is not enabled and we to define
HAVE_FUNC_ATTRIBUTE_NORETURN.

Auditing src/util/macros.h, we're also missing
HAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL and HAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT,
so add them too.

Fixes: 94ca8e04ad ("spirv: Add vtn_fail and vtn_assert helpers")
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-12-05 07:47:04 -06:00
Marek Olšák
dbad0acfaf gallium/u_upload_mgr: allow drivers to specify pipe_resource::flags
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-05 13:30:35 +01:00
Marek Olšák
c7f84f6513 winsys/amdgpu: add RADEON_FLAG_READ_ONLY
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-05 13:30:34 +01:00
Marek Olšák
cccf09677f gallium/radeon: remove RADEON_HEAP_VRAM_GTT
Only winsyses can set VRAM|GTT. Drivers shouldn't if they want to use
winsys allocators.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-05 13:30:34 +01:00
Marek Olšák
9ac5504df5 gallium/radeon: move setting VRAM|GTT into winsyses
The combined VRAM|GTT heap will be removed.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-05 13:30:34 +01:00
Marek Olšák
5e805cc74b radeonsi: flush the context after resource_copy_region for buffer exports
Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-05 13:28:00 +01:00
Mauro Rossi
cd8554502e Android: gallium/radeon: fix libmesa_amd_common dependency
libmesa_amd_common static dependency is added in Android build
to avoid the following building errors:

In file included from external/mesa/src/gallium/drivers/radeon/r600_buffer_common.c:24:
In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:26:
external/mesa/src/gallium/drivers/radeonsi/si_shader.h:138:10: fatal error: 'ac_binary.h' file not found
         ^~~~~~~~~~~~~
1 error generated.
...
In file included from external/mesa/src/gallium/drivers/radeon/r600_gpu_load.c:34:
In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:26:
external/mesa/src/gallium/drivers/radeonsi/si_shader.h:138:10: fatal error: 'ac_binary.h' file not found
         ^~~~~~~~~~~~~
1 error generated.

Fixes: 950221f923 ("radeonsi: remove r600_common_screen")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-05 12:05:21 +00:00
Dave Airlie
b501ef164e st/mesa: handle compute atomics
Just reuse the cs atomics bit and emit the hw atomic state.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 10:38:35 +00:00
Dave Airlie
05f594f229 r600/atomic: add cayman version of atomic save/restore from GDS (v2)
On Cayman we don't use the append/consume counters (fglrx doesn't)
and they don't seem to work well with compute shaders.

This just uses GDS instead to do the atomic operations.

v1.1: remove unused line.
v2: use EOS on cayman, it appears to work.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 10:38:07 +00:00
Dave Airlie
cf6d3caee2 r600/atomic: refactor out evergreen atomic setup/save code.
For cayman we want to use different code paths.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-05 10:38:04 +00:00
Timothy Arceri
e9e6476ae5 radeonsi: pass llvm type directly to buffer_load()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-05 15:15:36 +11:00
Dylan Baker
6b4c7047d5 meson: build gallium nine state_tracker
v2: - set d3d_drivers_path instead of dri_drivers_path
    - Fix nine guard to check for all relavent gallium drivers
    - Link with libswdri and libswkmsdri when necessary
    - Fix pkg-config generation
    - Add missing comma

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:36:58 -08:00
Dylan Baker
0ba909f0f1 meson: build gallium xa state tracker
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:36:56 -08:00
Dylan Baker
5a785d51a6 meson: build gallium va state tracker
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:36:53 -08:00
Dylan Baker
1d36dc674d meson: build gallium omx state tracker
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:36:51 -08:00
Dylan Baker
22a817af8a meson: build gallium xvmc state tracker
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:36:48 -08:00
Dylan Baker
68076b8747 meson: build gallium vdpau state tracker
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:36:38 -08:00
Dylan Baker
085070a2c8 meson: drop gallium-media argument
This argument is the wrong approach for handling gallium media state
trackers, since it doesn't allow for an auto option. Instead we'll use
tristates, which do allow for auto.

This option has never been wired to anything anyway.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:36:35 -08:00
Dylan Baker
f7f1b30f81 meson: extend install_megadrivers script to handle symmlinking
Which is required for the gallium media state trackers.

v2: - Make symlinks local instead of absolute

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:36:26 -08:00
Dylan Baker
b065de05c6 meson: Add osmesa.sym script as a link dependency (gallium-osmesa)
v2: - Add this patch

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:36:19 -08:00
Dylan Baker
0cb6d69a72 meson: use driver_deps for gallium osmesa
v2: - Put driver_swrast in the correct field (dependencies)
    - Remove unused osmesa_deps

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:36:10 -08:00
Dylan Baker
60283769ec meson: Use driver dependencies for libgl-xlib target
v2: - put driver_swrast in the right field
    - add dep_threads (dep_llvm requires threads, so it masked this
      previously)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:35:48 -08:00
Dylan Baker
95a791f63e meson: use the driver dependencies for the gallium dri target
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:35:43 -08:00
Dylan Baker
0bbecc5a85 meson: define driver dependencies
This allow us to encapsulate the compiler and linkage requirements of
each driver in a reusable way. The result will be that each target that
needs a specific driver can simply add `driver_<name>` to its
dependencies line and the necessary libraries and compiler args will be
added. This will allow for a lot of code de-duplication between gallium
targets.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:35:36 -08:00
Dylan Baker
831d2fb012 meson: sort gallium drivers after winsys
This is a requirement of the next patch. Since meson does not have
forward declarations, and we're going to define the driver dependencies
in the drivers folder they need to be after the winsys so that the
winsys libs are defined first.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:35:31 -08:00
Dylan Baker
383cdaf990 meson: Combine gallium target subdirs
So that state trackers, targets, and special winsys requirements are all
in a single if statement. This is a cosmetic only cleanup with no
functional changes.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 14:35:03 -08:00
Nanley Chery
0a257b3fe4 i965/cnl: Avoid fast-clearing sRGB render buffers
Gen10 doesn't automatically decode the clear color of sRGB buffers. To
get correct rendering, avoid fast-clearing such buffers for now.

The driver now passes the following piglit tests:
* spec@arb_framebuffer_srgb@msaa-fast-clear
* spec@ext_texture_srgb@multisample-fast-clear gl_ext_texture_srgb

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-04 14:21:47 -08:00
Dylan Baker
6a9611763b meson: Fix overlinkage of dri3 loader
This was covering for underinkage elsewhere. With that fixed these can
be removed.

v2: - sort dependencies

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-04 13:15:13 -08:00
Dylan Baker
d2e9ba2782 meson: fix underlinkage without dri3
There are some case where the dri3 loader is covering for underlinkage
for GLX and EGL, provide the linkage that they actually need.

v2: - remove dep_xcb_dri3 from glx. This was an oversight in v1 and is
      not needed.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-04 13:15:02 -08:00
Dylan Baker
b08a35b150 meson: Reformat glx code to match more common style
Generally in our meson build large arrays are formated in the form:
[
  ..., ..., ..., $
  ...,
]

So use that form

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-04 13:14:00 -08:00
Samuel Pitoiset
5de7c782fb radv: fix a crash in radv_can_dump_shader()
module can be NULL, oops.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-04 19:52:14 +01:00
Chad Versace
a932aee7a8 intel/isl: Declare private array as static const
It's array isl_drm.c:modifier_info[] .

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-04 10:16:33 -08:00
Dylan Baker
2be2565b9e meson: Install dri.pc file when building gallium dri drivers
Currently this pkg-config file is only installed if a classic dri driver
is built. This is wrong, it should be installed if any dri driver is
installed, which includes the gallium dri target.

Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-04 10:14:09 -08:00
Lionel Landwerlin
2ead8f1690 anv: query CS timestamp frequency from the kernel
The reference value in gen_device_info isn't going to be acurate on
Gen10+. We should query it from the kernel, which reads a couple of
register to compute the actual value.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-12-04 18:05:20 +00:00
Lionel Landwerlin
b66e4b51bf i965: read CS timestamp frequency from the kernel on Gen10+
We cannot figure this value out of the PCI-id anymore. Let's read it
from the kernel (which computes this from a few registers).

When running on a (upcoming) 4.16-rc1+ kernel, this will fixes piglit
tests on CNL :

    spec@arb_timer_query@query gl_timestamp
    spec@arb_timer_query@timestamp-get
    spec@ext_timer_query@time-elapsed

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-12-04 18:05:20 +00:00
Lionel Landwerlin
aa8a2a8670 drm-uapi: Update drm/i915 headers from drm-next
Taken from drm-next ca797d29cd63e7b71b4eea29aff3b1cefd1ecb59

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-12-04 18:05:20 +00:00
Jason Ekstrand
1e565bc6ce radv: Implement VK_KHR_get_surface_capabilities2
The WSI core code does all the hard work.  Just add the wrappers and
turn it on.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
0a10e3770f vulkan/wsi: Initialize individual WSI interfaces in wsi_device_init
Now that we have anv_device_init/finish functions, there's no reason to
have the individual driver do any more work than that.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
2e3e55110b vulkan/wsi: Drop some unneeded cruft from the API
This drops the unneeded callbacks struct as well as the queue_get_family
callback we were using before we'd pulled QueuePresent inside.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
c1b1be5196 vulkan/wsi: Add wrappers for all of the surface queries
This lets us move wsi_interface to wsi_common_private.h

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
82931dc007 vulkan/wsi: Drop the can_handle_different_gpu parameter from get_support
Both anv and radv can handle prime now.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
3131fd9dec vulkan/wsi: Move wsi_swapchain to wsi_common_private.h
The drivers no longer poke at this directly.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
516dfb34e1 vulkan/wsi: Add a helper for AcquireNextImage
Unfortunately, due to the fact that AcquireNextImage does not take a
queue, the ANV trick for triggering the fence won't work in general.  We
leave dealing with the fence up to the caller for now.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Dave Airlie
8ff49951c3 vulkan/wsi: move swapchain create/destroy to common code
v2 (Jason Ekstrand):
 - Rebase
 - Alter the names of the helpers to better match the vulkan entrypoints
 - Use the helpers in anv

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
ad4c60d6b8 vulkan/wsi: Move prime blitting into queue_present
This lets us save a QueueSubmit and it also makes prime a lot less
X11-specific.  Also, it means we can only wait on the semaphores once
instead of on every blit.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
393aa3f6c9 vulkan/wsi: Move get_images into common code
This moves bits out of all four corners (anv, radv, x11, wayland) and
into the wsi common code.  We also switch to using an outarray to ensure
we get our return code right.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
1117f843fe anv/wsi: Enable prime support
Now that we're using the same common code as radv, we get prime support
for free.  Just enable it.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
ac95335b61 anv/wsi: Use the common QueuePresent code
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
d25a0f21e1 vulkan/wsi: Set a proper pWaitDstStageMask on the dummy submit
Neither mesa driver really cares, but we should set it none the less for
the sake of correctness.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
59e58c348e vulkan/wsi: Only wait on semaphores on the first swapchain
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
b91a1953e8 vulkan/wsi: Refactor result handling in queue_present
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Dave Airlie
6dc3a5e8f0 radv/wsi: Move the guts of QueuePresent to wsi common
v2 (Jason Ekstrand):
 - Better comit message
 - Rebase
 - Re-indent to follow wsi_common style
 - Drop the unneeded _swapchain from the newly added helper
 - Make the clone more true to the original (as per the rebase)

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
42dd06d957 vulkan/wsi: Add a WSI_FROM_HANDLE macro
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Dave Airlie
69365d72de radv/wsi: drop allocate memory special case
Just check if image has scanout flag set

v2 (Jason Ekstrand):
 - Rebase
 - Also drop the now unused radv_mem_flag_bits enum

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
e12688f365 vulkan/wsi: Do image creation in common code
This uses the mock extension created in a previous commit to tell the
driver that the image it's just been asked to create is, in fact, a
window system image with whatever assumptions that implies.  There was a
lot of redundant code between the two drivers to do basically exactly
the same thing.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
d50937f137 vulkan/wsi: Implement prime in a completely generic way
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
df4fc68492 radv: Move wsi initialization later in physical_device_init
We need it to happen after memory type setup so that we can query memory
types in wsi_device_init.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
a50f93ecfb radv/image: Implement the wsi "extension"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
3dabb4011f anv/image: Implement the wsi "extension"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
a44744e01d anv: Require a dedicated allocation for modified images
This lets us set the BO tiling when we allocate the memory.  This is
required for GL to work properly.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
7d19e570e1 anv/image: Add a drm_format_mod field
At the moment, this is always initialized to DRM_FORMAT_MOD_INVALID.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
546e747867 radv: Implement VK_EXT_external_memory_dma_buf
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
ab18e8e59b anv: Implement VK_EXT_external_memory_dma_buf
This is a modified version of the patch originally sent by Chad Versace.
The primary difference is that this version claims that OPQAUE_FD and
DMA_BUF are compatible handle types.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
66dc618215 vulkan/wsi: Add a mock image creation extension
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
cd881dafad vulkan/wsi: Add wsi_swapchain_init/finish functions
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
764fc1643c vulkan/wsi: Add a wsi_device_init function
This gives the opportunity to collect some function pointers if we'd
like which will be very useful in future.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand
3991098f3b vulkan/wsi/x11: Handle the geometry check earlier in create_swapchain
This fixes a potential leak if allocating the swapchain fails.  Since
geometry checking and bit-depth fetching is self-contained, it makes
sense to just do it first so we can delete the geometry reply.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Daniel Stone
c1163f7b1c vulkan/wsi: Add a wsi_image structure
This is used to hold information about the allocated image, rather than
an ever-growing function argument list.

v2 (Jason Ekstrand):
 - Rename wsi_image_base to wsi_image

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Dave Airlie
2cbeb32555 vulkan/wsi: use function ptr definitions from the spec.
This just seems cleaner, and we may expand this in future.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-04 10:04:19 -08:00
Kenneth Graunke
55a97db523 i965: Emit CS stall before MEDIA_VFE_STATE.
This fixes hangs on GFXBench 5's Aztec Ruins benchmark.

Unfortunately, it regresses OglCSCloth performance by about 10%. There
are some ideas for fixing that.

The Vulkan driver already emits this stall.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-12-04 10:02:46 -08:00
Kenneth Graunke
bfe0f3a702 i965: Move PIPE_CONTROL defines and prototypes to brw_pipe_control.h.
We need to be able to emit PIPE_CONTROLs from genX_state_upload.c,
which can't safely include brw_defines.h because it conflicts with
genxml.  Move all the PIPE_CONTROL related stuff together into a
separate header.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-12-04 10:02:46 -08:00
Jason Ekstrand
d74b1f4809 spirv: Replace unreachable with vtn_fail
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Jason Ekstrand
b7ef60d846 spirv: Replace assert with vtn_assert
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Jason Ekstrand
94ca8e04ad spirv: Add vtn_fail and vtn_assert helpers
These helpers are much nicer than just using assert because they don't
kill your process.  Instead, it longjmps back to spirv_to_nir(), cleans
up all the temporary memory, and nicely returns NULL.  While crashing is
completely OK in the Vulkan world, it's not considered to be quite so
nice in GL.  This should help us to make SPIR-V parsing much more
robust.  The one downside here is that vtn_assert is not compiled out in
release builds like assert() is so it isn't free.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Jason Ekstrand
0c49aa0624 util: Add a NORETURN macro
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Jason Ekstrand
591a07632c spirv: Do something useful with OpSource
We may as well log the source language and file name.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Jason Ekstrand
16dfdeefc8 spirv: Rework logging
This commit reworks the way that logging works in SPIR-V to provide
richer and more detailed logging infrastructure.  This commit contains
several improvements over the old mechanism:

 1) Log messages are now more detailed.  They contain the SPIR-V byte
    offset as well as source language information from OpSource and
    OpLine.

 2) There is now a logging callback mechanism so that errors can get
    propagated to the client through debug callbak extensions.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Jason Ekstrand
11bd753c4e spirv: Re-arrange vtn_builder initialization
This simply moves allocating the vtn_builder and initializing it to the
very beginning before we even parse the header.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Jason Ekstrand
d74bec1a54 spirv: Parent the nir_shader to the builder while building
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Rob Clark
1ec1ae47f7 freedreno: mark stencil buffer valid too in case of z32x24s8
The separate stencil buffer was not also getting marked as valid if
written by a draw/clear, resulting in gmem2mem getting skipped.  Move
this into fd_batch_resource_used() which also handles the separate
stencil case.

Also fix restore_buffers typo.

Fixes: 4ab6ab8036 freedreno: avoid mem2gmem for invalidated buffers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-04 11:50:45 -05:00
Rob Clark
e90f1a26c3 freedreno: remove use of u_transfer
Freedreno doesn't treat buffers and images differently, so it's use was
kind of pointless.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-04 11:50:45 -05:00
Eric Engestrom
7c3f958d23 freedreno: add -Wno-packed-bitfield-compat for meson build
Otherwise huge amount of spam from instr-a2xx.h.. gcc has no way to know
that freedreno was never built with such an old gcc version to care
about the bugs in old gcc ;-)

Reported-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
[added commit message]
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-04 11:50:45 -05:00
Samuel Iglesias Gonsálvez
fa8c1b92b7 glsl: don't run intrastage array validation when the interface type is not an array
We validate that the interface block array type's definition matches.
However, previously, the function could be called if an non-array
interface block has different type definitions -for example, when the
precision qualifier differs in a GLSL ES shader, we would create two
different types-, and it would return invalid as both definitions are
non-arrays.

We fix this by specifying that at least one definition should be an
array to call the validation.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:32:57 +01:00
Samuel Iglesias Gonsálvez
fc6d55952d glsl/es: precision qualifier doesn't need to match in UBOs
They might mismatch due to the two shaders using different GLSL
versions, and that's ok in desktop GL. In ES, precision qualifiers
don't need to match.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:32:57 +01:00
Pierre Moreau
9bee12160b nvc0/ir: Properly lower 64-bit shifts when the shift value is >32
Fixes: 61d7676df7 "nvc0/ir: add support for 64-bit shift lowering on SM20/SM30"

Fixes fs-shift-scalar-by-scalar.shader_test from piglit for the current
set-up:

uniform int64_t ival -0x7dfcfefbdf6536ff # bit pattern: 0x82030104209ac901
uniform uint64_t uval 0x1400000085010203
uniform int shl 36
uniform int shr 36
uniform int64_t iexpected_shl 0x09ac901000000000
uniform int64_t iexpected_shr -0x7dfcff0 # bit pattern: 0xfffffffff8203010
uniform uint64_t uexpected_shl 0x5010203000000000
uniform uint64_t uexpected_shr 0x0000000001400000
draw rect ortho 12 0 4 4

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-12-04 01:03:47 -05:00
Fabian Bieler
9bdb5457f4 glsl: Match order of gl_LightSourceParameters elements.
spotExponent and spotCosCutoff were swapped in the
gl_builtin_uniform_element struct.
Now the order matches across gl_builtin_uniform_element,
glsl_struct_field and the spec.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-12-03 21:14:14 -07:00
Fabian Bieler
c3ee464d7a glsl: Fix gl_NormalScale.
GLSL shaders can access the normal scale factor with the built-in
gl_NormalScale.  Mesa's modelspace lighting optimization uses a different
normal scale factor than defined in the spec.  We have to take care not
to use this factor for gl_NormalScale.

Mesa already defines two seperate states: state.normalScale and
state.internal.normalScale.  The first is used by the glsl compiler
while the later is used by the fixed function T&L pipeline.  Previously
the only difference was some component swizzling.  With this commit
state.normalScale always uses the normal scale factor for eyespace
lighting.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-12-03 21:13:46 -07:00
Timothy Arceri
27888977c1 st/glsl_to_nir/radeonsi: enable gs support for nir backend
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:19 +11:00
Timothy Arceri
ccd1810bba ac: add si_nir_load_input_gs() to the abi
V2: make use of driver_location and don't expose NIR to the ABI.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:19 +11:00
Timothy Arceri
caf15ce670 ac: move build_varying_gather_values() to ac_llvm_build.h and expose
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:19 +11:00
Timothy Arceri
6fd6cb6616 ac: add basic nir -> llvm type helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri
4184e7c417 radeonsi: create si_llvm_load_input_gs()
This creates a common function that can be shared by the tgsi
and nir backends.

v2: use LLVMBuildBitCast() directly

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri
c4c8df94bd radeonsi: pass llvm type to lds_load()
v2: use LLVMBuildBitCast() directly

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri
650126f3e0 radeonsi: add llvm_type_is_64bit() helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri
7ef1e42c14 radeonsi: pass llvm type to si_llvm_emit_fetch_64bit()
v2: use LLVMBuildBitCast() directly

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri
e51ecbe980 radeonsi: add nir support for gs epilogue
v2: add emit_gs_epilogue() helper function to reduce duplication.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri
73918b3172 radeonsi: add nir support for es epilogue
v2: make use of existing si_tgsi_emit_epilogue()

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri
204f547852 radeonsi: add nir support for ls epilogue
v2: make use of existing si_tgsi_emit_epilogue()

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri
164b6d4aeb st/glsl_to_nir: add gs support to st_nir_assign_var_locations()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri
c86baf71fb st/glsl_to_nir: use nir_lower_io_arrays_to_elements() to lower arrays
This pass is more fully featured, it supports geom and tess shaders.
It also supports interpolation intrinsics.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri
d99c7e0ff1 nir: allow builin arrays to be lowered
Galliums nir drivers expect this to be done.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri
2bc49ac3e6 nir: add array lowering function that assumes there are no indirects
The gallium glsl->nir pass currently lowers away all indirects on both inputs
and outputs. This fuction allows us to lower vs inputs and fs outputs and also
lower things one stage at a time as we don't need to worry about indirects
on the other side of the shaders interface.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri
f13790c92f radv: enable nir varying array splitting
Acked-by: Dave Airlie <airlied@redhat.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri
6648bd68fd st/glsl_to_nir: enable NIR link time opts
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Timothy Arceri
c16a0e11d3 radeonsi/nir: add support for packed inputs
Because NIR can create non vec4 variables when implementing component
packing we need to make sure not to reprocess the same slot again.

Also we can drop the fs_attr_idx counter and just use driver_location.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Timothy Arceri
c3a5d74377 st/glsl_to_nir: move some calls out of st_glsl_to_nir_post_opts()
NIR component packing will be inserted between these calls and the
calling of st_glsl_to_nir_post_opts().

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Timothy Arceri
90abaf8a21 st/glsl_to_nir: call some lowering passes earlier
This is required so that we can enbale NIR linking optimisations.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Timothy Arceri
bd98b8c74e st/glsl_to_nir: add basic NIR opt loop helper
We need to be able to do these NIR opts in the state tracker
rather than the driver in order for the NIR linking opts to
be useful.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Timothy Arceri
a9ac01b96f st/glsl_to_nir: make st_glsl_to_nir() static
Here we also move the extern C functions to the bottom of the file.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Timothy Arceri
d586f39cb0 st/glsl_to_nir: split the st_glsl_to_nir() function in two
We want to be able to generate NIR then apply NIR optimisations.
Once the optimisations are done we can then apply the new post opt
function which assigns uniforms etc based on the optimised IR.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Timothy Arceri
d38f99baec st/glsl_to_nir: create set_st_program() helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Timothy Arceri
da953b641d st/glsl: move nir linking loop to new function st_link_nir()
This will allow us to refactor linking and include some nir link
time optimisations.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Timothy Arceri
2a35021bc6 nir: fix support for scalar arrays in nir_lower_io_types()
This was just recreating the same vector type we alreay had and
hitting an assert for scalars.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Timothy Arceri
9530b786d2 st/glsl_to_nir: add st_nir_assign_var_locations() helper
This avoids packed varyings being assigned different driver locations.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Timothy Arceri
aecb9bec87 radv: enable nir component packing
SaschaWillems Vulkan demo tessellation:

~4000fps -> ~4600fps

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-04 09:10:30 +11:00
Timothy Arceri
1c9c42d16b nir: add varying component packing helpers
v2: update shader info input/output masks when pack components
v3: make sure interpolation loc matches, this is required for the
    radeonsi NIR backend.
v4: 33dca36f4f fixed nir_gather_info to update outputs_read
    correct, make sure we also adjust this correctly when
    packing components.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)
2017-12-04 09:10:30 +11:00
Timothy Arceri
c797bc6aa7 nir: add varying array splitting pass
V2:
 - fix matrix support, non-array matrices were being skipped in v1

v3:
 - handle lowering of tcs output loads correctly
 - correctly mark indirect locations for either in or out not both
   when processing a stage.
 - use nir_src_copy() when lowering stores.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Rob Clark
11efe42a73 freedreno/ir3: relax barriers
Instructions with no barrier_class can move wrt. an EVERYTHING barrier.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
48eef0c182 freedreno/ir3: all mem instructions have WAR hazzard
It isn't just load instructions that have write-after-read hazzard.

Fixes stk gaussian blur compute shaders.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
e6c6495d3a freedreno: add debug option to force emulated indirect
Useful mostly for debugging indirect draw.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
f93f2f7b1e freedreno: also mark draw-indirect buffer as read
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
4b1d0d2844 freedreno: small cleanups
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
91730fb0ff freedreno: avoid unneccessary batch flush
In some cases we can end up trying to add a write dependency on ourself,
which shouldn't trigger a flush.

Avoids an extra couple flushes per from in stk.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
4ab6ab8036 freedreno: avoid mem2gmem for invalidated buffers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
2fcf6faa06 freedreno: deferred flush support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
15ebf387fc freedreno: rework fence tracking
ctx->last_fence isn't such a terribly clever idea, if batches can be
flushed out of order.  Instead, each batch now holds a fence, which is
created before the batch is flushed (useful for next patch), that later
gets populated after the batch is actually flushed.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:40 -05:00
Rob Clark
deb57fb237 freedreno: proper locking for iterating dependent batches
In transfer_map(), when we need to flush batches that read from a
resource, we should be holding screen->lock to guard against race
conditions.  Somehow deferred flush seems to make this existing
race more obvious.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:40 -05:00
Rob Clark
ef6313ffd3 freedreno/a5xx: correct max_indicies for indirect draws
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:40 -05:00
Jason Ekstrand
e19c623128 spirv: Convert the supported_extensions struct to spirv_options
This is a bit more general and lets us pass additional options into the
spirv_to_nir pass beyond what capabilities we support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-02 08:09:11 -08:00
Jason Ekstrand
6bd876dcaa spirv: Only emit functions which are actually used
Instead of emitting absolutely everything, just emit the few functions
that are actually referenced in some way by the entrypoint.  This should
save us quite a bit of time when handed large shader modules containing
many entrypoints.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-02 08:07:35 -08:00
Jason Ekstrand
f5aad36d2e spirv: Drop the impl field from vtn_builder
We have a nir_builder and it has an impl field.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-02 08:07:35 -08:00
Jordan Justen
fc033742d2 i965: Serialize nir later in the linking process
Fixes MESA_GLSL=cache_fb with piglit
tests/spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out.shader_test

Fixes: 0610a624a1 i965/link: Serialize program to nir after linking for shader cache
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103988
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-12-01 23:17:44 -08:00
Marc Dietrich
d93fabb013 configure: avoid testing for negative compiler options
gcc seems to always accept unsupported negative compiler warning options:

echo "int i;" | gcc -c -xc -Wno-bob - # no error
echo "int i;" | gcc -c -xc -Walice -  # unsupported compiler option

Inverting the options fixes the tests.

V2: fix options in meson build

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Marc Dietrich <marvin24@gmx.de>
2017-12-01 17:09:42 -08:00
Eric Anholt
0ed952c7e9 broadcom/vc4: Use a single-entry cached last_hindex value.
Since almost all BOs will be in one CL at a time, this cache will almost
always hit except for the first usage of the BO in each CL.

This didn't show up as statistically significant on the minetest trace
(n=340), but if I lop off the throttled lobe of the bimodal distribution,
it very clearly does (0.74731% +/- 0.162093%, n=269).
2017-12-01 15:37:28 -08:00
Eric Anholt
230e646a40 broadcom/vc4: Decompose single QUADs to a TRIANGLE_FAN.
No significant difference in the minetest replay, but it should reduce
overhead by not requiring that we write quad indices to index buffers that
we repeatedly re-upload (and making the draw packet smaller, as well).

Over the course of the series the actual game seems to be up by 1-2 fps.
2017-12-01 15:37:28 -08:00
Eric Anholt
fefff74b0d broadcom/vc4: Use the new enum functionality of the XML to decode better. 2017-12-01 15:37:28 -08:00
Eric Anholt
5167367050 broadcom/vc4: Skip emitting redundant VC4_PACKET_GEM_HANDLES.
Now that there's only one user of it, it's pretty obvious how to avoid
emitting redundant ones.  This should save a bunch of kernel validation
overhead.

No statistically sigificant difference on the minetest trace I was looking
at (n=169), but the maximum FPS is up by .3%
2017-12-01 15:37:28 -08:00
Eric Anholt
842b05d6ad broadcom/vc4: Simplify the relocation handling for index buffers.
Originally there was CL code for handling various relocations back when I
had relocs for the TSDA/TA buffers.  Now that the kernel handles those
entirely on its own, I can inline that code into the one place using it.
2017-12-01 15:37:28 -08:00
Eric Anholt
84ab48c15c broadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count.
We failed to take the start into account for how many vertices to draw in
this round, so we would end up decrementing count below 0, which as an
unsigned number meant we would loop until the CLs soon ran out of space.

When I wrote the code I was thinking about how to use the previously
emitted shader state (no index bias baked into the elements) by emitting
up to 65535 and then only re-emitting with bias for the second wround, but
that doesn't work if the start is over 65535.  Instead, just delay
emitting shader state until we get into the drawarrays GFXH-515 loop and
always bake the bias in when we're doing the workaround.
2017-12-01 15:37:28 -08:00
Eric Anholt
bcb6ebe91a broadcom/vc4: Fix the scaling factor for the GFXH-515 workaround.
For triangle strips, we step by max_verts - 2.
2017-12-01 15:37:28 -08:00
Dylan Baker
f56e964e01 meson: use dep_thread instead of dependency('threads') in freedreno
They are the same thing, but this is more consistent with the rest of
the project.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-01 15:31:43 -08:00
Dylan Baker
5e71efef44 meson: Add lmsensors support
v2: - Make -Dlmsensors=false work
    - Simplify auto and true cases

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-01 15:31:43 -08:00
Dylan Baker
7309207432 meson: Add support for gallium extra hud
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-01 15:31:43 -08:00
Adam Jackson
a48a6b8a40 glx: Prepare driFetchDrawable for no-config contexts
When we look up the DRI drawable state we need to associate an fbconfig
with the drawable. With GLX_EXT_no_config_context we can no longer infer
that from the context and must instead query the server.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-01 15:53:52 -05:00
Adam Jackson
75d5d22fb7 glx: Use __glXSendError instead of open-coding it
This also fixes a bug, the error path through MakeCurrent didn't
translate the error code by the extension's error base.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-01 15:46:46 -05:00
Adam Jackson
bcb15bee52 glx: Simplify some dummy vtable interactions
The dummy vtable has these slots as NULL already, no need to check for
the dummy context explicitly.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-01 15:46:46 -05:00
Emil Velikov
8893418e99 docs/release-calendar: update and extend
v2: Missing td tag, add Andres + Juan for 17.2.8 and 17.3.3

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-12-01 19:30:23 +00:00
Emil Velikov
8d58e9b2cf docs/specs: annotate MESA_set_3dfx_mode as obsolete
Aimed to work with Glide, which hasn't been a thing in over 10 years.
There are no drivers that implement it, so annotate it as obsolete

v2: Move the extension to OLD/

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Adam Jackson <ajax@redhat.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-01 19:30:23 +00:00
Emil Velikov
f8aea0ce47 xlib: remove dummy GLX_MESA_set_3dfx_mode implementation
The implementation is a simple 'return EGL_FALSE'. Stop pretending and
simply remove it.

Note: the removal of XMesa API is fine, since there hasn't been any
users for it in years.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-01 19:30:23 +00:00
Emil Velikov
7a4107291d docs/specs: annotate MESA_agp_offset as obsolete
No Mesa driver has implemented the extension in ages. Seemingly non Mesa
drivers don't implement it either.

As mentioned by Ian, the extension is effectively superseded by
ARB_vertex_buffer_object.

v2: Move the extension to OLD/

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Adam Jackson <ajax@redhat.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-01 19:30:23 +00:00
Emil Velikov
bcf0ce4016 xlib: remove empty GLX_MESA_agp_offset stubs
The extension was never implemented and seemingly never will.
The DRI based libGL dropped support for it over 10 years ago.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-01 19:30:23 +00:00
Emil Velikov
b1e7386f1b xlib: remove empty GLX_NV_vertex_array_range stubs
The extension was never implemented and seemingly never will.
The DRI based libGL dropped support for it over 10 years ago.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-01 19:30:23 +00:00
Rafael Antognolli
e20830db96 i965/gen10: Change the order of PIPE_CONTROL and load register.
I believe the workaround describes that the MI_LOAD_REGISTER_IMM should
come right after the 3DSTATE_SAMPLE_PATTERN.

This fixes GPU hangs in the i965 initial state batchbuffer when running
some Piglit tests with always_flush_batch=true.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-12-01 11:27:27 -08:00
Rafael Antognolli
2919adffe9 intel/compiler: Implement WaClearTDRRegBeforeEOTForNonPS.
The bspec describes:

   "WA: Clear tdr register before send EOT in all non-PS shader kernels

   mov(8) tdr0:ud 0x0:ud {NoMask}"

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-12-01 11:27:27 -08:00
Rafael Antognolli
979fc1bc9b i965/gen10: emit 3DSTATE_MULTISAMPLE more often.
On CNL, we see multiple multisample failures on piglit tests. By
emitting this extra state, though not documented in the bspec, those
failures seem to go away.

This workaround could be removed if we ever find out a better solution,
but it should be good enough for now.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-12-01 11:27:19 -08:00
Dylan Baker
dbeb278e0d meson: install khrplatform header for EGL as well as GLES
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-01 10:39:19 -08:00
Dylan Baker
91244db186 meson: install dri internal header
Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-01 10:39:04 -08:00
Jason Ekstrand
ee57b15ec7 i965: Disable regular fast-clears (CCS_D) on gen9+
This partially reverts commit 3e57e9494c
which caused a bunch of GPU hangs on several Source titles.  To date, we
have no clue why these hangs are actually happening.  This undoes the
final effect of 3e57e9494c and gets us back to not hanging.  Tested
with Team Fortress 2.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102435
Fixes: 3e57e9494c
Cc: mesa-stable@lists.freedesktop.org
2017-12-01 10:14:28 -08:00
Vadym Shovkoplias
a1b4f1877f egl/x11: Remove unneeded free() on always null string
In this condition dri2_dpy->driver_name string always equals
NULL, so call to free() is useless

Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-01 15:15:30 +00:00
Eric Engestrom
29ee934331 gallium/hud: use #ifdef to test for macro existence
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-01 13:49:42 +00:00
Eric Engestrom
13a7a2d455 amd: remove always-true BRAHMA_BUILD define
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-01 13:49:42 +00:00
Vadym Shovkoplias
d555929239 glx/dri3: Remove unused deviceName variable
deviceName string is declared, assigned and freed but actually
never used in dri3_create_screen() function.

Fixes: 2d94601582 ("Add DRI3+Present loader")
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-01 13:49:42 +00:00
George Kyriazis
95adbe1a4e swr/scons: Fix intermittent build failure
gen_rasterizer*.cpp depends on gen_ar_eventhandler.hpp.
Account for new dependency.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-12-01 07:47:13 -06:00
Samuel Pitoiset
80e6e71b82 radv: only reset command buffers when the allocation fails
"vkAllocateCommandBuffers can be used to create multiple command
    buffers. If the creation of any of those command buffers fails, the
    implementation must destroy all successfully created command buffer
    objects from this command, set all entries of the pCommandBuffers
    array to NULL and return the error."

This has been suggested by gabriel@system.is.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-01 11:38:34 +01:00
Samuel Pitoiset
921986b580 radv: do not dump meta shaders with RADV_DEBUG=shaders
It's really annoying and this pollutes the output especially
when a bunch of non-meta shaders are compiled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-01 11:38:26 +01:00
Dave Airlie
4e7f6437b5 r600: add ARB_shader_storage_buffer_object support (v3)
This just builds on the image support. Evergreen only has ssbo
for fragment and compute no other stages.

v2: handle images and ssbo in the same shader properly (Ilia)
v3: fix RESQ on buffers,
    fix missing atom emit
    fix first element offset
    use R32 format
    write separate buffer rat store path.
(from running deqp gles3.1 tests)

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-01 06:12:31 +00:00
Dave Airlie
c758fd05d8 r600/cayman: looks like cmpxchg moved to Z
On cayman it appears the cmp component is now in Z.

Fixes:
arb_shader_image_load_store-dead-fragments on cayman.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-01 03:59:17 +00:00
Dave Airlie
4f3e73516c r600/shader: fix 64->32 conversions
These didn't handle the TGSI at all properly, this fixes
them to use the common path for 64->32 then adds the 32->int
on at the end.

Fixes:
generated_tests/spec/arb_gpu_shader_fp64/execution/conversion/*

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-01 03:48:35 +00:00
Samuel Pitoiset
ff0f17da14 radv: do not allocate CMASK or DCC for small surfaces
The idea is ported from RadeonSI, but using 512x512 instead of
256x256 seems slightly better. This improves dota2 performance
by +2%.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-11-30 21:38:30 +01:00
Samuel Pitoiset
f5955c6bf8 radv: do not set DISABLE_LSB_CEIL on GFX9
The state no longer exists on GFX9.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-30 21:38:01 +01:00
Samuel Pitoiset
319f56e675 radv: remove set but unnecessary radv_color_buffer_info::micro_tile_mode
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-30 21:38:00 +01:00
Samuel Pitoiset
4eab78b03c radv: do not store gfx9_epitch in radv_color_buffer_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-30 21:37:58 +01:00
Dylan Baker
7776dc32eb meson: fix glxext.h install
Another typo, the glext.h header was being install instead.

Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-30 10:00:49 -08:00
Dylan Baker
a80a3e4cbb meson: fix GLES3/gl31.h install
This is a typo, gl32.h is installed twice.

Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-30 10:00:49 -08:00
Marek Olšák
186adc514b ac/surface: always compute DCC info when DCC is possible on GFX9
The same code for VI doesn't check for scanout either.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-30 18:46:11 +01:00
Marek Olšák
ed4780383c radeonsi/gfx9: fix importing shared textures with DCC
VI has 11 dwords at least. GFX9 has 10 dwords.

Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-30 18:46:11 +01:00
Jon Turney
6f0ce2617e meson: fix deps and underlinkage of libGL
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-30 15:09:21 +00:00
Jon Turney
5ef75cb02b meson: build src/glx/windows
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-30 15:09:21 +00:00
Jon Turney
3ae998a743 meson: don't require dri2proto for darwin or windows
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-30 15:09:21 +00:00
Jon Turney
dbe36e3b17 meson: set _GNU_SOURCE on cygwin
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-30 15:09:21 +00:00
Jon Turney
9cdd41b18a meson: set windows glx defines
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-30 15:09:21 +00:00
Dylan Baker
bb5d663b39 meson: fix generated source inclusion on macOS and Windows
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-30 15:09:21 +00:00
Vadym Shovkoplias
cdb3eb7174 intel/blorp: Fix possible NULL pointer dereferencing
Fix incomplete check of input params in blorp_surf_convert_to_uncompressed()
which can lead to NULL pointer dereferencing.

Fixes: 5ae8043fed ("intel/blorp: Add an entrypoint for doing
bit-for-bit copies")
Fixes: f395d0abc8 ("intel/blorp: Internally expose
surf_convert_to_uncompressed")
Reviewed-by: Emil Velikov <emli.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-11-30 16:20:05 +02:00
Tapani Pälli
faccbaf3fa mesa: add AllowGLSLCrossStageInterpolationMismatch workaround
This fixes issues seen with certain versions of Unreal Engine 4 editor
and games built with that using GLSL 4.30.

v2: add driinfo_gallium change (Emil Velikov)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97852
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103801
Acked-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-30 11:43:10 +02:00
Vinson Lee
8c1e4b1afc anv: Check if memfd_create is already defined.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103909
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-30 01:36:46 -08:00
Iago Toral Quiroga
8620f7ebbc i965/vec4: use a temp register to compute offsets for pull loads
64-bit pull loads are implemented by emitting 2 separate
32-bit pull load messages, where the second message loads from
an offset at +16B.

That addition of 16B to the original offset should not alter the
original offset register used as source for the pull load instruction
though, since the compiler might use that same offset register in other
instructions (for example, for other pull loads in the shader code
that take that same offset as reference).

If the pull load is 32-bit then we only need to emit one message and
we don't need to do offset calculations, but in that case the optimizer
should be able to drop the redundant MOV.

Fixes the following test on Haswell:
KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components

Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103007
2017-11-30 07:57:53 +01:00
Wladimir J. van der Laan
f1a9a724f9 etnaviv: GC7000: Factor out state based texture functionality
Prepare for two texture handling paths, the descriptor-based
path will be added in a future commit. These are structured
so that the texture implementation handles its own state
emission.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:33:20 +01:00
Wladimir J. van der Laan
075f8cd7de etnaviv: GC7000: Move active_samplers_bits to texture
This needs to be shared between texture_plain and texture_desc.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:33:16 +01:00
Wladimir J. van der Laan
260a5e2a1a etnaviv: GC7000: Factor out incompatible texture handling logic
This will be shared with the texture descriptor path.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:33:11 +01:00
Wladimir J. van der Laan
9d1f8805b0 etnaviv: GC7000: Track dirty sampler views
Need this to efficiently emit texture descriptor invalidations.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:33:07 +01:00
Wladimir J. van der Laan
5cc36f9f21 etnaviv: GC7000: Make point sprites work on HALTI5
Track varying component offset of the point size output, as well as
provide the offset of the point coord input.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:33:02 +01:00
Wladimir J. van der Laan
3d09bb390a etnaviv: GC7000: State changes for HALTI3..5
Update state objects to add new state, and emit function to emit new
state.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:32:33 +01:00
Wladimir J. van der Laan
acd3dff463 etnaviv: GC7000: Update screen specs for HALTI5
- This core must load shaders from memory (AFAIK)
- Yet another new location for UNIFORMS

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:32:21 +01:00
Wladimir J. van der Laan
c6033e84bb etnaviv: GC7000: Update context reset for ..HALTI5
Update context reset for HALTI3..HALTI5, sorting states for the HALTI
version that has them.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:28:09 +01:00
Wladimir J. van der Laan
baff59ebf0 etnaviv: GC7000: No RS align when using BLT
RS align is not necessary and might even be harmful when using the BLT
engine for blitting.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:28:02 +01:00
Wladimir J. van der Laan
dd3a04c2c3 etnaviv: GC7000: BLT engine blitting support
Add an implemenation of key clear_blit functions using the BLT engine
that replaced the RS on GC7000.

Also set level->size correctly for imported resources. This is important
for the BLT resolve-in-place path to work for them.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:27:57 +01:00
Wladimir J. van der Laan
079bbaec0c etnaviv: GC7000: Factor out RS blit functionality
Prepare for BLT-based blitting path by moving RS-based
blitting to the RS implementation file, making this
self-contained.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:27:53 +01:00
Wladimir J. van der Laan
77768b1859 etnaviv: GC7000: Move etna_coalesce to emit header file
Want to be able to emit state from the texture implementation,
and the blitter implementation.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:27:48 +01:00
Wladimir J. van der Laan
571d980695 etnaviv: GC7000: Support BLT as recipient for etna_stall
When the BLT is involved as source or target, add an extra BLT
enable/disable sequence around the sync sequence.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:27:43 +01:00
Wladimir J. van der Laan
150d8766ea etnaviv: Use only DRAW_INSTANCED on GC3000+
The blob does this, as DRAW_INSTANCED can replace fully all the other
draw commands. It is also required to handle integer vertex formats.
The other path is only there for compatibility and might go away (or at
least rot to become buggy due to dis-use) in newer hardware.

As a by-effect this changes the behavior for GC3000-, by no longer using
the index offset for DRAW_INDEXED but instead adding it to INDEX_ADDR.
This should make no difference.

Preparation for GC7000 support.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-11-30 07:26:55 +01:00
Wladimir J. van der Laan
23630ab1b6 etnaviv: Emit SCALE for vertex attributes
This is used by HALTI2+ (GC3000+) when drawing with DRAW_INSTANCED.

It is also necessary when switching between integer and floating point
vertex element formats.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:26:46 +01:00
Kenneth Graunke
74e38739ca i965: Reorganize batch/state BO fields into a 'brw_growing_bo' struct.
We're about to add more of them, and need to pass the whole lot of them
around together when growing them.  Putting them in a struct makes this
much easier.

brw->batch.batch.bo is a bit of a mouthful, but it's nice to have things
labeled 'batch' and 'state' now that we have multiple buffers.

Fixes: 2dfc119f22 "i965: Grow the batch/state buffers if we need space and can't flush."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103101
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-29 17:30:35 -08:00
Kenneth Graunke
ca43616586 i965: Don't grow batch/state buffer on every emit after an overflow.
Once we reach the intended size of the buffer (BATCH_SZ or STATE_SZ), we
try and flush.  If we're not allowed to flush, we resort to growing the
buffer so that there's space for the data we need to emit.

We accidentally got the threshold wrong.  The first non-wrappable call
beyond (e.g.) STATE_SZ would grow the buffer to floor(1.5 * STATE_SZ),
The next call would see we were beyond STATE_SZ and think we needed to
grow a second time - when the buffer was already large enough.

We still want to flush when we hit STATE_SZ, but for growing, we should
use the actual size of the buffer as the threshold.  This way, we only
grow when actually necessary.

v2: Simplify the control flow (suggested by Jordan)

Fixes: 2dfc119f22 "i965: Grow the batch/state buffers if we need space and can't flush."
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-11-29 17:30:35 -08:00
Kenneth Graunke
52d32917e1 i965: Preserve EXEC_OBJECT_CAPTURE when growing the BO.
The original state buffer was marked with EXEC_OBJECT_CAPTURE.  When
growing it, we want to preserve that flag so we continue to capture it
in GPU hang reports.

Fixes: 2dfc119f22 "i965: Grow the batch/state buffers if we need space and can't flush."
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-29 17:30:35 -08:00
Kenneth Graunke
2af7085460 i965: Use old_bo->align when growing batch/state buffer instead of 4096.
The intention here is make the new BO use the same alignment as the old
BO.  This isn't strictly necessary, but we would have to update the
'alignment' field in the validation list when swapping it out, and we
don't bother today.

The batch and state buffers use an alignment of 4096, so this should be
equivalent - it's just clearer than cut and pasting a magic constant.

Fixes: 2dfc119f22 "i965: Grow the batch/state buffers if we need space and can't flush."
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-11-29 17:30:35 -08:00
Dave Airlie
2c4861e453 r600: no need to reinit compute regs
Compute setup gets emitted into the normal gfx state buffer,
so no need to reinit the basics.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-30 09:53:22 +10:00
Dave Airlie
ea355e29f7 r600: split cb setup code out from evergreen compute path.
This just makes it easier to bypass for TGSI later.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-30 09:39:25 +10:00
Dave Airlie
77c70e5fe5 r600: add support for compute pkt flags to debug dumping.
This just lets us see packets marked for compute.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-30 09:32:31 +10:00
Dave Airlie
779306c8b6 r600: fix bfe where src/dst are same.
This fixes overlaps where src/dst are the same.

Fixes a bunch of the deqp bitfield tests.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-30 09:32:31 +10:00
Adam Jackson
0d044351b7 gallium/dri2: Enable {GLX_ARB,EGL_KHR}_context_flush_control
Reviewed-and-tested-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-11-29 16:00:24 -05:00
Kenneth Graunke
cfc5af588c i965: Program the dynamic state heap size to MAX_STATE_SIZE.
STATE_BASE_ADDRESS specifies a maximum size of the dynamic state
section, beyond which data supposedly reads back as 0.  On Gen8+,
we were programming it to the size of the buffer.  This worked fine
until we started growing the state buffer in commit 2dfc119f22.
When the state buffer grows, the value in STATE_BASE_ADDRESS becomes
too small, and our state beyond STATE_SZ bytes would read back as 0.

To avoid having to update the value, we program it to MAX_STATE_SIZE.
We used to program the upper bound to the maximum on older hardware
anyway, so programming it too large isn't a big deal.

Bogus SURFACE_STATE can easily lead to GPU hangs and misrendering.
DiRT Rally was hitting the statebuffer growth path, and suffered from
bad texture corruption and GPU hangs (usually around the same time).

This patch fixes both issues.

Fixes: 2dfc119f22 "i965: Grow the batch/state buffers if we need space and can't flush."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103101
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-29 11:48:29 -08:00
Marek Olšák
2c5f2936af r300,r600,radeonsi: replace RADEON_FLUSH_* with PIPE_FLUSH_*
and handle PIPE_FLUSH_HINT_FINISH in r300.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
950221f923 radeonsi: remove r600_common_screen
Most files in gallium/radeon now include si_pipe.h.

chip_class and family are now here:
    sscreen->info.family
    sscreen->info.chip_class

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
4d1fe8f964 radeonsi: remove r600_pipe_common::barrier_flags::compute_to_L2
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
c0d44fe0e9 radeonsi: remove query/apply_opaque_metadata callbacks
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
2208b760f3 radeonsi: move shader debug helpers out of r600_pipe_common.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
e4cce7dbba radeonsi: dismantle si_common_screen_init/destroy
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
e32d3a648e radeonsi: document our vendor string situation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
eae85b99fc radeonsi: set all pipe buffer functions in r600_buffer_common.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
63f88644a5 radeonsi/uvd: don't call ws->query_info
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
b86feec390 radeonsi: move video queries into si_get.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
30d5f2c942 radeonsi: remove more functions from r600_pipe_common.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
757ea3e613 radeonsi: move/remove ac_shader_binary helpers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
03e2adc990 radeonsi: move all get functions to si_get.c; disk_cache_create to si_pipe.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
1823bbbb1a radeonsi: remove R600_CONTEXT_* flags
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
d96c7e7822 radeonsi: just include si_pipe.h in r600_query.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
c63e225bff radeonsi: remove some definitions and helpers from r600_pipe_common.h
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
175ee084ff radeonsi: don't use fast color clear for small surfaces
This removes 35+ clear eliminate passes from DOTA 2.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
8a58724ac9 radeonsi: unify code setting dirty_level_mask for fast clear
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
980bf9a27e radeonsi: clean up si_do_fast_color_clear parameters
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
092756f23f radeonsi: remove r600_common_context::clear_buffer
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
b191e2d79d radeonsi: move r600_test_dma.c into si_test_dma.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
132471bde1 radeonsi: move si_pipe_clear_buffer into si_cp_dma.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
7aa2366b70 radeonsi: move all clear() code into si_clear.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
3c4d871ca2 radeonsi: enable DCC with MSAA for VI
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
373f4a48ae radeonsi: implement fast color clear for DCC with MSAA for VI
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
36ac7a1b0e radeonsi: add a workaround for blending with DCC and MSAA
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
d1f65e5e99 radeonsi: clear PIPE_IMAGE_ACCESS_WRITE when it's invalid to be on the safe side
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
e3c0a5b6e8 ac/surface: enable DCC computation for MSAA
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
6863651bbd radeonsi: fix layered DCC fast clear
Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Jon Turney
2c62ccb10a util: Also include endian.h on cygwin
If u_endian.h can't determine the endianess, the default behaviour in sha1.c
is to build for big-endian

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-11-29 14:04:40 +00:00
Juan A. Suarez Romero
6d540aa092 mesa: deal with vs_inputs as 64-bit unsigned integer
Commit 78942e ("mesa: shrink VERT_ATTRIB bitfields to 32 bits") uses
vs_prog_data->vs_inputs as if it were a 32-bit unsigned integer.

But actually it is a 64-bit integer, and as such it is used in other
parts of Mesa code. It is worth to note that bits from the entire range
are used, and not only 32-bits. This is due our implementation for
handling 64-bit dual-slot input attributes, which requires to use a
larger bitfield to manage them.

This commit reverts the changes done in brw_draw_upload.c, keeping the
rest of the changes.

This fixes the following tests:

- KHR-GL45.enhanced_layouts.varying_array_locations
- KHR-GL45.enhanced_layouts.varying_locations

Fixes: 78942e ("mesa: shrink VERT_ATTRIB bitfields to 32 bits")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103942
CC: Marek Olšák <marek.olsak@amd.com>
CC: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-11-29 12:13:10 +01:00
Timothy Arceri
a39a3b4b76 mesa: rework _mesa_add_parameter() to only add a single param
This is more inline with what the functions name suggests it should
do, and makes the code much easier to follow.

This will also make adding uniform packing support much simpler.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 21:50:48 +11:00
Dave Airlie
f8a54c489d r600: lds load cleanups.
This is just some cleanups on top of the last patch from my compute branch.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-29 13:18:44 +10:00
Gert Wollny
76837e29e3 r600_shader: only load from LDS what is really used
Use the destination write mask to determine which values are really to be
read from LDS and load only these.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2017-11-29 13:08:29 +10:00
Dave Airlie
579ec9c311 r600/sb: handle jump after target to end of program. (v2)
This fixes hangs on cayman with
tests/spec/arb_tessellation_shader/execution/trivial-tess-gs_no-gs-inputs.shader_test

This has a single if/else in it, and when this peephole activated,
it would set the jump target to NULL if there was no instruction
after the final POP. This adds a NOP if we get a jump in this case,
and seems to fix the hangs, so we have a valid target for the ELSE
instruction to go to, instead of 0 (which causes infinite loops).

v2: update last_cf correctly. (I had some other patches hide this)

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-29 11:52:53 +10:00
Kenneth Graunke
6b91610fc6 i965: Change a ret == -1 check to ret != 0.
For consistency with most other ret checks.  Suggested by Chris.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-28 15:23:16 -08:00
Kenneth Graunke
874d41add3 i965: Use C99 struct initializers in brw_bufmgr.c.
This is cleaner than using a non-standard memclear macro (which does a
memset to 0) and then initializing fields after the fact.  We move the
declarations to where we initialized the fields.  While we're at it, we
move the declaration of 'ret' that goes with the ioctl, eliminating the
declaration section altogether.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-28 15:23:16 -08:00
Kenneth Graunke
3d68329a65 i965: Move perf_debug and WARN_ONCE back to brw_context.h.
These were moved to src/intel/common/gen_debug.h, but they are not
common code.  They assume that brw_context or gl_context variables
exist, named brw or ctx.  That isn't remotely true outside of i965.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-28 15:23:16 -08:00
Eric Engestrom
07d3966694 i965: const a few structs and vars to avoid writing to them by accident
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-28 15:23:16 -08:00
Kenneth Graunke
760e0156df i965: Fix Smooth Point Enables.
We want to program the 3DSTATE_RASTER field to the gl_context value,
not the other way around.

Fixes: 13ac46557a (i965: Port Gen8+ 3DSTATE_RASTER state to genxml.)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-28 15:23:16 -08:00
Dylan Baker
43b0e5f5cd meson: build virgl driver
Build tested only.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-28 14:06:38 -08:00
Dylan Baker
a537231b22 meson: build svga driver on linux
Build tested only.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-28 14:06:36 -08:00
Dylan Baker
5060c51b6f meson: build r600 driver
v4: - Ensure inc_amd_common defined when radeonsi is disabled (needed by
      r600)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-28 14:06:33 -08:00
Dylan Baker
4ae08296d0 meson: build r300 driver
This is build tested only

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-28 14:06:30 -08:00
Dylan Baker
9169dde941 meson: build i915g driver
Build tested only.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-28 14:06:26 -08:00
Brian Paul
c5d199fa2c svga: move svga_is_format_supported() to svga_format.c
where the other format-related functions live.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-11-28 06:50:16 -07:00
Brian Paul
bae5b2a87c svga: s/unsigned/SVGA3dDevCapIndex/
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-11-28 06:50:16 -07:00
Lionel Landwerlin
addfa4c5e8 i965: perf: add support for CoffeeLake GT3
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-28 13:34:04 +00:00
Lionel Landwerlin
b5f6b9b0eb i965: perf: add support for CoffeeLake GT2
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-28 13:34:04 +00:00
Lionel Landwerlin
74f41fd781 i965: perf: add busyness metric sets on gen8/9 platforms
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-28 13:34:04 +00:00
Lionel Landwerlin
a543ae4c2a i965: fix time elapsed counter equations in VME/Media configs
There was a mistake just in those metric sets. We probably didn't
noticed because they're not really interesting for 3D workloads.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-28 13:34:04 +00:00
Lionel Landwerlin
064a4831e3 i965: perf: update counter names on gen8/9 platforms
Just fixing names.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-28 13:34:04 +00:00
Lionel Landwerlin
349712018b i965: add a debug option to disable oa config loading
This provides a good way to verify we haven't broken using the perf
driver on older kernels (which don't have the oa config loading
mechanism).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-28 13:34:04 +00:00
Lionel Landwerlin
27ee83eaf7 i965: perf: add support for userspace configurations
This allows us to deploy new configurations without touching the
kernel.

v2: Detect loadable configs without creating one (Chris)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-28 13:34:04 +00:00
Lionel Landwerlin
3e7112e603 i965: perf: update configs for loading from userspace
When making configs loadable from userspace in the kernel, we left to
userspace more responsability around programming some registers. In
particular one register we use to set directly in the driver has now
been moved into the configs.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-28 13:34:04 +00:00
Eric Engestrom
44fbbd6fd0 util: add mesa-sha1 test to meson
Fixes: 513d7ffa23 "util: Add a SHA1 unit test program"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-11-28 11:06:04 +00:00
Eric Engestrom
9d281e1506 compiler: fix typo
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-11-28 10:54:38 +00:00
Eric Engestrom
7b85b9b877 compiler: use NDEBUG to guard asserts
nir_validate.c's #endif already had the correct NDEBUG comment

Fixes: dcb1acdea0 "nir/validate: Only build in debug mode"
Fixes: 9ff71b649b "i965/nir: Validate that NIR passes call nir_metadata_preserve()"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-11-28 10:54:38 +00:00
Eric Engestrom
bb46111c01 broadcom: use NDEBUG to guard asserts
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-28 09:50:36 +00:00
Eric Engestrom
7bb89e1c8f vc4: check preprocessor token existence using #ifdef instead of #if
(other uses of USE_VC4_SIMULATOR are already correct)

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-28 09:50:36 +00:00
Ben Crocker
b43daf7bf6 docs/llvmpipe.html: Minor edits
Language and spelling fixups in three places.

Cc: "17.2" "17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

[Eric: move two fixes from the other patch to this one.]
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-28 09:50:36 +00:00
Eric Engestrom
bca122902a st/dri: replace hard-coded array size with ARRAY_SIZE()
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:50:36 +00:00
Nicolai Hähnle
dd07868904 radeonsi/gfx9: simplify condition for on-chip ESGS
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:34:43 +01:00
Nicolai Hähnle
239d2b5809 radeonsi: clarify that si_shader_selector::esgs_itemsize is set for the ES part
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:34:43 +01:00
Nicolai Hähnle
26da5d0317 radeonsi: use si_shader_context instead of lp_build_context in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:34:43 +01:00
Nicolai Hähnle
1c2d19d84d radeonsi: cleanup si_initialize_color_surface
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:34:43 +01:00
Nicolai Hähnle
08f6b4dd7b radeonsi: avoid attempting to create CMASK if the tiling mode doesn't have it
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:34:43 +01:00
Nicolai Hähnle
e52e8326d9 radeonsi: check that we don't leak fine.buf references
Just as an added precaution.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:34:43 +01:00
Nicolai Hähnle
377a062321 ac/surface: fix indentation
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:34:43 +01:00
Nicolai Hähnle
97f42d11df amd/common: sid.h cleanups
Fix a bunch of labels indicating when registers were added/removed
and normalize the SI-class GRBM_GFX_INDEX.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:34:43 +01:00
Nicolai Hähnle
7e35bdad1c st_glsl_to_tgsi: check for the tail sentinel in merge_two_dsts
This fixes yet another case where DFRACEXP has only one destination. Found
by address sanitizer.

Fixes tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4-only-mantissa.shader_test

Fixes: 3b666aa747 ("st/glsl_to_tgsi: fix DFRACEXP with only one destination")
Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:31:33 +01:00
Tapani Pälli
1e508e10d9 mesa/gles: adjust internal format in glTexSubImage2D error checks
When floating point textures are created on OpenGL ES 2.0, driver
is free to choose used internal format. Mesa makes this decision in
adjust_for_oes_float_texture. Error checking for glTexImage2D properly
checks that sized formats are not used. We use same error checking
path for glTexSubImage2D (since there is lot of overlap), however since
those checks include internalFormat checks, we need to pass original
internalFormat passed by the client. Patch adds oes_float_internal_format
that does reverse adjust_for_oes_float_texture to get that format.

Fixes following test failure:
   ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float

(when running test with MESA_GLES_VERSION_OVERRIDE=2.0)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103227
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-11-28 08:57:49 +02:00
Jason Ekstrand
049b84246e radv: Use the suffixed versions of VK_QUEUE_GLOBAL_PRIORITY_*
Acked-by: Dave Airlie <airlied@redhat.com>
2017-11-27 21:42:06 -08:00
Jason Ekstrand
07850893a1 vulkan: Update the XML and headers to 1.0.66
Acked-by: Dave Airlie <airlied@redhat.com>
2017-11-27 21:41:46 -08:00
Jason Ekstrand
d7c8c7bd9d intel/blorp: Drop blorp_resolve_ccs_attachment
The only reason why we needed that version was because the Vulkan driver
needed to be able to create the surface states so it could handle
indirect clear colors.  Now that blorp handles them natively, there's no
need for the extra entrypoint.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-11-27 16:22:13 -08:00
Jason Ekstrand
5bc2849af9 anv: Let blorp handle indirect clear colors for CCS resolves
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-11-27 16:22:13 -08:00
Jason Ekstrand
34b95f88e6 anv: Move get_fast_clear_state_address into anv_private.h
While we're at it, we break it into two nicely named functions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-11-27 16:22:13 -08:00
Jason Ekstrand
8915621882 intel/blorp: Take a range of layers in blorp_ccs_resolve
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-11-27 16:22:13 -08:00
Jason Ekstrand
67b676f0c5 intel/blorp: Add initial support for indirect clear colors
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-27 16:22:12 -08:00
Jason Ekstrand
85aa4074a2 i965/blorp: Use a designated initializer for blorp_surf
This way uninitialized fields get automatically zeroed and it's safe to
add more fields to blorp_surf.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-11-27 16:22:12 -08:00
Jason Ekstrand
86becfd2de intel/blorp: Add fast-clear to the special case in MSAA resolves
This doesn't go all the way of avoiding the txf_ms if it's fast-cleared,
however it does at least make us only do it once.  This should improve
performance of MSAA resolves in the presence of lots of clear color.
Without the patch, enabling fast-clears in the multisampling Sascha demo
drops the framerate by about 10%.  With this patch, enabling fast-clears
increases the demo's framerate by 25%.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-11-27 16:22:11 -08:00
Jason Ekstrand
dc21c3937c intel/blorp/blit: Rename blorp_nir_txf_ms_mcs
That name is already taken by one of the helpers in blorp_nir_builder.h
and, while we haven't moved the guts of blorp_blit.c there yet, we'd
like to start using some things from that header.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-11-27 16:19:38 -08:00
Rob Herring
46148be8e4 Android: disable warnings causing errors
AOSP master has changed the build default to -Werror making all the
warnings errors. Override that with -Wno-error.

Signed-off-by: Rob Herring <robh@kernel.org>
2017-11-27 17:26:45 -06:00
Timothy Arceri
3e789026ca st/glsl_to_tgsi: make use of driver_cache_blob with the disk cache
driver_cache_blob was introduced with the i965 disk cache, it allows
us to simplify the cache a little and possibly offers some minor
speed improvements since we load the GLSL metadata and TGSI from
disk in one pass.

Using driver_cache_blob should also make it straight forward to
implement binary support for ARB_get_program_binary in gallium.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:01:44 +11:00
Gwan-gyeong Mun
4cb27047c8 glsl: Fix typo nagivation -> navigation
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-28 08:48:55 +11:00
Emil Velikov
c7616ac069 gl_table.py: add extern C guard for the generated glapitable.h
The header can be included from C++, hence contents should have
appropriate notation.

Cc: mesa-stable@lists.freedesktop.org
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-27 19:23:05 +00:00
Marek Olšák
6b8909f2d1 ac: pack legacy_surf_level better
r600_texture: 1488 -> 1248 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-27 14:46:16 +01:00
Marek Olšák
ec15ff78c3 ac: change legacy_surf_level::slice_size to dword units
The next commit will reduce the size even more.

v2: typecast to uint64_t manually
v3: add more typecasts, add asserts

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-27 14:44:04 +01:00
Marek Olšák
474b4a9191 ac: pack ac_surface better
r600_texture: 1736 -> 1488 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-27 14:12:38 +01:00
Marek Olšák
b5444877c0 radeonsi: always initialize max_forced_staging_uploads
r600_resource is malloc'd.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103808
Fixes: 4b0dc098b2 ("gallium/u_threaded: don't map big VRAM buffers for the first upload directly")

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-27 14:12:38 +01:00
Marek Olšák
95cd74abd4 radeonsi: remove an old hack for evergreen
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-27 14:12:38 +01:00
Marek Olšák
1cb731012c radeonsi: set COMPUTE_RESOURCE_LIMITS.FORCE_SIMD_DIST when profitable
ported from Vulkan

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-27 14:12:38 +01:00
Dave Airlie
043d14db30 ac/nir: don't write tcs outputs to LDS that aren't read back.
If the TCS doesn't read back the outputs, no need to store them
to LDS in the first place. (except for tess factors).

This seems to give about 50fps (3290->3330) with tessellation demo.

I haven't tested if it impacts DoW3 at all.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-27 13:50:24 +10:00
Dave Airlie
33dca36f4f nir: fill outputs_read field and add patch outputs read (v2)
This is to be used for TCS optimisations on radv.

v2: don't set written on reads (nha)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-27 13:50:03 +10:00
Dave Airlie
fd301472bd r600/eg: dump event type in dumps
This just makes it easier to debug some things.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-27 12:53:18 +10:00
Tobias Klausmann
068a72fbcb nouveau/compiler: Allow to omit line numbers when printing instructions
This comes in handy when checking "NV50_PROG_DEBUG=1" outputs with diff!

V2:
 - Use environmental variable (Karol Herbst)
V3:
 - Use the already populated nv50_ir_prog_info to forward information to the
   print pass (Pierre Moreau)
V4:
 - get rid of default value in PrintPass constructor

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-11-26 12:51:30 -05:00
Nicolai Hähnle
0fed7f83ba radeonsi: try flushing unflushed fences in si_fence_finish even when timeout == 0
Under certain conditions, waiting on a GL sync objects should act like
a flush, regardless of the timeout.

Portal 2, CS:GO, and presumably other Source engine games rely on this
behavior and hang during loading without this fix.

Fixes: bc65dcab3b ("radeonsi: avoid syncing the driver thread in si_fence_finish")

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103902
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103904
2017-11-26 16:53:00 +01:00
Ilia Mirkin
0bd83d0461 nv50/ir: move LateAlgebraicOpt to the very end
Memory loads can take offsets, but the SHLADD will often attempt to
consume the offsets too. As there may be multiple memory loads with the
same base but different offsets, those would end up in a SHLADD instead
of the offset of the memory operation.

This moves the pass after we've had a chance to attempt to propagate
immediate adds into the indirect offset.

total instructions in shared programs : 6580681 -> 6567716 (-0.20%)
total gprs used in shared programs    : 944261 -> 943375 (-0.09%)
total shared used in shared programs  : 0 -> 0 (0.00%)
total local used in shared programs   : 15328 -> 15328 (0.00%)
total bytes used in shared programs   : 60339896 -> 60221504 (-0.20%)

                local     shared        gpr       inst      bytes
    helped           0           0         555        2698        2698
      hurt           0           0         138         336         336

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-11-26 01:10:19 -05:00
Ilia Mirkin
3072bbef63 nv50/ir: when merging immediates/consts, load directly
When a MERGE operation gets its constraint moves added, it
susbstantially extends live ranges to be reusing an immediate from
earlier in the program (not to mention the silliness of loading an
immediate into a register, and then moving into another register).

We detect these scenarios and insert moves that take the immediate or
constbuf load directly into the register. If it's the last use, then we
can just move that operation to the closer location.

With SM35 (255 regs) we get these results:

total instructions in shared programs : 6583670 -> 6580681 (-0.05%)
total gprs used in shared programs    : 950818 -> 944261 (-0.69%)
total shared used in shared programs  : 0 -> 0 (0.00%)
total local used in shared programs   : 15328 -> 15328 (0.00%)
total bytes used in shared programs   : 60367456 -> 60339896 (-0.05%)

                local     shared        gpr       inst      bytes
    helped           0           0        4584        3186        3186
      hurt           0           0          55         968         968

I suspect they will be better for SM20 and SM30.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-11-26 01:10:19 -05:00
Ilia Mirkin
50e913b9c5 nv50/ir: add optimization for modulo by a non-power-of-2 value
We can still use the optimized division methods which make use of
multiplication with overflow.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
2017-11-26 01:10:03 -05:00
Ilia Mirkin
3079993727 nv50/ir: optimize signed integer modulo by pow-of-2
It's common to use signed int modulo in GLSL. As it happens, the GLSL
specs allow the result to be undefined, but that seems fairly
surprising. It's not that much more effort to get it right, at least for
positive modulo operators.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-11-25 22:48:09 -05:00
Matt Turner
676761252b util: Just give up and define PIPE_ARCH_LITTLE_ENDIAN on MSVC
MSVC doesn't support #warning?! Getting really tired of this.
2017-11-25 16:46:00 -08:00
Andres Gomez
5fa589148a docs: remove bug 103626 from fix list as per 17.2.6
Bug https://bugs.freedesktop.org/show_bug.cgi?id=103626 was
incorrectly listed as fixed.

Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit b9b60dbf55)
2017-11-26 02:18:08 +02:00
Matt Turner
b8cbad624b util: Use preprocessor correctly
Fixes: 6a353479a7 ("util: Assume little endian in the absence of
                      platform-specific handling")
2017-11-25 15:57:37 -08:00
Andres Gomez
63d488d10c docs: update calendar, add news item and link release notes for 17.2.6
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-11-26 01:46:25 +02:00
Andres Gomez
b0049428b5 docs: add sha256 checksums for 17.2.6
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 93c2beafc0)
2017-11-26 01:42:16 +02:00
Andres Gomez
e6acc4d528 docs: add release notes for 17.2.6
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 00b52f8e99)
2017-11-26 01:42:15 +02:00
Ilia Mirkin
f39a91c152 freedreno/a4xx: add ARB_framebuffer_no_attachments support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-11-25 17:20:17 -05:00
Ilia Mirkin
4f748d12e8 freedreno/a4xx: add indirect draw support
This is a copy of the a5xx logic. Fails a few tests, but basic
functionality is there.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-11-25 17:20:17 -05:00
Ilia Mirkin
c3c8d48725 freedreno: regenerate pm4 header, adjust code for new names
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-11-25 17:20:17 -05:00
Ilia Mirkin
ffdcd51e66 freedreno/a4xx: add stencil texturing support
Copied from a5xx, should be identical.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-11-25 17:20:17 -05:00
Ilia Mirkin
86f12e9377 freedreno/ir3: add a pass to lower tg4 to txl, enable gather on a4xx
Unfortunately Adreno A4xx hardware returns incorrect results with the
GATHER4 opcodes. As a result, we have to lower to 4 individual texture
calls (txl since we have to force lod to 0). We achieve this using
offsets, including on cube maps which normally never have offsets.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-11-25 16:56:59 -05:00
Ilia Mirkin
ab336e8b46 nir: allow texture offsets with cube maps
GL doesn't have this, but some hardware supports it. This is convenient
for lowering tg4 to plain texture calls, which is necessary on Adreno
A4xx hardware.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-11-25 16:56:30 -05:00
Matt Turner
c690a7a8cd util: Fix disk_cache index calculation on big endian
The cache-test test program attempts to create a collision (using key_a
and key_a_collide) by making the first two bytes identical. The idea is
fine -- the shader cache wants to use the first four characters of a
SHA1 hex digest as the index.

The following program

        unsigned char array[4] = {1, 2, 3, 4};
        int *ptr = (int *)array;

        for (int i = 0; i < 4; i++) {
            printf("%02x", array[i]);
        }
        printf("\n");

        printf("%08x\n", *ptr);

prints

   01020304
   04030201

on little endian, and

   01020304
   01020304

on big endian.

On big endian platforms reading the character array back as an int (as
is done in disk_cache.c) does not yield the same results as reading the
byte array.

To get the first four characters of the SHA1 hex digest when we mask
with CACHE_INDEX_KEY_MASK, we need to byte swap the int on big endian
platforms.

Bugzilla: https://bugs.freedesktop.org/103668
Bugzilla: https://bugs.gentoo.org/637060
Bugzilla: https://bugs.gentoo.org/636326
Fixes: 87ab26b2ab ("glsl: Add initial functions to implement an
                      on-disk cache")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-25 12:30:46 -08:00
Matt Turner
513d7ffa23 util: Add a SHA1 unit test program
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-25 12:30:46 -08:00
Matt Turner
532674303a util: Fix SHA1 implementation on big endian
The code defines a macro blk0(i) based on the preprocessor condition
BYTE_ORDER == LITTLE_ENDIAN. If true, blk0(i) is defined as a byte swap
operation. Unfortunately, if the preprocessor macros used in the test
are no defined, then the comparison becomes 0 == 0 and it evaluates as
true.

Fixes: d1efa09d34 ("util: import sha1 implementation from OpenBSD")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-25 12:30:46 -08:00
Matt Turner
6a353479a7 util: Assume little endian in the absence of platform-specific handling 2017-11-25 12:30:46 -08:00
Marek Olšák
78942e7dbf mesa: shrink VERT_ATTRIB bitfields to 32 bits
There are only 32 vertex attribs now.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-25 17:18:22 +01:00
Marek Olšák
43abaf2ad0 mesa: remove unused vertex attrib WEIGHT
We don't support ARB_vertex_blend.

Note that the attribute aliasing check for ARB_vertex_program had to be
rewritten.

vbo_context: 20344 -> 20008 bytes
gl_context: 74672 -> 74616 bytes

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-25 17:17:52 +01:00
Marek Olšák
2116b97418 mesa: don't assign numbers to vertex attrib enums manually
I plan to remove one of them.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-25 17:17:52 +01:00
Marek Olšák
bd57f45168 gallium/hud: add HUD sharing within a context share group
This is needed for profiling multi-context applications like Chrome.
One context can record queries and another context can draw the HUD.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
11e25eb7f4 gallium/hud: update the HUD interface for multiple contexts
This is the boring subset of the following commit.
All new parameters are optional.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
9c5b4eb6b4 gallium/hud: prevent a crash if the recording context is inactive
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
37ded08321 gallium/hud: separate code for record context init/release
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
fc07acc21e gallium/hud: separate code for draw context init/release
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
8caf7d51a9 gallium/hud: don't use hud->pipe in hud_parse_env_var
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
65433c3fd0 gallium/hud: use cso_get_pipe_context
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
e20364df82 cso: add cso_get_pipe_context
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
3132afdf4c gallium/hud: pass pipe_context explicitly to most functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
0e319ed835 gallium/hud: split hud_draw into 3 separate functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
e5148791f6 st/dri: remove dead code and incorrect comment around make_current
Core Mesa already handles flushing based on ContextReleaseBehavior,
so the comment is wrong.

Also, old_st is always NULL, because unbind_context always precedes
make_current.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
6ad83b58e2 st/dri: clean up dri_unbind_context
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
2cfa319f9f radeonsi: expose all CB performance counters on Stoney
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
797c447f1c radeonsi: handle imported textures with DCC robustly
now you can hack the driver to enable DCC for displayable textures and
Glamor that doesn't enable that by default won't crash anymore.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
992b6e18d0 radeonsi: fix a typo in creating monolithic ES-GS
This has no effect because both occupy the same memory in a union.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
f783677a82 radeonsi: don't write undefined output channels to LDS in LS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
b63e7d4c6f radeonsi: use ac.lds for shared memory
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Marek Olšák
39b098dafb radeonsi: do 64-bit LDS loads recursively
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-25 17:16:56 +01:00
Jon Turney
b6b4b2c6d8 mapi: Teach es{1,2}api/ABI-check shared library names on Cygwin
Ideally we'd be able to get the library filename from libtool, but that
doesn't seem to be a feature...

Use of ${uname} is presumably ok here as we won't be running 'make check' if
we are cross-compiling

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-24 16:53:55 +00:00
Samuel Pitoiset
1cc00b8e0e Revert "radv: remove unnecessary memset() in radv_AllocateCommandBuffers()"
This fixes two CTS regressions:
- dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary
- dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary

These two tests are part the mustpass lists, so presumably they
are correct and my change was wrong.

This reverts commit 0f68208f1d.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-24 12:26:35 +01:00
Samuel Pitoiset
dc391a406a radv/winsys: improve error messages when the buffer list creation failed
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-24 11:18:43 +01:00
Samuel Pitoiset
15c0df785b radv/winsys: do not try to create a BO list with 0 buffers
This happens when all BOs have the RADEON_FLAG_NO_INTERPROCESS_SHARING
(DRM version >= 3.23) flag set. This flag is mainly used for reducing
overhead on the userspace side because we don't have to put those BOs
inside the list.

Though, if the driver tries to create a list with 0 buffers inside it,
libdrm returns -EINVAL and the app just crashes.

This fixes a bunch of CTS dEQP-VK.sparse_resources.* fails (~100).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-24 11:18:38 +01:00
Iago Toral Quiroga
f1873956db i965/vec4: fix splitting of interleaved attributes
When we split an instruction that reads an uniform value
(vstride 0) we need to respect the vstride on the second
half of the instruction (that is, the second half should
read the same region as the first).

We were doing this already, but we didn't account for
stages that have interleaved input attributes which also
have a vstride of 0 and need the same treatment.

Fixes the following on Haswell:
KHR-GL45.enhanced_layouts.varying_locations
KHR-GL45.enhanced_layouts.varying_array_locations
KHR-GL45.enhanced_layouts.varying_structure_locations

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Andres Gomez <agomez@igalia.com>
2017-11-24 09:24:06 +01:00
Wladimir J. van der Laan
35548cae93 etnaviv: Emit vertex buffers consecutively
Vertex buffer legacy state is no longer picked up with new drawing
commands. Change to use different cases depending on the number of
vertex streams in the GPU specs.

This results in slightly more compact state emission as well, on all
vivantes.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-23 22:24:51 +01:00
Eric Engestrom
99aea1e3de REVIEWERS: add Alexander von Gluck IV as a reviewer for Haiku
There's been some Haiku-related activity lately, so let's document who
to cc on these patches.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Alexander von Gluck IV <kallisti5@unixzen.com>
2017-11-23 10:00:55 +00:00
Eric Engestrom
1d3944aeeb genxml: fix assert guards
This removes a few hundred warnings on debug builds with asserts off.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-23 09:44:16 +00:00
Eric Engestrom
f9cb2370f3 meson: add variable for mapi_abi.py instead of going back up the tree
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-23 09:44:16 +00:00
Eric Engestrom
d16af73559 meson: reorder subdirs to avoid directly including more than one level
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-23 09:44:16 +00:00
Eric Engestrom
ab0809e552 meson: fix strtof locale support check
Fixes: d1992255bb "meson: Add build Intel "anv" vulkan driver"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-23 09:44:16 +00:00
Roland Scheidegger
71e630753e r600: set DX10_CLAMP for compute shader too
I really intended to set this for all shader stages by
3835009796 but missed it for compute shaders
(because it's in a different source file...).

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-23 02:28:38 +01:00
Lionel Landwerlin
d4c52c5408 anv: flag batch & instruction BOs for capture
When the kernel support flagging our BO, let's mark batch &
instruction BOs for capture so then can be included in the error
state.

v2: Only add EXEC_CAPTURE if supported (Kristian)

v3: Fix operator precedence issue (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-22 22:53:27 +00:00
Lionel Landwerlin
118a8c7587 anv: setup BO flags at state_pool/block_pool creation
This will allow to set the flags on any anv_bo created/filled from a
state pool or block pool later.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-22 22:53:27 +00:00
Gert Wollny
799d350870 r600/shader: Fix all warnings issed with "-Wall -Wextra"
- fix a number of -Wsign-compare warnings
- fix two warnings for -Woverride-init because TGSI_OPCODE_CEIL == 83, and
  the according field was defined two times.

[airlied: don't use -1 with unsigned type,
fix whitespace]

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-22 22:50:18 +00:00
Gert Wollny
1d076aafbc r600: Emit EOP for more CF instruction types
So far on pre-cayman chipsets the CF instructions CF_OP_LOOP_END,
CF_OP_CALL_FS, CF_OP_POP, and CF_OP_GDS an extra CF_NOP instruction
was added to add the EOP flag, even though this is not actually
needed, because all these instrutions support the EOP flag.

This patch removes the fixup code, adds setting the EOP flag for the
according instructions as well as others like CF_OP_TEX and CF_OP_VTX,
and adds writing out EOP for this type of instruction in the disassembler.

This also fixes a bug where shaders were created that didn't actually have
the EOP flag set in the last CF instruction, which might have resulted
in GPU lockups.

[airlied: cleaned up a little]
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-22 22:39:42 +00:00
Dylan Baker
c2dad6ca0a meson: replace with_*dri with with_dri_platform
This fixes the windows and macos stubs to be consistent with the *nix
path.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:47:43 -08:00
Dylan Baker
33627d23d0 meson: add logic to select apple and windows dri
This is still not fully correct (haiku and BSD is notably probably not
correct), but Linux is not regressed and this should be correct for
macOS and Windows.

v2: - set the dri_platform to windows on Cygwin as well (Jon)
v3: - Add a better todo for Hurd (Eric)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:47:43 -08:00
Dylan Baker
2d1a3bf657 meson: Fix LLVM requires for radeonsi
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:47:43 -08:00
Dylan Baker
48f64e591f meson: convert llvm option to tristate
This option has been acting as a strange sort of half-tri state anyway.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:47:43 -08:00
Dylan Baker
4b61b07e4b meson: Convert platform to auto
This is necessary to support operating systems other than the *nix
family (excluding macOS). For Linux nothing has changed, the defaults
are still the same.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:47:43 -08:00
Dylan Baker
b5d98a101b meson: Remove duplicate _GNU_SOURCE
There is one provided unconditionally, and one guarded by platform ==
linux. Remove the unconditional one.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:47:43 -08:00
Dylan Baker
9c3e894ebe meson: Remove completed or irrelevant TODO comments
These are all either done already, or are autotools specific. The
misspelled gallium G3DVL is the autotools specific bit, meson is
handling that via build_by_default.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:46:00 -08:00
Dylan Baker
e89842ebbc meson: Fix TODO for missing dl_iterate_phdr function
This function is required for both the Intel "Anvil" vulkan driver and
the i965 GL driver. Error out if either of those is enabled but this
function isn't found.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:46:00 -08:00
Dylan Baker
2d62fc0646 meson: disable x86 asm in fewer cases.
This patch allows building asm for x86 on x86_64 platforms, when the
operating system is the same. Previously cross compile always turned off
assembly. This allows using a cross file to cross compile x86 binaries
on x86_64 with asm.

This could probably be relaxed further thanks to meson's "exe_wrapper",
which is way to specify an emulator or compatibility layer (wine) that
can run the foreign binaries on the build system. Since the meson build
at this point only supports building on Linux I can't test this and I
don't want to write/enable code that cannot even be build tested.

v4: - set condition to build == x86_64 and host == x86 and
      build.system == host.system

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:46:00 -08:00
Dylan Baker
84486f6462 meson: Enable SSE4.1 optimizations
This patch checks for an and then enables sse4.1 optimizations if the
host machine will be x86/x86_64.

v2: - Don't compile code, it's unnecessary since we require a compiler
      which always has SSE4.1 (Matt)
v3: - x64 -> x86_64 (Matt)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:46:00 -08:00
Eric Anholt
6a78416dab broadcom/vc5: Fix BASE_LEVEL handling with txl.
The HW doesn't add the base level anywhere (the min/max lod clamping is
what does base level), so we need to add it manually in this case.

Fixes piglit tex-miplevel-selection *Lod 2D.
2017-11-22 10:56:31 -08:00
Eric Anholt
c55813c22e broadcom/vc5: Fix array texture layer count setup.
Fixes piglit array-texture.
2017-11-22 10:56:31 -08:00
Eric Anholt
ad1521d708 broadcom/vc5: Don't increment primitive queries while they're paused.
Fixes ext_transform_feedback-generatemipmap prims_generated
2017-11-22 10:56:31 -08:00
Eric Anholt
1214c2ea2a broadcom/vc5: Fix incorrect padding of TF outputs.
After the first output, we were padding by an extra size of the previous
output.  Fixes piglit ext_transform_feedback-output-type mat4x3[2] and
friends.
2017-11-22 10:56:31 -08:00
Eric Anholt
b18840ac6e broadcom/vc5: Fix UIF surface size setup for ARB_fbo's mismatched sizes.
The HW was computing an implicit height for the surface based on the image
size, but that may be smaller than the surface with ARB_fbo mismatched
sizes.  In that case, we need to tell it about the pad, either with the
little 4-bit field in the RT config, or the extended field in
CLEAR_COLORS_PART3.

Fixes piglit arb_framebuffer_object-mixed-buffer-sizes.
2017-11-22 10:56:31 -08:00
Wladimir J. van der Laan
9f162fa107 etnaviv: Put HALTI level in specs
The HALTI level is an indication of the gross architecture of the GPU.
It determines for significant part what feature level the GPU has, what
state (especially frontend state) is there, and where it is located.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-11-22 14:42:06 +01:00
Wladimir J. van der Laan
391c958f08 etnaviv: Const-correctness etnaviv_emit.h
The relocation structure is never changed by submitting it.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-11-22 14:42:00 +01:00
Juan A. Suarez Romero
1b0638c65f meson: add si_driinfo.h in libgallium_dri
v2: generate target conditionally (Dylan)

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-22 12:35:38 +01:00
Iago Toral Quiroga
a217cbd7ec nir/gather_info: recognize load_patch_vertices_in as a system value
This intrinsic is produced to load SYSTEM_VALUE_VERTICES_IN, which is
generated to load gl_PatchVerticesIn in the SPIR-V path for both
Vulkan and OpenGL.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-22 08:03:55 +01:00
Jordan Justen
386f6cd041 i965: Support decoding INTERFACE_DESCRIPTOR_DATA with INTEL_DEBUG=bat
This will dump the INTERFACE_DESCRIPTOR_DATA along with the associated
samplers & surfaces.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-21 12:11:57 -08:00
Kristian H. Kristensen
24609377f9 intel/genxml: Add helpers for determining field type
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-21 11:15:06 -08:00
Matt Turner
beaea7abfa i965/fs: Check ADD/MAD with immediates in satprop unit test
The gen had to be changed from 4 to 6 so that we could test MAD, which
is new on Gen6.

mad_imm_float_neg_mov_sat tests the case fixed by the previous commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-21 10:13:07 -08:00
Matt Turner
a05af1f7b8 i965/fs: Handle negating immediates on MADs when propagating saturates
MADs don't take immediate sources, but we allow them in the IR since it
simplifies a lot of things. I neglected to consider that case.

Fixes: 4009a9ead4 ("i965/fs: Allow saturate propagation to propagate
                      negations into MADs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103616
Reported-and-Tested-by: Ruslan Kabatsayev <b7.10110111@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-21 10:13:07 -08:00
Juan A. Suarez Romero
ce221cbbcf mesa/teximage: add TEXTURE_CUBE_MAP_ARRAY target for CompressedTexImage3D
From section 8.7, page 179 of OpenGL ES 3.2 spec:

  An INVALID_OPERATION error is generated by CompressedTexImage3D
  if internalformat is one of the the formats in table 8.17 and target
  is not TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY or TEXTURE_3D.

  An INVALID_OPERATION error is generated by CompressedTexImage3D if
  internalformat is TEXTURE_CUBE_MAP_ARRAY and the “Cube Map Array”
  column of table 8.17 is not checked, or if internalformat is
  TEXTURE_3D and the “3D Tex.” column of table 8.17 is not checked.

So far it was only considering TEXTURE_2D_ARRAY as valid target. But as
"Cube Map Array" column is checked for all the cases, in practice we can
consider also TEXTURE_CUBE_MAP_ARRAY.

This fixes KHR-GLES32.core.texture_cube_map_array.etc2_texture

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-11-21 13:05:42 +01:00
Tapani Pälli
6236ffeb83 intel: fix disasm_info memory leaks
Fixes: 4f82b17287 ("i965: Rewrite disassembly annotation code")
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-11-21 08:36:43 +02:00
Timothy Arceri
04a9558497 st/glsl_to_nir: don't generate nir twice for gs
This was left out of c980a3aa31

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-21 15:57:39 +11:00
Roland Scheidegger
b5957cee92 llvmpipe: fix snorm blending
The blend math gets a bit funky due to inverse blend factors being
in range [0,2] rather than [-1,1], our normalized math can't really
cover this.
src_alpha_saturate blend factor has a similar problem too.
(Note that piglit fbo-blending-formats test is mostly useless for
anything but unorm formats, since not just all src/dst values are
between [0,1], but the tests are crafted in a way that the results
are between [0,1] too.)

v2: some formatting fixes, and fix a fairly obscure (to debug)
issue with alpha-only formats (not related to snorm at all), where
blend optimization would think it could simplify the blend equation
if the blend factors were complementary, however was using the
completely unrelated rgb blend factors instead of the alpha ones...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-11-21 04:06:29 +01:00
Dave Airlie
464c2d8083 r600: add cull distance support
This passes all the tests in piglit.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-21 09:00:52 +10:00
Aravindan Muthukumar
971b3c019b i965: Optimize bucket index calculation
Reducing Bucket index calculation to O(1).

This algorithm calculates the index using matrix method.  Assuming
PAGE_SIZE is 4096, matrix arrangement is as below:

          1*4096   2*4096    3*4096    4*4096
          5*4096   6*4096    7*4096    8*4096
          10*4096  12*4096   14*4096   16*4096
          20*4096  24*4096   28*4096   32*4096
           ...      ...       ...       ...
           ...      ...       ...       ...
           ...      ...       ...   max_cache_size

From this matrix its clearly seen that every row follows the below way:

          ...       ...       ...        n
        n+(1/4)n  n+(1/2)n  n+(3/4)n    2n

Row is calculated as log2(size/PAGE_SIZE) Column is calculated as
converting the difference between the elements to fit into power size of
two and indexing it.

Final Index is (row*4)+(col-1)

Tested with Intel Mesa CI.

Improves performance of 3DMark on BXT by 0.705966% +/- 0.229767% (n=20)

v4: Review comments on style and code comments implemented (Ian).
v3: Review comments implemented (Ian).
v2: Review comments implemented (Jason).

Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Kedar Karanje <kedar.j.karanje@intel.com>
Reviewed-by: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-20 14:52:42 -08:00
Dylan Baker
c8417c8d25 meson: Guard the gallium dri componenet
Currently the target has a redundant guard, and the state tracker isn't
properly guarded.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-20 14:28:31 -08:00
Dylan Baker
689fb74716 meson: don't build gallium subdir unless we're building gallium
This will allow us to simplify some guards within the gallium directory.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-20 14:28:31 -08:00
Eric Anholt
494effd242 broadcom/vc5: Align 1D texture miplevels to 64b.
Fixes tex-miplevel-selection GL2:texture() 1D
2017-11-20 13:54:45 -08:00
Eric Anholt
9d5972da80 broadcom/vc5: Clamp min lod to the last level.
Otherwise, the simulator would complain in tex-miplevel-selection that the
min/max clamp was out of order.  The actual HW seems to have clamped to
the max anyway.
2017-11-20 13:52:33 -08:00
Eric Anholt
2c8913e224 broadcom/vc5: Increase simulator memory for tex-miplevel-selection.
We were overflowing, because of all the little 4k allocations for CLs that
were getting expanded to 128kb in the simulator due to the GMP alignment.
2017-11-20 13:52:33 -08:00
Tim Rowley
34838c2212 swr/rast: Repair simd8 frontend code rot
Keep non-default simd8 frontend code running for comparison purposes.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:51:10 -06:00
Tim Rowley
005d937e15 swr/rast: Implement AVX-512 GATHERPS in SIMD16 fetch shader
Disabled for now.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:51:06 -06:00
Tim Rowley
2e244c7168 swr/rast: Simplify GATHER* jit builder api
General cleanup, and prep work for possibly moving to llvm masked
gather intrinsic.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:51:01 -06:00
Tim Rowley
44025def06 swr/rast: Add alignment to transpose targets
Needed to ensure alignment for avx512.

Fixes address sanitizer crash.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:50:56 -06:00
Tim Rowley
bc356b0fc0 swr/rast: Cache eventmanager
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:50:51 -06:00
Tim Rowley
395a298fa5 swr/rast: Enable AVX-512 targets in the jitter
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:50:45 -06:00
Tim Rowley
37bb69fb88 swr/rast: Points with clipdistance can't go through simplepoints path
Fixes piglit glsl-1.20:vs-clip-vertex-primitives and
glsl-1.30:vs-clip-distance-primitives.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:50:38 -06:00
Tim Rowley
d9de8f3122 swr/rast: Code style change (NFC)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:50:29 -06:00
Tim Rowley
08512c52de swr/rast: Widen fetch shader to SIMD16
Widen fetch shader to SIMD16, enable SIMD16 types in the jitter,
and provide utility EXTRACT/INSERT SIMD8 <-> SIMD16 utility functions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:50:23 -06:00
Tim Rowley
e612231f20 swr/rast: Support flexible vertex layout for DS output
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:49:59 -06:00
Nicolai Hähnle
3f17d3c017 gallium/u_threaded: avoid syncing in threaded_context_flush
We could always do the flush asynchronously, but if we're going to wait
for a fence anyway and the driver thread is currently idle, the additional
communication overhead isn't worth it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:16:15 +01:00
Nicolai Hähnle
bc65dcab3b radeonsi: avoid syncing the driver thread in si_fence_finish
It is really only required when we need to flush for deferred fences.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:16:11 +01:00
Nicolai Hähnle
3db1ce01b1 radeonsi: recompute the relative timeout after waiting for ready fence
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:16:06 +01:00
Nicolai Hähnle
f5ea8d18ff ddebug: fix the hang detection timeout calculation
Fixes: c9fefa062b ("ddebug: rewrite to always use a threaded approach")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:16:03 +01:00
Nicolai Hähnle
16f8da2997 ddebug: fix use-after-free of streamout targets
Fixes: b47727a83a ("ddebug: implement pipelined hang detection mode")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:16:00 +01:00
Nicolai Hähnle
aaebf49eba gallium/u_threaded: properly initialize fence unflushed tokens
This got lost in a rebase but never hurt anything because we happened
to always sync in fence_finish anyway...

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:15:56 +01:00
Nicolai Hähnle
81aabb20f3 util/u_queue: really use futex-based fences
The relevant define changed in the final revision of the simple mutex
patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:15:53 +01:00
Nicolai Hähnle
a6e8311723 util/u_queue: fix timeout handling in util_queue_fence_wait_timeout
Fixes: e3a8013de8 ("util/u_queue: add util_queue_fence_wait_timeout")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:15:49 +01:00
Nicolai Hähnle
764bd6ef96 st/mesa: use asynchronous flushes in st_finish
With threaded gallium, the driver may currently be running in another
thread. In that case, we will execute all remaining commands in that
thread instead of syncing, which should be better for cache locality.

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:15:07 +01:00
Nicolai Hähnle
2d8b82baaa st/mesa: implement st_server_wait_sync properly
Asynchronous flushes require a proper implementation of
st_server_wait_sync, because we could have the following with
threaded Gallium:

 Context 1 app     Context 1 driver         Context 2
 -------------     ----------------         ---------
 f = glFenceSync
 glFlush
 <-- app sync -->                           <-- app sync -->
                                            glWaitSync(f)
                                            .. draw calls ..
                   pipe_context::flush
                     for glFenceSync
                   pipe_context::flush
                     for glFlush

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:15:07 +01:00
Nicolai Hähnle
ce470af0b1 u_threaded_gallium: remove synchronization in fence_server_sync
The whole point of fence_server_sync is that it can be used to
avoid waiting in the application thread.

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:15:06 +01:00
Nicolai Hähnle
abeded1cac amd: build addrlib with C++11
It is required for LLVM anyway.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103658
Fixes: 7f33e94e43 ("amd/addrlib: update to latest version")
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 16:26:28 +01:00
Nicolai Hähnle
df5ebe0c26 radeonsi/gfx9: fix VM fault with fetched instance divisors
We need to account for SGPR locations in merged shaders.

This case is exercised by KHR-GL45.enhanced_layouts.vertex_attrib_locations

Fixes: 79c2e7388c ("radeonsi/gfx9: use SPI_SHADER_USER_DATA_COMMON")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 16:26:10 +01:00
Samuel Pitoiset
3a32858fc3 radv: use a 16 bytes array for the sampled/storage image descriptors
This allows to update them with only one memcpy().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-20 11:18:22 +01:00
Samuel Pitoiset
bc92ed04ac radv: do not add the query pool BO to the list in vkCmdEndQuery()
As per the spec, the query identified by queryPool and query
must currently be active. Applications have to call vkCmdBeginQuery()
before, and thus the query pool BO will already be in the list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-20 11:18:20 +01:00
Samuel Pitoiset
cf54ea155e radv: only load needed depth clear regs for fast depth clears
Similar to how the driver sets the depth clear regs after a
fast depth clear. Most of the time, this will copy a 32-bit reg
instead of a 64-bit reg.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-20 10:45:27 +01:00
Samuel Pitoiset
e55b7609fa radv: do not add the image BO in radv_set_depth_clear_regs()
For the fast path, radv_fill_buffer() ensures that the BO is
already in the list. For the slow path, the depth surface is
part of the framebuffer which means the BO is added to the list
when the framebuffer is emitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-20 10:45:23 +01:00
Samuel Pitoiset
3c6bba83f0 radv: remove useless assertion in emit_depthstencil_clear()
Already checked in emit_clear().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-20 10:45:21 +01:00
Samuel Pitoiset
403a3d8061 radv: remove useless check in radv_set_depth_clear_regs()
aspects can't be zero and there is an assertion that ensures
it's not in emit_clear().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-20 10:45:19 +01:00
Dave Airlie
59ca0c4b44 docs/features: mark some r600 extensions supported
These just looked to be missed when this file was updated.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-20 10:22:25 +10:00
George Barrett
f09c2cefdd glsl: Catch subscripted calls to undeclared subroutines
generate_array_index fails to check whether the target of a subroutine
call exists in the AST, potentially passing around null ir_rvalue
pointers eventuating in abort/segfault.

Fixes: fd01840c0b ("glsl: add AoA support to subroutines")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100438
2017-11-20 11:04:04 +11:00
Eric Anholt
514db90448 broadcom/vc5: Fix up integer texture handling.
The original spec I had didn't expose integer textures and suggested that
you use unfiltered floats.  Now there are proper formats for them.

Fixes 16- and 32-bit texwrap integer tests in piglit, and
dEQP-GLES3.functional.fbo.completeness.renderable.renderbuffer.color0.rgb10_a2ui.
2017-11-19 10:12:30 -08:00
Eric Anholt
65ae4527d9 broadcom/vc5: Fix simulator assertion failures about color RT clears.
When we tried to clear color while storing depth, it assertion failed
about basically not having enough information to decide which color RT to
clear.  It turns out the STORE_GENERAL picks the buffer according to the
color buffer being stored, or all of them if NONE.  If you're doing depth,
it doesn't know which to pick.
2017-11-19 10:12:30 -08:00
Rob Clark
ae44845aff freedreno/ir3: add texture gather support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-18 13:39:39 -05:00
Lucas Stach
f5d477f447 etnaviv: enable full overwrite when no color buffer is present
The OVERWRITE bit disables destination fetches, which is exactly what
we want when there is no valid color buffer bound.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-11-18 12:33:49 +01:00
Jason Ekstrand
1eab327ba7 i965: Stop including brw_cfg.h in brw_disasm_info.h
The brw_disasm_info header is included by certain tools in order to get
shader assembly from binaries so it's a semi-external header.  Including
brw_cfg.h also pulls in brw_shader.h so you end up getting quite a bit
of our back-end compiler internals.  Instead, make the couple of forward
declarations we need and make the header more stand-alone.  This fixes
the meson build.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 4f82b17287
2017-11-17 21:51:16 -08:00
Jason Ekstrand
0a6a137eb2 i965: Mark BOs as external when we export their handle
Almost all of our BO export paths were already properly marked the BO as
external and added it to the handle table.  Most export use-cases go
through a prime fd or flink where we have a brw_bo export helper that
does the right thing.  The one missing one happens when you call
queryImage and ask for __DRI_IMAGE_ATTRIB_HANDLE.  We just grabbed the
gem handle out of the BO (because it's really easy to do that) and
handed it off to the client; what could go wrong?  As it turns out, this
path is used by basically every compositor that wants to turn around and
call drmModeAddFB2 on it so it can hand it off to display.  The result,
as of 4b1e70cc57, is that we no longer set
MOCS_PTE on those surfaces and the kernel's attempts to disable caching
fail and we scanout gets corruption.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103759
Fixes: 4b1e70cc57
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2017-11-17 17:16:44 -08:00
Jason Ekstrand
344252a27f i965/bufmgr: Add a helper to mark a BO as external
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2017-11-17 17:16:44 -08:00
Andres Gomez
1866f7aee5 i965: Correct disasm_info usage in eu_validate test
Fixes: 4f82b17287 ("i965: Rewrite disassembly annotation code")

Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-11-18 03:07:06 +02:00
Eric Anholt
8d5994098f broadcom/vc5: Set up the padded height at surface creation time.
This centralizes the calculation in the surface, instead of in each
load/store.
2017-11-17 16:09:55 -08:00
Eric Anholt
87391e23cf broadcom/vc5: Ensure that there is always a TLB write.
This should fix some GPU hangs in our (currently always single-threaded)
fragment shaders, and definitely fixes assertion failures in simulation.
2017-11-17 16:09:55 -08:00
Eric Anholt
c40ac132e4 broadcom/vc5: Fix clear color for swap_color_rb render targets.
Fixes dEQP-GLES3.functional.depth_stencil_clear.depth.*
2017-11-17 16:09:55 -08:00
Eric Anholt
52f3e9e43c broadcom/vc5: Fix pasteo in front stencil ref value setup.
Fixes piglit masked-clear.
2017-11-17 16:09:55 -08:00
Eric Anholt
b63dd626b7 broadcom/vc5: Fix colormasking when we need to swap r/b colors.
Fixes part of piglit masked-clear.
2017-11-17 16:09:55 -08:00
Eric Anholt
2daf941a58 broadcom/vc5: Enable the Z min/max clipping planes. 2017-11-17 16:09:55 -08:00
Eric Anholt
c259bf686c broadcom/vc5: Fix driver for new PIPE_SHADER_CAP_MAX_HW_ATOMIC_*. 2017-11-17 16:09:54 -08:00
Brian Paul
2b5b6bebac r300: add PIPE_SHADER_CAP_MAX_HW_ATOMIC_COUNTER* switch cases
To silence compiler warnings.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-17 16:09:59 -07:00
Brian Paul
af322ed887 tgsi: s/uint/enum pipe_shader_type/
Roland Scheidegger <sroland@vmware.com>
2017-11-17 16:09:40 -07:00
Brian Paul
fdee3e1d82 tgsi: bump tgsi_opcode_info::output_mode size to 4 bits
To avoid problems with MSVC.  And verify size with ASSERT_BITFIELD_SIZE().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-11-17 16:09:39 -07:00
Kenneth Graunke
a01ba366e0 i965: Revert Gen8 aspect of VF PIPE_CONTROL workaround.
This apparently causes hangs on Broadwell, so let's back it out for now.
I think there are other PIPE_CONTROL workarounds that we're missing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103787
2017-11-17 14:28:22 -08:00
Adam Jackson
ddcd4b05a3 egl: Convert int to attrib in eglGetPlatformDisplay
... because converting attrib to int truncates, and that's bad.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-17 16:43:16 -05:00
Rob Clark
1831e3fb1d docs: update features for freedreno
Just comparing glxinfo and features.txt, and it seems features.txt is
fairly out of date.  The a5xx specific features (compute/images/atomics/
etc) are recent.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-17 15:19:38 -05:00
Matt Turner
821ec473a8 i965: Rename intel_asm_annotation -> brw_disasm_info
It was the only file named intel_* in the compiler.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-17 12:14:38 -08:00
Matt Turner
4f82b17287 i965: Rewrite disassembly annotation code
The old code used an array to store each "instruction group" (the new,
better name than the old overloaded "annotation"), and required a
memmove() to shift elements over in the array when we needed to split a
group so that we could add an error message. This was confusing and
difficult to get right, not the least of which was  because the array
has a tail sentinel not included in .ann_count.

Instead use a linked list, a data structure made for efficient
insertion.

Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-17 12:14:38 -08:00
Matt Turner
f80e97346b i965: Simplify annotation_insert_error()
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-17 12:14:38 -08:00
Matt Turner
f4276ef7ef i965: Move common code out of #ifdef
I'm going to change the call in a later patch and with the difference in
indentation level it wasn't immediately obvious that the calls were
identical.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-17 12:14:38 -08:00
Anuj Phogat
822fd2341d i965: Remove DWord length from MI_FLUSH_DW definition
Fixes: 6165fda59b ("i965: Program DWord Length in MI_FLUSH_DW")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-17 11:51:28 -08:00
Jason Ekstrand
a07f7b2619 anv/cmd_buffer: Take bo_offset into account in fast clear state addresses
Otherwise, if the image is not bound to the start of the buffer, we're
going to be reading and writing its fast clear state in the wrong spot.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-17 11:32:21 -08:00
Jason Ekstrand
a6cc361e5f anv/cmd_buffer: Advance the address when initializing clear colors
Found by inspection

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-17 11:32:21 -08:00
Boyuan Zhang
3b7fd35d01 radeon/video: enable encode support for raven
Enable h.264 encode for vcn hardware (raven)

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
549a41ed9d radeonsi: enable vcn encode
Enable vcn encode by creating radeon_encoder for vcn.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
fe50797d93 radeon/vcn: add create encoder
Add implementation for create_encoder interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
3c53fbbc87 radeon/vcn: add encode get feedback
Add implementation for get_feedback interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
bc9644460d radeon/vcn: add encode destroy
Add implementation for destroy interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
3f83c24366 radeon/vcn: add encode end frame
Add implementation for end_frame interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
47443bc9f0 radeon/vcn: add encode bitstream
Add implementation for encode_bitstream interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
f40fe728a1 radeon/vcn: add encode begin frame
Add implementation for begin_frame interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
c2448f20a3 radeon/vcn: add encode header implementations
Implement encoding of sps, pps, and silce headers using the newly added h.264
header coding descriptors functions based on h.264 specs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
d940fdf765 radeon/vcn: add encode header algorithms
Since bitstream headers, e.g. sps, pps, slice, are encoded in driver side, we
need to add corresponding algorithms that required to generate those headers.
According to h.264 specs, signed/unsigned interger Exp-Golomb-coded syntax
element with left bit first (code_se and code_ue) and unsigned integer using
n bits (code_fixed_bits) descriptors function are needed. Therefore, adding
those algorithms and related variables and output algorithms here.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
be996f2213 radeon/vcn: add ib implementations
Implement required ibs and command buffer submission interfaces for vcn encode

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
7f7ae47385 radeon/vcn: add common encode part
Add a skeleton pipe video interface and encode ib interface for video encode
on vcn hardware. Add function defines and structures for vcn encode. Update
Makefile.sources and meson.build with newly added files.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
58aa4dffb4 st/va: implement poc type
pic_order_cnt_type is a required variable when encoding both sps and
slice header, therefore we need to get this value from st, e.g. vaapi
interface, and then pass it to radeon driver for encoding headers.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
76e0dcd5a9 vl: add poc type
Different from vce encoding, vcn encoding requires driver side to encode
bitstream header, such as pps, sps and slice header. pic_order_cnt_type
is a required variable when encoding both sps and slice header, therefore
we need to add this new variable here, and hold the value passed from st,
e.g. vaapi interface

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
c445cdf649 winsys/amdgpu: add vcn enc cs support
New cs support is needed for vcn encode

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
436a3f8d6d radeon/common: add vcn enc ip info query
New ip info query is needed for vcn encode

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
f2021d92eb radeon/winsys: add vcn enc ring type
New ring type is needed for vcn encode

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Boyuan Zhang
d3d8914275 radeon/vcn: add vcn encode interface
Add a new header file for vcn encode interface

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Gert Wollny
b50eda8498 gallium/aux/util/u_surface.c: Silence warnings and remove unneeded MAYBE_UNUSED
* Explicitely convert values to int in comparison.
 * Remove one MAYBE_UNUSED that is actually not needed.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-17 09:27:58 -07:00
Gert Wollny
c7bf83ef5c gallium/aux/util/u_debug_image.c: Silence warnings -Wunused-param
Decorate the according parameters with UNUSED.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-17 09:27:58 -07:00
Gert Wollny
f23f2146cb gallium/aux/util/u_debug_flush.c: Silence warnings -Wunused-param
Decorate the unused parameters with UNUSED.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-17 09:27:58 -07:00
Gert Wollny
1dca234daf gallium/aux/util/u_debug.c: Silence warnings -Wunused-param
Silence warnings by decoration the parameters with UNUSED.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-17 09:27:57 -07:00
Gert Wollny
bec80e892b gallium/aux/util/u_format_table.py: Add UNUSED decoration to the generated function headers
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-17 09:27:57 -07:00
Gert Wollny
c7ebf95797 gallium/aux/util/u_async_debug.c: Fix -Wtype-limits warning.
Use size_t instread of unsigned for new_max. realloc later expects
size_t as parameter anyway.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-17 09:27:57 -07:00
Gert Wollny
537d04615d gallium/aux/os/os_thread.h: Silence -Wunused-param.
With --disable-debug a parameter is not used. Silence this
warning by fake-using it.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-17 09:27:57 -07:00
Gert Wollny
4bdad3d98c gallium/aux/util/u_debug_refcnt.h: Fix -Wunused-param warnings
Annotate the according parameters accordingly.

v2: move UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:57 -07:00
Gert Wollny
2049c635ee gallium/aux/util/u_blit.c: Fix -Wunused-param warnings
Annotate the parameters accordingly.

v2: move UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
v1: Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:57 -07:00
Gert Wollny
0b984188f9 src/util/simple_mtx.h: Fix two -Wunused-param warnings.
Decorate the parameters accordingly with "UNUSED" or "MAYBE_UNUSED" (for
the param that is used in debug mode, but not in release mode).

v2: move UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:57 -07:00
Gert Wollny
811eb70a57 mesa/main/texcompress_s3tc_tmp.h: Fix two -Wparam-unused warnings.
Decorate the params accordingly with "UNUSED".

v2: move UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:57 -07:00
Gert Wollny
28a02eb3b8 gallium/aux/util/u_transfer.c: Fix some -Wunused-param warnings.
Decorate the params with "UNUSED" accordingly.

v2: move UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:57 -07:00
Gert Wollny
82e2f1ea34 gallium/aux/util/u_threaded_context.c: Fix some -Wunused-param warnings.
Decorate the params accordingly with UNUSED or MAYBE_UNUSED (for params
that are used in debug mode).

v2: move *UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:57 -07:00
Gert Wollny
a5da06d9b7 gallium/aux/util/u_surface.c: Silence a -Wsign-compare warning.
Explicitely convert one value to compare.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-17 09:27:57 -07:00
Gert Wollny
c9fef0fa9f gallium/aux/util/u_pstipple.c: Fix one -Wsign-compare warning in ?: construct.
Silence the warning by making the conversion to int explicit.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-17 09:27:57 -07:00
Gert Wollny
2644a80ccf gallium/aux/util/u_mm.c: Fix one -Wparam-unused warning.
Decorate the unused param accordingly with "UNUSED".

v2: move UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:57 -07:00
Gert Wollny
1c0d4baaf7 gallium/aux/util/u_format_yuv.c: Fix a number of -Wunused-param warnings.
Decorate the params accordingly with "UNUSED".

v2: move UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:57 -07:00
Gert Wollny
a837a3d10d gallium/aux/util/u_format_rgtc.c: Fix a number of -Wunused-param warnings
Decorate the params accordingly with "UNUSED".

v2: move UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:57 -07:00
Gert Wollny
a29181d3c1 gallium/aux/util/u_format_other.c: Fix various -Wunused-param warnings
Decorate the unused params with "UNUSED".

v2: move UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:57 -07:00
Gert Wollny
3b4bacc09f gallium/aux/util/u_format_latc.c: Fix various -Wunused-param warnings, (v2)
Decorate the unused params with "UNUSED".

v2: move UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:57 -07:00
Gert Wollny
d744387b08 gallium/aux/util/u_format_etc.c: Fix eight -Wunused-param warnings (v2)
Decorate the parameters accordingly with "UNUSED".

v2: move UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:56 -07:00
Gert Wollny
cf93a7fc9e gallium/aux/util/u_format.c: Fix one -Wunused-param warning
This warning was issued only in release mode. Fix it by fake-using the
parameter.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:56 -07:00
Gert Wollny
e979ab70c2 gallium/aux/util/u_dump_state.c: Fix two -Wunused-paramter warnings
v2: move UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:56 -07:00
Gert Wollny
20d3e943b1 gallium/aux/util/u_dump_defines.c: Fix -Wcompare-unsigned warning
u_bit_scan may return -1 that then may be interpreted as (unsigned)-1 in
the following comparison, since num_names is unsigned. Convert the latter to
be int as well.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-17 09:27:56 -07:00
Gert Wollny
373c263e2c gallium/aux/util/u_debug_stack.c: Silence -Wunused-result warning
asprintf is decorated with the attrbute "warn_unused_result", and if the
function call fails, the pointer "temp" will be undefined, but since it is
used later it should contain some usable value.
Test return value of asprintf and assign some save value to "temp" if
the call failed.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:56 -07:00
Gert Wollny
9b80c03870 gallium/aux/util/u_debug_describe.c: Silence an -Wunused-param warning
Annotate the unused parameter.

v2: move UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:56 -07:00
Gert Wollny
ca7d5170eb gallium/aux/util/u_blitter.c: Silence some warnings
* Annotate three parameters that are not used in release mode.
* explicitely convert an int to unsigned in an ?: construct.

v2: move MAYBE_UNUSED decoration in front of parameter declaration

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-17 09:27:56 -07:00
Rob Clark
c267750bb1 freedreno/a5xx: stencil texturing support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-17 11:19:34 -05:00
Rob Clark
a39d403202 freedreno/a5xx/gmem: fix z32/s8 restore/resolve
BLIT_ZS mode is used for either combined z24/s8 or z32 in which case
BLIT_S mode is used for separate stencil.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-17 11:19:34 -05:00
Rob Clark
010ebed72a freedreno/a5xx/gmem: move ZS restore tiling hack
Code motion to simplify next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-17 11:19:33 -05:00
Rob Clark
22605dce4b freedreno: update generated headers 2017-11-17 11:19:33 -05:00
Brian Paul
501591e852 svga: add missing PIPE_SHADER_CAP_MAX_HW_ATOMIC_COUNTER* cases
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-11-16 20:35:17 -07:00
Brian Paul
92c1290dc5 glsl: s/unsigned/glsl_base_type/ in glsl type code (v2)
Declare glsl_type::sampled_type as glsl_base_type as we do for the
base_type field.  And make base_type a bitfield to save a few bytes.

Update glsl_type constructor to take glsl_base_type instead of unsigned
and pass GLSL_TYPE_VOID instead of zero.

No Piglit regressions with llvmpipe.

v2:
- Declare both base_type and sampled_type as 8-bit fields
- Use the new ASSERT_BITFIELD_SIZE() macro.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-16 20:35:17 -07:00
Brian Paul
940fba68c9 util/tgsi: use ASSERT_BITFIELD_SIZE() to check opcode field size
I've noticed at least two places where we store the TGSI opcode in
an unsigned:8 bitfield.  We're at 249 opcodes now.  If we hit 256 we'll
need to grow those bitfields.  Use the new ASSERT_BITFIELD_SIZE() macro
to detect that.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-16 20:35:17 -07:00
Brian Paul
d4726b1318 st/mesa: use enum types instead of int/unsigned (v3)
Use the proper enum types for various variables.  Makes life in gdb
a little nicer.  Note that the size of enum bitfields must be one
larger so the high bit is always zero (for MSVC).

v2: also increase size of image_format bitfield, per Eric Engestrom.
v3: use the new ASSERT_BITFIELD_SIZE() macro

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-16 20:35:17 -07:00
Brian Paul
fe81e1f975 util: add new ASSERT_BITFIELD_SIZE() macro (v3)
For checking that bitfields are large enough to hold the largest
expected value.

v2: move into existing util/macros.h header where STATIC_ASSERT() lives.
v3: add MAYBE_UNUSED to variable declaration

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-16 20:35:17 -07:00
Dave Airlie
c8ce3c2689 st/mesa: don't move ssbo after atomic buffers if we support hw atomics
There is no need to have these overlap if we support hw atomics.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-17 13:15:38 +10:00
Kenneth Graunke
8f91aa35a5 i965: Upload invariant state once at the start of the batch on Gen4-5.
We want to emit invariant state at the start of a render batch.  In the
past, this more or less happened: a new batch flagged BRW_NEW_CONTEXT
(because we don't have hardware contexts), which triggered the
brw_invariant_state atom.  So, it would be emitted before any 3D
drawing.  (Technically, there might be some BLT commands in the batch
because Gen4-5 have a single combined render/BLT ring, but that should
be harmless).

With the advent of BLORP, this broke.  The first item in a batch might
be a BLORP operation, which bypasses the normal draw upload path.  So,
we need to ensure invariant state happens first.  To do that, we just
upload it when creating a new batch.  On Gen6+ we'd need to worry about
whether it's a RENDER or BLT batch, but because we have a combined ring,
this approach should work fine on Gen4-5.

Seems to fix GPU hangs when playing hardware accelerated video with
mpv -hwdec=vaapi on Ironlake.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103529
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-16 17:39:01 -08:00
Dave Airlie
59162c122f docs: update features/relnotes for r600 shader image support. (v2)
v2: update GLES

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-17 11:31:41 +10:00
Dave Airlie
8c52ece581 r600: enable ARB_shader_image_load_store, ARB_shader_image_size
This also enables GL4.2 for gpus with hw fp64 (cayman, cypress)

Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-17 11:31:41 +10:00
Dave Airlie
31db4a3200 r600: handle image size support.
This adds support for the RESQ opcode with the workaround
required due to hw bugs for buffers and cube arrays.

Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-17 11:31:41 +10:00
Dave Airlie
f94f5c9a9f r600/sb: disable SB for images.
Until we can work further on sb, disable it for images for now.

Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-17 11:31:41 +10:00
Dave Airlie
aa38bf658f r600/shader: add support for load/store/atomic ops on images.
This adds support to the shader assembler for load/store/atomic
ops on images which are handled via the RAT operations.

Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-17 11:31:41 +10:00
Dave Airlie
a6b3792843 r600: add core pieces of image support.
This adds the atoms and gallium api implementations,
along with support for compress/decompress paths for
shader images.

Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-17 11:31:40 +10:00
Dave Airlie
5689bb0022 r600/shader: implement getting thread id.
We need the thread id to use the immediate buffer readback
mechanism, so add support for calculating it.

Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-17 11:31:40 +10:00
Dave Airlie
c119a1acb3 r600/shader: add flag to denote if shader uses images
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-17 11:31:40 +10:00
Dave Airlie
894e2deb7e r600: implement basic memory barrier.
This isn't 100% perfect (fglrx also fails a bunch of those tests)
but implement the start of a memory barrier for image support.

Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-17 11:31:40 +10:00
Dave Airlie
ac4f175d79 r600: allocate immed buffer resource for images.
In order to image readback we have to execute a MEM_RAT instruction
that needs a buffer to transfer the result into until the shader
can fetch it.

Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-17 11:31:40 +10:00
Dave Airlie
77d36cbc8d r600: handle writes_memory properly
This implements proper handling for shaders with side effects.

Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-17 11:31:40 +10:00
Dylan Baker
d8acf79f0c autotools: change version TINY -> PATCH
Because patch is more common than tiny for talking about the 3rd element
of a version.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emli.velikov@collabora.com>
2017-11-16 16:16:45 -08:00
Dylan Baker
65fc16c974 autotools: set XA versions in configure.ac and configure header file
Currently the versions are set in the header, and then sed is used to
extract them, so that autotools can use them elsewhere.

This is odd. Autotools is perfectly capable of configuring the header
with the versions, and then they don't need to be extracted from the
the header. This is cleaner and more obvious.

Tested with make distcheck.

v2: - Split tiny -> patch change
    - Drop temporary variables
    - change XA_VERSION_* -> XA_*
v3: - Finish splitting the tiny -> patch change

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emli.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
2017-11-16 16:16:29 -08:00
Kenneth Graunke
f274687413 genxml: Fix PIPELINE_SELECT on G45/Ironlake.
Original 965 sets bits 28:27 to 0, while G45 and later set it to 1.

Note that the G45 docs are incorrect in this regard - see the DevCTG+
note in the Ironlake PRMs.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-16 11:01:50 -08:00
Emil Velikov
9b02230466 egl: pass the dri2_dpy to the $plat_teardown functions
Cc: Mark Janes <mark.a.janes@intel.com>
Fixes: 40a01c9a0e ("egl/drm: move teardown code to the platform file")
Fixes: 8d745abc00 ("egl/wayland: move teardown code to the platform file")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Dylan Baker <dylan@pnwbakers.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103784
2017-11-16 18:46:01 +00:00
Rafael Antognolli
306914db92 meson: Add dridriverdir variable to dri.pc.
Xorg (and possibly other things) depend on this variable to find the
path to DRI drivers.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-16 10:40:26 -08:00
Dylan Baker
bc17ac5866 docs: add documentation for building with meson
v2: - Add information about CC, CXX, CFLAGS, and CXXFLAGS (Nicolai)
    - Add message at top that meson for mesa is still a work in progress
    - Add trailing "/" to directories (Eric E.)
    - Fix a number of spelling/grammar/style suggestions from Eric E.
    - Make a number of changes as suggested by Emil.
v3: - Fix order of commands in example (Eric E.)
    - Add documentation for overriding LLVM version (Eric E.)
v4: - Rebase on master
    - update default buildtype
    - add note about b_ndebug
    - Clarify meson configure a bit
v5: - use <code> for command line arguments (Eric E.)
    - Add note about listing options without a build directory
    - Minor formatting changes (Eric E.)
    - Replace the CC, CFLAGS, etc section with an environment variables
      section, which mentions CC, CXX, CFLAGS, CXXFLAGS, LDFLAGS, and
      DESTDIR
    - Add comment that not using buildtype debug might make debugging
      harder
    - Add comment that b_ndebug and buildtype are orthogonal

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v3)
2017-11-16 09:48:09 -08:00
Kai Wasserbäch
d25123e23a docs: Point to apt.llvm.org for development snapshot packages
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-16 16:21:09 +00:00
Eric Engestrom
ca95d7ad4e egl: fix var type
queryImage() takes an `int*`; compiler is warning about the
signed<->unsigned pointer mismatch.

Fixes: 0db36caa19 "egl/wayland: Add a fallback when fourcc
       query isn't supported"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Derek Foreman <derekf@osg.samsung.com>
2017-11-16 16:20:44 +00:00
Emil Velikov
9e74e2d13c i915: add missing extensions.h include
Otherwise we'll bail with due to -Werror=implicit-function-declaration.
It went unnoticed since the we had a bug which did consistently set the
compiler flag.

Fixes: ba8a347f93 ("mesa: split extensions overrides and glGetString(GL_EXTENSIONS)")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-16 16:11:25 +00:00
Emil Velikov
f3ea07959b mesa: return 'unrecognized' extensions in glGetStringi
Analogous to the glGetString() case - report all the
extensions enabled via MESA_EXTENSION_OVERRIDE

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-16 14:17:07 +00:00
Emil Velikov
310e4485cb mesa: rework the way we manage extra_extensions
Store pointers to the tokenized strings in the gl_extensions struct.

This way we can reuse them in glGetStringi() while we construct the
really long string only in _mesa_make_extension_string.

Only 16 pointers/strings are stored for now.

v2: Warn only once when we provide more than 16 unk. extensions, rebase

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-16 14:17:07 +00:00
Emil Velikov
693682bd01 mesa: pass the ctx to _mesa_one_time_init_extension_overrides
Will be needed with next commit

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-16 14:08:14 +00:00
Emil Velikov
9aa9b98e63 mesa: call atexit() only as needed
If the extra_extensions string is empty there's no need to call
atexit() - there's nothing to free.

v2: Rebase

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2017-11-16 14:08:03 +00:00
Emil Velikov
3d81e11b49 mesa: remove unnecessary 'sort by year' for the GL extensions
The sorting was originally added to work around broken games (comment
says Quake3 demo) that were copying the extensions list into small
buffer.

Sorting does not solve the problem, since we'll still overflow and cause
corruption/crash.

Better workaround is to actually trim the string ... as done with a
later commit which introduces the MESA_EXTENSION_MAX_YEAR env. variable.

Side note: On my machine, the existing sorting makes no changes to the
extensions string.

Cc: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-16 14:07:14 +00:00
Emil Velikov
a3f82876f4 mesa: reuse set_extension() for _mesa_extension_override_disables
We already use it for _mesa_extension_override_enables.
Improve consistency and use it for both extension lists.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-16 14:07:14 +00:00
Emil Velikov
444d9e4b08 mesa: drop unnecessary coping of extra_extensions
The function get_extension_override() returns a copy of a string,
only for it to be copied again ...

Drop the unneeded calloc/strdup/free dance.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-16 14:07:14 +00:00
Emil Velikov
e4cdce5058 mesa: remove duplicate 'disabled extensions' list
While parsing MESA_EXTENSION_OVERRIDE we keep track of the disabled
extensions, twice - in _mesa_extension_override_disables and
disabled_extensions.

Upon context creation, we use the former to modify the extensions list.
Yet, we still check the updated list against disabled_extensions.

Remove disabled_extensions, it's obsolete.

Cc: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-16 14:07:14 +00:00
Emil Velikov
167e958a87 mesa: call _mesa_make_extension_string only as needed
As of previous commit we removed the extension overrides from this
function.

Thus we no longer need to call it during MakeCurrent, so we can
construct the extensions string when needed - _mesa_GetString.

This commit effectively reverts a879d14ecf ("mesa: initialize extension
string when context is first bound")

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-16 14:07:14 +00:00
Emil Velikov
ba8a347f93 mesa: split extensions overrides and glGetString(GL_EXTENSIONS)
Currently we apply the extension overrides and construct the extensions
string upon MakeCurrent.

They are two distinct things, so let's slit the two while pushing the
overrides management _before_ _mesa_compute_version(). This ensures that
the version is updated to reflect the enabled/disabled extensions.

Cc: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-16 14:07:14 +00:00
Emil Velikov
afd6a964a4 i965: remove ARB_compute_shader extension override
Checking the override was useful in the early stages of developing the
extension.

Now that everything is wired, where possible, we can drop the check.
Doing so allows us to simplify some of the related code.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-11-16 14:06:57 +00:00
Emil Velikov
f8812931cf i965: use _mesa_is_desktop_gl helper
Use the helper over opencoding the check.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-16 14:06:56 +00:00
Emil Velikov
6614804d1e egl: add note about missing $plat_teardown
Some platforms are missing a proper teardown function. Add a small TODO
to make it obvious.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-16 14:03:11 +00:00
Emil Velikov
8d745abc00 egl/wayland: move teardown code to the platform file
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-16 14:03:10 +00:00
Emil Velikov
40a01c9a0e egl/drm: move teardown code to the platform file
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-16 14:03:08 +00:00
Emil Velikov
938fcab08b egl/x11: move teardown code to the platform file
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-16 14:03:06 +00:00
Emil Velikov
55245fe1c9 egl: Provide meaningfull error when built w/o requested platform
The current "No EGL platform enabled." is misleading and wrong.
We reach said code when $platform is missing.

To make this more obvious and clear provide wrappers in the header
file, making the code a bit easier to follow.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-16 14:03:03 +00:00
Jon Turney
8ceccbf80d meson: Don't define HAVE_PTHREAD only on linux
I'm not sure of the reason for this. I don't see anything like this in
configure.ac

In include/c11/threads.h the cases are:

1) building for Windows -> threads_win32.h
2) HAVE_PTHREAD -> threads_posix.h
3) Not supported on this platform

So not defining HAVE_PTHREAD for anything not Windows just means we can't
build at all.

When we are building for Windows, I'm not sure if dependency('threads')
would ever find anything, or defining HAVE_PTHREAD has any effect, but avoid
defining it there, just in case.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-16 13:51:25 +00:00
Rob Clark
ff018a3f55 freedreno: also mark images used by draw/grid
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-16 08:44:19 -05:00
Rob Clark
92e75bf0ec freedreno: mark SSBOs written at draw time
Comment was right, implementation was wrong ;-)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-16 08:44:19 -05:00
Rob Clark
2878af74dd freedreno/a5xx: ARB_framebuffer_no_attachments support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-16 08:44:19 -05:00
Kenneth Graunke
8d48671492 i965: Implement another VF cache invalidate workaround on Gen8+.
...and provide a better citation for the existing one.

v2:
- Apply the workaround to Gen8 too, as intended (caught by Topi).
- Restructure to add bits instead of an extra flush (based on a similar
  patch by Rafael Antognolli).

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-11-16 01:43:26 -08:00
Nicolai Hähnle
f3fa3b0d95 tgsi/exec: fix LDEXP in softpipe
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103128
Fixes: cad959d901 ("gallium: add LDEXP TGSI instruction and corresponding cap")
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-16 06:47:05 +01:00
Nicolai Hähnle
2e3d0dd6c8 threads,configure.ac,meson.build: define and use HAVE_TIMESPEC_GET
Tested with Travis and Appveyor.

v2: add HAVE_TIMESPEC_GET for non-Windows Scons builds
v3: use check_functions in Scons (Eric)

Cc: Rob Herring <robh@kernel.org>
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103674
Fixes: f1a3648784 ("threads: update for late C11 changes")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v2)
2017-11-16 06:45:35 +01:00
Timothy Arceri
a8bdf0e0c4 radeonsi: copy some nir gs info
v2: copy input primitive

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-16 10:54:03 +11:00
Timothy Arceri
b73ce64fb8 ac: add gs_{prim,invocation}_id to the abi
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-16 10:54:03 +11:00
Timothy Arceri
8ae92a9209 radeonsi: gather stream info in nir path
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-16 10:51:35 +11:00
Vinson Lee
cd58b98b03 mapi: Use correct shared libraries suffix on macOS.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-15 15:26:46 -08:00
Brian Paul
ae7b4fdb32 tgsi: whitespace clean-ups in tgsi_util.[ch]
Trivial.
2017-11-15 16:12:44 -07:00
Brian Paul
cbfc92c04b svga: s/unsigned/enum tgsi_texture_type/
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-11-15 16:12:43 -07:00
Brian Paul
dc37cb0f86 tgsi: s/unsigned/enum tgsi_texture_type/
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-11-15 16:12:43 -07:00
Frank Richter
bf41b2b262 gallium/wgl: fix default pixel format issue
When creating a context without SetPixelFormat() don't blindly take the
pixel format reported by GDI. Instead, look for our own closest pixel
format.

Minor clean-ups added by Brian Paul.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103412
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2017-11-15 16:12:43 -07:00
Brian Paul
824e8084ed svga: issue debug warning for unsupported two-sided stencil state
We only have a single stencil read mask and write mask.  Issue a
warning if different front/back values are used.  The Piglit
gl-2.0-two-sided-stencil test hits this.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-11-15 16:12:43 -07:00
Brian Paul
cacb88490a st/mesa: whitespace fixes in st_manager.c
Trivial.
2017-11-15 16:12:43 -07:00
Brian Paul
8150690cac st/mesa: whitespace clean-ups in st_context.c
Trivial.
2017-11-15 16:12:43 -07:00
Brian Paul
0605a6cc89 st/mesa: move st_manager_destroy() earlier in file
To avoid forward declaration.

Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
2017-11-15 16:12:43 -07:00
Brian Paul
3a74eb3a9b st/mesa: move st_init_driver_flags() earlier in file
To get rid of forward declaration.

Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
2017-11-15 16:12:43 -07:00
Brian Paul
955cbdf120 docs: update llvmpipe.html build instructions 2017-11-15 16:12:42 -07:00
Wladimir J. van der Laan
d61a914394 etnaviv: Add sampler TS support
Sampler TS is an hardware optimization that can be used when rendering
to textures. After rendering to a resource with TS enabled, the
texture unit can use this to bypass lookups to empty tiles. This also
means a resolve-in-place can be avoided to flush the TS.

This commit is also an optimization when not using sampler TS, as
resolve-in-place will now be skipped if a resource has no (valid) TS.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-15 23:27:54 +01:00
Wladimir J. van der Laan
59d76e7ab6 etnaviv: Flush TS cache before changing TS configuration
This is to make sure that the TS is properly flushed to memory before
rendering to a new surface starts.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-15 23:27:39 +01:00
Wladimir J. van der Laan
0d6d9b520b etnaviv: Add TS_SAMPLER formats to etnaviv_format
Sampler TS introduces yet another format enumeration for
renderable+textureable formats. Introduce it into the etnaviv_format
table as another column.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-15 23:27:26 +01:00
Wladimir J. van der Laan
ade528edd1 etnaviv: Check that resource has a valid TS in etna_resource_needs_flush
Resources only need a resolve-to-itself if their TS is valid for any
level, not just if it happens to be allocated.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-15 23:27:09 +01:00
Wladimir J. van der Laan
b24cb40188 etnaviv: rnndb update
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-15 23:26:53 +01:00
Dave Airlie
00bf875d55 radv: it isn't an error to not support a format or driver
This reverts two of the vk_error changes:

reporting unsupported format is common,
and testing non-amdgpu drivers and ignoring them is also common.

Fixes: cd64a4f70 (radv: use vk_error() everywhere an error is returned)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-16 06:12:42 +10:00
Kenneth Graunke
5da2b26dcb i965: Drop some reserved space remnants.
BATCH_RESERVED was deleted in commit 2c46a67b41 (i965: Delete
BATCH_RESERVED handling.)  The reserved_space field is dead code, and
the comments aren't useful these days.
2017-11-15 09:37:32 -08:00
Kenneth Graunke
e48cc01be9 intel: Drop mtypes.h include from brw_compiler.h.
This isn't necessary and causes trouble for a project I'm working on.
2017-11-15 09:37:32 -08:00
Kenneth Graunke
0704702972 i965: Fold ABO state upload code into the SSBO/UBO state upload code.
Having this separate could potentially make programs that rebind atomics
but no other surfaces ever so slightly faster.  But it's a tiny amount
of code to add to the existing UBO/SSBO atom, and very related.

The extra atoms have a cost on every draw call, and so dropping some of
them would be nice.  This also reclaims a dirty bit.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-15 09:37:32 -08:00
Kenneth Graunke
ff964916dc i965: Use nir_lower_atomics_to_ssbos and delete ABO compiler code.
We use the same hardware mechanism for both atomic counters and SSBO
atomics, so there's really no benefit to maintaining separate code to
handle each case.  Instead, we can just use Rob's shiny new NIR pass to
convert atomic_uints to SSBOs, and delete piles of code.

The ssbo_start section of the binding table becomes a combined ABO and
SSBO section, with ABOs first, then SSBOs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-15 09:37:32 -08:00
Kenneth Graunke
f48f52b030 i965: Make a better helper function for UBO/SSBO/ABO surface handling.
This fixes the missing AutomaticSize handling in the ABO code, removes
a bunch of duplicated code, and drops an extra layer of wrapping around
brw_emit_buffer_surface_state().

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-15 09:37:32 -08:00
Samuel Pitoiset
059d25a06d radv: add the vertex buffers BO to the list at bind time
This should reduce the overhead of adding a BO to the current
list, especially when the list is huge. Also, when a new pipeline
is bound, we only need to update the descriptor, the buffer objects
should already be in the list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-15 09:01:07 +01:00
Samuel Pitoiset
c665879455 radv: replace vb_dirty with RADV_CMD_DIRTY_VERTEX_BUFFER
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-15 09:01:05 +01:00
Samuel Pitoiset
8fd213277f radv: drop radv_cmd_dirty_mask_t typedef
I don't think we will need a 64-bit unsigned integer for the
dirty flags in the future, and there is still 20 bits left.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-15 09:01:01 +01:00
Samuel Pitoiset
f697365058 radv: use an unsigned 32-bit integer for radv_queue::family_index
VkDeviceQueueCreateInfo::queueFamilyIndex is an unsigned 32-bit
integer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-15 09:00:59 +01:00
Samuel Pitoiset
f9e1ff2464 radv: do not add the image BO in radv_set_dcc_need_cmask_elim_pred()
radv_fill_buffer() ensures that the image BO is added to the list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-15 09:00:57 +01:00
Samuel Pitoiset
40290c805f radv: do not add the image BO in radv_set_color_clear_regs()
radv_fill_buffer() ensures that the image BO is added to the list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-15 09:00:54 +01:00
Roland Scheidegger
65123ee62c r600: set the number type correctly for float rts in cb setup
Float rts were always set as unorm instead of float.
Not sure of the consequences, but at least it looks like the blend clamp
would have been enabled, which is against the rules (only eg really bothered
to even attempt to specify this correctly, r600 always used clamp anyway).
Albeit r600 (not r700) setup still looks bugged to me due to never setting
BLEND_FLOAT32 which must be set according to docs...
Not sure if the hw really cares, no piglit change (on eg/juniper).

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-15 03:13:46 +01:00
Roland Scheidegger
570d5b7992 r600: use ieee version of rsq
Both r600 and evergreen used the clamped version, whereas cayman used the
ieee one. I don't think there's a valid reason for this discrepancy, so let's
switch to the ieee version for r600 and evergreen too, since we generally
want to stick to ieee arithmetic.
With this, behavior for both rcp and rsq should now be the same for all of
r600, eg, cm, all using ieee versions (albeit note rsq retains the abs
behavior for everybody, which may not be a good idea ultimately).

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-15 03:13:46 +01:00
Roland Scheidegger
1c8d57a008 r600: use ieee version of rcp
r600 used the clamped version for rcp, whereas both evergreen and cayman
used the ieee version. I don't know why that discrepancy exists (it does so
since day 1) but there does not seem to be a valid reason for this, so make
it consistent. This seems now safer than before the previous commit (using
the dx10 clamp bit).
Note that rsq still uses clamped version (as before even though the table
may have suggested otherwise for evergreen) for r600/eg, but not for cayman.
Will be changed separately for better regression tracking...

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-15 03:13:46 +01:00
Roland Scheidegger
3835009796 r600: use DX10_CLAMP bit in shader setup
The docs are not very concise in what this really does, however both
Alex Deucher and Nicolai Hähnle suggested this only really affects instructions
using the CLAMP output modifier, and I've confirmed that with the newly
changed piglit isinf_and_isnan test.
So, with this bit set, if an instruction has the CLAMP modifier bit (which
clamps to [0,1]) set, then NaNs will be converted to zero, otherwise the result
will be NaN.
D3D10 would require this, glsl doesn't have modifiers (with mesa
clamp(x,0,1) would get converted to such a modifier) coupled with a
whatever-floats-your-boat specified NaN behavior, but the clamp behavior
should probably always be used (this also matches what a decomposition into
min(1.0, max(x, 0.0)) would do, if min/max also adhere to the ieee spec of
picking the non-nan result).
Some apps may in fact rely on this, as this prevents misrenderings in
This War of Mine since using ieee muls
(ce7a045fee), without having to use clamped
rcp opcode, which would also fix this bug there.
radeonsi also seems to set this bit nowadays if I see that righ (albeit the
llvm amdgpu code comment now says "Make clamp modifier on NaN input returns 0"
instead of "Do not clamp NAN to 0" since it was changed, which also looks
a bit misleading).

v2: set it in all shader stages.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103544

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-15 03:13:46 +01:00
Roland Scheidegger
aab0bfc648 r600: use min_dx10/max_dx10 instead of min/max
I believe this is the safe thing to do, especially ever since the driver
actually generates NaNs for muls too.
The ISA docs are not very helpful here, however the dx10 versions will pick
a non-nan result over a NaN one (this is also the ieee754 behavior), whereas
the non-dx10 ones will pick the NaN (verified by newly changed piglit
isinf-and-isnan test).
Other "modern" drivers will most likely do the same.
This was shown to make some difference for bug 103544, albeit it is not
required to fix it.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-15 03:13:46 +01:00
Dave Airlie
3ceee04a4f r600: fix cubemap arrays
A lot of cubemap array piglits fail, port the texture type
picking code from radeonsi which seems to fix most of them.

For images I will port the rest of the code.

Fixes:
getteximage-depth gl_texture_cube_map_array-*
fbo-generatemipmap-cubemap array
getteximage-targets cube_array
amongst others.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-15 11:26:11 +10:00
Rob Clark
7676e71113 freedreno/a5xx: small comment fix
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-14 18:12:47 -05:00
Rob Clark
d27318bdd0 freedreno/a5xx: indirect draw support
A couple failures in piglit tests w/ TF or gl_VertexID + indirect draws.
OTOH all the deqp tests (although they don't test those combinations).
I suspect this could be fixed by a firmware update, but I don't think
there is much we can do in mesa for that.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-14 18:10:58 -05:00
Rob Clark
f383cf9d41 freedreno/a5xx: split out helper for pipeline stalls
We need a similar thing for indirect draws.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-14 18:10:51 -05:00
Rob Clark
d74029bddc freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-14 18:10:43 -05:00
Timothy Arceri
5041ea96a0 gallium/radeon: disable the cache when nir backend enabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-15 08:47:31 +11:00
Timothy Arceri
7273e9820e st/glsl_to_tgsi: use tgsi_get_gl_varying_semantic() for gs/tes outputs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-15 08:26:34 +11:00
Timothy Arceri
bc308122cc gallium/tgsi: add tess output supoort to tgsi_get_gl_varying_semantic()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-15 08:26:34 +11:00
Timothy Arceri
4ae9f0b580 st/glsl_to_tgsi: make use of tgsi_get_gl_varying_semantic()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-15 08:26:34 +11:00
Timothy Arceri
3d21eb3b7d gallium/tgsi: add prim id to tgsi_get_gl_varying_semantic()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-15 08:26:34 +11:00
Anuj Phogat
fc59546e9a i965: Make use of brw_load_register_imm32() helper function
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Nanley Chery <nanley.g.chery@intel.com>
2017-11-14 13:23:18 -08:00
Anuj Phogat
1dc45d75bb i965/gen8+: Fix the number of dwords programmed in MI_FLUSH_DW
Number of dwords in MI_FLUSH_DW changed from 4 to 5 in gen8+.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
2017-11-14 13:23:18 -08:00
Anuj Phogat
6165fda59b i965: Program DWord Length in MI_FLUSH_DW
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
2017-11-14 13:23:18 -08:00
Anuj Phogat
5d8164c428 anv/gen10: Enable float blend optimization
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-11-14 13:23:18 -08:00
Anuj Phogat
72a239266b intel/genxml: Add Cache Mode SubSlice Register to gen10.xml
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-11-14 13:23:18 -08:00
Anuj Phogat
aacf1943c0 anv/gen10: Implement WaSampleOffsetIZ workaround
We already have this workaround in OpenGL driver.
See Mesa commit 3cf4fe2219.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Nanley Chery <nanley.g.chery@intel.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
2017-11-14 13:23:18 -08:00
Andres Rodriguez
20e8dfcca9 mesa/st: add missing copyright headers to memoryobjects files
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-14 11:32:44 -08:00
Andres Rodriguez
60baf1a962 mesa: minor tidy up for memory object error strings
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-14 11:31:49 -08:00
Andres Rodriguez
f7580e7204 broadcom/vc4: fix indentation in vc4_screen.c
Stumbled into this when adding a new PIPE_CAP.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-14 11:31:36 -08:00
Matt Turner
a31d038208 Revert "intel/fs: Use a pure vertical stride for large register strides"
This reverts commit e8c9e65185.

With the actual bug fixed (by commit 6ac2d16901), this is not
necessary. I'm doubtful of its correctness in any case.
2017-11-14 11:24:08 -08:00
Matt Turner
6ac2d16901 i965/fs: Fix extract_i8/u8 to a 64-bit destination
The MOV instruction can extract bytes to words/double words, and
words/double words to quadwords, but not byte to quadwords.

For unsigned byte to quadword, we can read them as words and AND off the
high byte and extract to quadword in one instruction. For signed bytes,
we need to first sign extend to word and the sign extend that word to a
quadword.

Fixes the following test on CHV, BXT, and GLK:
   KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103628
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-14 10:56:18 -08:00
Matt Turner
cfcfa0b9cd i965/fs: Split all 32->64-bit MOVs on CHV, BXT, GLK
Fixes the following tests on CHV, BXT, and GLK:
    KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot
    dEQP-VK.spirv_assembly.instruction.compute.uconvert.uint32_to_int64
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103115
2017-11-14 10:56:18 -08:00
Tim Rowley
d8489517a5 swr/rast: Faster emulated simd16 permute
Speed up simd16 frontend (default) on avx/avx2 platforms;
fixes performance regression caused by switch to simdlib.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-14 11:40:19 -06:00
Tim Rowley
439904847e swr/rast: Use gather instruction for i32gather_ps on simd16/avx512
Speed up avx512 platforms; fixes performance regression caused
by swithc to simdlib.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-14 11:39:02 -06:00
Derek Foreman
0db36caa19 egl/wayland: Add a fallback when fourcc query isn't supported
When queryImage doesn't support __DRI_IMAGE_ATTRIB_FOURCC wayland clients
will die with a NULL derefence in wl_proxy_add_listener.

Attempt to provide a simple fallback to keep ancient systems working.

Fixes: 6595c69951 ("egl/wayland: Remove more surface specifics from
create_wl_buffer")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103519
Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-14 15:38:43 +00:00
Marek Olšák
89e669d2fd radeonsi: remove has_cp_dma, has_streamout flags (v2)
v2: remove r600_can_dma_copy_buffer
2017-11-14 15:24:50 +01:00
Julien Isorce
b904ad7d21 i965: implement (un)mapImage
Already implemented for Gallium drivers.

Useful for gbm_bo_(un)map.

Tests:
  By porting wayland/weston/clients/simple-dmabuf-drm.c to GBM.
  kmscube --mode=rgba
  kmscube --mode=nv12-1img
  kmscube --mode=nv12-2img
  piglit ext_image_dma_buf_import-refcount -auto
  piglit ext_image_dma_buf_import-transcode-nv12-as-r8-gr88 -auto
  piglit ext_image_dma_buf_import-sample_rgb -fmt=XR24 -alpha-one -auto
  piglit ext_image_dma_buf_import-sample_rgb -fmt=AR24 -auto
  piglit ext_image_dma_buf_import-sample_yuv -fmt=NV12 -auto
  piglit ext_image_dma_buf_import-sample_yuv -fmt=YU12 -auto
  piglit ext_image_dma_buf_import-sample_yuv -fmt=YV12 -auto

v2: add early return if (flag & MAP_INTERNAL_MASK)
v3: take input rect into account and test with kmscube and piglit.
v4: handle wraparound and bo reference.
v5: indent, exclude 0 width and height on the boundary, map bo
    independently of the image.

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-14 14:23:13 +00:00
Samuel Pitoiset
8a7d4092d2 radv: force enable LLVM sisched for The Talos Principle
It seems safe and it improves performance by +4% (73->76).

A drirc based solution is not what we want for now, keep it
simple and improve later if it's really needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-14 15:21:50 +01:00
Samuel Pitoiset
ecabe2280c radv: add nosisched debug option
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-14 15:21:48 +01:00
Alejandro Piñeiro
b498172d0e spirv: fix typo on DO NOT EDIT header
Introduced on commit 157c9a1341

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-14 13:07:36 +01:00
Jon Turney
7df9a3609a meson: if dep_dl is an empty list, it's not a dependency object
It's ok to use an empty list for dependencies:, but it's not ok to try to
use the found() method of it.

See also https://github.com/mesonbuild/meson/issues/2324

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-14 12:00:25 +00:00
Bas Nieuwenhuizen
7c25578863 radv: Free temporary syncobj after waiting on it.
Otherwise we leak it.

Fixes: eaa56eab6d "radv: initial support for shared semaphores (v2)"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-11-14 10:03:02 +01:00
Bas Nieuwenhuizen
917d3b43f2 radv: Free syncobj with multiple imports.
Otherwise we can leak the old syncobj.

Fixes: eaa56eab6d "radv: initial support for shared semaphores (v2)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-11-14 10:03:02 +01:00
Jason Ekstrand
fb0e9b5197 i965: Track the depth and render caches separately
Previously, we just had one hash set for tracking depth and render
caches called brw_context::render_cache.  This is less than ideal
because the depth and render caches are separate and we can't track
moves between the depth and the render caches.  This limitation led
to some unnecessary flushing around the depth cache.  There are cases
(mostly with BLORP) where we can end up touching a depth or stencil
buffer through the render cache.  To guard against this, blorp would
unconditionally do a render_cache_set_check_flush on it's destination
which meant that if you did any rendering (including a BLORP operation)
to a given surface and then used it as a blorp destination, you would
end up flushing it out of the render cache before rendering into it.

Things get worse when you dig into the depth/stencil state code for
regular GL draw calls.  Because we may end up rendering to a depth
or stencil buffer via BLORP, we did a render_cache_set_check_flush on
all depth and stencil buffers in brw_emit_depthbuffer to ensure that
they got flushed out of the render cache prior to using them for depth
or stencil testing.  However, because we also need to track dirtiness
for depth and stencil so that we can implement depth and stencil
texturing correctly, we were adding all depth and stencil buffers to the
render cache set in brw_postdraw_set_buffers_need_resolve.  This meant
that, if anything caused 3DSTATE_DEPTH_BUFFER to get re-emitted
(currently _NEW_BUFFERS, BRW_NEW_BATCH, and BRW_NEW_BLORP), we would
almost always do a full pipeline stall and render/depth cache flush.

The root cause of both of these problems is that we can't tell the
difference between the render and depth caches in our tracking.  This
commit splits our cache tracking into two sets, one for render and one
for depth, and properly handles transitioning between the two.  We still
flush all the caches whenever anything needs to be flushed.  The idea is
that if we're going to take the hit of a flush and stall, we may as well
flush everything in the hopes that we can avoid a flush by something
else later.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-13 21:51:59 -08:00
Jason Ekstrand
d6d0ac95d5 i965/blorp: Add more destination flushing
Right now we just always flush the destination for render and aren't
particularly careful about depth or stencil.  Soon, flush_for_render
isn't going to do the same thing as flush_for_depth and we may be doing
a good deal less depth flushing so we should be a bit more precise.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-13 21:51:59 -08:00
Jason Ekstrand
4a09070295 i965: Add more precise cache tracking helpers
In theory, this will let us track the depth and render caches
separately.  Right now, they're just wrappers around
brw_render_cache_set_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-13 21:51:59 -08:00
Jason Ekstrand
6830ba0d3b i965: Add stencil buffers to cache set regardless of stencil texturing
We may access them as a texture using blorp regardless of whether or not
stencil texturing is enabled.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2017-11-13 21:51:59 -08:00
Jason Ekstrand
4b1e70cc57 i965: Switch over to fully external-or-not MOCS scheme
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-13 21:35:52 -08:00
Jason Ekstrand
d7a19d69eb i965: Use PTE MOCS for all external buffers
We were already using PTE for all render targets in case one happened to
get scanned out.  However, this still wasn't 100% correct because there
are still possibly cases where we may want to texture from an external
buffer even though we don't know the caching mode.  This can happen, for
instance, on buffers imported from another GPU via prime.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101691
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Tested-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-13 21:35:44 -08:00
Jason Ekstrand
bc933d0e84 intel/blorp: Make the MOCS setting part of blorp_address
This makes our MOCS settings significantly more flexible.

Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Tested-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-13 19:40:10 -08:00
Jason Ekstrand
deec84fd77 anv/blorp: Add a device parameter to blorp_surf_for_anv_image
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Tested-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-13 19:40:09 -08:00
Jason Ekstrand
4639cc716e intel/blorp: Use mocs.tex for depth stencil
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Tested-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-13 19:39:57 -08:00
Kenneth Graunke
866158b4b6 intel/tools/error: Decode compute shaders.
This is a bit more annoying than your average shader - we need to look
at MEDIA_INTERFACE_DESCRIPTOR_LOAD in the batch buffer, then hop over
to the dynamic state buffer to read the INTERFACE_DESCRIPTOR_DATA, then
hop over to the instruction buffer to decode the program.

Now that we store all the buffers before decoding, we can actually do
this fairly easily.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-13 17:11:02 -08:00
Kenneth Graunke
7049c38655 intel/tools/error: Use do-while for field iterator loops.
while loops skip the first field of the instruction/structure, which
is not what the code intended.  It works out because the field we're
looking for doesn't happen to be first, but we ought to do it right
regardless.

Found while writing the next patch, where Kernel Start Pointer is
the first field of INTERFACE_DESCRIPTOR_DATA.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-13 17:11:02 -08:00
Kenneth Graunke
8b749ee0ea intel/tools/error: Decode shaders while decoding batch commands.
This makes aubinator_error_decode's shader dumping work like aubinator.
Instead of printing them after the fact, it prints them right inside the
3DSTATE_VS/HS/DS/GS/PS packet that references them.  This saves you the
effort of cross-referencing things and jumping back and forth.

It also reduces a bunch of book-keeping, and eliminates the limitation
that we could only handle 4096 programs.  That code was also broken and
failed to print any shaders if there were under 4096 programs.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-13 17:11:02 -08:00
Kenneth Graunke
4979bf2728 intel/tools/error: Save error state sections and decode them later.
This lets us complete parsing and storing of each buffer's data before
we begin decoding the batchbuffer.  This makes it possible to inspect
the state buffer and program buffer, so we can properly decode any
indirect state or shader programs.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-13 17:11:02 -08:00
Kenneth Graunke
eb8ad56ed2 intel/tools/error: Fix null termination of ring name string.
Ported from intel_error_decode.  We don't want to run off the end.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-13 17:11:01 -08:00
Kenneth Graunke
ac17b38e79 intel/tools/error: Drop unused MAX_RINGS #define.
Dead code.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-13 17:11:01 -08:00
Kenneth Graunke
596e860317 intel/tools/error: Refactor buffer matching, add more buffers.
Based on a similar patch to intel_error_decode by Chris Wilson.

While we're de-duplicating the gtt_offset calculation, we can simplify
it to assume two hex digits are there - the kernel has done this since
v4.6, and we already require error states from v4.10.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-13 17:10:51 -08:00
Kenneth Graunke
4bb119f00b intel/tools/error: Only decode a few sections of error states.
These three are the only we can reasonably decode with genxml.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-13 17:10:38 -08:00
Kenneth Graunke
00981e7c47 intel/tools/error: Drop unused parameters from decode() helper.
Also change count from a pointer into a value.  We were supposed to
be resetting it to 0 (and failed to), but that's gone since we dropped
the pre-ascii85 handling.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-13 17:10:38 -08:00
Kenneth Graunke
1898bf11a8 intel/tools/error: Drop support for non-ascii85 encoded error states.
Error state files used to look like:

   render ring --- gtt_offset = 0x0e8f6000
   00000000 :  69040000
   00000004 :  79090000
   ...
   00007ffc :  00000000
   --- ringbuffer = 0x00001000

There were thousands of lines between sections.  The file format changed
with Kernel 4.10, and now has a single ascii85-encoded line following
each section heading.  This is much easier to parse.

There are a bunch of bugs in our handling of the old style format,
where we'd decode the wrong data, at the wrong time.  Fixing all of
these is going to be a giant pain.  It's also a lot of extra code
complexity.  In order to properly decode indirect state, or compute
shaders, we'll also need to parse data in advance of decoding, which
is going to be a giant pain with this ad-hoc "decode everywhere!"
mentality.  So, let's just drop support for the older file format.

This unfortunately requires an error state generated by Kernel 4.10 or
later.  That's probably not the end of the world, as we encourage users
to upgrade to the latest kernel when encountering GPU hangs anyway.  It
might be a giant pain for people with LTS kernels, though...

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-13 17:10:38 -08:00
Kenneth Graunke
53586f88d7 intel/tools/error: Do ascii85 decode first.
The dashes "---" may occur within an ascii85 block, but only an ascii85
block starts with ':' or '~'.

Ported from Chris Wilson's intel-gpu-tools commit:
bceec7e1d8a160226b783c6344eae8cbf4ece144

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-13 17:10:38 -08:00
Alexander von Gluck IV
8f9d9ddcae c11/haiku: Define missing timespec_get on Haiku
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-13 16:45:14 -06:00
Alexander von Gluck IV
f09e001a05 egl/haiku: Correct invalid void* conversion in calloc
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-13 16:45:10 -06:00
Dylan Baker
46a7fdd7ca meson: Remove build_by_default from amd code
This is the same logic as the previous two patches.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-13 13:43:20 -08:00
Dylan Baker
49fa074726 meson: Don't build intel shared components by default
It's a neat idea, and still useful in some cases, but the intel common
code is used by i965 and anvil only, this is a little clearer.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-13 13:43:15 -08:00
Dylan Baker
2bfd34c518 meson: don't use build_by_default for specific gallium drivers
Using build_by_default : false is convenient for dependencies that can
be pulled in by various diverse components of the build system, the
gallium hardware/software drivers and state trackers do not fit that
description. Instead, these should be guarded using the variable that tracks
whether that driver should be enabled.

This leaves a few helper libraries: trace, rbug, etc, and the generic
winsys bits as `build_by_default : false` because there are a large
number of gallium components that pull them in.

v2: - remove build_by_default from winsys convenience libs as well.
v3: - Always put drivers before winsys for consistency

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-13 13:43:12 -08:00
Dave Airlie
63b6eb9cb9 r600/shader: handle bitfield extract semantics properly.
Fixes:
tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-bitfieldExtract.shader_test

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-14 06:16:06 +10:00
Dave Airlie
0442809205 r600: handle bitfieldInsert corner case.
This handles the bits >= 32 corner case in bitfieldInsert.

Fixes:
tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-bitfieldInsert.shader_test.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-14 06:16:06 +10:00
Dave Airlie
53d5dda6f8 r600: add gs tri strip adjacency fix.
Like
radeonsi: generate GS prolog to (partially) fix triangle strip adjacency rotation

evergreen hw suffers from the same problem, so rotate the
geometry inputs to fix this.

This fixes:
./bin/glsl-1.50-geometry-primitive-types GL_TRIANGLE_STRIP_ADJACENCY
on evergreen.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-14 06:16:06 +10:00
Dave Airlie
f3f8615d76 r600: fix isoline tess factor component swapping.
As per radeonsi, the tess factor components for isolines
are reversed.

Fixes: tests/spec/arb_tessellation_shader/execution/isoline.shader_test
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-14 06:16:06 +10:00
Dave Airlie
50330d7115 r600/shader: reserve first register of vertex shader.
r0 in input into vertex shaders contains things like vertexid,
we need to reserve it even if we have no inputs.

This fixes a bunch of tessellation piglits.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-14 06:16:05 +10:00
Jason Ekstrand
00fb21b570 meson: Move -Dvulkan-drivers handling higher in the file
The window-system auto-detection code (specifically for glx) relies on
with_any_vk being available.  This fixes the Vulkan-only build.  Also,
this puts it up near the handling of -Ddri-drivers and -Dgallium-drivers
which seems to make a bit more sense.

Fixes: 118a7f0441 "meson: add support for xlib glx"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-13 11:55:06 -08:00
Jason Ekstrand
3a922d6a61 meson: Stop requiring platforms for Vulkan
It should be perfectly valid to build a completely headless Vulkan
driver.  We don't need to require a platform.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-13 11:54:44 -08:00
Dave Airlie
da31e2c22d r600: don't emit atomic save if we have no atomic counters.
Otherwise we end up emitting the fence.

Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-14 05:40:47 +10:00
Adam Jackson
257edb5b9a glx/dri3: Fix passing renderType into glXCreateContext
Without this, trying to create a GLX_RGBA_FLOAT_TYPE_ARB context would
fail, because GLX_RGBA_TYPE would be a mismatch with the fbconfig.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-11-13 10:40:30 -05:00
Adam Jackson
033cfb17db glx/drisw: Fix glXMakeCurrent(dpy, None, ctx)
This is perfectly legal in GL 3.0+.

Fixes piglit/glx-create-context-current-no-framebuffer.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-11-13 10:39:48 -05:00
Adam Jackson
bc1bc6f512 glx: Lower GLX opcode lookup into SendMakeCurrentRequest
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-11-13 10:39:33 -05:00
Jason Ekstrand
93200ea26d aubinator: Don't skip the first field in each subgroup
The previous iteration algorithm would advance the field pointer right
after we advance the group.  This meant that you would end up with
skipping the first field of the group.  In the common case, where the
only field is a struct (e.g. 3DSTATE_VERTEX_BUFFERS), it would get
skipped entirely.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-13 07:37:23 -08:00
Jason Ekstrand
74a9e51696 intel/genxml: Delete empty groups
They serve no purpose other than to just fill empty space in the packet
so each dword has something.  Just disallowing empty groups is a bit
easier on some of the tools.  This does not change the generated packing
headers in any way.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-13 07:37:23 -08:00
Jason Ekstrand
54a6f7eaca anv: Don't crash on invalid heap sizes when the PCI ID is overriden 2017-11-13 07:37:23 -08:00
Alex Smith
4122d00846 nir/spirv: tg4 requires a sampler
Gather operations in both GLSL and SPIR-V require a sampler. Fixes
gathers returning garbage when using separate texture/samplers (on AMD,
was using an invalid sampler descriptor).

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-13 13:38:18 +00:00
Alex Smith
e9eb3c4753 spirv: Use correct type for sampled images
We should use the result type of the OpSampledImage opcode, rather than
the type of the underlying image/samplers.

This resolves an issue when using separate images and shadow samplers
with glslang. Example:

    layout (...) uniform samplerShadow s0;
    layout (...) uniform texture2D res0;
    ...
    float result = textureLod(sampler2DShadow(res0, s0), uv, 0);

For this, for the combined OpSampledImage, the type of the base image
was being used (which does not have the Depth flag set, whereas the
result type does), therefore it was not being recognised as a shadow
sampler. This led to the wrong LLVM intrinsics being emitted by RADV.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-13 13:37:50 +00:00
Alejandro Piñeiro
157c9a1341 spirv: add DO NOT EDIT warning on generated spirv_info.c
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-13 13:28:44 +01:00
Thomas Hellstrom
54a58b2856 loader/dri3: Improve dri3 thread-safety
It turned out that with recent changes that call into dri3 from glFinish(),
it appears like different thread end up waiting for X events simultaneously,
causing deadlocks since they steal events from eachoter and update the dri3
counters behind eachothers backs.

This patch intends to improve on that. It allows at most one thread at a
time to wait on events for a single drawable. If another thread intends to
do the same, it's put to sleep until the first thread finishes waiting, and
then it rechecks counters and optionally retries the waiting. Threads that
poll for X events never pulls X events off the event queue if there are
other threads waiting for events on that drawable. Counters in the
dri3 drawable structure are protected by a mutex. Finally, the mutex we
introduce is never held while waiting for the X server to avoid
unnecessary stalls.

This does not make dri3 drawables completely thread-safe but at least it's a
first step.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102358
Fixes: d5ba75f888 "st/dri2 Plumb the flush_swapbuffer functionality through to dri3"
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-13 12:43:39 +01:00
Juan A. Suarez Romero
2b72ab58e5 etnaviv: automake,meson: include common_3d.xml.h in the sources lists
v2: include the file also in the meson.build (Eric Engestrom).

Fixes: f1e1c60ff6 ("etnaviv: Update from rnndb")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-13 12:26:59 +01:00
Tapani Pälli
41f7de477c egl: EXT_pixel_format_float plumbing
Patch adds support and capability to match with new surface attribute,
component type. Currently no configs with floating point type are exposed.

With this change, following dEQP test starts to pass:

   dEQP-EGL.functional.choose_config.color_component_type_ext.dont_care
   dEQP-EGL.functional.choose_config.color_component_type_ext.fixed
   dEQP-EGL.functional.choose_config.color_component_type_ext.float

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2017-11-13 12:40:26 +02:00
Samuel Pitoiset
934b77f2fe radv: add unlikely() around radv_save_descriptors()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:40 +01:00
Samuel Pitoiset
305745457c radv: optimize calling radv_cmd_buffer_trace_emit()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:38 +01:00
Samuel Pitoiset
957d42271b radv: optimize calling radv_save_pipeline()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:36 +01:00
Samuel Pitoiset
ebab5c8ff4 radv: use vk_zalloc instead of vk_alloc+memset
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:35 +01:00
Samuel Pitoiset
0f68208f1d radv: remove unnecessary memset() in radv_AllocateCommandBuffers()
This should not be needed, if the allocation fails an error is
returned and the host should handle it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:32 +01:00
Samuel Pitoiset
66da4c75bc radv: remove useless initializations in radv_create_cmd_buffer()
There is a memset() above.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:30 +01:00
Samuel Pitoiset
3d95fde661 radv: remove useless memset() in radv_CreateFence()
All radv_fence fields are initialized here.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:28 +01:00
Samuel Pitoiset
cd64a4f705 radv: use vk_error() everywhere an error is returned
For consistency and it might help for debugging purposes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:26 +01:00
Samuel Pitoiset
4e16c6a41e radv: make radv_emit_framebuffer_state() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:04:25 +01:00
Samuel Pitoiset
be01197d8d radv: do not emit the framebuffer when restoring a pass
Instead just dirty RADV_CMD_DIRTY_FRAMEBUFFER and it will be
re-emitted if necessary before the next draw.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:04:22 +01:00
Samuel Pitoiset
f87c58dde3 radv: prefetch VBO descriptors at the right place
Just after the vertex shader.

This seems to give a minor boost for, at least, Serious Sam
Fusion 2017 and Dawn of War 3. I don't see any real impacts
with The Talos Principle.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:03:16 +01:00
Samuel Pitoiset
9444a34f4a radv: add radv_emit_prefetch_TC_L2_async() helper
Will be used for VBO descriptors prefetching.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:03:13 +01:00
Samuel Pitoiset
36c2e46328 radv: rename radv_emit_shaders_prefetch() to radv_emit_prefetch()
For consistency because this function will also prefetch VBO
descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:03:11 +01:00
Iago Toral Quiroga
456e10944f glsl/linker: use without_array() to retrieve type
This is what we do in the condition too, so it makes sense.

v2: Only compute without_array() once (Ilia).

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-11-13 09:22:26 +01:00
Dave Airlie
bec716e844 radv: emit esgs ring size in one place.
This register is the same on all gpus so far, so emit it in one
place and also for the pre-gfx9 gpus set the value in the pipeline
creation.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-13 07:17:09 +00:00
Dave Airlie
031e591923 radv: move calculating vs out info regs into pipeline.
This moves some calculations of register values into the pipeline
construction, it saves looking at outinfo in the cmd buffer emit.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-13 07:16:53 +00:00
Rob Clark
4a9aad96aa freedreno/a5xx: fix SSBO emit for non-zero offset
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:29:00 -05:00
Rob Clark
5f25ab4fee freedreno/a5xx: remove obsolete comment
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:29:00 -05:00
Rob Clark
8fcee858d5 freedreno/ir3: don't create split/fo if only writing .x
In case an instruction only writes one register, and it is .x, we can
skip the extra level of fanout indirection.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
e7b2719f69 freedreno/a5xx: indirect grids
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
471aa1b6d0 freedreno/a5xx: add global size compute cap
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
62981bbe65 freedreno/ir3: turn on std430 packing
Seems to fix dEQP compute related tests.. and matches what i965 does, so
perhaps there is some assumption that std430 packing is on by default
somewhere in NIR?
2017-11-12 12:28:59 -05:00
Rob Clark
bedbe7f90c freedreno/a5xx: image support 2017-11-12 12:28:59 -05:00
Rob Clark
819a613ae3 freedreno/ir3: moar better scheduler
Add a new pass that inserts additional dependencies, rather than simply
relying on SSA srcs added in the nir->ir3 frontend.  This makes it
easier to deal with barriers, but the additional false deps also lets us
deal properly with ensuring a write depends on all previous reads.

Since conversion to barrier instructions is lossy (ie. just knowing the
instruction doesn't tell us enough about what other instructions the
barrier applies to), use barrier_class/barrier_conflict fields in the
ir3_instruction to retain this information.

This could probably be relaxed somewhat by considering *which* array/
buffer/image variable is being referenced.  Ie. a write to buffer A
can overtake a read from buffer B, if B is not coherent.  (right?)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
15ea8d128a freedreno/ir3: move macros
I want to add a growable array to ir3_instruction, so we can append
false dependencies for purposes of scheduling barriers, atomics, and
dealing with write after read hazards.

Just code motion preparing for next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
9edfc369c0 freedreno/ir3: image support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
eaae81058c freedreno/ir3: shared variable support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
dd75abc6f3 freedreno/ir3: some SSBO cleanups/fixes
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
2f8bdf2e2b freedreno/ir3: split out INSTR4F instructions
Atomic instructions take a different # of src args depending on .g or .l
variant, split these out into different helpers with INSTR*F() helper
macro that lets you specify instruction flag.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
0038deb256 freedreno/ir3: cat6 encoding fixes
Instruction encoding/decoding fixes needed for images, shared variables,
etc.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
4e9a6c6868 freedreno/ir3: add barriers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
4c711f4d18 freedreno/ir3: invert is_same_type_mov() logic
Some instructions (like barriers) have no dst, which causes problems
with dereferencing a NULL dst.  Flip the logic around to reject opc's
that can't be a type of move first, to filter out those instructions.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
6da5130074 freedreno/ir3: add cat7 instructions
Needed for memory and execution barriers.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
33f5f63b8f freedreno/ir3: add SSBO get_buffer_size() support
Somehow I overlooked this when adding initial SSBO support.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
b267a08404 freedreno/ir3: extract helper for common consts
User consts and driver consts such as UBO addresses and immediates are
handled the same for all shader stages, so split out a shared helper for
these, to make it easier to add more.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
13fe1feb62 freedreno: add image view state tracking
It is unfortunate that image state isn't a real CSO, since (at least for
a4xx/a5xx) it is a combination of sampler and "SSBO" image state, and it
would be useful to pre-compute the state block "register" values rather
than doing it at emit time.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
12c1c3ab23 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
0006b860ce mesa/st/nir: assign driver_location for images
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark
ecbe1e976f st/program: fix compute shader nir references
In case the IR is NIR, the driver takes reference to the nir_shader.
Also, because there are no variants, we need to clone the shader,
instead of sharing the reference with gl_program, which would result
in a double free in _mesa_delete_program().

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-12 12:28:59 -05:00
Rob Clark
5009dc55f2 freedreno/ir3: rename ir3_compile -> ir3_context
Having both an ir3_compile (which was really context for compiling a
single shader variant) and ir3_compiler (which is the compiler object
that compiles all variants, ie. basically holds the RA regset) is a
bit confusing.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Kenneth Graunke
9a0465b3a3 intel/tools: Fix detection of enabled shader stages.
We renamed "Function Enable" to "Enable", which broke our detection
of whether shaders are enabled or not.  So, we'd see a bunch of HS/DS
packets with program offsets of 0, and think that was a valid TCS/TES.

Fixes: c032cae9ff (genxml: Rename "Function Enable" to "Enable".)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-12 00:16:40 -08:00
Timothy Arceri
b99fb1a04d st/atifs: remove unrequired initialisation of gl_program fields
As far as I can tell these fields are only used to query arb
program info and are not related to ATI_fragment_shader.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Miklós Máté <mtmkls@gmail.com>
2017-11-12 11:59:22 +11:00
Timothy Arceri
8fe6abd964 ac: add emit_vertex to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-12 11:08:26 +11:00
Timothy Arceri
dc42a2177c radeonsi: rework gs_vtx_offset handling
This simplifies things a bit and will enable it to work with the
common NIR -> LLVM code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-12 11:08:26 +11:00
Timothy Arceri
8c9f3f2c46 nir: add streams to nir data
This will be used by gallium drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-12 11:08:26 +11:00
Marek Olšák
3a71eac783 st/dri: fix deadlock when waiting on android fences
Android fences can't be deferred, because st/dri calls fence_finish
with ctx = NULL, so the driver can't flush u_threaded_context.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-11 04:12:53 +01:00
Rob Clark
881f6e741f meson: Guard freedreno build with with_gallium_freedreno.
This prevents build failures when libdrm_freedreno is unavailable,
which started happening after the ir3_compiler build was enabled.

(Patch by Rob, commit message by Ken).

Fixes: fecd04a66a ("freedreno/ir3: fix standalone compiler meson build")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-10 17:11:48 -08:00
Andres Gomez
d48492074a docs: update calendar, add news item and link release notes for 17.2.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-11-11 01:25:40 +02:00
Andres Gomez
30a3da45fa docs: add sha256 checksums for 17.2.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 96ad27f8fc)
2017-11-11 01:25:01 +02:00
Andres Gomez
bb40d2f3b8 docs: add release notes for 17.2.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit ae52410bf0)
2017-11-11 01:24:56 +02:00
Jason Ekstrand
2927014313 i965/gen10: Use the correct form of | for the RCPFE workaround
Found by inspection

Fixes: d3d0fe4572
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-11-10 14:36:57 -08:00
Kenneth Graunke
b8d42cccd0 i965: Make L3 configuration atom listen for TCS/TES program updates.
The L3 configuration code already considers the TCS and TES programs,
but failed to listen for TCS/TES program changes.

This was somehow missing.

Fixes: e9644cb1f9 ("i965: Consider tessellation in get_pipeline_state_l3_weights.")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-11-10 13:34:59 -08:00
Dylan Baker
ad9c2f5469 meson: build gallium-xlib based glx
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-10 13:00:01 -08:00
Dylan Baker
118a7f0441 meson: add support for xlib glx
There is a bunch of churn in the main meson.build so that we can
correctly set the auto tristate of GLX. In particular, don't build
xlib-based glx when dri and gallium are disabled but vulkan is enabled,
in that case just turn glx off.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-10 13:00:01 -08:00
Dylan Baker
13752af4ed meson: move gl pkgconfig generation out of glx
Because the same generation logic is required by xlib glx and
gallium-xlib glx, it makes sense to pull it out.

v2: - Ensure that libgl is defined before trying to generate a pkgconfig
      file with it.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-10 13:00:01 -08:00
Dylan Baker
dc0ec581f2 meson: move wayland-egl into egl folder
This ensure that it's properly guarded, but also makes the code clearer
by grouping related things together.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-10 13:00:01 -08:00
Dylan Baker
140b688c57 meson: add nir_builder_opcodes_h to gallium_auxiliary
This creates a dependency on this header being generated before trying
to compile any of these targets, as well as passing the correct -I to
the compiler to ensure it's included correctly.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-10 12:59:54 -08:00
Dylan Baker
7210d0096a gallium/xlib: remove GL_{MAJOR,MINOR,TINY}
These variables were removed from autotools in 2008 (sha:
80f68e1b6a), but they have lived on here. The Scons build
meanwhile doesn't set a patch/tiny version at all, just major and minor.
This patch removes the unused variables and simply sets the version,
leaving patch/tiny as 0 since that's what the autotools build as been
doing forever. This shouldn't change any behavior.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-10 12:40:08 -08:00
Timothy Arceri
f9e5216f71 radeonsi: get llvm types from ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-11 06:54:25 +11:00
Jon Turney
d1edf6e396 glx/windows: Fix building libwindowsdri when libX11 headers are installed in a non-standard location
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2017-11-10 18:24:12 +00:00
Jon Turney
764f1e4d45 util: include unistd.h, which may be required for usleep prototype
This seems to be dropped in 222a2fb9 "util: move os_time.[ch] to src/util"

../../../src/util/os_time.c: In function ‘os_time_sleep’:
../../../src/util/os_time.c:104:4: error: implicit declaration of function ‘usleep’ [-Werror=implicit-function-declaration]

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-10 18:20:35 +00:00
Dylan Baker
854455498c autotools: Set C++ visibility flags on Intel
These flags are set for C sources, but not C++. This causes symbol
visibility leaks from the C++ parts of the Intel compiler.

Fixes: 700bebb958 ("i965: Move the back-end compiler to src/intel/compiler")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-11-10 09:41:55 -08:00
Andres Gomez
b8c85f5acc docs/releasing: improve the pre-announce template and examples
v2: Choose a proper rejection example (Emil).

Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-10 19:36:24 +02:00
Andres Gomez
3bc16339e5 docs/releasing: drop manually exported variables during smoke test
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-10 19:36:00 +02:00
Andres Gomez
5ccbac8e04 docs/releasing: drop custom LLVM_CONFIG if previously manually set
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-10 19:34:51 +02:00
Marek Olšák
e456d4def5 st/dri: fix android fence regression
Fixes piglit - egl_khr_fence_sync/android_native tests.
Broken by 884a0b2a9e.

Introduce state-tracker flush flags, analogous to the pipe ones. Use
the former when with stapi->flush().

Fixes: 884a0b2a9e ("st/dri: use stapi flush instead of pipe flush
when creating fences")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-10 17:17:13 +01:00
Nicolai Hähnle
e7972b8943 util/u_thread: fix compilation on Mac OS
Apparently, it doesn't have pthread barriers.

p_config.h (which was originally used to guard this code) uses the
__APPLE__ macro to detect Mac OS.

Fixes: f0d3a4de75 ("util: move pipe_barrier into src/util and rename to util_barrier")
Cc: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-10 16:37:54 +01:00
Nicolai Hähnle
f53570a7a6 util/u_queue: handle OS_TIMEOUT_INFINITE in util_queue_fence_wait_timeout
Fixes e.g. piglit/bin/bufferstorage-persistent read -auto

Fixes: e6dbc804a8 ("winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-10 16:37:47 +01:00
Nicolai Hähnle
ee880e91cc gallium/u_threaded: fix end_query regression
Ouch...

Fixes: 244536d3d6 ("gallium/u_threaded: avoid syncs for get_query_result")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103653
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-10 16:37:37 +01:00
Bruce Cherniak
d473f91758 swr: Fixed an uncommon freed-memory access during state validation
State validation is performed during clear and draw calls.  Validation
during clear was still accessing vertex buffer state.  When the currently
set vertex buffers are client arrays, this could lead to accessing freed
memory.  Such is the case with the VMD application.

Previously, vertex buffer validation depended on a dirty bit or the
draw info indicating an indexed draw.  This required special handling for
clears.  But, vertex buffer validation still occurred which was unnecessary
and wrong.

Now, only minimal validation is performed during clear, deferring the
remainder to the next draw.  And, by setting the dirty bit in swr_draw_vbo
for indexed draws, vertex buffer validation is only dependent upon a
single dirty bit.

This fixes a bug exposed by the VMD application when changing models.

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2017-11-10 08:55:42 -06:00
Rob Clark
fecd04a66a freedreno/ir3: fix standalone compiler meson build
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-10 08:57:33 -05:00
Rob Clark
86154acb57 freedreno/ir3: correct # of dest components for intrinsics
Don't rely on intr->num_components having a valid value.  It doesn't
seem to anymore for non-vectorized intrinsics.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-10 08:57:33 -05:00
Rob Clark
3fcf18634c freedreno/ir3: remove bogus assert
The ssbo atomic instructions are not vectorized.  So num_components is
not expected to be valid.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-10 08:57:33 -05:00
Rob Clark
ef4c42fc3a nir: handle get_buffer_size in nir_lower_atomics_to_ssbo
Overlooked initially, be we need to remap the SSBO index for this as
well.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-10 08:57:33 -05:00
Chad Versace
cd6f79a71d anv/meson: Generate dev_icd.json
I tested this in a setup where the builddir was outside of the srcdir.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-09 16:29:33 -08:00
Chad Versace
b7441ef252 anv: Fix architecture in intel_icd.{arch}.json
Use the host arch, not the target arch. In Meson and in recent
Autotools, the host arch is where the binary will be used. The target
arch is useful only when compiling a compiler.

See: http://mesonbuild.com/Cross-compilation.html
See: https://www.gnu.org/software/automake/manual/html_node/Cross_002dCompilation.html
Reported-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-09 16:29:31 -08:00
Chad Versace
9ec33972cc radv: Fix architecture in radeon_icd.{arch}.json
Use the host arch, not the target arch. In Meson and in recent
Autotools, the host arch is where the binary will be used. The target
arch is useful only when compiling a compiler.

See: http://mesonbuild.com/Cross-compilation.html
See: https://www.gnu.org/software/automake/manual/html_node/Cross_002dCompilation.html
Reported-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-09 16:29:28 -08:00
Chad Versace
2a4798ad98 anv: Refactor anv_GetImageSubresourceLayout()
Its helper function, anv_surface_get_subresource_layout(), was not very
helpful. So fold it into the main function.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-09 16:01:59 -08:00
Chad Versace
69e3f0b02e anv/image: Refactor choice of isl_tiling_flags_t
Instead of choosing the tiling flags inside make_surface(), which is
called once per aspect in a loop, and which chooses the same tiling for
each aspect, choose the tiling flags exactly once before entering the
aspect loop.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-09 16:01:59 -08:00
Chad Versace
7bb4387105 anv: Refactor anv_get_format_plane() - explicit unsupported
The same local variable, 'plane_format', was returned on success *and*
failure. Be more explicit in distinguishing the two cases: return
'plane_format' on success and return 'unsupported' on failure.

This simplifies the diff in upcoming patches for
VK_EXT_image_drm_format_modifier.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-09 16:01:59 -08:00
Chad Versace
3ee7f4bc2f anv: Remove anv_physical_device_get_format_properties()
Fold its body into its sole caller,
anv_GetPhysicalDeviceFormatProperties().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-09 16:01:59 -08:00
Chad Versace
891d237667 anv: Simplify anv_physical_device_get_format_properties()
Now that get_image_format_properties() returns the correct
VkFormatFeatureFlags, we can remove the unneeded if-branch and some
local variables.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-09 16:01:59 -08:00
Chad Versace
b3e2ce0580 anv: Simplify anv_get_image_format_properties()
Now that get_image_format_features() has a VkImageTiling parameter, we
can bypass anv_physical_device_get_format_properties() and call
get_image_format_features() directly.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-09 16:01:59 -08:00
Chad Versace
cd3fe376e0 anv: Rename get_image_format_properties()
The name is misleading. It looks like vkGetPhysicalDeviceImageFormatProperties(),
but it actually implement vkGetPhysicalDeviceFormatProperties. Let's
rename it to what it actually does, get_image_format_features(), because it
returns VkFormatFeatureFlags.

For consistency, also rename get_buffer_format_properties() to
get_buffer_format_features().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-09 16:01:59 -08:00
Chad Versace
17ac61a2c9 anv: Fix get_image_format_properties() - YCbCr
Teach it to calculate the format features for YCbCr.

The goal (which is completed in this patch) is to incrementally fix
get_image_format_properties() to return a correct result.  Previously,
it returned incorrect VkFormatFeatureFlags which the caller needed clean
up.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-09 16:01:59 -08:00
Chad Versace
eaa49ec3fc anv: Fix get_image_format_properties() - 3-channel formats
Teach it to calculate the format features for 3-channel formats.

The goal is to incrementally fix get_image_format_properties() to return
a correct result.  Currently, it returns incorrect VkFormatFeatureFlags
which the caller must clean up.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-09 16:01:59 -08:00
Chad Versace
6394e4a380 anv: Refactor get_image_format_properties() - Reduce params
Replace parameters 'enum isl_format' and 'struct anv_format_plane' with
new parameter 'const struct anv_format *'.

The goal is to incrementally fix get_image_format_properties() to return
a correct result.  Currently, it returns incorrect VkFormatFeatureFlags
which the caller must clean up.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-09 16:01:59 -08:00
Chad Versace
66647074a4 anv: Refactor get_image_format_properties() - base_isl_format
Rename parameter 'base' to 'base_isl_format'.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-09 16:01:59 -08:00
Chad Versace
c22a9f10be anv: Refactor get_image_format_properties() - plane_format
Rename parameter 'format' to 'plane_format'.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-09 16:01:59 -08:00
Chad Versace
096fc6915b anv: Refactor get_image_format_properties() - ASTC
Teach it to calculate the format features for ASTC.

The goal is to incrementally fix get_image_format_properties() to return
a correct result.  Currently, it returns incorrect VkFormatFeatureFlags
which the caller must clean up.

v2: New commit message

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-09 16:01:59 -08:00
Chad Versace
8ae4e97536 anv: Refactor get_image_format_properties() - depthstencil (v2)
Teach it to calculate the features of depthstencil formats.

The goal is to incrementally fix get_image_format_properties() to return
a correct result.  Currently, it returns incorrect VkFormatFeatureFlags
which the caller must clean up.

v2: New commit message

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-09 16:01:59 -08:00
Chad Versace
6720abf292 anv: Better types for 'aspect' function params
Some functions have a comment that says "Exactly one bit must be in
'aspect'". So change the type of their 'aspect' parameter from
VkImageAspectFlags to VkImageAspectFlagBits.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-09 16:01:59 -08:00
Chad Versace
342c811646 anv: Refactor get_buffer_format_properties()
Make it a stand-alone function. Pre-patch, for some formats the function
returned incorrect VkFormatFeatureFlags which were cleaned up by the
caller.

This prepares for a cleaner implementation of
VK_EXT_image_drm_format_modifier.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-09 16:01:59 -08:00
Eric Anholt
62deeaa23a broadcom/vc4: Fix simulator mode for the MADVISE usage. 2017-11-09 15:51:56 -08:00
Marek Olšák
272fe94942 mesa: enable ARB_texture_buffer_* extensions in the Compatibility profile
We already have piglit tests testing alpha, luminance, and intensity
formats. They were skipped by piglit until now.

Additionally, I'm enabling one ARB_texture_buffer_range piglit test to run
with the compat profile.

i965 behavior is unchanged except that it doesn't expose TBOs in the Compat
profile. Not sure how that affects the GL version override.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-09 23:55:31 +01:00
Dave Airlie
d4ebdc1a54 docs: update r600 atomic counter status.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-10 08:39:36 +10:00
Dave Airlie
06993e4ee3 r600: add support for hw atomic counters. (v3)
This adds support for the evergreen/cayman atomic counters.

These are implemented using GDS append/consume counters. The values
for each counter are loaded before drawing and saved after each draw
using special CP packets.

v2: move hw atomic assignment into driver.
v3: fix messing up caps (Gert Wollny), only store ranges in driver,
drop buffers.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
2017-11-10 08:39:36 +10:00
Dave Airlie
9e62654d4b st/mesa: add support for hw atomics to glsl->tgsi. (v5)
This adds support for creating the hw atomic tgsi from
the glsl codepaths.

v2: drop the atomic index and move to backend.
v3: drop buffer decls. (Marek)
v4: fix off by one (Gert)
v5: fix off by one the other way (Dave)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-10 08:39:35 +10:00
Dave Airlie
58d061ceef st/mesa: setup hw atomic limits. (v1.1)
HW atomics need to use caps to set some limits, and some
other limits may also need limiting.

This fixes things up to work for evergreen hw, it may need
more changes in the future if other hw wants to use this path.

v1.1: fix indent.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-10 08:39:35 +10:00
Dave Airlie
9f1db21f28 st/mesa: start adding support for hw atomics atom. (v2)
This adds a new atom that calls the new driver API to
bind buffers containing hw atomics.

v2: fixup bindings for sparse buffers. (mareko/nha)
don't bind buffer atomics when hw atomics are enabled.
use NewAtomicBuffer (mareko)

Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-10 08:39:35 +10:00
Dave Airlie
e0fb6de313 mesa/program: add hw atomic counter file
This is needed for the GLSL->TGSI translation for hw atomic counters.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-10 08:39:35 +10:00
Dave Airlie
cca5617348 gallium: add hw atomic buffer binding API.
This API binds atomic buffers for all bound shaders (as per the
GL semantics).

This is needed to support cross shader hw atomic counters.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-10 08:39:35 +10:00
Dave Airlie
4b0b82770a gallium/tgsi: start adding hw atomics (v3.2)
This adds support for a hw atomic counters to TGSI.

A new register file for storing atomic counters is added,
along with a new atomic counter semantic, along with docs
for both.

v2: drop semantic, move hw counter to backend,
Ilia pointed out SSO would have busted my plan, and he
was right.
v3: drop BUFFER decls. (Marek)
v3.1: minor fixups for whitespace, set ureg error
if we overflow the hw atomic limits. (nha)
v3.2: fix some docs inconsistencies (Ilia)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-10 08:39:35 +10:00
Dave Airlie
2a06423c00 gallium: add CAPs to support HW atomic counters. (v3)
This looks like an evergreen specific feature, but with atomic
counters AMD have hw specific counters they use instead of operating
on buffers directly. These are separate to the buffer atomics,
so require different limits and code paths.

I've left the CAP for atomic type extensible in case someone
else has a variant on this sort of thing (freedreno maybe?)
and needs to change it.

This adds all the CAPs required to add support for those atomic
counters, along with a related CAP for limiting the number of
output resources.

I'd like to land this and the st patch then I can start to
upstream the evergreen support for these and other GL4.x features.

v2: drop the ATOMIC_COUNTER_MODE cap, just use the return
from the HW counters. If 0 we use the current mode.
v3: fix some rebase errors (Gert Wollny)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-10 08:39:34 +10:00
Dave Airlie
24baca6e75 r600/query: drop rest of vi workaround code.
This isn't needed in r600 anymore.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-10 08:39:16 +10:00
Roland Scheidegger
dd38a4ee0d docs: Fix GL_MESA_program_debug enums
13b303ff92 added the actual enums but
didn't remove the already existing XXXX ones. (And also duplicated
the "fragment" names instead of using the "vertex" names.)

Fixes: 13b303ff92 "docs: Update the list of used MESA GL enums."
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-09 23:35:12 +01:00
Brian Paul
b99c6288c1 st/mesa: remove 'struct' keyword on function parameter
st_src_reg is a class, not a struct.  Simply remove 'struct' to silence
a MSVC compiler warning (class vs. struct mismatch).

Reviewed-by; Charmaine Lee <charmainel@vmware.com>
2017-11-09 14:13:59 -07:00
Brian Paul
750ee4182e threads: fix MinGW build breakage
Fixes: f1a3648784 ("threads: update for late C11 changes")

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-11-09 14:13:59 -07:00
Brian Paul
f8bae523d9 mesa: s/GLint/gl_buffer_index/ for _ColorDrawBufferIndexes
Also fix local variable declarations and replace -1 with BUFFER_NONE.
No Piglit changes.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-11-09 14:13:59 -07:00
Brian Paul
366453f4d3 mesa: s/GLint/gl_buffer_index/ for _ColorReadBufferIndex
BUFFER_NONE is -1 so no reason for GLint.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-11-09 14:13:59 -07:00
Brian Paul
9862a8403e mesa: minor reformatting, add const to gl_external_samplers()
This function should probably be moved elsewhere, too.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-11-09 14:13:59 -07:00
Brian Paul
ad5f407b61 st/mesa: whitespace clean-up in st_mesa_to_tgsi.c
Remove trailing whitespace, fix indentation, wrap lines to 78 columns, etc.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-11-09 14:13:59 -07:00
Dylan Baker
1873327c4b meson: implement default driver arguments
This allows drivers to be set by OS/arch in a sane manner.

v2: - set _drivers to a list of drivers instead of manually assigning
      each with_*
v3: - Use "auto" instead of "default", which matches the value of other
      automatically configured options.
    - Set vulkan drivers as well
    - Add error message if no automatic drivers are known for a given
      arch/OS combo
    - use not(darwin or windows) instead of (linux or *bsd), which is
      probably more accurate (that way Solaris and other *nix systems
      aren't excluded)
    - rename softpipe to swrast, as swrast is the actual option name

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-11-09 13:10:59 -08:00
Kenneth Graunke
8ecdbb6136 i965: Pretend there are 4 subslices for compute shader threads on Gen9+.
Similar to what we did for pixel shader threads - see gen_device_info.c.

We don't want to bump the actual Maximum Number of Threads though, so
we adjust it here.  For pixel shaders, we don't use max_wm_threads, so
we could just bump it globally.

Supposedly fixes Piglit tests:
arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-i64vec3-int64_t
arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-i64vec4-int64_t
arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-u64vec4-uint64_t

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-11-09 12:34:11 -08:00
Dylan Baker
3e9533d9b8 meson: Add script to use VERSION file for getting version
Meson has up until this point set it's version in the root meson.build
script, while the other build systems read the VERSION file. This is
just "one more thing" to duplicate between meson and every other build
system. This script is a simple "read, strip, print" sort of deal to
allow meson to read the VERSION file.

I chose to implement this in python since python is portable, and to
keep the meson.build script clean. This is also complicated by the fact
that the project() call *must* be the first non-comment,non-blank in the
toplevel meson.build script.

v2: - Move from scripts/ to bin/
    - use python explicitly to run the scripts to support windows

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-09 11:19:53 -08:00
Boris Brezillon
359a8f6ae5 broadcom/vc4: Mark BOs as purgeable when they enter the BO cache
This patch makes use of the DRM_IOCTL_VC4_GEM_MADVISE ioctl to mark all
BOs placed in the mesa BO cache as purgeable so that the system can
reclaim this memory under memory pressure.

v2:
- Removed BOs from the cache when they've been purged by the kernel
- Check whether the madvise ioctl is supported or not before using it

v3: Don't walk the whole list when we find a busy BO (by anholt, acked by
    Boris)

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-09 10:57:17 -08:00
Boris Brezillon
7d72af4f09 drm-uapi: Update vc4 header from drm-next
Taken from drm-next d65d31388a23 ("Merge tag
'drm-misc-next-fixes-2017-11-07' of
git://anongit.freedesktop.org/drm/drm-misc into drm-next")

v2: Add the NOTSUPP definition from the final drm-next version, not the
    commit (anholt).

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-11-09 10:57:14 -08:00
Eric Anholt
ebcb4c2156 meson: Enable VC4's NEON assembly support.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-09 09:40:30 -08:00
Eric Anholt
9c9fd8ff37 meson: Always link libgallium_dri.so against dep_thread.
Somehow on my cross build the -pthread is getting lost.  All the other
deps seem to work out fine.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-09 09:40:27 -08:00
Eric Anholt
4c94607c21 meson: Drop stale comment about making valgrind conditional.
It was fixed in 5c2ff5773a.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-09 09:40:25 -08:00
Eric Anholt
d975e6940e meson: Leave dep_llvm empty if !with_llvm
The gallium auxiliary build would link against llvm, for the gallivm code
that it didn't build.  This broke the build on my armhf cross, where
libLLVM-3.9.so is not multiarch and thus points to x86-64 libs.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-09 09:40:03 -08:00
Adam Jackson
015cc6bb7c Revert "glx: Implement GLX_EXT_no_config_context (v2)"
Pushed ahead of things actually working.

This reverts commit 5293b96b16.
2017-11-09 11:41:14 -05:00
Marek Olšák
9ceb057ebf radeonsi: pack r600_surface better
160 -> 136 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Marek Olšák
169525684f radeonsi: pack r600_texture better
1752 -> 1736 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Marek Olšák
f8a4b606a2 radeonsi: clean up r600_surface
216 -> 160 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Marek Olšák
6916ee7e17 radeonsi: remove r600_texture::non_disp_tiling
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Marek Olšák
a06fe75eac radeonsi: remove DBG_NO_DISCARD_RANGE
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Adam Jackson
5293b96b16 glx: Implement GLX_EXT_no_config_context (v2)
This more or less ports EGL_KHR_no_config_context to GLX.

v2: Enable the extension only for those backends that support it.

Khronos: https://github.com/KhronosGroup/OpenGL-Registry/pull/102
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-11-09 09:35:30 -05:00
Adam Jackson
3f66d54a2a glx: Prepare the DRI backends for GLX_EXT_no_config_context
This should be safe as these backends already support the EGL version of
this extension. DRI1 is not affected because it does not support
GLX_ARB_create_context anyway. DRI-Windows is not prepared to implement
this as there's no equivalent WGL extension, and wglCreateContextAttribs
seems to really want the HDC's pixel format to be set.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-09 09:35:22 -05:00
Adam Jackson
74b701d84c glx: Relax validate_renderType_against_config for EXT_no_config_context
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-11-09 09:35:07 -05:00
Nicolai Hähnle
ffc2060616 anv: fix build failure
Fixes: e3a8013de8 ("util/u_queue: add util_queue_fence_wait_timeout")
2017-11-09 14:49:19 +01:00
Nicolai Hähnle
cc78d77043 mesa: flush and wait after creating a fallback texture
Fixes non-deterministic failures in
dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_sync.images.texture_source.teximage2d_render
and others in dEQP-EGL.functional.sharing.gles2.multithread.*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:20:58 +01:00
Nicolai Hähnle
46444613cf mesa: increase MaxServerWaitTimeout
The current value was introduced in commit a27180d0d8, which claims
that it represents ~1.11 years. However, it is interpreted in nanoseconds,
so it actually only represents ~9.8 hours. That seems a bit short.

Use the largest value consistent with both int32 and int64. It
corresponds to ~292 years in nanoseconds.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:20:58 +01:00
Nicolai Hähnle
fbda7958ff st/mesa: remove redundant flushes from st_flush
st_flush should flush state tracker-internal state and the pipe, but
not mesa/main state. Of the four callers:

- glFlush/glFinish already call FLUSH_{VERTICES,STATE}.
- st_vdpau doesn't need to call them.
- st_manager will now call them explicitly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:20:58 +01:00
Nicolai Hähnle
884a0b2a9e st/dri: use stapi flush instead of pipe flush when creating fences
There may be pending operations (e.g. vertices) that need to be flushed
by the state tracker.

Found by inspection.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:20:58 +01:00
Nicolai Hähnle
b921da3b74 radeonsi: use a threaded context even for debug contexts
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:04 +01:00
Nicolai Hähnle
1a6d9e087a radeonsi: record and dump time of flush
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:04 +01:00
Nicolai Hähnle
b07569ad8b ddebug: optionally handle transfer commands like draws
Transfer commands can have associated GPU operations.

Enabled by passing GALLIUM_DDEBUG=transfers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:03 +01:00
Nicolai Hähnle
18fd2a859d ddebug: dump context and before/after times of draws
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:03 +01:00
Nicolai Hähnle
ba2f2b6f2a ddebug: generalize print_named_xxx via a PRINT_NAMED macro
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:03 +01:00
Nicolai Hähnle
c9fefa062b ddebug: rewrite to always use a threaded approach
This patch has multiple goals:

1. Off-load the writing of records in 'always' mode to another thread
   for performance.

2. Allow using ddebug with threaded contexts. This really forces us to
   move some of the "after_draw" handling into another thread.

3. Simplify the different modes of ddebug, both in the code and in
   the user interface, i.e. GALLIUM_DDEBUG. In particular, there's
   no 'pipelined' anymore, since we're always pipelined; and 'noflush'
   is replaced by 'flush', since we no longer flush by default.

4. Fix the fences in pipelining mode. They previously relied on writes
   via pipe_context::clear_buffer. However, on radeonsi, those could
   (quite reasonably) end up in the SDMA buffer. So we use the newly
   added PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE fences instead.

5. Improve pipelined mode overall, using the finer grained information
   provided by the new fences.

Overall, the result is that pipelined mode should be more useful, and
using ddebug in default mode is much less invasive, in the sense that
it changes the overall driver behavior less (which is kind of crucial
for a driver debugging tool).

An example of the new hang debug output:

  Gallium debugger active.
  Hang detection timeout is 1000ms.
  GPU hang detected, collecting information...

  Draw #   driver  prev BOP  TOP  BOP  dump file
  -------------------------------------------------------------
  2          YES      YES    YES  NO   /home/nha/ddebug_dumps/shader_runner_19919_00000000
  3          YES      NO     YES  NO   /home/nha/ddebug_dumps/shader_runner_19919_00000001
  4          YES      NO     YES  NO   /home/nha/ddebug_dumps/shader_runner_19919_00000002
  5          YES      NO     YES  NO   /home/nha/ddebug_dumps/shader_runner_19919_00000003

  Done.

We can see that there were almost certainly 4 draws in flight when
the hang happened: the top-of-pipe fence was signaled for all 4 draws,
the bottom-of-pipe fence for none of them. In virtually all cases,
we'd expect the first draw in the list to be at fault, but due to the
GPU parallelism, it's possible (though highly unlikely) that one of
the later draws causes a component to get stuck in a way that prevents
the earlier draws from making progress as well.

(In the above example, there were actually only 3 draws truly in flight:
the last draw is a blit that waits for the earlier draws; however, its
top-of-pipe fence is emitted before the cache flush and wait, and so
the fact that the draw hasn't truly started yet can only be seen from a
closer inspection of GPU state.)

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:03 +01:00
Nicolai Hähnle
e8bb8758dd ddebug: use an atomic increment when numbering files
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:03 +01:00
Nicolai Hähnle
d6710fe874 dd/util: extract dd_get_debug_filename_and_mkdir
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:03 +01:00
Nicolai Hähnle
8491fcafab gallium/u_dump: add and use util_dump_transfer_usage
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:02 +01:00
Nicolai Hähnle
9b8033a4a7 gallium/u_dump: add util_dump_ns
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:02 +01:00
Nicolai Hähnle
6f4a03b08a gallium/u_dump: export util_dump_ptr
Change format to %p while we're at it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:02 +01:00
Nicolai Hähnle
125a915052 radeonsi: implement PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE
v2: use uncached system memory for the fence, and use the CPU to
    clear it so we never read garbage when checking the fence

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:55 +01:00
Nicolai Hähnle
e4627ac8fb radeonsi: document some subtle details of fence_finish & fence_server_sync
v2: remove the change to si_fence_server_sync, we'll handle that more
    robustly

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:50 +01:00
Nicolai Hähnle
14b9fa75e4 gallium: add pipe_context::callback
For running post-draw operations inside the driver thread. ddebug will
use it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:50 +01:00
Nicolai Hähnle
2bdfbb0e53 gallium/u_threaded: implement pipe_context::set_log_context
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:49 +01:00
Nicolai Hähnle
244536d3d6 gallium/u_threaded: avoid syncs for get_query_result
Queries should still get marked as flushed when flushes are executed
asynchronously in the driver thread.

To this end, the management of the unflushed_queries list is moved into
the driver thread.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:49 +01:00
Nicolai Hähnle
609a230375 gallium/u_threaded: implement asynchronous flushes
This requires out-of-band creation of fences, and will be signaled to
the pipe_context::flush implementation by a special TC_FLUSH_ASYNC flag.

v2:
- remove an incorrect assertion
- handle fence_server_sync for unsubmitted fences by
  relying on the improved cs_add_fence_dependency
- only implement asynchronous flushes on amdgpu

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:42 +01:00
Nicolai Hähnle
11b380ed0c gallium/u_threaded: mark queries flushed only for non-deferred flushes
The driver uses (and must use) the flushed flag of queries as a hint that
it does not have to check for synchronization with currently queued up
commands. Deferred flushes do not actually flush queued up commands, so
we must not set the flushed flag for them.

Found by inspection.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:42 +01:00
Nicolai Hähnle
78a4750d91 radeonsi: move fence functions to si_fence.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:42 +01:00
Nicolai Hähnle
e6dbc804a8 winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences
The idea is to fix the following interleaving of operations
that can arise from deferred fences:

 Thread 1 / Context 1          Thread 2 / Context 2
 --------------------          --------------------
 f = deferred flush
 <------- application-side synchronization ------->
                               fence_server_sync(f)
                               ...
                               flush()
 flush()

We will now stall in fence_server_sync until the flush of context 1
has completed.

This scenario was unlikely to occur previously, because applications
seem to be doing

 Thread 1 / Context 1          Thread 2 / Context 2
 --------------------          --------------------
 f = glFenceSync()
 glFlush()
 <------- application-side synchronization ------->
                               glWaitSync(f)

... and indeed they probably *have* to use this ordering to avoid
deadlocks in the GLX model, where all GL operations conceptually
go through a single connection to the X server. However, it's less
clear whether applications have to do this with other WSI (i.e. EGL).
Besides, even this sequence of GL commands can be translated into
the Gallium-level sequence outlined above when Gallium threading
and asynchronous flushes are used. So it makes sense to be more
robust.

As a side effect, we no longer busy-wait on submission_in_progress.

We won't enable asynchronous flushes on radeon, but add a
cs_add_fence_dependency stub anyway to document the potential
issue.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:22 +01:00
Nicolai Hähnle
1e5c9cf590 gallium: add PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE bits
These bits are intended to be used by the ddebug hang detection and are
named in analogy to the Vulkan stage bits (and the corresponding Radeon
pipeline event).

Hang detection needs fences on the granularity of individual commands,
which nothing else really covers. The closest alternative would have
been PIPE_QUERY_GPU_FINISHED, but (a) queries are a per-context object
and we really want a per-screen object, (b) queries don't offer a
wait with timeout, and (c) in any case, PIPE_QUERY_GPU_FINISHED is
meant to imply that GPU caches are flushed, which the new bits
explicitly aren't.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 13:58:16 +01:00
Nicolai Hähnle
ea6df1ce37 gallium: add PIPE_FLUSH_ASYNC and PIPE_FLUSH_HINT_FINISH
Also document some subtleties of pipe_context::flush.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 13:58:16 +01:00
Nicolai Hähnle
e3a8013de8 util/u_queue: add util_queue_fence_wait_timeout
v2:
- style fixes
- fix missing timeout handling in futex path

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 13:58:10 +01:00
Nicolai Hähnle
f1a3648784 threads: update for late C11 changes
C11 threads were changed to use struct timespec instead of xtime, and
thrd_sleep got a second argument.

See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1554.htm and
http://en.cppreference.com/w/c/thread/{thrd_sleep,cnd_timedwait,mtx_timedlock}

Note that cnd_timedwait is spec'd to be relative to TIME_UTC / CLOCK_REALTIME.

v2: Fix Windows build errors. Tested with a default Appveyor config
    that uses Visual Studio 2013. Judging from Brian's email and
    random internet sources, Visual Studio 2015 does have timespec
    and timespec_get, hence the _MSC_VER-based guard which I have
    not tested.

Cc: Jose Fonseca <jfonseca@vmware.com>
Cc: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2017-11-09 11:57:22 +01:00
Nicolai Hähnle
c50743f61c gallium: remove unused and deprecated u_time.h
Cc: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:57:22 +01:00
Nicolai Hähnle
222a2fb998 util: move os_time.[ch] to src/util
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:57:21 +01:00
Nicolai Hähnle
f76a6cb337 radeonsi: always use async compiles when creating shader/compute states
With Gallium threaded contexts, creating shader/compute states is
effectively a screen operation, so we should not use context state.

In particular, this allows us to avoid using the context's LLVM
TargetMachine.

This isn't an issue yet because u_threaded_context filters out non-async
debug callbacks, and we disable threaded contexts for debug contexts.
However, we may want to change that in the future.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:53:20 +01:00
Nicolai Hähnle
b650fc09c3 radeonsi: fix potential use-after-free of debug callbacks
Found by inspection.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:53:20 +01:00
Nicolai Hähnle
dd7c273e87 radeonsi: move pipe debug callback to si_context
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:53:19 +01:00
Nicolai Hähnle
185061aef4 u_queue: add util_queue_finish for waiting for previously added jobs
Schedule one job for every thread, and wait on a barrier inside the job
execution function.

v2: avoid alloca (fixes Windows build error)

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2017-11-09 11:53:19 +01:00
Nicolai Hähnle
f0d3a4de75 util: move pipe_barrier into src/util and rename to util_barrier
The #if guard is probably not 100% equivalent to the previous PIPE_OS
check, but if anything it should be an over-approximation (are there
pthread implementations without barriers?), so people will get either
a good implementation or compile errors that are easy to fix.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:53:19 +01:00
Nicolai Hähnle
28c95cdb29 gallium: add async debug message forwarding helper
v2: use util_vasprintf for Windows portability

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2017-11-09 11:53:19 +01:00
Nicolai Hähnle
637240d824 st/mesa: guard sampler views changes with a mutex
Some locking is unfortunately required, because well-formed GL programs
can have multiple threads racing to access the same texture, e.g.: two
threads/contexts rendering from the same texture, or one thread destroying
a context while the other is rendering from or modifying a texture.

Since even the simple mutex caused noticable slowdowns in the piglit
drawoverhead micro-benchmark, this patch uses a slightly more involved
approach to keep locks out of the fast path:

- the initial lookup of sampler views happens without taking a lock
- a per-texture lock is only taken when we have to modify the sampler
  view(s)
- since each thread mostly operates only on the entry corresponding to
  its context, the main issue is re-allocation of the sampler view array
  when it needs to be grown, but the old copy is not freed

Old copies of the sampler views array are kept around in a linked list
until the entire texture object is deleted. The total memory wasted
in this way is roughly equal to the size of the current sampler views
array.

Fixes non-deterministic memory corruption in some
dEQP-EGL.functional.sharing.gles2.multithread.* tests, e.g.
dEQP-EGL.functional.sharing.gles2.multithread.simple.images.texture_source.create_texture_render

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:50:55 +01:00
Nicolai Hähnle
8d20c660a9 st/mesa: re-arrange st_finalize_texture
Move the early-out for surface-based textures earlier. This narrows the
scope of the locking added in a follow-up commit.

Fix one remaining case of initializing a surface-based texture
without properly finalizing it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:50:55 +01:00
Nicolai Hähnle
0dcf30e550 gallium: clarify the constraints on sampler_view_destroy
r600 expects the context that created the sampler view to still be alive
(there is a per-context list of sampler views).

svga currently bails when the context of destruction is not the same as
creation.

The GL state tracker, which is the only one that runs into the
multi-context subtleties (due to share groups), already guarantees that
sampler views are destroyed before their context of creation is destroyed.

Most drivers are context-agnostic, so the warning message in
pipe_sampler_view_release doesn't really make sense.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:50:54 +01:00
Nicolai Hähnle
0f54ee6072 radeonsi: reduce the scope of sel->mutex in si_shader_select_with_key
We only need the lock to guard changes in the variant linked list. The
actual compilation can happen outside the lock, since we use the ready
fence as a guard.

v2: fix double-unlock

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:37:51 +01:00
Nicolai Hähnle
4f493c79ee radeonsi: use ready fences on all shaders, not just optimized ones
There's a race condition between si_shader_select_with_key and
si_bind_XX_shader:

  Thread 1                         Thread 2
  --------                         --------
  si_shader_select_with_key
    begin compiling the first
    variant
    (guarded by sel->mutex)
                                   si_bind_XX_shader
                                     select first_variant by default
                                     as state->current
                                   si_shader_select_with_key
                                     match state->current and early-out

Since thread 2 never takes sel->mutex, it may go on rendering without a
PM4 for that shader, for example.

The solution taken by this patch is to broaden the scope of
shader->optimized_ready to a fence shader->ready that applies to
all shaders. This does not hurt the fast path (if anything it makes
it faster, because we don't explicitly check is_optimized).

It will also allow reducing the scope of sel->mutex locks, but this is
deferred to a later commit for better bisectability.

Fixes dEQP-EGL.functional.sharing.gles2.multithread.simple.buffers.bufferdata_render

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:37:51 +01:00
Nicolai Hähnle
d1ff082637 u_queue: add a futex-based implementation of fences
Fences are now 4 bytes instead of 96 bytes (on my 64-bit system).

Signaling a fence is a single atomic operation in the fast case plus a
syscall in the slow case.

Testing if a fence is signaled is the same as before (a simple comparison),
but waiting on a fence is now no more expensive than just testing it in
the fast (already signaled) case.

v2:
- style fixes
- use p_atomic_xxx macros with the right barriers

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:37:39 +01:00
Nicolai Hähnle
574c59d4f9 u_queue: add util_queue_fence_reset
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:37:39 +01:00
Nicolai Hähnle
1b9d5ece55 u_queue: export util_queue_fence_signal
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:37:38 +01:00
Nicolai Hähnle
b20f955bc1 u_queue: group fence functions together
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:37:38 +01:00
Nicolai Hähnle
0a7f17cf5b util/u_atomic: add p_atomic_xchg
The closest to it in the old-style gcc builtins is __sync_lock_test_and_set,
however, that is only guaranteed to work with values 0 and 1 and only
provides an acquire barrier. I also don't know about other OSes, so we
provide a simple & stupid emulation via p_atomic_cmpxchg.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:37:30 +01:00
Nicolai Hähnle
b4b2a951c8 util: move futex helpers into futex.h
v2: style fixes

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2017-11-09 11:37:22 +01:00
Kenneth Graunke
688d695868 glsl: Make #pragma STDGL invariant(all) only modify outputs.
According to the GLSL ES 3.20, GLSL 4.50, and GLSL 1.20 specs:

   "To force all output variables to be invariant, use the pragma

       #pragma STDGL invariant(all)

    before all declarations in a shader."

Notably, this is only supposed to affect output variables.  Furthermore,

   "Only variables output from a shader can be candidates for invariance."

It looks like this has been wrong since we first supported the pragma in
2011 (commit 86b4398cd1).

Fixes dEQP-GLES2.functional.shaders.preprocessor.pragmas.pragma_fragment.

v2: Now that all cases are identical (other than compute shaders, which
    have no output variables anyway), we can drop the switch statement
    entirely.  We also don't need the current_function == NULL check;
    this was a hold over from when we had a single var_mode_out for both
    function parameters and shader varyings, in the bad old days.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-11-08 23:11:48 -08:00
Tapani Pälli
c591b1e594 i965: expose SRGB visuals and turn on EGL_KHR_gl_colorspace
Patch exposes sRGB visuals and adds DRI integer query support for
__DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB. Further changes make sure that
we mark if the app explicitly wanted sRGB and for these framebuffers
we don't turn sRGB off in intel_gles3_srgb_workaround. This way we
keep compatibility for existing applications relying on default sRGB
and ony add more visual support.

With this change, following dEQP tests start to pass:

   dEQP-EGL.functional.wide_color.window_8888_colorspace_srgb
   dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_srgb

v2: some code cleanup (Emil Velikov)
    update num_formats correctly (reported by deveee@gmail.com)

v3: cleanup, remove redundant is_srgb
    rename explicit_srgb as 'need_srgb' to follow style better

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v2)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102264
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102354
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102503
2017-11-09 07:43:25 +02:00
Neil Roberts
4dc8458cd1 glsl: Transform fb buffers are only active if a variable uses them
The GL spec will soon be revised to clarify that a buffer binding for
a transform feedback buffer is only required if a variable is actually
defined to use the buffer binding point. Previously a declaration for
the default transform buffer would make it require a binding even if
nothing was declared to use the default buffer.

Affects:
KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list
KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list_and_api

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-09 05:39:42 +01:00
Jason Ekstrand
951a5dc4cc intel/nir: Use the correct indirect lowering masks in link_shaders
Previously, if we were linking a vec4 VS with a SIMD8/16 FS, we wouldn't
lower indirects on the fragment shader which is wrong.  Instead of using
a single indirect mask, take advantage of our new little helper.

Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-08 20:10:04 -08:00
Ilia Mirkin
f317f72f73 r600g: use SIMPLE_FLOAT for blending to enable some optimizations
Radeonsi also sets this flag. Seems to avoid pulling up the desintation
RT value when the dst blend factor is zero if it's not otherwise being
loaded. Among other things, it allows blending to overwrite infinity/NaN
values in the destination RT.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-08 22:35:27 -05:00
Ilia Mirkin
35433494f3 nv50: make blending work so that zero wins in a multiplication
This matches nvc0 behavior, tested with the fbo-float-nan piglit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann<tobias.johannes.klausmann@mni.thm.de>
2017-11-08 22:32:43 -05:00
Ian Romanick
9c53b80ff9 glsl: Minor cleanups after previous commit
I think it's more clear to only call emit_access once.  The only
difference between the two calls is the value of size_mul used for the
offset parameter... but you really have to look at it to be sure.

The s/is_64bit/is_double/ change is because there are no int64_t or
uint64_t matrix types.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick
c18d8c61d6 glsl: Use more link_calculate_matrix_stride in lower_buffer_access
I was going to squash this with the previous commit, but there's a lot
of churn in that commit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick
1a2beae1b3 glsl: Use link_calculate_matrix_stride in lower_buffer_access and friends
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick
24e78d99db glsl: Refactor matrix stride calculation into a utility function
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick
88f5588f77 glsl/linker: Optimize swizzles again after linking
Without this, the SPIR-V generator has to deal with a bunch of junk
like:

    (swiz z (swiz xxx (swiz x (var_ref packed:binormal.z,light_dir))))

It seems better to cull that stuff out than to add code to deal with
it.  The problem is the way swizzles to and from scalars have to be
handled in SPIR-V.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick
ef1ca06ce8 glsl: Combine nop-swizzle optimization with swizzle-swizzle optimization
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick
c858abb14f glsl: Make the swizzle-swizzle optimization greedy
If there is a long sequence of swizzled swizzles, compact all of them
down to a single swizzle.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick
ae1fd09c1d glsl: Remove program_resource_visitor::visit_field(const glsl_struct_field *)
I could not find any remaining users.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-08 18:37:29 -08:00
Ian Romanick
2c7657f62c glsl: Silence unused parameter warning
glsl/lower_shared_reference.cpp: In member function ‘virtual void
{anonymous}::lower_shared_reference_visitor::insert_buffer_access(void*,
ir_dereference*, const glsl_type*, ir_rvalue*, unsigned int, int)’:

glsl/lower_shared_reference.cpp:244:58: warning: unused parameter
‘channel’ [-Wunused-parameter]
                                                      int channel)
                                                          ^~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-08 18:37:29 -08:00
Dave Airlie
6bec8bcd79 ac/nir: add support for all intrinsics. (v2)
This is derived from tgsi/radeonsi code from the GLSL intrinsics.

This should pre-fix radv for the upcoming spirv patches.

v2: actually use wait_cnt, sleep deprived dad time! (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-09 01:25:59 +00:00
Timothy Arceri
87f02ddfd1 amdgpu: use simple mtx
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 12:07:48 +11:00
Timothy Arceri
f0857fe87b mesa: use simple mtx in core mesa
Results from x11perf -copywinwin10 on Eric's SKL:
   4.33338% ± 0.905054% (n=40)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Yogesh Marathe <yogesh.marathe@intel.com>
2017-11-09 12:07:48 +11:00
Timothy Arceri
f98a2768ca mesa: Add new fast mtx_t mutex type for basic use cases
While modern pthread mutexes are very fast, they still incur a call to an
external DSO and overhead of the generality and features of pthread mutexes.
Most mutexes in mesa only needs lock/unlock, and the idea here is that we can
inline the atomic operation and make the fast case just two intructions.
Mutexes are subtle and finicky to implement, so we carefully copy the
implementation from Ulrich Dreppers well-written and well-reviewed paper:

  "Futexes Are Tricky"
  http://www.akkadia.org/drepper/futex.pdf

We implement "mutex3", which gives us a mutex that has no syscalls on
uncontended lock or unlock.  Further, the uncontended case boils down to a
cmpxchg and an untaken branch and the uncontended unlock is just a locked decr
and an untaken branch.  We use __builtin_expect() to indicate that contention
is unlikely so that gcc will put the contention code out of the main code
flow.

A fast mutex only supports lock/unlock, can't be recursive or used with
condition variables.  We keep the pthread mutex implementation around as
for the few places where we use condition variables or recursive locking.
For platforms or compilers where futex and atomics aren't available,
simple_mtx_t falls back to the pthread mutex.

The pthread mutex lock/unlock overhead shows up on benchmarks for CPU bound
applications.  Most CPU bound cases are helped and some of our internal
bind_buffer_object heavy benchmarks gain up to 10%.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 12:07:48 +11:00
Timothy Arceri
6a72eba755 mesa: rework how we free gl_shader_program_data
When I introduced gl_shader_program_data one of the intentions was to
fix a bug where a failed linking attempt freed data required by a
currently active program. However I seem to have failed to finish
hooking up the final steps required to have the data hang around.

Here we create a fresh instance of gl_shader_program_data every
time we link. gl_program has a reference to gl_shader_program_data
so it will be freed once the program is no longer active.

Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Neil Roberts <nroberts@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102177
2017-11-09 12:07:48 +11:00
Timothy Arceri
9c33533586 glsl: use the correct parent when allocating program data members
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-09 12:07:48 +11:00
Timothy Arceri
cf05bb506a glsl: drop cache_fallback
This turned out to be a dead end, it is much easier and less error
prone to just cache the IR used by the drivers backend e.g. TGSI or
NIR.

Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-09 12:07:48 +11:00
Kenneth Graunke
a16dc04ad5 i965: properly initialize brw->cs.base.stage to MESA_SHADER_COMPUTE
This has a bit of a surprising effect:

For the render pipeline, the upload_sampler_state_table atom emits
3DSTATE_BINDING_TABLE_POINTERS_XS.  It tries to avoid this for compute:

   if (GEN_GEN >= 7 && stage_state->stage != MESA_SHADER_COMPUTE) {
      /* Emit a 3DSTATE_SAMPLER_STATE_POINTERS_XS packet. */
      genX(emit_sampler_state_pointers_xs)(brw, stage_state);
   } ...

However, we were failing to initialize brw->cs.base.stage, so it was
left as 0 (MESA_SHADER_VERTEX), causing this condition to break.  We
then emitted 3DSTATE_SAMPLER_STATE_POINTERS_VS in GPGPU mode, when
trying to upload CS samplers.  Nothing good can come of this.

Found by inspection while debugging a GPU hang.  Jordan believes this
helps the Deus Ex: Mankind Divided benchmark mode's stability when
running with shader cache.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-11-08 15:26:18 -08:00
Jason Ekstrand
3e63cf893f intel/nir: Break the linking code into a helper in brw_nir.c
Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-08 14:09:51 -08:00
Jason Ekstrand
7364f080f9 intel/nir: Add a helper for getting the NoIndirect mask
Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-08 14:09:49 -08:00
Matt Turner
77a63d190a nir: Don't print swizzles when there are more than 4 components
... as can happen with various types like mat4, or else we'll smash the
stack writing past the end of components_local[].

Fixes: 5a0d3e1129 ("nir: Print the components referenced for split or
                      packed shader in/outs.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-08 13:22:26 -08:00
Dylan Baker
34593e978c meson: Add threads dependencies to glsl_compiler executable
Fixes compiling the optional standalone glsl compiler.

Reported-by: DrNick (on irc)
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-08 11:36:02 -08:00
Andreas Boll
a6932faae1 glsl: Fix typo fragement -> fragment
Fixes: 94d669b0d2 ("glsl: enforce fragment shader input restrictions in
       GLSL ES 3.10")

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-08 18:30:48 +00:00
Andreas Boll
4f29ed38f3 broadcom/vc5: Remove unused v3d_compiler.c
Unused since original import of VC5.

Fixes: ade416d023 ("broadcom: Add VC5 NIR compiler.")

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-08 18:30:47 +00:00
Andreas Boll
6e4d65f674 broadcom/vc5: Add vc5_drm.h to the release tarball
Fixes: 45bb8f2957 ("broadcom: Add V3D 3.3 gallium driver called "vc5",
       for BCM7268.")

Cc: 17.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-08 18:30:45 +00:00
Gert Wollny
6905d005ef clover: use the unified check for c++11 instead of the gcc version number
So far clover based its test for compiler support on the version of gcc,
while in reality support for c++11 is required. This patch replaces the
version check by the check unified for all modules that require c++11.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-08 16:03:38 +00:00
Gert Wollny
8f18528cea swr: Replace the check for c++11 by the unified version
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-08 16:03:38 +00:00
Gert Wollny
09ad2576ec configure: check for -std=c++11 support and enable st/mesa test accordingly
Add a check that tests whether the c++ compiler supports c++11, either
by default, by adding the compiler flag -std=c++11, or by adding a
compiler flag that the user has specified via the environment variable
CXX11_CXXFLAGS.

The test only does a very shallow check of c++11 support, i.e. it tests
whether the define  __cplusplus >= 201103L to confirm language support
by the compiler, and it checks whether the header <tuple> is available
to test the availability of the c++11 standard library.

A make file conditional HAVE_STD_CXX11 is provided that is used in this
patch to enable the test in st/mesa if C++11 support is available.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102665
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-08 16:03:34 +00:00
Emil Velikov
6dd56fafe2 configure.ac: append to existing initializer override flags
Currently we were overwriting the existing warning flags, instead of
adding new [as applicable].

Fixes c5d2e2d43f ("configure: Test for -Wno-initializer-overrides")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-08 15:53:00 +00:00
Emil Velikov
63811f3b7c configure.ac: append to existing MSVC compat flags
Currently we were overwriting the existing warning flags, instead of
adding new [as applicable].

v2: Add missing space before -Werror (Eric)

Fixes e4b2b69e82 ("configure: Add and use AX_CHECK_COMPILE_FLAG")
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-08 15:53:00 +00:00
Dylan Baker
8a36f025f4 meson: Allow building glvnd with EGL and non-dri based GLX
Because meson mirrors the auototools logic, it needs the same changes to
allow building glvnd based egl.

v2: - change if to elif (Eric)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-08 15:53:00 +00:00
Emil Velikov
85a017230c configure.ac: require xcb* for the omx/va/... when using x11 platform
Targets such as omx and va can work w/o anything X related. Mandate the
xcb* dependencies only when the X11 platform is selected.

Reported-by: Lukas Rusak <lorusak@gmail.com>
Fixes: 63e11ac2b5 ("configure: error out if building VA w/o supported
platform")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Lukas Rusak <lorusak@gmail.com> (v1)
2017-11-08 15:53:00 +00:00
Emil Velikov
b4967561c0 configure.ac: loosen --enable-glvnd check to honour egl
Currently we error out when building GLVND w/o GLX.

That was the original premice before we had EGL. As the commit says,
that error should be reworked to honour both - do so.

v2: Drop noop *);; (Eric)

Reported-by: Lukas Rusak <lorusak@gmail.com>
Fixes: ce562f9e3f ("EGL: Implement the libglvnd interface for EGL (v3)")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Lukas Rusak <lorusak@gmail.com> (v1)
2017-11-08 15:52:56 +00:00
Emil Velikov
61e99ce267 egl/android: add a note about .swap_buffers_with_damage
Android implements the API and does the native damage handling itself.
At the same time it
 a) does call the vendor's eglSwapBuffersWithDamageKHR
 b) does not implement eglSetDamageRegionKHR

There's something strange happening here. For now simply note about the
'lack' of eglSwapBuffersWithDamageKHR support.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-08 14:42:12 +00:00
Emil Velikov
c7b65c330f wayland-drm: static inline wayland_drm_buffer_get
The function is effectively a direct function call into
libwayland-server.so.

Thus GBM no longer depends on the wayland-drm static library, making the
build more straight forward. And the resulting binary is a bit smaller.

Note: we need to move struct wayland_drm_callbacks further up,
otherwise we'll get an error since the type is incomplete.

v2: Rebase, beef-up commit message, update meson, move struct
wayland_drm_callbacks.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> # meson bit only
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> # for the rest
Reviewed-by: Dylan Baker <dylan@pnwbakers.com> # meson
2017-11-08 14:40:12 +00:00
Emil Velikov
ba414dba4f automake: intel: correctly append to the LIBADD variable
Commit 05fc62d89f sets the variable, yet it forgot the update the
existing reference to append (instead of assign).

Thus as-is the expat library was discarded from the link chain when
building with Android.

Fixes: 05fc62d89f ("automake: intel: move expat handling where it's
used")
Cc: Hongxu Jia <hongxu.jia@windriver.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-08 14:23:57 +00:00
Emil Velikov
6ef9482b78 configure: enable the OpenCL ICD by default
Nearly all the distributions* that build Mesa OpenCL, enable the ICD.
Since building a non-ICD driver has the chance of conflicting with
existing OpenCL binary (libOpenCL.so).

Furthermore, some applications expect the library to provide
annotated/versioned symbols.

https://lists.freedesktop.org/archives/mesa-dev/2017-September/171093.html

*Fedora, Suse, Arch, Debian, Ubuntu, FreeBSD use the ICD
Gentoo manages the conflicting files via eselect.

Cc: Matt Turner <mattst88@gmail.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-By: Aaron Watry <awatry@gmail.com>
2017-11-08 14:10:33 +00:00
Emil Velikov
0cd0958544 targets/opencl: don't hardcode the icd file install to /etc/...
Use $(sysconfdir) instead of hardcoding /etc.

While the OpenCL spec expects the file in /etc, people building their
stack can override that, esp. !Linux users.

Furthermore this removes a fundamental violation, which results in the
system file being overwritten even as one explicitly sets --prefix
and/or DESTDIR.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-By: Aaron Watry <awatry@gmail.com>
2017-11-08 14:10:07 +00:00
Emil Velikov
01d91b3718 amd: add amdgpu_asic_addr.h to the sources list
Otherwise it will be missing from the release tarball

Fixes: 7f33e94e43 ("amd/addrlib: update to latest version")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-08 14:07:27 +00:00
Tobias Droste
5d61fa4e68 gallivm: Use new LLVM fast-math-flags API
LLVM 6 changed the API on the fast-math-flags:
https://reviews.llvm.org/rL317488

NOTE: This also enables the new flag 'ApproxFunc' to allow for
approximations for library functions (sin, cos, ...). I'm not completly
convinced, that this is something mesa should do.

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-11-08 10:44:19 +01:00
Juan A. Suarez Romero
d5a641106b glsl: add varying resources for arrays of complex types
This patch is mostly a patch done by Ilia Mirkin.

It fixes KHR-GL45.enhanced_layouts.varying_structure_locations.

v2: fix locations for TCS/TES/GS inputs and outputs (Ilia)

CC: Ilia Mirkin <imirkin@alum.mit.edu>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-11-08 10:12:07 +01:00
Timothy Arceri
36be8c2fcf st/glsl_to_nir: use nir_shader_gather_info()
Use the NIR helper rather than the GLSL IR helper to get in/out
masks. This allows us to ignore varyings removed by NIR
optimisations.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-08 17:33:14 +11:00
Timothy Arceri
c980a3aa31 st/glsl_to_nir: generate NIR earlier
We want to use nir_shader_gather_info() the GLSL IR version might
be including varyings that NIR later eliminates. To do this we
need to generate NIR before we we start using the in/out bitmasks.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-08 17:33:14 +11:00
Timothy Arceri
f6c0504abc st/glsl_to_nir: delay adding built-in uniforms to Parameters list
Delaying adding built-in uniforms until after we convert to NIR
gives us a better chance to optimise them away. Also NIR allows
us to iterate over the uniforms directly so should be faster.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-08 17:33:14 +11:00
Marek Olšák
7f33e94e43 amd/addrlib: update to latest version
This uses C++11 initializer lists.

I just overwrote all Mesa files with internal addrlib and discarded
hunks that we should probably keep, but I might have missed something.

The code depending on ADDR_AM_BUILD is removed. We can add it back next
time if needed.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-08 00:55:13 +01:00
Eric Anholt
3bfcd31e98 braodcom/vc5: Flush the job when it grows over 1GB.
Fixes GL_OUT_OF_MEMORY from streaming-texture-leak (and will hopefully
keep piglit from ooming on my no-swap platform, as well).
2017-11-07 12:58:03 -08:00
Eric Anholt
50906e4583 broadcom/vc5: Do 16-bit unpacking of integer texture returns properly.
We were doing f16 unpacks, which trashed "1" values.  Fixes many piglit
texwrap GL_EXT_texture_integer cases.
2017-11-07 12:58:03 -08:00
Eric Anholt
80da60947b broadcom/vc5: Fix pausing of transform feedback.
Gallium disables it by removing the streamout buffers, not by binding a
program that doesn't have TF outputs.  Fixes piglit
"ext_transform_feedback2/counting with pause"
2017-11-07 12:58:00 -08:00
Eric Anholt
25d199f67d broadcom/vc5: Add support for GL_RASTERIZER_DISCARD
Fixes piglit discard-drawarrays.
2017-11-07 12:57:49 -08:00
Eric Anholt
dfff9ce45e broadcom/vc5: Fix scheduling for a non-SFU R4 write after a dead R4 write.
The v3d_qpu_writes_r*() were only checking for fixed-function accumulator
writes, not normal ALU writes to those regs.

Fixes fs-discard-exit-2 on simulation (but not HW).
2017-11-07 12:57:49 -08:00
Eric Anholt
9ccb6621be broadcom/vc5: Add partial transform feedback query support.
We have to compute the queries in software, so we're counting the
primitives by hand.  We still need to make sure to not increment the
PRIMITIVES_EMITTED if we overflowed, but leave that for later.
2017-11-07 12:57:43 -08:00
Eric Anholt
4f33344e7a broadcom/vc5: Add occlusion query support.
Fixes all of piglit's OQ tests.
2017-11-07 12:56:40 -08:00
Jason Ekstrand
d002950e54 intel/fs/nir: Return Q types from brw_reg_type_for_bit_size
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-11-07 10:41:24 -08:00
Jason Ekstrand
dee58ecd2e intel/fs/nir: Use Q immediates for load_const on gen8+
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-11-07 10:41:24 -08:00
Jason Ekstrand
9bb34892bf intel/fs/nir: Setup immediates based on type in i2b and f2b
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-11-07 10:41:24 -08:00
Jason Ekstrand
1cb210f4bc intel/reg: Add helpers for 64-bit integer immediates
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-11-07 10:41:24 -08:00
Jason Ekstrand
df81b81fb9 compiler/nir_types: Handle vectors in glsl_get_array_element
Most of NIR doesn't allow doing array indexing on a vector (though it
does on a matrix).  However, nir_lower_io handles it just fine and this
behavior is needed for shared variables in Vulkan.  This commit makes
glsl_get_array_element do something sensible for vector types and makes
nir_validate happy with them.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:41:24 -08:00
Jason Ekstrand
ad77775809 nir: Validate base types on array dereferences
We were already validating that the parent type goes along with the
child type but we weren't actually validating that the parent type is
reasonable.  This fixes that.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:41:24 -08:00
Jason Ekstrand
ab9220edd6 nir,intel/compiler: Use a fixed subgroup size
The GL_ARB_shader_ballot spec says that gl_SubGroupSizeARB is declared
as a uniform.  This means that it cannot change across an invocation
such as a draw call or a compute dispatch.  For compute shaders, we're
ok because we only ever use one dispatch size.  For fragment, however,
the hardware dynamically chooses between SIMD8 and SIMD16 which violates
the spec.  Instead, let's just pick a subgroup size based on the shader
stage.  The fixed size we choose for compute shaders is a bit higher
than strictly needed but there's no real harm in that.  The advantage is
that, if they do anything interesting with the value, NIR will see it as
an immediate and can optimize better.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
a026458020 nir/lower_subgroups: Lower ballot intrinsics to the specified bit size
Ballot intrinsics return a bitfield of subgroups.  In GLSL and some
SPIR-V extensions, they return a uint64_t.  In SPV_KHR_shader_ballot,
they return a uvec4.  Also, some back-ends would rather pass around
32-bit values because it's easier than messing with 64-bit all the time.
To solve this mess, we make nir_lower_subgroups take a new parameter
called ballot_bit_size and it lowers whichever thing it gets in from the
source language (uint64_t or uvec4) to a scalar with the specified
number of bits.  This replaces a chunk of the old lowering code.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
8c2bf020fd nir/builder: Add a nir_imm_intN_t helper
This lets you easily build integer immediates of arbitrary bit size.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
9b35faba42 nir/lower_system_values: Lower SUBGROUP_*_MASK based on type
The SUBGROUP_*_MASK system values are uint64_t when coming in from GLSL
but uvec4 when coming in from SPIR-V.  Lowering based on type allows us
to nicely handle both.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
3ee91ee6ac nir: Make ballot intrinsics variable-size
This way they can return either a uvec4 or a uint64_t.  At the moment,
this is a no-op since we still always return a uint64_t.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
ad127afcfd nir: Add a ssa_dest_init_for_type helper
This would be useful a number of places

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
28da82f978 nir: Add a new subgroups lowering pass
This commit pulls nir_lower_read_invocations_to_scalar along with most
of the guts of nir_opt_intrinsics (which mostly does subgroup lowering)
into a new nir_lower_subgroups pass.  There are various other bits of
subgroup lowering that we're going to want to do so it makes a bit more
sense to keep it all together in one pass.  We also move it in i965 to
happen after nir_lower_system_values to ensure that because we want to
handle the subgroup mask system value intrinsics here.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
1ca3a94427 intel/fs: Don't use automatic exec size inference
The automatic exec size inference can accidentally mess things up if
we're not careful.  For instance, if we have

add(4)    g38.2<4>D    g38.1<8,2,4>D    g38.2<8,2,4>D

then the destination register will end up having a width of 2 with a
horizontal stride of 4 and a vertical stride of 8.  The EU emit code
sees the width of 2 and decides that we really wanted an exec size of 2
which doesn't do what we wanted.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
dc4cf11dfc intel/fs: Explicitly set EXECUTE_1 where needed
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
ab378734f5 intel/eu: Explicitly set EXECUTE_1 where needed
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
8280560705 intel/eu: Make automatic exec sizes a configurable option
We have had a feature in codegen for some time that tries to
automatically infer the execution size of an instruction from the width
of its destination.  For things such as fixed function GS, clipper, and
SF programs, this is very useful because they tend to have lots of
hand-rolled register setup and trying to specify the exec size all the
time would be prohibitive.  For things that come from a higher-level IR,
however, it's easier to just set the right size all the time and the
automatic exec sizes can, in fact, cause problems.  This commit makes it
optional while enabling it by default.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
7a82ad54bb intel/fs: Rework zero-length URB write handling
Originally we tried to handle this case based on slots_valid.  However,
there are a number of ways that this can go wrong.  For one, we throw
away any trailing slots which either aren't written or are set to
VARYING_SLOT_PAD.  Second, even if PSIZ is a valid slot, we may not
actually write anything there.  Between the lot of these, it was
possible to end up in a case where we tried to do a regular URB write
but ended up with a length of 1 which is invalid.  This commit moves it
to the end and makes it based on a new boolean flag urb_written.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
6132992cdb intel/compiler/fs: Set up subgroup invocation as a system value
Subgroup invocation is computed using a vector immediate and some
dispatch-aware arithmetic.  Unfortunately, due to the vector arithmetic,
and the fact that it's frequently read 16-wide, it's not something that
can easily be CSEd by the back-end compiler.  There are a few different
possible approaches to this problem:

 1) Emit the code to calculate the subgroup invocation on-the-fly and
    trust NIR to do the CSE.  This is what we were doing.

 2) Add a back-end instruction for the subgroup ID.  This has the
    advantage of helping the back-end compiler with CSE but has the
    downside of very poor scheduling for the calculation because it has
    to be emitted in the back-end.

 3) Emit the calculation at the top of the program and re-use the
    result.  This gets rid of the CSE problem but comes at the cost of
    an extra live register.

This commit switches us from 1) to 3).  We choose to store the subgroup
invocation values as a W type to reduce the impact of the extra live
register.  Trusting NIR and using 1) was fine but we're soon going to
want to use the subgroup invocation value for other things in the
back-end compiler and this makes it much easier to do without having to
worry about CSE problems.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
295605c930 intel/cs: Push subgroup ID instead of base thread ID
We're going to want subgroup ID for SPIR-V subgroups eventually anyway.
We really only want to push one and calculate the other from it.  It
makes a bit more sense to push the subgroup ID because it's simpler to
calculate and because it's a real API thing.  The only advantage to
pushing the base thread ID is to avoid a single SHL in the shader.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
6411defdcd intel/cs: Re-run final NIR optimizations for each SIMD size
With the advent of SPIR-V subgroup operations, compute shaders will have
to be slightly different depending on the SIMD size at which they
execute.  In order to allow us to do dispatch-width specific things in
NIR, we re-run the final NIR stages for each sIMD width.

One side-effect of this change is that we start rallocing fs_visitors
which means we need DECLARE_RALLOC_CXX_OPERATORS.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
4e79a77cdc intel/compiler: Move the destructor from vec4_visitor to backend_shader
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
16ada419d7 i965/fs: Get rid of the early return in brw_compile_cs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
80ddfab2f5 intel/cs: Rework the way thread local ID is handled
Previously, brw_nir_lower_intrinsics added the param and then emitted a
load_uniform intrinsic to load it directly.  This commit switches things
over to use a specific NIR intrinsic for the thread id.  The one thing I
don't like about this approach is that we have to copy thread_local_id
over to the new visitor in import_uniforms.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
25f7453c9e intel/fs: Mark 64-bit values as being contiguous
This isn't often a problem , when we're in a compute shader, we must
push the thread local ID so we decrement the amount of available push
space by 1 and it's no longer even and 64-bit data can, in theory, span
it.  By marking those uniforms contiguous, we ensure that they never get
split in half between push and pull constants.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
c4c8cba705 intel/cs: Ignore runtime_check_aads_emit for CS
It's only set on gen4-5 which clearly don't support compute shaders.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
d4de813d86 intel/cs: Stop setting dispatch_grf_start_reg
Nothing ever reads it for compute shaders because it's always 1.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
b1a9cdede4 intel/cs: Drop max_dispatch_width checks from compile_cs
The only things that adjust fs_visitor::max_dispatch_width are render
target writes which don't happen in compute shaders so they're
pointless.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
1077981eb5 intel/fs: Remove min_dispatch_width from fs_visitor
It's 8 for everything except compute shaders.  For compute shaders,
there's no need to duplicate the computation and it's just a possible
source of error.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
b299ded02e intel/fs: use pull constant locations to check for first compile of a shader
Before, we bailing in assign_constant_locations based on the minimum
dispatch size.  The more direct thing to do is simply to check for
whether or not we have constant locations and bail if we do.  For
nir_setup_uniforms, it's completely safe to do it multiple times because
we just copy a value from the NIR shader.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
103081c9a9 intel/fs: Retype dest to match value in read[First]Invocation
This is what we really wanted all along.  Always retyping to D works
because that's what get_nir_src() always gives us, at least for 32-bit
types.  The SPIR-V variants of these operations accept arbitrary types
and we need this if we're going to handle 64 or 16-bit values.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
ebaee9da4a intel/fs: Uniformize the index in readInvocation
The index is any value provided by the shader and this can be called in
non-uniform control flow so we can't just take component 0.  Found by
inspection.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
b67230de63 intel/fs: Protect opt_algebraic from OOB BROADCAST indices
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
aa4ff4b98c i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
ec8c6649f1 i965/fs/nir: Minor refactor of store_output
Stop retyping the output of shuffle_64bit_data_for_32bit_write.  It's
always BRW_REGISTER_TYPE_D which is perfectly fine for writing out.
Also, when we change get_nir_src to return something with a 64-bit type
for 64-bit values, the retyping will not be at all what we want.  Also,
retyping the output based on src.type before we whack it back to 32 bits
is a problem because the output is always 32 bits.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
030d2b5016 i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write
All callers of this function allocate a fs_reg expressly to pass into
it.  It's much easier if we just let the helper allocate the register.
While we're here, we switch it to doing the MOVs with an integer type so
that we don't accidentally canonicalize floats on half of a double.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
6197a6b7ac i965/fs/nir: Simplify 64-bit store_output
The swizzles weren't doing any good because swiz is just XYZW.  Also, we
were emitting an extra set of MOVs because shuffle_64bit_data_for_32bit
already does a MOV for us.  Finally, the temporary was only ever used
inside the inner loop so there's no need for it to actually be an array.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
18fde36ced intel/fs: Use the original destination region for int MUL lowering
Some hardware (CHV, BXT) have special restrictions on register regions
when doing integer multiplication.  We want to respect those when we
lower to DxW multiplication.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
d54f8ec744 intel/fs: Fix integer multiplication lowering for src/dst hazards
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
fd1bcccc2d intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core
The same workaround we need for 64-bit values on little core also takes
care of the Ivy Bridge problem and does so a bit more efficiently so we
can drop that code while we're here.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
6041a31e77 intel/eu: Fix broadcast instruction for 64-bit values on little-core
We're not using broadcast for any 32-bit types right now since we mostly
use it for emit_uniformize on 32-bit buffer indices.  However, SPIR-V
subgroups are going to need it for 64-bit so let's make it work.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
10e4feed39 intel/eu/reg: Add a subscript() helper
This is similar to the identically named fs_reg helper.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
068beb41d8 intel/eu: Just modify the offset in brw_broadcast
This means we have to drop const from a variable but it also means that
100% of the code which deals with the offset limit is in one place.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
e3bcc86133 intel/compiler: Add some restrictions to MOV_INDIRECT and BROADCAST
These restrictions effectively already existed due to the way we use
indirect sources but weren't being directly enforced.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
1b8ef49f48 intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all
For some reason, the any/all predicates don't work properly with SIMD32.
In particular, it appears that a SEL with a QtrCtrl of 2H doesn't read
the correct subset of the flag register and you end up getting garbage
in the second half.  Work around this by using a pair of 1-wide MOVs and
scattering the result.  This fixes the any/all instructions for SIMD32.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
1f41663007 intel/fs: Use an explicit D type for vote any/all/eq intrinsics
The any/all intrinsics return a boolean value so D or UD is the correct
type.  Unfortunately, get_nir_dest has the annoying behavior of
returnning a float type by default.  This causes format conversion which
gives us -1.0f or 0.0f in the register.  If the consumer of the result
does an integer comparison to zero, it will give you the right boolean
value but if we do something more clever based on the 0/~0 assumption
for booleans, this will give the wrong value.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
6c00240bc6 intel/fs: Don't stomp f0.1 in SIMD16 ballot
In fragment shaders f0.1 is used for discards so doing ballot after a
discard can potentially cause the discard to not happen.  However, we
don't support SIMD32 fragment shaders yet so this isn't a problem.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
def013a863 intel/fs: Use ANY/ALL32 predicates in SIMD32
We have ANY/ALL32 predicates and, for the most part, they work just
fine.  (See the next commit for more details.)  Also, due to the way
that flag registers are handled in hardware, instruction splitting is
able to split the CMP correctly.  Specifically, that hardware looks at
the execution group and knows to shift it's flag usage up correctly so a
2H instruction will write to f0.1 instead of f0.0.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
0d905597fe intel/fs: Be more explicit about our placement of [un]zip
Before, we were careful to place the zip after the last of the split
instructions but did unzip on-demand.  This changes things so that the
unzips go before all of the split instructions and the unzip comes
explicitly after all the split instructions.  As a side-effect of this
change, we now emit the split instruction from highest SIMD group to
lowest instead of low to high.  We could have kept the old behavior, but
it shouldn't matter and this made the code easier.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
fcd4adb9d0 intel/fs: Pass builders instead of blocks into emit_[un]zip
This makes it far more explicit where we're inserting the instructions
rather than the magic "before and after" stuff that the emit_[un]zip
helpers did based on block and inst.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
e8c9e65185 intel/fs: Use a pure vertical stride for large register strides
Register strides higher than 4 are uncommon but they can happen.  For
instance, if you have a 64-bit extract_u8 operation, we turn that into
UB -> UQ MOV with a source stride of 8.  Our previous calculation would
try to generate a stride of <32;8,8>:ub which is invalid because the
maximum horizontal stride is 4.  To solve this problem, we instead use a
stride of <8;1,0>.  As noted in the comment, this does not work as a
destination but that's ok as very few things actually generate that
stride.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Eric Anholt
bd24f4890f broadcom/vc5: Skip emitting textures that aren't used.
Fixes crashes when ARB_fp uses texture[1] but not 0, as in piglit's
fp-fragment-position.
2017-11-07 09:40:25 -08:00
Eric Anholt
3d5e62dcfa broadcom/vc5: Add missing SRGBA8 ETC2 support.
Fixes piglit oes_compressed_etc2_texture-miptree srgb8-alpha8.
2017-11-07 09:40:25 -08:00
Eric Anholt
6079f7c3c3 broadcom/vc5: Disable early Z test when the FS writes Z.
Fixes piglit early-z.
2017-11-07 09:40:25 -08:00
Eric Anholt
eeb9e80272 broadcom/vc5: Shift the min/max lod fields by the BASE_LEVEL.
The lod clamping is what limits you between base and last level, and the
base level field is just there to help decide where the min/mag change
happens.

Fixes tex-miplevel-selection GL2:texture()
2017-11-07 09:40:25 -08:00
Eric Anholt
521e1d0275 broadcom/vc5: Add support for anisotropic filtering. 2017-11-07 09:40:25 -08:00
Eric Anholt
a266f78741 broadcom/vc5: Fix mipmap filtering enums.
The ordering of the values was even less obvious than I thought, with both
the mip filter and the min filter being in different bits depending on
whether the mip filter is none.

Fixes piglit fs-textureLod-miplevels.shader_test
2017-11-07 09:40:25 -08:00
Eric Anholt
73ec70bf13 broadcom/vc5: Fix height padding of small UIF slices.
The HW doesn't pad the slice's height to make a full 4x4 group of UIF
blocks.  We just need to pad to columns, and the start of the next column
appears in the bottom of the previous column's last block.

Fixes piglit fs-textureOffset-2D.
2017-11-07 09:40:24 -08:00
Eric Anholt
e23c6991be broadcom/vc5: Print the actual offsets in HW for our resource layout debug.
The alignment of level 0 is non-obvious, so it's hard to turn a faulting
address into a slice without this.
2017-11-07 09:40:24 -08:00
Eric Anholt
426c352336 broadcom/vc5: Set the available VS outputs to match the FS inputs.
Fixes piglit glsl-es-3.00/minimum-maximums.txt.
2017-11-07 09:40:24 -08:00
Eric Anholt
f1797928fd broadcom/vc5: Set the max texture LOD bias.
The field is signed 8.8, so the usual 16.0f fits.  Fixes piglit
gl-2.1-minmax.
2017-11-07 09:40:24 -08:00
Eric Anholt
47bd9dac19 broadcom/vc5: Fix translation of stencil ops.
They aren't quite in the same order as the gallium defines.  Fixes piglit
gl-2.0-two-sided-stencil.
2017-11-07 09:40:24 -08:00
Eric Anholt
3be820477f broadcom/vc5: Move stencil state packing to the CSO.
Only the stencil ref comes in as dynamic state at emit time.
2017-11-07 09:19:48 -08:00
Eric Anholt
3da39f2297 broadcom/vc5: Introduce a helper for pre-packing our V3DXX structs.
This is so much more pleasant to write than the manual
V3D33_whatever_pack() calls, and will be useful for when we start doing
actual per-V3D compiles.
2017-11-07 09:19:48 -08:00
Eric Anholt
078b163a9c broadcom/vc5: Add a cl_emit() variant for merging with a pre-packed struct.
Cleans up the hand-written code, at the cost of another ugly macro.
2017-11-07 09:19:48 -08:00
Eric Anholt
735b844b1b broadcom/vc5: Skip emitting depth offset while disabled.
The enable flag is also in the rasterizer state, so it will be emitted
once it's needed.
2017-11-07 09:19:48 -08:00
Eric Anholt
386e9362a5 broadcom/vc5: Don't emit stencil config if not doing stencil test.
As with blending, we'll have the bit flagged again when it gets reenabled
in CONFIGURATION_BITS, so there's no need to emit test state if we're not
testing.
2017-11-07 09:19:48 -08:00
Eric Anholt
f90ee6eb2b broadcom/vc5: Don't emit updated blend factors/funcs while disabled.
The dirty bit will be flagged again when re-enbaled.  Keeps us from
emitting blend state in CLs that never do blending.
2017-11-07 09:19:48 -08:00
Eric Anholt
dd429cb2db broadcom/vc5: Fix missing enum decode for indexed primitives. 2017-11-07 09:19:48 -08:00
Eric Anholt
bb6997e6a3 broadcom/vc5: Drop padding bits from the bottom of the TSDA address.
Fixes misaligned-looking addresses in decode.
2017-11-07 09:19:48 -08:00
Eric Anholt
949ac638bc broadcom/vc5: Make sure the TMU indirect struct is appropriately aligned.
I was hoping that this would help with fbo-generatemipmap hangs, but no
luck.
2017-11-07 09:19:48 -08:00
Kenneth Graunke
cb47de4ff0 broadcom/genxml: Fix decoding of groups with small fields.
Groups containing fields smaller than a byte probably not being decoded
correctly.  For example:

    <group count="32" start="32" size="4">
      <field name="Vertex Element Enables" start="0" end="3" type="uint"/>
    </group>

gen_field_iterator_next would properly walk over each element of the
array, incrementing group_iter.  However, the code to print the actual
values only considered iter->field->start/end, which are 0 and 3 in the
above example.  So it would always fetch bits 3:0 of the current byte,
printing the same value over and over.

Cc: Eric Anholt <eric@anholt.net>
2017-11-07 09:19:48 -08:00
Eric Anholt
47dac5d2bc broadcom/vc5: Use DEPTH24_STENCIL8 for rendering to depth-only textures.
The HW puts the pad bits at the top for DEPTH_COMPONENT24, but we need it
at the bottom for texturing.  Using the format with stencil probably means
we won't be able to do Z24 and separate S8, but I wasn't planning on
supporting that anyway.

Fixes hiz-depth-read-fbo-d24-s0
2017-11-07 09:19:48 -08:00
Chad Versace
3ea37d0a2a anv: Suffix anv-private 'VK' tokens with 'ANV'
I saw VK_IMAGE_ASPECT_ANY_COLOR_BIT while hacking anv_formats.c and got
confused. "Huh? What extension added that?". No extension defines it;
anv_private.h defines it.

To remove confusion, rename the anv-private VK tokens as if they were
extension tokens with the ANV vendor suffix.

I found only two such tokens:

    VK_IMAGE_ASPECT_ANY_COLOR_BIT
    VK_IMAGE_ASPECT_PLANES_BITS

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-07 09:06:41 -08:00
Chad Versace
012b54c6b1 anv: Remove unused variable 'gen'
In anv_physical_device_get_format_properties().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-07 09:06:30 -08:00
Marek Olšák
33000e7c43 radeonsi: add si_screen::has_ls_vgpr_init_bug
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-07 17:58:40 +01:00
Marek Olšák
cde664ab81 radeonsi: use ac_create_target_machine
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-07 17:58:38 +01:00
Marek Olšák
81f81fdb54 radeonsi: use ac_get_llvm_processor_name
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-07 17:58:36 +01:00
Marek Olšák
c29f5fe41c radeonsi/gfx9: don't set gs_table_depth
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-07 17:58:33 +01:00
Marek Olšák
e616743dab radeonsi/gfx9: limit the scissor bug workaround to Vega10 and Raven only
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-07 17:26:36 +01:00
Marek Olšák
24e9004708 radeonsi: remove unused field in the PCI ID table
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-11-07 17:26:36 +01:00
Miklós Máté
cf47dfe8f1 mesa: fix deleting the dummy ATI_fs
The DummyShader is used by GenFragmentShadersATI() as a placeholder to
mark IDs as allocated. Context cleanup wants to delete everything in
ctx->Shared->ATIShaders, and crashes on these placeholders with this
backtrace:
==15060== Invalid free() / delete / delete[] / realloc()
==15060==    at 0x482F478: free (vg_replace_malloc.c:530)
==15060==    by 0x57694F4: _mesa_delete_ati_fragment_shader (atifragshader.c:68)
==15060==    by 0x58B33AB: delete_fragshader_cb (shared.c:208)
==15060==    by 0x5838836: _mesa_HashDeleteAll (hash.c:295)
==15060==    by 0x58B365F: free_shared_state (shared.c:377)
==15060==    by 0x58B3BC2: _mesa_reference_shared_state (shared.c:469)
==15060==    by 0x578687F: _mesa_free_context_data (context.c:1366)
==15060==    by 0x595E9EC: st_destroy_context (st_context.c:642)
==15060==    by 0x5987057: st_context_destroy (st_manager.c:772)
==15060==    by 0x5B018B6: dri_destroy_context (dri_context.c:217)
==15060==    by 0x5B006D3: driDestroyContext (dri_util.c:511)
==15060==    by 0x4A1CBE6: dri3_destroy_context (dri3_glx.c:170)
==15060==  Address 0x7b5dae0 is 0 bytes inside data symbol "DummyShader"

Also, DeleteFragmentShadersATI() should not assert on DummyShader, just
remove the hash entry.

Normally one would define a shader after GenFragmentShadersATI(), and
BindFragmentShaderATI() replaces the placeholder with a real object.
However, the specification doesn't say that one has to define a shader
for each allocated ID.

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-11-07 17:26:36 +01:00
Michel Dänzer
cd3b55ad07 gallium: Guard assertions by NDEBUG instead of DEBUG
This matches the standard assert.h header.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-07 16:47:15 +01:00
Eric Engestrom
1e6f9ea212 meson: only turn on Mesa's DEBUG for buildtype==debug
As discussed in this thread:
https://lists.freedesktop.org/archives/mesa-dev/2017-November/175104.html

Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Chad Versace <chadversary@chromium.org>
2017-11-07 11:01:32 +00:00
Eric Engestrom
d5597f09c6 meson: switch default build type to debugoptimized
As discussed in this thread:
https://lists.freedesktop.org/archives/mesa-dev/2017-November/175104.html

Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Christian Schmidbauer <ch.schmidbauer@gmail.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Ernst Sjöstrand <ernstp@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Andres Rodriguez <andresx7@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
2017-11-07 11:00:03 +00:00
Eric Engestrom
cc15460e18 meson: drop GLESv1 .so version back to 1.0.0
autotools generates libGLESv1_CM.so.1.0.0, so let's make sure meson
does the same.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-07 10:47:20 +00:00
Eric Engestrom
5be1b1a8ce meson: standardize .so version to major.minor.patch
This `version` field defines the filename for the .so.
The plan .so as well as .so.$major are always symlinks to this.

Unless I'm mistaken, only the major is ever used, so this shouldn't
matter, but for consistency with autotools (and in case it does matter),
let's always have all 3 major.minor.patch components.

(The soname isn't affected, and is always .so.$major)

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-07 10:47:20 +00:00
Dave Airlie
0084f4a422 ac/nir: for ubo load use correct num_components
I was hacking something stupid in doom, and hit an assert for the bitcast
following this, it definitely looks like this should be the number of 32-bit
components, not the instr level ones.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-07 14:54:19 +10:00
Gwan-gyeong Mun
fb87c40a58 nir: fix a typo
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-06 18:11:24 -08:00
Tomasz Figa
0886be093f glsl: Allow precision mismatch on dead data with GLSL ES 1.00
Commit 259fc50545 added linker error for
mismatching uniform precision, as required by GLES 3.0 specification and
conformance test-suite.

Several Android applications, including Forge of Empires, have shaders
which violate this rule, on a dead varying that will be eliminated.
The problem affects a big number of applications using Cocos2D engine
and other GLES implementations accept this, this poses a serious
application compatibility issue.

Starting from GLSL ES 3.0, declarations with conflicting precision
qualifiers are explicitly prohibited. However GLSL ES 1.00 does not
clearly specify the behavior, except that

  "Uniforms are defined to behave as if they are using the same storage in
  the vertex and fragment processors and may be implemented this way.
  If uniforms are used in both the vertex and fragment shaders, developers
  should be warned if the precisions are different. Conversion of
  precision should never be implicit."

The word "used" is not clear in this context and might refer to
 1) declared (same as GLES 3.x)
 2) referred after post-processing, or
 3) linked after all optimizations are done.

Looking at existing applications, 2) or 3) seems to be widely adopted.
To avoid compatibility issues, turn the error into a warning if GLSL ES
version is lower than 3.0 and the data is dead in at least one of the
shaders.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97532
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-06 15:16:03 -08:00
Timothy Arceri
a9000cb860 i965: disable NIR linking on HSW and below
Fixes: 379b24a40d "i965: make use of nir linking"

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103537
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 09:14:05 +11:00
Dave Airlie
201b3b8d0d radv: move is_local up to the winsys level.
We can avoid adding the buffer in the non-local case, this will
avoid all the overhead of the indirect call.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 21:45:59 +00:00
Dave Airlie
25660499b6 radv: wrap cs_add_buffer in an inline. (v2)
The next patch will try and avoid calling the indirect function.

v2: add a missing conversion.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 21:45:59 +00:00
Dave Airlie
31b5da7958 radv: when loading regs no need to add buffer
The function that calls us has just added the buffer to the
list already, no need to try and add it again.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 21:44:49 +00:00
Dave Airlie
3bf8be41b8 radv: pre-calculate user_data_0 registers and store in pipeline
There's no point recalculating these the whole time on descriptor
emission, just store them at pipeline creation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 21:44:49 +00:00
Adam Jackson
d547e18184 docs: Mark GLX_ARB_context_flush_control done
Requires an unreleased X server, but from the client GLX side this is as
done as it gets.

Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-11-06 16:21:57 -05:00
Neil Roberts
6ce9006d76 i965: Enable flush control
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Neil Roberts <neil@linux.intel.com>
2017-11-06 16:09:03 -05:00
Adam Jackson
791d06b23b drisw: Enable flush control for llvmpipe and softpipe
Hilariously this is a fairly big win.  Neil's multi-context-test
improves from ~24 to ~36 fps with llvmpipe on a Core i5-3317U.  softpipe
also improves, from about 2.25 to 3.09 fps (when it's that slow, you're
allowed to be that precise).

I'd have added it to swrast classic, but the testcase wants GL 3.0 and
shaders, and that's not a thing classic has, so I figured making it work
on softpipe was crime enough.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-11-06 16:09:03 -05:00
Adam Jackson
5cc06bec19 gallium: Wire up flush control
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-11-06 16:09:03 -05:00
Adam Jackson
c0be3aae6c egl: Implement EGL_KHR_context_flush_control
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-11-06 16:09:03 -05:00
Neil Roberts
ba7679f48d glx: Implement GLX_ARB_context_flush_control
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Neil Roberts <neil@linux.intel.com>
2017-11-06 16:09:02 -05:00
Neil Roberts
b89067c84f dri: Add a flush control extension
This advertises that the driver can accept a new context attribute
__DRI_CTX_ATTRIB_RELEASE_BEHAVIOR.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Neil Roberts <neil@linux.intel.com>
2017-11-06 16:09:02 -05:00
Neil Roberts
6d87500fe1 dri: Change __DriverApiRec::CreateContext to take a struct for attribs
Previously the CreateContext method of __DriverApiRec took a set of
arguments to describe the attribute values from the window system API's
CreateContextAttribs function. As more attributes get added this could
quickly get unworkable and every new attribute needs a modification for
every driver.

To fix that, pass the attribute values in a struct instead.  The struct
has a bitmask to specify which members are used. The first three members
(two for the GL version and one for the flags) are always set. If the
bit is not set in the attribute mask then it can be assumed the
attribute has the default value. Drivers will error if unknown bits in
the mask are set.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Neil Roberts <neil@linux.intel.com>
2017-11-06 16:09:02 -05:00
Neil Roberts
8c0729fd99 intel: Don't flush the old context in intelMakeCurrent
It shouldn't be necessary to flush the context within the driver
implementation because the old context is explicitly flushed in
_mesa_make_current which is called a little further on. It is useful to
only have a single place that flushes when switching contexts to make it
easier to later implement the GL_KHR_context_flush_control extension.

The flush in intelMakeCurrent was added in commit 5505865 to implement
the GLX semantics that the context should be flushed when it is
released.  When the commit was made there was no flush in
_mesa_make_current because it was only added later in 93102b4c. I think
that later commit effectively makes the first commit redundant.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Neil Roberts <neil@linux.intel.com>
2017-11-06 16:08:58 -05:00
Adam Jackson
9ef7158a09 egl/dri2: Factor out context attribute initialization
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-06 16:08:58 -05:00
Wladimir J. van der Laan
96463614a3 etnaviv: Don't over-pad compressed textures
HALIGN_FOUR/SIXTEEN has no meaning for compressed textures, and we can't
render to them anyway. So use the tightest possible packing. This
avoids bugs with non-power-of-two block sizes.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-06 21:31:20 +01:00
Wladimir J. van der Laan
93ba3f29bb etnaviv: ASTC texture support
Add ASTC texture support for hardware that supports this
(currently only GC3000 on i.MX6qp is known to have this).

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-06 21:30:54 +01:00
Wladimir J. van der Laan
f1e1c60ff6 etnaviv: Update from rnndb
Updated as of etnav_viv commit 3b4a8ec.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-06 21:29:19 +01:00
Dave Airlie
4bcb48b831 radv: add initial copy descriptor support. (v2)
It appears the latest dota2 vulkan uses this,
and we get a hang in VR mode without it.

v2: remove finishme I left in after finishing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 19:12:39 +00:00
Marek Olšák
71f5fe36b7 gallium/u_vbuf: use signed vertex buffers offsets for optimal uploads
Uploaded data must start at (stride * start), because we can't modify
start in all cases. If it's the first allocation, it's also the amount
of memory wasted. If the starting offset is larger than the size of
the upload buffer, the buffer is re-created, used for 1 upload, and then
thrown away. If the upload is small, most of the buffer space is unused
and wasted. Keep doing that and the OOM killer comes. It's actually
pretty quick.

With signed VB offsets, we can set min_out_offset = 0
in u_upload_alloc/u_upload_data.

This fixes OOM situations with SPECviewperf.
2017-11-06 19:09:12 +01:00
Marek Olšák
3f58988b81 radeonsi: enable signed vertex buffer offsets 2017-11-06 19:09:12 +01:00
Marek Olšák
24d6318d24 gallium: add PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET 2017-11-06 19:09:12 +01:00
Juan A. Suarez Romero
e17e8934f9 automake: include git_sha1.h.in in release tarball
Fixes:

make[2]: Leaving directory '/home/local/mesa/mesa-17.4.0-devel/_build/sub/src'
make[2]: *** No rule to make target '../../../src/git_sha1.h.in', needed by 'git_sha1.h'.  Stop.
Makefile:660: recipe for target 'all-recursive' failed

Fixes: 16be271c6e "git_sha1_gen: use git_sha1.h.in on all build systems"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-11-06 18:18:42 +01:00
Marek Olšák
adab7f16ff radeonsi: don't map big VRAM buffers for the first upload directly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-06 16:23:20 +01:00
Marek Olšák
4b0dc098b2 gallium/u_threaded: don't map big VRAM buffers for the first upload directly
This improves Paraview "many spheres" performance 4x along with the radeonsi
commit.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-06 16:23:20 +01:00
Marek Olšák
a5d3999c31 gallium/u_threaded: clean up tc_improve_map_buffer_flags and prevent reentry
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-06 16:23:20 +01:00
Dave Airlie
60a9705e00 radv: move descriptor sets out of cmd_state.
Instead of storing all the pointers and zeroing them all out,
just store a valid bitmask in the state. This also moves
the CmdBindPipeline path down the cpu usage path for the
multithreading demo as it no longer has to traverse MAX_SETS
to find the active descriptor sets.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:11:03 +00:00
Dave Airlie
3a0d098252 radv: add helper for setting a descriptor.
This is just a simple refactor.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:11:00 +00:00
Dave Airlie
b48063a2f2 radv: move vertex binding out of cmd state.
This isn't required to be cleared, since buffers are only linked
by vertex elements, so if elements are clear then no buffers
should be referenced.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:10:56 +00:00
Dave Airlie
7365626d78 radv: reorder cmd_state to remove a hole.
This just removes a hole in the cmd_state and packs some bools
together.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:10:53 +00:00
Dave Airlie
f0ae06a13c radv: free attachments on end command buffer.
If we allocate attachments in the begin command buffer due to the
render pass continue bit, we were leaking them.

Since renderpasses inside a cmd buffer malloc/free these properly,
and set to NULL, we just need to call free at end.

Fixes a memory leak with multithreading demo.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:03:47 +00:00
Bas Nieuwenhuizen
608af05ffb radv: Optimize calling radv_save_descriptors.
uint32_t data[MAX_SETS * 2] = {}; was getting executed before
the exit and took significant amounts of time. By having the
check outside the function, we skip the execution of the clear.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-04 20:18:17 +01:00
Bas Nieuwenhuizen
cecbcf4b2d radv: Use an array to store descriptor sets.
The vram_list linked list resulted in lots of pointer chasing.
Replacing this with an array instead improves descriptor set
allocation CPU usage by 3x at least (when also considering the free),
because it had to iterate through 300-400 sets on average.

Not a huge improvement as the pre-improvement CPU usage was only
about 2.3% in the busiest thread.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-04 20:18:17 +01:00
Pierre Moreau
b041687ed1 nv50,nvc0: Display shared memory usage in pipe_debug_message
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
2017-11-04 14:12:07 -04:00
Pierre Moreau
efe532b739 nv50,nvc0: Copy shared memory per block to the program info structure and back
In OpenCL/CUDA kernels, shared memory usage can be defined within the
kernel code. Those usage will only be picked up while parsing the
SPIR-V, during the translation phase of the program.

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
2017-11-04 14:12:07 -04:00
Pierre Moreau
49752e99f8 nv50/ir: Store shared memory per block in nv50_ir_prog_info
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
2017-11-04 14:12:07 -04:00
Anuj Phogat
898e5555de i965/gen10: Implement Wa3DStateMode
This workaround doesn't fix any of the piglit hangs we've seen
on CNL. But it might be fixing something we haven't tested yet.

V2: Remove the bits enabling Float blend optimization. It is
    enabled through CACHE_MODE_SS register.
    Update the comment.
    Move gen10 if block on top of gen9 if block.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-11-03 14:30:34 -07:00
Anuj Phogat
6c681b4cc1 i965/gen10: Enable float blend optimization
This optimization is enabled for previous generations too.
See Mesa commit c17e214a6b
On CNL this bit has been moved to CACHE_MODE_SS register.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-11-03 14:30:34 -07:00
Anuj Phogat
d3d0fe4572 i965/gen10: Implement WaForceRCPFEHangWorkaround
This workaround doesn't fix any of the piglit hangs we've seen
on CNL. But it might be fixing something we haven't tested yet.

V2: Add the check for Post Sync Operation.
    Update the workaround comment.
    Use braces around if-else.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-11-03 14:30:34 -07:00
Anuj Phogat
3cf4fe2219 i965/gen10: Implement WaSampleOffsetIZ workaround
There are few other (duplicate) workarounds which have similar recommendations:
WaFlushHangWhenNonPipelineStateAndMarkerStalled
WaCSStallBefore3DSamplePattern
WaPipeControlBefore3DStateSamplePattern

WaPipeControlBefore3DStateSamplePattern has some extra recommendations if
driver is using mid batch context restore. Ignoring it for now because We're
not doing mid-batch context restore in Mesa.

This workaround doesn't fix any of the piglit hangs we've seen
on CNL. But it might be fixing something we haven't tested yet.

V2: Use brw_load_register_imm32() to program CACHE_MODE_0.
    Get rid of brw_flush_gpu_caches().

V3: Make the workaround helper functions static.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by :Nanley Chery <nanley.g.chery@intel.com>
2017-11-03 14:30:33 -07:00
Anuj Phogat
7a09be2dc9 i965/gen10: Don't set Antialiasing Enable in 3DSTATE_RASTER if num_samples > 1
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-03 14:30:33 -07:00
Anuj Phogat
2d10eb5ed8 i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-03 14:30:33 -07:00
Andrey Grodzovsky
19fc3cdcfb winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx.
Fixes reverted patch f03b7c9 by doing VMID reservation per
process and not per context.
Also updates required amdgpu libdrm version since the change
involved interface updates in amdgpu libdrm.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-11-03 18:06:17 +01:00
Lionel Landwerlin
24ec29b919 i965: perf: list registers to program for queries
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-03 14:25:36 +00:00
Lionel Landwerlin
285a2192f9 i965: perf: factorize code for availability
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-03 14:23:39 +00:00
Lionel Landwerlin
05231a4e74 i965: perf: make revision variable available
This will be used in the next commit to build up register programming.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-03 14:23:22 +00:00
Nicolai Hähnle
ca63a5ed3e glsl: fix interpolateAtXxx(some_vec[idx], ...) with dynamic idx
The dynamic index of a vector (not array!) is lowered to a sequence of
conditional assignments. However, the interpolate_at_* expressions
require that the interpolant is an l-value of a shader input.

So instead of doing conditional assignments of parts of the shader input
and then interpolating that (which is nonsensical), we interpolate the
entire shader input and then do conditional assignments of the interpolated
result.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-03 14:30:08 +01:00
Nicolai Hähnle
4f42450b86 glsl: allow any l-value of an input variable as interpolant in interpolateAt*
The intended rule has been clarified in GLSL 4.60, Section 8.13.2
(Interpolation Functions):

   "For all of the interpolation functions, interpolant must be an l-value
    from an in declaration; this can include a variable, a block or
    structure member, an array element, or some combination of these.
    Component selection operators (e.g., .xy) may be used when specifying
    interpolant."

For members of interface blocks, var->data.must_be_shader_input must be
determined on-the-fly after lowering interface blocks, since we don't want
to disable varying packing for an entire block just because one input in it
is used in interpolateAt*.

v2: keep setting must_be_shader_input in ast_function (Ian)
v3: follow the relaxed rule of GLSL 4.60
v4: only apply the relaxed rules to desktop GL
    (the ES WG decided that the relaxed rules may apply in a future version
     but not retroactively; see also
     dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_centroid.negative.*)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101378
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-03 14:30:08 +01:00
Dave Airlie
57372c5a42 nir/serialize: fix build with gcc 4.4.7
I had to build on RHEL6 today, and noticed this.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 15:03:35 +10:00
Dave Airlie
0722b6d693 i915g: remove some unknown cap warnings. 2017-11-03 15:03:30 +10:00
Dave Airlie
cc69f2385e i915g: make gears run again.
We need to validate some structs exist before we dirty the states, and
avoid the problem in some other places.

Fixes: e027935a7 ("st/mesa: don't update unrelated states in non-draw calls such as Clear")
2017-11-03 15:03:30 +10:00
Timothy Arceri
6e2eb96b64 ac: remove the remaining duplicate llvm types
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
e73a467005 ac: remove usused v4f32
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
7f4966731f ac: add v2f32 to the common code and make use of it
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
cd6cfd1095 ac: use the ac f16 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
8f651ae062 ac: use the ac f32 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
368654a299 ac: use the ac f64 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
d927db0672 ac: use the common v8i32 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
9db51b2393 ac: use the common v4i32 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
ee376ac6f4 ac: add v3i32 to the common code and make use of it
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
309a51411d ac: add v2i32 to the common code and use it
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
c64cfa0392 ac: use the ac i64 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
3d45acf71c ac: remove unused i16 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
4d4799643d ac: use the ac ivoidt llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
209ad5c16f ac: use the ac i8 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
21d71189ec ac: use the ac i1 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
bd59a0bb8b ac: use the ac i32 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
439a2febc4 ac/radeonsi: add support for tex instr without a derefence
These are produced by nir_lower_bitmap(), adding the missing derefence
would cause other issues that need to be hacked around such as
skipping sampler lowering and uniform location assignment, so this
change seems the correct way to go.

Fixes 194 piglit crashes on radeonsi using NIR.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:19:51 +11:00
Timothy Arceri
440d08fe93 nir: skip lowering sampler if there is no dereference
This avoids a crash on the output of nir_lower_bitmap().

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:19:46 +11:00
Dave Airlie
de126b0402 r600: add support for early depth/stencil.
This add support for the early depth/stencil property found
on image shaders.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 09:33:37 +10:00
Dave Airlie
f3c6149c26 r600: add support for emitting RAT instructions to the assembler.
This adds support for emitting RAT instructions to the assembler.
RAT instructions are used to implement image accessors.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 09:33:33 +10:00
Dave Airlie
159bf38c3a r600: add support for mark bit to the assembler.
This adds support to the assembler for the mark bit
 on the export word1.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 09:33:30 +10:00
Dave Airlie
90ca378080 r600: add support for valid pixel mode on CF clauses
This just adds support to the assembler for setting the valid
pixel mode on the CF clause.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 09:33:26 +10:00
Dave Airlie
d584b4671f r600: add support for some ALU sources.
These special ALU sources provide the shader engine,
simd and hw wave ids.

These are required for images support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 09:31:50 +10:00
Samuel Pitoiset
bad31f6a65 radv: use the optimal packets order for dispatch calls
This should reduce the time where compute units are idle, mainly
for meta operations because they use a bunch of compute shaders.

This seems to have a really minor positive effect for Talos, at least.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-02 23:03:59 +01:00
Timothy Arceri
cf5f8f55c3 nir: add tess patch support to nir_remove_unused_varyings()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 08:58:39 +11:00
Dylan Baker
4ff6187b84 es2api/ABI-check: Add es3.x symbols
Currently this ABI check only checks for es2 symbols, but es3.x symbols
are also exposed. Exposing these symbols is recommended by Khronos, and
as such the test should accept that as ABI.

see: https://lists.freedesktop.org/archives/mesa-stable/2016-June/004545.html
for the discussion about exposing these symbols

cc: Ian Romanick <idr@freedesktop.org>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-11-02 14:50:52 -07:00
Dylan Baker
a5635d993a meson: Set c visibility args for wayland-drm
Because otherwise gbm will expose wayland symbols that it shouldn't.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-02 14:50:18 -07:00
Timothy Arceri
4837ad4832 st/glsl_to_nir: pass gl_shader_program to st_finalize_nir()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 08:32:35 +11:00
Bas Nieuwenhuizen
806721429a radv: Don't expose heaps with 0 memory.
It confuses CTS. This pregenerates the heap info into the
physical device, so we can use it for translating contiguous
indices into our "standard" ones.

This also makes the WSI a bit smarter in case the first preferred
heap does not exist.

Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
2017-11-02 20:28:19 +01:00
Dylan Baker
a29869e872 gbm: Don't traverse backwards for includes
This is just a bad idea and should be avoided. Instead, make the #include
flat and fix the build systems to pass the proper -I flags

v2: - add an inc_wayland_drm instead passing a path to
      include_directories (Emil)
    - update commit message (Emil)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
2017-11-02 11:39:45 -07:00
Dylan Baker
10d869535c automake: Remove unused include path
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-02 11:39:44 -07:00
Marek Olšák
529cdce799 radeonsi: remove 'Authors:' comments
It's inaccurate. Instead, see the copyright and use "git log" and
"git blame" to know the authorship.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-02 18:19:03 +01:00
Jason Ekstrand
172e8e42c4 intel/fs: Don't allocate a param array for zero push constants
Thanks to the ralloc invariant of "any pointer returned from ralloc can
be used as a context", calling ralloc_size with a size of zero will
cause it to allocate at least a header.  If we don't have any push
constants, then NULL is perfectly acceptable (and even preferred).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-11-02 09:55:21 -07:00
Jason Ekstrand
7b4387519c intel/fs: Alloc pull constants off mem_ctx
It doesn't actually matter since the only user of push constants, i965,
ralloc_steals it back to NULL but it's more consistent and probably
fixes memory leaks in some error cases.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-02 09:55:21 -07:00
Dylan Baker
cb831d918b Revert "meson: bump libdrm version required by amdgpu"
This reverts commit d364684711.

The commit that bumped the autotools version was reverted, so lets
revert the meson version to match.

fixes: 1f2640bfa9
       "Revert "winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx.""
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-02 09:49:03 -07:00
Tim Rowley
0023b5ae67 gallivm: allow arch rounding with avx512
Fixes piglit vs-roundeven-{float,vec[234]} with simd16 VS.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-11-02 10:24:54 -05:00
Wladimir J. van der Laan
0ba4320d94 etnaviv: Allow clearing constant buffer using buffer==NULL user_buffer==NULL
Prevents an assertion when using GALLIUM_HUD with ioquake3,
when cso_restore_constant_buffer_slot0 restores an empty
constant buffer in slot 0.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-11-02 11:03:30 +01:00
Wladimir J. van der Laan
bc71c31842 etnaviv: Don't flush on transfer when UNSYNCHRONIZED
Structure code to only flush when we will potentially call cpu_prep. This
prevents spurious flushes in applications that heavily rely on u_uploader.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-11-02 11:00:26 +01:00
Wladimir J. van der Laan
8fbd82f464 etnaviv: don't do resolve-in-place without valid TS
GC3000 resolve-in-place assumes that the TS state is configured.
If it is not, this will result in MMU errors. This is especially
apparent when using glGenMipmaps().

Fixes: 78ade65956 ("etnaviv: Do GC3000 resolve-in-place when possible")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-11-02 10:58:48 +01:00
Samuel Pitoiset
c39f39106d radv: make radv_bind_descriptor_set() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-02 09:36:14 +01:00
Dave Airlie
799ef80059 radv: make sure we set buffers as shareable properly.
This should make sure we don't treat exports buffers as local
bos.

Fixes: a639d40f13 (radv: add support for local bos. (v3))

Tested-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-02 01:01:29 +00:00
Dylan Baker
6594213cfa svga: Use __asm__ instead of asm
__asm__ is portable, and allows the svga driver to be compiled with the
c99 standard instead of requiring the gnu99 standard.

I have compile tested this with GCC and Clang on Linux.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2017-11-01 15:05:26 -07:00
Marek Olšák
1f2640bfa9 Revert "winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx."
This reverts commit f03b7c9ad9.

The libdrm interface is wrong.
2017-11-01 21:42:31 +01:00
Lionel Landwerlin
8d8b9d11c9 intel: decoder: enable decoding a single field
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 17:23:49 +00:00
Lionel Landwerlin
bb16503542 intel: decoder: expose missing find_enum()
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 17:23:49 +00:00
Lionel Landwerlin
ad876f721e intel: decoder: extract field value computation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 17:23:49 +00:00
Lionel Landwerlin
81aee9fd4b intel: decoder: rename field() to field_value()
We would like to avoid collisions with variables named field.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 17:23:49 +00:00
Lionel Landwerlin
69d158573a intel: decoder: rename internal function to free name
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 17:23:49 +00:00
Lionel Landwerlin
20156931bf intel: decoder: simplify field_is_header()
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 17:23:49 +00:00
Lionel Landwerlin
cab93a901e intel: common: make intel utils available from C++
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 17:23:49 +00:00
Lionel Landwerlin
ea14ba0179 intel: decoder: remove unused platform field
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 17:23:49 +00:00
Lionel Landwerlin
938f62a1c7 intel: error-decode: implement a rolling window of programs
If we have more programs than what we can store,
aubinator_error_decode will assert. Instead let's have a rolling
window of programs.

v2: Fix overflowing issues (Eric Engestrom)

v3: Go through programs starting at idx_program (Scott)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 17:23:49 +00:00
Brian Paul
eedecb4eca gallium: increase pipe_sampler_view::target bitfield size for MSVC
MSVC treats enums as being signed.  The 4-bit target field isn't large
enough to correctly store the value 8 (for PIPE_TEXTURE_CUBE_ARRAY).
The bitfield value 0x8 was being interpreted as -8 so matching the
target with PIPE_TEXTURE_CUBE_ARRAY in switch statements, etc. was
failing.

To keep the structure size the same, we reduce the format field from
16 bits to 15.  There don't appear to be any other enum bitfields
which need to be adjusted.

This fixes a number of Piglit cube map array tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-11-01 11:06:02 -06:00
Eric Engestrom
5d4ffb9970 mapi: fix .so path in ABI-check
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-11-01 15:43:46 +00:00
Lionel Landwerlin
38f338c19a intel: decoder: extract instruction/structs length
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 13:49:12 +00:00
Lionel Landwerlin
279531672e intel: decoder: pack iterator variable declarations
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 13:49:12 +00:00
Lionel Landwerlin
1cf1591abd intel: decoder: simplify creation of struct when 0-allocated
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 13:49:12 +00:00
Lionel Landwerlin
eb00b8b18c intel: decoder: add destructor for gen_spec
This makes use of ralloc to simplify the destruction. We can also
store instructions in hash tables.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 13:49:12 +00:00
Lionel Landwerlin
de213b4af8 intel: decoder: expose helper to test header fields
These fields are of little importance as they're used to recognize
instructions.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 13:19:20 +00:00
Lionel Landwerlin
68e1853ea3 intel: decoder: don't read qword outside instruction/struct limit
We used to print invalid data when the last field was being clamped to
32bits due to Dword Length of the whole instruction. Here is an
example where the decoder read part of the next instruction instead of
stopping at the 32bit limit:

0x000ce0b4:  0x10000002:  MI_STORE_DATA_IMM
0x000ce0b4:  0x10000002 : Dword 0
    DWord Length: 2
    Store Qword: 0
    Use Global GTT: false
0x000ce0b8:  0x00045010 : Dword 1
    Core Mode Enable: 0
    Address: 0x00045010
0x000ce0bc:  0x00000000 : Dword 2
0x000ce0c0:  0x00000000 : Dword 3
    Immediate Data: 8791026489807077376

With this change we have the proper value :

0x000ce0b4:  0x10000002:  MI_STORE_DATA_IMM (4 Dwords)
0x000ce0b4:  0x10000002 : Dword 0
    DWord Length: 2
    Store Qword: 0
    Use Global GTT: false
0x000ce0b8:  0x00045010 : Dword 1
    Core Mode Enable: 0
    Address: 0x00045010
0x000ce0bc:  0x00000000 : Dword 2
0x000ce0c0:  0x00000000 : Dword 3
    Immediate Data: 0

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 13:19:20 +00:00
Lionel Landwerlin
f5e5ca1e21 intel: decoder: split out getting the next field and decoding it
Due to the new way we handle fields, we need *not* to forget the first
field when decoding instructions. The issue was that the advance
function was called first and skipped the first field.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 13:19:20 +00:00
Lionel Landwerlin
ffa011d1e3 intel: decoder: move field name copy
This should be inside the function that actually decodes fields.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 13:19:20 +00:00
Lionel Landwerlin
0698318d1a intel: decoder: reorder iterator init function
Making the next change more readable.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 13:19:20 +00:00
Lionel Landwerlin
1b369acdd8 intel: common: print out all dword with field spanning multiple dwords
For example, we were skipping Dword 3 in this PIPE_CONTROL :

0x000ce130:  0x7a000004:  PIPE_CONTROL
    DWord Length: 4
0x000ce134:  0x00000010 : Dword 1
    Flush LLC: false
    Destination Address Type: 0 (PPGTT)
    LRI Post Sync Operation: 0 (No LRI Operation)
    Store Data Index: 0
    Command Streamer Stall Enable: false
    Global Snapshot Count Reset: false
    TLB Invalidate: false
    Generic Media State Clear: false
    Post Sync Operation: 0 (No Write)
    Depth Stall Enable: false
    Render Target Cache Flush Enable: false
    Instruction Cache Invalidate Enable: false
    Texture Cache Invalidation Enable: false
    Indirect State Pointers Disable: false
    Notify Enable: false
    Pipe Control Flush Enable: false
    DC Flush Enable: false
    VF Cache Invalidation Enable: true
    Constant Cache Invalidation Enable: false
    State Cache Invalidation Enable: false
    Stall At Pixel Scoreboard: false
    Depth Cache Flush Enable: false
0x000ce138:  0x00000000 : Dword 2
    Address: 0x00000000
0x000ce140:  0x00000000 : Dword 4
    Immediate Data: 0

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 13:19:20 +00:00
Lionel Landwerlin
3ae5c57916 intel: decoder: build sorted linked lists of fields
The xml files don't always have fields in order. This might confuse
our parsing of the commands. Let's have the fields in order. To do
this, the easiest way it to use a linked list. It also helps a bit
with the iterator.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 13:19:20 +00:00
Lionel Landwerlin
957a6eea7a intel: common: expose gen_spec fields
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-01 13:19:20 +00:00
Eric Engestrom
f0ab3f7635 travis: build meson first for quicker feedback
Meson is much quicker to build Mesa, giving quicker feedback if
executed first.

Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-01 09:57:32 +00:00
Eric Engestrom
d364684711 meson: bump libdrm version required by amdgpu
Fixes: f03b7c9ad9 "winsys/amdgpu: Add R600_DEBUG flag to
                             reserve VMID per ctx."
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-01 09:57:32 +00:00
Jordan Justen
1a61a8b9a7 i965: Initialize disk shader cache if MESA_GLSL_CACHE_DISABLE is false
(Apologies for the double negative.)

For now, the shader cache is disabled by default on i965 to allow us
to verify its stability.

In other words, to enable the shader cache on i965, set
MESA_GLSL_CACHE_DISABLE to false or 0. If the variable is unset, then
the shader cache will be disabled.

We use the build-id of i965_dri.so for the timestamp, and the pci
device id for the device name.

v2:
 * Simplify code by forcing link to include build id sha. (Matt)

v3:
 * Don't use a for loop with snprintf for bin to hex. (Matt)
 * Assume fixed length render and timestamp string to further simplify
   code.

Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:46:53 -07:00
Jordan Justen
ccb700526f dri drivers: Always add the sha1 build-id
v4:
 * Add Android build changes. (Emil)

Cc: Dylan Baker <dylanx.c.baker@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Jordan Justen
e5b141634c disk_cache: Fix issue reading GLSL metadata
This would cause the read of the metadata content to fail, which would
prevent the linking from being skipped.

Seen on Rocket League with i965 shader cache.

Fixes: b86ecea344 "util/disk_cache: write cache item metadata to disk"
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Jordan Justen
e6ecd7d73f glsl/shader_cache: Save fs (BlendSupport) metadata
Fixes many GL 4.5 CTS blend tests, such as:

* GL45-CTS.blend_equation_advanced.extension_directive_enable
* GL45-CTS.blend_equation_advanced.extension_directive_warn
* GL45-CTS.blend_equation_advanced.blend_all.GL_MULTIPLY_KHR_all_qualifier
* GL45-CTS.blend_equation_advanced.blend_specific.GL_COLORBURN_KHR

v2:
 * Directly save the BlendSupport field to avoid potentially including
   a pointer in the future in the structure is updated. (tarceri)

Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:54 -07:00
Jordan Justen
7f5204a0db i965: Initialize sha1 hash of dri config options
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Jordan Justen
478a73fdfa i965: Don't link when the program was found in the disk cache
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:54 -07:00
Jordan Justen
c3a8ae105c i965: add cache fallback support using serialized nir
If the i965 gen program cannot be loaded from the cache, then we
fallback to using a serialized nir program.

This is based on "i965: add cache fallback support" by Timothy Arceri
<timothy.arceri@collabora.com>. Tim's version was written to fallback
to compiling from source, and therefore had to be much more complex.
After Connor and Jason implemented nir serialization, I was able to
rewrite and greatly simplify this patch.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Timothy Arceri
a4078b819f i965: add support for cached shaders with xfb qualifiers
For now this disables the shader cache when transform feedback is
enabled via the GL API as we don't currently allow for it when
generating the sha for the shader.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Timothy Arceri
15f39e8654 mesa/glsl: add api_enabled flag to gl_transform_feedback_info
This will be used to disable the shader cache when xfb is enabled
via the api as we don't currently allow for it when generating the
sha for the shader.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Jordan Justen
8a019f5601 i965: Add shader cache support for compute
v2:
 * Use MAYBE_UNUSED. (Matt)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Timothy Arceri
42383faf51 i965: add shader cache support for tess stages
v2:
 * Use MAYBE_UNUSED. (Matt)

[jordan.l.justen@intel.com: *_cached_program => brw_disk_cache_*_program]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Timothy Arceri
5a4afd822f i965: add shader cache support for geometry shaders
v2:
 * Use MAYBE_UNUSED. (Matt)

[jordan.l.justen@intel.com: *_cached_program => brw_disk_cache_*_program]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Timothy Arceri
2589e7ddaf i965: Add shader cache support for vertex and fragment stages
This enables the cache on vertex and fragment shaders only.

v2:
 * Use MAYBE_UNUSED. (Matt)

[jordan.l.justen@intel.com: reword subject]
[jordan.l.justen@intel.com: *_cached_program => brw_disk_cache_*_program]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Timothy Arceri
516d50db31 i965: add initial implementation of on disk shader cache
This uses the Mesa disk_cache support to write out the final linked
binary for vertex and fragment shader programs.

This is based off the initial implementation done by Carl Worth. It
has been significantly reworked, first by Tim Arceri, and then by
Jordan Justen.

v2:
 * Squash 'i965: add image param shader cache support'
 * Squash 'i965: add shader cache support for pull param pointers'
 * Sustantially simplified by a rework on top of Jason's 2975e4c56a.
 * Rename load_program_data to read_program_data. (Jason)

v3:
 * Simplify and align program read/write. (Jason)

v4:
 * Don't save prog_data size since we know it from the stage. (Ken)
 * Don't save program size, since prog_data includes the size. (Ken)
 * Remove `assert` that potentially could be triggered by disk
   corruption of the cache entries. (Ken)
 * Fix compute shader scratch allocation. (Ken)
 * Remove special case mapping for non-LLC. (Ken)
 * Remove SET_UPLOAD_PARAMS macro

[jordan.l.justen@intel.com: *_cached_program => brw_disk_cache_*_program]
[jordan.l.justen@intel.com: brw_shader_cache.c => brw_disk_cache.c]
[jordan.l.justen@intel.com: don't map to write program when LLC is present]
[jordan.l.justen@intel.com: set program_written_to_cache on read from cache]
[jordan.l.justen@intel.com: only try cache when status is linking_skipped]
[jordan.l.justen@intel.com: all v2-v4 changes noted above]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Jordan Justen
f9d5a7add4 i965: Calculate thread_count in brw_alloc_stage_scratch
Previously, thread_count was sent in from the stage after some stage
specific calculations. Those stage specific calculations were moved
into brw_alloc_stage_scratch, which will allow the shader cache to
also use the same calculations.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Jordan Justen
f082d7f64f intel/compiler: Add functions to get prog_data and prog_key sizes for a stage
v2:
 * Return unsigned instead of size_t. (Ken)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Jordan Justen
05b1193361 intel/compiler: Add union types for prog_data and prog_key stages
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:54 -07:00
Jordan Justen
4c7a1ec62a blob: Don't set overrun if reading 0 bytes at end of data
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:54 -07:00
Jordan Justen
3dcbc5cdaa intel/compiler: Remove final_program_size from brw_compile_*
The caller can now use brw_stage_prog_data::program_size which is set
by the brw_compile_* functions.

Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:54 -07:00
Carl Worth
540636045f intel/compiler: add new field for storing program size
This will be used by the on disk shader cache.

v2:
 * Set in brw_compile_* rather than brw_codegen_*. (Jason)

Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com>
[jordan.l.justen@intel.com: Only add to brw_stage_prog_data]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:54 -07:00
Jordan Justen
1edf0fe612 i965: Don't rely on nir for uses_texture_gather
When a program is restored from the shader cache, prog->nir will be
NULL, but prog->info will be restored.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Jordan Justen
0610a624a1 i965/link: Serialize program to nir after linking for shader cache
If the shader cache is enabled, after linking the program, we
serialize the program to nir. This will be saved out by the glsl
shader cache support.

Later, if the same program is found in the cache, we can use the nir
for a fallback in the unlikely case that the gen binary program is not
found in the cache.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:54 -07:00
Jordan Justen
6b815e405d glsl/shader_cache: Save and restore serialized nir in gl_program
v3:
 * Rename serialized_nir* to driver_cache_blob*. (Tim)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:54 -07:00
Jordan Justen
571bee96d5 main: Add driver cache blob fields to gl_program
These fields can be used to optionally save off a driver blob with the
program metadata. For example, serialized nir, or tgsi.

v3:
 * Rename serialized_nir* to driver_cache_blob*. (Tim)
 * Free memory. (Jason)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:53 -07:00
Jason Ekstrand
54f691311c nir: Add hooks for testing serialization
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-31 23:36:53 -07:00
Connor Abbott
120da00975 nir: add serialization and deserialization
v2 (Jason Ekstrand):
 - Various whitespace cleanups
 - Add helpers for reading/writing objects
 - Rework derefs
 - [de]serialize nir_shader::num_*
 - Fix uses of blob_reserve_bytes
 - Use a bitfield struct for packing tex_instr data

v3:
 - Zero nir_variable struct on deserialization. (Jordan)
 - Allow nir_serialize.h to be included in C++. (Jordan)
 - Handle NULL info.name. (Jason)
 - Set info.name to NULL when name is NULL. (Jordan)

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:53 -07:00
Dave Airlie
57892a23be mesa/st: implement max combined output resources limiting.
if the driver sets the cap, then use the value it gives us.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-01 10:07:07 +10:00
Dave Airlie
d3fdd66401 gallium: add cap for driver specified max combined shader resources.
Some hw (evergreen) has a limit on how many combined (images/buffers/mrts)
a fragment shader can access.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-01 10:07:03 +10:00
Gert Wollny
69eee511c6 r600/sb: bail out if prepare_alu_group() doesn't find a proper scheduling
It is possible that the optimizer ends up in an infinite loop in
post_scheduler::schedule_alu(), because post_scheduler::prepare_alu_group()
does not find a proper scheduling. This can be deducted from
pending.count() being larger than zero and not getting smaller.

This patch works around this problem by signalling this failure so that the
optimizers bails out and the un-optimized shader is used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-01 09:33:40 +10:00
Timothy Arceri
e80bbd6f52 radeonsi: fix culldist_writemask in nir path
The shared si_create_shader_selector() code already offsets the mask.

Fixes the following piglit tests:

arb_cull_distance/clip-cull-3.shader_test
arb_cull_distance/clip-cull-4.shader_test

Fixes: 29d7bdd179 (radeonsi: scan NIR shaders to obtain required info)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-01 09:41:11 +11:00
Neil Roberts
b697ece10a nir/opt_intrinsics: Fix values for gl_SubGroupG{e,t}MaskARB
Previously the values were calculated by just shifting ~0 by the
invocation ID. This would end up including bits that are higher than
gl_SubGroupSizeARB. The corresponding CTS test effectively requires that
these high bits be zero so it was failing. There is a Piglit test as
well but this appears to checking the wrong values so it passes.

For the two greater-than bitmasks, this patch adds an extra mask with
(~0>>(64-gl_SubGroupSizeARB)) to force these bits to zero.

Fixes: KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102680#c3
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Neil Roberts <nroberts@igalia.com>
2017-10-31 23:28:00 +01:00
Nanley Chery
9e849eb8bb i965: Check CCS_E compatibility for texture view rendering
Only use CCS_E to render to a texture that is CCS_E-compatible with the
original texture's miptree (linear) format. This prevents render
operations from writing data that can't be decoded with the original
miptree format.

On Gen10, with the new CCS_E-enabled formats handled, this enables the
driver to pass the arb_texture_view-rendering-formats piglit test.

v2. Add a TODO for texturing. (Jason)

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 14:26:23 -07:00
Nanley Chery
c7baaafe54 intel/isl: Disable some gen10 CCS_E formats for now
CannonLake additionally supports R11G11B10_FLOAT and four 10-10-10-2
formats with CCS_E. None of these formats fit within the current
blorp_copy framework so disable them until support is added.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 14:26:23 -07:00
Eric Engestrom
3ba973fe37 meson: pass correct args to gles2 ABI test
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-31 18:00:11 +00:00
Eric Engestrom
8a3022ffa0 meson: pass correct args to gles1 ABI test
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-31 18:00:06 +00:00
Eric Engestrom
5eb7bd0e77 meson: pass correct args to gbm symbol test
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-31 18:00:00 +00:00
Eric Engestrom
be301ab724 meson: pass correct args to wayland-egl symbol test
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-31 17:59:54 +00:00
Eric Engestrom
64f17440b6 automake+meson: don't run egl symbol check on libglvnd lib
We might want to add a symbol check for the glvnd variant though.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-31 17:59:49 +00:00
Eric Engestrom
1946de2b71 meson: pass correct env/args to egl tests
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-31 17:59:44 +00:00
Eric Engestrom
ddb3a695a8 gles2: fail symbol check if lib is missing
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-31 17:59:39 +00:00
Eric Engestrom
4e7612c54d gles1: fail symbol check if lib is missing
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-31 17:59:34 +00:00
Eric Engestrom
d830351bfa gbm: fail symbol check if lib is missing
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-31 17:59:29 +00:00
Eric Engestrom
4e43ba5687 wayland-egl: fail symbol check if lib is missing
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-31 17:59:24 +00:00
Eric Engestrom
0529a63384 egl: fail symbol check if lib is missing
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-31 17:59:19 +00:00
Dylan Baker
1c59119368 meson: set visibility flags on gbm
This is done in autotools, and is an oversight in the meson build.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-31 10:41:19 -07:00
Dylan Baker
c16486f5db meson: Don't link gbm with threads
It's supposed to be linked with pthread-stubs (if the platform needs
pthread-stubs). Pthread stubs support isn't (yet) implemented in the
meson build, so add a TODO.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-31 10:41:19 -07:00
Dylan Baker
0589331d54 meson: Use true and false instead of yes and no for tristate options
This allows a user to not care whether they're setting a tristate or a
boolean option, which is a nice user facing feature, and something I've
personally run into.

Suggested-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-31 10:37:17 -07:00
Andrey Grodzovsky
f03b7c9ad9 winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-10-31 16:55:24 +01:00
Erik Faye-Lund
5c2ff5773a meson: do not search for needless deps
If we don't want to use these deps, there's no good reason to search
for them in the first place. This should shave a bit of time for the
initial build.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-31 15:44:13 +01:00
Samuel Pitoiset
5010436e09 radv: bail out when binding the same vertex buffers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-31 10:16:38 +01:00
Samuel Pitoiset
11fdc2cd34 radv: bail out when binding the same index buffer
DOW3 appears to hit this path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-31 10:16:35 +01:00
Erik Faye-Lund
cf41c19d9f meson: use dep_m in libgallium
The u_format_other.c users sqrtf, which on some systems require
a math-library. So let's make sure we link with it.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-31 08:10:37 +01:00
Timothy Arceri
e92405c55a radv: use correct alloc function when loading from disk
Fixes regression in:

dEQP-VK.api.object_management.alloc_callback_fail.graphics_pipeline

Fixes: 1e84e53712 "radv: add cache items to in memory cache when reading from disk"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-31 14:51:55 +11:00
Plamena Manolova
048d4c45c9 i965: Fix ARB_indirect_parameters logic.
This patch modifies the ARB_indirect_parameters logic in
brw_draw_prims, so that our implementation isn't affected if
another application attempts to use predicates. Previously we
were using a predicate with a DELTAS_EQUAL comparison operation
and relying on the MI_PREDICATE_DATA register being 0. Our code
to initialize MI_PREDICATE_DATA to 0 was incorrect, so we were
accidentally using whatever value was written there. Because the
kernel does not initialize the MI_PREDICATE_DATA register on
hardware context creation, we might inherit the value from whatever
context was last running on the GPU (likely another process).
The Haswell command parser also does not currently allow us to write
the MI_PREDICATE_DATA register. Rather than fixing this and requiring
an updated kernel, we switch to a different approach which uses a
SRCS_EQUAL predicate that makes no assumptions about the states of any
of the predicate registers.

Fixes Piglit's spec/arb_indirect_parameters/tf-count-arrays test.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103085
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-30 20:40:05 -07:00
Kenneth Graunke
877dd14e88 i965: Don't flag BRW_NEW_SURFACES unless some push constants are dirty.
Due to a gaffe on my part, we were re-emitting all binding table entries
on every single draw call.  The push_constant_packets atom listens to
BRW_NEW_DRAW_CALL, but skips emitting 3DSTATE_CONSTANT_XS for each stage
unless stage_state->push_constants_dirty is true.  However, it flagged
BRW_NEW_SURFACES unconditionally at the end, by mistake.

Instead, it should only flag it if we actually emit 3DSTATE_CONSTANT_XS
for a stage.  We can move it a few lines up, inside the loop - the early
continues will skip over it if push constants aren't dirty for a stage.

With INTEL_NO_HW=1 set, improves performance of GFXBench5 gl_driver_2
on Apollolake at 1280x720 by 1.01122% +/- 0.470723% (n=35).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-10-30 20:38:08 -07:00
Kenneth Graunke
28fcf5cd94 intel/genxml: Fix decoding of groups with fields smaller than a DWord.
Groups containing fields smaller than a DWord were not being decoded
correctly.  For example:

    <group count="32" start="32" size="4">
      <field name="Vertex Element Enables" start="0" end="3" type="uint"/>
    </group>

gen_field_iterator_next would properly walk over each element of the
array, incrementing group_iter, and calling iter_group_offset_bits()
to advance to the proper DWord.  However, the code to print the actual
values only considered iter->field->start/end, which are 0 and 3 in the
above example.  So it would always fetch bits 3:0 of the current DWord
when printing values, instead of advancing to each element of the array,
printing bits 0-3, 4-7, 8-11, and so on.

To fix this, we add new iter->start/end tracking, which properly
advances for each instance of a group's field.

Caught by Matt Turner while working on 3DSTATE_VF_COMPONENT_PACKING,
with a patch to convert it to use an array of bitfields (the example
above).

This also fixes the decoding of 3DSTATE_SBE's "Attribute Active
Component Format" fields.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-30 20:22:55 -07:00
Ian Romanick
53c7b8bdca glsl: Fix bad formatting in a comment
Trivial

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2017-10-30 20:08:25 -07:00
Eric Anholt
2a77c763fe broadcom/vc5: Force blending to treat alpha as 1 for formats without alpha.
Fixes fbo-blending-formats on RGB8 and 565.  We will still need to demote
blending to shader code in the MRT case to fix it in general, but that can
be added when we start doing 32F blending (which also needs to be done in
the shader).
2017-10-30 13:31:32 -07:00
Eric Anholt
61bb0df60e broadcom/vc5: Do BGRA vs RGBA swapping for the BLEND_CONSTANT_COLOR.
Fixes many of the fbo-blending-formats tests.
2017-10-30 13:31:32 -07:00
Eric Anholt
2e3c7beb1e broadcom/vc5: Pack clear colors according to the TLB internal format/type.
The previous packing I did got us all the R*16F and R*32F formats, where
the pipe format basically matched the TLB's format, but since the clear
color will just be memcpyed to the TLB, we should be looking at its format
for deciding how to pack.

Fixes RGB565, RGB5_A1 and RGBA10 fbo-clear-formats tests and improves
4444.
2017-10-30 13:31:32 -07:00
Eric Anholt
828299d1bd broadcom/vc5: Don't do r/b channel swapping on 565.
The HW's format actually matches the gallium format.
2017-10-30 13:31:32 -07:00
Eric Anholt
9e5df1897c broadcom/vc5: Use the proper gallium format for our RGB10_A2.
This keeps us from needing our own reswizzling of the B vs R fields.
2017-10-30 13:31:31 -07:00
Eric Anholt
6d1809a6d6 broadcom/vc5: Add some comments about the texture/output format ordering.
The output formats are consistent with their channels appearing from low
to high in their name.  Textures are interpreted the same way, but their
names may have the channels swapped around.  I'm retaining the texture
names so that we are consistent with the documentation, but I want to
leave a warning for others.
2017-10-30 13:31:28 -07:00
Eric Anholt
2d6088f2a3 broadcom/vc5: Drop duplicated setup of clip_window_height_in_pixels. 2017-10-30 13:31:28 -07:00
Eric Anholt
1b32786de6 broadcom/vc5: Don't forget to actually turn on stencil testing.
I had the rest of stencil state set up, but forgot to actually enable it
in the higher level configuration bits packet.
2017-10-30 13:31:28 -07:00
Eric Anholt
4d2619a6b3 broadcom/vc5: Stop lowering negates to subs.
In the case of fneg(0.0), we were getting back 0.0 instead of -0.0.  We
were also needing an immediate 0 value for ineg, when there's an opcode to
do the job properly.

Fixes fs-floatBitsToInt-neg.shader_test.
2017-10-30 13:31:28 -07:00
Eric Anholt
a797f0eb63 broadcom/vc5: Set up MSAA texture type according to the internal format.
It gets most of EXT_framebuffer_multisample-formats passing, but doesn't
really work for texture views.
2017-10-30 13:31:28 -07:00
Eric Anholt
fe6fc579cb broadcom/vc5: Use the sampler view's format, not the resource's.
This should help with texture views, though I just noticed this while
reading the code.
2017-10-30 13:31:27 -07:00
Eric Anholt
0ec4b4178f broadcom/vc5: Emit raw loads for MSAA buffers.
Similar to stores, but we also need to emit dummy stores in between each
load, to flush out the previous queued load.
2017-10-30 13:31:27 -07:00
Eric Anholt
464f1fb733 broadcom/vc5: Use raw stores for MSAA buffers.
We were storing the resolved pixels in all cases, but nr_samples > 0 means
we should be keeping the per-sample values.

We will probably want to change the job structure at some point, as we'll
want to recognize full-buffer resolves and do the resolved store in the
same job as the original rendering, meaning we'll need to track both the
MSAA and single-sample resources in the job.  However, this will be enough
to build the rest of the MSAA support.
2017-10-30 13:31:27 -07:00
Eric Anholt
e717e3e7cd broadcom/vc5: Add lowering for txf_ms to a txf on a 2x2-scaled texture.
The HW has no native sampler support for multisample textures, but since
we only need to support txf_ms and the layout is UIF, we just need to
scale up the texcoords and then add in the sample.

This drops the old TEXTURE_MSAA_ADDR special uniform, since we're treating
MSAA textures as textures, rather than basically texbos like VC4 had to.
2017-10-30 13:31:27 -07:00
Eric Anholt
b1a8b3979c broadcom/vc5: Lay out MSAA textures/renderbuffers as UIF scaled by 4.
We just need to multiply width/height by 2 each, and always set them up as
UIF tiling, since that's how the TLB will store them in raw (per-sample)
mode.
2017-10-30 13:31:27 -07:00
Eric Anholt
1d8105a167 broadcom/vc5: Keep output height pad out of the store TLB general address.
The equivalent load already had the pad separated out.
2017-10-30 13:31:24 -07:00
Eric Anholt
99c69027e4 broadcom/vc5: Drop padding bits from the texture shader state's address. 2017-10-30 13:31:22 -07:00
Eric Anholt
cf3759a9a4 broadcom/vc5: Drop alignment bits from texture P1's address. 2017-10-30 13:31:19 -07:00
Eric Anholt
607031f411 broadcom/vc5: Drop alignment bits from Z/S rendering mode config address.
Improves CLIF dumping output.
2017-10-30 13:31:16 -07:00
Eric Anholt
d0f7053369 broadcom/xml: Fix address packing for address with >= 8 alignment bits.
We were handing the intra-byte padding fine, but with a 24-bit address
(bottom 8 bits implied 0) we would end up off by 8 bytes in our shift,
impacting vc5's load/store general packets (all other packets we have had
<8 bits of padding).
2017-10-30 13:31:16 -07:00
Eric Anholt
40280b0abe broadcom/clif: Print out the contents of the generic tile list.
This is the real meat of the RCL, so let's get it printed again.
2017-10-30 13:31:16 -07:00
Eric Anholt
10fa685b53 broadcom/clif: Move the CL printing part of CL dumps to a helper.
This will let me reuse the printing for processing branches to other CLs.
2017-10-30 13:31:16 -07:00
Eric Anholt
125f2a751e broadcom/vc5: Lower unpack_*_4x8 to normal math.
We only have 2x16 unpacking in our ALUs.  To enable this, we also need
lower_fdiv for its new instructions, which had been handled at a higher
level previously.
2017-10-30 13:31:16 -07:00
Eric Anholt
eecdbaa985 broadcom/vc5: Add PIPE_TEX_WRAP_CLAMP support for linear-filtered textures.
I already had the texture's wrapping set up to use different behavior for
nearest or linear, so we just needed to saturate the coordinates in linear
mode to get the "proper" blend between the edge and border values.
2017-10-30 13:31:16 -07:00
Eric Anholt
e798455330 broadcom/vc5: Disable GL_ARB_transform_feedback3.
We don't seem to have a way to generally handle gl_SkipComponents.
2017-10-30 13:31:15 -07:00
Eric Anholt
e2d9ed4f39 broadcom/vc5: Fix gl_FragCoord pixel center setup.
Fixes glsl-arb-fragment-coord-conventions.
2017-10-30 13:31:15 -07:00
Eric Anholt
bacbcafec1 broadcom/vc5: Always set up 1D textures as raster order.
1D is the exception to "all V3D textures are tiled", since tiling 1D
textures would just waste memory and cache space.  This ended up being a
problem once we started actually marking 1D textures as 1D instead of 2D.
2017-10-30 13:31:15 -07:00
Eric Anholt
443e1984d2 broadcom/xml: Throw an #error in XML-based codegen for a >1bit bool
I've debugged two nasty errors now due to copy-and-pasting a bool type
when writing a uint field.  Make sure I don't do that again.
2017-10-30 13:31:12 -07:00
Eric Anholt
e2f114b32b broadcom/vc4: Fix bool marking on Rasterizer Oversample Mode.
We don't set this field using the XML codegen, but this would help us
decode the right value in case of 16x (VG) oversampling.
2017-10-30 13:27:03 -07:00
Eric Anholt
45e70bdc8c broadcom/vc5: Mark lookup type as uint, not bool.
Fixes non-2D texturing.
2017-10-30 13:27:03 -07:00
Eric Anholt
77c7b98ba5 broadcom/vc5: Fix GPU hang with no vertex elements used by the VS.
Like VC4, we need to at least have one element set up, but unlike VC4 it
seems we don't need to read it to keep the HW happy.  Fixes GPU hangs with
glsl-no-vertex-attribs.shader_test.
2017-10-30 13:25:45 -07:00
Eric Engestrom
2117d03310 git_sha1_gen: create empty file in fallback path
I missed this part in my conversion, the old stream redirection meant
the file was always created.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103496
Fixes: 7088622e5f "buildsys: move file regeneration logic to
       the script itself"
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-30 17:21:58 +00:00
Lionel Landwerlin
a1faf48636 intel: common: silence compiler warning
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-30 17:15:50 +00:00
Eduardo Lima Mitev
f9de7f5596 glsl/linker: Check that re-declared, inter-shader built-in blocks match
>From GLSL 4.5 spec, section "7.1 Built-In Language Variables", page 130 of
the PDF states:

    "If multiple shaders using members of a built-in block belonging to
     the same interface are linked together in the same program, they must
     all redeclare the built-in block in the same way, as described in
     section 4.3.9 “Interface Blocks” for interface-block matching, or a
     link-time error will result."

Fixes:
* GL45-CTS.CommonBugs.CommonBug_PerVertexValidation

v2 (Neil Roberts):
Explicitly look for gl_PerVertex in the symbol tables instead of
waiting to find a variable in the interface.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102677
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>
2017-10-30 18:10:39 +01:00
Eduardo Lima Mitev
f5fe99ac85 glsl: Use the utility function to copy symbols between symbol tables
This effectively factorizes a couple of similar routines.

v2 (Neil Roberts): Non-trivial rebase on master

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>
2017-10-30 18:10:39 +01:00
Eduardo Lima Mitev
4c62a270a9 glsl_parser_extra: Add utility to copy symbols between symbol tables
Some symbols gathered in the symbols table during parsing are needed
later for the compile and link stages, so they are moved along the
process. Currently, only functions and non-temporary variables are
copied between symbol tables. However, the built-in gl_PerVertex
interface blocks are also needed during the linking stage (the last
step), to match re-declared blocks of inter-stage shaders.

This patch adds a new utility function that will factorize current code
that copies functions and variables between two symbol tables, and in
addition will copy explicitly declared gl_PerVertex blocks too.

The function will be used in a subsequent patch.

v2 (Neil Roberts):
Allow the src symbol table to be NULL and explicitly copy the
gl_PerVertex symbols in case they are not referenced in the exec_list.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>
2017-10-30 18:10:39 +01:00
Eric Engestrom
ceaad79f85 i965: remove unused variable
Fixes: 2c873060d3 "i965: Delete unused
       brw_vs_prog_data::nr_attributes field."
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-10-30 16:32:05 +00:00
Eric Engestrom
c5ec155685 meson: wire up egl/android
Cc: Rob Herring <robh@kernel.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-30 16:32:05 +00:00
Ian Romanick
6403efbe74 glsl: Remove ir_binop_greater and ir_binop_lequal expressions
NIR does not have these instructions.  TGSI and Mesa IR both implement
them using < and >=, repsectively.  Removing them deletes a bunch of
code and means I don't have to add code to the SPIR-V generator for
them.

v2: Rebase on 2+ years of change... and fix a major bug added in the
rebase.

   text	   data	    bss	    dec	    hex	filename
8255291	 268856	 294072	8818219	 868e2b	32-bit i965_dri.so before
8254235	 268856	 294072	8817163	 868a0b	32-bit i965_dri.so after
7815339	 345592	 420592	8581523	 82f193	64-bit i965_dri.so before
7813995	 345560	 420592	8580147	 82ec33	64-bit i965_dri.so after

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Ian Romanick
34f7e761bc glsl/parser: Track built-in types using the glsl_type directly
Without the lexer changes, tests/glslparsertest/glsl2/tex_rect-02.frag
fails.  Before this change, the parser would determine that
sampler2DRect is not a valid type because the call to
state->symbols->get_type() in ast_type_specifier::glsl_type() would
return NULL.  Since ast_type_specifier::glsl_type() is now going to
return the glsl_type pointer that it received from the lexer, it doesn't
have an opportunity to generate an error.

   text	   data	    bss	    dec	    hex	filename
8255243	 268856	 294072	8818171	 868dfb	32-bit i965_dri.so before
8255291	 268856	 294072	8818219	 868e2b	32-bit i965_dri.so after
7815195	 345592	 420592	8581379	 82f103	64-bit i965_dri.so before
7815339	 345592	 420592	8581523	 82f193	64-bit i965_dri.so after

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Ian Romanick
747c057530 glsl/parser: Return the glsl_type object from the lexer
This allows us to use a single token for every built-in type except void.

   text	   data	    bss	    dec	    hex	filename
8275163	 269336	 294072	8838571	 86ddab	32-bit i965_dri.so before
8255243	 268856	 294072	8818171	 868dfb	32-bit i965_dri.so after
7836963	 346552	 420592	8604107	 8349cb	64-bit i965_dri.so before
7815195	 345592	 420592	8581379	 82f103	64-bit i965_dri.so after

Yes, the 64-bit binary shrinks by 21k.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Ian Romanick
4171900cf1 glsl/parser: Allocate identifier inside classify_identifier
Passing YYSTYPE into classify_identifier enables a later patch.

   text	   data	    bss	    dec	    hex	filename
8310339	 269336	 294072	8873747	 876713	32-bit i965_dri.so before
8275163	 269336	 294072	8838571	 86ddab	32-bit i965_dri.so after
7845579	 346552	 420592	8612723	 836b73	64-bit i965_dri.so before
7836963	 346552	 420592	8604107	 8349cb	64-bit i965_dri.so after

Yes, the 64-bit binary shrinks by 8k.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Ian Romanick
792acfc44a glsl/parser: Move anonymous struct name handling to the parser
There are two callers of the constructor, and they are right next to
each other.  Move the "#anon_struct" name handling to the parser so that
the conditional can be removed.

I've also deleted part of the comment (about the memory leak) because I
don't think it's quite accurate or relevant.

   text	   data	    bss	    dec	    hex	filename
8310399	 269336	 294072	8873807	 87674f	32-bit i965_dri.so before
8310339	 269336	 294072	8873747	 876713	32-bit i965_dri.so after
7845611	 346552	 420592	8612755	 836b93	64-bit i965_dri.so before
7845579	 346552	 420592	8612723	 836b73	64-bit i965_dri.so after

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Ian Romanick
fc07ab165b glsl/parser: Silence unused parameter warning
glsl/glsl_parser_extras.cpp: In constructor ‘ast_struct_specifier::ast_struct_specifier(void*, const char*, ast_declarator_list*)’:
glsl/glsl_parser_extras.cpp:1675:50: warning: unused parameter ‘lin_ctx’ [-Wunused-parameter]
 ast_struct_specifier::ast_struct_specifier(void *lin_ctx, const char *identifier,
                                                  ^~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Ian Romanick
d70e8ef1c1 glsl: Silence unused parameter warnings
glsl/standalone_scaffolding.cpp: In function ‘GLbitfield _mesa_program_state_flags(const gl_state_index*)’:
glsl/standalone_scaffolding.cpp:103:66: warning: unused parameter ‘state’ [-Wunused-parameter]
 _mesa_program_state_flags(const gl_state_index state[STATE_LENGTH])
                                                                  ^
glsl/standalone_scaffolding.cpp: In function ‘char* _mesa_program_state_string(const gl_state_index*)’:
glsl/standalone_scaffolding.cpp:109:67: warning: unused parameter ‘state’ [-Wunused-parameter]
 _mesa_program_state_string(const gl_state_index state[STATE_LENGTH])
                                                                   ^
glsl/standalone_scaffolding.cpp: In function ‘void _mesa_delete_shader(gl_context*, gl_shader*)’:
glsl/standalone_scaffolding.cpp:115:40: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 _mesa_delete_shader(struct gl_context *ctx, struct gl_shader *sh)
                                        ^~~
glsl/standalone_scaffolding.cpp: In function ‘void _mesa_delete_linked_shader(gl_context*, gl_linked_shader*)’:
glsl/standalone_scaffolding.cpp:123:47: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 _mesa_delete_linked_shader(struct gl_context *ctx,
                                               ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Mauro Rossi
7dae419aa7 Android: move drivers' symlinks to /vendor (v2)
Having moved gallium_dri.so library to /vendor/lib/dri
also symlinks need to be coherently created using TARGET_OUT_VENDOR instead of TARGET_OUT
or all non Intel drivers will not be loaded with Android N and earlier,
thus causing SurfaceFlinger SIGABRT

(v2) simplification of post install command

Fixes: c3f75d483c ("Android: move libraries to /vendor")

Cc: 17.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Reviewed-by: Rob Herring <robh@kernel.org> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-30 15:41:31 +00:00
Emil Velikov
fc7816fd4e Revert "foo"
This reverts commit 27d5a7bce0.

I fat fingered it, failing to reset the checkout before applying the
sequential commit.
2017-10-30 15:32:56 +00:00
Emil Velikov
6997d222f5 docs/release-calendar: update - 17.3.0-rc2 is out
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-30 15:24:10 +00:00
Emil Velikov
27d5a7bce0 foo
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-10-30 15:22:26 +00:00
Lionel Landwerlin
a8b1715b8a util: hashtable: make hashing prototypes match
It seems nobody's using the string hashing function. If you try to
pass it directly to the hashtable creation function, you'll get
compiler warning for non matching prototypes. Let's make them match.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-30 15:18:00 +00:00
Andres Gomez
c220202a73 docs: update calendar, add news item and link release notes for 17.2.4
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-10-30 16:58:51 +02:00
Andres Gomez
d100e966e8 docs: add sha256 checksums for 17.2.4
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-10-30 16:55:19 +02:00
Andres Gomez
bccb074014 docs: add release notes for 17.2.4
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-10-30 16:55:18 +02:00
Alex Smith
134a40d2a6 radv: Fix -Wformat-security issue
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103513
Fixes: de88979413 ("radv: Implement VK_AMD_shader_info")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-30 10:58:56 +01:00
Timothy Arceri
1e84e53712 radv: add cache items to in memory cache when reading from disk
Otherwise we will leak them, load duplicates from disk rather
than memory and never write items loaded from disk to the apps
pipeline cache.

Fixes: fd24be134f 'radv: make use of on-disk cache'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-30 17:49:54 +11:00
Tapani Pälli
446c5726ec i965: fix blorp stage_prog_data->param leak
Patch uses mem_ctx for allocation to ensure param array gets freed
later.

==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of 193
==6164==    at 0x4C2EB6B: malloc (vg_replace_malloc.c:299)
==6164==    by 0x12E31C6C: ralloc_size (ralloc.c:121)
==6164==    by 0x130189F1: fs_visitor::assign_constant_locations() (brw_fs.cpp:2095)
==6164==    by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715)
==6164==    by 0x13024D5A: fs_visitor::run_fs(bool, bool) (brw_fs.cpp:6229)
==6164==    by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570)
==6164==    by 0x130C4B07: blorp_compile_fs (blorp.c:194)
==6164==    by 0x130D384B: blorp_params_get_clear_kernel (blorp_clear.c:79)
==6164==    by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332)
==6164==    by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261)
==6164==    by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326)
==6164==    by 0x12EFF72B: brw_clear (brw_clear.c:297)

Fixes: 8d90e28839 ("intel/compiler: Allocate pull_param in assign_constant_locations")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-30 08:19:37 +02:00
Kevin Rogovin
d30b5f2f9b i965: correctly assign SamplerCount of INTERFACE_DESCRIPTOR_DATA
We were dividing by 4 twice.  This also papered over a bug where we
were neglecting to clamp the sampler count to the [0, 16] range.

This should have no functional impact, it only affects prefetching.

v2 [Kenneth Graunke]:
 - Clamp sampler_count to [0, 16] to avoid overflowing the valid values
   for this field.  Write a commit message.

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-29 22:41:23 -07:00
Kenneth Graunke
992e2cf57f i965: Only set key->high_quality_derivatives when it matters.
This avoids recompiles for shaders that don't use explicit derivatives
when ctx->Hint.FragmentShaderDerivative == GL_NICEST.

For example, GFXBench 5 Aztec Ruins sets the GL_NICEST hint before
compiling any shaders, but none of them use dFdx() or dFdy() - only
implicit derivatives.  This doesn't eliminate any recompiles, but
does eliminate one of the reasons for doing so.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-29 20:54:16 -07:00
Kenneth Graunke
86c68bb886 nir: Make nir_gather_info collect a uses_fddx_fddy flag.
i965 turns fddx/fddy into their coarse/fine variants based on the
ctx->Hint.FragmentShaderDerivative setting.  It needs to know whether
this can impact a shader in order to better guess NOS settings.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-29 20:52:20 -07:00
Kenneth Graunke
f970e4d481 i965: Update brw_wm_debug_recompile() for newer key entries.
Also, reorder them to match the structure's field order, to make it
easier to check that they're all present.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-29 20:52:14 -07:00
Kenneth Graunke
d1b392d060 i965: Delete brw_wm_prog_key::drawable_height.
This has been unused since we switched to nir_lower_wpos_ytransform.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-29 20:52:02 -07:00
Alex Smith
de88979413 radv: Implement VK_AMD_shader_info
This allows an app to query shader statistics and get a disassembly of
a shader. RenderDoc git has support for it, so this allows you to view
shader disassembly from a capture.

When this extension is enabled on a device (or when tracing), we now
disable pipeline caching, since we don't get the shader debug info when
we retrieve cached shaders.

v2: Improvements to resource usage reporting
v3: Disassembly string must be null terminated (string_buffer's length
    does not include the terminator)
v4: Fixed LDS reporting. (Bas)

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-29 00:28:45 +02:00
Christian Gmeiner
0a23841a98 etnaviv: add ext_texture_srgb support
Following piglits are passing:
 - glean@texture_srgb
 - spec@ext_texture_srgb@fbo-srgb
 - spec@ext_texture_srgb@tex-srgb
 - spec@ext_texture_srgb@texwrap formats
 - spec@ext_texture_srgb@texwrap formats-s3tc

Btw. this enables GL 2.1 :-)

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-28 21:20:42 +02:00
Topi Pohjolainen
97e01adfd5 intel/compiler/gen9: Pixel shader header only workaround
Fixes intermittent GPU hangs on Broxton with an Intel internal
test case.

There are plenty of similar fragment shaders in piglit that do
not use any varyings and any uniforms. According to the
documentation special timing is needed between pipeline stages.
Apparently we just don't hit that with piglit. Even with the
failing test case one doesn't always get the hang.

Moreover, according to the error states the hang happens
significantly later than the execution of the problematic shader.
There are multiple render cycles (primitive submissions) in between.
I've also seen error states where the ACTHD points outside the
batch. Almost as if the hardware writes somewhere that gets used
later on. That would also explain why piglit doesn't suffer from
this - most tests kick off one render cycle and any corruption
is left unseen.

v2 (Ken): Instead of enabling push constants, enable one of the
          inputs (PSIZ).
v3 (Ken, Jason): Use LAYER instead making vulkan emit_3dstate_sbe()
                 happy.

Cc: "17.3 17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-10-28 10:07:29 +03:00
Brian Paul
2b612431f5 scons: fix OSMesa driver build
Fixes: ea53d9a8eb "glapi: include generated headers without path"

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-10-27 20:56:13 -06:00
Brian Paul
05592cebd4 scons: fix scons build to find generated glapitable.h
Fixes: ea53d9a8eb "glapi: include generated headers without path"

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-10-27 16:26:26 -06:00
Brian Paul
1fe4c7b2af gallium: s/unsigned/enum pipe_prim_type/
In the vbuf_render::set_primitive() functions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-10-27 16:26:26 -06:00
Roland Scheidegger
3e4fd2d4b1 draw: don't cull tris with zero area
Culling tris with zero area seems like a great idea, but apparently with
fill mode line (and point) we're supposed to draw them, at least some tests
for some other state tracker complained otherwise.
Such tris also always seem to be back facing (not sure if this can be
inferred from anything, since in a mathematical sense it cannot really be
determined), so make sure to account for this when filling in the face
information.
(For solid tris, this is of course unnecessary, drivers will throw the tris
away later in any case.)

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-10-27 22:37:19 +02:00
Dylan Baker
f7f12780c8 meson: Add a dependency on nir_opcodes_h for freedreno
This fixes a race condition in the build.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2017-10-27 11:30:13 -07:00
Dylan Baker
f121a669c7 meson: build gallium based osmesa
This has been tested with the osdemo from mesa-demos

v2: - Add SELinux dependency
    - fix typo GALLIUM_LLVM -> GALLIUM_LLVMPIPE

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-27 11:06:45 -07:00
Dylan Baker
cbbd5bb889 meson: build classic osmesa
This builds the classic (non-gallium) osmesa with meson. This has been
tested with the osdemo application from mesa-demos.

v2: - Remove unrelated change
    - Add SELinux dependency to osmesa

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-27 11:06:45 -07:00
Dylan Baker
7503ab687b meson: Add generated files to non-shared glapi
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-27 11:06:07 -07:00
Dylan Baker
ea53d9a8eb glapi: include generated headers without path
This has been tested wtih make dist-check and with meson.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-27 11:06:07 -07:00
Dylan Baker
5daed06da2 osmesa: Include generated headers without path
This makes things much easier to ensure correctness with meson. Tested
with make dist-check and with meson.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-27 11:06:07 -07:00
Dylan Baker
06c6675560 meson: move gallium include declarations to src
These are used by non-gallium osmesa, so they need to be defined outside
of the gallium subdirectory.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-27 11:06:07 -07:00
Dylan Baker
63c360d7b2 meson: fix glprocs.h generator
There was a typo that causes the generated file to be called gl_procs.h
instead.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-27 11:06:07 -07:00
Dylan Baker
1f11ac4395 meson: rename all instances of xf86vm to xxf86vm
Because consistency

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-27 11:06:07 -07:00
Dylan Baker
b92992e95c meson: fix pkg-config Gl Require.Private
xf86vm -> xxf86vm

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-27 11:06:07 -07:00
Kenneth Graunke
4f538c3f99 mesa: Accept GL_BACK in get_fb0_attachment with ARB_ES3_1_compatibility.
According to the ARB_ES3_1_compatibility specification,
glGetFramebufferAttachmentParameteriv is supposed to accept BACK,
and it behaves exactly like BACK_LEFT.

Fixes a GL error in GFXBench 5 Aztec Ruins.

Cc: "17.3 17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-10-27 10:19:07 -07:00
Brian Paul
7a718667f3 gallium/os: fix align_malloc() / os_malloc_aligned() comment mix-up
os_free_aligned() is the counterpart to os_malloc_aligned().
Trivial.
2017-10-27 09:54:26 -06:00
Alejandro Piñeiro
fd011376cb formatquery: use correct target check for IMAGE_FORMAT_COMPATIBILITY_TYPE
From the spec:
   "IMAGE_FORMAT_COMPATIBILITY_TYPE: The matching criteria use for the
    resource when used as an image textures is returned in
    <params>. This is equivalent to calling GetTexParameter"

So we would need to return None for any target not supported by
GetTexParameter. By mistake, we were using the target check for
GetTexLevelParameter.

v2: fix typo (GetTextParameter vs GetTexParemeter) on comment (Illia Mirkin)

Reviewed-by: Antia Puentes <apuentes@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-27 15:04:03 +02:00
Eric Engestrom
05a94a4dfc meson: bring MESA_GIT_SHA1 in line with other build systems
Meson's vcs_tag() uses the output of `git describe`, eg.
  17.3-branchpoint-5-gfbf29c3cd15ae831e249+

Whereas the other build systems used a script that outputs only the sha1
of the HEAD commit, eg.
  fbf29c3cd1

Given that this information is used by printing it next to the version
number, there's some redundancy here, and inconsistency between build
systems.

Bring Meson in line by making it use the same script, with the added
advantage of now supporting the MESA_GIT_SHA1_OVERRIDE env var.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-27 13:38:37 +01:00
Eric Engestrom
7088622e5f buildsys: move file regeneration logic to the script itself
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-27 13:38:37 +01:00
Samuel Pitoiset
a41e2e9cf5 radv: allow to use a compute shader for resetting the query pool
Serious Sam Fusion 2017 uses a huge number of occlusion queries,
and the allocated query pool buffer is greater than 4096 bytes.

This slightly improves performance (tested in Ultra) from
117.2 FPS to 119.7 FPS (~+2%) on my RX480.

This also improves Talos, from 69 FPS to 72/73 FPS (~+5%).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-27 13:47:06 +02:00
Samuel Pitoiset
0d61109bb7 radv: make radv_fill_buffer() return the needed flush bits
Only needed when the CS path is used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-27 13:47:03 +02:00
Eric Engestrom
4b9421d45d meson: wire up selinux
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-27 11:57:03 +01:00
Eric Engestrom
866c8a94d4 wayland-egl: fix wayland cflags
Fixes: 80bfff5c4f "wayland-egl: adds CFLAGS for wayland.egl.h include"
Suggested-by: Daniel Stone <daniel@fooishbar.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
2017-10-27 11:57:03 +01:00
Eric Engestrom
5d44e35a8f vc4: fix release build
Mesa's DEBUG and assert's NDEBUG are not tied to each other, so we need
to explicitly compile this code out.

Fixes: 3df7892878 "vc4: Drop reloc_count tracking for debug
       asserts on non-debug builds."
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-27 11:57:03 +01:00
Tapani Pälli
0b131ca427 i965: unref push_const_bo in intelDestroyContext
Valgrind shows that leak is caused by gen6_upload_push_constant, add
unref push_const_bo per stage to destructor to fix this (like done for
scratch_bo).

   ==10952== 144 bytes in 1 blocks are definitely lost in loss record 44 of 66
   ==10952==    at 0x4C30A1E: calloc (vg_replace_malloc.c:711)
   ==10952==    by 0x8C02847: bo_alloc_internal.constprop.10 (brw_bufmgr.c:344)
   ==10952==    by 0x8C425C4: intel_upload_space (intel_upload.c:101)
   ==10952==    by 0x8C22ED0: gen6_upload_push_constants (gen6_constant_state.c:154)

v2: remove if conditions, brw_bo_unreference handles NULL (Ken, Emil)

Fixes: 24891d7c05 ("i965: Store per-stage push constant BO pointers.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2017-10-27 13:49:13 +03:00
Tapani Pälli
eeb3515c3f i965: remove if conditions from scratch_bo unref
brw_bo_unreference handles NULL case

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-27 13:48:56 +03:00
Kenneth Graunke
70cd05d6ac anv: Fix assert about source attrs.
Asserting slot >= 2 made sense when the URB read offset was always 1
(pair of slots).  Commit 566a0c43f0 made
it possible to read from the VUE header in slot 0, by adjusting the
offset to be 0.  So, this assert is now bogus.  Use the one from GL.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-27 03:01:13 -07:00
Kenneth Graunke
49d3c004f1 anv: Drop URB entry output read handling in 3DSTATE_XS.
Commit 566a0c43f0 started setting the
3DSTATE_SBE bit to override these values with the one calculated there.

So, they're dead.  Stop setting them.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-27 03:01:13 -07:00
Kenneth Graunke
2c873060d3 i965: Delete unused brw_vs_prog_data::nr_attributes field.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-10-27 02:53:38 -07:00
Samuel Pitoiset
dd79aa4ad3 radeonsi: update hack for HTILE corruption in ARK: Survival Evolved
It appears that flushing the DB metadata is actually not sufficient
since the driver uses the new VS blit shaders. This looks quite
strange though, but it seems like we need to flush DB for fixing
the corruption.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102955
Fixes: 69ccb9dae7 (radeonsi: use new VS blit shaders (VS inputs in SGPRs)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-27 10:47:30 +02:00
Dave Airlie
a639d40f13 radv: add support for local bos. (v3)
This uses the new kernel interfaces for reduced cs overhead,
We only set the local flag for memory allocations that don't have
 a dedicated allocation and ones that aren't imports.

v2: add to all the internal buffer creation paths.
v3: missed some command submission paths, handle 0/empty bo lists.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 23:59:28 +01:00
Jason Ekstrand
39c5c12f8f i965/miptree: Take an isl_format in render_aux_usage
Not all rendering matches the miptree format.  We allow rendering to
texture views so there are cases where it may not match.  In those
cases, our current scheme of just passing the value of ctx->sRGBEnabled
isn't viable.  Instead, just do what we do for texturing and pass the
view format in directly.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-26 15:24:38 -07:00
Jason Ekstrand
78e50185d6 i965/blorp: Use more temporary isl_format variables
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-26 15:24:38 -07:00
Jason Ekstrand
94389943b6 i965/blorp: Use blorp_to_isl_format for src_isl_format in blit_miptrees
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-26 15:24:38 -07:00
Jason Ekstrand
8ab9820d34 spirv: Claim support for the simple memory model
It's rather surprising that we've never actually hit this before.
Aparently, Ian's SPIR-V generator currently claims the Simple when you
don't do anything complex.  We really shouldn't assert-fail on it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: mesa-stable@lists.freedesktop.org
2017-10-26 15:24:38 -07:00
Rob Herring
90dd6e5bb9 Android: egl: add dependency on libnativewindow
system/window.h is no longer available by default and is part of
libnativewindow, so add it to the shared libraries. It has to be conditional
because the library is only present in O and later.

Really, we should only be depending on vndk/window.h now, but that's only
in O and changing would be pretty invasive.

Signed-off-by: Rob Herring <robh@kernel.org>
2017-10-26 16:06:53 -05:00
Dylan Baker
eb3bb03b34 meson: build nouveau vieux driver
Build tested only.

v2: - fix spelling error (veaux -> vieux)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-26 11:30:56 -07:00
Dylan Baker
eecd2860ff meson: build r200 driver
v2: - remove TODO that is done

Build tested only

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-26 11:30:56 -07:00
Dylan Baker
191e785c81 meson: build r100 driver
build tested only

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-26 11:30:56 -07:00
Dylan Baker
8da36268d4 install_megadrivers: print the full path with driver name
Instead of just the path.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-26 11:30:56 -07:00
Kevin Rogovin
e640b3fe13 intel/tools/disasm: correctly observe FILE *out parameter
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-26 10:43:48 -07:00
Kevin Rogovin
75d10e4c84 intel/compiler: brw_validate_instructions to take const void* instead of void*
The disassembler does not (and should not) be modifying the data.

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-26 10:43:48 -07:00
Eric Engestrom
109de3049d loader: drop empty function alias
While at it, drop the duplicate return.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emli.velikov@collabora.com>
2017-10-26 16:25:33 +01:00
Marek Olšák
3f8e3c2bd8 radeonsi: add a workaround for weird s_buffer_load_dword behavior on SI
See my LLVM patch which fixes the root cause.

Users have to apply this patch and then they have 2 choices:
- Downgrade to LLVM 5.0
- Update to LLVM git after my LLVM patch is pushed.

It won't be possible to use current and earlier development version
of LLVM 6.0.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: 17.3 <mesa-stable@lists.freedesktop.org>
2017-10-26 16:44:01 +02:00
Greg V
9d8c91eb91 util: use OpenBSD/NetBSD code on FreeBSD/DragonFly
Obtained from: FreeBSD ports

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
[Emil Velikov: wrap long line]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-26 15:11:38 +01:00
Greg V
cece4ff6a3 winsys/svga/drm: add ERESTART define for *BSD
Obtained from: FreeBSD ports

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-10-26 15:11:38 +01:00
Greg V
db8519a369 loader: use drmGetDeviceNameFromFd2 from libdrm
Reduce code duplication and automatically benefit from OS-specific
fixes to libdrm (e.g. in FreeBSD ports).

API was introduced with 2.4.74 and we already require 2.4.75 globally.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103283
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-26 15:11:35 +01:00
Daniel Stone
9f7ed60b3e meson: wayland-egl depends on wayland-client
Since wayland-egl.h is currently provided by the core Wayland package,
depend on wayland-client to make sure we get it in our include path.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Fixes: 108d257a16 ("meson: build libEGL")
Cc: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Gert Wollny <gw.fossdev@gmail.com>
2017-10-26 13:41:09 +01:00
Rob Clark
4f0f80776f freedreno: implement pipe->invalidate_resource()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-26 08:39:32 -04:00
Rob Clark
9fc7de827e freedreno: GL_ARB_texture_barrier
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-26 08:39:32 -04:00
Rob Clark
a6bd23e43b freedreno/a5xx: rename invalidate_resource()
This is different from pipe->invalidate_resource()..

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-26 08:39:32 -04:00
Rob Clark
4afcadbcc2 freedreno/a5xx: mem2gmem is read-only for BO
This should be OUT_RELOC() since the operation isn't writing to the
buffer.  Technically it doesn't matter much currently, since we'd
anyways to a gmem2mem later.  But that will change.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-26 08:39:32 -04:00
Rob Clark
a4744c2ae7 freedreno: small rename
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-26 08:39:32 -04:00
Leo Liu
ea3dc75d72 radeon/video: add gfx9 offsets when rejoin the video surface
For CPU access.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-10-26 08:30:21 -04:00
Samuel Pitoiset
06a12f250f radv: only copy the dynamic states that changed
When binding a new pipeline, we applied all dynamic states
without checking if they really need to be re-emitted. This
doesn't seem to be useful for the meta operations because only
the viewports/scissors are updated.

This should reduce the number of commands added to the IB
when a new graphics pipeline is bound.

Also, rename radv_dynamic_state_copy() to radv_bind_dynamic_state()
and set the dirty flags directly there.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-26 09:37:05 +02:00
Samuel Pitoiset
b1e31c1911 radv: store the dynamic state mask into radv_dynamic_state
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-26 09:37:03 +02:00
Samuel Pitoiset
672cf692fb radv: only emit the depth bounds test values when set dynamically
The depth bounds test values are either set at pipeline
creation or dynamically using vkCmdSetDepthBounds().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-26 09:37:00 +02:00
Iago Toral Quiroga
13652e7516 glsl/linker: Fix type checks for location aliasing
From the OpenGL 4.6 spec, section 4.4.1 Input Layout Qualifiers, Page 68,
(Location aliasing):

   "Further, when location aliasing, the aliases sharing the location
    must have the same underlying numerical type  (floating-point or
    integer)."

The current implementation is too strict, since it checks that the
the base types are an exact match instead.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga
7276ccf8ed glsl/linker: refactor check_location_aliasing
Mostly, this merges the type checks with all the other checks so
we only have a single loop for this.

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga
e2abb75b0e glsl/linker: validate explicit locations for SSO programs
v2:
- we only need to validate inputs to the first stage and outputs
  from the last stage, everything else has already been validated
  during cross_validate_outputs_to_inputs (Timothy).
- Use MAX_VARYING instead of MAX_VARYINGS_INCL_PATCH (Illia)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga
bdaf058978 glsl/linker: generalize validate_explicit_variable_location for SSO
For non-SSO programs, we only need to validate outputs, since
the cross validation of outputs to inputs will ensure that we
produce linker errors for invalid inputs too.

Hoever, for the SSO path there is no output to input validation,
so we need to validate inputs explicitly. Generalize the function
so it can handle this as well.

Also, notice that vertex shader inputs and fragment shader outputs
are already validated in assign_attribute_or_color_locations()
for both SSO and non-SSO paths, so we should not try to validate
that here again (in fact, the function would require explicit
paths to handle these two cases properly).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga
e7b7fe314e glsl/linker: create a helper function to validate explicit locations
Currently, we only validate explicit locations for non-SSO programs.
This creates a helper that we can call from both SSO and non-SSO paths
directly, so we can reuse all the logic behind this.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga
ab40acb453 glsl/linker: outputs in the same location must share auxiliary storage
From ARB_enhanced_layouts:

"[...]when location aliasing, the aliases sharing the location
  must have the same underlying numerical type (floating-point or
  integer) and the same auxiliary storage and
  interpolation qualification.[...]"

Add code to the linker to validate that aliased locations do
have the same aux storage.

Fixes:
KHR-GL45.enhanced_layouts.varying_location_aliasing_with_mixed_auxiliary_storage

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga
0b565f715d glsl/linker: outputs in the same location must share interpolation
From ARB_enhanced_layouts:

"[...]when location aliasing, the aliases sharing the location
 must have the same underlying numerical type (floating-point or
 integer) and the same auxiliary storage and
 interpolation qualification.[...]"

Add code to the linker to validate that aliased locations do
have the same interpolation.

Fixes:
KHR-GL45.enhanced_layouts.varying_location_aliasing_with_mixed_interpolation

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga
c4545676d7 glsl/linker: fix location aliasing checks for interface variables
The existing code was checking the whole interface variable rather
than its members, which is not what we want: we want to check
aliasing for each member in the interface variable.

Surprisingly, there are piglit tests that verify this and were
passing due to a bug in the existing code: when we were computing
the last component used by an interface variable we would use
the 'vector' path and multiply by vector_elements, which is 0 for
interface variables. This made the loop that checks for aliasing
be a no-op and not add the interface variable to the list of outputs
so then we would fail to link when we did not see a matching output
for the same input in the next stage. Since the tests expect a
linker error to happen, they would pass, but not for the right
reason.

Unfortunately, the current implementation uses ir_variable instances
to keep track of explicit locations. Since we don't have
ir_variables instances for individual interface members, we need
to have a custom struct with the data we need. This struct has
the ir_variable (which for interface members is the whole
interface variable), plus the data that we need to validate for
each aliased location, for now only the base type, which for
interface members we will take from the appropriate field inside
the interface variable.

Later patches will expand this custom struct so we can also check
other requirements for location aliasing, specifically that
we have matching interpolation and auxiliary storage, that once
again, we will take from the appropriate field members for the
interface variables.

v2:
 - Use MAX_VARYING instead of MAX_VARYINGS_INCL_PATCH (Illia)

Fixes:
KHR-GL45.enhanced_layouts.varying_block_automatic_member_locations

Fixes (these were passing before but for incorrect reasons):
tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-location-overlap.shader_test
tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-mixed-order-overlap.shader_test

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga
6aa68772d4 glsl/linker: refactor link-time validation of output locations
Move the checks for explicit locations to a separate function. We
will use this in a follow-up patch to validate locations for interface
variables where we need to validate each interface member rather than
the interface variable itself.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga
b944617224 glsl/linker: report linker errors for invalid explicit locations on inputs
We were assuming that if an input has an invalid explicit location it would
fail to link because it would not find the corresponding output, however,
since we look for the matching output by indexing the explicit_locations
array with the input location, we still need to ensure that we don't index
out of bounds.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Dave Airlie
16cfbef44c ac/llvm: drop pointless wrappers around umsb/imsb
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:34 +10:00
Dave Airlie
82d47b9d38 ac/llvm: consolidate find lsb function.
This was the same between si and ac.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:31 +10:00
Dave Airlie
de2b241111 ac/llvm: drop v4f32empty. (v2)
This was unused.

v2: drop args.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:22 +10:00
Dave Airlie
a76b6c2192 ac/llvm: add i1false/i1true to common code.
These get used in fair few places.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:18 +10:00
Dave Airlie
88b7ddbe65 ac/llvm: use the ac i32 0/1 and f32 0/1 llvm types.
This just avoids having two copies of these.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:13 +10:00
Dave Airlie
f925f5b074 ac/nir: move lds declaration/load/store into shared code.
This was duplicated between both drivers, share here.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:11 +10:00
Dave Airlie
74fc9e9186 st/mesa: enable nir path for all shaders.
There is no reason to block this here, if a driver enables
it, let it handle it.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 00:55:59 +01:00
Dave Airlie
3ee2e98aff st/program: add support for gs/tes/tcs nir shaders.
This probably needs more work but this just add the initial
code to convert gs/tcs/tes nir based shaders in the state tracker.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 00:55:56 +01:00
Dave Airlie
3c34d11589 st/program: rework basic variant interface
This just passes st_common_program and uses it.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 00:55:52 +01:00
Jason Ekstrand
3720d913dd anv/entrypoints: Dump useful data if mako throws an exception
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-25 16:14:09 -07:00
Jason Ekstrand
e0519294c7 nir/opt_intrinsics: Rework progress
This commit fixes two issues:  First, we were returning false regardless
of whether or not the function made progress.  Second, we were calling
nir_metadata_preserve far more often than needed; we only need to call
it once per impl.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-25 16:14:09 -07:00
Jason Ekstrand
d24311b7b5 intel/compiler: Call nir_lower_system_values in brw_preprocess_nir
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-25 16:14:09 -07:00
Jason Ekstrand
ece206b848 i965/program: Move nir_lower_system_values higher up
We want this to get called before nir_lower_subgroups which is going in
brw_preprocess_nir.  Now that nir_lower_wpos_ytransform can handle
system values, this should be safe to do.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-25 16:14:09 -07:00
Jason Ekstrand
2cfa3ef438 nir/lower_wpos_ytransform: Support system value intrinsics
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-25 16:14:09 -07:00
Jason Ekstrand
279f8fb69c anv/pipeline: Call nir_lower_system_valaues after brw_preprocess_nir
We currently have a bug where nir_lower_system_values gets called before
nir_lower_var_copies so it will miss any system value uses which come
from a copy_var intrinsic.  Moving it to after brw_preprocess_nir fixes
this problem.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-25 16:14:09 -07:00
Jason Ekstrand
afa0ddb81e anv/pipeline: Drop nir_lower_clip_cull_distance_arrays
We already handle it in brw_preprocess_nir

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-25 16:14:09 -07:00
Jason Ekstrand
e758b6519d anv/pipeline: Dump shader immedately after spirv_to_nir
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-25 16:14:09 -07:00
Jason Ekstrand
562b8d458c intel/eu: Use EXECUTE_1 for JMPI
The PRM says "The execution size must be 1."  In 73137997e2, the
execution size was set to 1 when it should have been BRW_EXECUTE_1
(which maps to 0).  Later, in dc2d3a7f5c, JMPI was used for
line AA on gen6 and earlier and we started manually stomping the
exeution size to BRW_EXECUTE_1 in the generator.  This commit fixes the
original bug and makes brw_JMPI just do the right thing.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 73137997e2
2017-10-25 16:14:09 -07:00
Alejandro Piñeiro
4723933b8e i965/fs: Add brw_reg_type_from_bit_size utility method
Returns the brw_type for a given ssa.bit_size, and a reference type.
So if bit_size is 64, and the reference type is BRW_REGISTER_TYPE_F,
it returns BRW_REGISTER_TYPE_DF. The same applies if bit_size is 32
and reference type is BRW_REGISTER_TYPE_HF it returns BRW_REGISTER_TYPE_F

v2 (Jason Ekstrand):
 - Use better unreachable() messages
 - Add Q types

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-25 16:14:09 -07:00
Jason Ekstrand
99778e7f9f i965/fs/nir: Use the nir_src_bit_size helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-25 16:14:09 -07:00
Jason Ekstrand
fa6e74e33e intel/fs: Handle flag read/write aliasing in needs_src_copy
In order to implement the ballot intrinsic, we do a MOV from flag
register to some GRF.  If that GRF is used in a SEL, cmod propagation
helpfully changes it into a MOV from the flag register with a cmod.
This is perfectly valid but when lower_simd_width comes along, it simply
splits into two instructions which both have conditional modifiers.
This is a problem since we're reading the flag register.  This commit
makes us check whether or not flags_written() overlaps with the flag
values that we are reading via the instruction source and, if we have
any interference, will force us to emit a copy of the source.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-25 16:14:09 -07:00
Jan Vesely
a6d38f476b clover: Fix compilation after clang r315871
v2: use a more generic compat function
v3: rename and formatting cleanup

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103388
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
CC: <mesa-stable@lists.freedesktop.org>
2017-10-25 18:57:42 -04:00
Marek Olšák
b85cd69415 glsl_to_tgsi: remove unused glsl_version variable
trivial
2017-10-26 00:43:31 +02:00
Bas Nieuwenhuizen
61a9ef4ab1 radv: Compute ac keys from pipeline key.
The beginning of the end for the shader keys. Not entirely sure
what I'm going to replace them with for the compiler though, so this
is the first step.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-26 00:28:40 +02:00
Bas Nieuwenhuizen
49d035122e radv: Add single pipeline cache key.
To decouple the key used for info gathering and the cache from
whatever we pass to the compiler.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-26 00:28:40 +02:00
Bas Nieuwenhuizen
de38491a57 radv: Don't compute as_ls/as_es before hashing.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-26 00:28:40 +02:00
Jordan Justen
87e71726e0 glsl_to_nir: Zero nir_constant in constant_copy for valgrind & nir_serialize
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-25 12:36:21 -07:00
Jordan Justen
16867154d8 glsl_to_nir: Zero nir_variable struct for valgrind & nir_serialize
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-25 12:36:21 -07:00
Jordan Justen
78550869a1 nir: Zero nir_load_const_instr::value for valgrind & nir_serialize
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-25 12:36:21 -07:00
Jordan Justen
b35e8c3b86 intel/nir: Zero local index const struct for valgrind & nir_serialize
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-25 12:36:21 -07:00
Jordan Justen
d917f57c2f nir: Zero local_size const struct for valgrind & nir_serialize
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-25 12:36:21 -07:00
Jordan Justen
abbcdc9b69 glsl: Add field initializers for glsl_struct_field default constructor
This helps valgrind when encode_type_to_blob is used.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-25 12:36:21 -07:00
Jason Ekstrand
23327af91c compiler/types: Support [de]serializing void types
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-25 12:36:21 -07:00
Jason Ekstrand
c1b84256cc nir/intrinsics: Set the correct num_indices for load_output
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-25 12:36:20 -07:00
Connor Abbott
7686f0b316 glsl: move shader_cache type handling to glsl_types
Not sure if this is the best place to put it, but we're going to need
this for NIR too.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-25 12:36:20 -07:00
Alex Smith
9626128f32 vulkan: Update headers and registry to 1.0.64
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
2017-10-26 05:17:57 +10:00
Matthew Nicholls
27a0b24bf2 ac/nir: generate correct instruction for atomic min/max on unsigned images
v2: fix silly typo

Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-25 20:52:58 +02:00
Roland Scheidegger
20c77ae639 gallium/util: remove some block alignment assertions
These assertions were revisited a couple of times in the past, and they
still weren't quite right.
The problem I was seeing (with some other state tracker) was a copy between
two 512x512 s3tc textures, but from mip level 0 to mip level 8. Therefore,
the destination has only size 2x2 (not a full block), so the box width/height
was only 2, causing the assertion to trigger for src alignment.
As far as I can tell, such a copy is completely legal, and because a correct
assertion would get ridiculously complicated just get rid of it for good.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-10-25 19:52:24 +02:00
Eric Engestrom
7983adc60f meson: be explicit about the version required
This way, we know what we're allowed to use (no nested include lists
for instance) and users get immediate feedback when trying to use
unsupported versions, rather than a cryptic crash or things being
silently not built correctly.

Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-25 14:05:56 +01:00
Erik Faye-Lund
9e5a5a11ed meson: add opt-out of libunwind
Libunwind has some issues on some platforms, so let's allow people
who have issues to opt-out. This is similar to what we do in automake,
and the implementation is modelled after our opt-out for valgrind.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-25 14:05:24 +02:00
Harish Krupo
d37bcf3cc2 gles2: support for GL_EXT_occlusion_query_boolean
Following test checking entrypoints passes:
   dEQP-EGL.functional.get_proc_address.extension.gl_ext_occlusion_query_boolean

Piglit test 'ext_occlusion_query_boolean-any-samples' passes with these changes.

No changes/regression observed in WebGL occlusion tests or Intel CI.

v2: add es2="2.0" for glapi entrypoints, clean up xml
    dispatch_sanity changes (fix 'make check')

Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-25 14:10:38 +03:00
Tapani Pälli
f5bec8583a mesa: enum checks for GL_EXT_occlusion_query_boolean
Some of the checks are valid for generic ES 3.2 as well.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-25 14:10:38 +03:00
Samuel Pitoiset
9711979df0 radv: print NIR before LLVM IR and disassembly
It's still printed after linking, but it makes more sense to
have SPIRV->NIR->LLVM IR->ASM.

Fixes: f0a2bbd1a4 (radv: move nir print after linking is done)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-25 11:46:53 +02:00
Bas Nieuwenhuizen
5bfbab2fdc radv: Fix truncation issue hexifying the cache uuid for the disk cache.
Going from binary to hex has a 2x blowup.

Fixes: 1421625292 'radv: create on-disk shader cache'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-25 09:50:05 +02:00
Timothy Arceri
767ca5bdf1 radv: enable lower to scalar nir pass
This will allow dead components of varyings to be removed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-25 17:02:40 +11:00
Timothy Arceri
8ebaf8192a ac: add support for explicit component packing
This is needed for RADV to support explicit component packing.

This is also required to use the new NIR component splitting /
packing passes.

V2:
 - add commponent packing support for interpolate_at* intrinsics
 - improve store packing support when not all varyings are scalar
   as spotted by Bas the store source was incorrectly offset.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-25 17:02:40 +11:00
Timothy Arceri
e0e0666584 i965: fix unused var warnings in release build
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-25 14:26:39 +11:00
Dave Airlie
d8cefaa197 radv: use device name in cache creation like radeonsi.
Not sure how useful this is, but it makes it more consistent.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-25 02:26:01 +01:00
Dave Airlie
3cd3035ace radv: use a define for the transition point between cp and compute shader
For certain buffer meta ops we can use the CP or a compute shader,
we should use a define to rather than hardcoding 4096, allows
for easier testing and more consistency.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-25 10:01:13 +10:00
Kenneth Graunke
b704538b00 docs: Mark GL_KHR_no_error as done.
Drivers have supported KHR_no_error for a while.  We'd been leaving it
marked as "in progress" because there's a zillion places that could get
slightly more optimized.  But, Timothy and Samuel have already done
piles of work, and I think we have a solid implementation at this point.

Let's check it off the list.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-24 16:56:58 -07:00
Kenneth Graunke
66b4a7a79e i965: Call gen6_upload_push_constants() even when the stage is disabled.
This properly sets stage_state->push_constant_dirty = true, so that we
emit 3DSTATE_CONSTANT_XS to disable the constant buffer for the shader
stage.  It also sets stage_state->push_const_size = 0.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-10-24 16:14:04 -07:00
Kenneth Graunke
16096e9119 i965: Drop a bunch of downcasting and upcasting of gl_program pointers.
We have a gl_program and we want a gl_program.  There's no point in
converting to brw_program and back again.  This probably made more
sense in the old days before Tim dropped a layer of subclassing.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-10-24 16:14:02 -07:00
Kenneth Graunke
90ed2a10bb i965: Move _mesa_shader_write_subroutine_indices down a level.
Now we call it in one place instead of making every caller do it.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-10-24 16:13:59 -07:00
Dave Airlie
a5499b639c radv: only emit dfsm packets if dfsm is allowed.
radeonsi only emits these when dfsm is enabled, so for now
just hinge them on a flag we never set.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-24 23:00:57 +01:00
Rob Clark
4aa69cc425 meson: build freedreno
Mostly copy/pasta from Dylan Baker's conversion of nouveau and i965.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-24 15:33:40 -04:00
Rob Clark
2207af032b meson: extract out variable for nir_algebraic.py
Also needed in freedreno/ir3.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-24 15:33:40 -04:00
Rob Clark
0ca8d53215 freedreno/ir3: use a flag instead of setting PYTHONPATH
Similar to 848da66222, pass an arg to
ir3_nir_trig.py to add to python path, rather than using $PYTHONPATH,
to prep for meson build support.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-24 15:33:40 -04:00
Kenneth Graunke
583ce96c94 i965: Don't disable CCS for RT dependencies when dispatching compute.
Compute shaders don't have access to the framebuffer, so there's no
point in worrying whether a texture is bound as a render target.

This saves a bunch of resolves in GFXBench4 Manhattan 3.1, but doesn't
seem to impact performance at all, at least on Apollolake.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-10-24 11:31:33 -07:00
Eric Anholt
e91c3540fc i965: Fix memmem compiler warnings.
gcc is throwing this warning in my meson build:

../src/intel/compiler/brw_eu_validate.c:50:11: warning
argument 1 null where non-null expected [-Wnonnull]
    return memmem(haystack.str, haystack.len,
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                  needle.str, needle.len) != NULL;
                                  ~~~~~~~~~~~~~~~~~~~~~~~

The first check for CONTAINS has a NULL error_msg.str and 0 len.  The
glibc implementation will exit without looking at any haystack bytes if
haystack.len < needle.len, so this was safe, but silence the warning
anyway by guarding against implementation variablility.

Fixes: 122ef3799d ("i965: Only insert error message if not already present")
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-10-24 10:51:18 -07:00
Rob Clark
eed9685dd6 freedreno: per-context fd_pipe
To enable per-context priorities, we need to have per-context pipe's.
Unfortunately we still need to keep the global screen pipe, mostly just
for screen->get_timestamp().

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-24 12:56:51 -04:00
Rob Clark
9c32333a58 freedreno: rename pipe -> vsc_pipe
To add context priority support we need to have an fd_pipe per context,
rather than per-screen.  Which conflicts with existing ctx->pipe (which
is actually a visibility stream pipe (hw resource).  So just rename it.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-24 12:56:51 -04:00
Rob Clark
7e7096307a freedreno: pass context flags through to fd_context_init()
Prep work for later patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-24 12:56:51 -04:00
Brian Paul
7a6c6e73a8 gallium/util: use util_snprintf() in u_socket_connect()
Instead of plain snprintf().  To fix the MSVC build.

snprintf() is used in various places in Mesa/gallium, but apparently,
not in code built with MSVC.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-24 08:17:15 -06:00
Benjamin Gordon
de3555f834 configure: Allow android as an EGL platform
I'm working on radeonsi support in the Chrome OS Android container
(ARC++).  Mesa in ARC++ uses autotools instead of Android.mk, but all
the necessary EGL bits are there, so the existing check is too strict.

Signed-off-by: Benjamin Gordon <bmgordon@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-24 14:46:22 +01:00
Marek Olšák
2a414c3961 radeonsi: postponed KILL isn't postponed anymore, but maintains WQM
This restores performance for the drirc workaround, i.e.
KILL_IF does:
   visible = src0 >= 0;
   kill_flag &= visible; // accumulate kills
   amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only

And all helper pixels are killed at the end of the shader:
   amdgcn_kill(kill_flag);

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-24 14:56:34 +02:00
Marek Olšák
da0083f123 radeonsi: use postponed KILL only when derivatives are used
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-24 14:56:34 +02:00
Marek Olšák
478afbe525 ac: use llvm.amdgcn.kill with LLVM 6.0
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-24 14:56:34 +02:00
Marek Olšák
1ff9e27cbd ac: replace ac_build_kill with ac_build_kill_if_false
This will be a new LLVM intrinsic and will also work nicely with
llvm.amdgcn.wqm.vote.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-24 14:56:34 +02:00
Timothy Arceri
f0a2bbd1a4 radv: move nir print after linking is done
We now have linking optimisations so we want to delay dumping the
nir until after these are complete.

Fixes: 06f05040eb (radv: Link shaders)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-24 10:41:38 +11:00
Dave Airlie
11d688d9f0 mesa/bufferobj: don't double negate the range
This fixes a regression I introduced refactoring this code,
I managed to invert range twice, I moved the inversion into
the common code, but forgot to stop doing it in the callee.

Fixes: GL45-CTS.multi_bind.dispatch_bind_buffers_base

Fixes: 35ac13ed3 (mesa/bufferobj: consolidate some codepaths between ubo/ssbo/atomics.)
Reported-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-24 08:40:23 +10:00
Timothy Arceri
013313cf89 radv: clone meta shaders before linking
The IR is reused in different pipeline combinations so we need
to clone it to avoid link time optimistaions messing up the
original copy.

Fixes: 06f05040eb (radv: Link shaders)

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-24 09:27:40 +11:00
Brian Paul
069211f205 gallium/util: don't call close() on Windows in u_tests.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 15:10:44 -06:00
Brian Paul
5134c0dedf mesa: use util_strdup() macro in u_debug_symbol.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 15:10:38 -06:00
Brian Paul
89372220b3 mesa: use util_strdup() macro in symbol_table.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 15:10:32 -06:00
Brian Paul
acd6ea0cc0 util: add util_strdup() wrapper macro
To work around MSVC warning that strdup() is a deprecated POSIX function.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 15:10:24 -06:00
Brian Paul
6230773936 gallium/util: replace gethostbyname() with getaddrinfo()
Compiling with MSVC options /we4995 /we4996 (a subset of /sdl) generates
a warning that the gethostbyname() function is deprecated in favor of
getaddrinfo() or GetAddrInfoW().  Replace the call with getaddrinfo().

Untested.  There are no callers to u_socket_connect() in Gallium.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 15:10:01 -06:00
Alex Smith
fee9d05e21 radv: Update code pointer correctly if a variant is already created
This was the actual cause of GPU hangs fixed by 0fdd531457 ("radv:
Fix pipeline cache locking issues"), since multiple threads would end
up trying to create the variants for a single entry.

Now that we're locking around the whole of this function, this isn't
really necessary (we either create all or none of the variants), but
fix this anyway in case things change later.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: 17.3 <mesa-stable@lists.freedesktop.org>
2017-10-23 22:36:54 +02:00
Kenneth Graunke
013d331220 i965: Revert absolute mode for constant buffer pointers.
The kernel doesn't initialize the value of the INSTPM or CS_DEBUG_MODE2
registers at context initialization time.  Instead, they're inherited
from whatever happened to be running on the GPU prior to first run of a
new context.  So, when we started setting these, other contexts in the
system started inheriting our values.  Since this controls whether
3DSTATE_CONSTANT_* takes a pointer or an offset, getting the wrong
setting is fatal for almost any process which isn't expecting this.

Unfortunately, VA-API and Beignet don't initialize this (nor does older
Mesa), so they will die horribly if we start doing this.  UXA and SNA
don't use any push constants, so they are unaffected.

Until we have some kind of solution to this problem, I'm going to revert
this patch and abandon using the feature for now.  It will lead to fewer
pushed UBO ranges on Broadwell+, which may lead to lower performance,
though I don't have any data on the impact.

Cc: "17.3 17.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102774
2017-10-23 12:03:22 -07:00
Dylan Baker
d4567efa5c meson: build imx driver
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-23 11:45:55 -07:00
Dylan Baker
51558a1d6c meson: build etnaviv driver + winsys
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-23 11:45:38 -07:00
Eric Anholt
ba85525fce ac: Silence a compiler warning about results[0].
We know that num_components will be > 0, but it doesn't.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 10:14:40 -07:00
Eric Anholt
34c04c734f ac: Fix a compiler warning for possibly undefined "name"
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 10:14:40 -07:00
Dylan Baker
77f7ef0287 meson: fix egl build for meson version < 0.43
Meson 0.43 added the ability to pass nested lists to
include_directories, so the code that we have works for 0.43, but not
for 0.42. This patch changes the include_directories list to be flat so
it works with 0.42

fixes: 108d257a16 ("meson: build libEGL")
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-10-23 10:14:40 -07:00
Nicolai Hähnle
f9ccfda9bc amd/common/gfx9: workaround DCC corruption more conservatively
Fixes KHR-GL45.texture_swizzle.smoke and others on Vega.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102809
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-23 18:10:20 +02:00
Emil Velikov
a90b4329df docs/release-calendar: update - 17.3.0-rc1 is out
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-23 14:31:15 +01:00
Ilia Mirkin
4d24a7cb97 glsl: fix derived cs variables
There are two issues with the current implementation. First, it relies
on the layout(local_size_*) happening in the same shader as the main
function, and secondly it doesn't work for variable group sizes.

In both cases, the simplest fix is to move the setup of these derived
values to a later time, similar to how the gl_VertexID workarounds are
done. There already exist system values defined for both of the derived
values, so we use them unconditionally, and lower them after linking is
performed.

While we're at it, we move to using gl_LocalGroupSizeARB instead of
gl_WorkGroupSize for variable group sizes.

Also the dead code elimination avoidance can be removed, since there
can be situations where gl_LocalGroupSizeARB is needed but has not been
inserted for the shader with main function. As a result, the lowering
code has to insert its own copies of the system values if needed.

Reported-by: Stephane Chevigny <stephane.chevigny@polymtl.ca>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103393
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-23 08:34:56 -04:00
Emil Velikov
4302df8c8e docs: add 17.4.0-devel release notes template
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-23 13:07:06 +01:00
Emil Velikov
c85d64cf67 mesa: bump version to 17.4.0-devel
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-23 13:00:43 +01:00
2267 changed files with 201840 additions and 76734 deletions

View File

@@ -148,6 +148,8 @@ Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com>
Eric Anholt <eric@anholt.net> Eric Anholt <anholt@FreeBSD.org>
Eric Engestrom <eric@engestrom.ch> <eric.engestrom@imgtec.com>
Eugeni Dodonov <eugeni.dodonov@intel.com> <eugeni@mandriva.com>
Fabian Bieler <der.fabe@gmx.net> <fabianbieler@fastmail.fm>

View File

@@ -17,11 +17,11 @@ env:
- DRI2PROTO_VERSION=dri2proto-2.8
- LIBPCIACCESS_VERSION=libpciaccess-0.13.4
- LIBDRM_VERSION=libdrm-2.4.74
- XCBPROTO_VERSION=xcb-proto-1.11
- LIBXCB_VERSION=libxcb-1.11
- XCBPROTO_VERSION=xcb-proto-1.13
- LIBXCB_VERSION=libxcb-1.13
- LIBXSHMFENCE_VERSION=libxshmfence-1.2
- LIBVDPAU_VERSION=libvdpau-1.1
- LIBVA_VERSION=libva-1.6.2
- LIBVA_VERSION=libva-1.7.0
- LIBWAYLAND_VERSION=wayland-1.11.1
- WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.8
- PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig:$HOME/prefix/share/pkgconfig
@@ -30,6 +30,40 @@ env:
matrix:
include:
- env:
- LABEL="meson Vulkan"
- BUILD=meson
- MESON_OPTIONS="-Ddri-drivers= -Dgallium-drivers="
- LLVM_VERSION=4.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
addons:
apt:
sources:
- llvm-toolchain-trusty-4.0
packages:
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-4.0-dev
# Common
- xz-utils
- libexpat1-dev
- libelf-dev
- python3-pip
- env:
- LABEL="meson loaders/classic DRI"
- BUILD=meson
- MESON_OPTIONS="-Dvulkan-drivers= -Dgallium-drivers="
addons:
apt:
packages:
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libxdamage-dev
- libxfixes-dev
- python3-pip
- env:
- LABEL="make loaders/classic DRI"
- BUILD=make
@@ -58,12 +92,10 @@ matrix:
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.9
- LLVM_VERSION=4.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- OVERRIDE_CC="gcc-4.8"
- OVERRIDE_CXX="g++-4.8"
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
@@ -73,13 +105,41 @@ matrix:
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
- llvm-toolchain-trusty-4.0
packages:
- binutils-2.26
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
- llvm-4.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
- LABEL="make Gallium Drivers RadeonSI"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=4.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="radeonsi"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-4.0
packages:
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-4.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
@@ -99,7 +159,7 @@ matrix:
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="i915,nouveau,pl111,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"
- GALLIUM_DRIVERS="i915,nouveau,pl111,r300,r600,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
@@ -134,7 +194,7 @@ matrix:
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600,radeonsi"
- GALLIUM_DRIVERS="r600"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
@@ -271,10 +331,8 @@ matrix:
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="make -C src/gtest check && make -C src/intel check"
- LLVM_VERSION=3.9
- LLVM_VERSION=4.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl --with-platforms=x11,wayland"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --enable-dri3 --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
@@ -284,13 +342,12 @@ matrix:
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
- llvm-toolchain-trusty-4.0
packages:
- binutils-2.26
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
- llvm-4.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
@@ -342,7 +399,7 @@ matrix:
- BUILD=scons
- SCONSFLAGS="-j4"
- SCONS_TARGET="swr=1"
- LLVM_VERSION=3.9
- LLVM_VERSION=4.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
# Keep it symmetrical to the make build. There's no actual SWR, yet.
- SCONS_CHECK_COMMAND="true"
@@ -351,13 +408,13 @@ matrix:
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
- llvm-toolchain-trusty-4.0
packages:
- scons
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
- llvm-4.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
@@ -365,45 +422,49 @@ matrix:
- libx11-xcb-dev
- libelf-dev
- env:
- LABEL="meson Vulkan"
- BUILD=meson
- MESON_OPTIONS="-Ddri-drivers= -Dgallium-drivers="
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
# Common
- xz-utils
- libexpat1-dev
- libelf-dev
- python3-pip
- LABEL="macOS make"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="make check"
- DRI_LOADERS="--with-platforms=x11 --disable-egl"
os: osx
- env:
- LABEL="meson loaders/classic DRI"
- LABEL="macOS meson"
- BUILD=meson
- MESON_OPTIONS="-Dvulkan-drivers= -Dgallium-drivers="
addons:
apt:
packages:
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libxdamage-dev
- libxfixes-dev
- python3-pip
- MESON_OPTIONS="-Degl=false"
os: osx
before_install:
- |
if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then
HOMEBREW_NO_AUTO_UPDATE=1 brew install python3 ninja expat gettext
# Set PATH for homebrew pip3 installs
PATH="$HOME/Library/Python/3.6/bin:${PATH}"
# Set PKG_CONFIG_PATH for keg-only expat
PKG_CONFIG_PATH="/usr/local/opt/expat/lib/pkgconfig:${PKG_CONFIG_PATH}"
# Set PATH for keg-only gettext
PATH="/usr/local/opt/gettext/bin:${PATH}"
# Install xquartz for prereqs ...
XQUARTZ_VERSION="2.7.11"
wget -nv https://dl.bintray.com/xquartz/downloads/XQuartz-${XQUARTZ_VERSION}.dmg
hdiutil attach XQuartz-${XQUARTZ_VERSION}.dmg
sudo installer -pkg /Volumes/XQuartz-${XQUARTZ_VERSION}/XQuartz.pkg -target /
hdiutil detach /Volumes/XQuartz-${XQUARTZ_VERSION}
# ... and set paths
PATH="/opt/X11/bin:${PATH}"
PKG_CONFIG_PATH="/opt/X11/share/pkgconfig:/opt/X11/lib/pkgconfig:${PKG_CONFIG_PATH}"
ACLOCAL="aclocal -I /opt/X11/share/aclocal -I /usr/local/share/aclocal"
fi
install:
- pip install --user mako
- pip2 install --user mako
# Install the latest meson from pip, since the version in the ubuntu repos is
# often quite old.
# Install a more modern meson from pip, since the version in the
# ubuntu repos is often quite old. Avoid >=0.45.0 as it needs python
# 3.5+
- if test "x$BUILD" = xmeson; then
pip3 install --user meson;
pip3 install --user "meson<0.45.0";
fi
# Since libdrm gets updated in configure.ac regularly, try to pick up the
@@ -419,62 +480,64 @@ install:
# Install dependencies where we require specific versions (or where
# disallowed by Travis CI's package whitelisting).
- wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2
- tar -jxvf $XORGMACROS_VERSION.tar.bz2
- (cd $XORGMACROS_VERSION && ./configure --prefix=$HOME/prefix && make install)
- |
if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then
wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2
tar -jxvf $XORGMACROS_VERSION.tar.bz2
(cd $XORGMACROS_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2
- tar -jxvf $GLPROTO_VERSION.tar.bz2
- (cd $GLPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2
tar -jxvf $GLPROTO_VERSION.tar.bz2
(cd $GLPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2
- tar -jxvf $DRI2PROTO_VERSION.tar.bz2
- (cd $DRI2PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2
tar -jxvf $DRI2PROTO_VERSION.tar.bz2
(cd $DRI2PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2
- tar -jxvf $XCBPROTO_VERSION.tar.bz2
- (cd $XCBPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2
tar -jxvf $XCBPROTO_VERSION.tar.bz2
(cd $XCBPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2
- tar -jxvf $LIBXCB_VERSION.tar.bz2
- (cd $LIBXCB_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2
tar -jxvf $LIBXCB_VERSION.tar.bz2
(cd $LIBXCB_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2
- tar -jxvf $LIBPCIACCESS_VERSION.tar.bz2
- (cd $LIBPCIACCESS_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2
tar -jxvf $LIBPCIACCESS_VERSION.tar.bz2
(cd $LIBPCIACCESS_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
- tar -jxvf $LIBDRM_VERSION.tar.bz2
- (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api && make install)
wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
tar -jxvf $LIBDRM_VERSION.tar.bz2
(cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api && make install)
- wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
- tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
- (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
(cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget http://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2
- tar -jxvf $LIBVDPAU_VERSION.tar.bz2
- (cd $LIBVDPAU_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget http://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2
tar -jxvf $LIBVDPAU_VERSION.tar.bz2
(cd $LIBVDPAU_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget http://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2
- tar -jxvf $LIBVA_VERSION.tar.bz2
- (cd $LIBVA_VERSION && ./configure --prefix=$HOME/prefix --disable-wayland --disable-dummy-driver && make install)
wget http://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2
tar -jxvf $LIBVA_VERSION.tar.bz2
(cd $LIBVA_VERSION && ./configure --prefix=$HOME/prefix --disable-wayland --disable-dummy-driver && make install)
- wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz
- tar -axvf $LIBWAYLAND_VERSION.tar.xz
- (cd $LIBWAYLAND_VERSION && ./configure --prefix=$HOME/prefix --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation && make install)
wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz
tar -axvf $LIBWAYLAND_VERSION.tar.xz
(cd $LIBWAYLAND_VERSION && ./configure --prefix=$HOME/prefix --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation && make install)
- wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz
- tar -axvf $WAYLAND_PROTOCOLS_VERSION.tar.xz
- (cd $WAYLAND_PROTOCOLS_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz
tar -axvf $WAYLAND_PROTOCOLS_VERSION.tar.xz
(cd $WAYLAND_PROTOCOLS_VERSION && ./configure --prefix=$HOME/prefix && make install)
# Meson requires ninja >= 1.6, but trusty has 1.3.x
- wget https://github.com/ninja-build/ninja/releases/download/v1.6.0/ninja-linux.zip;
- unzip ninja-linux.zip
- mv ninja $HOME/prefix/bin/
# Meson requires ninja >= 1.6, but trusty has 1.3.x
wget https://github.com/ninja-build/ninja/releases/download/v1.6.0/ninja-linux.zip
unzip ninja-linux.zip
mv ninja $HOME/prefix/bin/
# Generate the header since one is missing on the Travis instance
- mkdir -p linux
- printf "%s\n" \
# Generate this header since one is missing on the Travis instance
mkdir -p linux
printf "%s\n" \
"#ifndef _LINUX_MEMFD_H" \
"#define _LINUX_MEMFD_H" \
"" \
@@ -485,6 +548,7 @@ install:
"#define MFD_ALLOW_SEALING 0x0002U" \
"" \
"#endif /* _LINUX_MEMFD_H */" > linux/memfd.h
fi
script:
- if test "x$BUILD" = xmake; then
@@ -512,8 +576,28 @@ script:
scons $SCONS_TARGET && eval $SCONS_CHECK_COMMAND;
fi
- if test "x$BUILD" = xmeson; then
export CFLAGS="$CFLAGS -isystem`pwd`";
meson _build $MESON_OPTIONS;
ninja -C _build;
- |
if test "x$BUILD" = xmeson; then
# Travis CI has moved to LLVM 5.0, and meson is detecting
# automatically the available version in /usr/local/bin based on
# the PATH env variable order preference.
#
# As for 0.44.x, Meson cannot receive the path to the
# llvm-config binary as a configuration parameter. See
# https://github.com/mesonbuild/meson/issues/2887 and
# https://github.com/dcbaker/meson/commit/7c8b6ee3fa42f43c9ac7dcacc61a77eca3f1bcef
#
# We want to use the custom (APT) installed version. Therefore,
# let's make Meson find our wanted version sooner than the one
# at /usr/local/bin
#
# Once this is corrected, we would still need a patch similar
# to:
# https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html
test -f /usr/bin/$LLVM_CONFIG && ln -s /usr/bin/$LLVM_CONFIG $HOME/prefix/bin/llvm-config
export CFLAGS="$CFLAGS -isystem`pwd`"
meson _build $MESON_OPTIONS
ninja -C _build
fi

View File

@@ -31,6 +31,7 @@ LOCAL_C_INCLUDES += \
MESA_VERSION := $(shell cat $(MESA_TOP)/VERSION)
LOCAL_CFLAGS += \
-Wno-error \
-Wno-unused-parameter \
-Wno-pointer-arith \
-Wno-missing-field-initializers \
@@ -51,11 +52,15 @@ LOCAL_CFLAGS += \
-DHAVE___BUILTIN_EXPECT \
-DHAVE___BUILTIN_FFS \
-DHAVE___BUILTIN_FFSLL \
-DHAVE_DLFCN_H \
-DHAVE_FUNC_ATTRIBUTE_FLATTEN \
-DHAVE_FUNC_ATTRIBUTE_UNUSED \
-DHAVE_FUNC_ATTRIBUTE_FORMAT \
-DHAVE_FUNC_ATTRIBUTE_PACKED \
-DHAVE_FUNC_ATTRIBUTE_ALIAS \
-DHAVE_FUNC_ATTRIBUTE_NORETURN \
-DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL \
-DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT \
-DHAVE___BUILTIN_CTZ \
-DHAVE___BUILTIN_POPCOUNT \
-DHAVE___BUILTIN_POPCOUNTLL \
@@ -65,6 +70,9 @@ LOCAL_CFLAGS += \
-DHAVE_PTHREAD=1 \
-DHAVE_DLADDR \
-DHAVE_DL_ITERATE_PHDR \
-DHAVE_LINUX_FUTEX_H \
-DHAVE_ENDIAN_H \
-DHAVE_ZLIB \
-DMAJOR_IN_SYSMACROS \
-fvisibility=hidden \
-Wno-sign-compare

View File

@@ -39,6 +39,7 @@ endif
MESA_DRI_MODULE_REL_PATH := dri
MESA_DRI_MODULE_PATH := $(TARGET_OUT_SHARED_LIBRARIES)/$(MESA_DRI_MODULE_REL_PATH)
MESA_DRI_MODULE_UNSTRIPPED_PATH := $(TARGET_OUT_SHARED_LIBRARIES_UNSTRIPPED)/$(MESA_DRI_MODULE_REL_PATH)
MESA_DRI_LDFLAGS := -Wl,--build-id=sha1
MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk
MESA_PYTHON2 := python

View File

@@ -35,6 +35,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-glx-tls \
--enable-nine \
--enable-opencl \
--enable-opencl-icd \
--enable-opengl \
--enable-va \
--enable-vdpau \
@@ -44,7 +45,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-libunwind \
--with-platforms=x11,wayland,drm,surfaceless \
--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
--with-gallium-drivers=i915,nouveau,r300,pl111,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr,etnaviv,imx \
--with-gallium-drivers=i915,nouveau,r300,pl111,r600,radeonsi,freedreno,svga,swrast,vc4,tegra,virgl,swr,etnaviv,imx \
--with-vulkan-drivers=intel,radeon
ACLOCAL_AMFLAGS = -I m4
@@ -58,7 +59,13 @@ EXTRA_DIST = \
scons \
SConstruct \
build-support/conftest.dyn \
build-support/conftest.map
build-support/conftest.map \
meson.build \
meson_options.txt \
bin/meson.build \
include/meson.build \
bin/install_megadrivers.py \
bin/meson_get_version.py
noinst_HEADERS = \
include/c99_alloca.h \
@@ -69,12 +76,14 @@ noinst_HEADERS = \
include/drm-uapi/drm_fourcc.h \
include/drm-uapi/drm_mode.h \
include/drm-uapi/i915_drm.h \
include/drm-uapi/tegra_drm.h \
include/drm-uapi/vc4_drm.h \
include/D3D9 \
include/GL/wglext.h \
include/HaikuGL \
include/no_extern_c.h \
include/pci_ids
include/pci_ids \
include/vulkan
# We list some directories in EXTRA_DIST, but don't actually want to include
# the .gitignore files in the tarball.

View File

@@ -74,6 +74,15 @@ EGL
R: Eric Engestrom <eric@engestrom.ch>
F: src/egl/
HAIKU
R: Alexander von Gluck IV <kallisti5@unixzen.com>
F: include/HaikuGL/
F: src/egl/drivers/haiku/
F: src/gallium/state_trackers/hgl/
F: src/gallium/targets/haiku-softpipe/
F: src/gallium/winsys/sw/hgl/
F: src/hgl/
GALLIUM LOADER
R: Emil Velikov <emil.l.velikov@gmail.com>
F: src/gallium/auxiliary/pipe-loader/

View File

@@ -1 +1 @@
17.3.0-devel
18.1.9

View File

@@ -35,13 +35,13 @@ clone_depth: 100
cache:
- win_flex_bison-2.5.9.zip
- llvm-3.3.1-msvc2013-mtd.7z
- llvm-3.3.1-msvc2015-mtd.7z
os: Visual Studio 2013
os: Visual Studio 2015
environment:
WINFLEXBISON_ARCHIVE: win_flex_bison-2.5.9.zip
LLVM_ARCHIVE: llvm-3.3.1-msvc2013-mtd.7z
LLVM_ARCHIVE: llvm-3.3.1-msvc2015-mtd.7z
install:
# Check pip
@@ -69,10 +69,10 @@ install:
- set LLVM=%CD%\llvm
build_script:
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.0 llvm=1
after_build:
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1 check
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.0 llvm=1 check
# It's possible to setup notification here, as described in

129
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,129 @@
d89f58a6b8436b59dcf3b896c0ccddabed3f78fd
a7d0c53ab89ca86b705014925214101f5bc4187f
# These patches are ignored becuase Jason provided a version rebased on the 18.1
# branch that was pulled instead
#
778e2881a04817e8c10c7a400bf1e37414420194
3b54dd87f707a0fa40a1555bee64aeb06a381c27
eeae4851494c16d2a6591550bfa6ef77d887ebe3
a26693493570a9d0f0fba1be617e01ee7bfff4db
0e7f3febf7e739c075a139ae641d65a0618752f3
# This has a warning that it fixes more than one commit, but isn't needed in
# 18.1
#
a1220e73116bad74f39c1792a0b0cf0e4e5031db
# This doesn't apply and isn't necessary since
# 1cc2e0cc6b47bd5efbf2af266405060785085e6b isn't in the 18.1 branch
#
587e712eda95c31d88ea9d20e59ad0ae59afef4f
# This requires too many previous patch, and Marek (the author) decided to
# to drop it from stable
#
cac7ab1192eefdd8d8b3f25053fb006b5c330eb8
# This patch is excluded since it requires additional patches to be pulled,
# and is mainly aimed at developers, who rarely (if ever) work in the
# stable branch
#
a2f5292c82ad07731d633b36a663e46adc181db9
# This patch required manual backport, which was provided as
# 3953467ee7851792c1d4b1c9435499545a7da9fc
#
4a67ce886a7b3def5f66c1aedf9e5436d157a03c
# This patch required manual backport, which was provided as
# 31677c5aa867e457cd06ae25150be2155e8da3c6
#
1f616a840eac02241c585d28e9dac8f19a297f39
# Jason de-nominated this because it "a) shouldn't be needed and b) is horribly
# broken"
#
11712b9ca17e4e1a819dcb7d020e19c6da77bc90
# None of these are tagged for 18.1, they're for 18.2, but get-pick-list
# still finds them for some reason
#
1c7a2433b270afb65f044d0cf49cb67715f50b5b
0f79b2015bc0c44a8ed470684b6789f0e2e6aa6c
ccbe33af5b086f4b488ac7ca8a8a45ebc9ac189c
3f9cb2eb05152f4f0269e97893a16f23261f095b
f2c0d310d6efe560de8192ab468ba02d50c9ac1e
50a8713d4f90a6c70a23f9f5871420371df283a7
1561e4984eb03d6946d19b820b83a96bbbd83b98
66e12451ac4e4e1c05a48b2cd2b0d3186f779f20
73b342c7a52a93d283799800824311639f372de0
71d5b2fbf83061a1319141d26942771e8c75ff2b
011a811652c74dcc9f56506ebb6075e4bdfe6ef9
f3a78a9da01218df0067b24b52204a4e5f01bc69
f9e8456c39136aa41f85f82758a00e5aa2aab334
0aacb5eab6120aa1410966d23101e16eea3fbcd7
a4a104fc81e93555899050efac23c3cd6ba762ab
24ee53231da84a1be5ec08abebe8a2ff6aa019ca
4c43ec461de4f122d5d6566361d064c816e4ef69
743e11c10b180247488ae0cc24900560e0a74e2b
4ffb575da59fd3aece02734ca4fd3212d5002d55
8c048af5890d43578ca41eb9dcfa60cb9cc3fc9c
c92a463d2341dd7893dd8b54775930ed9be72ac0
ea1e50cc166ae855f9fa91ca6a4f944123298e4e
f73f748323ef5a421ffd8fa0f02afd9627e31023
d4e52281aa9c1acc92619736da8b67d8c02ce380
a5f35aa742c3f1e2fae6a6c2fb53f92822f0cb70
f6e09db2e613c215257b80f40957d580165b5ddf
d4bf954fe61ec231be2bfa5e059f0fb7f6150bd1
abdf396cbeaec2bfe9da2fd773d42fa3022ca8b5
b9f6521157ab55073eec528cacc1f3b567e49503
aa3020592964344c7032396d159e4ab2df743587
063264db5be2941746fa58f164cdc803362753a9
748f4cce183007587a6688ef25ad5f9dbea5c33c
9de062ef207c6062d1fabb70209f4bbc9dc4732d
7d1d1208c2b38890fe065b6431ef2e3b7166bae4
0796c3934ebfe3448acf2d63f478f51c08e33046
864c780566b8782c4fc69b4337db768223717bd8
# These have more than one fixes tag and generate a warning
#
24839663a40257e0468406d72c48d431b5ae2bd4
6ff1c479968819b93c46d24bd898e89ce14ac401
ac0856ae4100a05dcd1fd932d9fd10200f8f7a7c
c9f54486959716762e6818dabb0a73a8cd46df67
# This patch requires patches that would require python 3, the thing they're
# fixing isn't really even a bug, it's more of a style choice, so I'm not going
# to pull them.
#
48820ed8da0ad50d51a58f26e156d82b685492e2
# This patch doesn't apply and either needs to be backported or can just be
# ignored
#
1e40f6948310be07abb2d0198e6602769892cdac
# This patch isn't necessary, since the patch it fixes is not present in 18.1
#
a72dbc461bdb7714656e62cd8f4b00a404c2e6e0
# This requires a much more significant patch not present in 18.1
#
4dc244eb447b1fa4e39d67a58328ed774395c901
# This patches were dropped since they only fix developer tools, which aren't
# built by default and should be of no use to end users or distros
#
97fcccb25ed5f55139c03ebc1c71742f0f25f683
4aec44c0d9c4c0649c362199fac97efe0a3b38a4
# This patch was reverted on master shortly after merging.
#
90819abb56f6b1a0cd4946b13b6caf24fb46e500
# These were supreceeded by patches backported to 18.1
#
3341429d74099b436c3824164837eebd47029ded
9158e0bd82ffdad4baf46221bccbbb3fe4764c11
cc3b99bb48769ccd018b781338b548306af5046b

View File

@@ -16,7 +16,7 @@ latest_branchpoint=`git merge-base origin/master HEAD`
git log --reverse --pretty=%H $latest_branchpoint > already_landed
# ... and the ones cherry-picked.
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
git log --reverse --pretty=medium --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
@@ -38,7 +38,7 @@ do
# Place every "fixes:" tag on its own line and join with the next word
# on its line or a later one.
fixes=`git show -s $sha | tr -d "\n" | sed -e 's/fixes:[[:space:]]*/\nfixes:/Ig' | grep "fixes:" | sed -e 's/\(fixes:[a-zA-Z0-9]*\).*$/\1/'`
fixes=`git show --pretty=medium -s $sha | tr -d "\n" | sed -e 's/fixes:[[:space:]]*/\nfixes:/Ig' | grep "fixes:" | sed -e 's/\(fixes:[a-zA-Z0-9]*\).*$/\1/'`
# For each one try to extract the tag
fixes_count=`echo "$fixes" | wc -l`

View File

@@ -12,7 +12,7 @@
latest_branchpoint=`git merge-base origin/master HEAD`
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
git log --reverse --pretty=medium --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked

View File

@@ -6,6 +6,7 @@ The output of this script goes to stdout.
"""
import argparse
import os
import os.path
import subprocess
@@ -27,10 +28,22 @@ def get_git_sha1():
git_sha1 = ''
return git_sha1
parser = argparse.ArgumentParser()
parser.add_argument('--output', help='File to write the #define in',
required=True)
args = parser.parse_args()
git_sha1 = os.environ.get('MESA_GIT_SHA1_OVERRIDE', get_git_sha1())[:10]
if git_sha1:
git_sha1_h_in_path = os.path.join(os.path.dirname(sys.argv[0]),
'..', 'src', 'git_sha1.h.in')
with open(git_sha1_h_in_path , 'r') as git_sha1_h_in:
sys.stdout.write(git_sha1_h_in.read().replace('@VCS_TAG@', git_sha1))
new_sha1 = git_sha1_h_in.read().replace('@VCS_TAG@', git_sha1)
if os.path.isfile(args.output):
with open(args.output, 'r') as git_sha1_h:
if git_sha1_h.read() == new_sha1:
quit()
with open(args.output, 'w') as git_sha1_h:
git_sha1_h.write(new_sha1)
else:
open(args.output, 'w').close()

View File

@@ -1,6 +1,6 @@
#!/usr/bin/env python
# encoding=utf-8
# Copyright © 2017 Intel Corporation
# Copyright © 2017-2018 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -35,19 +35,39 @@ def main():
parser.add_argument('drivers', nargs='+')
args = parser.parse_args()
to = os.path.join(os.environ.get('MESON_INSTALL_DESTDIR_PREFIX'), args.libdir)
if os.path.isabs(args.libdir):
to = os.path.join(os.environ.get('DESTDIR', '/'), args.libdir[1:])
else:
to = os.path.join(os.environ['MESON_INSTALL_DESTDIR_PREFIX'], args.libdir)
master = os.path.join(to, os.path.basename(args.megadriver))
if not os.path.exists(to):
if os.path.lexists(to):
os.unlink(to)
os.makedirs(to)
shutil.copy(args.megadriver, master)
for each in args.drivers:
driver = os.path.join(to, each)
if os.path.exists(driver):
if os.path.lexists(driver):
os.unlink(driver)
print('installing {} to {}'.format(args.megadriver, to))
print('installing {} to {}'.format(args.megadriver, driver))
os.link(master, driver)
try:
ret = os.getcwd()
os.chdir(to)
name, ext = os.path.splitext(each)
while ext != '.so':
if os.path.lexists(name):
os.unlink(name)
os.symlink(each, name)
name, ext = os.path.splitext(name)
finally:
os.chdir(ret)
os.unlink(master)

21
bin/meson.build Normal file
View File

@@ -0,0 +1,21 @@
# Copyright © 2017 Eric Engestrom
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
git_sha1_gen_py = files('git_sha1_gen.py')

35
bin/meson_get_version.py Normal file
View File

@@ -0,0 +1,35 @@
#!/usr/bin/env python
# encoding=utf-8
# Copyright © 2017 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
from __future__ import print_function
import os
def main():
filename = os.path.join(os.environ['MESON_SOURCE_ROOT'], 'VERSION')
with open(filename) as f:
version = f.read().strip()
print(version, end='')
if __name__ == '__main__':
main()

View File

@@ -74,24 +74,27 @@ AC_SUBST([OPENCL_VERSION])
# in the first entry.
LIBDRM_REQUIRED=2.4.75
LIBDRM_RADEON_REQUIRED=2.4.71
LIBDRM_AMDGPU_REQUIRED=2.4.85
LIBDRM_AMDGPU_REQUIRED=2.4.91
LIBDRM_INTEL_REQUIRED=2.4.75
LIBDRM_NVVIEUX_REQUIRED=2.4.66
LIBDRM_NOUVEAU_REQUIRED=2.4.66
LIBDRM_FREEDRENO_REQUIRED=2.4.74
LIBDRM_ETNAVIV_REQUIRED=2.4.82
LIBDRM_FREEDRENO_REQUIRED=2.4.91
LIBDRM_ETNAVIV_REQUIRED=2.4.89
dnl Versions for external dependencies
DRI2PROTO_REQUIRED=2.8
GLPROTO_REQUIRED=1.4.14
LIBOMXIL_BELLAGIO_REQUIRED=0.0
LIBVA_REQUIRED=0.38.0
LIBOMXIL_TIZONIA_REQUIRED=0.10.0
LIBVA_REQUIRED=0.39.0
VDPAU_REQUIRED=1.1
WAYLAND_REQUIRED=1.11
WAYLAND_PROTOCOLS_REQUIRED=1.8
XCB_REQUIRED=1.9.3
XCBDRI2_REQUIRED=1.8
XCBDRI3_MODIFIERS_REQUIRED=1.13
XCBGLX_REQUIRED=1.8.1
XCBPRESENT_MODIFIERS_REQUIRED=1.13
XDAMAGE_REQUIRED=1.1
XSHMFENCE_REQUIRED=1.1
XVMC_REQUIRED=1.0.6
@@ -103,15 +106,20 @@ dnl LLVM versions
LLVM_REQUIRED_GALLIUM=3.3.0
LLVM_REQUIRED_OPENCL=3.9.0
LLVM_REQUIRED_R600=3.9.0
LLVM_REQUIRED_RADEONSI=3.9.0
LLVM_REQUIRED_RADV=3.9.0
LLVM_REQUIRED_SWR=3.9.0
LLVM_REQUIRED_RADEONSI=4.0.0
LLVM_REQUIRED_RADV=4.0.0
LLVM_REQUIRED_SWR=4.0.0
dnl Check for progs
AC_PROG_CPP
AC_PROG_CC
AC_PROG_CXX
dnl add this here, so the help for this environmnet variable is close to
dnl other CC/CXX flags related help
AC_ARG_VAR([CXX11_CXXFLAGS], [Compiler flag to enable C++11 support (only needed if not
enabled by default and different from -std=c++11)])
AM_PROG_CC_C_O
AC_PROG_NM
AM_PROG_AS
AX_CHECK_GNU_MAKE
AC_CHECK_PROGS([PYTHON2], [python2.7 python2 python])
@@ -120,6 +128,7 @@ AC_PROG_MKDIR_P
AC_SYS_LARGEFILE
LT_PREREQ([2.2])
LT_INIT([disable-static])
@@ -245,6 +254,7 @@ AX_GCC_FUNC_ATTRIBUTE([visibility])
AX_GCC_FUNC_ATTRIBUTE([warn_unused_result])
AX_GCC_FUNC_ATTRIBUTE([weak])
AX_GCC_FUNC_ATTRIBUTE([alias])
AX_GCC_FUNC_ATTRIBUTE([noreturn])
AM_CONDITIONAL([GEN_ASM_OFFSETS], test "x$GEN_ASM_OFFSETS" = xyes)
@@ -309,11 +319,11 @@ AC_LANG_POP([C++])
# - non-Linux/Posix OpenGL portions needs to build on MSVC 2013 (which
# supports most of C99)
# - the rest has no compiler compiler restrictions
AX_CHECK_COMPILE_FLAG([-Werror=pointer-arith], [MSVC2013_COMPAT_CFLAGS="-Werror=pointer-arith"])
AX_CHECK_COMPILE_FLAG([-Werror=vla], [MSVC2013_COMPAT_CFLAGS="-Werror=vla"])
AX_CHECK_COMPILE_FLAG([-Werror=pointer-arith], [MSVC2013_COMPAT_CFLAGS="$MSVC2013_COMPAT_CFLAGS -Werror=pointer-arith"])
AX_CHECK_COMPILE_FLAG([-Werror=vla], [MSVC2013_COMPAT_CFLAGS="$MSVC2013_COMPAT_CFLAGS -Werror=vla"])
AC_LANG_PUSH([C++])
AX_CHECK_COMPILE_FLAG([-Werror=pointer-arith], [MSVC2013_COMPAT_CXXFLAGS="-Werror=pointer-arith"])
AX_CHECK_COMPILE_FLAG([-Werror=vla], [MSVC2013_COMPAT_CXXFLAGS="-Werror=vla"])
AX_CHECK_COMPILE_FLAG([-Werror=pointer-arith], [MSVC2013_COMPAT_CXXFLAGS="$MSVC2013_COMPAT_CXXFLAGS -Werror=pointer-arith"])
AX_CHECK_COMPILE_FLAG([-Werror=vla], [MSVC2013_COMPAT_CXXFLAGS="$MSVC2013_COMPAT_CXXFLAGS -Werror=vla"])
AC_LANG_POP([C++])
AC_SUBST([MSVC2013_COMPAT_CFLAGS])
@@ -327,6 +337,56 @@ if test "x$GCC" = xyes; then
fi
fi
dnl
dnl Check whether C++11 is supported, if the environment variable
dnl CXX11_CXXFLAGS is set it takes precedence.
dnl
AC_LANG_PUSH([C++])
check_cxx11_available() {
output_support=$1
AC_COMPILE_IFELSE(
[AC_LANG_PROGRAM([
#if !(__cplusplus >= 201103L)
#error
#endif
#include <tuple>
])
], [
AC_MSG_RESULT(yes)
cxx11_support=yes
], AC_MSG_RESULT(no))
eval "$output_support=\$cxx11_support"
}
HAVE_CXX11=no
save_CXXFLAGS="$CXXFLAGS"
dnl If the user provides a flag to enable c++11, then we test only this
if test "x$CXX11_CXXFLAGS" != "x"; then
CXXFLAGS="$CXXFLAGS $CXX11_CXXFLAGS"
AC_MSG_CHECKING(whether c++11 is enabled by via $CXX11_CXXFLAGS)
check_cxx11_available HAVE_CXX11
else
dnl test whether c++11 is enabled by default
AC_MSG_CHECKING(whether c++11 is enabled by default)
check_cxx11_available HAVE_CXX11
dnl C++11 not enabled by default, test whether -std=c++11 does the job
if test "x$HAVE_CXX11" != "xyes"; then
CXX11_CXXFLAGS=-std=c++11
CXXFLAGS="$CXXFLAGS $CXX11_CXXFLAGS"
AC_MSG_CHECKING(whether c++11 is enabled by via $CXX11_CXXFLAGS)
check_cxx11_available HAVE_CXX11
fi
fi
CXXFLAGS="$save_CXXFLAGS"
AM_CONDITIONAL(HAVE_STD_CXX11, test "x$HAVE_CXX11" = "xyes")
AC_SUBST(CXX11_CXXFLAGS)
AC_LANG_POP([C++])
dnl even if the compiler appears to support it, using visibility attributes isn't
dnl going to do anything useful currently on cygwin apart from emit lots of warnings
case "$host_os" in
@@ -339,8 +399,10 @@ esac
AC_SUBST([VISIBILITY_CFLAGS])
AC_SUBST([VISIBILITY_CXXFLAGS])
AX_CHECK_COMPILE_FLAG([-Wno-override-init], [WNO_OVERRIDE_INIT="-Wno-override-init"]) # gcc
AX_CHECK_COMPILE_FLAG([-Wno-initializer-overrides], [WNO_OVERRIDE_INIT="-Wno-initializer-overrides"]) # clang
dnl For some reason, the test for -Wno-foo always succeeds with gcc, even if the
dnl option is not supported. Hence, check for -Wfoo instead.
AX_CHECK_COMPILE_FLAG([-Woverride-init], [WNO_OVERRIDE_INIT="$WNO_OVERRIDE_INIT -Wno-override-init"]) # gcc
AX_CHECK_COMPILE_FLAG([-Winitializer-overrides], [WNO_OVERRIDE_INIT="$WNO_OVERRIDE_INIT -Wno-initializer-overrides"]) # clang
AC_SUBST([WNO_OVERRIDE_INIT])
dnl
@@ -371,28 +433,41 @@ fi
AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
dnl Check for new-style atomic builtins
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
dnl Check for new-style atomic builtins. We first check without linking to
dnl -latomic.
AC_MSG_CHECKING(whether __atomic_load_n is supported)
AC_LINK_IFELSE([AC_LANG_SOURCE([[
#include <stdint.h>
int main() {
int n;
return __atomic_load_n(&n, __ATOMIC_ACQUIRE);
}]])], GCC_ATOMIC_BUILTINS_SUPPORTED=1)
if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
dnl On some platforms, new-style atomics need a helper library
AC_MSG_CHECKING(whether -latomic is needed)
AC_LINK_IFELSE([AC_LANG_SOURCE([[
#include <stdint.h>
uint64_t v;
int main() {
return (int)__atomic_load_n(&v, __ATOMIC_ACQUIRE);
}]])], GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC=no, GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC=yes)
AC_MSG_RESULT($GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC)
if test "x$GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC" = xyes; then
LIBATOMIC_LIBS="-latomic"
fi
struct {
uint64_t *v;
} x;
return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE) &
(int)__atomic_add_fetch(x.v, (uint64_t)1, __ATOMIC_ACQ_REL);
}]])], GCC_ATOMIC_BUILTINS_SUPPORTED=yes, GCC_ATOMIC_BUILTINS_SUPPORTED=no)
dnl If that didn't work, we try linking with -latomic, which is needed on some
dnl platforms.
if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" != xyes; then
save_LDFLAGS=$LDFLAGS
LDFLAGS="$LDFLAGS -latomic"
AC_LINK_IFELSE([AC_LANG_SOURCE([[
#include <stdint.h>
int main() {
struct {
uint64_t *v;
} x;
return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE) &
(int)__atomic_add_fetch(x.v, (uint64_t)1, __ATOMIC_ACQ_REL);
}]])], GCC_ATOMIC_BUILTINS_SUPPORTED=yes LIBATOMIC_LIBS="-latomic",
GCC_ATOMIC_BUILTINS_SUPPORTED=no)
LDFLAGS=$save_LDFLAGS
fi
AC_MSG_RESULT($GCC_ATOMIC_BUILTINS_SUPPORTED)
if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = xyes; then
DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
fi
AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
AC_SUBST([LIBATOMIC_LIBS])
dnl Check if host supports 64-bit atomics
@@ -627,6 +702,19 @@ AC_LINK_IFELSE(
LDFLAGS=$save_LDFLAGS
AM_CONDITIONAL(HAVE_LD_DYNAMIC_LIST, test "$have_ld_dynamic_list" = "yes")
dnl
dnl OSX linker does not support build-id
dnl
case "$host_os" in
darwin*)
LD_BUILD_ID=""
;;
*)
LD_BUILD_ID="-Wl,--build-id=sha1"
;;
esac
AC_SUBST([LD_BUILD_ID])
dnl
dnl compatibility symlinks
dnl
@@ -791,8 +879,12 @@ fi
AC_HEADER_MAJOR
AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"])
AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"])
AC_CHECK_HEADERS([endian.h])
AC_CHECK_HEADER([dlfcn.h], [DEFINES="$DEFINES -DHAVE_DLFCN_H"])
AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"])
AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"])
AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"])
AC_CHECK_FUNC([memfd_create], [DEFINES="$DEFINES -DHAVE_MEMFD_CREATE"])
AC_MSG_CHECKING([whether strtod has locale support])
AC_LINK_IFELSE([AC_LANG_SOURCE([[
@@ -846,6 +938,7 @@ AC_CHECK_FUNC([posix_memalign], [DEFINES="$DEFINES -DHAVE_POSIX_MEMALIGN"])
dnl Check for zlib
PKG_CHECK_MODULES([ZLIB], [zlib >= $ZLIB_REQUIRED])
DEFINES="$DEFINES -DHAVE_ZLIB"
dnl Check for pthreads
AX_PTHREAD
@@ -865,10 +958,10 @@ dnl In practise that should be sufficient for all platforms, since any
dnl platforms build with GCC and Clang support the flag.
PTHREAD_LIBS="$PTHREAD_LIBS -pthread"
dnl pthread-stubs is mandatory on BSD platforms, due to the nature of the
dnl pthread-stubs is mandatory on some BSD platforms, due to the nature of the
dnl project. Even then there's a notable issue as described in the project README
case "$host_os" in
linux* | cygwin* | darwin* | solaris* | *-gnu* | gnu*)
linux* | cygwin* | darwin* | solaris* | *-gnu* | gnu* | openbsd*)
pthread_stubs_possible="no"
;;
* )
@@ -880,6 +973,9 @@ if test "x$pthread_stubs_possible" = xyes; then
PKG_CHECK_MODULES(PTHREADSTUBS, pthread-stubs >= 0.4)
fi
dnl Check for futex for fast inline simple_mtx_t.
AC_CHECK_HEADER([linux/futex.h], [DEFINES="$DEFINES -DHAVE_LINUX_FUTEX_H"])
dnl SELinux awareness.
AC_ARG_ENABLE([selinux],
[AS_HELP_STRING([--enable-selinux],
@@ -1206,10 +1302,10 @@ AC_ARG_ENABLE([xa],
[enable_xa=no])
AC_ARG_ENABLE([gbm],
[AS_HELP_STRING([--enable-gbm],
[enable gbm library @<:@default=yes except cygwin@:>@])],
[enable gbm library @<:@default=yes except cygwin and macOS@:>@])],
[enable_gbm="$enableval"],
[case "$host_os" in
cygwin*)
cygwin* | darwin*)
enable_gbm=no
;;
*)
@@ -1234,14 +1330,19 @@ AC_ARG_ENABLE([vdpau],
[enable_vdpau=auto])
AC_ARG_ENABLE([omx],
[AS_HELP_STRING([--enable-omx],
[DEPRECATED: Use --enable-omx-bellagio instead @<:@default=auto@:>@])],
[AC_MSG_ERROR([--enable-omx is deprecated. Use --enable-omx-bellagio instead.])],
[DEPRECATED: Use --enable-omx-bellagio or --enable-omx-tizonia instead @<:@default=auto@:>@])],
[AC_MSG_ERROR([--enable-omx is deprecated. Use --enable-omx-bellagio or --enable-omx-tizonia instead.])],
[])
AC_ARG_ENABLE([omx-bellagio],
[AS_HELP_STRING([--enable-omx-bellagio],
[enable OpenMAX Bellagio library @<:@default=disabled@:>@])],
[enable_omx_bellagio="$enableval"],
[enable_omx_bellagio=no])
AC_ARG_ENABLE([omx-tizonia],
[AS_HELP_STRING([--enable-omx-tizonia],
[enable OpenMAX Tizonia library @<:@default=disabled@:>@])],
[enable_omx_tizonia="$enableval"],
[enable_omx_tizonia=no])
AC_ARG_ENABLE([va],
[AS_HELP_STRING([--enable-va],
[enable va library @<:@default=auto@:>@])],
@@ -1255,9 +1356,9 @@ AC_ARG_ENABLE([opencl],
AC_ARG_ENABLE([opencl_icd],
[AS_HELP_STRING([--enable-opencl-icd],
[Build an OpenCL ICD library to be loaded by an ICD implementation
@<:@default=disabled@:>@])],
@<:@default=enabled@:>@])],
[enable_opencl_icd="$enableval"],
[enable_opencl_icd=no])
[enable_opencl_icd=yes])
AC_ARG_ENABLE([gallium-tests],
[AS_HELP_STRING([--enable-gallium-tests],
@@ -1273,7 +1374,7 @@ GALLIUM_DRIVERS_DEFAULT="r300,r600,svga,swrast"
AC_ARG_WITH([gallium-drivers],
[AS_HELP_STRING([--with-gallium-drivers@<:@=DIRS...@:>@],
[comma delimited Gallium drivers list, e.g.
"i915,nouveau,r300,r600,radeonsi,freedreno,pl111,svga,swrast,swr,vc4,vc5,virgl,etnaviv,imx"
"i915,nouveau,r300,r600,radeonsi,freedreno,pl111,svga,swrast,swr,tegra,vc4,vc5,virgl,etnaviv,imx"
@<:@default=r300,r600,svga,swrast@:>@])],
[with_gallium_drivers="$withval"],
[with_gallium_drivers="$GALLIUM_DRIVERS_DEFAULT"])
@@ -1293,11 +1394,17 @@ if test "x$enable_opengl" = xno -a \
"x$enable_xvmc" = xno -a \
"x$enable_vdpau" = xno -a \
"x$enable_omx_bellagio" = xno -a \
"x$enable_omx_tizonia" = xno -a \
"x$enable_va" = xno -a \
"x$enable_opencl" = xno; then
AC_MSG_ERROR([at least one API should be enabled])
fi
if test "x$enable_omx_bellagio" = xyes -a \
"x$enable_omx_tizonia" = xyes; then
AC_MSG_ERROR([Can't enable both bellagio and tizonia at same time])
fi
# Building OpenGL ES1 and/or ES2 without OpenGL is not supported on mesa 9.0.x
if test "x$enable_opengl" = xno -a \
"x$enable_gles1" = xyes; then
@@ -1383,18 +1490,10 @@ AC_ARG_ENABLE([libglvnd],
AM_CONDITIONAL(USE_LIBGLVND, test "x$enable_libglvnd" = xyes)
if test "x$enable_libglvnd" = xyes ; then
dnl XXX: update once we can handle more than libGL/glx.
dnl Namely: we should error out if neither of the glvnd enabled libraries
dnl are built
case "x$enable_glx" in
xno)
AC_MSG_ERROR([cannot build libglvnd without GLX])
;;
xxlib | xgallium-xlib )
AC_MSG_ERROR([cannot build libgvnd when Xlib-GLX or Gallium-Xlib-GLX is enabled])
;;
xdri)
;;
esac
PKG_CHECK_MODULES([GLVND], libglvnd >= 0.2.0)
@@ -1403,20 +1502,24 @@ if test "x$enable_libglvnd" = xyes ; then
DEFINES="${DEFINES} -DUSE_LIBGLVND=1"
DEFAULT_GL_LIB_NAME=GLX_mesa
if test "x$enable_glx" = xno -a "x$enable_egl" = xno; then
AC_MSG_ERROR([cannot build libglvnd without GLX or EGL])
fi
fi
AC_ARG_WITH([gl-lib-name],
[AS_HELP_STRING([--with-gl-lib-name@<:@=NAME@:>@],
[specify GL library name @<:@default=GL@:>@])],
[GL_LIB=$withval],
[GL_LIB="$DEFAULT_GL_LIB_NAME"])
[AC_MSG_ERROR([--with-gl-lib-name is no longer supported. Rename the library manually if needed.])],
[])
AC_ARG_WITH([osmesa-lib-name],
[AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@],
[specify OSMesa library name @<:@default=OSMesa@:>@])],
[OSMESA_LIB=$withval],
[OSMESA_LIB=OSMesa])
AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"])
AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa])
[AC_MSG_ERROR([--with-osmesa-lib-name is no longer supported. Rename the library manually if needed.])],
[])
GL_LIB="$DEFAULT_GL_LIB_NAME"
OSMESA_LIB=OSMesa
dnl
dnl Mangled Mesa support
@@ -1428,6 +1531,9 @@ AC_ARG_ENABLE([mangling],
[enable_mangling=no]
)
if test "x${enable_mangling}" = "xyes" ; then
if test "x$enable_libglvnd" = xyes; then
AC_MSG_ERROR([Conflicting options --enable-mangling and --enable-libglvnd.])
fi
DEFINES="${DEFINES} -DUSE_MGL_NAMESPACE"
GL_LIB="Mangled${GL_LIB}"
OSMESA_LIB="Mangled${OSMESA_LIB}"
@@ -1435,6 +1541,15 @@ fi
AC_SUBST([GL_LIB])
AC_SUBST([OSMESA_LIB])
dnl HACK when building glx + glvnd we ship gl.pc, despite that glvnd should do it
dnl Thus we need to use GL as a DSO name.
if test "x$enable_libglvnd" = xyes -a "x$enable_glx" != xno; then
GL_PKGCONF_LIB="GL"
else
GL_PKGCONF_LIB="$GL_LIB"
fi
AC_SUBST([GL_PKGCONF_LIB])
# Check for libdrm
PKG_CHECK_MODULES([LIBDRM], [libdrm >= $LIBDRM_REQUIRED],
[have_libdrm=yes], [have_libdrm=no])
@@ -1538,7 +1653,7 @@ fi
AC_ARG_ENABLE([driglx-direct],
[AS_HELP_STRING([--disable-driglx-direct],
[disable direct rendering in GLX and EGL for DRI \
@<:@default=auto@:>@])],
@<:@default=enabled@:>@])],
[driglx_direct="$enableval"],
[driglx_direct="yes"])
@@ -1562,6 +1677,8 @@ xxlib | xgallium-xlib)
xdri)
# DRI-based GLX
require_dri_shared_libs_and_glapi "GLX"
# find the DRI deps for libGL
dri_modules="x11 xext xdamage >= $XDAMAGE_REQUIRED xfixes x11-xcb xcb xcb-glx >= $XCBGLX_REQUIRED"
@@ -1697,19 +1814,6 @@ if test "x$with_platforms" = xauto; then
with_platforms=$with_egl_platforms
fi
PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland-scanner`,
WAYLAND_SCANNER='')
if test "x$WAYLAND_SCANNER" = x; then
AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner], [:])
fi
PKG_CHECK_EXISTS([wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED], [have_wayland_protocols=yes], [have_wayland_protocols=no])
if test "x$have_wayland_protocols" = xyes; then
ac_wayland_protocols_pkgdatadir=`$PKG_CONFIG --variable=pkgdatadir wayland-protocols`
fi
AC_SUBST(WAYLAND_PROTOCOLS_DATADIR, $ac_wayland_protocols_pkgdatadir)
# Do per platform setups and checks
platforms=`IFS=', '; echo $with_platforms`
for plat in $platforms; do
@@ -1718,13 +1822,19 @@ for plat in $platforms; do
PKG_CHECK_MODULES([WAYLAND_CLIENT], [wayland-client >= $WAYLAND_REQUIRED])
PKG_CHECK_MODULES([WAYLAND_SERVER], [wayland-server >= $WAYLAND_REQUIRED])
PKG_CHECK_MODULES([WAYLAND_PROTOCOLS], [wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED])
WAYLAND_PROTOCOLS_DATADIR=`$PKG_CONFIG --variable=pkgdatadir wayland-protocols`
PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland-scanner`,
WAYLAND_SCANNER='')
if test "x$WAYLAND_SCANNER" = x; then
AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner], [:])
fi
if test "x$WAYLAND_SCANNER" = "x:"; then
AC_MSG_ERROR([wayland-scanner is needed to compile the wayland platform])
fi
if test "x$have_wayland_protocols" = xno; then
AC_MSG_ERROR([wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED is needed to compile the wayland platform])
fi
DEFINES="$DEFINES -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED"
;;
@@ -1759,6 +1869,7 @@ for plat in $platforms; do
;;
esac
done
AC_SUBST([WAYLAND_PROTOCOLS_DATADIR])
if test "x$enable_glx" != xno; then
if ! echo "$platforms" | grep -q 'x11'; then
@@ -1771,6 +1882,12 @@ if test x"$enable_dri3" = xyes; then
dri3_modules="x11-xcb xcb >= $XCB_REQUIRED xcb-dri3 xcb-xfixes xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED"
PKG_CHECK_MODULES([XCB_DRI3], [$dri3_modules])
dri3_modifier_modules="xcb-dri3 >= $XCBDRI3_MODIFIERS_REQUIRED xcb-present >= $XCBPRESENT_MODIFIERS_REQUIRED"
PKG_CHECK_MODULES([XCB_DRI3_MODIFIERS], [$dri3_modifier_modules], [have_dri3_modifiers=yes], [have_dri3_modifiers=no])
if test "x$have_dri3_modifiers" == xyes; then
DEFINES="$DEFINES -DHAVE_DRI3_MODIFIERS"
fi
fi
AM_CONDITIONAL(HAVE_PLATFORM_X11, echo "$platforms" | grep -q 'x11')
@@ -1997,6 +2114,9 @@ if test -n "$with_vulkan_drivers"; then
PKG_CHECK_MODULES([AMDGPU], [libdrm >= $LIBDRM_AMDGPU_REQUIRED libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED])
radeon_llvm_check $LLVM_REQUIRED_RADV "radv"
require_x11_dri3 "radv"
if test "x$acv_mako_found" = xno; then
AC_MSG_ERROR([Python mako module v$PYTHON_MAKO_REQUIRED or higher not found])
fi
HAVE_RADEON_VULKAN=yes
;;
*)
@@ -2114,13 +2234,13 @@ else
have_vdpau_platform=no
fi
if echo $platforms | grep -q "x11\|drm"; then
if echo $platforms | egrep -q "x11|drm"; then
have_omx_platform=yes
else
have_omx_platform=no
fi
if echo $platforms | grep -q "x11\|drm\|wayland"; then
if echo $platforms | egrep -q "x11|drm|wayland"; then
have_va_platform=yes
else
have_va_platform=no
@@ -2142,6 +2262,10 @@ if test -n "$with_gallium_drivers" -a "x$with_gallium_drivers" != xswrast; then
PKG_CHECK_EXISTS([libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED], [enable_omx_bellagio=yes], [enable_omx_bellagio=no])
fi
if test "x$enable_omx_tizonia" = xauto -a "x$have_omx_platform" = xyes; then
PKG_CHECK_EXISTS([libtizonia >= $LIBOMXIL_TIZONIA_REQUIRED], [enable_omx_tizonia=yes], [enable_omx_tizonia=no])
fi
if test "x$enable_va" = xauto -a "x$have_va_platform" = xyes; then
PKG_CHECK_EXISTS([libva >= $LIBVA_REQUIRED], [enable_va=yes], [enable_va=no])
fi
@@ -2151,6 +2275,7 @@ if test "x$enable_dri" = xyes -o \
"x$enable_xvmc" = xyes -o \
"x$enable_vdpau" = xyes -o \
"x$enable_omx_bellagio" = xyes -o \
"x$enable_omx_tizonia" = xyes -o \
"x$enable_va" = xyes; then
need_gallium_vl=yes
fi
@@ -2159,8 +2284,11 @@ AM_CONDITIONAL(NEED_GALLIUM_VL, test "x$need_gallium_vl" = xyes)
if test "x$enable_xvmc" = xyes -o \
"x$enable_vdpau" = xyes -o \
"x$enable_omx_bellagio" = xyes -o \
"x$enable_omx_tizonia" = xyes -o \
"x$enable_va" = xyes; then
PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
if echo $platforms | grep -q "x11"; then
PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
fi
need_gallium_vl_winsys=yes
fi
AM_CONDITIONAL(NEED_GALLIUM_VL_WINSYS, test "x$need_gallium_vl_winsys" = xyes)
@@ -2190,9 +2318,27 @@ if test "x$enable_omx_bellagio" = xyes; then
fi
PKG_CHECK_MODULES([OMX_BELLAGIO], [libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED])
gallium_st="$gallium_st omx_bellagio"
AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 1, [Use Bellagio for OMX IL])
else
AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 0)
fi
AM_CONDITIONAL(HAVE_ST_OMX_BELLAGIO, test "x$enable_omx_bellagio" = xyes)
if test "x$enable_omx_tizonia" = xyes; then
if test "x$have_omx_platform" != xyes; then
AC_MSG_ERROR([OMX requires at least one of the x11 or drm platforms])
fi
PKG_CHECK_MODULES([OMX_TIZONIA],
[libtizonia >= $LIBOMXIL_TIZONIA_REQUIRED
tizilheaders >= $LIBOMXIL_TIZONIA_REQUIRED
libtizplatform >= $LIBOMXIL_TIZONIA_REQUIRED])
gallium_st="$gallium_st omx_tizonia"
AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 1, [Use Tizoina for OMX IL])
else
AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 0)
fi
AM_CONDITIONAL(HAVE_ST_OMX_TIZONIA, test "x$enable_omx_tizonia" = xyes)
if test "x$enable_va" = xyes; then
if test "x$have_va_platform" != xyes; then
AC_MSG_ERROR([VA requires at least one of the x11 drm or wayland platforms])
@@ -2241,8 +2387,8 @@ if test "x$enable_opencl" = xyes; then
AC_MSG_ERROR([cannot enable OpenCL without Gallium])
fi
if test $GCC_VERSION_MAJOR -lt 4 -o $GCC_VERSION_MAJOR -eq 4 -a $GCC_VERSION_MINOR -lt 7; then
AC_MSG_ERROR([gcc >= 4.7 is required to build clover])
if test "x$HAVE_CXX11" != "xyes"; then
AC_MSG_ERROR([clover requires c++11 support])
fi
if test "x$have_libclc" = xno; then
@@ -2366,6 +2512,15 @@ AC_ARG_WITH([omx-bellagio-libdir],
$PKG_CONFIG --define-variable=libdir=\$libdir --variable=pluginsdir libomxil-bellagio`])
AC_SUBST([OMX_BELLAGIO_LIB_INSTALL_DIR])
dnl Directory for OMX_TIZONIA libs
AC_ARG_WITH([omx-tizonia-libdir],
[AS_HELP_STRING([--with-omx-tizonia-libdir=DIR],
[directory for the OMX_TIZONIA libraries])],
[OMX_TIZONIA_LIB_INSTALL_DIR="$withval"],
[OMX_TIZONIA_LIB_INSTALL_DIR=`$PKG_CONFIG --define-variable=libdir=\$libdir --variable=pluginsdir libtizcore`])
AC_SUBST([OMX_TIZONIA_LIB_INSTALL_DIR])
dnl Directory for VA libs
AC_ARG_WITH([va-libdir],
@@ -2410,12 +2565,13 @@ dnl Surfaceless is an alternative for the last one.
dnl
require_basic_egl() {
case "$with_platforms" in
*drm*|*surfaceless*)
*drm*|*surfaceless*|*android*)
;;
*)
AC_MSG_ERROR([$1 requires one of these:
1) --with-platforms=drm (X, Wayland, offscreen rendering based on DRM)
2) --with-platforms=surfaceless (offscreen only)
3) --with-platforms=android (Android only)
Recommended options: drm,x11])
;;
esac
@@ -2519,6 +2675,10 @@ if test -n "$with_gallium_drivers"; then
ximx)
HAVE_GALLIUM_IMX=yes
;;
xtegra)
HAVE_GALLIUM_TEGRA=yes
require_libdrm "tegra"
;;
xswrast)
HAVE_GALLIUM_SOFTPIPE=yes
if test "x$enable_llvm" = xyes; then
@@ -2528,10 +2688,9 @@ if test -n "$with_gallium_drivers"; then
xswr)
llvm_require_version $LLVM_REQUIRED_SWR "swr"
swr_require_cxx_feature_flags "C++11" "__cplusplus >= 201103L" \
",-std=c++11" \
SWR_CXX11_CXXFLAGS
AC_SUBST([SWR_CXX11_CXXFLAGS])
if test "x$HAVE_CXX11" != "xyes"; then
AC_MSG_ERROR([swr requires c++11 support])
fi
swr_require_cxx_feature_flags "AVX" "defined(__AVX__)" \
",-target-cpu=sandybridge,-mavx,-march=core-avx,-tp=sandybridge" \
@@ -2578,6 +2737,11 @@ if test -n "$with_gallium_drivers"; then
AC_MSG_ERROR([swr enabled but no swr architectures selected])
fi
# test if more than one swr arch configured
if test `echo $swr_archs | wc -w` -eq 1; then
HAVE_SWR_BUILTIN=yes
fi
HAVE_GALLIUM_SWR=yes
;;
xvc4)
@@ -2615,8 +2779,8 @@ if test -n "$with_gallium_drivers"; then
fi
# XXX: Keep in sync with LLVM_REQUIRED_SWR
AM_CONDITIONAL(SWR_INVALID_LLVM_VERSION, test "x$LLVM_VERSION" != x3.9.0 -a \
"x$LLVM_VERSION" != x3.9.1)
AM_CONDITIONAL(SWR_INVALID_LLVM_VERSION, test "x$LLVM_VERSION" != x4.0.0 -a \
"x$LLVM_VERSION" != x4.0.1)
if test "x$enable_llvm" = "xyes" -a "$with_gallium_drivers"; then
llvm_require_version $LLVM_REQUIRED_GALLIUM "gallium"
@@ -2627,6 +2791,7 @@ AM_CONDITIONAL(HAVE_SWR_AVX, test "x$HAVE_SWR_AVX" = xyes)
AM_CONDITIONAL(HAVE_SWR_AVX2, test "x$HAVE_SWR_AVX2" = xyes)
AM_CONDITIONAL(HAVE_SWR_KNL, test "x$HAVE_SWR_KNL" = xyes)
AM_CONDITIONAL(HAVE_SWR_SKX, test "x$HAVE_SWR_SKX" = xyes)
AM_CONDITIONAL(HAVE_SWR_BUILTIN, test "x$HAVE_SWR_BUILTIN" = xyes)
dnl We need to validate some needed dependencies for renderonly drivers.
@@ -2638,6 +2803,9 @@ if test "x$HAVE_GALLIUM_VC4" != xyes -a "x$HAVE_GALLIUM_PL111" = xyes ; then
AC_MSG_ERROR([Building with pl111 requires vc4])
fi
if test "x$HAVE_GALLIUM_NOUVEAU" != xyes -a "x$HAVE_GALLIUM_TEGRA" = xyes; then
AC_MSG_ERROR([Building with tegra requires nouveau])
fi
detect_old_buggy_llvm() {
dnl llvm-config may not give the right answer when llvm is a built as a
@@ -2712,6 +2880,18 @@ if test "x$enable_llvm" = xyes; then
fi
fi
fi
dnl The gallium-xlib GLX and gallium OSMesa targets directly embed the
dnl swr/llvmpipe driver into the final binary. Adding LLVM_LIBS results in
dnl the LLVM library propagated in the Libs.private of the respective .pc
dnl file which ensures complete dependency information when statically
dnl linking.
if test "x$enable_glx" == xgallium-xlib; then
GL_PC_LIB_PRIV="$GL_PC_LIB_PRIV $LLVM_LIBS"
fi
if test "x$enable_gallium_osmesa" = xyes; then
OSMESA_PC_LIB_PRIV="$OSMESA_PC_LIB_PRIV $LLVM_LIBS"
fi
fi
AM_CONDITIONAL(HAVE_GALLIUM_SVGA, test "x$HAVE_GALLIUM_SVGA" = xyes)
@@ -2720,11 +2900,11 @@ AM_CONDITIONAL(HAVE_GALLIUM_PL111, test "x$HAVE_GALLIUM_PL111" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_R300, test "x$HAVE_GALLIUM_R300" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_R600, test "x$HAVE_GALLIUM_R600" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_RADEONSI, test "x$HAVE_GALLIUM_RADEONSI" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_RADEON_COMMON, test "x$HAVE_GALLIUM_RADEONSI" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_NOUVEAU, test "x$HAVE_GALLIUM_NOUVEAU" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_FREEDRENO, test "x$HAVE_GALLIUM_FREEDRENO" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_ETNAVIV, test "x$HAVE_GALLIUM_ETNAVIV" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_IMX, test "x$HAVE_GALLIUM_IMX" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_TEGRA, test "x$HAVE_GALLIUM_TEGRA" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_SOFTPIPE, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_LLVMPIPE, test "x$HAVE_GALLIUM_LLVMPIPE" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_SWR, test "x$HAVE_GALLIUM_SWR" = xyes)
@@ -2789,8 +2969,8 @@ AM_CONDITIONAL(HAVE_ARM_ASM, test "x$asm_arch" = xarm)
AC_SUBST([NINE_MAJOR], 1)
AC_SUBST([NINE_MINOR], 0)
AC_SUBST([NINE_TINY], 0)
AC_SUBST([NINE_VERSION], "$NINE_MAJOR.$NINE_MINOR.$NINE_TINY")
AC_SUBST([NINE_PATCH], 0)
AC_SUBST([NINE_VERSION], "$NINE_MAJOR.$NINE_MINOR.$NINE_PATCH")
AC_SUBST([VDPAU_MAJOR], 1)
AC_SUBST([VDPAU_MINOR], 0)
@@ -2807,15 +2987,10 @@ AM_CONDITIONAL(HAVE_VULKAN_COMMON, test "x$VULKAN_DRIVERS" != "x")
AC_SUBST([XVMC_MAJOR], 1)
AC_SUBST([XVMC_MINOR], 0)
XA_HEADER="$srcdir/src/gallium/state_trackers/xa/xa_tracker.h"
XA_MAJOR=`grep "#define XA_TRACKER_VERSION_MAJOR" $XA_HEADER | $SED 's/^#define XA_TRACKER_VERSION_MAJOR //'`
XA_MINOR=`grep "#define XA_TRACKER_VERSION_MINOR" $XA_HEADER | $SED 's/^#define XA_TRACKER_VERSION_MINOR //'`
XA_TINY=`grep "#define XA_TRACKER_VERSION_PATCH" $XA_HEADER | $SED 's/^#define XA_TRACKER_VERSION_PATCH //'`
AC_SUBST([XA_MAJOR], $XA_MAJOR)
AC_SUBST([XA_MINOR], $XA_MINOR)
AC_SUBST([XA_TINY], $XA_TINY)
AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_TINY")
AC_SUBST([XA_MAJOR], 2)
AC_SUBST([XA_MINOR], 3)
AC_SUBST([XA_PATCH], 0)
AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_PATCH")
AC_ARG_ENABLE(valgrind,
[AS_HELP_STRING([--enable-valgrind],
@@ -2868,21 +3043,17 @@ AC_CONFIG_FILES([Makefile
src/gallium/auxiliary/Makefile
src/gallium/auxiliary/pipe-loader/Makefile
src/gallium/drivers/freedreno/Makefile
src/gallium/drivers/ddebug/Makefile
src/gallium/drivers/i915/Makefile
src/gallium/drivers/llvmpipe/Makefile
src/gallium/drivers/noop/Makefile
src/gallium/drivers/nouveau/Makefile
src/gallium/drivers/pl111/Makefile
src/gallium/drivers/r300/Makefile
src/gallium/drivers/r600/Makefile
src/gallium/drivers/radeon/Makefile
src/gallium/drivers/radeonsi/Makefile
src/gallium/drivers/rbug/Makefile
src/gallium/drivers/softpipe/Makefile
src/gallium/drivers/svga/Makefile
src/gallium/drivers/swr/Makefile
src/gallium/drivers/trace/Makefile
src/gallium/drivers/tegra/Makefile
src/gallium/drivers/etnaviv/Makefile
src/gallium/drivers/imx/Makefile
src/gallium/drivers/vc4/Makefile
@@ -2892,17 +3063,20 @@ AC_CONFIG_FILES([Makefile
src/gallium/state_trackers/dri/Makefile
src/gallium/state_trackers/glx/xlib/Makefile
src/gallium/state_trackers/nine/Makefile
src/gallium/state_trackers/omx_bellagio/Makefile
src/gallium/state_trackers/omx/Makefile
src/gallium/state_trackers/omx/bellagio/Makefile
src/gallium/state_trackers/omx/tizonia/Makefile
src/gallium/state_trackers/osmesa/Makefile
src/gallium/state_trackers/va/Makefile
src/gallium/state_trackers/vdpau/Makefile
src/gallium/state_trackers/xa/Makefile
src/gallium/state_trackers/xa/xa_tracker.h
src/gallium/state_trackers/xvmc/Makefile
src/gallium/targets/d3dadapter9/Makefile
src/gallium/targets/d3dadapter9/d3d.pc
src/gallium/targets/dri/Makefile
src/gallium/targets/libgl-xlib/Makefile
src/gallium/targets/omx-bellagio/Makefile
src/gallium/targets/omx/Makefile
src/gallium/targets/opencl/Makefile
src/gallium/targets/opencl/mesa.icd
src/gallium/targets/osmesa/Makefile
@@ -2929,6 +3103,7 @@ AC_CONFIG_FILES([Makefile
src/gallium/winsys/sw/null/Makefile
src/gallium/winsys/sw/wrapper/Makefile
src/gallium/winsys/sw/xlib/Makefile
src/gallium/winsys/tegra/drm/Makefile
src/gallium/winsys/vc4/drm/Makefile
src/gallium/winsys/vc5/drm/Makefile
src/gallium/winsys/virgl/drm/Makefile
@@ -2978,6 +3153,9 @@ $SED -i -e 's/brw_blorp.cpp/brw_blorp.c/' src/mesa/drivers/dri/i965/.deps/brw_bl
rm -f src/compiler/spirv/spirv_info.lo
echo "# dummy" > src/compiler/spirv/.deps/spirv_info.Plo
rm -f src/compiler/nir/.deps/nir_intrinsics.Plo
echo "# dummy" > src/compiler/nir/.deps/nir_intrinsics.Plo
dnl
dnl Output some configuration info for the user
dnl
@@ -3095,7 +3273,11 @@ fi
echo ""
if test "x$HAVE_GALLIUM_SWR" != x; then
echo " SWR archs: $swr_archs"
if test "x$HAVE_SWR_BUILTIN" = xyes; then
echo " SWR archs: $swr_archs (builtin)"
else
echo " SWR archs: $swr_archs"
fi
fi
dnl Libraries
@@ -3115,6 +3297,7 @@ defines=`echo $DEFINES | $SED 's/^ *//;s/ */ /;s/ *$//'`
echo ""
echo " CFLAGS: $cflags"
echo " CXXFLAGS: $cxxflags"
echo " CXX11_CXXFLAGS: $CXX11_CXXFLAGS"
echo " LDFLAGS: $ldflags"
echo " Macros: $defines"
echo ""

View File

@@ -43,6 +43,7 @@
<li><a href="install.html" target="_parent">Compiling / Installing</a>
<ul>
<li><a href="autoconf.html" target="_parent">Autoconf</a></li>
<li><a href="meson.html" target="_parent">Meson</a></li>
</ul>
</li>
<li><a href="precompiled.html" target="_parent">Precompiled Libraries</a>

View File

@@ -88,22 +88,40 @@ This is a work-around for that.
<li>MESA_GL_VERSION_OVERRIDE - changes the value returned by
glGetString(GL_VERSION) and possibly the GL API type.
<ul>
<li> The format should be MAJOR.MINOR[FC]
<li> FC is an optional suffix that indicates a forward compatible context.
This is only valid for versions &gt;= 3.0.
<li> GL versions &lt; 3.0 are set to a compatibility (non-Core) profile
<li> GL versions = 3.0, see below
<li> GL versions &gt; 3.0 are set to a Core profile
<li> Examples: 2.1, 3.0, 3.0FC, 3.1, 3.1FC
<ul>
<li> 2.1 - select a compatibility (non-Core) profile with GL version 2.1
<li> 3.0 - select a compatibility (non-Core) profile with GL version 3.0
<li> 3.0FC - select a Core+Forward Compatible profile with GL version 3.0
<li> 3.1 - select a Core profile with GL version 3.1
<li> 3.1FC - select a Core+Forward Compatible profile with GL version 3.1
</ul>
<li> Mesa may not really implement all the features of the given version.
(for developers only)
<li>The format should be MAJOR.MINOR[FC|COMPAT]
<li>FC is an optional suffix that indicates a forward compatible
context. This is only valid for versions &gt;= 3.0.
<li>COMPAT is an optional suffix that indicates a compatibility
context or GL_ARB_compatibility support. This is only valid for
versions &gt;= 3.1.
<li>GL versions &lt;= 3.0 are set to a compatibility (non-Core)
profile
<li>GL versions = 3.1, depending on the driver, it may or may not
have the ARB_compatibility extension enabled.
<li>GL versions &gt;= 3.2 are set to a Core profile
<li>Examples: 2.1, 3.0, 3.0FC, 3.1, 3.1FC, 3.1COMPAT, X.Y, X.YFC,
X.YCOMPAT.
<ul>
<li>2.1 - select a compatibility (non-Core) profile with GL
version 2.1.
<li>3.0 - select a compatibility (non-Core) profile with GL
version 3.0.
<li>3.0FC - select a Core+Forward Compatible profile with GL
version 3.0.
<li>3.1 - select GL version 3.1 with GL_ARB_compatibility enabled
per the driver default.
<li>3.1FC - select GL version 3.1 with forward compatibility and
GL_ARB_compatibility disabled.
<li>3.1COMPAT - select GL version 3.1 with GL_ARB_compatibility
enabled.
<li>X.Y - override GL version to X.Y without changing the profile.
<li>X.YFC - select a Core+Forward Compatible profile with GL
version X.Y.
<li>X.YCOMPAT - select a Compatibility profile with GL version
X.Y.
</ul>
<li>Mesa may not really implement all the features of the given
version. (for developers only)
</ul>
<li>MESA_GLES_VERSION_OVERRIDE - changes the value returned by
glGetString(GL_VERSION) for OpenGL ES.
@@ -135,6 +153,16 @@ home directory.
<li>MESA_NO_MINMAX_CACHE - when set, the minmax index cache is globally disabled.
<li>MESA_SHADER_CAPTURE_PATH - see <a href="shading.html#capture">Capturing Shaders</a></li>
<li>MESA_SHADER_DUMP_PATH and MESA_SHADER_READ_PATH - see <a href="shading.html#replacement">Experimenting with Shader Replacements</a></li>
<li>MESA_VK_VERSION_OVERRIDE - changes the Vulkan physical device version
as returned in VkPhysicalDeviceProperties::apiVersion.
<ul>
<li>The format should be MAJOR.MINOR[.PATCH]</li>
<li>This will not let you force a version higher than the driver's
instance versionas advertised by vkEnumerateInstanceVersion</li>
<li>This can be very useful for debugging but some features may not be
implemented correctly. (For developers only)</li>
</ul>
</li>
</ul>
@@ -241,7 +269,7 @@ Mesa EGL supports different sets of environment variables. See the
Especially useful to toggle hud at specific points of application and
disable for unencumbered viewing the rest of the time. For example, set
GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_TOGGLE_SIGNAL to 10 (SIGUSR1).
Use kill -10 <pid> to toggle the hud as desired.
Use kill -10 &lt;pid&gt; to toggle the hud as desired.
<li>GALLIUM_HUD_DUMP_DIR - specifies a directory for writing the displayed
hud values into files.
<li>GALLIUM_DRIVER - useful in combination with LIBGL_ALWAYS_SOFTWARE=true for
@@ -313,6 +341,12 @@ such as the OpenGL program's name and command line arguments.
<li>See the driver code for other, lesser-used variables.
</ul>
<h3>WGL environment variables</h3>
<ul>
<li>WGL_SWAP_INTERVAL - to set a swap interval, equivalent to calling
wglSwapIntervalEXT() in an application. If this environment variable
is set, application calls to wglSwapIntervalEXT() will have no effect.
</ul>
<h3>VA-API state tracker environment variables</h3>
<ul>

View File

@@ -23,7 +23,7 @@ The specifications follow.
<ul>
<li><a href="specs/MESA_agp_offset.spec">MESA_agp_offset.spec</a>
<li><a href="specs/OLD/MESA_agp_offset.spec">MESA_agp_offset.spec</a>
<li><a href="specs/MESA_copy_sub_buffer.spec">MESA_copy_sub_buffer.spec</a>
<li><a href="specs/MESA_drm_image.spec">MESA_drm_image.spec</a>
<li><a href="specs/MESA_multithread_makecurrent.spec">MESA_multithread_makecurrent.spec</a>
@@ -33,7 +33,7 @@ The specifications follow.
<li><a href="specs/OLD/MESA_program_debug.spec">MESA_program_debug.spec</a> (obsolete)
<li><a href="specs/MESA_release_buffers.spec">MESA_release_buffers.spec</a>
<li><a href="specs/OLD/MESA_resize_buffers.spec">MESA_resize_buffers.spec</a> (obsolete)
<li><a href="specs/MESA_set_3dfx_mode.spec">MESA_set_3dfx_mode.spec</a>
<li><a href="specs/OLD/MESA_set_3dfx_mode.spec">MESA_set_3dfx_mode.spec</a>
<li><a href="specs/MESA_shader_debug.spec">MESA_shader_debug.spec</a>
<li><a href="specs/OLD/MESA_sprite_point.spec">MESA_sprite_point.spec</a> (obsolete)
<li><a href="specs/MESA_swap_control.spec">MESA_swap_control.spec</a>

View File

@@ -24,10 +24,13 @@ not started
# OpenGL Core and Compatibility context support
OpenGL 3.1 and later versions are only supported with the Core profile.
There are no plans to support GL_ARB_compatibility. The last supported OpenGL
version with all deprecated features is 3.0. Some of the later GL features
are exposed in the 3.0 context as extensions.
Some drivers do not support the Compatibility profile or the
ARB_compatibility extensions. If an application does not request a
specific version without the forward-compatiblity flag, such drivers
will be limited to OpenGL 3.0. If an application requests OpenGL 3.1,
it will get a context that may or may not have the ARB_compatibility
extension enabled. Some of the later GL features are exposed in the 3.0
context as extensions.
Feature Status
@@ -102,7 +105,7 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
GL_ARB_shader_bit_encoding DONE (freedreno, swr)
GL_ARB_texture_rgb10_a2ui DONE (freedreno, swr)
GL_ARB_texture_swizzle DONE (freedreno, swr)
GL_ARB_timer_query DONE (swr)
GL_ARB_timer_query DONE (freedreno, swr)
GL_ARB_instanced_arrays DONE (freedreno, swr)
GL_ARB_vertex_type_2_10_10_10_rev DONE (freedreno, swr)
@@ -110,7 +113,7 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
GL_ARB_draw_buffers_blend DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_draw_indirect DONE (i965/gen7+, llvmpipe, softpipe, swr)
GL_ARB_draw_indirect DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr)
GL_ARB_gpu_shader5 DONE (i965/gen7+)
- 'precise' qualifier DONE
- Dynamically uniform sampler array indices DONE (softpipe)
@@ -126,97 +129,97 @@ GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
- New overload resolution rules DONE
GL_ARB_gpu_shader_fp64 DONE (i965/gen7+, llvmpipe, softpipe)
GL_ARB_sample_shading DONE (i965/gen6+, nv50)
GL_ARB_shader_subroutine DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_shader_subroutine DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_tessellation_shader DONE (i965/gen7+)
GL_ARB_texture_buffer_object_rgb32 DONE (i965/gen6+, llvmpipe, softpipe, swr)
GL_ARB_texture_buffer_object_rgb32 DONE (freedreno, i965/gen6+, llvmpipe, softpipe, swr)
GL_ARB_texture_cube_map_array DONE (i965/gen6+, nv50, llvmpipe, softpipe)
GL_ARB_texture_gather DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_query_lod DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_texture_gather DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_query_lod DONE (freedreno, i965, nv50, llvmpipe, softpipe)
GL_ARB_transform_feedback2 DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_transform_feedback3 DONE (i965/gen7+, llvmpipe, softpipe, swr)
GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
GL_ARB_ES2_compatibility DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_get_program_binary DONE (0 binary formats)
GL_ARB_ES2_compatibility DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_get_program_binary DONE (0 or 1 binary formats)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_shader_precision DONE (i965/gen7+, all drivers that support GLSL 4.10)
GL_ARB_vertex_attrib_64bit DONE (i965/gen7+, llvmpipe, softpipe)
GL_ARB_viewport_array DONE (i965, nv50, llvmpipe, softpipe)
GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, radeonsi
GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, r600, radeonsi
GL_ARB_texture_compression_bptc DONE (i965, r600)
GL_ARB_texture_compression_bptc DONE (freedreno, i965)
GL_ARB_compressed_texture_pixel_storage DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (i965, softpipe)
GL_ARB_shader_atomic_counters DONE (freedreno/a5xx, i965, softpipe)
GL_ARB_texture_storage DONE (all drivers)
GL_ARB_transform_feedback_instanced DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_base_instance DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_shader_image_load_store DONE (i965, softpipe)
GL_ARB_transform_feedback_instanced DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_base_instance DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_shader_image_load_store DONE (freedreno/a5xx, i965, softpipe)
GL_ARB_conservative_depth DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_420pack DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_internalformat_query DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_internalformat_query DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_map_buffer_alignment DONE (all drivers)
GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, radeonsi
GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_ES3_compatibility DONE (all drivers that support GLSL 3.30)
GL_ARB_clear_buffer_object DONE (all drivers)
GL_ARB_compute_shader DONE (i965, softpipe)
GL_ARB_copy_image DONE (i965, nv50, r600, softpipe, llvmpipe)
GL_ARB_compute_shader DONE (freedreno/a5xx, i965, softpipe)
GL_ARB_copy_image DONE (i965, nv50, softpipe, llvmpipe)
GL_KHR_debug DONE (all drivers)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_fragment_layer_viewport DONE (i965, nv50, r600, llvmpipe, softpipe)
GL_ARB_framebuffer_no_attachments DONE (i965, r600, softpipe)
GL_ARB_fragment_layer_viewport DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_framebuffer_no_attachments DONE (freedreno, i965, softpipe)
GL_ARB_internalformat_query2 DONE (all drivers)
GL_ARB_invalidate_subdata DONE (all drivers)
GL_ARB_multi_draw_indirect DONE (i965, r600, llvmpipe, softpipe, swr)
GL_ARB_multi_draw_indirect DONE (freedreno, i965, llvmpipe, softpipe, swr)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_robust_buffer_access_behavior DONE (i965)
GL_ARB_shader_image_size DONE (i965, softpipe)
GL_ARB_shader_storage_buffer_object DONE (i965, softpipe)
GL_ARB_stencil_texturing DONE (i965/hsw+, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_texture_buffer_range DONE (nv50, i965, r600, llvmpipe)
GL_ARB_shader_image_size DONE (freedreno/a5xx, i965, softpipe)
GL_ARB_shader_storage_buffer_object DONE (freedreno/a5xx, i965, softpipe)
GL_ARB_stencil_texturing DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_buffer_range DONE (freedreno, nv50, i965, llvmpipe)
GL_ARB_texture_query_levels DONE (all drivers that support GLSL 1.30)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_texture_view DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_texture_view DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, radeonsi
GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
GL_MAX_VERTEX_ATTRIB_STRIDE DONE (all drivers)
GL_ARB_buffer_storage DONE (i965, nv50, r600, llvmpipe, swr)
GL_ARB_clear_texture DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_buffer_storage DONE (freedreno, i965, nv50, llvmpipe, swr)
GL_ARB_clear_texture DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_enhanced_layouts DONE (i965, nv50, llvmpipe, softpipe)
- compile-time constant expressions DONE
- explicit byte offsets for blocks DONE
- forced alignment within blocks DONE
- specified vec4-slot component numbers DONE (i965, nv50, llvmpipe, softpipe)
- specified vec4-slot component numbers DONE
- specified transform/feedback layout DONE
- input/output block locations DONE
GL_ARB_multi_bind DONE (all drivers)
GL_ARB_query_buffer_object DONE (i965/hsw+)
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_texture_stencil8 DONE (i965/hsw+, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_stencil8 DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr)
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, llvmpipe, softpipe, swr)
GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi
GL_ARB_ES3_1_compatibility DONE (i965/hsw+)
GL_ARB_clip_control DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_conditional_render_inverted DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_cull_distance DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_ES3_1_compatibility DONE (i965/hsw+, r600)
GL_ARB_clip_control DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_conditional_render_inverted DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_cull_distance DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_derivative_control DONE (i965, nv50, r600)
GL_ARB_direct_state_access DONE (all drivers)
GL_ARB_get_texture_sub_image DONE (all drivers)
GL_ARB_shader_texture_image_samples DONE (i965, nv50, r600)
GL_ARB_texture_barrier DONE (i965, nv50, r600)
GL_ARB_texture_barrier DONE (freedreno, i965, nv50, r600)
GL_KHR_context_flush_control DONE (all - but needs GLX/EGL extension to be useful)
GL_KHR_robustness DONE (i965)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
@@ -225,39 +228,39 @@ GL 4.6, GLSL 4.60
GL_ARB_gl_spirv in progress (Nicolai Hähnle, Ian Romanick)
GL_ARB_indirect_parameters DONE (i965/gen7+, nvc0, radeonsi)
GL_ARB_pipeline_statistics_query DONE (i965, nvc0, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_polygon_offset_clamp DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr)
GL_ARB_shader_atomic_counter_ops DONE (i965/gen7+, nvc0, radeonsi, softpipe)
GL_ARB_pipeline_statistics_query DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_polygon_offset_clamp DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr)
GL_ARB_shader_atomic_counter_ops DONE (freedreno/a5xx, i965/gen7+, nvc0, r600, radeonsi, softpipe)
GL_ARB_shader_draw_parameters DONE (i965, nvc0, radeonsi)
GL_ARB_shader_group_vote DONE (i965, nvc0, radeonsi)
GL_ARB_spirv_extensions in progress (Nicolai Hähnle, Ian Romanick)
GL_ARB_texture_filter_anisotropic DONE (i965, nv50, nvc0, r600, radeonsi, softpipe (*), llvmpipe (*))
GL_ARB_transform_feedback_overflow_query DONE (i965/gen6+, radeonsi, llvmpipe, softpipe)
GL_KHR_no_error started (Timothy Arceri)
GL_ARB_texture_filter_anisotropic DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, softpipe (*), llvmpipe (*))
GL_ARB_transform_feedback_overflow_query DONE (i965/gen6+, nvc0, radeonsi, llvmpipe, softpipe)
GL_KHR_no_error DONE (all drivers)
(*) softpipe and llvmpipe advertise 16x anisotropy but simply ignore the setting
These are the extensions cherry-picked to make GLES 3.1
GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, radeonsi
GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_compute_shader DONE (i965/gen7+, softpipe)
GL_ARB_draw_indirect DONE (i965/gen7+, r600, llvmpipe, softpipe, swr)
GL_ARB_compute_shader DONE (freedreno/a5xx, i965/gen7+, softpipe)
GL_ARB_draw_indirect DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_framebuffer_no_attachments DONE (i965/gen7+, r600, softpipe)
GL_ARB_framebuffer_no_attachments DONE (freedreno, i965/gen7+, softpipe)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (i965/gen7+, softpipe)
GL_ARB_shader_image_load_store DONE (i965/gen7+, softpipe)
GL_ARB_shader_image_size DONE (i965/gen7+, softpipe)
GL_ARB_shader_storage_buffer_object DONE (i965/gen7+, softpipe)
GL_ARB_shader_atomic_counters DONE (freedreno/a5xx, i965/gen7+, softpipe)
GL_ARB_shader_image_load_store DONE (freedreno/a5xx, i965/gen7+, softpipe)
GL_ARB_shader_image_size DONE (freedreno/a5xx, i965/gen7+, softpipe)
GL_ARB_shader_storage_buffer_object DONE (freedreno/a5xx, i965/gen7+, softpipe)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_stencil_texturing DONE (nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_texture_multisample (Multisample textures) DONE (i965/gen7+, nv50, r600, llvmpipe, softpipe)
GL_ARB_stencil_texturing DONE (freedreno, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_multisample (Multisample textures) DONE (i965/gen7+, nv50, llvmpipe, softpipe)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GS5 Enhanced textureGather DONE (i965/gen7+, r600)
GS5 Packing/bitfield/conversion functions DONE (i965/gen6+, r600)
GS5 Enhanced textureGather DONE (freedreno, i965/gen7+,)
GS5 Packing/bitfield/conversion functions DONE (i965/gen6+)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
Additional functionality not covered above:
@@ -269,10 +272,10 @@ GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, radeonsi
GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+
GL_EXT_color_buffer_float DONE (all drivers)
GL_KHR_blend_equation_advanced DONE (i965, nvc0)
GL_KHR_blend_equation_advanced DONE (i965, nvc0, radeonsi)
GL_KHR_debug DONE (all drivers)
GL_KHR_robustness DONE (i965, nvc0, radeonsi)
GL_KHR_texture_compression_astc_ldr DONE (i965/gen9+)
GL_KHR_texture_compression_astc_ldr DONE (freedreno, i965/gen9+)
GL_OES_copy_image DONE (all drivers)
GL_OES_draw_buffers_indexed DONE (all drivers that support GL_ARB_draw_buffers_blend)
GL_OES_draw_elements_base_vertex DONE (all drivers)
@@ -293,7 +296,7 @@ GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+
Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES version:
GL_ARB_bindless_texture DONE (radeonsi)
GL_ARB_bindless_texture DONE (nvc0, radeonsi)
GL_ARB_cl_event not started
GL_ARB_compute_variable_group_size DONE (nvc0, radeonsi)
GL_ARB_ES3_2_compatibility DONE (i965/gen8+)
@@ -305,8 +308,8 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
GL_ARB_sample_locations not started
GL_ARB_seamless_cubemap_per_texture DONE (i965, nvc0, radeonsi, r600, softpipe, swr)
GL_ARB_shader_ballot DONE (i965/gen8+, nvc0, radeonsi)
GL_ARB_shader_clock DONE (i965/gen7+, nv50, nvc0, radeonsi)
GL_ARB_shader_stencil_export DONE (i965/gen9+, radeonsi, softpipe, llvmpipe, swr)
GL_ARB_shader_clock DONE (i965/gen7+, nv50, nvc0, r600, radeonsi)
GL_ARB_shader_stencil_export DONE (i965/gen9+, r600, radeonsi, softpipe, llvmpipe, swr)
GL_ARB_shader_viewport_layer_array DONE (i965/gen6+, nvc0, radeonsi)
GL_ARB_sparse_buffer DONE (radeonsi/CIK+)
GL_ARB_sparse_texture not started
@@ -316,8 +319,8 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
GL_EXT_memory_object DONE (radeonsi)
GL_EXT_memory_object_fd DONE (radeonsi)
GL_EXT_memory_object_win32 not started
GL_EXT_semaphore not started
GL_EXT_semaphore_fd not started
GL_EXT_semaphore DONE (radeonsi)
GL_EXT_semaphore_fd DONE (radeonsi)
GL_EXT_semaphore_win32 not started
GL_KHR_blend_equation_advanced_coherent DONE (i965/gen9+)
GL_KHR_texture_compression_astc_hdr DONE (i965/bxt)
@@ -328,10 +331,10 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
GL_OES_required_internalformat DONE (all drivers)
GL_OES_surfaceless_context DONE (all drivers)
GL_OES_texture_compression_astc DONE (core only)
GL_OES_texture_float DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_float_linear DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_half_float DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_half_float_linear DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_float DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_float_linear DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_half_float DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_half_float_linear DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_view not started - based on GL_ARB_texture_view
GL_OES_viewport_array DONE (i965, nvc0, radeonsi)
GLX_ARB_context_flush_control not started

View File

@@ -16,6 +16,118 @@
<h1>News</h1>
<h2>April 18, 2018</h2>
<p>
<a href="relnotes/18.0.1.html">Mesa 18.0.1</a> is released.
This is a bug-fix release.
</p>
<h2>April 18, 2018</h2>
<p>
<a href="relnotes/17.3.9.html">Mesa 17.3.9</a> is released.
This is a bug-fix release.
<br>
NOTE: It is anticipated that 17.3.9 will be the final release in the
17.3 series. Users of 17.3 are encouraged to migrate to the 18.0
series in order to obtain future fixes.
</p>
<h2>April 03, 2018</h2>
<p>
<a href="relnotes/17.3.8.html">Mesa 17.3.8</a> is released.
This is a bug-fix release.
</p>
<h2>March 27, 2018</h2>
<p>
<a href="relnotes/18.0.0.html">Mesa 18.0.0</a> is released. This is a
new development release. See the release notes for more information
about the release.
</p>
<h2>March 21, 2018</h2>
<p>
<a href="relnotes/17.3.7.html">Mesa 17.3.7</a> is released.
This is a bug-fix release.
</p>
<h2>February 26, 2018</h2>
<p>
<a href="relnotes/17.3.6.html">Mesa 17.3.6</a> is released.
This is a bug-fix release.
</p>
<h2>February 19, 2018</h2>
<p>
<a href="relnotes/17.3.5.html">Mesa 17.3.5</a> is released.
This is a bug-fix release.
</p>
<h2>February 15, 2018</h2>
<p>
<a href="relnotes/17.3.4.html">Mesa 17.3.4</a> is released.
This is a bug-fix release.
</p>
<h2>January 18, 2018</h2>
<p>
<a href="relnotes/17.3.3.html">Mesa 17.3.3</a> is released.
This is a bug-fix release.
</p>
<h2>January 8, 2018</h2>
<p>
<a href="relnotes/17.3.2.html">Mesa 17.3.2</a> is released.
This is a bug-fix release.
</p>
<h2>December 22, 2017</h2>
<p>
<a href="relnotes/17.2.8.html">Mesa 17.2.8</a> is released.
This is a bug-fix release.
<br>
NOTE: It is anticipated that 17.2.8 will be the final release in the
17.2 series. Users of 17.2 are encouraged to migrate to the 17.3
series in order to obtain future fixes.
</p>
<h2>December 21, 2017</h2>
<p>
<a href="relnotes/17.3.1.html">Mesa 17.3.1</a> is released.
This is a bug-fix release.
</p>
<h2>December 14, 2017</h2>
<p>
<a href="relnotes/17.2.7.html">Mesa 17.2.7</a> is released.
This is a bug-fix release.
</p>
<h2>December 8, 2017</h2>
<p>
<a href="relnotes/17.3.0.html">Mesa 17.3.0</a> is released. This is a
new development release. See the release notes for more information
about the release.
</p>
<h2>November 25, 2017</h2>
<p>
<a href="relnotes/17.2.6.html">Mesa 17.2.6</a> is released.
This is a bug-fix release.
</p>
<h2>November 10, 2017</h2>
<p>
<a href="relnotes/17.2.5.html">Mesa 17.2.5</a> is released.
This is a bug-fix release.
</p>
<h2>October 30, 2017</h2>
<p>
<a href="relnotes/17.2.4.html">Mesa 17.2.4</a> is released.
This is a bug-fix release.
</p>
<h2>October 19, 2017</h2>
<p>
<a href="relnotes/17.2.3.html">Mesa 17.2.3</a> is released.
@@ -32,6 +144,10 @@ This is a bug-fix release.
<p>
<a href="relnotes/17.1.10.html">Mesa 17.1.10</a> is released.
This is a bug-fix release.
<br>
NOTE: It is anticipated that 17.1.10 will be the final release in the
17.1 series. Users of 17.1 are encouraged to migrate to the 17.2
series in order to obtain future fixes.
</p>
<h2>September 17, 2017</h2>

View File

@@ -75,7 +75,7 @@ you think you've spotted a bug let developers know by filing a
Version 2.6.4 or later should work.
</li>
<li><a href="http://www.makotemplates.org/">Python Mako module</a> -
Python Mako module is required. Version 0.3.4 or later should work.
Python Mako module is required. Version 0.8.0 or later should work.
</li>
<li>lex / yacc - for building the Mesa IR and GLSL compiler.
<div>

View File

@@ -20,7 +20,7 @@
The Gallium llvmpipe driver is a software rasterizer that uses LLVM to
do runtime code generation.
Shaders, point/line/triangle rasterization and vertex processing are
implemented with LLVM IR which is translated to x86 or x86-64 machine
implemented with LLVM IR which is translated to x86, x86-64, or ppc64le machine
code.
Also, the driver is multithreaded to take advantage of multiple CPU cores
(up to 8 at this time).
@@ -32,24 +32,36 @@ It's the fastest software rasterizer for Mesa.
<ul>
<li>
<p>An x86 or amd64 processor; 64-bit mode recommended.</p>
<p>
For x86 or amd64 processors, 64-bit mode is recommended.
Support for SSE2 is strongly encouraged. Support for SSE3 and SSE4.1 will
yield the most efficient code. The fewer features the CPU has the more
likely is that you run into underperforming, buggy, or incomplete code.
likely it is that you will run into underperforming, buggy, or incomplete code.
</p>
<p>
For ppc64le processors, use of the Altivec feature (the Vector
Facility) is recommended if supported; use of the VSX feature (the
Vector-Scalar Facility) is recommended if supported AND Mesa is
built with LLVM version 4.0 or later.
</p>
<p>
See /proc/cpuinfo to know what your CPU supports.
</p>
</li>
<li>
<p>LLVM: version 3.4 recommended; 3.3 or later required.</p>
<p>Unless otherwise stated, LLVM version 3.4 is recommended; 3.3 or later is required.</p>
<p>
For Linux, on a recent Debian based distribution do:
</p>
<pre>
aptitude install llvm-dev
</pre>
<p>
If you want development snapshot builds of LLVM for Debian and derived
distributions like Ubuntu, you can use the APT repository at <a
href="https://apt.llvm.org/" title="Debian Development packages for LLVM"
>apt.llvm.org</a>, which are maintained by Debian's LLVM maintainer.
</p>
<p>
For a RPM-based distribution do:
</p>
@@ -108,10 +120,10 @@ To build everything on Linux invoke scons as:
scons build=debug libgl-xlib
</pre>
Alternatively, you can build it with GNU make, if you prefer, by invoking it as
Alternatively, you can build it with autoconf/make with:
<pre>
make linux-llvm
./configure --enable-glx=gallium-xlib --with-gallium-drivers=swrast --disable-dri --disable-gbm --disable-egl
make
</pre>
but the rest of these instructions assume that scons is used.
@@ -228,8 +240,8 @@ build/linux-???-debug/gallium/drivers/llvmpipe:
</ul>
<p>
Some of this tests can output results and benchmarks to a tab-separated-file
for posterior analysis, e.g.:
Some of these tests can output results and benchmarks to a tab-separated file
for later analysis, e.g.:
</p>
<pre>
build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv
@@ -240,8 +252,8 @@ for posterior analysis, e.g.:
<ul>
<li>
When looking to this code by the first time start in lp_state_fs.c, and
then skim through the lp_bld_* functions called in there, and the comments
When looking at this code for the first time, start in lp_state_fs.c, and
then skim through the lp_bld_* functions called there, and the comments
at the top of the lp_bld_*.c functions.
</li>
<li>

178
docs/meson.html Normal file
View File

@@ -0,0 +1,178 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Compilation and Installation using Meson</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Compilation and Installation using Meson</h1>
<h2 id="basic">1. Basic Usage</h2>
<p><strong>The Meson build system is generally considered stable and ready
for production</strong></p>
<p>The meson build is tested on on Linux, macOS, Cygwin and Haiku, it should
work on FreeBSD, DragonflyBSD, NetBSD, and OpenBSD.</p>
<p><strong>Mesa requires Meson >= 0.42.0 to build in general.</strong>
Additionaly, to build the Clover OpenCL state tracker or the OpenSWR driver
meson 0.44.0 or greater is required.
Some older versions of meson do not check that they are too old and will error
out in odd ways.
</p>
<p>
The meson program is used to configure the source directory and generates
either a ninja build file or Visual Studio® build files. The latter must
be enabled via the --backend switch, as ninja is the default backend on all
operating systems. Meson only supports out-of-tree builds, and must be passed a
directory to put built and generated sources into. We'll call that directory
"build" for examples.
</p>
<pre>
meson build/
</pre>
<p>
To see a description of your options you can run <code>meson configure</code>
along with a build directory to view the selected options for. This will show
your meson global arguments and project arguments, along with their defaults
and your local settings.
Moes does not currently support listing options before configure a build
directory, but this feature is being discussed upstream.
</p>
<pre>
meson configure build/
</pre>
<p>
With additional arguments <code>meson configure</code> is used to change
options on already configured build directory. All options passed to this
command are in the form -D "command"="value".
</p>
<pre>
meson configure build/ -Dprefix=/tmp/install -Dglx=true
</pre>
<p>
Once you've run the initial <code>meson</code> command successfully you can use
your configured backend to build the project. With ninja, the -C option can be
be used to point at a directory to build.
</p>
<pre>
ninja -C build/
</pre>
<p>
Without arguments, it will produce libGL.so and/or several other libraries
depending on the options you have chosen. Later, if you want to rebuild for a
different configuration, you should run <code>ninja clean</code> before
changing the configuration, or create a new out of tree build directory for
each configuration you want to build.
http://mesonbuild.com/Using-multiple-build-directories.html
</p>
<dt><code>Environment Variables</code></dt>
<dd><p>Meson supports the standard CC and CXX envrionment variables for
changing the default compiler, and CFLAGS, CXXFLAGS, and LDFLAGS for setting
options to the compiler and linker.
The default compilers depends on your operating system. Meson supports most of
the popular compilers, a complete list is available
<a href="http://mesonbuild.com/Reference-tables.html#compiler-ids">here</a>.
These arguments are consumed and stored by meson when it is initialized or
re-initialized. Therefore passing them to meson configure will not do anything,
and passing them to ninja will only do something if ninja decides to
re-initialze meson, for example, if a meson.build file has been changed.
Changing these variables will not cause all targets to be rebuilt, so running
ninja clean is recomended when changing CFLAGS or CXXFLAGS. meson will never
change compiler in a configured build directory.
</p>
<pre>
CC=clang CXX=clang++ meson build-clang
ninja -C build-clang
ninja -C build-clang clean
touch meson.build
CFLAGS=-Wno-typedef-redefinition ninja -C build-clang
</pre>
<p>Meson also honors DESTDIR for installs</p>
</dd>
<dl>
<dt><code>LLVM</code></dt>
<dd><p>Meson includes upstream logic to wrap llvm-config using it's standard
dependncy interface. It will search $PATH (or %PATH% on windows) for
llvm-config, so using an LLVM from a non-standard path is as easy as
<code>PATH=/path/with/llvm-config:$PATH meson build</code>.
</p></dd>
</dl>
<dl>
<dt><code>PKG_CONFIG_PATH</code></dt>
<dd><p>The
<code>pkg-config</code> utility is a hard requirement for configuring and
building Mesa on Unix-like systems. It is used to search for external libraries
on the system. This environment variable is used to control the search path for
<code>pkg-config</code>. For instance, setting
<code>PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig</code> will search for package
metadata in <code>/usr/X11R6</code> before the standard directories.</p>
</dd>
</dl>
<p>
One of the oddities of meson is that some options are different when passed to
the <code>meson</code> than to <code>meson configure</code>. These options are
passed as --option=foo to <code>meson</code>, but -Doption=foo to <code>meson
configure</code>. Mesa defined options are always passed as -Doption=foo.
<p>
<p>For those coming from autotools be aware of the following:</p>
<dl>
<dt><code>--buildtype/-Dbuildtype</code></dt>
<dd><p>This option will set the compiler debug/optimisation levels to aid
debugging the Mesa libraries.</p>
<p>Note that in meson this defaults to "debugoptimized", and not setting it to
"release" will yield non-optimal performance and binary size. Not using "debug"
may interfer with debbugging as some code and validation will be optimized
away.
</p>
<p> For those wishing to pass their own optimization flags, use the "plain"
buildtype, which causes meson to inject no additional compiler arguments, only
those in the C/CXXFLAGS and those that mesa itself defines.</p>
</dd>
</dl>
<dl>
<dt><code>-Db_ndebug</code></dt>
<dd><p>This option controls assertions in meson projects. When set to false
(the default) assertions are enabled, when set to true they are disabled. This
is unrelated to the <code>buildtype</code>; setting the latter to
<code>release</code> will not turn off assertions.
</p>
</dd>
</dl>

View File

@@ -27,5 +27,5 @@ ARB_texture_float:
enable this extension.
[1] https://www.google.com/patents/about?id=mIIOAAAAEBAJ&dq=6650327
[1] https://patents.google.com/patent/US6650327B1
[2] https://www.opengl.org/registry/specs/ARB/texture_float.txt

View File

@@ -37,74 +37,98 @@ if you'd like to nominate a patch in the next stable release.
<th>Release</th>
<th>Release manager</th>
<th>Notes</th>
<tr>
<td rowspan="3">18.0</td>
<td>2018-04-20</td>
<td>18.0.2</td>
<td>Juan A. Suarez Romero</td>
<td></td>
</tr>
<tr>
<td rowspan="4">17.2</td>
<td>2017-10-27</td>
<td>17.2.4</td>
<td>2018-05-04</td>
<td>18.0.3</td>
<td>Juan A. Suarez Romero</td>
<td></td>
</tr>
<tr>
<td>2018-05-18</td>
<td>18.0.4</td>
<td>Juan A. Suarez Romero</td>
<td>Last planned 18.0.x release</td>
</tr>
<tr>
<td rowspan="8">18.1</td>
<td>2018-04-20</td>
<td>18.1.0rc1</td>
<td>Dylan Baker</td>
<td></td>
</tr>
<tr>
<td>2018-04-27</td>
<td>18.1.0rc2</td>
<td>Dylan Baker</td>
<td></td>
</tr>
<tr>
<td>2018-05-04</td>
<td>18.1.0rc3</td>
<td>Dylan Baker</td>
<td></td>
</tr>
<tr>
<td>2018-05-11</td>
<td>18.1.0rc4</td>
<td>Dylan Baker</td>
<td>Last planned RC/Final release</td>
</tr>
<tr>
<td>TBD</td>
<td>18.1.1</td>
<td>Emil Velikov</td>
<td></td>
</tr>
<tr>
<td>TBD</td>
<td>18.1.2</td>
<td>Emil Velikov</td>
<td></td>
</tr>
<tr>
<td>TBD</td>
<td>18.1.3</td>
<td>Emil Velikov</td>
<td></td>
</tr>
<tr>
<td>TBD</td>
<td>18.1.4</td>
<td>Emil Velikov</td>
<td>Last planned RC/Final release</td>
</tr>
<tr>
<td rowspan="4">18.2</td>
<td>2018-07-20</td>
<td>18.2.0rc1</td>
<td>Andres Gomez</td>
<td></td>
</tr>
<tr>
<td>2017-11-10</td>
<td>17.2.5</td>
<td>2018-07-27</td>
<td>18.2.0rc2</td>
<td>Andres Gomez</td>
<td></td>
</tr>
<tr>
<td>2017-11-24</td>
<td>17.2.6</td>
<td>2018-08-03</td>
<td>18.2.0rc3</td>
<td>Andres Gomez</td>
<td></td>
</tr>
<tr>
<td>2017-12-08</td>
<td>17.2.7</td>
<td>Emil Velikov</td>
<td>Final planned release for the 17.2 series</td>
</tr>
<tr>
<td rowspan="7">17.3</td>
<td>2017-10-20</td>
<td>17.3.0-rc1</td>
<td>Emil Velikov</td>
<td></td>
</tr>
<tr>
<td>2017-10-27</td>
<td>17.3.0-rc2</td>
<td>Emil Velikov</td>
<td></td>
</tr>
<tr>
<td>2017-11-03</td>
<td>17.3.0-rc3</td>
<td>Emil Velikov</td>
<td></td>
</tr>
<tr>
<td>2017-11-10</td>
<td>17.3.0-rc4</td>
<td>Emil Velikov</td>
<td>May be promoted to 17.3.0 final</td>
</tr>
<tr>
<td>2017-11-24</td>
<td>17.3.1</td>
<td>2018-08-10</td>
<td>18.2.0rc4</td>
<td>Andres Gomez</td>
<td></td>
</tr>
<tr>
<td>2017-12-08</td>
<td>17.3.2</td>
<td>Emil Velikov</td>
<td></td>
</tr>
<tr>
<td>2017-12-22</td>
<td>17.3.3</td>
<td>Emil Velikov</td>
<td></td>
<td>Last planned RC/Final release</td>
</tr>
</table>

View File

@@ -96,7 +96,7 @@ described in the same section.
<p>
Nomination happens in the mesa-stable@ mailing list. However,
maintainer is resposible of checking for forgotten candidates in the
maintainer is responsible of checking for forgotten candidates in the
master branch. This is achieved by a combination of ad-hoc scripts and
a casual search for terms such as regression, fix, broken and similar.
</p>
@@ -272,6 +272,11 @@ It is followed by a brief period (normally 24 or 48 hours) before the actual
release is made.
</p>
<p>
Be aware to add a note to warn about a final release in a series, if
that is the case.
</p>
<h2>Terminology used</h2>
<ul><li>Nominated</ul>
@@ -311,6 +316,10 @@ The candidate for the Mesa X.Y.Z is now available. Currently we have:
- NUMBER nominated (outstanding)
- and NUMBER rejected patches
[If applicable:
Note: this is the final anticipated release in the SERIES series. Users are
encouraged to migrate to the NEXT_SERIES series in order to obtain future fixes.]
BRIEF SUMMARY OF CHANGES
Take a look at section "Mesa stable queue" for more information.
@@ -374,6 +383,9 @@ Queued (NUMBER)
AUTHOR (NUMBER):
COMMIT SUMMARY
[If applicable:
Squashed with
COMMIT SUMMARY]
For example:
@@ -382,16 +394,21 @@ Jonas Pfeil (1):
Squashed with
ralloc: don't leave out the alignment factor
Rejected (NUMBER)
=================
Rejected (11)
=============
AUTHOR (NUMBER):
SHA COMMIT SUMMARY
Reason: ...
For example:
Emil Velikov (1)
a39ad18 configure.ac: honour LLVM_LIBDIR when linking against LLVM
Reason: The patch was reverted shortly after it was merged.
</pre>
@@ -457,9 +474,9 @@ Here is one solution that I've been using.
cd .. &amp;&amp; rm -rf mesa-$__version
# Test the automake binaries
tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version
# Restore LLVM_CONFIG, if applicable:
# export LLVM_CONFIG=`echo $save_LLVM_CONFIG`; unset save_LLVM_CONFIG
tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version
./configure \
--with-dri-drivers=i965,swrast \
--with-gallium-drivers=swrast \
@@ -471,6 +488,10 @@ Here is one solution that I've been using.
--enable-egl \
--with-platforms=x11,drm,wayland,surfaceless
make &amp;&amp; DESTDIR=`pwd`/test make install
# Drop LLVM_CONFIG, if applicable:
# unset LLVM_CONFIG
__glxinfo_cmd='glxinfo 2>&amp;1 | egrep -o "Mesa.*|Gallium.*|.*dri\.so"'
__glxgears_cmd='glxgears 2>&amp;1 | grep -v "configuration file"'
__es2info_cmd='es2_info 2>&amp;1 | egrep "GL_VERSION|GL_RENDERER|.*dri\.so"'
@@ -500,8 +521,10 @@ Here is one solution that I've been using.
unset LIBGL_DRIVERS_PATH
unset LIBGL_DEBUG
unset LIBGL_ALWAYS_SOFTWARE
unset GALLIUM_DRIVER
export VK_ICD_FILENAMES=`pwd`/src/intel/vulkan/dev_icd.json
steam steam://rungameid/570 -vconsole -vulkan
unset VK_ICD_FILENAMES
</pre>
<h3>Update version in file VERSION</h3>
@@ -580,7 +603,8 @@ Something like the following steps will do the trick:
<p>
Also, edit docs/relnotes.html to add a link to the new release notes,
edit docs/index.html to add a news entry, and remove the version from
edit docs/index.html to add a news entry and a note in case of the
last release in a series, and remove the version from
docs/release-calendar.html. Then commit and push:
</p>
@@ -596,6 +620,11 @@ docs/release-calendar.html. Then commit and push:
Use the generated template during the releasing process.
</p>
<p>
Again, pay attention to add a note to warn about a final release in a
series, if that is the case.
</p>
<h1 id="website">Update the mesa3d.org website</h1>

View File

@@ -21,6 +21,24 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/18.1.0.html">18.1.0 release notes</a>
<li><a href="relnotes/18.0.1.html">18.0.1 release notes</a>
<li><a href="relnotes/17.3.9.html">17.3.9 release notes</a>
<li><a href="relnotes/17.3.8.html">17.3.8 release notes</a>
<li><a href="relnotes/18.0.0.html">18.0.0 release notes</a>
<li><a href="relnotes/17.3.7.html">17.3.7 release notes</a>
<li><a href="relnotes/17.3.6.html">17.3.6 release notes</a>
<li><a href="relnotes/17.3.5.html">17.3.5 release notes</a>
<li><a href="relnotes/17.3.4.html">17.3.4 release notes</a>
<li><a href="relnotes/17.3.3.html">17.3.3 release notes</a>
<li><a href="relnotes/17.3.2.html">17.3.2 release notes</a>
<li><a href="relnotes/17.2.8.html">17.2.8 release notes</a>
<li><a href="relnotes/17.3.1.html">17.3.1 release notes</a>
<li><a href="relnotes/17.2.7.html">17.2.7 release notes</a>
<li><a href="relnotes/17.3.0.html">17.3.0 release notes</a>
<li><a href="relnotes/17.2.6.html">17.2.6 release notes</a>
<li><a href="relnotes/17.2.5.html">17.2.5 release notes</a>
<li><a href="relnotes/17.2.4.html">17.2.4 release notes</a>
<li><a href="relnotes/17.2.3.html">17.2.3 release notes</a>
<li><a href="relnotes/17.2.2.html">17.2.2 release notes</a>
<li><a href="relnotes/17.1.10.html">17.1.10 release notes</a>

132
docs/relnotes/17.2.4.html Normal file
View File

@@ -0,0 +1,132 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.4 Release Notes / October 30, 2017</h1>
<p>
Mesa 17.2.4 is a bug fix release which fixes bugs found since the 17.2.3 release.
</p>
<p>
Mesa 17.2.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
cb266edc5cf7226219ebaf556ca2e03dff282e0324d20afd80423a5754d1272c mesa-17.2.4.tar.gz
5ba408fecd6e1132e5490eec1a2f04466214e4c65c8b89b331be844768c2e550 mesa-17.2.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102774">Bug 102774</a> - [BDW] [Bisected] Absolute constant buffers break VAAPI in mpv</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103388">Bug 103388</a> - Linking libcltgsi.la (llvm/codegen/libclllvm_la-common.lo) fails with &quot;error: no match for 'operator-'&quot; with GCC-7, Mesa from Git and current LLVM revisions</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (8):</p>
<ul>
<li>cherry-ignore: configure.ac: rework llvm detection and handling</li>
<li>cherry-ignore: glsl: fix derived cs variables</li>
<li>cherry-ignore: added 17.3 nominations.</li>
<li>cherry-ignore: radv: Don't use vgpr indexing for outputs on GFX9.</li>
<li>cherry-ignore: radv: Disallow indirect outputs for GS on GFX9 as well.</li>
<li>cherry-ignore: mesa/bufferobj: don't double negate the range</li>
<li>cherry-ignore: broadcom/vc5: Propagate vc4 aliasing fix to vc5.</li>
<li>Update version to 17.2.4</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>ac/nir: Fix nir_texop_lod on GFX for 1D arrays.</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv/image: bump all the offset to uint64_t.</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.2.3</li>
</ul>
<p>Henri Verbeet (1):</p>
<ul>
<li>vulkan/wsi: Free the event in x11_manage_fifo_queues().</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>clover: Fix compilation after clang r315871</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>nir/intrinsics: Set the correct num_indices for load_output</li>
<li>intel/fs: Handle flag read/write aliasing in needs_src_copy</li>
<li>anv/pipeline: Call nir_lower_system_valaues after brw_preprocess_nir</li>
<li>intel/eu: Use EXECUTE_1 for JMPI</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Revert absolute mode for constant buffer pointers.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>Revert "mesa: fix texture updates for ATI_fragment_shader"</li>
</ul>
<p>Matthew Nicholls (1):</p>
<ul>
<li>ac/nir: generate correct instruction for atomic min/max on unsigned images</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>st/mesa: Initialize textures array in st_framebuffer_validate</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: add the draw count buffer to the list of buffers</li>
</ul>
<p>Stefan Schake (1):</p>
<ul>
<li>broadcom/vc4: Fix aliasing issue</li>
</ul>
</div>
</body>
</html>

156
docs/relnotes/17.2.5.html Normal file
View File

@@ -0,0 +1,156 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.5 Release Notes / November 10, 2017</h1>
<p>
Mesa 17.2.5 is a bug fix release which fixes bugs found since the 17.2.4 release.
</p>
<p>
Mesa 17.2.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
25b40e72fad64b096c2d8d6fe9579369954debe7970d4ad53e5033c7eec2918b mesa-17.2.5.tar.gz
7f7f914b7b9ea0b15f2d9d01a4375e311b0e90e55683b8e8a67ce8691eb1070f mesa-17.2.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97532">Bug 97532</a> - Regression: GLB 2.7 &amp; Glmark-2 GLES versions segfault due to linker precision error (259fc505) on dead variable</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102680">Bug 102680</a> - [OpenGL CTS] KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102809">Bug 102809</a> - Rust shadows(?) flash random colours</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (8):</p>
<ul>
<li>docs: add sha256 checksums for 17.2.4</li>
<li>cherry-ignore: radv: copy indirect lowering settings from radeonsi</li>
<li>cherry-ignore: i965: fix blorp stage_prog_data-&gt;param leak</li>
<li>cherry-ignore: etnaviv: don't do resolve-in-place without valid TS</li>
<li>cherry-ignore: intel/fs: Alloc pull constants off mem_ctx</li>
<li>cherry-ignore: added 17.3 nominations.</li>
<li>cherry-ignore: automake: include git_sha1.h.in in release tarball</li>
<li>Update version to 17.2.5</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Don't expose heaps with 0 memory.</li>
<li>radv: Don't use vgpr indexing for outputs on GFX9.</li>
<li>radv: Disallow indirect outputs for GS on GFX9 as well.</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>i915g: make gears run again.</li>
<li>radv: free attachments on end command buffer.</li>
<li>radv: add initial copy descriptor support. (v2)</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>vc4: fix release build</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>r600/sb: bail out if prepare_alu_group() doesn't find a proper scheduling</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>spirv: Claim support for the simple memory model</li>
<li>i965/blorp: Use blorp_to_isl_format for src_isl_format in blit_miptrees</li>
<li>i965/blorp: Use more temporary isl_format variables</li>
<li>i965/miptree: Take an isl_format in render_aux_usage</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>mesa: Accept GL_BACK in get_fb0_attachment with ARB_ES3_1_compatibility.</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/video: add gfx9 offsets when rejoin the video surface</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>st/dri: don't expose modifiers in EGL if the driver doesn't implement them</li>
<li>ac/surface/gfx9: don't allow DCC for the smallest mipmap levels</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>i965: Check CCS_E compatibility for texture view rendering</li>
</ul>
<p>Neil Roberts (1):</p>
<ul>
<li>nir/opt_intrinsics: Fix values for gl_SubGroupG{e,t}MaskARB</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>amd/common/gfx9: workaround DCC corruption more conservatively</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>i965: unref push_const_bo in intelDestroyContext</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>radv: copy indirect lowering settings from radeonsi</li>
</ul>
<p>Tomasz Figa (1):</p>
<ul>
<li>glsl: Allow precision mismatch on dead data with GLSL ES 1.00</li>
</ul>
<p>Topi Pohjolainen (1):</p>
<ul>
<li>intel/compiler/gen9: Pixel shader header only workaround</li>
</ul>
</div>
</body>
</html>

187
docs/relnotes/17.2.6.html Normal file
View File

@@ -0,0 +1,187 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.6 Release Notes / November 25, 2017</h1>
<p>
Mesa 17.2.6 is a bug fix release which fixes bugs found since the 17.2.5 release.
</p>
<p>
Mesa 17.2.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
a9ed76702ffb14ad674ad48899f5c8c7e3a0f987911878a5dfdc4117dce5b415 mesa-17.2.6.tar.gz
6ad85224620330be26ab68c8fc78381b12b38b610ade2db8716b38faaa8f30de mesa-17.2.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100438">Bug 100438</a> - glsl/ir.cpp:1376: ir_dereference_variable::ir_dereference_variable(ir_variable*): Assertion `var != NULL' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102177">Bug 102177</a> - [SKL] ES31-CTS.core.sepshaderobjs.StateInteraction fails sporadically</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103115">Bug 103115</a> - [BSW BXT GLK] dEQP-VK.spirv_assembly.instruction.compute.sconvert.int32_to_int64</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103519">Bug 103519</a> - wayland egl apps crash on start with mesa 17.2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103529">Bug 103529</a> - [GM45] GPU hang with mpv fullscreen (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103628">Bug 103628</a> - [BXT, GLK, BSW] KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103787">Bug 103787</a> - [BDW,BSW] gpu hang on spec.arb_pipeline_statistics_query.arb_pipeline_statistics_query-comp</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (2):</p>
<ul>
<li>glx/drisw: Fix glXMakeCurrent(dpy, None, ctx)</li>
<li>glx/dri3: Fix passing renderType into glXCreateContext</li>
</ul>
<p>Alex Smith (2):</p>
<ul>
<li>spirv: Use correct type for sampled images</li>
<li>nir/spirv: tg4 requires a sampler</li>
</ul>
<p>Andres Gomez (14):</p>
<ul>
<li>docs: add sha256 checksums for 17.2.5</li>
<li>cherry-ignore: intel/fs: Use a pure vertical stride for large register strides</li>
<li>cherry-ignore: intel/nir: Use the correct indirect lowering masks in link_shaders</li>
<li>cherry-ignore: intel/fs: Use the original destination region for int MUL lowering</li>
<li>cherry-ignore: intel/fs: refactors</li>
<li>cherry-ignore: r600/shader: reserve first register of vertex shader.</li>
<li>cherry-ignore: anv/cmd_buffer: Advance the address when initializing clear colors</li>
<li>cherry-ignore: anv/cmd_buffer: Take bo_offset into account in fast clear state addresses</li>
<li>cherry-ignore: i965: Mark BOs as external when we export their handle</li>
<li>cherry-ignore: added 17.3 nominations.</li>
<li>cherry-ignore: glsl: Fix typo fragement -&gt; fragment</li>
<li>cherry-ignore: egl: pass the dri2_dpy to the $plat_teardown functions</li>
<li>cherry-ignore: Revert "intel/fs: Use a pure vertical stride for large register strides"</li>
<li>Update version to 17.2.6</li>
</ul>
<p>Anuj Phogat (2):</p>
<ul>
<li>i965: Program DWord Length in MI_FLUSH_DW</li>
<li>i965/gen8+: Fix the number of dwords programmed in MI_FLUSH_DW</li>
</ul>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>radv: Free syncobj with multiple imports.</li>
<li>radv: Free temporary syncobj after waiting on it.</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>r600: fix isoline tess factor component swapping.</li>
</ul>
<p>Derek Foreman (1):</p>
<ul>
<li>egl/wayland: Add a fallback when fourcc query isn't supported</li>
</ul>
<p>Dylan Baker (1):</p>
<ul>
<li>autotools: Set C++ visibility flags on Intel</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>targets/opencl: don't hardcode the icd file install to /etc/...</li>
<li>configure.ac: loosen --enable-glvnd check to honour egl</li>
<li>configure.ac: require xcb* for the omx/va/... when using x11 platform</li>
</ul>
<p>George Barrett (1):</p>
<ul>
<li>glsl: Catch subscripted calls to undeclared subroutines</li>
</ul>
<p>Jason Ekstrand (9):</p>
<ul>
<li>intel/fs: Use ANY/ALL32 predicates in SIMD32</li>
<li>intel/fs: Use an explicit D type for vote any/all/eq intrinsics</li>
<li>intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all</li>
<li>intel/eu/reg: Add a subscript() helper</li>
<li>intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core</li>
<li>intel/fs: Fix integer multiplication lowering for src/dst hazards</li>
<li>intel/fs: Mark 64-bit values as being contiguous</li>
<li>intel/fs: Rework zero-length URB write handling</li>
<li>i965: Add stencil buffers to cache set regardless of stencil texturing</li>
</ul>
<p>Kenneth Graunke (5):</p>
<ul>
<li>i965: properly initialize brw-&gt;cs.base.stage to MESA_SHADER_COMPUTE</li>
<li>i965: Make L3 configuration atom listen for TCS/TES program updates.</li>
<li>intel/tools: Fix detection of enabled shader stages.</li>
<li>i965: Implement another VF cache invalidate workaround on Gen8+.</li>
<li>i965: Upload invariant state once at the start of the batch on Gen4-5.</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>i965/fs: Fix extract_i8/u8 to a 64-bit destination</li>
<li>i965/fs: Split all 32-&gt;64-bit MOVs on CHV, BXT, GLK</li>
</ul>
<p>Neil Roberts (1):</p>
<ul>
<li>glsl: Transform fb buffers are only active if a variable uses them</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>ddebug: fix use-after-free of streamout targets</li>
</ul>
<p>Tim Rowley (2):</p>
<ul>
<li>swr/rast: Use gather instruction for i32gather_ps on simd16/avx512</li>
<li>swr/rast: Faster emulated simd16 permute</li>
</ul>
<p>Timothy Arceri (3):</p>
<ul>
<li>glsl: drop cache_fallback</li>
<li>glsl: use the correct parent when allocating program data members</li>
<li>mesa: rework how we free gl_shader_program_data</li>
</ul>
</div>
</body>
</html>

247
docs/relnotes/17.2.7.html Normal file
View File

@@ -0,0 +1,247 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.7 Release Notes / December 14, 2017</h1>
<p>
Mesa 17.2.7 is a bug fix release which fixes bugs found since the 17.2.6 release.
</p>
<p>
Mesa 17.2.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e8d837a1cd55014e636e9caf6c75cfbe1b3e4be9ab3fa125f5ef38398aa12e97 mesa-17.2.7.tar.gz
50cfdea8df55045797b4d0409591c04c784d9551c4da09b8178874dbe5a37a68 mesa-17.2.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94739">Bug 94739</a> - Mesa 11.1.2 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101378">Bug 101378</a> - interpolateAtSample check for input parameter is too strict</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102006">Bug 102006</a> - gstreamer vaapih264enc segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU HANG: ecode 9:0:0x86df7cf9, in csgo_linux64 [4947], reason: Hang on rcs, action: reset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102552">Bug 102552</a> - Null dereference due to not checking return value of util_format_description</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102677">Bug 102677</a> - [OpenGL CTS] KHR-GL45.CommonBugs.CommonBug_PerVertexValidation fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103098">Bug 103098</a> - [OpenGL CTS] KHR-GL45.enhanced_layouts.varying_structure_locations fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103227">Bug 103227</a> - [G965 G45 ILK] ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103393">Bug 103393</a> - glDispatchComputeGroupSizeARB : gl_GlobalInvocationID.x != gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103412">Bug 103412</a> - gallium/wgl: Another fix to context creation without prior SetPixelFormat()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103616">Bug 103616</a> - Increased difference from reference image in shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103732">Bug 103732</a> - [swr] often gets stuck in piglit's glx-multi-context-single-window test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103909">Bug 103909</a> - anv_allocator.c:113:1: error: static declaration of memfd_create follows non-static declaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103966">Bug 103966</a> - Mesa 17.2.5 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104119">Bug 104119</a> - radv: OpBitFieldInsert produces 0 with a loop counter for Insert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104143">Bug 104143</a> - r600/sb: clobbers gl_Position -&gt; gl_FragCoord</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (1):</p>
<ul>
<li>radv: Add LLVM version to the device name string</li>
</ul>
<p>Andres Gomez (2):</p>
<ul>
<li>docs: add sha256 checksums for 17.2.6</li>
<li>docs: remove bug 103626 from fix list as per 17.2.6</li>
</ul>
<p>Ben Crocker (2):</p>
<ul>
<li>docs/llvmpipe.html: Minor edits</li>
<li>docs/llvmpipe: document ppc64le as alternative architecture to x86.</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>r600/sb: handle jump after target to end of program. (v2)</li>
</ul>
<p>Denis Pauk (1):</p>
<ul>
<li>gallium/{r600, radeonsi}: Fix segfault with color format (v2)</li>
</ul>
<p>Eduardo Lima Mitev (3):</p>
<ul>
<li>glsl_parser_extra: Add utility to copy symbols between symbol tables</li>
<li>glsl: Use the utility function to copy symbols between symbol tables</li>
<li>glsl/linker: Check that re-declared, inter-shader built-in blocks match</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>gl_table.py: add extern C guard for the generated glapitable.h</li>
<li>cherry-ignore: radeonsi: allow DMABUF exports for local buffers</li>
<li>Update version to 17.2.7</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>broadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count.</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>compiler: use NDEBUG to guard asserts</li>
</ul>
<p>Fabian Bieler (2):</p>
<ul>
<li>glsl: Match order of gl_LightSourceParameters elements.</li>
<li>glsl: Fix gl_NormalScale.</li>
</ul>
<p>Frank Richter (1):</p>
<ul>
<li>gallium/wgl: fix default pixel format issue</li>
</ul>
<p>George Kyriazis (1):</p>
<ul>
<li>swr: Handle resource across context changes</li>
</ul>
<p>Gert Wollny (2):</p>
<ul>
<li>r600: Emit EOP for more CF instruction types</li>
<li>r600/sb: do not convert if-blocks that contain indirect array access</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>glsl: fix derived cs variables</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>nir/opcodes: Fix constant-folding of bitfield_insert</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965: Disable regular fast-clears (CCS_D) on gen9+</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>glsl: add varying resources for arrays of complex types</li>
</ul>
<p>Julien Isorce (1):</p>
<ul>
<li>st/va: change frame_idx from array to hash table</li>
</ul>
<p>Kai Wasserbäch (1):</p>
<ul>
<li>docs: Point to apt.llvm.org for development snapshot packages</li>
</ul>
<p>Kenneth Graunke (3):</p>
<ul>
<li>meta: Initialize depth/clear values on declaration.</li>
<li>meta: Fix ClearTexture with GL_DEPTH_COMPONENT.</li>
<li>i965: Fix Smooth Point Enables.</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>radeonsi: fix layered DCC fast clear</li>
<li>radeonsi/gfx9: fix importing shared textures with DCC</li>
<li>radeonsi: flush the context after resource_copy_region for buffer exports</li>
</ul>
<p>Matt Turner (4):</p>
<ul>
<li>i965/fs: Handle negating immediates on MADs when propagating saturates</li>
<li>util: Fix SHA1 implementation on big endian</li>
<li>util: Fix disk_cache index calculation on big endian</li>
<li>i965/fs: Unpack count argument to 64-bit shift ops on Atom</li>
</ul>
<p>Nicolai Hähnle (3):</p>
<ul>
<li>radeonsi: fix the R600_RESOURCE_FLAG_UNMAPPABLE check</li>
<li>glsl: allow any l-value of an input variable as interpolant in interpolateAt*</li>
<li>glsl: fix interpolateAtXxx(some_vec[idx], ...) with dynamic idx</li>
</ul>
<p>Pierre Moreau (1):</p>
<ul>
<li>nvc0/ir: Properly lower 64-bit shifts when the shift value is &gt;32</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>mesa/gles: adjust internal format in glTexSubImage2D error checks</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>glsl: get correct member type when processing xfb ifc arrays</li>
</ul>
<p>Vadym Shovkoplias (2):</p>
<ul>
<li>intel/blorp: Fix possible NULL pointer dereferencing</li>
<li>glx/dri3: Remove unused deviceName variable</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>anv: Check if memfd_create is already defined.</li>
</ul>
</div>
</body>
</html>

112
docs/relnotes/17.2.8.html Normal file
View File

@@ -0,0 +1,112 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.8 Release Notes / December 22, 2017</h1>
<p>
Mesa 17.2.8 is a bug fix release which fixes bugs found since the 17.2.7 release.
</p>
<p>
Mesa 17.2.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c715c3a3d6fe26a69c096f573ec416e038a548f0405e3befedd5136517527a84 mesa-17.2.8.tar.gz
6e940345cceaadfd805d701ed2b956589fa77fe8c39991da30ed51ea6b9d095f mesa-17.2.8.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102710">Bug 102710</a> - vkCmdBlitImage with arrayLayers &gt; 1 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103007">Bug 103007</a> - [OpenGL CTS] [HSW] KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103544">Bug 103544</a> - Graphical glitches r600 in game this war of mine linux native</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103579">Bug 103579</a> - Vertex shader causes compiler to crash in SPIRV-to-NIR</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (6):</p>
<ul>
<li>cherry-ignore: swr: Fix KNOB_MAX_WORKER_THREADS thread creation override.</li>
<li>cherry-ignore: added 17.3 nominations.</li>
<li>cherry-ignore: radv: port merge tess info from anv</li>
<li>cherry-ignore: main: Clear shader program data whenever ProgramBinary is called</li>
<li>cherry-ignore: r600: set DX10_CLAMP for compute shader too</li>
<li>Update version to 17.2.8</li>
</ul>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>spirv: Fix loading an entire block at once.</li>
<li>radv: Fix multi-layer blits.</li>
</ul>
<p>Brian Paul (2):</p>
<ul>
<li>xlib: call _mesa_warning() instead of fprintf()</li>
<li>gallium/aux: include nr_samples in util_resource_size() computation</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.2.7</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>i965/vec4: use a temp register to compute offsets for pull loads</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/vce: move destroy command before feedback command</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>util: Assume little endian in the absence of platform-specific handling</li>
<li>util: Add a SHA1 unit test program</li>
</ul>
<p>Roland Scheidegger (2):</p>
<ul>
<li>r600: use min_dx10/max_dx10 instead of min/max</li>
<li>r600: use DX10_CLAMP bit in shader setup</li>
</ul>
</div>
</body>
</html>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.0 Release Notes / TBD</h1>
<h1>Mesa 17.3.0 Release Notes / December 8. 2017</h1>
<p>
Mesa 17.3.0 is a new development release.
@@ -33,7 +33,8 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD.
0cb1ffe2b4637d80f08df3bdfeb300352dcffd8ff4f6711278639b084e3f07f9 mesa-17.3.0.tar.gz
29a0a3a6c39990d491a1a58ed5c692e596b3bfc6c01d0b45e0b787116c50c6d9 mesa-17.3.0.tar.xz
</pre>
@@ -58,14 +59,187 @@ Note: some of the new features are only available with certain drivers.
<h2>Bug fixes</h2>
<ul>
TBD
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97532">Bug 97532</a> - Regression: GLB 2.7 &amp; Glmark-2 GLES versions segfault due to linker precision error (259fc505) on dead variable</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100438">Bug 100438</a> - glsl/ir.cpp:1376: ir_dereference_variable::ir_dereference_variable(ir_variable*): Assertion `var != NULL' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100613">Bug 100613</a> - Regression in Mesa 17 on s390x (zSystems)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101334">Bug 101334</a> - AMD SI cards: Some vulkan apps freeze the system</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101378">Bug 101378</a> - interpolateAtSample check for input parameter is too strict</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101655">Bug 101655</a> - Explicit sync support for android</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101691">Bug 101691</a> - gfx corruption on windowed 3d-apps running on dGPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101709">Bug 101709</a> - [llvmpipe] piglit gl-1.0-scissor-offscreen regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101766">Bug 101766</a> - Assertion `!&quot;invalid type&quot;' failed when constant expression involves literal of different type</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101832">Bug 101832</a> - [PATCH][regression][bisect] Xorg fails to start after f50aa21456d82c8cb6fbaa565835f1acc1720a5d</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101851">Bug 101851</a> - [regression] libEGL_common.a undefined reference to '__gxx_personality_v0'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101867">Bug 101867</a> - Launch options window renders black in Feral Games in current Mesa trunk</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101876">Bug 101876</a> - SIGSEGV when launching Steam</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101910">Bug 101910</a> - [BYT] ES31-CTS.functional.copy_image.non_compressed.viewclass_96_bits.rgb32f_rgb32f</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101925">Bug 101925</a> - playstore/webview crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101941">Bug 101941</a> - Getting different output depending on attribute declaration order</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101961">Bug 101961</a> - Serious Sam Fusion hangs system completely</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101981">Bug 101981</a> - Commit ddc32537d6db69198e88ef0dfe19770bf9daa536 breaks rendering in multiple applications</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101982">Bug 101982</a> - Weston crashes when running an OpenGL program on i965</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101983">Bug 101983</a> - [G33] ES2-CTS.functional.shaders.struct.uniform.sampler_nested* regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101989">Bug 101989</a> - ES3-CTS.functional.state_query.integers.viewport_getinteger regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102006">Bug 102006</a> - gstreamer vaapih264enc segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102014">Bug 102014</a> - Mesa git build broken by commit bc7f41e11d325280db12e7b9444501357bc13922</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102015">Bug 102015</a> - [Regression,bisected]: Segfaults with various programs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102024">Bug 102024</a> - FORMAT_FEATURE_SAMPLED_IMAGE_BIT not supported for D16_UNORM and D32_SFLOAT</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102038">Bug 102038</a> - assertion failure in update_framebuffer_size</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102050">Bug 102050</a> - commit b4f639d02a causes build breakage on Android 32bit builds</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102052">Bug 102052</a> - No package 'expat' found</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102062">Bug 102062</a> - Segfault at eglCreateContext in android-x86</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102125">Bug 102125</a> - [softpipe] piglit arb_texture_view-targets regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102148">Bug 102148</a> - Crash when running qopenglwidget example on mesa llvmpipe win32</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102177">Bug 102177</a> - [SKL] ES31-CTS.core.sepshaderobjs.StateInteraction fails sporadically</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102201">Bug 102201</a> - [regression, SI] GPU crash in Unigine Valley</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102241">Bug 102241</a> - gallium/wgl: SwapBuffers freezing regularly with swap interval enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102274">Bug 102274</a> - assertion failure in ir_validate.cpp:240</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102308">Bug 102308</a> - segfault in glCompressedTextureSubImage3D</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102358">Bug 102358</a> - WarThunder freezes at start, with activated vsync (vblank_mode=2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102377">Bug 102377</a> - PIPE_*_4BYTE_ALIGNED_ONLY caps crashing</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102429">Bug 102429</a> - [regression, SI] Performance decrease in Unigine Valley &amp; Heaven</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU HANG: ecode 9:0:0x86df7cf9, in csgo_linux64 [4947], reason: Hang on rcs, action: reset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102454">Bug 102454</a> - glibc 2.26 doesn't provide anymore xlocale.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102461">Bug 102461</a> - [llvmpipe] piglit glean fragprog1 XPD test 1 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102467">Bug 102467</a> - src/mesa/state_tracker/st_cb_readpixels.c:178]: (warning) Redundant assignment</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102496">Bug 102496</a> - Frontbuffer rendering corruption on mesa master</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102502">Bug 102502</a> - [bisected] Kodi crashes since commit 707d2e8b - gallium: fold u_trim_pipe_prim call from st/mesa to drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102530">Bug 102530</a> - [bisected] Kodi crashes when launching a stream - commit bd2662bf</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102552">Bug 102552</a> - Null dereference due to not checking return value of util_format_description</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102565">Bug 102565</a> - u_debug_stack.c:114: undefined reference to `_Ux86_64_getcontext'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102573">Bug 102573</a> - fails to build on armel</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102665">Bug 102665</a> - test_glsl_to_tgsi_lifetime.cpp:53:67: error: &gt;&gt; should be &gt; &gt; within a nested template argument list</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102677">Bug 102677</a> - [OpenGL CTS] KHR-GL45.CommonBugs.CommonBug_PerVertexValidation fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102680">Bug 102680</a> - [OpenGL CTS] KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102685">Bug 102685</a> - piglit.spec.glsl-1_50.compiler.vs-redeclares-pervertex-out-before-global-redeclaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102774">Bug 102774</a> - [BDW] [Bisected] Absolute constant buffers break VAAPI in mpv</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102809">Bug 102809</a> - Rust shadows(?) flash random colours</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102844">Bug 102844</a> - memory leak with glDeleteProgram for shader program type GL_COMPUTE_SHADER</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102847">Bug 102847</a> - swr fail to build with llvm-5.0.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102852">Bug 102852</a> - Scons: Support the new Scons 3.0.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102904">Bug 102904</a> - piglit and gl45 cts linker tests regressed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102924">Bug 102924</a> - mesa (git version) images too dark</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102940">Bug 102940</a> - Regression: Vulkan KMS rendering crashes since 17.2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102955">Bug 102955</a> - HyperZ related rendering issue in ARK: Survival Evolved</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102999">Bug 102999</a> - [BISECTED,REGRESSION] Failing Android EGL dEQP with RGBA configs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103002">Bug 103002</a> - string_buffer_test.cpp:43: error: ISO C++ forbids initialization of member str1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103085">Bug 103085</a> - [ivb byt hsw] piglit.spec.arb_indirect_parameters.tf-count-arrays</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103098">Bug 103098</a> - [OpenGL CTS] KHR-GL45.enhanced_layouts.varying_structure_locations fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103101">Bug 103101</a> - [SKL][bisected] DiRT Rally GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103115">Bug 103115</a> - [BSW BXT GLK] dEQP-VK.spirv_assembly.instruction.compute.sconvert.int32_to_int64</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103128">Bug 103128</a> - [softpipe] piglit fs-ldexp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103214">Bug 103214</a> - GLES CTS functional.state_query.indexed.atomic_counter regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103227">Bug 103227</a> - [G965 G45 ILK] ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103247">Bug 103247</a> - Performance regression: car chase, manhattan</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103253">Bug 103253</a> - blob.h:138:1: error: unknown type name 'ssize_t'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103265">Bug 103265</a> - [llvmpipe] piglit depth-tex-compare regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103323">Bug 103323</a> - Possible unintended error message in file pixel.c line 286</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103388">Bug 103388</a> - Linking libcltgsi.la (llvm/codegen/libclllvm_la-common.lo) fails with &quot;error: no match for 'operator-'&quot; with GCC-7, Mesa from Git and current LLVM revisions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103393">Bug 103393</a> - glDispatchComputeGroupSizeARB : gl_GlobalInvocationID.x != gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103412">Bug 103412</a> - gallium/wgl: Another fix to context creation without prior SetPixelFormat()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103519">Bug 103519</a> - wayland egl apps crash on start with mesa 17.2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103529">Bug 103529</a> - [GM45] GPU hang with mpv fullscreen (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103537">Bug 103537</a> - i965: Shadow of Mordor broken since commit 379b24a40d3d34ffdaaeb1b328f50e28ecb01468 on Haswell</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103544">Bug 103544</a> - Graphical glitches r600 in game this war of mine linux native</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103616">Bug 103616</a> - Increased difference from reference image in shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103628">Bug 103628</a> - [BXT, GLK, BSW] KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103759">Bug 103759</a> - plasma desktop corrupted rendering</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103787">Bug 103787</a> - [BDW,BSW] gpu hang on spec.arb_pipeline_statistics_query.arb_pipeline_statistics_query-comp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103909">Bug 103909</a> - anv_allocator.c:113:1: error: static declaration of memfd_create follows non-static declaration</li>
</ul>
<h2>Changes</h2>
<ul>
TBD
</ul>
</div>
</body>

191
docs/relnotes/17.3.1.html Normal file
View File

@@ -0,0 +1,191 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.1 Release Notes / December 21, 2017</h1>
<p>
Mesa 17.3.1 is a bug fix release which fixes bugs found since the 17.3.0 release.
</p>
<p>
Mesa 17.3.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
b0bb0419dbe3043ed4682a28eaf95721f427ca3f23a3c2a7dc77dbe8a3b6384d mesa-17.3.1.tar.gz
9ae607e0998a586fb2c866cfc8e45e6f52d1c56cb1b41288253ea83eada824c1 mesa-17.3.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94739">Bug 94739</a> - Mesa 11.1.2 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102710">Bug 102710</a> - vkCmdBlitImage with arrayLayers &gt; 1 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103579">Bug 103579</a> - Vertex shader causes compiler to crash in SPIRV-to-NIR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103966">Bug 103966</a> - Mesa 17.2.5 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104119">Bug 104119</a> - radv: OpBitFieldInsert produces 0 with a loop counter for Insert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104143">Bug 104143</a> - r600/sb: clobbers gl_Position -&gt; gl_FragCoord</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (1):</p>
<ul>
<li>radv: Add LLVM version to the device name string</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>spirv: Fix loading an entire block at once.</li>
<li>radv: Don't advertise VK_EXT_debug_report.</li>
<li>radv: Fix multi-layer blits.</li>
</ul>
<p>Ben Crocker (1):</p>
<ul>
<li>docs/llvmpipe: document ppc64le as alternative architecture to x86.</li>
</ul>
<p>Brian Paul (2):</p>
<ul>
<li>xlib: call _mesa_warning() instead of fprintf()</li>
<li>gallium/aux: include nr_samples in util_resource_size() computation</li>
</ul>
<p>Bruce Cherniak (1):</p>
<ul>
<li>swr: Fix KNOB_MAX_WORKER_THREADS thread creation override.</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv: port merge tess info from anv</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.0</li>
<li>util: scons: wire up the sha1 test</li>
<li>cherry-ignore: meson: fix strtof locale support check</li>
<li>cherry-ignore: util: add mesa-sha1 test to meson</li>
<li>Update version to 17.3.1</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>broadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count.</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>compiler: use NDEBUG to guard asserts</li>
</ul>
<p>Fabian Bieler (2):</p>
<ul>
<li>glsl: Match order of gl_LightSourceParameters elements.</li>
<li>glsl: Fix gl_NormalScale.</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>r600/sb: do not convert if-blocks that contain indirect array access</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>nir/opcodes: Fix constant-folding of bitfield_insert</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965: Switch over to fully external-or-not MOCS scheme</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>travis: disable Meson build</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>meta: Initialize depth/clear values on declaration.</li>
<li>meta: Fix ClearTexture with GL_DEPTH_COMPONENT.</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/vce: move destroy command before feedback command</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>radeonsi: flush the context after resource_copy_region for buffer exports</li>
<li>radeonsi: allow DMABUF exports for local buffers</li>
<li>winsys/amdgpu: disable local BOs again due to worse performance</li>
<li>radeonsi: don't call force_dcc_off for buffers</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>util: Assume little endian in the absence of platform-specific handling</li>
<li>util: Add a SHA1 unit test program</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>radeonsi: fix the R600_RESOURCE_FLAG_UNMAPPABLE check</li>
</ul>
<p>Pierre Moreau (1):</p>
<ul>
<li>nvc0/ir: Properly lower 64-bit shifts when the shift value is &gt;32</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>glsl: get correct member type when processing xfb ifc arrays</li>
</ul>
<p>Vadym Shovkoplias (2):</p>
<ul>
<li>glx/dri3: Remove unused deviceName variable</li>
<li>util/disk_cache: Remove unneeded free() on always null string</li>
</ul>
</div>
</body>
</html>

109
docs/relnotes/17.3.2.html Normal file
View File

@@ -0,0 +1,109 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.2 Release Notes / January 8, 2018</h1>
<p>
Mesa 17.3.2 is a bug fix release which fixes bugs found since the 17.3.1 release.
</p>
<p>
Mesa 17.3.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
f997e80f14c385f9a2ba827c2b74aebf1b7426712ca4a81c631ef9f78e437bf4 mesa-17.3.2.tar.gz
e2844a13f2d6f8f24bee65804a51c42d8dc6ae9c36cff7ee61d0940e796d64c6 mesa-17.3.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97852">Bug 97852</a> - Unreal Engine corrupted preview viewport</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103801">Bug 103801</a> - [i965] &gt;Observer_ issue</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104288">Bug 104288</a> - Steamroll needs allow_glsl_cross_stage_interpolation_mismatch=true</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Fix DCC compatible formats.</li>
</ul>
<p>Brendan King (1):</p>
<ul>
<li>egl: link libEGL against the dynamic version of libglapi</li>
</ul>
<p>Dave Airlie (6):</p>
<ul>
<li>radv/gfx9: add support for 3d images to blit 2d paths</li>
<li>radv: handle depth/stencil image copy with layouts better. (v3.1)</li>
<li>radv/meta: fix blit paths for depth/stencil (v2.1)</li>
<li>radv: fix issue with multisample positions and interp_var_at_sample.</li>
<li>radv/gfx9: add 3d sampler image-&gt;buffer copy shader. (v3)</li>
<li>radv: don't do format replacement on tc compat htile surfaces.</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.1</li>
<li>Update version to 17.3.2</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>egl: let each platform decided how to handle LIBGL_ALWAYS_SOFTWARE</li>
</ul>
<p>Rob Herring (1):</p>
<ul>
<li>egl/android: Fix build break with dri2_initialize_android _EGLDisplay parameter</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>radv/gfx9: fix primitive topology when adjacency is used</li>
<li>radv: use a faster version for nir_op_pack_half_2x16</li>
</ul>
<p>Tapani Pälli (2):</p>
<ul>
<li>mesa: add AllowGLSLCrossStageInterpolationMismatch workaround</li>
<li>drirc: set allow_glsl_cross_stage_interpolation_mismatch for more games</li>
</ul>
</div>
</body>
</html>

151
docs/relnotes/17.3.3.html Normal file
View File

@@ -0,0 +1,151 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.3 Release Notes / January 18, 2018</h1>
<p>
Mesa 17.3.3 is a bug fix release which fixes bugs found since the 17.3.2 release.
</p>
<p>
Mesa 17.3.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c733d37a161501cd81dc9b309ccb613753b98eafc6d35e0847548a6642749772 mesa-17.3.3.tar.gz
41bac5de0ef6adc1f41a1ec0f80c19e361298ce02fa81b5f9ba4fdca33a9379b mesa-17.3.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104214">Bug 104214</a> - Dota crashes when switching from game to desktop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104492">Bug 104492</a> - Compute Shader: Wrong alignment when assigning struct value to structured SSBO</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104551">Bug 104551</a> - Check if Mako templates for Python are installed</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (3):</p>
<ul>
<li>anv: Add missing unlock in anv_scratch_pool_alloc</li>
<li>anv: Take write mask into account in has_color_buffer_write_enabled</li>
<li>anv: Make sure state on primary is correct after CmdExecuteCommands</li>
</ul>
<p>Andres Gomez (1):</p>
<ul>
<li>anv: Import mako templates only during execution of anv_extensions</li>
</ul>
<p>Bas Nieuwenhuizen (11):</p>
<ul>
<li>radv: Invert condition for all samples identical during resolve.</li>
<li>radv: Flush caches before subpass resolve.</li>
<li>radv: Fix fragment resolve destination offset.</li>
<li>radv: Use correct framebuffer size for partial FS resolves.</li>
<li>radv: Always use fragment resolve if dest uses DCC.</li>
<li>Revert "radv/gfx9: fix block compression texture views."</li>
<li>radv: Use correct HTILE expanded words.</li>
<li>radv: Allow writing 0 scissors.</li>
<li>ac/nir: Handle loading data from compact arrays.</li>
<li>radv: Invalidate L1 for VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT.</li>
<li>ac/nir: Sanitize location_frac for local variables.</li>
</ul>
<p>Dave Airlie (8):</p>
<ul>
<li>radv: fix events on compute queues.</li>
<li>radv: fix pipeline statistics end query on compute queue</li>
<li>radv/gfx9: fix 3d image to image transfers on compute queues.</li>
<li>radv/gfx9: fix 3d image clears on compute queues</li>
<li>radv/gfx9: fix buffer to image for 3d images on compute queues</li>
<li>radv/gfx9: fix block compression texture views.</li>
<li>radv/gfx9: use a bigger hammer to flush cb/db caches.</li>
<li>radv/gfx9: use correct swizzle parameter to work out border swizzle.</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.2</li>
</ul>
<p>Florian Will (1):</p>
<ul>
<li>glsl: Respect std430 layout in lower_buffer_access</li>
</ul>
<p>Juan A. Suarez Romero (6):</p>
<ul>
<li>cherry-ignore: intel/fs: Use the original destination region for int MUL lowering</li>
<li>cherry-ignore: i965/fs: Use UW types when using V immediates</li>
<li>cherry-ignore: main: Clear shader program data whenever ProgramBinary is called</li>
<li>cherry-ignore: egl: pass the dri2_dpy to the $plat_teardown functions</li>
<li>cherry-ignore: vulkan/wsi: free cmd pools</li>
<li>Update version to 17.3.3</li>
</ul>
<p>Józef Kucia (1):</p>
<ul>
<li>radeonsi: fix alpha-to-coverage if color writes are disabled</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Require space for MI_BATCHBUFFER_END.</li>
<li>i965: Torch public intel_batchbuffer_emit_dword/float helpers.</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>etnaviv: disable in-place resolve for non-supertiled surfaces</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>anv: VkDescriptorSetLayoutBinding can have descriptorCount == 0</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>loader/dri3: Avoid freeing renderbuffers in use</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr/rast: fix invalid sign masks in avx512 simdlib code</li>
</ul>
</div>
</body>
</html>

275
docs/relnotes/17.3.4.html Normal file
View File

@@ -0,0 +1,275 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.4 Release Notes / January 15, 2018</h1>
<p>
Mesa 17.3.4 is a bug fix release which fixes bugs found since the 17.3.3 release.
</p>
<p>
Mesa 17.3.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
2d3a4c3cbc995b3e192361dce710d8c749e046e7575aa1b7d8fc9e6b4df28f84 mesa-17.3.4.tar.gz
71f995e233bc5df1a0dd46c980d1720106e7f82f02d61c1ca50854b5e02590d0 mesa-17.3.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU HANG: ecode 9:0:0x86df7cf9, in csgo_linux64 [4947], reason: Hang on rcs, action: reset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104163">Bug 104163</a> - [GEN9+] 2-3% perf drop in GfxBench Manhattan 3.1 from &quot;i965: Disable regular fast-clears (CCS_D) on gen9+&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104487">Bug 104487</a> - [KBL] portal2_linux GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104711">Bug 104711</a> - [skl CCS] Oxenfree (unity engine game) hangs GPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104745">Bug 104745</a> - HEVC VDPAU decoding broken on RX 460 with UVD Firmware v1.130</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104818">Bug 104818</a> - mesa fails to build on ia64</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (1):</p>
<ul>
<li>i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen&lt;8</li>
</ul>
<p>Bas Nieuwenhuizen (10):</p>
<ul>
<li>radv: Fix ordering issue in meta memory allocation failure path.</li>
<li>radv: Fix memory allocation failure path in compute resolve init.</li>
<li>radv: Fix freeing meta state if the device pipeline cache fails to allocate.</li>
<li>radv: Fix fragment resolve init memory allocation failure paths.</li>
<li>radv: Fix bufimage failure deallocation.</li>
<li>radv: Init variant entry with memset.</li>
<li>radv: Don't allow 3d or 1d depth/stencil textures.</li>
<li>ac/nir: Use instance_rate_inputs per attribute, not per variable.</li>
<li>ac/nir: Use correct 32-bit component writemask for 64-bit SSBO stores.</li>
<li>ac/nir: Fix vector extraction if source vector has &gt;4 elements.</li>
</ul>
<p>Boyuan Zhang (2):</p>
<ul>
<li>radeon/vcn: add and manage render picture list</li>
<li>radeon/uvd: add and manage render picture list</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>configure.ac: add missing llvm dependencies to .pc files</li>
</ul>
<p>Dave Airlie (10):</p>
<ul>
<li>r600/sb: fix a bug emitting ar load from a constant.</li>
<li>ac/nir: account for view index in the user sgpr allocation.</li>
<li>radv: add fs_key meta format support to resolve passes.</li>
<li>radv: don't use hw resolve for integer image formats</li>
<li>radv: don't use hw resolves for r16g16 norm formats.</li>
<li>radv: move spi_baryc_cntl to pipeline</li>
<li>r600/sb: insert the else clause when we might depart from a loop</li>
<li>radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)</li>
<li>radv/gfx9: fix block compression texture views. (v2)</li>
<li>virgl: also remove dimension on indirect.</li>
</ul>
<p>Eleni Maria Stea (1):</p>
<ul>
<li>mesa: Fix function pointers initialization in status tracker</li>
</ul>
<p>Emil Velikov (18):</p>
<ul>
<li>cherry-ignore: i965: Accept CONTEXT_ATTRIB_PRIORITY for brwCreateContext</li>
<li>cherry-ignore: swr: refactor swr_create_screen to allow for proper cleanup on error</li>
<li>cherry-ignore: anv: add explicit 18.0 only nominations</li>
<li>cherry-ignore: radv: fix sample_mask_in loading. (v3.1)</li>
<li>cherry-ignore: meson: multiple fixes</li>
<li>cherry-ignore: swr/rast: support llvm 3.9 type declarations</li>
<li>Revert "cherry-ignore: intel/fs: Use the original destination region for int MUL lowering"</li>
<li>cherry-ignore: ac/nir: set amdgpu.uniform and invariant.load for UBOs</li>
<li>cherry-ignore: add gen10 fixes</li>
<li>cherry-ignore: add r600/amdgpu 18.0 nominations</li>
<li>cherry-ignore: add i965 shader cache fixes</li>
<li>cherry-ignore: nir: mark unused space in packed_tex_data</li>
<li>radv: Stop advertising VK_KHX_multiview</li>
<li>cherry-ignore: radv: Don't expose VK_KHX_multiview on android.</li>
<li>configure.ac: correct driglx-direct help text</li>
<li>cherry-ignore: add meson fix</li>
<li>cherry-ignore: add a few more meson fixes</li>
<li>Update version to 17.3.4</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>radeon: remove left over dead code</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>r600/shader: Initialize max_driver_temp_used correctly for the first time</li>
</ul>
<p>Grazvydas Ignotas (2):</p>
<ul>
<li>st/va: release held locks in error paths</li>
<li>st/vdpau: release held lock in error path</li>
</ul>
<p>Igor Gnatenko (1):</p>
<ul>
<li>link mesautil with pthreads</li>
</ul>
<p>Indrajit Das (4):</p>
<ul>
<li>st/omx_bellagio: Update default intra matrix per MPEG2 spec</li>
<li>radeon/uvd: update quantiser matrices only when requested</li>
<li>radeon/vcn: update quantiser matrices only when requested</li>
<li>st/va: clear pointers for mpeg2 quantiser matrices</li>
</ul>
<p>Jason Ekstrand (19):</p>
<ul>
<li>i965: Call brw_cache_flush_for_render in predraw_resolve_framebuffer</li>
<li>i965: Add more precise cache tracking helpers</li>
<li>i965/blorp: Add more destination flushing</li>
<li>i965: Track the depth and render caches separately</li>
<li>i965: Track format and aux usage in the render cache</li>
<li>Re-enable regular fast-clears (CCS_D) on gen9+</li>
<li>i965/miptree: Refactor CCS_E and CCS_D cases in render_aux_usage</li>
<li>i965/miptree: Add an explicit tiling parameter to create_for_bo</li>
<li>i965/miptree: Use the tiling from the modifier instead of the BO</li>
<li>i965/bufmgr: Add a create_from_prime_tiled function</li>
<li>i965: Set tiling on BOs imported with modifiers</li>
<li>i965/miptree: Take an aux_usage in prepare/finish_render</li>
<li>i965/miptree: Add an aux_disabled parameter to render_aux_usage</li>
<li>i965/surface_state: Drop brw_aux_surface_disabled</li>
<li>intel/fs: Use the original destination region for int MUL lowering</li>
<li>anv/pipeline: Don't look at blend state unless we have an attachment</li>
<li>anv/cmd_buffer: Re-emit the pipeline at every subpass</li>
<li>anv: Stop advertising VK_KHX_multiview</li>
<li>i965: Call prepare_external after implicit window-system MSAA resolves</li>
</ul>
<p>Jon Turney (3):</p>
<ul>
<li>configure: Default to gbm=no on osx</li>
<li>glx/apple: include util/debug.h for env_var_as_boolean prototype</li>
<li>glx/apple: locate dispatch table functions to wrap by name</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>svga: Prevent use after free.</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.3</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Bind null render targets for shadow sampling + color.</li>
<li>i965: Bump official kernel requirement to Linux v3.9.</li>
</ul>
<p>Lucas Stach (2):</p>
<ul>
<li>etnaviv: dirty TS state when framebuffer has changed</li>
<li>renderonly: fix dumb BO allocation for non 32bpp formats</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: don't ignore pitch for imported textures</li>
</ul>
<p>Matthew Nicholls (2):</p>
<ul>
<li>radv: restore previous stencil reference after depth-stencil clear</li>
<li>radv: remove predication on cache flushes</li>
</ul>
<p>Maxin B. John (1):</p>
<ul>
<li>anv_icd.py: improve reproducible builds</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>winsys/radeon: Compute is_displayable in surf_drm_to_winsys</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>r600: don't do stack workarounds for hemlock</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: create pipeline layout objects for all meta operations</li>
</ul>
<p>Samuel Thibault (1):</p>
<ul>
<li>glx: fix non-dri build</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>ac: fix buffer overflow bug in 64bit SSBO loads</li>
<li>ac: fix visit_ssa_undef() for doubles</li>
</ul>
</div>
</body>
</html>

66
docs/relnotes/17.3.5.html Normal file
View File

@@ -0,0 +1,66 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.5 Release Notes / February 19, 2018</h1>
<p>
Mesa 17.3.5 is a bug fix release which fixes bugs found since the 17.3.4 release.
</p>
<p>
Mesa 17.3.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
bc1ee20366aae2affc37c89228f871f438136f70252005e9f842169bde976788 mesa-17.3.5.tar.gz
eb9228fc8aaa71e0205c1481c5b157752ebaec9b646b030d27478e25a6d7936a mesa-17.3.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
</ul>
<h2>Changes</h2>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.4</li>
<li>Update version to 17.3.5</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>ac/nir: Fix conflict resolution typo in handle_vs_input_decl</li>
</ul>
</div>
</body>
</html>

85
docs/relnotes/17.3.6.html Normal file
View File

@@ -0,0 +1,85 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.6 Release Notes / February 27, 2018</h1>
<p>
Mesa 17.3.6 is a bug fix release which fixes bugs found since the 17.3.5 release.
</p>
<p>
Mesa 17.3.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
d5e10ea3f0d11b06d2b0b235bba372a04278c39bc0e712090bda1f61842db188 mesa-17.3.6.tar.gz
e5915680d44ac9d05defdec529db7459ac9edd441c9845266eff2e2d3e57fbf8 mesa-17.3.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104546">Bug 104546</a> - Crash happens when running compute pipeline after calling glxMakeCurrent two times</li>
</ul>
<h2>Changes</h2>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.5</li>
<li>Update version to 17.3.6</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>i965/draw: Do resolves properly for textures used by TXF</li>
<li>i965: Replace draw_aux_buffer_disabled with draw_aux_usage</li>
<li>i965/draw: Set NEW_AUX_STATE when draw aux changes</li>
<li>i965: Stop disabling aux during texture preparation</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Don't disable CCS for RT dependencies when dispatching compute.</li>
</ul>
<p>Topi Pohjolainen (1):</p>
<ul>
<li>i965: Don't try to disable render aux buffers for compute</li>
</ul>
</div>
</body>
</html>

312
docs/relnotes/17.3.7.html Normal file
View File

@@ -0,0 +1,312 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.7 Release Notes / March 21, 2018</h1>
<p>
Mesa 17.3.7 is a bug fix release which fixes bugs found since the 17.3.7 release.
</p>
<p>
Mesa 17.3.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
f08de6d0ccb3dbca04b44790d85c3ff9e7b1cc4189d1b7c7167e5ba7d98736c0 mesa-17.3.7.tar.gz
0595904a8fba65a8fe853a84ad3c940205503b94af41e8ceed245fada777ac1e mesa-17.3.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103007">Bug 103007</a> - [OpenGL CTS] [HSW] KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103988">Bug 103988</a> - Intermittent piglit failures with shader cache enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104381">Bug 104381</a> - swr fails to build since llvm-svn r321257</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert _mm512_mask_blend_epi32((__mmask16)(ImmT), a, b) from __m512i {aka __vector(8) long long int} to SIMDImpl::SIMD512Impl::Float</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (1):</p>
<ul>
<li>radv: Fix CmdCopyImage between uncompressed and compressed images</li>
</ul>
<p>Andriy Khulap (1):</p>
<ul>
<li>i965: Fix RELOC_WRITE typo in brw_store_data_imm64()</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>isl: Don't use surface format R32_FLOAT for typed atomic integer operations</li>
</ul>
<p>Bas Nieuwenhuizen (6):</p>
<ul>
<li>radv: Always lower indirect derefs after nir_lower_global_vars_to_local.</li>
<li>radeonsi: Export signalled sync file instead of -1.</li>
<li>radv: Implement WaitForFences with !waitAll.</li>
<li>radv: Implement waiting on non-submitted fences.</li>
<li>radv: Fix copying from 3D images starting at non-zero depth.</li>
<li>radv: Increase the number of dynamic uniform buffers.</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: add missing switch case for EXTRA_VERSION_40 in check_extra()</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>glx: Properly handle cases where screen creation fails</li>
</ul>
<p>Daniel Stone (3):</p>
<ul>
<li>i965: Fix bugs in intel_from_planar</li>
<li>egl/wayland: Fix ARGB/XRGB transposition in config map</li>
<li>egl/wayland: Always use in-tree wayland-egl-backend.h</li>
</ul>
<p>Dave Airlie (9):</p>
<ul>
<li>r600: fix cubemap arrays</li>
<li>r600/sb/cayman: fix indirect ubo access on cayman</li>
<li>r600: fix xfb stream check.</li>
<li>ac/nir: to integer the args to bcsel.</li>
<li>r600/cayman: fix fragcood loading recip generation.</li>
<li>radv: don't support tc-compat on multisample d32s8 at all.</li>
<li>virgl: remap query types to hw support.</li>
<li>ac/nir: don't apply slice rounding on txf_ms</li>
<li>r600: implement callstack workaround for evergreen.</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>glapi/check_table: Remove 'extern "C"' block</li>
<li>glapi: remove APPLE extensions from test</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.6</li>
</ul>
<p>Eric Anholt (4):</p>
<ul>
<li>mesa: Drop incorrect A4B4G4R4 _mesa_format_matches_format_and_type() cases.</li>
<li>ac/nir: Fix compiler warning about uninitialized dw_addr.</li>
<li>glsl/tests: Fix strict aliasing warning about int64/double.</li>
<li>glsl/tests: Fix a compiler warning about signed/unsigned loop comparison.</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>i965: Fix KHR_blend_equation_advanced with some render targets.</li>
</ul>
<p>Frank Binns (1):</p>
<ul>
<li>egl/dri2: fix segfault when display initialisation fails</li>
</ul>
<p>George Kyriazis (1):</p>
<ul>
<li>swr/rast: blend_epi32() should return Integer, not Float</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>r600: Take ALU_EXTENDED into account when evaluating jump offsets</li>
</ul>
<p>Gurchetan Singh (1):</p>
<ul>
<li>mesa: don't clamp just based on ARB_viewport_array extension</li>
</ul>
<p>Iago Toral Quiroga (2):</p>
<ul>
<li>i965/sbe: fix number of inputs for active components</li>
<li>i965/vec4: use a temp register to compute offsets for pull loads</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>radv: Really use correct HTILE expanded words.</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>intel/isl: Add an isl_color_value_is_zero helper</li>
<li>vulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails</li>
<li>intel/fs: Set up sampler message headers in the visitor on gen7+</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>configure.ac: pthread-stubs not present on OpenBSD</li>
</ul>
<p>Jordan Justen (3):</p>
<ul>
<li>i965: Create new program cache bo when clearing the program cache</li>
<li>program: Don't reset SamplersValidated when restoring from shader cache</li>
<li>intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview</li>
</ul>
<p>Juan A. Suarez Romero (14):</p>
<ul>
<li>cherry-ignore: Explicit 18.0 only nominations</li>
<li>cherry-ignore: r600/compute: only mark buffer/image state dirty for fragment shaders</li>
<li>cherry-ignore: anv: Move setting current_pipeline to cmd_state_init</li>
<li>cherry-ignore: anv: Be more careful about fast-clear colors</li>
<li>cherry-ignore: Add patches that has a specific version for 17.3</li>
<li>cherry-ignore: r600: Take ALU_EXTENDED into account when evaluating jump offsets</li>
<li>cherry-ignore: intel/compiler: Memory fence commit must always be enabled for gen10+</li>
<li>cherry-ignore: i965: Avoid problems from referencing orphaned BOs after growing.</li>
<li>cherry-ignore: include all Meson related fixes</li>
<li>cherry-ignore: ac/shader: fix vertex input with components.</li>
<li>cherry-ignore: i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.</li>
<li>cherry-ignore: anv/image: Separate modifiers from legacy scanout</li>
<li>cherry-ignore: glsl: Fix memory leak with known glsl_type instances</li>
<li>Update version to 17.3.7</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nvir/nvc0: fix legalizing of ld unlock c0[0x10000]</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Emit CS stall before MEDIA_VFE_STATE.</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>i965: perf: ensure reading config IDs from sysfs isn't interrupted</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>radeonsi: align command buffer starting address to fix some Raven hangs</li>
<li>configure.ac: blacklist libdrm 2.4.90</li>
</ul>
<p>Michal Navratil (1):</p>
<ul>
<li>winsys/amdgpu: allow non page-aligned size bo creation from pointer</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>glsl/linker: fix bug when checking precision qualifier</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>ac/nir: use ordered float comparisons except for not equal</li>
<li>Revert "mesa: do not trigger _NEW_TEXTURE_STATE in glActiveTexture()"</li>
</ul>
<p>Stephan Gerhold (1):</p>
<ul>
<li>util/build-id: Fix address comparison for binaries with LOAD vaddr &gt; 0</li>
</ul>
<p>Thomas Hellstrom (2):</p>
<ul>
<li>svga: Fix a leftover debug hack</li>
<li>loader_dri3/glx/egl: Reinstate the loader_dri3_vtable get_dri_screen callback</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr/rast: fix MemoryBuffer build break for llvm-6</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>nir: fix interger divide by zero crash during constant folding</li>
</ul>
<p>Tobias Droste (1):</p>
<ul>
<li>gallivm: Use new LLVM fast-math-flags API</li>
</ul>
<p>Vadym Shovkoplias (1):</p>
<ul>
<li>mesa: add glsl version query (v4)</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>swr/rast: Fix macOS macro.</li>
</ul>
</div>
</body>
</html>

147
docs/relnotes/17.3.8.html Normal file
View File

@@ -0,0 +1,147 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.8 Release Notes / April 03, 2018</h1>
<p>
Mesa 17.3.8 is a bug fix release which fixes bugs found since the 17.3.7 release.
</p>
<p>
Mesa 17.3.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
175d2ca9be2af3a8db6cd603986096d75da70f59699528d7b6675d542a305e23 mesa-17.3.8.tar.gz
8f9d9bf281c48e4a8f5228816577263b4c655248dc7666e75034ab422951a6b1 mesa-17.3.8.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - [BSW/HD400] SynMark OglCSDof GPU hangs when shaders come from cache</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - Reading per-patch outputs in Tessellation Control Shader returns undefined values</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - [regression][hang] Trine1EE hangs GPU after loading screen on Mesa3D-17.3 and later</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - compiler assertion hit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>
</ul>
<h2>Changes</h2>
<p>Axel Davy (3):</p>
<ul>
<li>st/nine: Fix bad tracking of vs textures for NINESBT_ALL</li>
<li>st/nine: Fixes warning about implicit conversion</li>
<li>st/nine: Fix non inversible matrix check</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>anv/pipeline: fail if TCS/TES compile fail</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv: get correct offset into LDS for indexed vars.</li>
</ul>
<p>Derek Foreman (1):</p>
<ul>
<li>egl/wayland: Make swrast display_sync the correct queue</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>meson/configure: detect endian.h instead of trying to guess when it's available</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>mesa: Don't write to user buffer in glGetTexParameterIuiv on error</li>
<li>i965/vec4: Fix null destination register in 3-source instructions</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965: Emit texture cache invalidates around blorp_copy</li>
</ul>
<p>Jordan Justen (2):</p>
<ul>
<li>i965: Calculate thread_count in brw_alloc_stage_scratch</li>
<li>i965: Hard code CS scratch_ids_per_subslice for Cherryview</li>
</ul>
<p>Juan A. Suarez Romero (6):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.7</li>
<li>cherry-ignore: ac/nir: pass the nir variable through tcs loading.</li>
<li>cherry-ignore: radv: handle exporting view index to fragment shader. (v1.1)</li>
<li>cherry-ignore: omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}</li>
<li>cherry-ignore: docs: fix 18.0 release note version</li>
<li>Update version to 17.3.8</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/vce: move feedback command inside of destroy function</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>st/dri: fix OpenGL-OpenCL interop for GL_TEXTURE_BUFFER</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>nir: fix per_vertex_output intrinsic</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>glsl: fix infinite loop caused by bug in loop unrolling pass</li>
<li>nir: fix crash in loop unroll corner case</li>
</ul>
</div>
</body>
</html>

162
docs/relnotes/17.3.9.html Normal file
View File

@@ -0,0 +1,162 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.9 Release Notes / April 18, 2018</h1>
<p>
Mesa 17.3.9 is a bug fix release which fixes bugs found since the 17.3.8 release.
</p>
<p>
Mesa 17.3.9 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
4d625f65a1ff4cd8cfeb39e38f047507c6dea047502a0d53113c96f54588f340 mesa-17.3.9.tar.gz
c5beb5fc05f0e0c294fefe1a393ee118cb67e27a4dca417d77c297f7d4b6e479 mesa-17.3.9.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (2):</p>
<ul>
<li>dri_util: when overriding, always reset the core version</li>
<li>mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage</li>
</ul>
<p>Axel Davy (2):</p>
<ul>
<li>st/nine: Declare lighting consts for ff shaders</li>
<li>st/nine: Do not use scratch for face register</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>ac/nir: Add workaround for GFX9 buffer views.</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>st/dri: Initialise modifier to INVALID for DRI2</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>glsl: remove unreachable assert()</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>gbm: remove never-implemented function</li>
</ul>
<p>Henri Verbeet (1):</p>
<ul>
<li>mesa: Inherit texture view multi-sample information from the original texture images.</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>compiler/spirv: set is_shadow for depth comparitor sampling opcodes</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>nir/vars_to_ssa: Remove copies from the correct set</li>
<li>nir/lower_indirect_derefs: Support interp_var_at intrinsics</li>
<li>intel/vec4: Set channel_sizes for MOV_INDIRECT sources</li>
<li>nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination</li>
</ul>
<p>Juan A. Suarez Romero (3):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.8</li>
<li>cherry-ignore: Explicit 18.0 only nominations</li>
<li>Update version to 17.3.9</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: fix number of planes for depth &amp; stencil</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix picking the method for resolve subpass</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>i965: Extend the negative 32-bit deltas to 64-bits</li>
</ul>
<p>Timothy Arceri (6):</p>
<ul>
<li>gallium/pipebuffer: fix parenthesis location</li>
<li>glsl: always call do_lower_jumps() after loop unrolling</li>
<li>ac: add if/loop build helpers</li>
<li>radeonsi: make use of if/loop build helpers in ac</li>
<li>ac: make use of if/loop build helpers</li>
<li>mesa: free debug messages when destroying the debug state</li>
</ul>
<p>Xiong, James (1):</p>
<ul>
<li>i965: return the fourcc saved in __DRIimage when possible</li>
</ul>
</div>
</body>
</html>

321
docs/relnotes/18.0.0.html Normal file
View File

@@ -0,0 +1,321 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.0 Release Notes / March 27 2018</h1>
<p>
Mesa 18.0.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 18.0.1.
</p>
<p>
Mesa 18.0.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
93c2d3504b2871ac2146603fb1270f341d36a39695e2950a469c5eac74f98457 mesa-18.0.0.tar.gz
694e5c3d37717d23258c1f88bc134223c5d1aac70518d2f9134d6df3ee791eea mesa-18.0.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>Disk shader cache support for i965 when MESA_GLSL_CACHE_DISABLE environment variable is set to "0" or "false"</li>
<li>GL_ARB_shader_atomic_counters and GL_ARB_shader_atomic_counter_ops on r600/evergreen+</li>
<li>GL_ARB_shader_image_load_store and GL_ARB_shader_image_size on r600/evergreen+</li>
<li>GL_ARB_shader_storage_buffer_object on r600/evergreen+<li>
<li>GL_ARB_compute_shader on r600/evergreen+<li>
<li>GL_ARB_cull_distance on r600/evergreen+</li>
<li>GL_ARB_enhanced_layouts on r600/evergreen+</li>
<li>GL_ARB_bindless_texture on nvc0/kepler</li>
<li>OpenGL 4.3 on r600/evergreen with hw fp64 support</li>
<li>Support 1 binary format for GL_ARB_get_program_binary on i965.
(For the 18.0 release, 0 formats continue to be supported in
compatibility profiles.)</li>
<li>Cannonlake support on i965 and anv</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85564">Bug 85564</a> - Dead Island rendering issues</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92363">Bug 92363</a> - [BSW/BDW] ogles1conform Gets test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94739">Bug 94739</a> - Mesa 11.1.2 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97532">Bug 97532</a> - Regression: GLB 2.7 &amp; Glmark-2 GLES versions segfault due to linker precision error (259fc505) on dead variable</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97852">Bug 97852</a> - Unreal Engine corrupted preview viewport</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100438">Bug 100438</a> - glsl/ir.cpp:1376: ir_dereference_variable::ir_dereference_variable(ir_variable*): Assertion `var != NULL' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101378">Bug 101378</a> - interpolateAtSample check for input parameter is too strict</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101560">Bug 101560</a> - SPIR-V OpSwitch with int64 not supported even though shaderInt64 is true</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101691">Bug 101691</a> - gfx corruption on windowed 3d-apps running on dGPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102177">Bug 102177</a> - [SKL] ES31-CTS.core.sepshaderobjs.StateInteraction fails sporadically</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102264">Bug 102264</a> - Missing MESA_FORMAT_{B8G8R8A8,B8G8R8X8}_SRGB formats</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102354">Bug 102354</a> - Mesa 17.2 no longer can give SRGB-capable framebuffer on i965, even though Mesa 17.1.x does.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102358">Bug 102358</a> - WarThunder freezes at start, with activated vsync (vblank_mode=2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU hang in Valve games based on Source 1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102503">Bug 102503</a> - Report SRGB framebuffer to SuperTuxKart to workaround SuperTuxKart crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102665">Bug 102665</a> - test_glsl_to_tgsi_lifetime.cpp:53:67: error: &gt;&gt; should be &gt; &gt; within a nested template argument list</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102677">Bug 102677</a> - [OpenGL CTS] KHR-GL45.CommonBugs.CommonBug_PerVertexValidation fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102680">Bug 102680</a> - [OpenGL CTS] KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102710">Bug 102710</a> - vkCmdBlitImage with arrayLayers &gt; 1 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102774">Bug 102774</a> - [BDW] [Bisected] Absolute constant buffers break VAAPI in mpv</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102809">Bug 102809</a> - Rust shadows(?) flash random colours</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102897">Bug 102897</a> - Separate bind points are not implemented correctly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102955">Bug 102955</a> - HyperZ related rendering issue in ARK: Survival Evolved</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103007">Bug 103007</a> - [OpenGL CTS] [HSW] KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103085">Bug 103085</a> - [ivb byt hsw] piglit.spec.arb_indirect_parameters.tf-count-arrays</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103098">Bug 103098</a> - [OpenGL CTS] KHR-GL45.enhanced_layouts.varying_structure_locations fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103101">Bug 103101</a> - [SKL][bisected] DiRT Rally GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103115">Bug 103115</a> - [BSW BXT GLK] dEQP-VK.spirv_assembly.instruction.compute.sconvert.int32_to_int64</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103128">Bug 103128</a> - [softpipe] piglit fs-ldexp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103227">Bug 103227</a> - [G965 G45 ILK] ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103283">Bug 103283</a> - drm_get_device_name_for_fd is broken on FreeBSD</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103388">Bug 103388</a> - Linking libcltgsi.la (llvm/codegen/libclllvm_la-common.lo) fails with &quot;error: no match for 'operator-'&quot; with GCC-7, Mesa from Git and current LLVM revisions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103393">Bug 103393</a> - glDispatchComputeGroupSizeARB : gl_GlobalInvocationID.x != gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103412">Bug 103412</a> - gallium/wgl: Another fix to context creation without prior SetPixelFormat()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103496">Bug 103496</a> - svga_screen.c:26:46: error: git_sha1.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103513">Bug 103513</a> - [build failure] radv_shader.c:683:2: error: format not a string literal and no format arguments [-Werror=format-security]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103519">Bug 103519</a> - wayland egl apps crash on start with mesa 17.2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103529">Bug 103529</a> - [GM45] GPU hang with mpv fullscreen (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103537">Bug 103537</a> - i965: Shadow of Mordor broken since commit 379b24a40d3d34ffdaaeb1b328f50e28ecb01468 on Haswell</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103544">Bug 103544</a> - Graphical glitches r600 in game this war of mine linux native</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103579">Bug 103579</a> - Vertex shader causes compiler to crash in SPIRV-to-NIR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103616">Bug 103616</a> - Increased difference from reference image in shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103628">Bug 103628</a> - [BXT, GLK, BSW] KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103653">Bug 103653</a> - Unreal segfault since gallium/u_threaded: avoid syncs for get_query_result</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103658">Bug 103658</a> - addrlib/gfx9/gfx9addrlib.cpp:727:50: error: expected expression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103674">Bug 103674</a> - u_queue.c:173:7: error: implicit declaration of function 'timespec_get' is invalid in C99</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103759">Bug 103759</a> - plasma desktop corrupted rendering</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103784">Bug 103784</a> - [bisected] Egl changes breaks all of EGL</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103787">Bug 103787</a> - [BDW,BSW] gpu hang on spec.arb_pipeline_statistics_query.arb_pipeline_statistics_query-comp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103801">Bug 103801</a> - [i965] &gt;Observer_ issue</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103808">Bug 103808</a> - [radeonsi, bisected] World of Warcraft scribbling all over screen</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103902">Bug 103902</a> - Portal 2 game hangs at startup with latest mesa dev</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103904">Bug 103904</a> - Source engine-based games won't hang at start without R600_DEBUG=vs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103909">Bug 103909</a> - anv_allocator.c:113:1: error: static declaration of memfd_create follows non-static declaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103942">Bug 103942</a> - KHR-GL46.enhanced_layouts.varying* regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103955">Bug 103955</a> - Using array in structure results in wrong GLSL compilation output</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103966">Bug 103966</a> - Mesa 17.2.5 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103988">Bug 103988</a> - Intermittent piglit failures with shader cache enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104005">Bug 104005</a> - [sklgt4e] GPU hangs in Car_Chase</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104119">Bug 104119</a> - radv: OpBitFieldInsert produces 0 with a loop counter for Insert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104141">Bug 104141</a> - include/c11/threads_posix.h:96: undefined reference to `pthread_once'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104143">Bug 104143</a> - r600/sb: clobbers gl_Position -&gt; gl_FragCoord</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104163">Bug 104163</a> - [GEN9+] 2-3% perf drop in GfxBench Manhattan 3.1 from &quot;i965: Disable regular fast-clears (CCS_D) on gen9+&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104183">Bug 104183</a> - mesa-17.3.0/src/broadcom/qpu/qpu_pack.c:171]: (error) Invalid memcmp() argument</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104199">Bug 104199</a> - [i965 bisected] BIO and EM Vision in &gt;Observer_ is broken since commit af2c320190f3c73180f1610c8df955a7fa2a4d09</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104213">Bug 104213</a> - NULL pointer access crashes on compiling Vulkan compute shaders after &quot;anv: Add support for the variablePointers feature&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104214">Bug 104214</a> - Dota crashes when switching from game to desktop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104226">Bug 104226</a> - [bisected] Anvil accesses uninitialized memory while compiling shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104231">Bug 104231</a> - DispatchSanity_test.GL30 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104246">Bug 104246</a> - Talos Principle Vulkan version crash: spirv_to_nir() returns NULL entry_point</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104271">Bug 104271</a> - i965: Timeout in dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.5</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104288">Bug 104288</a> - Steamroll needs allow_glsl_cross_stage_interpolation_mismatch=true</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104331">Bug 104331</a> - [r600g] Ogre demo &quot;TutorialUAV01&quot; crash at r600_decompress_color_images</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104338">Bug 104338</a> - NULL pointer access crash on Sacha Willems' Vulkan raytracing demo after &quot;spirv: Add basic type validation for OpLoad, OpStore, and OpCopyMemory&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104359">Bug 104359</a> - Mesa freezes in &quot;vtn_cfg_walk_blocks&quot; with Sacha Willems' hdr, parallaxmapping and specializationconstants Vulkan demos</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104381">Bug 104381</a> - swr fails to build since llvm-svn r321257</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104424">Bug 104424</a> - DOOM 2016 broken by spirv OpStore validation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104487">Bug 104487</a> - [KBL] portal2_linux GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104490">Bug 104490</a> - [radeonsi/290x] Dota2 fails to start (can't create opengl context)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104492">Bug 104492</a> - Compute Shader: Wrong alignment when assigning struct value to structured SSBO</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104546">Bug 104546</a> - Crash happens when running compute pipeline after calling glxMakeCurrent two times</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104551">Bug 104551</a> - Check if Mako templates for Python are installed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104668">Bug 104668</a> - dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104677">Bug 104677</a> - radv_generate_graphics_pipeline_key reads input rate from incorrect binding</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104690">Bug 104690</a> - [G33] regression: piglit.spec.!opengl 1_4.draw-batch and gl-1_4-dlist-multidrawarrays</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104711">Bug 104711</a> - [skl CCS] Oxenfree (unity engine game) hangs GPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104742">Bug 104742</a> - [swrast] piglit gl-1.4-dlist-multidrawarrays regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104746">Bug 104746</a> - [swrast] piglit attribs regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104749">Bug 104749</a> - rasterizer/jitter/JitManager.cpp:252:91: error: no matching function for call to llvm::DIBuilder::createBasicType(const char [8], int, llvm::dwarf::TypeKind)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104762">Bug 104762</a> - Various segfaults/problems in qt/plasma</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104777">Bug 104777</a> - Attaching multiple shader objects for the same stage to a GLSL program triggers a linker error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104884">Bug 104884</a> - memory leak with intel i965 mesa when running android container in Ubuntu</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert _mm512_mask_blend_epi32((__mmask16)(ImmT), a, b) from __m512i {aka __vector(8) long long int} to SIMDImpl::SIMD512Impl::Float</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105065">Bug 105065</a> - Qt Programs occasionally fail to render with new Mesa (glGetProgramBinary)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105120">Bug 105120</a> - meson build broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - [BSW/HD400] SynMark OglCSDof GPU hangs when shaders come from cache</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105292">Bug 105292</a> - vkGetQueryPoolResults returns incorrect query status for large query buffers (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - Reading per-patch outputs in Tessellation Control Shader returns undefined values</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Remove incomplete GLX_MESA_set_3dfx_mode from the Xlib libGL</li>
</ul>
</div>
</body>
</html>

225
docs/relnotes/18.0.1.html Normal file
View File

@@ -0,0 +1,225 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.1 Release Notes / April 18, 2018</h1>
<p>
Mesa 18.0.1 is a bug fix release which fixes bugs found since the 18.0.0 release.
</p>
<p>
Mesa 18.0.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
0c93ba892c0610f5dd87f2e2673b9445187995c395b3ddb33fd4260bfb291e89 mesa-18.0.1.tar.gz
b2d2f5b5dbaab13e15cb0dcb5ec81887467f55ebc9625945b303a3647cd87954 mesa-18.0.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105567">Bug 105567</a> - meson/ninja: 1. mesa/vdpau incorrect symlinks in DESTDIR and 2. Ddri-drivers-path Dvdpau-libs-path overrides DESTDIR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - [regression][hang] Trine1EE hangs GPU after loading screen on Mesa3D-17.3 and later</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - compiler assertion hit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105942">Bug 105942</a> - Graphical artefacts after update to mesa 18.0.0-2</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (2):</p>
<ul>
<li>dri_util: when overriding, always reset the core version</li>
<li>mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage</li>
</ul>
<p>Axel Davy (5):</p>
<ul>
<li>st/nine: Fix bad tracking of vs textures for NINESBT_ALL</li>
<li>st/nine: Fixes warning about implicit conversion</li>
<li>st/nine: Fix non inversible matrix check</li>
<li>st/nine: Declare lighting consts for ff shaders</li>
<li>st/nine: Do not use scratch for face register</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>ac/nir: Add workaround for GFX9 buffer views.</li>
<li>radv: Don't set instance count using predication.</li>
<li>radv: Always reset draw user SGPRs after secondary command buffer.</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>anv/pipeline: fail if TCS/TES compile fail</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>st/dri: Initialise modifier to INVALID for DRI2</li>
</ul>
<p>Derek Foreman (1):</p>
<ul>
<li>egl/wayland: Make swrast display_sync the correct queue</li>
</ul>
<p>Dylan Baker (4):</p>
<ul>
<li>meson: don't use compiler.has_header</li>
<li>autotools: include meson_get_version</li>
<li>meson: Set .so version for xa like autotools does</li>
<li>meson: fix megadriver symlinking</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.0</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>meson/configure: detect endian.h instead of trying to guess when it's available</li>
<li>docs: fix 18.0 release note version</li>
<li>gbm: remove never-implemented function</li>
</ul>
<p>Henri Verbeet (1):</p>
<ul>
<li>mesa: Inherit texture view multi-sample information from the original texture images.</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>compiler/spirv: set is_shadow for depth comparitor sampling opcodes</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>i965/vec4: Fix null destination register in 3-source instructions</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>nir/vars_to_ssa: Remove copies from the correct set</li>
<li>nir/lower_indirect_derefs: Support interp_var_at intrinsics</li>
<li>intel/vec4: Set channel_sizes for MOV_INDIRECT sources</li>
<li>nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination</li>
</ul>
<p>Juan A. Suarez Romero (5):</p>
<ul>
<li>cherry-ignore anv: Be more careful about fast-clear colors</li>
<li>cherry-ignore: ac/shader: fix vertex input with components.</li>
<li>cherry-ignore: radv: handle exporting view index to fragment shader. (v1.1)</li>
<li>cherry-ignore: omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}</li>
<li>Update version to 18.0.1</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/vce: move feedback command inside of destroy function</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>i965/perf: fix config registration when uploading to kernel</li>
</ul>
<p>Marc Dietrich (1):</p>
<ul>
<li>meson: fix HAVE_LLVM version define in meson build</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override</li>
</ul>
<p>Mark Thompson (1):</p>
<ul>
<li>st/va: Enable vaExportSurfaceHandle()</li>
</ul>
<p>Rob Clark (3):</p>
<ul>
<li>nir: fix per_vertex_output intrinsic</li>
<li>freedreno/a5xx: fix page faults on last level</li>
<li>freedreno/a5xx: don't align height for PIPE_BUFFER</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>radv: fix picking the method for resolve subpass</li>
<li>radv: fix radv_layout_dcc_compressed() when image doesn't have DCC</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>i965: Extend the negative 32-bit deltas to 64-bits</li>
</ul>
<p>Timothy Arceri (7):</p>
<ul>
<li>ac: add if/loop build helpers</li>
<li>radeonsi: make use of if/loop build helpers in ac</li>
<li>ac: make use of if/loop build helpers</li>
<li>glsl: fix infinite loop caused by bug in loop unrolling pass</li>
<li>nir: fix crash in loop unroll corner case</li>
<li>gallium/pipebuffer: fix parenthesis location</li>
<li>glsl: always call do_lower_jumps() after loop unrolling</li>
</ul>
<p>Xiong, James (1):</p>
<ul>
<li>i965: return the fourcc saved in __DRIimage when possible</li>
</ul>
</div>
</body>
</html>

268
docs/relnotes/18.1.0.html Normal file
View File

@@ -0,0 +1,268 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.0 Release Notes / May 18 2018</h1>
<p>
Mesa 18.1.0 is a new development release. People who are concerned
with stability and reliability should stick with a previous release or
wait for Mesa 18.1.1.
</p>
<p>
Mesa 18.1.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
b1c1dbb42597190503d3abc518b12de880623f097c6cb6c293ecf69ae87e6fbf mesa-18.1.0.tar.gz
c855c5b67ef993b7621f76d8b120769ec0415f1c3616eaff44ef7f7f300aceba mesa-18.1.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>OpenGL 3.1 with ARB_compatibility on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe, svga</li>
<li>GL_ARB_bindless_texture on nvc0/maxwell+</li>
<li>GL_ARB_transform_feedback_overflow_query on nvc0</li>
<li>GL_EXT_semaphore on radeonsi</li>
<li>GL_EXT_semaphore_fd on radeonsi</li>
<li>GL_EXT_shader_framebuffer_fetch on i965 on desktop GL (GLES was already supported)</li>
<li>GL_EXT_shader_framebuffer_fetch_non_coherent on i965</li>
<li>GL_KHR_blend_equation_advanced on radeonsi</li>
<li>Disk shader cache support for i965 enabled by default</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91808">Bug 91808</a> - trine1 misrender r600g</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95009">Bug 95009</a> - [SNB] amd_shader_trinary_minmax.execution.built-in-functions.gs-mid3-ivec2-ivec2-ivec2 intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95012">Bug 95012</a> - [SNB] glsl-1_50.execution.built-in-functions.gs-op tests intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99549">Bug 99549</a> - pp: Failed to translate a shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100259">Bug 100259</a> - [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102905">Bug 102905</a> - [R600] Miscompilation of TGSI to VLIW causes artifacts in Gallium Nine with Crysis2 bump mapping</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104335">Bug 104335</a> - [OpenGL CTS][SKL,KBL] KHR-GL45.vertex_attrib_64bit.limits_test occasionally fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104668">Bug 104668</a> - dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104717">Bug 104717</a> - Rocket League: grass rendering broken with nir</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104732">Bug 104732</a> - [radv] Binding descriptor sets disturbs other pipeline bindings</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104762">Bug 104762</a> - Various segfaults/problems in qt/plasma</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104777">Bug 104777</a> - Attaching multiple shader objects for the same stage to a GLSL program triggers a linker error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104794">Bug 104794</a> - piglit.spec.arb_internalformat_query2.samples and num_sample_counts pname checks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104803">Bug 104803</a> - SIGSEGV in state_tracker/st_glsl_to_tgsi_temprename.cpp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104863">Bug 104863</a> - 186 assertions in piglit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104884">Bug 104884</a> - memory leak with intel i965 mesa when running android container in Ubuntu</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104908">Bug 104908</a> - Texture Compression Hint not converted to enum16</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104989">Bug 104989</a> - [r600] [bisected] OpenGL applications can't render anything at all</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105026">Bug 105026</a> - glxgears asserts with pp_jimenezmlaa=1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert _mm512_mask_blend_epi32((__mmask16)(ImmT), a, b) from __m512i {aka __vector(8) long long int} to SIMDImpl::SIMD512Impl::Float</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105052">Bug 105052</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105065">Bug 105065</a> - Qt Programs occasionally fail to render with new Mesa (glGetProgramBinary)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105067">Bug 105067</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105088">Bug 105088</a> - brw_nir_uniforms.cpp:256:10: error: non-constant-expression cannot be narrowed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105120">Bug 105120</a> - meson build broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105161">Bug 105161</a> - KHR_blend_equation_advanced doesn't work in GLSL 1.10-1.40 shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105183">Bug 105183</a> - Weird assertion in NIR linker</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105211">Bug 105211</a> - build failure after zwp_dmabuf commit if wayland-protocols is not installed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105229">Bug 105229</a> - [KBL SKL BDW HSW] [Regression] KHR-GLES31.core.shader_image_load_store.advanced-sso-simple failures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105238">Bug 105238</a> - ast.h:648:16: error: union member 'i' has a non-trivial constructor</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105262">Bug 105262</a> - [R600] [BISECTED] ttf fonts are invisible in many programs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105274">Bug 105274</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105292">Bug 105292</a> - vkGetQueryPoolResults returns incorrect query status for large query buffers (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105320">Bug 105320</a> - Storage texel buffer access produces wrong results (RX Vega)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105374">Bug 105374</a> - texture3d, a SaschaWillems demo, assert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105444">Bug 105444</a> - Enable GL disk shader cache when transform feedback is enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105471">Bug 105471</a> - [g33] [bisected] dEQP-GLES2.functional.shaders failures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105497">Bug 105497</a> - shader-db crashes on 72 core system after ast_type_qualifier bitset change</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105529">Bug 105529</a> - u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105567">Bug 105567</a> - meson/ninja: 1. mesa/vdpau incorrect symlinks in DESTDIR and 2. Ddri-drivers-path Dvdpau-libs-path overrides DESTDIR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105621">Bug 105621</a> - Build failure on GNOME Continuous</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105634">Bug 105634</a> - Android build test fails when building brw_oa_metrics.c</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105737">Bug 105737</a> - st_tests_common.cpp:140:42: error: no matching function for call to 'tgsi_get_opcode_info'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105738">Bug 105738</a> - commit f7ffa504a065dc2631fd38cc5fe885b277f4e7e7 causes artifacting in radv</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105740">Bug 105740</a> - glsl_types.cpp(524): error: a dynamically-initialized local static variable is not allowed inside of a statement expression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105775">Bug 105775</a> - SI reaches the maximum IB size in dwords and fail to submit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105807">Bug 105807</a> - [Regression, bisected]: 3D Rendering not working correctly in Warhammer 40k: Dawn of War II</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105817">Bug 105817</a> - scons build broken by glSpecializeShaderARB</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105820">Bug 105820</a> - [m32] piglit regressions relinking program without shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105942">Bug 105942</a> - Graphical artefacts after update to mesa 18.0.0-2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105952">Bug 105952</a> - radv causes GPU hang on SI</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105960">Bug 105960</a> - [bisected] meson build test fails with: undefined reference to `etna_pm_create_query'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106074">Bug 106074</a> - radv: si_scissor_from_viewport returns incorrect result when using half-pixel viewport offset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106126">Bug 106126</a> - eglMakeCurrent does not always ensure dri_drawable-&gt;update_drawable_info has been called for a new EGLSurface if another has been created and destroyed first</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106131">Bug 106131</a> - meson/ninja build missing file gtest.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106133">Bug 106133</a> - make check &quot;OSError: [Errno 24] Too many open files&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106147">Bug 106147</a> - SIGBUS in write_reloc() when Sacha Willems' &quot;texture3d&quot; Vulkan demo starts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106174">Bug 106174</a> - vulkan dota2 broken (segfaulting), found bug commit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106180">Bug 106180</a> - [bisected] radv vulkan smoke test black screen (Add support for DRI3 v1.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106243">Bug 106243</a> - [kbl] GPU HANG: 9:0:0x85dffffb, in Cinnamon</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106450">Bug 106450</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106462">Bug 106462</a> - piglit.spec.arb_vertex_array_bgra.get regression</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Remove incomplete GLX_SGIX_swap_barrier stubs from the Xlib libGL</li>
<li>Remove incomplete GLX_SGIX_swap_group stubs from the Xlib libGL</li>
</ul>
</div>
</body>
</html>

168
docs/relnotes/18.1.1.html Normal file
View File

@@ -0,0 +1,168 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.1 Release Notes / June 1 2018</h1>
<p>
Mesa 18.1.1 is a bug fix release which fixes bugs found since the 18.1.0 release.
</p>
<p>
Mesa 18.1.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
366a35f7530a016f2a8284fb0ee5759eeb216b4d6fa47f0e96b89ad2e43faf96 mesa-18.1.1.tar.gz
d3312a2ede5aac14a47476b208b8e3a401367838330197c4588ab8ad420d7781 mesa-18.1.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>None<p>
<h2>Changes</h2>
<p>Anuj Phogat (1):</p>
<ul>
<li>i965/glk: Add l3 banks count for 2x6 configuration</li>
</ul>
<p>Bas Nieuwenhuizen (7):</p>
<ul>
<li>radv: Fix multiview queries.</li>
<li>radv: Translate logic ops.</li>
<li>radv: Fix up 2_10_10_10 alpha sign.</li>
<li>radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.</li>
<li>amd/addrlib: Use defines in autotools build.</li>
<li>radv: Fix SRGB compute copies.</li>
<li>radv: Only expose subgroup shuffles on VI+.</li>
</ul>
<p>Christoph Haag (1):</p>
<ul>
<li>radv: fix VK_EXT_descriptor_indexing</li>
</ul>
<p>Dave Airlie (5):</p>
<ul>
<li>radv/resolve: do fmask decompress on all layers.</li>
<li>radv: resolve all layers in compute resolve path.</li>
<li>radv: use compute path for multi-layer images.</li>
<li>virgl: set texture buffer offset alignment to disable ARB_texture_buffer_range.</li>
<li>tgsi/scan: add hw atomic to the list of memory accessing files</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>docs: Add sha sums for release</li>
<li>VERSION: bump to 18.1.1 for next release</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>vulkan: don't free uninitialised memory</li>
</ul>
<p>Francisco Jerez (4):</p>
<ul>
<li>Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"</li>
<li>i965: Move buffer texture size calculation into a common helper function.</li>
<li>i965: Handle non-zero texture buffer offsets in buffer object range calculation.</li>
<li>i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nv30: ensure that displayable formats are marked accordingly</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>eg/compute: Use reference counting to handle compute memory pool.</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0</li>
<li>intel/blorp: Support blits and clears on surfaces with offsets</li>
</ul>
<p>Jose Dapena Paz (1):</p>
<ul>
<li>mesa: do not leak ctx-&gt;Shader.ReferencedProgram references</li>
</ul>
<p>Kai Wasserbäch (1):</p>
<ul>
<li>opencl: autotools: Fix linking order for OpenCL target</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>st/mesa: simplify lastLevel determination in st_finalize_texture</li>
<li>radeonsi: fix incorrect parentheses around VS-PS varying elimination</li>
<li>mesa: handle GL_UNSIGNED_INT64_ARB properly (v2)</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>dri3: Stricter SBC wraparound handling</li>
</ul>
<p>Nanley Chery (4):</p>
<ul>
<li>i965: Add and use a getter for the miptree aux buffer</li>
<li>i965: Add and use a single miptree aux_buf field</li>
<li>i965/miptree: Fix handling of uninitialized MCS buffers</li>
<li>i965/miptree: Zero-initialize CCS_D buffers</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>spirv: fix visiting inner loops with same break/continue block</li>
<li>radv: fix centroid interpolation</li>
</ul>
<p>Stuart Young (1):</p>
<ul>
<li>etnaviv: Fix missing rnndb file in tarballs</li>
</ul>
<p>Thierry Reding (3):</p>
<ul>
<li>tegra: Treat resources with modifiers as scanout</li>
<li>tegra: Fix scanout resources without modifiers</li>
<li>tegra: Remove usage of non-stable UAPI</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>mesa: add glUniform*ui{v} support to display lists</li>
</ul>
</div>
</body>
</html>

170
docs/relnotes/18.1.2.html Normal file
View File

@@ -0,0 +1,170 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.2 Release Notes / June 15 2018</h1>
<p>
Mesa 18.1.2 is a bug fix release which fixes bugs found since the 18.1.1 release.
</p>
<p>
Mesa 18.1.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
a644df23937f4078a2bd9a54349f6315c1955f5e3a4ac272832da51dea4d3c11 mesa-18.1.1.tar.gz
070bf0648ba5b242d7303ceed32aed80842f4c0ba16e5acc1a650a46eadfb1f9 mesa-18.1.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>None<p>
<h2>Changes</h2>
<p>Alex Smith (4):</p>
<ul>
<li>radv: Consolidate GFX9 merged shader lookup logic</li>
<li>radv: Handle GFX9 merged shaders in radv_flush_constants()</li>
<li>radeonsi: Fix crash on shaders using MSAA image load/store</li>
<li>radv: Set active_stages the same whether or not shaders were cached</li>
</ul>
<p>Andrew Galante (2):</p>
<ul>
<li>meson: Test for __atomic_add_fetch in atomic checks</li>
<li>configure.ac: Test for __atomic_add_fetch in atomic checks</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Don't pass a TESS_EVAL shader when tesselation is not enabled.</li>
</ul>
<p>Cameron Kumar (1):</p>
<ul>
<li>vulkan/wsi: Destroy swapchain images after terminating FIFO queues</li>
</ul>
<p>Dylan Baker (6):</p>
<ul>
<li>docs/relnotes: Add sha256 sums for mesa 18.1.1</li>
<li>cherry-ignore: add commits not to pull</li>
<li>cherry-ignore: Add patches from Jason that he rebased on 18.1</li>
<li>meson: work around gentoo applying -m32 to host compiler in cross builds</li>
<li>cherry-ignore: Add another patch</li>
<li>version: bump version for 18.1.2 release</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>autotools: add missing android file to package</li>
<li>configure: radv depends on mako</li>
<li>i965: fix resource leak</li>
</ul>
<p>Jason Ekstrand (10):</p>
<ul>
<li>intel/eu: Add some brw_get_default_ helpers</li>
<li>intel/eu: Copy fields manually in brw_next_insn</li>
<li>intel/eu: Set flag [sub]register number differently for 3src</li>
<li>intel/blorp: Don't vertex fetch directly from clear values</li>
<li>intel/isl: Add bounds-checking assertions in isl_format_get_layout</li>
<li>intel/isl: Add bounds-checking assertions for the format_info table</li>
<li>i965/screen: Refactor query_dma_buf_formats</li>
<li>i965/screen: Use RGBA non-sRGB formats for images</li>
<li>anv: Set fence/semaphore types to NONE in impl_cleanup</li>
<li>i965/screen: Return false for unsupported formats in query_modifiers</li>
</ul>
<p>Jordan Justen (1):</p>
<ul>
<li>mesa/program_binary: add implicit UseProgram after successful ProgramBinary</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>glsl: Add ir_binop_vector_extract in NIR</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Fix batch-last mode to properly swap BOs.</li>
<li>anv: Disable __gen_validate_value if NDEBUG is set.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>r300g/swtcl: make pipe_context uploaders use malloc'd memory as before</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>meson: Fix -latomic check</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>glx: Fix number of property values to read in glXImportContextEXT</li>
</ul>
<p>Nicolas Boichat (1):</p>
<ul>
<li>configure.ac/meson.build: Fix -latomic test</li>
</ul>
<p>Philip Rebohle (1):</p>
<ul>
<li>radv: Use correct color format for fast clears</li>
</ul>
<p>Samuel Pitoiset (3):</p>
<ul>
<li>radv: fix a GPU hang when MRTs are sparse</li>
<li>radv: fix missing ZRANGE_PRECISION(1) for GFX9+</li>
<li>radv: add a workaround for DXVK hangs by setting amdgpu-skip-threshold</li>
</ul>
<p>Scott D Phillips (1):</p>
<ul>
<li>intel/tools: add intel_sanitize_gpu to EXTRA_DIST</li>
</ul>
<p>Thomas Petazzoni (1):</p>
<ul>
<li>configure.ac: rework -latomic check</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>ac: fix possible truncation of intrinsic name</li>
<li>radeonsi: fix possible truncation on renderer string</li>
</ul>
</div>
</body>
</html>

167
docs/relnotes/18.1.3.html Normal file
View File

@@ -0,0 +1,167 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.3 Release Notes / June 29 2018</h1>
<p>
Mesa 18.1.3 is a bug fix release which fixes bugs found since the 18.1.2 release.
</p>
<p>
Mesa 18.1.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
2a1e36280d01ad18ba6d5b3fbd653ceaa109eaa031b78eb5dfaa4df452742b66 mesa-18.1.3.tar.gz
54f08deeda0cd2f818e8d40140040ed013de7852573002453b7f50da9ea738ce mesa-18.1.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105396">Bug 105396</a> - tc compatible htile sets depth of htiles of discarded fragments to 1.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105399">Bug 105399</a> - [snb] GPU hang: after geometry shader emits no geometry, the program hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106756">Bug 106756</a> - Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106774">Bug 106774</a> - GLSL IR copy propagates loads of SSBOs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106903">Bug 106903</a> - radv: Fragment shader output goes to wrong attachments when render targets are sparse</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106907">Bug 106907</a> - Correct Transform Feedback Varyings information is expected after using ProgramBinary</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106912">Bug 106912</a> - radv: 16-bit depth buffer causes artifacts in Shadow Warrior 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106980">Bug 106980</a> - Basemark GPU vulkan benchmark fails.</li>
</ul>
<h2>Changes</h2>
<p>Andrii Simiklit (1):</p>
<ul>
<li>i965/gen6/gs: Handle case where a GS doesn't allocate VUE</li>
</ul>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>radv: Fix output for sparse MRTs.</li>
<li>ac/surface: Set compressZ for stencil-only surfaces.</li>
</ul>
<p>Christian Gmeiner (1):</p>
<ul>
<li>util/bitset: include util/macro.h</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>glsl: allow standalone semicolons outside main()</li>
</ul>
<p>Dylan Baker (8):</p>
<ul>
<li>docs: Add release notes for 18.1.2</li>
<li>cherry-ignore: Add 587e712eda95c31d88ea9d20e59ad0ae59afef4f</li>
<li>meson: Fix auto option for va</li>
<li>meson: Fix auto option for xvmc</li>
<li>meson: Correct behavior of vdpau=auto</li>
<li>cherry-ignore: Ignore cac7ab1192eefdd8d8b3f25053fb006b5c330eb8</li>
<li>cherry-ignore: add a2f5292c82ad07731d633b36a663e46adc181db9</li>
<li>VERSION: bump version to 18.1.3</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>configure: use compliant grep regex checks</li>
<li>glsl/tests/glcpp: reinstate "error out if no tests found"</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>radv: fix reported number of available VGPRs</li>
<li>radv: fix bitwise check</li>
<li>meson: fix i965/anv/isl genX static lib names</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>glsl: Don't copy propagate from SSBO or shared variables either</li>
<li>glsl: Don't copy propagate elements from SSBO or shared variables either</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>nir: Handle call instructions in foreach_src</li>
<li>nir/validate: Use the type from the tail of call parameter derefs</li>
</ul>
<p>Lukas Rusak (2):</p>
<ul>
<li>meson: only build vl_winsys_dri.c when x11 platform is used</li>
<li>meson: fix private libs when building without glx</li>
</ul>
<p>Marek Olšák (5):</p>
<ul>
<li>radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointers</li>
<li>ac/gpu_info: report real total memory sizes</li>
<li>ac/gpu_info: add kernel_flushes_hdp_before_ib</li>
<li>radeonsi: always put persistent buffers into GTT on radeon</li>
<li>mesa: fix glGetInteger64v for arrays of integers</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno/ir3: fix base_vertex</li>
</ul>
<p>Samuel Pitoiset (6):</p>
<ul>
<li>radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8</li>
<li>radv: update the ZRANGE_PRECISION value for the TC-compat bug</li>
<li>radv: fix emitting the TCS regs on GFX9</li>
<li>radv: fix HTILE metadata initialization in presence of subpass clears</li>
<li>radv: ignore pInheritanceInfo for primary command buffers</li>
<li>radv: use separate bind points for the dynamic buffers</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>glsl: serialize data from glTransformFeedbackVaryings</li>
</ul>
<p>Tomeu Vizoso (1):</p>
<ul>
<li>virgl: Remove debugging left-overs</li>
</ul>
</div>
</body>
</html>

150
docs/relnotes/18.1.4.html Normal file
View File

@@ -0,0 +1,150 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.4 Release Notes / July 13 2018</h1>
<p>
Mesa 18.1.4 is a bug fix release which fixes bugs found since the 18.1.3 release.
</p>
<p>
Mesa 18.1.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
SHA256: 8acd42e4ac4d1e96ed22344073b3d4fef03d10f225f4eaf3f88c001dfc10e2db mesa-18.1.4.tar.gz
SHA256: 3061488b5d85504092cf4343816cfb2d96f2ad9bc2edec31fc96933d184cf58b mesa-18.1.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106906">Bug 106906</a> - Failed to recongnize keyword “sampler2DRect” and &quot;sampler2DRectShadow&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106928">Bug 106928</a> - When starting a match Rocket League crashes on &quot;Go&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107193">Bug 107193</a> - piglit.spec.arb_compute_shader.linker.bug-93840 fails</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (1):</p>
<ul>
<li>glx: Don't allow glXMakeContextCurrent() with only one valid drawable</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>r600/sb: cleanup if_conversion iterator to be legal C++</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>docs: Add SHA256 sums to notes for 18.1.3</li>
<li>Bump version for release</li>
</ul>
<p>Iago Toral Quiroga (3):</p>
<ul>
<li>anv/cmd_buffer: make descriptors dirty when emitting base state address</li>
<li>anv/cmd_buffer: clean dirty push constants flag after emitting push constants</li>
<li>anv/cmd_buffer: never shrink the push constant buffer size</li>
</ul>
<p>Ian Romanick (4):</p>
<ul>
<li>i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible</li>
<li>intel/compiler: Relax mixed type restriction for saturating immediates</li>
<li>i965/vec4: Properly handle sign(-abs(x))</li>
<li>i965/fs: Properly handle sign(-abs(x))</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>intel/fs: Split instructions low to high in lower_simd_width</li>
<li>anv: Be more careful about hashing pipeline layouts</li>
<li>intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLN</li>
</ul>
<p>Jose Maria Casanova Crespo (3):</p>
<ul>
<li>i965/fs: Register allocator shoudn't use grf127 for sends dest</li>
<li>intel/compiler: grf127 can not be dest when src and dest overlap in send</li>
<li>i965/fs: unspills shoudn't use grf127 as dest since Gen8+</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>i965: fix clear color bo address relocation</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>radeonsi: fix memory exhaustion issue with DCC statistics gathering with DRI2</li>
<li>glsl/cache: save and restore ExternalSamplersUsed</li>
<li>st/dri: fix a crash in server_wait_sync</li>
</ul>
<p>Neil Roberts (1):</p>
<ul>
<li>i965: Fix output register sizes when variable ranges are interleaved</li>
</ul>
<p>Rhys Perry (1):</p>
<ul>
<li>nvc0/ir: fix TargetNVC0::insnCanLoadOffset()</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>r600/sb: fix crash in fold_alu_op3</li>
</ul>
<p>Ross Burton (1):</p>
<ul>
<li>egl: fix build race in automake</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix emitting the view index on GFX9</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>glsl: skip comparison opt when adding vars of different size</li>
<li>nir: fix selection of loop terminator when two or more have the same limit</li>
</ul>
<p>zhaowei yuan (1):</p>
<ul>
<li>glsl: Treat sampler2DRect and sampler2DRectShadow as reserved in ES2</li>
</ul>
</div>
</body>
</html>

183
docs/relnotes/18.1.5.html Normal file
View File

@@ -0,0 +1,183 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.4 Release Notes / July 13 2018</h1>
<p>
Mesa 18.1.5 is a bug fix release which fixes bugs found since the 18.1.4 release.
</p>
<p>
Mesa 18.1.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
SHA256: f966d5d5d373a5b8a16ed5036c1e7f05d4ad46d130f793bf9782c3ac9133a02e mesa-18.1.5.tar.gz
SHA256: 69dbe6f1a6660386f5beb85d4fcf003ee23023ed7b9a603de84e9a37e8d98dea mesa-18.1.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103274">Bug 103274</a> - BRW allocates too much heap memory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107275">Bug 107275</a> - NIR segfaults after spirv-opt</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107295">Bug 107295</a> - Access violation on glDrawArrays with count &gt;= 2048</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107312">Bug 107312</a> - Mesa-git RPM build fails after commit 8cacf38f527d42e41441ef8c25d95d4b2f4e8602</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107366">Bug 107366</a> - NIR verification crashes on piglit tests</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (1):</p>
<ul>
<li>anv: Pay attention to VK_ACCESS_MEMORY_(READ|WRITE)_BIT</li>
</ul>
<p>Bas Nieuwenhuizen (7):</p>
<ul>
<li>radv: Select correct entries for binning.</li>
<li>radv: Fix number of samples used for binning.</li>
<li>radv: Disable disabled color buffers in rbplus opts.</li>
<li>nir: Do not use continue block after removing it.</li>
<li>util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.</li>
<li>nir: Fix end of function without return warning/error.</li>
<li>radv: Still enable inmemory &amp; API level caching if disk cache is not enabled.</li>
</ul>
<p>Chad Versace (2):</p>
<ul>
<li>anv/android: Fix type error in call to vk_errorf()</li>
<li>anv/android: Fix Autotools build for VK_ANDROID_native_buffer</li>
</ul>
<p>Chih-Wei Huang (1):</p>
<ul>
<li>Android: fix a missing nir_intrinsics.h error</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>i965: Sweep NIR after linking phase to free held memory</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>r600: enable tess_input_info for TES</li>
</ul>
<p>Dylan Baker (5):</p>
<ul>
<li>docs: Add sha256 sums for 18.1.4 tarballs</li>
<li>cherry-ignore: add 4a67ce886a7b3def5f66c1aedf9e5436d157a03c</li>
<li>cherry-ignore: Add 1f616a840eac02241c585d28e9dac8f19a297f39</li>
<li>cherry-ignore: add 11712b9ca17e4e1a819dcb7d020e19c6da77bc90</li>
<li>bump version to 18.1.5</li>
</ul>
<p>Eric Anholt (2):</p>
<ul>
<li>vc4: Don't automatically reallocate a PERSISTENT-mapped buffer.</li>
<li>meson: Move xvmc test tools from unit tests to installed tools.</li>
</ul>
<p>Harish Krupo (1):</p>
<ul>
<li>egl: Fix missing clamping in eglSetDamageRegionKHR</li>
</ul>
<p>Jan Vesely (3):</p>
<ul>
<li>radeonsi: Refuse to accept code with unhandled relocations</li>
<li>clover: Report error when pipe driver fails to create compute state</li>
<li>clover: Catch errors from executing event action</li>
</ul>
<p>Jason Ekstrand (6):</p>
<ul>
<li>anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV</li>
<li>nir/serialize: Alloc constants off the variable</li>
<li>blorp: Handle the RGB workaround more like other workarounds</li>
<li>intel/blorp: Handle 3-component formats in clears</li>
<li>intel/compiler: Account for built-in uniforms in analyze_ubo_ranges</li>
<li>spirv: Fix a couple of image atomic load/store bugs</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>gallium/tests: Don't ignore S3TC errors.</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nir: fix printing of vec16 type</li>
</ul>
<p>Lepton Wu (1):</p>
<ul>
<li>virgl: Fix flush in virgl_encoder_inline_write.</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>st/mesa: call resource_changed when binding a EGLImage to a texture</li>
</ul>
<p>Mauro Rossi (2):</p>
<ul>
<li>radv: winsys/amdgpu: include missing pthread.h header</li>
<li>android: util/disk_cache: fix building errors in gallium drivers</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>gallium: Check pipe_screen::resource_changed before dereferencing it</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>draw: force draw pipeline if there's more than 65535 vertices</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>anv: fix assert in anv_CmdBindDescriptorSets()</li>
</ul>
<p>Samuel Pitoiset (3):</p>
<ul>
<li>radv: make sure to wait for CP DMA when needed</li>
<li>radv: emit a dummy ZPASS_DONE to prevent GPU hangs on GFX9</li>
<li>radv: fix a memleak for merged shaders on GFX9</li>
</ul>
</div>
</body>
</html>

188
docs/relnotes/18.1.6.html Normal file
View File

@@ -0,0 +1,188 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.6 Release Notes / August 13 2018</h1>
<p>
Mesa 18.1.6 is a bug fix release which fixes bugs found since the 18.1.5 release.
</p>
<p>
Mesa 18.1.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
580e03328ffefe1fd43b19ab7669f20d931601a1c0a4c0f8b9c65d6e81a06df3 mesa-18.1.6.tar.gz
bb7ce759069801804fcfb8152da3457f76cd7b4e0096e4870ff5adcb5c894289 mesa-18.1.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=13728">Bug 13728</a> - [G965] Some objects in Neverwinter Nights Linux version not displayed correctly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98699">Bug 98699</a> - &quot;float[a+++4 ? 1:1] f;&quot; crashes glsl_compiler</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99730">Bug 99730</a> - Metro Redux game(s) needs override for midshader extension declaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106382">Bug 106382</a> - Shader cache breaks INTEL_DEBUG=shader_time</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107117">Bug 107117</a> - mesa-18.1: regression with TFP on intel with modesettings and glamor acceleration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107212">Bug 107212</a> - Dual-Core CPU E5500 / G45: RetroArch with reicast core results in corrupted graphics</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (1):</p>
<ul>
<li>glx: GLX_MESA_multithread_makecurrent is direct-only</li>
</ul>
<p>Andres Gomez (3):</p>
<ul>
<li>ddebug: use util_snprintf() in dd_get_debug_filename_and_mkdir</li>
<li>gallium/aux/util: use util_snprintf() in test_texture_barrier</li>
<li>glsl: use util_snprintf()</li>
</ul>
<p>Christian Gmeiner (1):</p>
<ul>
<li>etnaviv: fix typo in query names</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>r600: reduce num compute threads to 1024.</li>
</ul>
<p>Dylan Baker (6):</p>
<ul>
<li>docs: Add sha-256 sums for 18.1.5</li>
<li>nir/meson: fix c vs cpp args for nir test</li>
<li>gallium: fix ddebug on windows</li>
<li>cherry-ignore: add patches that get-pick-list is finding in error</li>
<li>cherry-ignore: Add some additional patches that are for 18.2</li>
<li>bump version to 18.1.6</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>swr: don't export swr_create_screen_internal</li>
<li>automake: require shared glapi when using DRI based libGL</li>
<li>autotools: error out when using the broken --with-{gl, osmesa}-lib-name</li>
<li>autotools: error out when building with mangling and glvnd</li>
<li>autotools: use correct gl.pc LIBS when using glvnd</li>
</ul>
<p>Eric Anholt (4):</p>
<ul>
<li>vc4: Fix a leak of the no-vertex-elements workaround BO.</li>
<li>vc4: Respect a sampler view's first_layer field.</li>
<li>vc4: Ignore samplers for finding uniform offsets.</li>
<li>egl: Fix leak of X11 pixmaps backing pbuffers in DRI3.</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>meson, install_megadrivers: Also remove stale symlinks</li>
</ul>
<p>Jan Vesely (2):</p>
<ul>
<li>clover: Reduce wait_count in abort path.</li>
<li>clover: Don't extend illegal integer types.</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>nir: Take if uses into account in ssa_def_components_read</li>
<li>i965/fs: Flag all slots of a flat input as flat</li>
</ul>
<p>Jon Turney (1):</p>
<ul>
<li>meson: use correct keyword to fix a meson warning</li>
</ul>
<p>Jordan Justen (2):</p>
<ul>
<li>i965, anv: Use INTEL_DEBUG for disk_cache driver flags</li>
<li>i965: Disable shader cache with INTEL_DEBUG=shader_time</li>
</ul>
<p>Juan A. Suarez Romero (2):</p>
<ul>
<li>wayland/egl: update surface size on window resize</li>
<li>wayland/egl: initialize window surface size to window size</li>
</ul>
<p>Karol Herbst (2):</p>
<ul>
<li>nir/lower_int64: mark all metadata as dirty</li>
<li>nvc0/ir: return 0 in imageLoad on incomplete textures</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>intel: Fix SIMD16 unaligned payload GRF reads on Gen4-5.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>ac/surface: fix MSAA corruption on Vega due to FMASK tile swizzle</li>
</ul>
<p>Mauro Rossi (2):</p>
<ul>
<li>radv: generate entrypoints for VK_ANDROID_native_buffer</li>
<li>radv: move vk_format_table.c to generated sources</li>
</ul>
<p>Olivier Fourdan (1):</p>
<ul>
<li>dri3: For 1.2, use root window instead of pixmap drawable</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>glsl: handle error case with ast_post_inc, ast_post_dec</li>
</ul>
<p>Vlad Golovkin (1):</p>
<ul>
<li>swr: Remove unnecessary memset call</li>
</ul>
<p>vadym.shovkoplias (1):</p>
<ul>
<li>drirc: Allow extension midshader for Metro Redux</li>
</ul>
</div>
</body>
</html>

104
docs/relnotes/18.1.7.html Normal file
View File

@@ -0,0 +1,104 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.7 Release Notes / August 24 2018</h1>
<p>
Mesa 18.1.7 is a bug fix release which fixes bugs found since the 18.1.6 release.
</p>
<p>
Mesa 18.1.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
0c3c240bcd1352d179e65993214f9d55a399beac852c3ab4433e8df9b6c51c83 mesa-18.1.7.tar.gz
655e3b32ce3bdddd5e6e8768596e5d4bdef82d0dd37067c324cc4b2daa207306 mesa-18.1.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105975">Bug 105975</a> - i965 always reports 0 viewport subpixel bits</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107098">Bug 107098</a> - Segfault after munmap(kms_sw_dt-&gt;ro_mapped)</li>
</ul>
<h2>Changes</h2>
<p>Alexander Tsoy (1):</p>
<ul>
<li>meson: fix build for egl platform_x11 without dri3 and gbm</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Fix missing Android platform define.</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>i965: Advertise 8 bits subpixel precision for viewport bounds on gen6+</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>r600/eg: rework atomic counter emission with flushes</li>
</ul>
<p>Dylan Baker (7):</p>
<ul>
<li>docs: Add sha256 sums for 18.1.6</li>
<li>cherry-ignore: Add additional 18.2 only patches</li>
<li>cherry-ignore: Add more 18.2 patches</li>
<li>cherry-ignore: Add more 18.2 patches</li>
<li>cherry-ignore: Add a couple of patches with &gt; 1 fixes tags</li>
<li>cherry-ignore: more 18.2 patches</li>
<li>bump version for 18.1.7 release</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>intel: Switch the order of the 2x MSAA sample positions</li>
<li>anv/lower_ycbcr: Use the binding array size for bounds checks</li>
</ul>
<p>Ray Strode (1):</p>
<ul>
<li>gallium/winsys/kms: don't unmap what wasn't mapped</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv/winsys: fix creating the BO list for virtual buffers</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>radv: add Doom workaround</li>
</ul>
</div>
</body>
</html>

180
docs/relnotes/18.1.8.html Normal file
View File

@@ -0,0 +1,180 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.8 Release Notes / September 7 2018</h1>
<p>
Mesa 18.1.8 is a bug fix release which fixes bugs found since the 18.1.7 release.
</p>
<p>
Mesa 18.1.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
8ec62f215dd1bb3910987f9941c6fc31632a0874e618815cf1e8e29445c86e0a mesa-18.1.8.tar.gz
bd1be67fe9c73b517765264ac28911c84144682d28dbff140e1c2deb2f44c21b mesa-18.1.8.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93355">Bug 93355</a> - [BXT,SKLGT4e] intermittent ext_framebuffer_multisample.accuracy fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101247">Bug 101247</a> - Mesa fails to link GLSL programs with unused output blocks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104809">Bug 104809</a> - anv: DOOM 2016 and Wolfenstein II:The New Colossus crash due to not having depthBoundsTest</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105904">Bug 105904</a> - Needed to delete mesa shader cache after driver upgrade for 32 bit wine vulkan programs to work.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106738">Bug 106738</a> - No test for miptrees with DRI modifiers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106865">Bug 106865</a> - [GLK] piglit.spec.ext_framebuffer_multisample.accuracy stencil tests fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107359">Bug 107359</a> - [Regression] [bisected] [OpenGL CTS] [SKL,BDW] KHR-GL46.texture_barrier*-texels, GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners, and GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners fail with some configuration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107477">Bug 107477</a> - [DXVK] Setting high shader quality in GTA V results in LLVM error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107579">Bug 107579</a> - [SNB] The graphic corruption when we reuse the GS compiled and used for TFB when statebuffer contain magic trash in the unused space</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107601">Bug 107601</a> - Rise of the Tomb Raider Segmentation Fault when the game starts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107760">Bug 107760</a> - GPU Hang when Playing DiRT 3 Complete Edition using Steam Play with DXVK</li>
</ul>
<h2>Changes</h2>
<p>Andrii Simiklit (1):</p>
<ul>
<li>i965/gen6/xfb: handle case where transform feedback is not active</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Add missing checks in radv_get_image_format_properties.</li>
<li>radv: Fix CMASK dimensions.</li>
<li>radv: Use a lower max offchip buffer count.</li>
</ul>
<p>Christian Gmeiner (1):</p>
<ul>
<li>tegra: fix memory leak</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>st/dri: Don't expose sRGB formats to clients</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>ac/radeonsi: fix CIK copy max size</li>
</ul>
<p>Dylan Baker (10):</p>
<ul>
<li>docs: Add mesa 18.1.7 notes</li>
<li>cherry-ignore: add a patch</li>
<li>cherry-ignore: Add more 18.2 only patches</li>
<li>meson: Actually load translation files</li>
<li>cherry-ignore: Add more 18.2 patches</li>
<li>cherry-ignore: Add additional patch</li>
<li>cherry-ignore: Add patch that doesn't apply to 18.1</li>
<li>cherry-ignore: Add a couple of two fixes warning patches</li>
<li>cherry-ignore: Add patch that needs more significant patches to function</li>
<li>Bump version to 18.1.8</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: update required mako version</li>
</ul>
<p>Grazvydas Ignotas (1):</p>
<ul>
<li>radv: place pointer length into cache uuid</li>
</ul>
<p>Gurchetan Singh (2):</p>
<ul>
<li>meson: fix egl build for surfaceless</li>
<li>meson: fix egl build for android</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>i965/vec4: Clamp indirect tes input array reads with 0x0fffffff</li>
<li>i965/vec4: Correctly handle uniform sources in generate_tes_add_indirect_urb_offset</li>
</ul>
<p>Jason Ekstrand (5):</p>
<ul>
<li>anv: Fill holes in the VF VUE to zero</li>
<li>nir/algebraic: Be more careful converting ushr to extract_u8/16</li>
<li>egl/dri2: Add a helper for the number of planes for a FOURCC format</li>
<li>egl/dri2: Guard against invalid fourcc formats</li>
<li>anv/blorp: Do more flushing around HiZ clears</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>egl/wayland: do not leak wl_buffer when it is locked</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: blorp: support multiple aspect blits</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>glapi: actually implement GL_EXT_robustness for GLES</li>
</ul>
<p>Nanley Chery (7):</p>
<ul>
<li>intel/isl: Avoid tiling some 16K-wide render targets</li>
<li>i965: Make blt_pitch public</li>
<li>i965/miptree: Drop an if case from retile_as_linear</li>
<li>i965/miptree: Use the correct BLT pitch</li>
<li>i965/miptree: Use miptree_map in map_blit functions</li>
<li>i965/miptree: Fix can_blit_slice()</li>
<li>i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on some gen9</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix passing clip/cull distances from VS to PS</li>
</ul>
<p>vadym.shovkoplias (1):</p>
<ul>
<li>glsl/linker: Allow unused in blocks which are not declated on previous stage</li>
</ul>
</div>
</body>
</html>

177
docs/relnotes/18.1.9.html Normal file
View File

@@ -0,0 +1,177 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.8 Release Notes / September 24 2018</h1>
<p>
Mesa 18.1.9 is a bug fix release which fixes bugs found since the 18.1.8 release.
</p>
<p>
Mesa 18.1.9 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103241">Bug 103241</a> - Anv crashes when using 64-bit vertex inputs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104926">Bug 104926</a> - swrast: Mesa 17.3.3 produces: HW cursor for format 875713089 not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107280">Bug 107280</a> - [DXVK] Batman: Arkham City with tessellation enabled hangs on SKL GT4</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107772">Bug 107772</a> - Mesa preprocessor matches if(def)s &amp; endifs incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107779">Bug 107779</a> - Access violation with some games</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107810">Bug 107810</a> - The 'va_end' call is missed after 'va_copy' in 'util_vsnprintf' function under windows</li>
</ul>
<h2>Changes</h2>
<p>Andrii Simiklit (4):</p>
<ul>
<li>apple/glx/log: added missing va_end() after va_copy()</li>
<li>mesa/util: don't use the same 'va_list' instance twice</li>
<li>mesa/util: don't ignore NULL returned from 'malloc'</li>
<li>mesa/util: add missing va_end() after va_copy()</li>
</ul>
<p>Bas Nieuwenhuizen (4):</p>
<ul>
<li>radv: Use build ID if available for cache UUID.</li>
<li>radv: Only allow 16 user SGPRs for compute on GFX9+.</li>
<li>radv: Set the user SGPR MSB for Vega.</li>
<li>radv: Fix driver UUID SHA1 init.</li>
</ul>
<p>Christopher Egert (1):</p>
<ul>
<li>radeon: fix ColorMask</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>virgl: don't send a shader create with no data. (v2)</li>
</ul>
<p>Dylan Baker (10):</p>
<ul>
<li>docs/relnotes: Add sha256 sums for mesa 18.1.8</li>
<li>cherry-ignore: Add additional 18.2 patch</li>
<li>meson: Print a message about why a libdrm version was selected</li>
<li>cherry-ignore: add another 18.2 patch</li>
<li>cherry-ignore: Add patches that don't apply cleanly and are for developer tools</li>
<li>cherry-ignore: Add more 18.2 patches</li>
<li>cherry-ignore: add 18.2 patchs</li>
<li>cherry-ignore: add a patch that was reverted on master</li>
<li>cherry-ignore: one final update</li>
<li>Bump version to 18.1.9</li>
</ul>
<p>Erik Faye-Lund (2):</p>
<ul>
<li>winsys/virgl: avoid unintended behavior</li>
<li>virgl: adjust strides when mapping temp-resources</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>winsys/virgl: correct resource and handle allocation (v2)</li>
</ul>
<p>Jason Ekstrand (6):</p>
<ul>
<li>anv/pipeline: Only consider double elements which actually exist</li>
<li>i965: Workaround the gen9 hw astc5x5 sampler bug</li>
<li>anv: Re-emit vertex buffers when the pipeline changes</li>
<li>anv: Disable the vertex cache when tessellating on SKL GT4</li>
<li>anv: Clamp scissors to the framebuffer boundary</li>
<li>anv/query: Write both dwords in emit_zero_queries</li>
</ul>
<p>Josh Pieper (1):</p>
<ul>
<li>st/mesa: Validate the result of pipe_transfer_map in make_texture (v2)</li>
</ul>
<p>Kenneth Feng (1):</p>
<ul>
<li>amd: Add Picasso device id</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>st/mesa: help fix stencil border color for GL_DEPTH_STENCIL textures</li>
<li>radeonsi: fix HTILE for NPOT textures with mipmapping on SI/CI</li>
<li>r600: fix HTILE for NPOT textures with mipmapping</li>
<li>radeonsi: fix printing a BO list into ddebug reports</li>
</ul>
<p>Mathias Fröhlich (1):</p>
<ul>
<li>tnl: Fix green gun regression in xonotic.</li>
</ul>
<p>Mauro Rossi (3):</p>
<ul>
<li>android: broadcom/genxml: fix collision with intel/genxml header-gen macro</li>
<li>android: broadcom/cle: add gallium include path</li>
<li>android: broadcom/cle: export the broadcom top level path headers</li>
</ul>
<p>Michal Srb (1):</p>
<ul>
<li>st/dri: don't set queryDmaBufFormats/queryDmaBufModifiers if the driver does not implement it</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>loader/dri3: Only wait for back buffer fences in dri3_get_buffer</li>
</ul>
<p>Pierre Moreau (1):</p>
<ul>
<li>nvir: Always split 64-bit IMAD/IMUL operations</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>intel: compiler option msse2 and mstackrealign</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>glsl: fixer lexer for unreachable defines</li>
</ul>
</div>
</body>
</html>

View File

@@ -13,8 +13,7 @@ Contact
Status
Shipping (Mesa 4.0.4 and later. Only implemented in particular
XFree86/DRI drivers.)
Obsolete. Effectively superseded by ARB_vertex_buffer_object.
Version

View File

@@ -12,7 +12,7 @@ Contact
Status
XXX - Not complete yet!!!
Obsolete.
Version

View File

@@ -12,7 +12,7 @@ Contact
Status
Shipping since Mesa 2.6 in February, 1998.
Obsolete.
Version

View File

@@ -46,14 +46,14 @@ GL_MESA_shader_debug.spec: (obsolete)
GL_DEBUG_ASSERT_MESA 0x875B
GL_MESA_program_debug: (obsolete)
GL_FRAGMENT_PROGRAM_CALLBACK_MESA 0x????
GL_VERTEX_PROGRAM_CALLBACK_MESA 0x????
GL_FRAGMENT_PROGRAM_POSITION_MESA 0x????
GL_VERTEX_PROGRAM_POSITION_MESA 0x????
GL_FRAGMENT_PROGRAM_CALLBACK_FUNC_MESA 0x????
GL_FRAGMENT_PROGRAM_CALLBACK_DATA_MESA 0x????
GL_VERTEX_PROGRAM_CALLBACK_FUNC_MESA 0x????
GL_VERTEX_PROGRAM_CALLBACK_DATA_MESA 0x????
GL_FRAGMENT_PROGRAM_POSITION_MESA 0x8BB0
GL_FRAGMENT_PROGRAM_CALLBACK_MESA 0x8BB1
GL_FRAGMENT_PROGRAM_CALLBACK_FUNC_MESA 0x8BB2
GL_FRAGMENT_PROGRAM_CALLBACK_DATA_MESA 0x8BB3
GL_VERTEX_PROGRAM_POSITION_MESA 0x8BB4
GL_VERTEX_PROGRAM_CALLBACK_MESA 0x8BB5
GL_VERTEX_PROGRAM_CALLBACK_FUNC_MESA 0x8BB6
GL_VERTEX_PROGRAM_CALLBACK_DATA_MESA 0x8BB7
GL_MESAX_texture_stack:
GL_TEXTURE_1D_STACK_MESAX 0x8759
@@ -63,15 +63,8 @@ GL_MESAX_texture_stack:
GL_TEXTURE_1D_STACK_BINDING_MESAX 0x875D
GL_TEXTURE_2D_STACK_BINDING_MESAX 0x875E
GL_MESA_program_debug
GL_FRAGMENT_PROGRAM_POSITION_MESA 0x8BB0
GL_FRAGMENT_PROGRAM_CALLBACK_MESA 0x8BB1
GL_FRAGMENT_PROGRAM_CALLBACK_FUNC_MESA 0x8BB2
GL_FRAGMENT_PROGRAM_CALLBACK_DATA_MESA 0x8BB3
GL_FRAGMENT_PROGRAM_POSITION_MESA 0x8BB4
GL_FRAGMENT_PROGRAM_CALLBACK_MESA 0x8BB5
GL_FRAGMENT_PROGRAM_CALLBACK_FUNC_MESA 0x8BB6
GL_FRAGMENT_PROGRAM_CALLBACK_DATA_MESA 0x8BB7
GL_MESA_program_binary_formats:
GL_PROGRAM_BINARY_FORMAT_MESA 0x875F
GL_MESA_tile_raster_order
GL_TILE_RASTER_ORDER_FIXED_MESA 0x8BB8

View File

@@ -246,6 +246,10 @@ release.
Note: resending patch identical to one on mesa-dev@ or one that differs only
by the extra mesa-stable@ tag is <strong>not</strong> recommended.
</p>
<p>
If you are not the author of the original patch, please Cc: them in your
nomination request.
</p>
<h3 id="thetag">The stable tag</h3>

View File

@@ -933,6 +933,7 @@ EGLAPI EGLSurface EGLAPIENTRY eglCreatePixmapSurfaceHI (EGLDisplay dpy, EGLConfi
#define EGL_DRM_BUFFER_STRIDE_MESA 0x31D4
#define EGL_DRM_BUFFER_USE_SCANOUT_MESA 0x00000001
#define EGL_DRM_BUFFER_USE_SHARE_MESA 0x00000002
#define EGL_DRM_BUFFER_USE_CURSOR_MESA 0x00000004
typedef EGLImageKHR (EGLAPIENTRYP PFNEGLCREATEDRMIMAGEMESAPROC) (EGLDisplay dpy, const EGLint *attrib_list);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLEXPORTDRMIMAGEMESAPROC) (EGLDisplay dpy, EGLImageKHR image, EGLint *name, EGLint *handle, EGLint *stride);
#ifdef EGL_EGLEXT_PROTOTYPES

View File

@@ -34,13 +34,6 @@ extern "C" {
#include <EGL/eglplatform.h>
#ifdef EGL_MESA_drm_image
/* Mesa's extension to EGL_MESA_drm_image... */
#ifndef EGL_DRM_BUFFER_USE_CURSOR_MESA
#define EGL_DRM_BUFFER_USE_CURSOR_MESA 0x0004
#endif
#endif
#ifndef EGL_WL_bind_wayland_display
#define EGL_WL_bind_wayland_display 1

View File

@@ -104,6 +104,12 @@ typedef struct ANativeWindow* EGLNativeWindowType;
typedef struct egl_native_pixmap_t* EGLNativePixmapType;
typedef void* EGLNativeDisplayType;
#elif defined(USE_OZONE)
typedef intptr_t EGLNativeDisplayType;
typedef intptr_t EGLNativeWindowType;
typedef intptr_t EGLNativePixmapType;
#elif defined(__unix__) || defined(__APPLE__)
#if defined(MESA_EGL_NO_X11_HEADERS)
@@ -124,11 +130,13 @@ typedef Window EGLNativeWindowType;
#endif /* MESA_EGL_NO_X11_HEADERS */
#elif __HAIKU__
#elif defined(__HAIKU__)
#include <kernel/image.h>
typedef void *EGLNativeDisplayType;
typedef khronos_uintptr_t EGLNativePixmapType;
typedef khronos_uintptr_t EGLNativeWindowType;
typedef void *EGLNativeDisplayType;
typedef khronos_uintptr_t EGLNativePixmapType;
typedef khronos_uintptr_t EGLNativeWindowType;
#else
#error "Platform not recognized"

View File

@@ -47,9 +47,9 @@
# define GLAPI __declspec(dllimport)
# else /* for use with static link lib build of Win32 edition only */
# define GLAPI extern
# endif /* _STATIC_MESA support */
# endif
# if defined(__MINGW32__) && defined(GL_NO_STDCALL) || defined(UNDER_CE) /* The generated DLLs by MingW with STDCALL are not compatible with the ones done by Microsoft's compilers */
# define GLAPIENTRY
# define GLAPIENTRY
# else
# define GLAPIENTRY __stdcall
# endif

View File

@@ -9212,6 +9212,11 @@ GLAPI void APIENTRY glGetPerfQueryInfoINTEL (GLuint queryId, GLuint queryNameLen
#define GL_PACK_INVERT_MESA 0x8758
#endif /* GL_MESA_pack_invert */
#ifndef GL_MESA_program_binary_formats
#define GL_MESA_program_binary_formats 1
#define GL_PROGRAM_BINARY_FORMAT_MESA 0x875F
#endif /* GL_MESA_program_binary_formats */
#ifndef GL_MESA_resize_buffers
#define GL_MESA_resize_buffers 1
typedef void (APIENTRYP PFNGLRESIZEBUFFERSMESAPROC) (void);

View File

@@ -82,7 +82,7 @@ typedef struct __DRI2flushExtensionRec __DRI2flushExtension;
typedef struct __DRI2throttleExtensionRec __DRI2throttleExtension;
typedef struct __DRI2fenceExtensionRec __DRI2fenceExtension;
typedef struct __DRI2interopExtensionRec __DRI2interopExtension;
typedef struct __DRI2blobExtensionRec __DRI2blobExtension;
typedef struct __DRIimageLoaderExtensionRec __DRIimageLoaderExtension;
typedef struct __DRIimageDriverExtensionRec __DRIimageDriverExtension;
@@ -336,6 +336,30 @@ struct __DRI2throttleExtensionRec {
enum __DRI2throttleReason reason);
};
/**
* Extension for EGL_ANDROID_blob_cache
*/
#define __DRI2_BLOB "DRI2_Blob"
#define __DRI2_BLOB_VERSION 1
typedef void
(*__DRIblobCacheSet) (const void *key, signed long keySize,
const void *value, signed long valueSize);
typedef signed long
(*__DRIblobCacheGet) (const void *key, signed long keySize,
void *value, signed long valueSize);
struct __DRI2blobExtensionRec {
__DRIextension base;
/**
* Set cache functions for setting and getting cache entries.
*/
void (*set_cache_funcs) (__DRIscreen *screen,
__DRIblobCacheSet set, __DRIblobCacheGet get);
};
/**
* Extension for fences / synchronization objects.
@@ -1106,6 +1130,16 @@ struct __DRIdri2LoaderExtensionRec {
#define __DRI_CTX_PRIORITY_MEDIUM 1
#define __DRI_CTX_PRIORITY_HIGH 2
/**
* \name Context release behaviors.
*/
/*@{*/
#define __DRI_CTX_ATTRIB_RELEASE_BEHAVIOR 5
#define __DRI_CTX_RELEASE_BEHAVIOR_NONE 0
#define __DRI_CTX_RELEASE_BEHAVIOR_FLUSH 1
/*@}*/
/**
* \name Reasons that __DRIdri2Extension::createContextAttribs might fail
*/
@@ -1216,6 +1250,9 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_FORMAT_ARGB1555 0x100c
#define __DRI_IMAGE_FORMAT_R16 0x100d
#define __DRI_IMAGE_FORMAT_GR1616 0x100e
#define __DRI_IMAGE_FORMAT_YUYV 0x100f
#define __DRI_IMAGE_FORMAT_XBGR2101010 0x1010
#define __DRI_IMAGE_FORMAT_ABGR2101010 0x1011
#define __DRI_IMAGE_USE_SHARE 0x0001
#define __DRI_IMAGE_USE_SCANOUT 0x0002
@@ -1251,7 +1288,15 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_FOURCC_XRGB8888 0x34325258
#define __DRI_IMAGE_FOURCC_ABGR8888 0x34324241
#define __DRI_IMAGE_FOURCC_XBGR8888 0x34324258
#define __DRI_IMAGE_FOURCC_SARGB8888 0x83324258
#define __DRI_IMAGE_FOURCC_SARGB8888 0x83324258
#define __DRI_IMAGE_FOURCC_ARGB2101010 0x30335241
#define __DRI_IMAGE_FOURCC_XRGB2101010 0x30335258
#define __DRI_IMAGE_FOURCC_ABGR2101010 0x30334241
#define __DRI_IMAGE_FOURCC_XBGR2101010 0x30334258
#define __DRI_IMAGE_FOURCC_RGBA1010102 0x30334152
#define __DRI_IMAGE_FOURCC_RGBX1010102 0x30335852
#define __DRI_IMAGE_FOURCC_BGRA1010102 0x30334142
#define __DRI_IMAGE_FOURCC_BGRX1010102 0x30335842
#define __DRI_IMAGE_FOURCC_YUV410 0x39565559
#define __DRI_IMAGE_FOURCC_YUV411 0x31315559
#define __DRI_IMAGE_FOURCC_YUV420 0x32315559
@@ -1715,6 +1760,21 @@ typedef struct __DRInoErrorExtensionRec {
__DRIextension base;
} __DRInoErrorExtension;
/*
* Flush control driver extension.
*
* Existence of this extension means the driver can accept the
* \c __DRI_CTX_ATTRIB_RELEASE_BEHAVIOR attribute in
* \c __DRIdri2ExtensionRec::createContextAttribs.
*/
#define __DRI2_FLUSH_CONTROL "DRI_FlushControl"
#define __DRI2_FLUSH_CONTROL_VERSION 1
typedef struct __DRI2flushControlExtensionRec __DRI2flushControlExtension;
struct __DRI2flushControlExtensionRec {
__DRIextension base;
};
/**
* DRI config options extension.
*

View File

@@ -2334,6 +2334,11 @@ GL_APICALL void GL_APIENTRY glGetPerfQueryInfoINTEL (GLuint queryId, GLuint quer
#endif
#endif /* GL_INTEL_performance_query */
#ifndef GL_MESA_program_binary_formats
#define GL_MESA_program_binary_formats 1
#define GL_PROGRAM_BINARY_FORMAT_MESA 0x875F
#endif /* GL_MESA_program_binary_formats */
#ifndef GL_MESA_shader_integer_functions
#define GL_MESA_shader_integer_functions 1
#endif /* GL_MESA_shader_integer_functions */

View File

@@ -30,9 +30,6 @@
#define EMULATED_THREADS_H_INCLUDED_
#include <time.h>
#ifdef _MSC_VER
#include <thr/xtimec.h> // for xtime
#endif
#ifndef TIME_UTC
#define TIME_UTC 1
@@ -44,14 +41,6 @@
typedef void (*tss_dtor_t)(void*);
typedef int (*thrd_start_t)(void*);
#ifndef _MSC_VER
struct xtime {
time_t sec;
long nsec;
};
typedef struct xtime xtime;
#endif
/*-------------------- enumeration constants --------------------*/
enum {

View File

@@ -132,19 +132,15 @@ cnd_signal(cnd_t *cond)
// 7.25.3.5
static inline int
cnd_timedwait(cnd_t *cond, mtx_t *mtx, const xtime *xt)
cnd_timedwait(cnd_t *cond, mtx_t *mtx, const struct timespec *abs_time)
{
struct timespec abs_time;
int rt;
assert(mtx != NULL);
assert(cond != NULL);
assert(xt != NULL);
assert(abs_time != NULL);
abs_time.tv_sec = xt->sec;
abs_time.tv_nsec = xt->nsec;
rt = pthread_cond_timedwait(cond, mtx, &abs_time);
rt = pthread_cond_timedwait(cond, mtx, abs_time);
if (rt == ETIMEDOUT)
return thrd_busy;
return (rt == 0) ? thrd_success : thrd_error;
@@ -235,24 +231,21 @@ thrd_yield(void);
// 7.25.4.4
static inline int
mtx_timedlock(mtx_t *mtx, const xtime *xt)
mtx_timedlock(mtx_t *mtx, const struct timespec *ts)
{
assert(mtx != NULL);
assert(xt != NULL);
assert(ts != NULL);
{
#ifdef EMULATED_THREADS_USE_NATIVE_TIMEDLOCK
struct timespec ts;
int rt;
ts.tv_sec = xt->sec;
ts.tv_nsec = xt->nsec;
rt = pthread_mutex_timedlock(mtx, &ts);
rt = pthread_mutex_timedlock(mtx, ts);
if (rt == 0)
return thrd_success;
return (rt == ETIMEDOUT) ? thrd_busy : thrd_error;
#else
time_t expire = time(NULL);
expire += xt->sec;
expire += ts->tv_sec;
while (mtx_trylock(mtx) != thrd_success) {
time_t now = time(NULL);
if (expire < now)
@@ -342,13 +335,10 @@ thrd_join(thrd_t thr, int *res)
// 7.25.5.7
static inline void
thrd_sleep(const xtime *xt)
thrd_sleep(const struct timespec *time_point, struct timespec *remaining)
{
struct timespec req;
assert(xt);
req.tv_sec = xt->sec;
req.tv_nsec = xt->nsec;
nanosleep(&req, NULL);
assert(time_point != NULL);
nanosleep(time_point, remaining);
}
// 7.25.5.8
@@ -392,14 +382,15 @@ tss_set(tss_t key, void *val)
/*-------------------- 7.25.7 Time functions --------------------*/
// 7.25.6.1
#ifndef HAVE_TIMESPEC_GET
static inline int
xtime_get(xtime *xt, int base)
timespec_get(struct timespec *ts, int base)
{
if (!xt) return 0;
if (!ts) return 0;
if (base == TIME_UTC) {
xt->sec = time(NULL);
xt->nsec = 0;
clock_gettime(CLOCK_REALTIME, ts);
return base;
}
return 0;
}
#endif

View File

@@ -75,6 +75,20 @@ Configuration macro:
#error EMULATED_THREADS_USE_NATIVE_CV requires _WIN32_WINNT>=0x0600
#endif
/* Visual Studio 2015 and later */
#if _MSC_VER >= 1900
#define HAVE_TIMESPEC
#define HAVE_TIMESPEC_GET
#elif defined(__MINGW32__)
#define HAVE_TIMESPEC
#endif
#ifndef HAVE_TIMESPEC
struct timespec {
time_t tv_sec;
long tv_nsec;
};
#endif
/*---------------------------- macros ----------------------------*/
#ifdef EMULATED_THREADS_USE_NATIVE_CALL_ONCE
@@ -146,9 +160,9 @@ static unsigned __stdcall impl_thrd_routine(void *p)
return (unsigned)code;
}
static DWORD impl_xtime2msec(const xtime *xt)
static DWORD impl_timespec2msec(const struct timespec *ts)
{
return (DWORD)((xt->sec * 1000U) + (xt->nsec / 1000000L));
return (DWORD)((ts->tv_sec * 1000U) + (ts->tv_nsec / 1000000L));
}
#ifdef EMULATED_THREADS_USE_NATIVE_CALL_ONCE
@@ -206,7 +220,7 @@ static void impl_cond_do_signal(cnd_t *cond, int broadcast)
ReleaseSemaphore(cond->sem_queue, nsignal, NULL);
}
static int impl_cond_do_wait(cnd_t *cond, mtx_t *mtx, const xtime *xt)
static int impl_cond_do_wait(cnd_t *cond, mtx_t *mtx, const struct timespec *ts)
{
int nleft = 0;
int ngone = 0;
@@ -219,7 +233,7 @@ static int impl_cond_do_wait(cnd_t *cond, mtx_t *mtx, const xtime *xt)
mtx_unlock(mtx);
w = WaitForSingleObject(cond->sem_queue, xt ? impl_xtime2msec(xt) : INFINITE);
w = WaitForSingleObject(cond->sem_queue, ts ? impl_timespec2msec(ts) : INFINITE);
timeout = (w == WAIT_TIMEOUT);
EnterCriticalSection(&cond->monitor);
@@ -378,15 +392,15 @@ cnd_signal(cnd_t *cond)
// 7.25.3.5
static inline int
cnd_timedwait(cnd_t *cond, mtx_t *mtx, const xtime *xt)
cnd_timedwait(cnd_t *cond, mtx_t *mtx, const struct timespec *abs_time)
{
if (!cond || !mtx || !xt) return thrd_error;
if (!cond || !mtx || !abs_time) return thrd_error;
#ifdef EMULATED_THREADS_USE_NATIVE_CV
if (SleepConditionVariableCS(&cond->condvar, mtx, impl_xtime2msec(xt)))
if (SleepConditionVariableCS(&cond->condvar, mtx, impl_timespec2msec(abs_time)))
return thrd_success;
return (GetLastError() == ERROR_TIMEOUT) ? thrd_busy : thrd_error;
#else
return impl_cond_do_wait(cond, mtx, xt);
return impl_cond_do_wait(cond, mtx, abs_time);
#endif
}
@@ -438,12 +452,12 @@ mtx_lock(mtx_t *mtx)
// 7.25.4.4
static inline int
mtx_timedlock(mtx_t *mtx, const xtime *xt)
mtx_timedlock(mtx_t *mtx, const struct timespec *ts)
{
time_t expire, now;
if (!mtx || !xt) return thrd_error;
if (!mtx || !ts) return thrd_error;
expire = time(NULL);
expire += xt->sec;
expire += ts->tv_sec;
while (mtx_trylock(mtx) != thrd_success) {
now = time(NULL);
if (expire < now)
@@ -579,10 +593,11 @@ thrd_join(thrd_t thr, int *res)
// 7.25.5.7
static inline void
thrd_sleep(const xtime *xt)
thrd_sleep(const struct timespec *time_point, struct timespec *remaining)
{
assert(xt);
Sleep(impl_xtime2msec(xt));
assert(time_point);
assert(!remaining); /* not implemented */
Sleep(impl_timespec2msec(time_point));
}
// 7.25.5.8
@@ -633,14 +648,16 @@ tss_set(tss_t key, void *val)
/*-------------------- 7.25.7 Time functions --------------------*/
// 7.25.6.1
#ifndef HAVE_TIMESPEC_GET
static inline int
xtime_get(xtime *xt, int base)
timespec_get(struct timespec *ts, int base)
{
if (!xt) return 0;
if (!ts) return 0;
if (base == TIME_UTC) {
xt->sec = time(NULL);
xt->nsec = 0;
ts->tv_sec = time(NULL);
ts->tv_nsec = 0;
return base;
}
return 0;
}
#endif

View File

@@ -164,6 +164,7 @@ test_c99_compat_h(const void * restrict a,
# define HAVE_FUNC_ATTRIBUTE_FORMAT 1
# define HAVE_FUNC_ATTRIBUTE_PACKED 1
# define HAVE_FUNC_ATTRIBUTE_ALIAS 1
# define HAVE_FUNC_ATTRIBUTE_NORETURN 1
# if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3)
/* https://gcc.gnu.org/onlinedocs/gcc-4.3.6/gcc/Other-Builtins.html */

View File

@@ -13,9 +13,9 @@ $ make headers_install INSTALL_HDR_PATH=/path/to/install
The last update was done at the following kernel commit :
commit 7846b12fe0b5feab5446d892f41b5140c1419109
Merge: 7ebdb0d d78acfe
commit 78230c46ec0a91dd4256c9e54934b3c7095a7ee3
Merge: b65bd4031156 037f03155b7d
Author: Dave Airlie <airlied@redhat.com>
Date: Tue Aug 29 10:38:14 2017 +1000
Date: Wed Mar 21 14:07:03 2018 +1000
Merge branch 'drm-vmwgfx-next' of git://people.freedesktop.org/~syeh/repos_linux into drm-next
Merge tag 'omapdrm-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux into drm-next

View File

@@ -731,6 +731,28 @@ struct drm_syncobj_array {
__u32 pad;
};
/* Query current scanout sequence number */
struct drm_crtc_get_sequence {
__u32 crtc_id; /* requested crtc_id */
__u32 active; /* return: crtc output is active */
__u64 sequence; /* return: most recent vblank sequence */
__s64 sequence_ns; /* return: most recent time of first pixel out */
};
/* Queue event to be delivered at specified sequence. Time stamp marks
* when the first pixel of the refresh cycle leaves the display engine
* for the display
*/
#define DRM_CRTC_SEQUENCE_RELATIVE 0x00000001 /* sequence is relative to current */
#define DRM_CRTC_SEQUENCE_NEXT_ON_MISS 0x00000002 /* Use next sequence if we've missed */
struct drm_crtc_queue_sequence {
__u32 crtc_id;
__u32 flags;
__u64 sequence; /* on input, target sequence. on output, actual sequence */
__u64 user_data; /* user data passed to event */
};
#if defined(__cplusplus)
}
#endif
@@ -813,6 +835,9 @@ extern "C" {
#define DRM_IOCTL_WAIT_VBLANK DRM_IOWR(0x3a, union drm_wait_vblank)
#define DRM_IOCTL_CRTC_GET_SEQUENCE DRM_IOWR(0x3b, struct drm_crtc_get_sequence)
#define DRM_IOCTL_CRTC_QUEUE_SEQUENCE DRM_IOWR(0x3c, struct drm_crtc_queue_sequence)
#define DRM_IOCTL_UPDATE_DRAW DRM_IOW(0x3f, struct drm_update_draw)
#define DRM_IOCTL_MODE_GETRESOURCES DRM_IOWR(0xA0, struct drm_mode_card_res)
@@ -857,6 +882,11 @@ extern "C" {
#define DRM_IOCTL_SYNCOBJ_RESET DRM_IOWR(0xC4, struct drm_syncobj_array)
#define DRM_IOCTL_SYNCOBJ_SIGNAL DRM_IOWR(0xC5, struct drm_syncobj_array)
#define DRM_IOCTL_MODE_CREATE_LEASE DRM_IOWR(0xC6, struct drm_mode_create_lease)
#define DRM_IOCTL_MODE_LIST_LESSEES DRM_IOWR(0xC7, struct drm_mode_list_lessees)
#define DRM_IOCTL_MODE_GET_LEASE DRM_IOWR(0xC8, struct drm_mode_get_lease)
#define DRM_IOCTL_MODE_REVOKE_LEASE DRM_IOWR(0xC9, struct drm_mode_revoke_lease)
/**
* Device specific ioctls should only be in their respective headers
* The device specific ioctl range is from 0x40 to 0x9f.
@@ -887,6 +917,7 @@ struct drm_event {
#define DRM_EVENT_VBLANK 0x01
#define DRM_EVENT_FLIP_COMPLETE 0x02
#define DRM_EVENT_CRTC_SEQUENCE 0x03
struct drm_event_vblank {
struct drm_event base;
@@ -897,6 +928,16 @@ struct drm_event_vblank {
__u32 crtc_id; /* 0 on older kernels that do not support this */
};
/* Event delivered at sequence. Time stamp marks when the first pixel
* of the refresh cycle leaves the display engine for the display
*/
struct drm_event_crtc_sequence {
struct drm_event base;
__u64 user_data;
__s64 time_ns;
__u64 sequence;
};
/* typedef area */
typedef struct drm_clip_rect drm_clip_rect_t;
typedef struct drm_drawable_info drm_drawable_info_t;

View File

@@ -178,7 +178,7 @@ extern "C" {
#define DRM_FORMAT_MOD_VENDOR_NONE 0
#define DRM_FORMAT_MOD_VENDOR_INTEL 0x01
#define DRM_FORMAT_MOD_VENDOR_AMD 0x02
#define DRM_FORMAT_MOD_VENDOR_NV 0x03
#define DRM_FORMAT_MOD_VENDOR_NVIDIA 0x03
#define DRM_FORMAT_MOD_VENDOR_SAMSUNG 0x04
#define DRM_FORMAT_MOD_VENDOR_QCOM 0x05
#define DRM_FORMAT_MOD_VENDOR_VIVANTE 0x06
@@ -188,7 +188,7 @@ extern "C" {
#define DRM_FORMAT_RESERVED ((1ULL << 56) - 1)
#define fourcc_mod_code(vendor, val) \
((((__u64)DRM_FORMAT_MOD_VENDOR_## vendor) << 56) | (val & 0x00ffffffffffffffULL))
((((__u64)DRM_FORMAT_MOD_VENDOR_## vendor) << 56) | ((val) & 0x00ffffffffffffffULL))
/*
* Format Modifier tokens:
@@ -338,29 +338,17 @@ extern "C" {
*/
#define DRM_FORMAT_MOD_VIVANTE_SPLIT_SUPER_TILED fourcc_mod_code(VIVANTE, 4)
/* NVIDIA Tegra frame buffer modifiers */
/*
* Some modifiers take parameters, for example the number of vertical GOBs in
* a block. Reserve the lower 32 bits for parameters
*/
#define __fourcc_mod_tegra_mode_shift 32
#define fourcc_mod_tegra_code(val, params) \
fourcc_mod_code(NV, ((((__u64)val) << __fourcc_mod_tegra_mode_shift) | params))
#define fourcc_mod_tegra_mod(m) \
(m & ~((1ULL << __fourcc_mod_tegra_mode_shift) - 1))
#define fourcc_mod_tegra_param(m) \
(m & ((1ULL << __fourcc_mod_tegra_mode_shift) - 1))
/* NVIDIA frame buffer modifiers */
/*
* Tegra Tiled Layout, used by Tegra 2, 3 and 4.
*
* Pixels are arranged in simple tiles of 16 x 16 bytes.
*/
#define NV_FORMAT_MOD_TEGRA_TILED fourcc_mod_tegra_code(1, 0)
#define DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED fourcc_mod_code(NVIDIA, 1)
/*
* Tegra 16Bx2 Block Linear layout, used by TK1/TX1
* 16Bx2 Block Linear layout, used by desktop GPUs, and Tegra K1 and later
*
* Pixels are arranged in 64x8 Groups Of Bytes (GOBs). GOBs are then stacked
* vertically by a power of 2 (1 to 32 GOBs) to form a block.
@@ -380,7 +368,21 @@ extern "C" {
* Chapter 20 "Pixel Memory Formats" of the Tegra X1 TRM describes this format
* in full detail.
*/
#define NV_FORMAT_MOD_TEGRA_16BX2_BLOCK(v) fourcc_mod_tegra_code(2, v)
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(v) \
fourcc_mod_code(NVIDIA, 0x10 | ((v) & 0xf))
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB \
fourcc_mod_code(NVIDIA, 0x10)
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB \
fourcc_mod_code(NVIDIA, 0x11)
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB \
fourcc_mod_code(NVIDIA, 0x12)
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB \
fourcc_mod_code(NVIDIA, 0x13)
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB \
fourcc_mod_code(NVIDIA, 0x14)
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB \
fourcc_mod_code(NVIDIA, 0x15)
/*
* Broadcom VC4 "T" format

View File

@@ -38,14 +38,18 @@ extern "C" {
#define DRM_DISPLAY_MODE_LEN 32
#define DRM_PROP_NAME_LEN 32
#define DRM_MODE_TYPE_BUILTIN (1<<0)
#define DRM_MODE_TYPE_CLOCK_C ((1<<1) | DRM_MODE_TYPE_BUILTIN)
#define DRM_MODE_TYPE_CRTC_C ((1<<2) | DRM_MODE_TYPE_BUILTIN)
#define DRM_MODE_TYPE_BUILTIN (1<<0) /* deprecated */
#define DRM_MODE_TYPE_CLOCK_C ((1<<1) | DRM_MODE_TYPE_BUILTIN) /* deprecated */
#define DRM_MODE_TYPE_CRTC_C ((1<<2) | DRM_MODE_TYPE_BUILTIN) /* deprecated */
#define DRM_MODE_TYPE_PREFERRED (1<<3)
#define DRM_MODE_TYPE_DEFAULT (1<<4)
#define DRM_MODE_TYPE_DEFAULT (1<<4) /* deprecated */
#define DRM_MODE_TYPE_USERDEF (1<<5)
#define DRM_MODE_TYPE_DRIVER (1<<6)
#define DRM_MODE_TYPE_ALL (DRM_MODE_TYPE_PREFERRED | \
DRM_MODE_TYPE_USERDEF | \
DRM_MODE_TYPE_DRIVER)
/* Video mode flags */
/* bit compatible with the xrandr RR_ definitions (bits 0-13)
*
@@ -66,8 +70,8 @@ extern "C" {
#define DRM_MODE_FLAG_PCSYNC (1<<7)
#define DRM_MODE_FLAG_NCSYNC (1<<8)
#define DRM_MODE_FLAG_HSKEW (1<<9) /* hskew provided */
#define DRM_MODE_FLAG_BCAST (1<<10)
#define DRM_MODE_FLAG_PIXMUX (1<<11)
#define DRM_MODE_FLAG_BCAST (1<<10) /* deprecated */
#define DRM_MODE_FLAG_PIXMUX (1<<11) /* deprecated */
#define DRM_MODE_FLAG_DBLCLK (1<<12)
#define DRM_MODE_FLAG_CLKDIV2 (1<<13)
/*
@@ -99,6 +103,20 @@ extern "C" {
#define DRM_MODE_FLAG_PIC_AR_16_9 \
(DRM_MODE_PICTURE_ASPECT_16_9<<19)
#define DRM_MODE_FLAG_ALL (DRM_MODE_FLAG_PHSYNC | \
DRM_MODE_FLAG_NHSYNC | \
DRM_MODE_FLAG_PVSYNC | \
DRM_MODE_FLAG_NVSYNC | \
DRM_MODE_FLAG_INTERLACE | \
DRM_MODE_FLAG_DBLSCAN | \
DRM_MODE_FLAG_CSYNC | \
DRM_MODE_FLAG_PCSYNC | \
DRM_MODE_FLAG_NCSYNC | \
DRM_MODE_FLAG_HSKEW | \
DRM_MODE_FLAG_DBLCLK | \
DRM_MODE_FLAG_CLKDIV2 | \
DRM_MODE_FLAG_3D_MASK)
/* DPMS flags */
/* bit compatible with the xorg definitions. */
#define DRM_MODE_DPMS_ON 0
@@ -173,6 +191,10 @@ extern "C" {
DRM_MODE_REFLECT_X | \
DRM_MODE_REFLECT_Y)
/* Content Protection Flags */
#define DRM_MODE_CONTENT_PROTECTION_UNDESIRED 0
#define DRM_MODE_CONTENT_PROTECTION_DESIRED 1
#define DRM_MODE_CONTENT_PROTECTION_ENABLED 2
struct drm_mode_modeinfo {
__u32 clock;
@@ -341,7 +363,7 @@ struct drm_mode_get_connector {
__u32 pad;
};
#define DRM_MODE_PROP_PENDING (1<<0)
#define DRM_MODE_PROP_PENDING (1<<0) /* deprecated, do not use */
#define DRM_MODE_PROP_RANGE (1<<1)
#define DRM_MODE_PROP_IMMUTABLE (1<<2)
#define DRM_MODE_PROP_ENUM (1<<3) /* enumerated type with text strings */
@@ -576,8 +598,11 @@ struct drm_mode_crtc_lut {
};
struct drm_color_ctm {
/* Conversion matrix in S31.32 format. */
__s64 matrix[9];
/*
* Conversion matrix in S31.32 sign-magnitude
* (not two's complement!) format.
*/
__u64 matrix[9];
};
struct drm_color_lut {
@@ -749,9 +774,9 @@ struct drm_format_modifier {
* If the number formats grew to 128, and formats 98-102 are
* supported with the modifier:
*
* 0x0000003c00000000 0000000000000000
* 0x0000007c00000000 0000000000000000
* ^
* |__offset = 64, formats = 0x3c00000000
* |__offset = 64, formats = 0x7c00000000
*
*/
__u64 formats;
@@ -782,6 +807,72 @@ struct drm_mode_destroy_blob {
__u32 blob_id;
};
/**
* Lease mode resources, creating another drm_master.
*/
struct drm_mode_create_lease {
/** Pointer to array of object ids (__u32) */
__u64 object_ids;
/** Number of object ids */
__u32 object_count;
/** flags for new FD (O_CLOEXEC, etc) */
__u32 flags;
/** Return: unique identifier for lessee. */
__u32 lessee_id;
/** Return: file descriptor to new drm_master file */
__u32 fd;
};
/**
* List lesses from a drm_master
*/
struct drm_mode_list_lessees {
/** Number of lessees.
* On input, provides length of the array.
* On output, provides total number. No
* more than the input number will be written
* back, so two calls can be used to get
* the size and then the data.
*/
__u32 count_lessees;
__u32 pad;
/** Pointer to lessees.
* pointer to __u64 array of lessee ids
*/
__u64 lessees_ptr;
};
/**
* Get leased objects
*/
struct drm_mode_get_lease {
/** Number of leased objects.
* On input, provides length of the array.
* On output, provides total number. No
* more than the input number will be written
* back, so two calls can be used to get
* the size and then the data.
*/
__u32 count_objects;
__u32 pad;
/** Pointer to objects.
* pointer to __u32 array of object ids
*/
__u64 objects_ptr;
};
/**
* Revoke lease
*/
struct drm_mode_revoke_lease {
/** Unique ID of lessee
*/
__u32 lessee_id;
};
#if defined(__cplusplus)
}
#endif

View File

@@ -86,6 +86,62 @@ enum i915_mocs_table_index {
I915_MOCS_CACHED,
};
/*
* Different engines serve different roles, and there may be more than one
* engine serving each role. enum drm_i915_gem_engine_class provides a
* classification of the role of the engine, which may be used when requesting
* operations to be performed on a certain subset of engines, or for providing
* information about that group.
*/
enum drm_i915_gem_engine_class {
I915_ENGINE_CLASS_RENDER = 0,
I915_ENGINE_CLASS_COPY = 1,
I915_ENGINE_CLASS_VIDEO = 2,
I915_ENGINE_CLASS_VIDEO_ENHANCE = 3,
I915_ENGINE_CLASS_INVALID = -1
};
/**
* DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
*
*/
enum drm_i915_pmu_engine_sample {
I915_SAMPLE_BUSY = 0,
I915_SAMPLE_WAIT = 1,
I915_SAMPLE_SEMA = 2
};
#define I915_PMU_SAMPLE_BITS (4)
#define I915_PMU_SAMPLE_MASK (0xf)
#define I915_PMU_SAMPLE_INSTANCE_BITS (8)
#define I915_PMU_CLASS_SHIFT \
(I915_PMU_SAMPLE_BITS + I915_PMU_SAMPLE_INSTANCE_BITS)
#define __I915_PMU_ENGINE(class, instance, sample) \
((class) << I915_PMU_CLASS_SHIFT | \
(instance) << I915_PMU_SAMPLE_BITS | \
(sample))
#define I915_PMU_ENGINE_BUSY(class, instance) \
__I915_PMU_ENGINE(class, instance, I915_SAMPLE_BUSY)
#define I915_PMU_ENGINE_WAIT(class, instance) \
__I915_PMU_ENGINE(class, instance, I915_SAMPLE_WAIT)
#define I915_PMU_ENGINE_SEMA(class, instance) \
__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
#define I915_PMU_ACTUAL_FREQUENCY __I915_PMU_OTHER(0)
#define I915_PMU_REQUESTED_FREQUENCY __I915_PMU_OTHER(1)
#define I915_PMU_INTERRUPTS __I915_PMU_OTHER(2)
#define I915_PMU_RC6_RESIDENCY __I915_PMU_OTHER(3)
#define I915_PMU_LAST I915_PMU_RC6_RESIDENCY
/* Each region is a minimum of 16k, and there are at most 255 of them.
*/
#define I915_NR_TEX_REGIONS 255 /* table size 2k - maximum due to use
@@ -262,6 +318,7 @@ typedef struct _drm_i915_sarea {
#define DRM_I915_PERF_OPEN 0x36
#define DRM_I915_PERF_ADD_CONFIG 0x37
#define DRM_I915_PERF_REMOVE_CONFIG 0x38
#define DRM_I915_QUERY 0x39
#define DRM_IOCTL_I915_INIT DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
#define DRM_IOCTL_I915_FLUSH DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
@@ -319,6 +376,7 @@ typedef struct _drm_i915_sarea {
#define DRM_IOCTL_I915_PERF_OPEN DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)
#define DRM_IOCTL_I915_PERF_ADD_CONFIG DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
#define DRM_IOCTL_I915_PERF_REMOVE_CONFIG DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
#define DRM_IOCTL_I915_QUERY DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
/* Allow drivers to submit batchbuffers directly to hardware, relying
* on the security mechanisms provided by hardware.
@@ -450,6 +508,27 @@ typedef struct drm_i915_irq_wait {
*/
#define I915_PARAM_HAS_EXEC_FENCE_ARRAY 49
/*
* Query whether every context (both per-file default and user created) is
* isolated (insofar as HW supports). If this parameter is not true, then
* freshly created contexts may inherit values from an existing context,
* rather than default HW values. If true, it also ensures (insofar as HW
* supports) that all state set by this context will not leak to any other
* context.
*
* As not every engine across every gen support contexts, the returned
* value reports the support of context isolation for individual engines by
* returning a bitmask of each engine class set to true if that class supports
* isolation.
*/
#define I915_PARAM_HAS_CONTEXT_ISOLATION 50
/* Frequency of the command streamer timestamps given by the *_TIMESTAMP
* registers. This used to be fixed per platform but from CNL onwards, this
* might vary depending on the parts.
*/
#define I915_PARAM_CS_TIMESTAMP_FREQUENCY 51
typedef struct drm_i915_getparam {
__s32 param;
/*
@@ -839,6 +918,7 @@ struct drm_i915_gem_exec_fence {
#define I915_EXEC_FENCE_WAIT (1<<0)
#define I915_EXEC_FENCE_SIGNAL (1<<1)
#define __I915_EXEC_FENCE_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SIGNAL << 1))
__u32 flags;
};
@@ -1280,7 +1360,9 @@ struct drm_intel_overlay_attrs {
* active on a given plane.
*/
#define I915_SET_COLORKEY_NONE (1<<0) /* disable color key matching */
#define I915_SET_COLORKEY_NONE (1<<0) /* Deprecated. Instead set
* flags==0 to disable colorkeying.
*/
#define I915_SET_COLORKEY_DESTINATION (1<<1)
#define I915_SET_COLORKEY_SOURCE (1<<2)
struct drm_intel_sprite_colorkey {
@@ -1526,15 +1608,115 @@ struct drm_i915_perf_oa_config {
__u32 n_flex_regs;
/*
* These fields are pointers to tuples of u32 values (register
* address, value). For example the expected length of the buffer
* pointed by mux_regs_ptr is (2 * sizeof(u32) * n_mux_regs).
* These fields are pointers to tuples of u32 values (register address,
* value). For example the expected length of the buffer pointed by
* mux_regs_ptr is (2 * sizeof(u32) * n_mux_regs).
*/
__u64 mux_regs_ptr;
__u64 boolean_regs_ptr;
__u64 flex_regs_ptr;
};
struct drm_i915_query_item {
__u64 query_id;
#define DRM_I915_QUERY_TOPOLOGY_INFO 1
/*
* When set to zero by userspace, this is filled with the size of the
* data to be written at the data_ptr pointer. The kernel sets this
* value to a negative value to signal an error on a particular query
* item.
*/
__s32 length;
/*
* Unused for now. Must be cleared to zero.
*/
__u32 flags;
/*
* Data will be written at the location pointed by data_ptr when the
* value of length matches the length of the data to be written by the
* kernel.
*/
__u64 data_ptr;
};
struct drm_i915_query {
__u32 num_items;
/*
* Unused for now. Must be cleared to zero.
*/
__u32 flags;
/*
* This points to an array of num_items drm_i915_query_item structures.
*/
__u64 items_ptr;
};
/*
* Data written by the kernel with query DRM_I915_QUERY_TOPOLOGY_INFO :
*
* data: contains the 3 pieces of information :
*
* - the slice mask with one bit per slice telling whether a slice is
* available. The availability of slice X can be queried with the following
* formula :
*
* (data[X / 8] >> (X % 8)) & 1
*
* - the subslice mask for each slice with one bit per subslice telling
* whether a subslice is available. The availability of subslice Y in slice
* X can be queried with the following formula :
*
* (data[subslice_offset +
* X * subslice_stride +
* Y / 8] >> (Y % 8)) & 1
*
* - the EU mask for each subslice in each slice with one bit per EU telling
* whether an EU is available. The availability of EU Z in subslice Y in
* slice X can be queried with the following formula :
*
* (data[eu_offset +
* (X * max_subslices + Y) * eu_stride +
* Z / 8] >> (Z % 8)) & 1
*/
struct drm_i915_query_topology_info {
/*
* Unused for now. Must be cleared to zero.
*/
__u16 flags;
__u16 max_slices;
__u16 max_subslices;
__u16 max_eus_per_subslice;
/*
* Offset in data[] at which the subslice masks are stored.
*/
__u16 subslice_offset;
/*
* Stride at which each of the subslice masks for each slice are
* stored.
*/
__u16 subslice_stride;
/*
* Offset in data[] at which the EU masks are stored.
*/
__u16 eu_offset;
/*
* Stride at which each of the EU masks for each subslice are stored.
*/
__u16 eu_stride;
__u8 data[];
};
#if defined(__cplusplus)
}
#endif

View File

@@ -0,0 +1,209 @@
/*
* Copyright (c) 2012-2013, NVIDIA CORPORATION. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef _TEGRA_DRM_H_
#define _TEGRA_DRM_H_
#include "drm.h"
#if defined(__cplusplus)
extern "C" {
#endif
#define DRM_TEGRA_GEM_CREATE_TILED (1 << 0)
#define DRM_TEGRA_GEM_CREATE_BOTTOM_UP (1 << 1)
struct drm_tegra_gem_create {
__u64 size;
__u32 flags;
__u32 handle;
};
struct drm_tegra_gem_mmap {
__u32 handle;
__u32 pad;
__u64 offset;
};
struct drm_tegra_syncpt_read {
__u32 id;
__u32 value;
};
struct drm_tegra_syncpt_incr {
__u32 id;
__u32 pad;
};
struct drm_tegra_syncpt_wait {
__u32 id;
__u32 thresh;
__u32 timeout;
__u32 value;
};
#define DRM_TEGRA_NO_TIMEOUT (0xffffffff)
struct drm_tegra_open_channel {
__u32 client;
__u32 pad;
__u64 context;
};
struct drm_tegra_close_channel {
__u64 context;
};
struct drm_tegra_get_syncpt {
__u64 context;
__u32 index;
__u32 id;
};
struct drm_tegra_get_syncpt_base {
__u64 context;
__u32 syncpt;
__u32 id;
};
struct drm_tegra_syncpt {
__u32 id;
__u32 incrs;
};
struct drm_tegra_cmdbuf {
__u32 handle;
__u32 offset;
__u32 words;
__u32 pad;
};
struct drm_tegra_reloc {
struct {
__u32 handle;
__u32 offset;
} cmdbuf;
struct {
__u32 handle;
__u32 offset;
} target;
__u32 shift;
__u32 pad;
};
struct drm_tegra_waitchk {
__u32 handle;
__u32 offset;
__u32 syncpt;
__u32 thresh;
};
struct drm_tegra_submit {
__u64 context;
__u32 num_syncpts;
__u32 num_cmdbufs;
__u32 num_relocs;
__u32 num_waitchks;
__u32 waitchk_mask;
__u32 timeout;
__u64 syncpts;
__u64 cmdbufs;
__u64 relocs;
__u64 waitchks;
__u32 fence; /* Return value */
__u32 reserved[5]; /* future expansion */
};
#define DRM_TEGRA_GEM_TILING_MODE_PITCH 0
#define DRM_TEGRA_GEM_TILING_MODE_TILED 1
#define DRM_TEGRA_GEM_TILING_MODE_BLOCK 2
struct drm_tegra_gem_set_tiling {
/* input */
__u32 handle;
__u32 mode;
__u32 value;
__u32 pad;
};
struct drm_tegra_gem_get_tiling {
/* input */
__u32 handle;
/* output */
__u32 mode;
__u32 value;
__u32 pad;
};
#define DRM_TEGRA_GEM_BOTTOM_UP (1 << 0)
#define DRM_TEGRA_GEM_FLAGS (DRM_TEGRA_GEM_BOTTOM_UP)
struct drm_tegra_gem_set_flags {
/* input */
__u32 handle;
/* output */
__u32 flags;
};
struct drm_tegra_gem_get_flags {
/* input */
__u32 handle;
/* output */
__u32 flags;
};
#define DRM_TEGRA_GEM_CREATE 0x00
#define DRM_TEGRA_GEM_MMAP 0x01
#define DRM_TEGRA_SYNCPT_READ 0x02
#define DRM_TEGRA_SYNCPT_INCR 0x03
#define DRM_TEGRA_SYNCPT_WAIT 0x04
#define DRM_TEGRA_OPEN_CHANNEL 0x05
#define DRM_TEGRA_CLOSE_CHANNEL 0x06
#define DRM_TEGRA_GET_SYNCPT 0x07
#define DRM_TEGRA_SUBMIT 0x08
#define DRM_TEGRA_GET_SYNCPT_BASE 0x09
#define DRM_TEGRA_GEM_SET_TILING 0x0a
#define DRM_TEGRA_GEM_GET_TILING 0x0b
#define DRM_TEGRA_GEM_SET_FLAGS 0x0c
#define DRM_TEGRA_GEM_GET_FLAGS 0x0d
#define DRM_IOCTL_TEGRA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_CREATE, struct drm_tegra_gem_create)
#define DRM_IOCTL_TEGRA_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_MMAP, struct drm_tegra_gem_mmap)
#define DRM_IOCTL_TEGRA_SYNCPT_READ DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_READ, struct drm_tegra_syncpt_read)
#define DRM_IOCTL_TEGRA_SYNCPT_INCR DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_INCR, struct drm_tegra_syncpt_incr)
#define DRM_IOCTL_TEGRA_SYNCPT_WAIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_WAIT, struct drm_tegra_syncpt_wait)
#define DRM_IOCTL_TEGRA_OPEN_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_OPEN_CHANNEL, struct drm_tegra_open_channel)
#define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_open_channel)
#define DRM_IOCTL_TEGRA_GET_SYNCPT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT, struct drm_tegra_get_syncpt)
#define DRM_IOCTL_TEGRA_SUBMIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SUBMIT, struct drm_tegra_submit)
#define DRM_IOCTL_TEGRA_GET_SYNCPT_BASE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT_BASE, struct drm_tegra_get_syncpt_base)
#define DRM_IOCTL_TEGRA_GEM_SET_TILING DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_SET_TILING, struct drm_tegra_gem_set_tiling)
#define DRM_IOCTL_TEGRA_GEM_GET_TILING DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_GET_TILING, struct drm_tegra_gem_get_tiling)
#define DRM_IOCTL_TEGRA_GEM_SET_FLAGS DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_SET_FLAGS, struct drm_tegra_gem_set_flags)
#define DRM_IOCTL_TEGRA_GEM_GET_FLAGS DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_GET_FLAGS, struct drm_tegra_gem_get_flags)
#if defined(__cplusplus)
}
#endif
#endif

View File

@@ -41,6 +41,10 @@ extern "C" {
#define DRM_VC4_SET_TILING 0x08
#define DRM_VC4_GET_TILING 0x09
#define DRM_VC4_LABEL_BO 0x0a
#define DRM_VC4_GEM_MADVISE 0x0b
#define DRM_VC4_PERFMON_CREATE 0x0c
#define DRM_VC4_PERFMON_DESTROY 0x0d
#define DRM_VC4_PERFMON_GET_VALUES 0x0e
#define DRM_IOCTL_VC4_SUBMIT_CL DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_SUBMIT_CL, struct drm_vc4_submit_cl)
#define DRM_IOCTL_VC4_WAIT_SEQNO DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_WAIT_SEQNO, struct drm_vc4_wait_seqno)
@@ -53,6 +57,10 @@ extern "C" {
#define DRM_IOCTL_VC4_SET_TILING DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_SET_TILING, struct drm_vc4_set_tiling)
#define DRM_IOCTL_VC4_GET_TILING DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_GET_TILING, struct drm_vc4_get_tiling)
#define DRM_IOCTL_VC4_LABEL_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_LABEL_BO, struct drm_vc4_label_bo)
#define DRM_IOCTL_VC4_GEM_MADVISE DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_GEM_MADVISE, struct drm_vc4_gem_madvise)
#define DRM_IOCTL_VC4_PERFMON_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_CREATE, struct drm_vc4_perfmon_create)
#define DRM_IOCTL_VC4_PERFMON_DESTROY DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_DESTROY, struct drm_vc4_perfmon_destroy)
#define DRM_IOCTL_VC4_PERFMON_GET_VALUES DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_GET_VALUES, struct drm_vc4_perfmon_get_values)
struct drm_vc4_submit_rcl_surface {
__u32 hindex; /* Handle index, or ~0 if not present. */
@@ -171,6 +179,15 @@ struct drm_vc4_submit_cl {
* wait ioctl).
*/
__u64 seqno;
/* ID of the perfmon to attach to this job. 0 means no perfmon. */
__u32 perfmonid;
/* Unused field to align this struct on 64 bits. Must be set to 0.
* If one ever needs to add an u32 field to this struct, this field
* can be used.
*/
__u32 pad2;
};
/**
@@ -305,6 +322,8 @@ struct drm_vc4_get_hang_state {
#define DRM_VC4_PARAM_SUPPORTS_ETC1 4
#define DRM_VC4_PARAM_SUPPORTS_THREADED_FS 5
#define DRM_VC4_PARAM_SUPPORTS_FIXED_RCL_ORDER 6
#define DRM_VC4_PARAM_SUPPORTS_MADVISE 7
#define DRM_VC4_PARAM_SUPPORTS_PERFMON 8
struct drm_vc4_get_param {
__u32 param;
@@ -333,6 +352,82 @@ struct drm_vc4_label_bo {
__u64 name;
};
/*
* States prefixed with '__' are internal states and cannot be passed to the
* DRM_IOCTL_VC4_GEM_MADVISE ioctl.
*/
#define VC4_MADV_WILLNEED 0
#define VC4_MADV_DONTNEED 1
#define __VC4_MADV_PURGED 2
#define __VC4_MADV_NOTSUPP 3
struct drm_vc4_gem_madvise {
__u32 handle;
__u32 madv;
__u32 retained;
__u32 pad;
};
enum {
VC4_PERFCNT_FEP_VALID_PRIMS_NO_RENDER,
VC4_PERFCNT_FEP_VALID_PRIMS_RENDER,
VC4_PERFCNT_FEP_CLIPPED_QUADS,
VC4_PERFCNT_FEP_VALID_QUADS,
VC4_PERFCNT_TLB_QUADS_NOT_PASSING_STENCIL,
VC4_PERFCNT_TLB_QUADS_NOT_PASSING_Z_AND_STENCIL,
VC4_PERFCNT_TLB_QUADS_PASSING_Z_AND_STENCIL,
VC4_PERFCNT_TLB_QUADS_ZERO_COVERAGE,
VC4_PERFCNT_TLB_QUADS_NON_ZERO_COVERAGE,
VC4_PERFCNT_TLB_QUADS_WRITTEN_TO_COLOR_BUF,
VC4_PERFCNT_PLB_PRIMS_OUTSIDE_VIEWPORT,
VC4_PERFCNT_PLB_PRIMS_NEED_CLIPPING,
VC4_PERFCNT_PSE_PRIMS_REVERSED,
VC4_PERFCNT_QPU_TOTAL_IDLE_CYCLES,
VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_VERTEX_COORD_SHADING,
VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_FRAGMENT_SHADING,
VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_EXEC_VALID_INST,
VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_WAITING_TMUS,
VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_WAITING_SCOREBOARD,
VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_WAITING_VARYINGS,
VC4_PERFCNT_QPU_TOTAL_INST_CACHE_HIT,
VC4_PERFCNT_QPU_TOTAL_INST_CACHE_MISS,
VC4_PERFCNT_QPU_TOTAL_UNIFORM_CACHE_HIT,
VC4_PERFCNT_QPU_TOTAL_UNIFORM_CACHE_MISS,
VC4_PERFCNT_TMU_TOTAL_TEXT_QUADS_PROCESSED,
VC4_PERFCNT_TMU_TOTAL_TEXT_CACHE_MISS,
VC4_PERFCNT_VPM_TOTAL_CLK_CYCLES_VDW_STALLED,
VC4_PERFCNT_VPM_TOTAL_CLK_CYCLES_VCD_STALLED,
VC4_PERFCNT_L2C_TOTAL_L2_CACHE_HIT,
VC4_PERFCNT_L2C_TOTAL_L2_CACHE_MISS,
VC4_PERFCNT_NUM_EVENTS,
};
#define DRM_VC4_MAX_PERF_COUNTERS 16
struct drm_vc4_perfmon_create {
__u32 id;
__u32 ncounters;
__u8 events[DRM_VC4_MAX_PERF_COUNTERS];
};
struct drm_vc4_perfmon_destroy {
__u32 id;
};
/*
* Returns the values of the performance counters tracked by this
* perfmon (as an array of ncounters u64 values).
*
* No implicit synchronization is performed, so the user has to
* guarantee that any jobs using this perfmon have already been
* completed (probably by blocking on the seqno returned by the
* last exec that used the perfmon).
*/
struct drm_vc4_perfmon_get_values {
__u32 id;
__u64 values_ptr;
};
#if defined(__cplusplus)
}
#endif

View File

@@ -20,6 +20,9 @@
inc_drm_uapi = include_directories('drm-uapi')
inc_vulkan = include_directories('vulkan')
inc_d3d9 = include_directories('D3D9')
inc_gl_internal = include_directories('GL/internal')
inc_haikugl = include_directories('HaikuGL')
if with_gles1
install_headers(
@@ -34,13 +37,13 @@ if with_gles2
subdir : 'GLES2',
)
install_headers(
'GLES3/gl3.h', 'GLES3/gl32.h', 'GLES3/gl32.h', 'GLES3/gl3ext.h',
'GLES3/gl3.h', 'GLES3/gl31.h', 'GLES3/gl32.h', 'GLES3/gl3ext.h',
'GLES3/gl3platform.h',
subdir : 'GLES3',
)
endif
if with_gles1 or with_gles2 # or with_egl
if with_gles1 or with_gles2 or with_egl
install_headers('KHR/khrplatform.h', subdir : 'KHR')
endif
@@ -52,10 +55,10 @@ if with_opengl
endif
if with_glx != 'disabled'
install_headers('GL/glx.h', 'GL/glext.h', 'GL/glx_mangle.h', subdir : 'GL')
install_headers('GL/glx.h', 'GL/glxext.h', 'GL/glx_mangle.h', subdir : 'GL')
endif
if with_osmesa
if with_osmesa != 'none'
install_headers('GL/osmesa.h', subdir : 'GL')
endif
@@ -66,3 +69,44 @@ if with_egl
subdir : 'EGL',
)
endif
if with_dri
install_headers('GL/internal/dri_interface.h', subdir : 'GL/internal')
endif
if with_gallium_st_nine
install_headers(
'd3dadapter/d3dadapter9.h', 'd3dadapter/drm.h', 'd3dadapter/present.h',
subdir : 'd3dadapter',
)
endif
if with_platform_haiku
install_headers(
'HaikuGL/GLRenderer.h', 'HaikuGL/GLView.h', 'HaikuGL/OpenGLKit.h',
subdir : 'opengl',
)
endif
# Only install the headers if we are building a stand alone implementation and
# not an ICD enabled implementation
if with_gallium_opencl and not with_opencl_icd
install_headers(
'CL/cl.h',
'CL/cl.hpp',
'CL/cl_d3d10.h',
'CL/cl_d3d11.h',
'CL/cl_dx9_media_sharing.h',
'CL/cl_egl.h',
'CL/cl_ext.h',
'CL/cl_gl.h',
'CL/cl_gl_ext.h',
'CL/cl_platform.h',
'CL/opencl.h',
subdir: 'CL'
)
endif
if with_intel_vk
install_headers('vulkan/vulkan_intel.h', subdir : 'vulkan')
endif

View File

@@ -163,19 +163,27 @@ CHIPSET(0x5923, kbl_gt3, "Intel(R) Kabylake GT3")
CHIPSET(0x5926, kbl_gt3, "Intel(R) Iris Plus Graphics 640 (Kaby Lake GT3e)")
CHIPSET(0x5927, kbl_gt3, "Intel(R) Iris Plus Graphics 650 (Kaby Lake GT3e)")
CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
CHIPSET(0x3184, glk, "Intel(R) HD Graphics (Geminilake)")
CHIPSET(0x3185, glk_2x6, "Intel(R) HD Graphics (Geminilake 2x6)")
CHIPSET(0x3E90, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
CHIPSET(0x3E93, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
CHIPSET(0x3E91, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3E92, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3184, glk, "Intel(R) UHD Graphics 605 (Geminilake)")
CHIPSET(0x3185, glk_2x6, "Intel(R) UHD Graphics 600 (Geminilake 2x6)")
CHIPSET(0x3E90, cfl_gt1, "Intel(R) UHD Graphics 610 (Coffeelake 2x6 GT1)")
CHIPSET(0x3E93, cfl_gt1, "Intel(R) UHD Graphics 610 (Coffeelake 2x6 GT1)")
CHIPSET(0x3E99, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
CHIPSET(0x3EA1, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
CHIPSET(0x3EA4, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
CHIPSET(0x3E91, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")
CHIPSET(0x3E92, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")
CHIPSET(0x3E96, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3E9B, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3E9A, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3E9B, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")
CHIPSET(0x3E94, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3EA0, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3EA3, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3EA9, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3EA2, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
CHIPSET(0x3EA5, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
CHIPSET(0x3EA6, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
CHIPSET(0x3EA7, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
CHIPSET(0x3EA8, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
CHIPSET(0x3EA5, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
CHIPSET(0x5A49, cnl_2x8, "Intel(R) HD Graphics (Cannonlake 2x8 GT0.5)")
CHIPSET(0x5A4A, cnl_2x8, "Intel(R) HD Graphics (Cannonlake 2x8 GT0.5)")
CHIPSET(0x5A41, cnl_3x8, "Intel(R) HD Graphics (Cannonlake 3x8 GT1)")
@@ -188,3 +196,11 @@ CHIPSET(0x5A50, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
CHIPSET(0x5A51, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
CHIPSET(0x5A52, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
CHIPSET(0x5A54, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
CHIPSET(0x8A50, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")
CHIPSET(0x8A51, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")
CHIPSET(0x8A52, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")
CHIPSET(0x8A5A, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")
CHIPSET(0x8A5B, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
CHIPSET(0x8A5C, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")
CHIPSET(0x8A5D, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
CHIPSET(0x8A71, icl_1x8, "Intel(R) HD Graphics (Ice Lake 1x8 GT0.5)")

View File

@@ -1,229 +1,239 @@
CHIPSET(0x6780, TAHITI_6780, TAHITI)
CHIPSET(0x6784, TAHITI_6784, TAHITI)
CHIPSET(0x6788, TAHITI_6788, TAHITI)
CHIPSET(0x678A, TAHITI_678A, TAHITI)
CHIPSET(0x6790, TAHITI_6790, TAHITI)
CHIPSET(0x6791, TAHITI_6791, TAHITI)
CHIPSET(0x6792, TAHITI_6792, TAHITI)
CHIPSET(0x6798, TAHITI_6798, TAHITI)
CHIPSET(0x6799, TAHITI_6799, TAHITI)
CHIPSET(0x679A, TAHITI_679A, TAHITI)
CHIPSET(0x679B, TAHITI_679B, TAHITI)
CHIPSET(0x679E, TAHITI_679E, TAHITI)
CHIPSET(0x679F, TAHITI_679F, TAHITI)
CHIPSET(0x6780, TAHITI)
CHIPSET(0x6784, TAHITI)
CHIPSET(0x6788, TAHITI)
CHIPSET(0x678A, TAHITI)
CHIPSET(0x6790, TAHITI)
CHIPSET(0x6791, TAHITI)
CHIPSET(0x6792, TAHITI)
CHIPSET(0x6798, TAHITI)
CHIPSET(0x6799, TAHITI)
CHIPSET(0x679A, TAHITI)
CHIPSET(0x679B, TAHITI)
CHIPSET(0x679E, TAHITI)
CHIPSET(0x679F, TAHITI)
CHIPSET(0x6800, PITCAIRN_6800, PITCAIRN)
CHIPSET(0x6801, PITCAIRN_6801, PITCAIRN)
CHIPSET(0x6802, PITCAIRN_6802, PITCAIRN)
CHIPSET(0x6806, PITCAIRN_6806, PITCAIRN)
CHIPSET(0x6808, PITCAIRN_6808, PITCAIRN)
CHIPSET(0x6809, PITCAIRN_6809, PITCAIRN)
CHIPSET(0x6810, PITCAIRN_6810, PITCAIRN)
CHIPSET(0x6811, PITCAIRN_6811, PITCAIRN)
CHIPSET(0x6816, PITCAIRN_6816, PITCAIRN)
CHIPSET(0x6817, PITCAIRN_6817, PITCAIRN)
CHIPSET(0x6818, PITCAIRN_6818, PITCAIRN)
CHIPSET(0x6819, PITCAIRN_6819, PITCAIRN)
CHIPSET(0x684C, PITCAIRN_684C, PITCAIRN)
CHIPSET(0x6800, PITCAIRN)
CHIPSET(0x6801, PITCAIRN)
CHIPSET(0x6802, PITCAIRN)
CHIPSET(0x6806, PITCAIRN)
CHIPSET(0x6808, PITCAIRN)
CHIPSET(0x6809, PITCAIRN)
CHIPSET(0x6810, PITCAIRN)
CHIPSET(0x6811, PITCAIRN)
CHIPSET(0x6816, PITCAIRN)
CHIPSET(0x6817, PITCAIRN)
CHIPSET(0x6818, PITCAIRN)
CHIPSET(0x6819, PITCAIRN)
CHIPSET(0x684C, PITCAIRN)
CHIPSET(0x6820, VERDE_6820, VERDE)
CHIPSET(0x6821, VERDE_6821, VERDE)
CHIPSET(0x6822, VERDE_6822, VERDE)
CHIPSET(0x6823, VERDE_6823, VERDE)
CHIPSET(0x6824, VERDE_6824, VERDE)
CHIPSET(0x6825, VERDE_6825, VERDE)
CHIPSET(0x6826, VERDE_6826, VERDE)
CHIPSET(0x6827, VERDE_6827, VERDE)
CHIPSET(0x6828, VERDE_6828, VERDE)
CHIPSET(0x6829, VERDE_6829, VERDE)
CHIPSET(0x682A, VERDE_682A, VERDE)
CHIPSET(0x682B, VERDE_682B, VERDE)
CHIPSET(0x682C, VERDE_682C, VERDE)
CHIPSET(0x682D, VERDE_682D, VERDE)
CHIPSET(0x682F, VERDE_682F, VERDE)
CHIPSET(0x6830, VERDE_6830, VERDE)
CHIPSET(0x6831, VERDE_6831, VERDE)
CHIPSET(0x6835, VERDE_6835, VERDE)
CHIPSET(0x6837, VERDE_6837, VERDE)
CHIPSET(0x6838, VERDE_6838, VERDE)
CHIPSET(0x6839, VERDE_6839, VERDE)
CHIPSET(0x683B, VERDE_683B, VERDE)
CHIPSET(0x683D, VERDE_683D, VERDE)
CHIPSET(0x683F, VERDE_683F, VERDE)
CHIPSET(0x6820, VERDE)
CHIPSET(0x6821, VERDE)
CHIPSET(0x6822, VERDE)
CHIPSET(0x6823, VERDE)
CHIPSET(0x6824, VERDE)
CHIPSET(0x6825, VERDE)
CHIPSET(0x6826, VERDE)
CHIPSET(0x6827, VERDE)
CHIPSET(0x6828, VERDE)
CHIPSET(0x6829, VERDE)
CHIPSET(0x682A, VERDE)
CHIPSET(0x682B, VERDE)
CHIPSET(0x682C, VERDE)
CHIPSET(0x682D, VERDE)
CHIPSET(0x682F, VERDE)
CHIPSET(0x6830, VERDE)
CHIPSET(0x6831, VERDE)
CHIPSET(0x6835, VERDE)
CHIPSET(0x6837, VERDE)
CHIPSET(0x6838, VERDE)
CHIPSET(0x6839, VERDE)
CHIPSET(0x683B, VERDE)
CHIPSET(0x683D, VERDE)
CHIPSET(0x683F, VERDE)
CHIPSET(0x6600, OLAND_6600, OLAND)
CHIPSET(0x6601, OLAND_6601, OLAND)
CHIPSET(0x6602, OLAND_6602, OLAND)
CHIPSET(0x6603, OLAND_6603, OLAND)
CHIPSET(0x6604, OLAND_6604, OLAND)
CHIPSET(0x6605, OLAND_6605, OLAND)
CHIPSET(0x6606, OLAND_6606, OLAND)
CHIPSET(0x6607, OLAND_6607, OLAND)
CHIPSET(0x6608, OLAND_6608, OLAND)
CHIPSET(0x6610, OLAND_6610, OLAND)
CHIPSET(0x6611, OLAND_6611, OLAND)
CHIPSET(0x6613, OLAND_6613, OLAND)
CHIPSET(0x6617, OLAND_6617, OLAND)
CHIPSET(0x6620, OLAND_6620, OLAND)
CHIPSET(0x6621, OLAND_6621, OLAND)
CHIPSET(0x6623, OLAND_6623, OLAND)
CHIPSET(0x6631, OLAND_6631, OLAND)
CHIPSET(0x6600, OLAND)
CHIPSET(0x6601, OLAND)
CHIPSET(0x6602, OLAND)
CHIPSET(0x6603, OLAND)
CHIPSET(0x6604, OLAND)
CHIPSET(0x6605, OLAND)
CHIPSET(0x6606, OLAND)
CHIPSET(0x6607, OLAND)
CHIPSET(0x6608, OLAND)
CHIPSET(0x6610, OLAND)
CHIPSET(0x6611, OLAND)
CHIPSET(0x6613, OLAND)
CHIPSET(0x6617, OLAND)
CHIPSET(0x6620, OLAND)
CHIPSET(0x6621, OLAND)
CHIPSET(0x6623, OLAND)
CHIPSET(0x6631, OLAND)
CHIPSET(0x6660, HAINAN_6660, HAINAN)
CHIPSET(0x6663, HAINAN_6663, HAINAN)
CHIPSET(0x6664, HAINAN_6664, HAINAN)
CHIPSET(0x6665, HAINAN_6665, HAINAN)
CHIPSET(0x6667, HAINAN_6667, HAINAN)
CHIPSET(0x666F, HAINAN_666F, HAINAN)
CHIPSET(0x6660, HAINAN)
CHIPSET(0x6663, HAINAN)
CHIPSET(0x6664, HAINAN)
CHIPSET(0x6665, HAINAN)
CHIPSET(0x6667, HAINAN)
CHIPSET(0x666F, HAINAN)
CHIPSET(0x6640, BONAIRE_6640, BONAIRE)
CHIPSET(0x6641, BONAIRE_6641, BONAIRE)
CHIPSET(0x6646, BONAIRE_6646, BONAIRE)
CHIPSET(0x6647, BONAIRE_6647, BONAIRE)
CHIPSET(0x6649, BONAIRE_6649, BONAIRE)
CHIPSET(0x6650, BONAIRE_6650, BONAIRE)
CHIPSET(0x6651, BONAIRE_6651, BONAIRE)
CHIPSET(0x6658, BONAIRE_6658, BONAIRE)
CHIPSET(0x665C, BONAIRE_665C, BONAIRE)
CHIPSET(0x665D, BONAIRE_665D, BONAIRE)
CHIPSET(0x665F, BONAIRE_665F, BONAIRE)
CHIPSET(0x6640, BONAIRE)
CHIPSET(0x6641, BONAIRE)
CHIPSET(0x6646, BONAIRE)
CHIPSET(0x6647, BONAIRE)
CHIPSET(0x6649, BONAIRE)
CHIPSET(0x6650, BONAIRE)
CHIPSET(0x6651, BONAIRE)
CHIPSET(0x6658, BONAIRE)
CHIPSET(0x665C, BONAIRE)
CHIPSET(0x665D, BONAIRE)
CHIPSET(0x665F, BONAIRE)
CHIPSET(0x9830, KABINI_9830, KABINI)
CHIPSET(0x9831, KABINI_9831, KABINI)
CHIPSET(0x9832, KABINI_9832, KABINI)
CHIPSET(0x9833, KABINI_9833, KABINI)
CHIPSET(0x9834, KABINI_9834, KABINI)
CHIPSET(0x9835, KABINI_9835, KABINI)
CHIPSET(0x9836, KABINI_9836, KABINI)
CHIPSET(0x9837, KABINI_9837, KABINI)
CHIPSET(0x9838, KABINI_9838, KABINI)
CHIPSET(0x9839, KABINI_9839, KABINI)
CHIPSET(0x983A, KABINI_983A, KABINI)
CHIPSET(0x983B, KABINI_983B, KABINI)
CHIPSET(0x983C, KABINI_983C, KABINI)
CHIPSET(0x983D, KABINI_983D, KABINI)
CHIPSET(0x983E, KABINI_983E, KABINI)
CHIPSET(0x983F, KABINI_983F, KABINI)
CHIPSET(0x9830, KABINI)
CHIPSET(0x9831, KABINI)
CHIPSET(0x9832, KABINI)
CHIPSET(0x9833, KABINI)
CHIPSET(0x9834, KABINI)
CHIPSET(0x9835, KABINI)
CHIPSET(0x9836, KABINI)
CHIPSET(0x9837, KABINI)
CHIPSET(0x9838, KABINI)
CHIPSET(0x9839, KABINI)
CHIPSET(0x983A, KABINI)
CHIPSET(0x983B, KABINI)
CHIPSET(0x983C, KABINI)
CHIPSET(0x983D, KABINI)
CHIPSET(0x983E, KABINI)
CHIPSET(0x983F, KABINI)
CHIPSET(0x9850, MULLINS_9850, MULLINS)
CHIPSET(0x9851, MULLINS_9851, MULLINS)
CHIPSET(0x9852, MULLINS_9852, MULLINS)
CHIPSET(0x9853, MULLINS_9853, MULLINS)
CHIPSET(0x9854, MULLINS_9854, MULLINS)
CHIPSET(0x9855, MULLINS_9855, MULLINS)
CHIPSET(0x9856, MULLINS_9856, MULLINS)
CHIPSET(0x9857, MULLINS_9857, MULLINS)
CHIPSET(0x9858, MULLINS_9858, MULLINS)
CHIPSET(0x9859, MULLINS_9859, MULLINS)
CHIPSET(0x985A, MULLINS_985A, MULLINS)
CHIPSET(0x985B, MULLINS_985B, MULLINS)
CHIPSET(0x985C, MULLINS_985C, MULLINS)
CHIPSET(0x985D, MULLINS_985D, MULLINS)
CHIPSET(0x985E, MULLINS_985E, MULLINS)
CHIPSET(0x985F, MULLINS_985F, MULLINS)
CHIPSET(0x9850, MULLINS)
CHIPSET(0x9851, MULLINS)
CHIPSET(0x9852, MULLINS)
CHIPSET(0x9853, MULLINS)
CHIPSET(0x9854, MULLINS)
CHIPSET(0x9855, MULLINS)
CHIPSET(0x9856, MULLINS)
CHIPSET(0x9857, MULLINS)
CHIPSET(0x9858, MULLINS)
CHIPSET(0x9859, MULLINS)
CHIPSET(0x985A, MULLINS)
CHIPSET(0x985B, MULLINS)
CHIPSET(0x985C, MULLINS)
CHIPSET(0x985D, MULLINS)
CHIPSET(0x985E, MULLINS)
CHIPSET(0x985F, MULLINS)
CHIPSET(0x1304, KAVERI_1304, KAVERI)
CHIPSET(0x1305, KAVERI_1305, KAVERI)
CHIPSET(0x1306, KAVERI_1306, KAVERI)
CHIPSET(0x1307, KAVERI_1307, KAVERI)
CHIPSET(0x1309, KAVERI_1309, KAVERI)
CHIPSET(0x130A, KAVERI_130A, KAVERI)
CHIPSET(0x130B, KAVERI_130B, KAVERI)
CHIPSET(0x130C, KAVERI_130C, KAVERI)
CHIPSET(0x130D, KAVERI_130D, KAVERI)
CHIPSET(0x130E, KAVERI_130E, KAVERI)
CHIPSET(0x130F, KAVERI_130F, KAVERI)
CHIPSET(0x1310, KAVERI_1310, KAVERI)
CHIPSET(0x1311, KAVERI_1311, KAVERI)
CHIPSET(0x1312, KAVERI_1312, KAVERI)
CHIPSET(0x1313, KAVERI_1313, KAVERI)
CHIPSET(0x1315, KAVERI_1315, KAVERI)
CHIPSET(0x1316, KAVERI_1316, KAVERI)
CHIPSET(0x1317, KAVERI_1317, KAVERI)
CHIPSET(0x1318, KAVERI_1318, KAVERI)
CHIPSET(0x131B, KAVERI_131B, KAVERI)
CHIPSET(0x131C, KAVERI_131C, KAVERI)
CHIPSET(0x131D, KAVERI_131D, KAVERI)
CHIPSET(0x1304, KAVERI)
CHIPSET(0x1305, KAVERI)
CHIPSET(0x1306, KAVERI)
CHIPSET(0x1307, KAVERI)
CHIPSET(0x1309, KAVERI)
CHIPSET(0x130A, KAVERI)
CHIPSET(0x130B, KAVERI)
CHIPSET(0x130C, KAVERI)
CHIPSET(0x130D, KAVERI)
CHIPSET(0x130E, KAVERI)
CHIPSET(0x130F, KAVERI)
CHIPSET(0x1310, KAVERI)
CHIPSET(0x1311, KAVERI)
CHIPSET(0x1312, KAVERI)
CHIPSET(0x1313, KAVERI)
CHIPSET(0x1315, KAVERI)
CHIPSET(0x1316, KAVERI)
CHIPSET(0x1317, KAVERI)
CHIPSET(0x1318, KAVERI)
CHIPSET(0x131B, KAVERI)
CHIPSET(0x131C, KAVERI)
CHIPSET(0x131D, KAVERI)
CHIPSET(0x67A0, HAWAII_67A0, HAWAII)
CHIPSET(0x67A1, HAWAII_67A1, HAWAII)
CHIPSET(0x67A2, HAWAII_67A2, HAWAII)
CHIPSET(0x67A8, HAWAII_67A8, HAWAII)
CHIPSET(0x67A9, HAWAII_67A9, HAWAII)
CHIPSET(0x67AA, HAWAII_67AA, HAWAII)
CHIPSET(0x67B0, HAWAII_67B0, HAWAII)
CHIPSET(0x67B1, HAWAII_67B1, HAWAII)
CHIPSET(0x67B8, HAWAII_67B8, HAWAII)
CHIPSET(0x67B9, HAWAII_67B9, HAWAII)
CHIPSET(0x67BA, HAWAII_67BA, HAWAII)
CHIPSET(0x67BE, HAWAII_67BE, HAWAII)
CHIPSET(0x67A0, HAWAII)
CHIPSET(0x67A1, HAWAII)
CHIPSET(0x67A2, HAWAII)
CHIPSET(0x67A8, HAWAII)
CHIPSET(0x67A9, HAWAII)
CHIPSET(0x67AA, HAWAII)
CHIPSET(0x67B0, HAWAII)
CHIPSET(0x67B1, HAWAII)
CHIPSET(0x67B8, HAWAII)
CHIPSET(0x67B9, HAWAII)
CHIPSET(0x67BA, HAWAII)
CHIPSET(0x67BE, HAWAII)
CHIPSET(0x6900, ICELAND_, ICELAND)
CHIPSET(0x6901, ICELAND_, ICELAND)
CHIPSET(0x6902, ICELAND_, ICELAND)
CHIPSET(0x6903, ICELAND_, ICELAND)
CHIPSET(0x6907, ICELAND_, ICELAND)
CHIPSET(0x6900, ICELAND)
CHIPSET(0x6901, ICELAND)
CHIPSET(0x6902, ICELAND)
CHIPSET(0x6903, ICELAND)
CHIPSET(0x6907, ICELAND)
CHIPSET(0x6920, TONGA_, TONGA)
CHIPSET(0x6921, TONGA_, TONGA)
CHIPSET(0x6928, TONGA_, TONGA)
CHIPSET(0x6929, TONGA_, TONGA)
CHIPSET(0x692B, TONGA_, TONGA)
CHIPSET(0x692F, TONGA_, TONGA)
CHIPSET(0x6930, TONGA_, TONGA)
CHIPSET(0x6938, TONGA_, TONGA)
CHIPSET(0x6939, TONGA_, TONGA)
CHIPSET(0x6920, TONGA)
CHIPSET(0x6921, TONGA)
CHIPSET(0x6928, TONGA)
CHIPSET(0x6929, TONGA)
CHIPSET(0x692B, TONGA)
CHIPSET(0x692F, TONGA)
CHIPSET(0x6930, TONGA)
CHIPSET(0x6938, TONGA)
CHIPSET(0x6939, TONGA)
CHIPSET(0x9870, CARRIZO_, CARRIZO)
CHIPSET(0x9874, CARRIZO_, CARRIZO)
CHIPSET(0x9875, CARRIZO_, CARRIZO)
CHIPSET(0x9876, CARRIZO_, CARRIZO)
CHIPSET(0x9877, CARRIZO_, CARRIZO)
CHIPSET(0x9870, CARRIZO)
CHIPSET(0x9874, CARRIZO)
CHIPSET(0x9875, CARRIZO)
CHIPSET(0x9876, CARRIZO)
CHIPSET(0x9877, CARRIZO)
CHIPSET(0x7300, FIJI_, FIJI)
CHIPSET(0x7300, FIJI)
CHIPSET(0x67E0, POLARIS11_, POLARIS11)
CHIPSET(0x67E1, POLARIS11_, POLARIS11)
CHIPSET(0x67E3, POLARIS11_, POLARIS11)
CHIPSET(0x67E7, POLARIS11_, POLARIS11)
CHIPSET(0x67E8, POLARIS11_, POLARIS11)
CHIPSET(0x67E9, POLARIS11_, POLARIS11)
CHIPSET(0x67EB, POLARIS11_, POLARIS11)
CHIPSET(0x67EF, POLARIS11_, POLARIS11)
CHIPSET(0x67FF, POLARIS11_, POLARIS11)
CHIPSET(0x67E0, POLARIS11)
CHIPSET(0x67E1, POLARIS11)
CHIPSET(0x67E3, POLARIS11)
CHIPSET(0x67E7, POLARIS11)
CHIPSET(0x67E8, POLARIS11)
CHIPSET(0x67E9, POLARIS11)
CHIPSET(0x67EB, POLARIS11)
CHIPSET(0x67EF, POLARIS11)
CHIPSET(0x67FF, POLARIS11)
CHIPSET(0x67C0, POLARIS10_, POLARIS10)
CHIPSET(0x67C1, POLARIS10_, POLARIS10)
CHIPSET(0x67C2, POLARIS10_, POLARIS10)
CHIPSET(0x67C4, POLARIS10_, POLARIS10)
CHIPSET(0x67C7, POLARIS10_, POLARIS10)
CHIPSET(0x67C8, POLARIS10_, POLARIS10)
CHIPSET(0x67C9, POLARIS10_, POLARIS10)
CHIPSET(0x67CA, POLARIS10_, POLARIS10)
CHIPSET(0x67CC, POLARIS10_, POLARIS10)
CHIPSET(0x67CF, POLARIS10_, POLARIS10)
CHIPSET(0x67D0, POLARIS10_, POLARIS10)
CHIPSET(0x67DF, POLARIS10_, POLARIS10)
CHIPSET(0x67C0, POLARIS10)
CHIPSET(0x67C1, POLARIS10)
CHIPSET(0x67C2, POLARIS10)
CHIPSET(0x67C4, POLARIS10)
CHIPSET(0x67C7, POLARIS10)
CHIPSET(0x67C8, POLARIS10)
CHIPSET(0x67C9, POLARIS10)
CHIPSET(0x67CA, POLARIS10)
CHIPSET(0x67CC, POLARIS10)
CHIPSET(0x67CF, POLARIS10)
CHIPSET(0x67D0, POLARIS10)
CHIPSET(0x67DF, POLARIS10)
CHIPSET(0x98E4, STONEY_, STONEY)
CHIPSET(0x98E4, STONEY)
CHIPSET(0x6980, POLARIS12_, POLARIS12)
CHIPSET(0x6981, POLARIS12_, POLARIS12)
CHIPSET(0x6985, POLARIS12_, POLARIS12)
CHIPSET(0x6986, POLARIS12_, POLARIS12)
CHIPSET(0x6987, POLARIS12_, POLARIS12)
CHIPSET(0x6995, POLARIS12_, POLARIS12)
CHIPSET(0x6997, POLARIS12_, POLARIS12)
CHIPSET(0x699F, POLARIS12_, POLARIS12)
CHIPSET(0x6980, POLARIS12)
CHIPSET(0x6981, POLARIS12)
CHIPSET(0x6985, POLARIS12)
CHIPSET(0x6986, POLARIS12)
CHIPSET(0x6987, POLARIS12)
CHIPSET(0x6995, POLARIS12)
CHIPSET(0x6997, POLARIS12)
CHIPSET(0x699F, POLARIS12)
CHIPSET(0x6860, VEGA10_, VEGA10)
CHIPSET(0x6861, VEGA10_, VEGA10)
CHIPSET(0x6862, VEGA10_, VEGA10)
CHIPSET(0x6863, VEGA10_, VEGA10)
CHIPSET(0x6864, VEGA10_, VEGA10)
CHIPSET(0x6867, VEGA10_, VEGA10)
CHIPSET(0x6868, VEGA10_, VEGA10)
CHIPSET(0x687F, VEGA10_, VEGA10)
CHIPSET(0x686C, VEGA10_, VEGA10)
CHIPSET(0x694C, VEGAM)
CHIPSET(0x694E, VEGAM)
CHIPSET(0x15DD, RAVEN_, RAVEN)
CHIPSET(0x6860, VEGA10)
CHIPSET(0x6861, VEGA10)
CHIPSET(0x6862, VEGA10)
CHIPSET(0x6863, VEGA10)
CHIPSET(0x6864, VEGA10)
CHIPSET(0x6867, VEGA10)
CHIPSET(0x6868, VEGA10)
CHIPSET(0x687F, VEGA10)
CHIPSET(0x686C, VEGA10)
CHIPSET(0x69A0, VEGA12)
CHIPSET(0x69A1, VEGA12)
CHIPSET(0x69A2, VEGA12)
CHIPSET(0x69A3, VEGA12)
CHIPSET(0x69AF, VEGA12)
CHIPSET(0x15DD, RAVEN)
CHIPSET(0x15D8, RAVEN)

View File

@@ -89,32 +89,4 @@ extern "C"
} // extern "C"
#endif // __cplusplus
// Platform-specific headers required by platform window system extensions.
// These are enabled prior to #including "vulkan.h". The same enable then
// controls inclusion of the extension interfaces in vulkan.h.
#ifdef VK_USE_PLATFORM_ANDROID_KHR
#include <android/native_window.h>
#endif
#ifdef VK_USE_PLATFORM_MIR_KHR
#include <mir_toolkit/client_types.h>
#endif
#ifdef VK_USE_PLATFORM_WAYLAND_KHR
#include <wayland-client.h>
#endif
#ifdef VK_USE_PLATFORM_WIN32_KHR
#include <windows.h>
#endif
#ifdef VK_USE_PLATFORM_XLIB_KHR
#include <X11/Xlib.h>
#endif
#ifdef VK_USE_PLATFORM_XCB_KHR
#include <xcb/xcb.h>
#endif
#endif

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,126 @@
#ifndef VULKAN_ANDROID_H_
#define VULKAN_ANDROID_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_android_surface 1
struct ANativeWindow;
#define VK_KHR_ANDROID_SURFACE_SPEC_VERSION 6
#define VK_KHR_ANDROID_SURFACE_EXTENSION_NAME "VK_KHR_android_surface"
typedef VkFlags VkAndroidSurfaceCreateFlagsKHR;
typedef struct VkAndroidSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkAndroidSurfaceCreateFlagsKHR flags;
struct ANativeWindow* window;
} VkAndroidSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateAndroidSurfaceKHR)(VkInstance instance, const VkAndroidSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateAndroidSurfaceKHR(
VkInstance instance,
const VkAndroidSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
#endif
#define VK_ANDROID_external_memory_android_hardware_buffer 1
struct AHardwareBuffer;
#define VK_ANDROID_EXTERNAL_MEMORY_ANDROID_HARDWARE_BUFFER_SPEC_VERSION 3
#define VK_ANDROID_EXTERNAL_MEMORY_ANDROID_HARDWARE_BUFFER_EXTENSION_NAME "VK_ANDROID_external_memory_android_hardware_buffer"
typedef struct VkAndroidHardwareBufferUsageANDROID {
VkStructureType sType;
void* pNext;
uint64_t androidHardwareBufferUsage;
} VkAndroidHardwareBufferUsageANDROID;
typedef struct VkAndroidHardwareBufferPropertiesANDROID {
VkStructureType sType;
void* pNext;
VkDeviceSize allocationSize;
uint32_t memoryTypeBits;
} VkAndroidHardwareBufferPropertiesANDROID;
typedef struct VkAndroidHardwareBufferFormatPropertiesANDROID {
VkStructureType sType;
void* pNext;
VkFormat format;
uint64_t externalFormat;
VkFormatFeatureFlags formatFeatures;
VkComponentMapping samplerYcbcrConversionComponents;
VkSamplerYcbcrModelConversion suggestedYcbcrModel;
VkSamplerYcbcrRange suggestedYcbcrRange;
VkChromaLocation suggestedXChromaOffset;
VkChromaLocation suggestedYChromaOffset;
} VkAndroidHardwareBufferFormatPropertiesANDROID;
typedef struct VkImportAndroidHardwareBufferInfoANDROID {
VkStructureType sType;
const void* pNext;
struct AHardwareBuffer* buffer;
} VkImportAndroidHardwareBufferInfoANDROID;
typedef struct VkMemoryGetAndroidHardwareBufferInfoANDROID {
VkStructureType sType;
const void* pNext;
VkDeviceMemory memory;
} VkMemoryGetAndroidHardwareBufferInfoANDROID;
typedef struct VkExternalFormatANDROID {
VkStructureType sType;
void* pNext;
uint64_t externalFormat;
} VkExternalFormatANDROID;
typedef VkResult (VKAPI_PTR *PFN_vkGetAndroidHardwareBufferPropertiesANDROID)(VkDevice device, const struct AHardwareBuffer* buffer, VkAndroidHardwareBufferPropertiesANDROID* pProperties);
typedef VkResult (VKAPI_PTR *PFN_vkGetMemoryAndroidHardwareBufferANDROID)(VkDevice device, const VkMemoryGetAndroidHardwareBufferInfoANDROID* pInfo, struct AHardwareBuffer** pBuffer);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkGetAndroidHardwareBufferPropertiesANDROID(
VkDevice device,
const struct AHardwareBuffer* buffer,
VkAndroidHardwareBufferPropertiesANDROID* pProperties);
VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryAndroidHardwareBufferANDROID(
VkDevice device,
const VkMemoryGetAndroidHardwareBufferInfoANDROID* pInfo,
struct AHardwareBuffer** pBuffer);
#endif
#ifdef __cplusplus
}
#endif
#endif

7470
include/vulkan/vulkan_core.h Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,58 @@
#ifndef VULKAN_IOS_H_
#define VULKAN_IOS_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_MVK_ios_surface 1
#define VK_MVK_IOS_SURFACE_SPEC_VERSION 2
#define VK_MVK_IOS_SURFACE_EXTENSION_NAME "VK_MVK_ios_surface"
typedef VkFlags VkIOSSurfaceCreateFlagsMVK;
typedef struct VkIOSSurfaceCreateInfoMVK {
VkStructureType sType;
const void* pNext;
VkIOSSurfaceCreateFlagsMVK flags;
const void* pView;
} VkIOSSurfaceCreateInfoMVK;
typedef VkResult (VKAPI_PTR *PFN_vkCreateIOSSurfaceMVK)(VkInstance instance, const VkIOSSurfaceCreateInfoMVK* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateIOSSurfaceMVK(
VkInstance instance,
const VkIOSSurfaceCreateInfoMVK* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,58 @@
#ifndef VULKAN_MACOS_H_
#define VULKAN_MACOS_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_MVK_macos_surface 1
#define VK_MVK_MACOS_SURFACE_SPEC_VERSION 2
#define VK_MVK_MACOS_SURFACE_EXTENSION_NAME "VK_MVK_macos_surface"
typedef VkFlags VkMacOSSurfaceCreateFlagsMVK;
typedef struct VkMacOSSurfaceCreateInfoMVK {
VkStructureType sType;
const void* pNext;
VkMacOSSurfaceCreateFlagsMVK flags;
const void* pView;
} VkMacOSSurfaceCreateInfoMVK;
typedef VkResult (VKAPI_PTR *PFN_vkCreateMacOSSurfaceMVK)(VkInstance instance, const VkMacOSSurfaceCreateInfoMVK* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateMacOSSurfaceMVK(
VkInstance instance,
const VkMacOSSurfaceCreateInfoMVK* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,65 @@
#ifndef VULKAN_MIR_H_
#define VULKAN_MIR_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_mir_surface 1
#define VK_KHR_MIR_SURFACE_SPEC_VERSION 4
#define VK_KHR_MIR_SURFACE_EXTENSION_NAME "VK_KHR_mir_surface"
typedef VkFlags VkMirSurfaceCreateFlagsKHR;
typedef struct VkMirSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkMirSurfaceCreateFlagsKHR flags;
MirConnection* connection;
MirSurface* mirSurface;
} VkMirSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateMirSurfaceKHR)(VkInstance instance, const VkMirSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceMirPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, MirConnection* connection);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateMirSurfaceKHR(
VkInstance instance,
const VkMirSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceMirPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
MirConnection* connection);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,58 @@
#ifndef VULKAN_VI_H_
#define VULKAN_VI_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_NN_vi_surface 1
#define VK_NN_VI_SURFACE_SPEC_VERSION 1
#define VK_NN_VI_SURFACE_EXTENSION_NAME "VK_NN_vi_surface"
typedef VkFlags VkViSurfaceCreateFlagsNN;
typedef struct VkViSurfaceCreateInfoNN {
VkStructureType sType;
const void* pNext;
VkViSurfaceCreateFlagsNN flags;
void* window;
} VkViSurfaceCreateInfoNN;
typedef VkResult (VKAPI_PTR *PFN_vkCreateViSurfaceNN)(VkInstance instance, const VkViSurfaceCreateInfoNN* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateViSurfaceNN(
VkInstance instance,
const VkViSurfaceCreateInfoNN* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,65 @@
#ifndef VULKAN_WAYLAND_H_
#define VULKAN_WAYLAND_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_wayland_surface 1
#define VK_KHR_WAYLAND_SURFACE_SPEC_VERSION 6
#define VK_KHR_WAYLAND_SURFACE_EXTENSION_NAME "VK_KHR_wayland_surface"
typedef VkFlags VkWaylandSurfaceCreateFlagsKHR;
typedef struct VkWaylandSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkWaylandSurfaceCreateFlagsKHR flags;
struct wl_display* display;
struct wl_surface* surface;
} VkWaylandSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateWaylandSurfaceKHR)(VkInstance instance, const VkWaylandSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceWaylandPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, struct wl_display* display);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateWaylandSurfaceKHR(
VkInstance instance,
const VkWaylandSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceWaylandPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
struct wl_display* display);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,276 @@
#ifndef VULKAN_WIN32_H_
#define VULKAN_WIN32_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_win32_surface 1
#define VK_KHR_WIN32_SURFACE_SPEC_VERSION 6
#define VK_KHR_WIN32_SURFACE_EXTENSION_NAME "VK_KHR_win32_surface"
typedef VkFlags VkWin32SurfaceCreateFlagsKHR;
typedef struct VkWin32SurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkWin32SurfaceCreateFlagsKHR flags;
HINSTANCE hinstance;
HWND hwnd;
} VkWin32SurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateWin32SurfaceKHR)(VkInstance instance, const VkWin32SurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceWin32PresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateWin32SurfaceKHR(
VkInstance instance,
const VkWin32SurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceWin32PresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex);
#endif
#define VK_KHR_external_memory_win32 1
#define VK_KHR_EXTERNAL_MEMORY_WIN32_SPEC_VERSION 1
#define VK_KHR_EXTERNAL_MEMORY_WIN32_EXTENSION_NAME "VK_KHR_external_memory_win32"
typedef struct VkImportMemoryWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlagBits handleType;
HANDLE handle;
LPCWSTR name;
} VkImportMemoryWin32HandleInfoKHR;
typedef struct VkExportMemoryWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
const SECURITY_ATTRIBUTES* pAttributes;
DWORD dwAccess;
LPCWSTR name;
} VkExportMemoryWin32HandleInfoKHR;
typedef struct VkMemoryWin32HandlePropertiesKHR {
VkStructureType sType;
void* pNext;
uint32_t memoryTypeBits;
} VkMemoryWin32HandlePropertiesKHR;
typedef struct VkMemoryGetWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkDeviceMemory memory;
VkExternalMemoryHandleTypeFlagBits handleType;
} VkMemoryGetWin32HandleInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkGetMemoryWin32HandleKHR)(VkDevice device, const VkMemoryGetWin32HandleInfoKHR* pGetWin32HandleInfo, HANDLE* pHandle);
typedef VkResult (VKAPI_PTR *PFN_vkGetMemoryWin32HandlePropertiesKHR)(VkDevice device, VkExternalMemoryHandleTypeFlagBits handleType, HANDLE handle, VkMemoryWin32HandlePropertiesKHR* pMemoryWin32HandleProperties);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryWin32HandleKHR(
VkDevice device,
const VkMemoryGetWin32HandleInfoKHR* pGetWin32HandleInfo,
HANDLE* pHandle);
VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryWin32HandlePropertiesKHR(
VkDevice device,
VkExternalMemoryHandleTypeFlagBits handleType,
HANDLE handle,
VkMemoryWin32HandlePropertiesKHR* pMemoryWin32HandleProperties);
#endif
#define VK_KHR_win32_keyed_mutex 1
#define VK_KHR_WIN32_KEYED_MUTEX_SPEC_VERSION 1
#define VK_KHR_WIN32_KEYED_MUTEX_EXTENSION_NAME "VK_KHR_win32_keyed_mutex"
typedef struct VkWin32KeyedMutexAcquireReleaseInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t acquireCount;
const VkDeviceMemory* pAcquireSyncs;
const uint64_t* pAcquireKeys;
const uint32_t* pAcquireTimeouts;
uint32_t releaseCount;
const VkDeviceMemory* pReleaseSyncs;
const uint64_t* pReleaseKeys;
} VkWin32KeyedMutexAcquireReleaseInfoKHR;
#define VK_KHR_external_semaphore_win32 1
#define VK_KHR_EXTERNAL_SEMAPHORE_WIN32_SPEC_VERSION 1
#define VK_KHR_EXTERNAL_SEMAPHORE_WIN32_EXTENSION_NAME "VK_KHR_external_semaphore_win32"
typedef struct VkImportSemaphoreWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkSemaphore semaphore;
VkSemaphoreImportFlags flags;
VkExternalSemaphoreHandleTypeFlagBits handleType;
HANDLE handle;
LPCWSTR name;
} VkImportSemaphoreWin32HandleInfoKHR;
typedef struct VkExportSemaphoreWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
const SECURITY_ATTRIBUTES* pAttributes;
DWORD dwAccess;
LPCWSTR name;
} VkExportSemaphoreWin32HandleInfoKHR;
typedef struct VkD3D12FenceSubmitInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t waitSemaphoreValuesCount;
const uint64_t* pWaitSemaphoreValues;
uint32_t signalSemaphoreValuesCount;
const uint64_t* pSignalSemaphoreValues;
} VkD3D12FenceSubmitInfoKHR;
typedef struct VkSemaphoreGetWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkSemaphore semaphore;
VkExternalSemaphoreHandleTypeFlagBits handleType;
} VkSemaphoreGetWin32HandleInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkImportSemaphoreWin32HandleKHR)(VkDevice device, const VkImportSemaphoreWin32HandleInfoKHR* pImportSemaphoreWin32HandleInfo);
typedef VkResult (VKAPI_PTR *PFN_vkGetSemaphoreWin32HandleKHR)(VkDevice device, const VkSemaphoreGetWin32HandleInfoKHR* pGetWin32HandleInfo, HANDLE* pHandle);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkImportSemaphoreWin32HandleKHR(
VkDevice device,
const VkImportSemaphoreWin32HandleInfoKHR* pImportSemaphoreWin32HandleInfo);
VKAPI_ATTR VkResult VKAPI_CALL vkGetSemaphoreWin32HandleKHR(
VkDevice device,
const VkSemaphoreGetWin32HandleInfoKHR* pGetWin32HandleInfo,
HANDLE* pHandle);
#endif
#define VK_KHR_external_fence_win32 1
#define VK_KHR_EXTERNAL_FENCE_WIN32_SPEC_VERSION 1
#define VK_KHR_EXTERNAL_FENCE_WIN32_EXTENSION_NAME "VK_KHR_external_fence_win32"
typedef struct VkImportFenceWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkFence fence;
VkFenceImportFlags flags;
VkExternalFenceHandleTypeFlagBits handleType;
HANDLE handle;
LPCWSTR name;
} VkImportFenceWin32HandleInfoKHR;
typedef struct VkExportFenceWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
const SECURITY_ATTRIBUTES* pAttributes;
DWORD dwAccess;
LPCWSTR name;
} VkExportFenceWin32HandleInfoKHR;
typedef struct VkFenceGetWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkFence fence;
VkExternalFenceHandleTypeFlagBits handleType;
} VkFenceGetWin32HandleInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkImportFenceWin32HandleKHR)(VkDevice device, const VkImportFenceWin32HandleInfoKHR* pImportFenceWin32HandleInfo);
typedef VkResult (VKAPI_PTR *PFN_vkGetFenceWin32HandleKHR)(VkDevice device, const VkFenceGetWin32HandleInfoKHR* pGetWin32HandleInfo, HANDLE* pHandle);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkImportFenceWin32HandleKHR(
VkDevice device,
const VkImportFenceWin32HandleInfoKHR* pImportFenceWin32HandleInfo);
VKAPI_ATTR VkResult VKAPI_CALL vkGetFenceWin32HandleKHR(
VkDevice device,
const VkFenceGetWin32HandleInfoKHR* pGetWin32HandleInfo,
HANDLE* pHandle);
#endif
#define VK_NV_external_memory_win32 1
#define VK_NV_EXTERNAL_MEMORY_WIN32_SPEC_VERSION 1
#define VK_NV_EXTERNAL_MEMORY_WIN32_EXTENSION_NAME "VK_NV_external_memory_win32"
typedef struct VkImportMemoryWin32HandleInfoNV {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlagsNV handleType;
HANDLE handle;
} VkImportMemoryWin32HandleInfoNV;
typedef struct VkExportMemoryWin32HandleInfoNV {
VkStructureType sType;
const void* pNext;
const SECURITY_ATTRIBUTES* pAttributes;
DWORD dwAccess;
} VkExportMemoryWin32HandleInfoNV;
typedef VkResult (VKAPI_PTR *PFN_vkGetMemoryWin32HandleNV)(VkDevice device, VkDeviceMemory memory, VkExternalMemoryHandleTypeFlagsNV handleType, HANDLE* pHandle);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryWin32HandleNV(
VkDevice device,
VkDeviceMemory memory,
VkExternalMemoryHandleTypeFlagsNV handleType,
HANDLE* pHandle);
#endif
#define VK_NV_win32_keyed_mutex 1
#define VK_NV_WIN32_KEYED_MUTEX_SPEC_VERSION 1
#define VK_NV_WIN32_KEYED_MUTEX_EXTENSION_NAME "VK_NV_win32_keyed_mutex"
typedef struct VkWin32KeyedMutexAcquireReleaseInfoNV {
VkStructureType sType;
const void* pNext;
uint32_t acquireCount;
const VkDeviceMemory* pAcquireSyncs;
const uint64_t* pAcquireKeys;
const uint32_t* pAcquireTimeoutMilliseconds;
uint32_t releaseCount;
const VkDeviceMemory* pReleaseSyncs;
const uint64_t* pReleaseKeys;
} VkWin32KeyedMutexAcquireReleaseInfoNV;
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,66 @@
#ifndef VULKAN_XCB_H_
#define VULKAN_XCB_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_xcb_surface 1
#define VK_KHR_XCB_SURFACE_SPEC_VERSION 6
#define VK_KHR_XCB_SURFACE_EXTENSION_NAME "VK_KHR_xcb_surface"
typedef VkFlags VkXcbSurfaceCreateFlagsKHR;
typedef struct VkXcbSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkXcbSurfaceCreateFlagsKHR flags;
xcb_connection_t* connection;
xcb_window_t window;
} VkXcbSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateXcbSurfaceKHR)(VkInstance instance, const VkXcbSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceXcbPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, xcb_connection_t* connection, xcb_visualid_t visual_id);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateXcbSurfaceKHR(
VkInstance instance,
const VkXcbSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceXcbPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
xcb_connection_t* connection,
xcb_visualid_t visual_id);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,66 @@
#ifndef VULKAN_XLIB_H_
#define VULKAN_XLIB_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_xlib_surface 1
#define VK_KHR_XLIB_SURFACE_SPEC_VERSION 6
#define VK_KHR_XLIB_SURFACE_EXTENSION_NAME "VK_KHR_xlib_surface"
typedef VkFlags VkXlibSurfaceCreateFlagsKHR;
typedef struct VkXlibSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkXlibSurfaceCreateFlagsKHR flags;
Display* dpy;
Window window;
} VkXlibSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateXlibSurfaceKHR)(VkInstance instance, const VkXlibSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceXlibPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, Display* dpy, VisualID visualID);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateXlibSurfaceKHR(
VkInstance instance,
const VkXlibSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceXlibPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
Display* dpy,
VisualID visualID);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,54 @@
#ifndef VULKAN_XLIB_RANDR_H_
#define VULKAN_XLIB_RANDR_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2017 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_EXT_acquire_xlib_display 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_SPEC_VERSION 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_EXTENSION_NAME "VK_EXT_acquire_xlib_display"
typedef VkResult (VKAPI_PTR *PFN_vkAcquireXlibDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, VkDisplayKHR display);
typedef VkResult (VKAPI_PTR *PFN_vkGetRandROutputDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, RROutput rrOutput, VkDisplayKHR* pDisplay);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkAcquireXlibDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
VkDisplayKHR display);
VKAPI_ATTR VkResult VKAPI_CALL vkGetRandROutputDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
RROutput rrOutput,
VkDisplayKHR* pDisplay);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,54 @@
#ifndef VULKAN_XLIB_XRANDR_H_
#define VULKAN_XLIB_XRANDR_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_EXT_acquire_xlib_display 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_SPEC_VERSION 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_EXTENSION_NAME "VK_EXT_acquire_xlib_display"
typedef VkResult (VKAPI_PTR *PFN_vkAcquireXlibDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, VkDisplayKHR display);
typedef VkResult (VKAPI_PTR *PFN_vkGetRandROutputDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, RROutput rrOutput, VkDisplayKHR* pDisplay);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkAcquireXlibDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
VkDisplayKHR display);
VKAPI_ATTR VkResult VKAPI_CALL vkGetRandROutputDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
RROutput rrOutput,
VkDisplayKHR* pDisplay);
#endif
#ifdef __cplusplus
}
#endif
#endif

File diff suppressed because it is too large Load Diff

View File

@@ -21,45 +21,129 @@
option(
'platforms',
type : 'string',
value : 'x11,wayland,drm,surfaceless',
description : 'comma separated list of window systems to support. wayland, x11, surfaceless, drm, etc.'
value : 'auto',
description : 'comma separated list of window systems to support. If this is set to auto all platforms applicable to the OS will be enabled.'
)
option(
'dri3',
type : 'combo',
value : 'auto',
choices : ['auto', 'yes', 'no'],
choices : ['auto', 'true', 'false'],
description : 'enable support for dri3'
)
option(
'dri-drivers',
type : 'string',
value : 'i915,i965',
description : 'comma separated list of dri drivers to build.'
value : 'auto',
description : 'comma separated list of dri drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'
)
option(
'dri-drivers-path',
type : 'string',
value : '',
description : 'Location of dri drivers. Default: $libdir/dri.'
description : 'Location to install dri drivers. Default: $libdir/dri.'
)
option(
'dri-search-path',
type : 'string',
value : '',
description : 'Locations to search for dri drivers, passed as colon separated list. Default: dri-drivers-path.'
)
option(
'gallium-drivers',
type : 'string',
value : 'pl111,radeonsi,nouveau,swrast,vc4',
description : 'comma separated list of gallium drivers to build.'
value : 'auto',
description : 'comma separated list of gallium drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'
)
option(
'gallium-media',
'gallium-extra-hud',
type : 'boolean',
value : false,
description : 'Enable HUD block/NIC I/O HUD status support',
)
option(
'gallium-vdpau',
type : 'combo',
value : 'auto',
choices : ['auto', 'true', 'false'],
description : 'enable gallium vdpau state tracker.',
)
option(
'vdpau-libs-path',
type : 'string',
value : '',
description : 'comma separated list of gallium media APIs to build (omx,va,vdpau,xvmc).'
description : 'path to put vdpau libraries. defaults to $libdir/vdpau.'
)
option(
'gallium-xvmc',
type : 'combo',
value : 'auto',
choices : ['auto', 'true', 'false'],
description : 'enable gallium xvmc state tracker.',
)
option(
'xvmc-libs-path',
type : 'string',
value : '',
description : 'path to put xvmc libraries. defaults to $libdir.'
)
option(
'gallium-omx',
type : 'combo',
value : 'auto',
choices : ['auto', 'disabled', 'bellagio', 'tizonia'],
description : 'enable gallium omx state tracker.',
)
option(
'omx-libs-path',
type : 'string',
value : '',
description : 'path to put omx libraries. defaults to omx-bellagio pkg-config pluginsdir.'
)
option(
'gallium-va',
type : 'combo',
value : 'auto',
choices : ['auto', 'true', 'false'],
description : 'enable gallium va state tracker.',
)
option(
'va-libs-path',
type : 'string',
value : '',
description : 'path to put va libraries. defaults to $libdir/dri.'
)
option(
'gallium-xa',
type : 'combo',
value : 'auto',
choices : ['auto', 'true', 'false'],
description : 'enable gallium xa state tracker.',
)
option(
'gallium-nine',
type : 'boolean',
value : false,
description : 'build gallium "nine" Direct3D 9.x state tracker.',
)
option(
'gallium-opencl',
type : 'combo',
choices : ['icd', 'standalone', 'disabled'],
value : 'disabled',
description : 'build gallium "clover" OpenCL state tracker.',
)
option(
'd3d-drivers-path',
type : 'string',
value : '',
description : 'Location of D3D drivers. Default: $libdir/d3d',
)
option(
'vulkan-drivers',
type : 'string',
value : 'intel,amd',
description : 'comma separated list of vulkan drivers to build.'
value : 'auto',
description : 'comma separated list of vulkan drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'
)
option(
'shader-cache',
@@ -101,7 +185,7 @@ option(
'gbm',
type : 'combo',
value : 'auto',
choices : ['auto', 'yes', 'no'],
choices : ['auto', 'true', 'false'],
description : 'Build support for gbm platform'
)
option(
@@ -115,7 +199,7 @@ option(
'egl',
type : 'combo',
value : 'auto',
choices : ['auto', 'yes', 'no'],
choices : ['auto', 'true', 'false'],
description : 'Build support for EGL platform'
)
option(
@@ -132,15 +216,31 @@ option(
)
option(
'llvm',
type : 'boolean',
value : true,
type : 'combo',
value : 'auto',
choices : ['auto', 'true', 'false'],
description : 'Build with LLVM support.'
)
option(
'valgrind',
type : 'boolean',
value : true,
description : 'Build with valgrind support if possible'
type : 'combo',
value : 'auto',
choices : ['auto', 'true', 'false'],
description : 'Build with valgrind support'
)
option(
'libunwind',
type : 'combo',
value : 'auto',
choices : ['auto', 'true', 'false'],
description : 'Use libunwind for stack-traces'
)
option(
'lmsensors',
type : 'combo',
value : 'auto',
choices : ['auto', 'true', 'false'],
description : 'Enable HUD lmsensors support.'
)
option(
'build-tests',
@@ -154,3 +254,35 @@ option(
value : false,
description : 'Enable floating point textures and renderbuffers. This option may be patent encumbered, please read docs/patents.txt and consult with your lawyer before turning this on.'
)
option(
'selinux',
type : 'boolean',
value : false,
description : 'Build an SELinux-aware Mesa'
)
option(
'osmesa',
type : 'combo',
value : 'none',
choices : ['none', 'classic', 'gallium'],
description : 'Build OSmesa.'
)
option(
'osmesa-bits',
type : 'combo',
value : '8',
choices : ['8', '16', '32'],
description : 'Number of channel bits for OSMesa.'
)
option(
'swr-arches',
type : 'string',
value : 'avx,avx2',
description : 'Comma delemited swr architectures. choices : avx,avx2,knl,skx'
)
option(
'tools',
type : 'string',
value : '',
description : 'Comma delimited list of tools to build. choices : freedreno,glsl,intel,nir,nouveau,xvmc or all'
)

View File

@@ -134,7 +134,9 @@ def check_cc(env, cc, expr, cpp_opt = '-E'):
source.write('#if !(%s)\n#error\n#endif\n' % expr)
source.close()
pipe = SCons.Action._subproc(env, [env['CC'], cpp_opt, source.name],
# sys.stderr.write('%r %s %s\n' % (env['CC'], cpp_opt, source.name));
pipe = SCons.Action._subproc(env, env.Split(env['CC']) + [cpp_opt, source.name],
stdin = 'devnull',
stderr = 'devnull',
stdout = 'devnull')
@@ -352,9 +354,15 @@ def generate(env):
if check_header(env, 'xlocale.h'):
cppdefines += ['HAVE_XLOCALE_H']
if check_header(env, 'endian.h'):
cppdefines += ['HAVE_ENDIAN_H']
if check_functions(env, ['strtod_l', 'strtof_l']):
cppdefines += ['HAVE_STRTOD_L']
if check_functions(env, ['timespec_get']):
cppdefines += ['HAVE_TIMESPEC_GET']
if platform == 'windows':
cppdefines += [
'WIN32',

View File

@@ -19,21 +19,14 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
.PHONY: git_sha1.h.tmp
git_sha1.h.tmp:
@$(PYTHON2) $(top_srcdir)/bin/git_sha1_gen.py > $@
git_sha1.h: git_sha1.h.tmp
@echo "updating git_sha1.h"
@if ! cmp -s git_sha1.h.tmp git_sha1.h; then \
mv git_sha1.h.tmp git_sha1.h ;\
else \
rm git_sha1.h.tmp ;\
fi
.PHONY: git_sha1.h
git_sha1.h: $(top_srcdir)/src/git_sha1.h.in
@echo "updating $@"
@$(PYTHON2) $(top_srcdir)/bin/git_sha1_gen.py --output $@
BUILT_SOURCES = git_sha1.h
CLEANFILES = $(BUILT_SOURCES)
EXTRA_DIST =
EXTRA_DIST = git_sha1.h.in meson.build
SUBDIRS = . gtest util mapi/glapi/gen mapi
@@ -64,7 +57,7 @@ endif
# include only conditionally ?
SUBDIRS += compiler
## Optionally required by GBM, EGL
## Optionally required by EGL
if HAVE_PLATFORM_WAYLAND
SUBDIRS += egl/wayland/wayland-drm
endif
@@ -74,7 +67,6 @@ SUBDIRS += vulkan
endif
EXTRA_DIST += vulkan/registry/vk.xml
EXTRA_DIST += vulkan/registry/vk_android_native_buffer.xml
if HAVE_AMD_DRIVERS
SUBDIRS += amd

View File

@@ -24,22 +24,12 @@ def write_git_sha1_h_file(filename):
to retrieve the git hashid and write the header file. An empty file
will be created if anything goes wrong."""
tempfile = "git_sha1.h.tmp"
with open(tempfile, "w") as f:
args = [ python_cmd, Dir('#').abspath + '/bin/git_sha1_gen.py' ]
try:
subprocess.Popen(args, stdout=f).wait()
except:
print("Warning: exception in write_git_sha1_h_file()")
return
if not os.path.exists(filename) or not filecmp.cmp(tempfile, filename):
# The filename does not exist or it's different from the new file,
# so replace old file with new.
if os.path.exists(filename):
os.remove(filename)
os.rename(tempfile, filename)
return
args = [ python_cmd, Dir('#').abspath + '/bin/git_sha1_gen.py', '--output', filename ]
try:
subprocess.call(args)
except:
print("Warning: exception in write_git_sha1_h_file()")
return
# Create the git_sha1.h header file

Some files were not shown because too many files have changed in this diff Show More